Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages
This article introduces contrastive alignment instructions (AlignInstruct) to address two challenges in machine translation (MT) on large language models (LLMs). One is the...
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
This paper was accepted at the Efficient Systems for Foundation Models Workshop at ICML 2024
The inference of transformer-based large language models consists of...
Model-Driven Heart Rate Estimation and Heart Murmur Detection Based on Phonocardiogram
Acoustic signals are crucial for health monitoring, particularly heart sounds which provide essential data like heart rate and detect cardiac anomalies such as...
DataComp-LM: In Search of the Next Generation of Training Sets for Language Models
This paper was accepted at the NeurIPS Datasets and Benchmarks Workshop at NeurIPS 2024
We introduce DataComp for Language Models (DCLM), a testbed for...
Instance Optimal Private Density Estimation in the Wasserstein Distance
Estimating the density of a distribution from samples is a fundamental problem in statistics. In many practical settings, the Wasserstein distance is an...
Pre-Trained Foundation Model Representations to Uncover Breathing Patterns in Speech
The process of human speech production involves coordinated respiratory action to elicit acoustic speech signals. Typically, speech is produced when air is forced...
Apple Intelligence Foundation Language Models
We present foundation language models developed to power Apple Intelligence features, including a ∼3 billion parameter model designed to run efficiently on devices...
Federated Learning With Differential Privacy for End-to-End Speech Recognition
*Equal Contributors
While federated learning (FL) has recently emerged as a promising approach to train machine learning models, it is limited to only preliminary...
Ferretv2: An Improved Baseline for Referring and Grounding
While Ferret seamlessly integrates regional understanding into the Large Language Model (LLM) to facilitate its referring and grounding capability, it poses certain limitations:...
Samplable Anonymous Aggregation for Private Federated Data Analytics
We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data. Locally...