Multimodal Autoregressive Pre-Training of Large Vision Encoders
*Equal Contributors
A dominant paradigm in large multimodal models is to pair a large language de- coder with a vision encoder. While it is...
Do LLMs Internally “Know” When They Follow Instructions?
This paper was accepted at the Foundation Model Interventions (MINT) Workshop at NeurIPS 2024.
Instruction-following is crucial for building AI agents with large language...
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Large language models (LLMs) are commonly trained on datasets consisting of fixed-length token sequences. These datasets are created by randomly concatenating documents of...
Do Compressed LLMs Forget Knowledge? An Experimental Study with Practical Implications
This paper was accepted at the Machine Learning and Compression Workshop at NeurIPS 2024.
Compressing Large Language Models (LLMs) often leads to reduced performance,...
Transformation-Invariant Learning and Theoretical Guarantees for OOD Generalization
Learning with identical train and test distributions has been extensively investigated both practically and theoretically. Much remains to be understood, however, in statistical...
Faster Algorithms for User-Level Private Stochastic Convex Optimization
We study private stochastic convex optimization (SCO) under user-level differential privacy (DP) constraints. In this setting, there are nnn users, each possessing mmm...
Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions
We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a kthk^textthkth-moment bound on the Lipschitz...
Instance-Optimal Private Density Estimation in the Wasserstein Distance
Estimating the density of a distribution from samples is a fundamental problem in statistics. In many practical settings, the Wasserstein distance is an...
Private Online Learning via Lazy Algorithms
We study the problem of private online learning, specifically, online prediction from experts (OPE) and online convex optimization (OCO). We propose a new...
Do LLMs Estimate Uncertainty Well in Instruction-Following?
This paper was accepted at the Safe Generative AI Workshop (SGAIW) at NeurIPS 2024.
Large language models (LLMs) could be valuable personal AI agents...