Faster Algorithms for User-Level Private Stochastic Convex Optimization
We study private stochastic convex optimization (SCO) under user-level differential privacy (DP) constraints. In this setting, there are nnn users, each possessing mmm...
Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions
We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a kthk^textthkth-moment bound on the Lipschitz...
Instance-Optimal Private Density Estimation in the Wasserstein Distance
Estimating the density of a distribution from samples is a fundamental problem in statistics. In many practical settings, the Wasserstein distance is an...
Private Online Learning via Lazy Algorithms
We study the problem of private online learning, specifically, online prediction from experts (OPE) and online convex optimization (OCO). We propose a new...
Do LLMs Estimate Uncertainty Well in Instruction-Following?
This paper was accepted at the Safe Generative AI Workshop (SGAIW) at NeurIPS 2024.
Large language models (LLMs) could be valuable personal AI agents...
Towards Low-Bit Communication for Tensor Parallel LLM Inference
This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP) Workshop at NeurIPS 2024.
Tensor parallelism provides an effective way to...
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
We present Recurrent Drafter (ReDrafter), an advanced speculative decoding approach that achieves state-of-the-art speedup for large language models (LLMs) inference. The performance gains...
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP) Workshop at NeurIPS 2024.
The pre-training phase of language models often...
Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval
Neural contextual biasing allows speech recognition models to leverage contextually relevant information, leading to improved transcription accuracy. However, the biasing mechanism is typically...
Device-Directed Speech Detection for Follow-up Conversations Using Large Language Models
This paper was accepted at the Adaptive Foundation Models (AFM) workshop at NeurIPS Workshop 2024.
Follow-up conversations with virtual assistants (VAs) enable a user...