Machine Learning

Contextualization of ASR with LLM Using Phonetic Retrieval-Based Augmentation

Machine Learning September 30, 2024

Large language models (LLMs) have shown superb capability of modeling multimodal signals including audio and text, allowing the model to generate spoken or...

Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments

Machine Learning September 30, 2024

*Equal Contributors To deploy machine learning models on-device, practitioners use compression algorithms to shrink and speed up models while maintaining their high-quality output. A...

Generalizable Error Modeling for Human Data Annotation: Evidence from an Industry-Scale Search Data Annotation...

Machine Learning September 30, 2024

Machine learning (ML) and artificial intelligence (AI) systems rely heavily on human-annotated data for training and evaluation. A major challenge in this context...

Misty: UI Prototyping Through Interactive Conceptual Blending

Machine Learning September 30, 2024

UI prototyping often involves iterating and blending elements from examples such as screenshots and sketches, but current tools offer limited support for incorporating...

Optimizing Byte-level Representation for End-to-End ASR

Machine Learning August 30, 2024

This paper was accepted at the IEEE Spoken Language Technology Workshop (SLT) 2024. In this paper, we propose an algorithm to optimize a byte-level...

Interspeech 2024

Machine Learning August 29, 2024

Interspeech 2024 Source link

Classifier-Free Guidance Is a Predictor-Corrector

Machine Learning August 29, 2024

We investigate the unreasonable effectiveness of classifier-free guidance (CFG). CFG is the dominant method of conditional sampling for text-to-image diffusion models, yet unlike other aspects...

Apple Workshop on Privacy-Preserving Machine Learning 2024

Machine Learning August 29, 2024

At Apple, we believe privacy is a fundamental human right. It’s also one of our core values, influencing both our research and the...

Positional Description for Numerical Normalization

Machine Learning August 23, 2024

We present a Positional Description Scheme (PDS) tailored for digit sequences, integrating placeholder value information for each digit. Given the structural limitations of...

AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition

Machine Learning August 22, 2024

Audio-visual speech contains synchronized audio and visual information that provides cross-modal supervision to learn representations for both automatic speech recognition (ASR) and visual...

1...151617...31 Page 16 of 31

More News

Games

Contextualization of ASR with LLM Using Phonetic Retrieval-Based Augmentation

Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments

Generalizable Error Modeling for Human Data Annotation: Evidence from an Industry-Scale Search Data Annotation...

Misty: UI Prototyping Through Interactive Conceptual Blending

Optimizing Byte-level Representation for End-to-End ASR

Interspeech 2024

Classifier-Free Guidance Is a Predictor-Corrector

Apple Workshop on Privacy-Preserving Machine Learning 2024

Positional Description for Numerical Normalization

AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition

More News

Bravely Default HD Remaster For Nintendo Switch 2 Is Finally Up...

Official Nintendo Playing Cards – All Of The Mario & Zelda Decks Available Now

Nintendo Switch 2 May Record Your Audio And Video Chats

Let's All Speculate Wildly About What Outer Wilds Dev's New Game Is

GTA 6's Trailer 2 Looked Great, And It Wasn't All Cutscenes