Towards Automated Accessibility Report Generation for Mobile Apps
Many apps have basic accessibility issues, like missing labels or low contrast. Automated tools can help app developers catch basic issues, but can...
On a Neural Implementation of Brenier’s Polar Factorization
In 1991, Brenier proved a theorem that generalizes the polar decomposition for square matrices -- factored as PSD ×times× unitary -- to any...
Contrasting Multiple Representations with the Multi-Marginal Matching Gap
Learning meaningful representations of complex objects that can be seen through multiple (k≥3kgeq 3k≥3) views or modalities is a core task in machine...
A Direct Algorithm for Multi-Gyroscope Infield Calibration
In this paper, we address the problem of estimating the rotational extrinsics, as well as the scale factors of two gyroscopes rigidly mounted...
CodeAct: Your LLM Agent Acts Better when Generating Code
Large Language Model (LLM) agents, capable of performing a broad range of actions, such as invoking tools and controlling robots, show great potential...
On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
We investigate the out-of-domain generalization of random feature (RF) models and Transformers. We first prove that in the ‘generalization on the unseen (GOTU)’...
Revealing the Utilized Rank of Subspaces of Learning in Neural Networks
In this work, we study how well the learned weights of a neural network utilize the space available to them. This notion is...
Enhancing CTC-based Speech Recognition with Diverse Modeling Units
In recent years, the evolution of end-to-end (E2E) automatic speech recognition (ASR) models has been remarkable, largely due to advances in deep learning...
Bytes Are All You Need: Transformers Operating Directly On File Bytes
Modern deep learning approaches usually utilize modality-specific processing. For example, the most common deep learning approach to image classification involves decoding image file...
On Computationally Efficient Multi-Class Calibration
Consider a multi-class labelling problem, where the labels can take values in , and a predictor predicts a distribution over the labels. In...