Running out of time to catch up with new arXiv papers? We take the most impactful papers and present them as convenient podcasts. If you're a visual learner, we offer these papers in an engaging video format. Our service fills the gap between overly brief paper summaries and time-consuming full paper reads. You gain academic insights in a time-efficient, digestible format. Code behind this work: https://github.com/imelnyk/ArxivPapers Support this podcast: https://podcasters.spotify.com/pod/s ...
…
continue reading
The FISER framework enhances AI's ability to follow ambiguous human instructions by inferring intentions, outperforming traditional methods in collaborative tasks, particularly on the HandMeThat benchmark. https://arxiv.org/abs//2409.18073 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…
…
continue reading
1
Infer Human's Intentions Before Following Natural Language Instruction
27:36
27:36
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
27:36
The FISER framework enhances AI's ability to follow ambiguous human instructions by inferring intentions, outperforming traditional methods in collaborative tasks, particularly on the HandMeThat benchmark. https://arxiv.org/abs//2409.18073 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…
…
continue reading
This paper presents a learnable pruning method for Large Language Models, achieving efficient N:M sparsity, improved mask quality, and transferability across tasks, outperforming existing techniques in empirical evaluations. https://arxiv.org/abs//2409.17481 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers …
…
continue reading
1
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
15:10
15:10
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
15:10
This paper presents a learnable pruning method for Large Language Models, achieving efficient N:M sparsity, improved mask quality, and transferability across tasks, outperforming existing techniques in empirical evaluations. https://arxiv.org/abs//2409.17481 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers …
…
continue reading
This paper presents a method to enable large language models to perform counterfactual token generation, enhancing their capabilities without fine-tuning, and applying it for bias detection. https://arxiv.org/abs//2409.17027 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.a…
…
continue reading
1
Counterfactual Token Generation in Large Language Models
14:52
14:52
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
14:52
This paper presents a method to enable large language models to perform counterfactual token generation, enhancing their capabilities without fine-tuning, and applying it for bias detection. https://arxiv.org/abs//2409.17027 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.a…
…
continue reading
The paper identifies stable regions in Transformers' residual streams, showing insensitivity to small changes but high sensitivity at boundaries, aligning with semantic distinctions and clustering similar prompts. https://arxiv.org/abs//2409.17113 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…
…
continue reading
The paper identifies stable regions in Transformers' residual streams, showing insensitivity to small changes but high sensitivity at boundaries, aligning with semantic distinctions and clustering similar prompts. https://arxiv.org/abs//2409.17113 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…
…
continue reading
We introduce Program Trace Prompting, enhancing chain of thought explanations with formal syntax, improving observability, and enabling analysis of reasoning errors across diverse tasks in the BIG-Bench Hard benchmark. https://arxiv.org/abs//2409.15359 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple …
…
continue reading
1
Watch Your Steps: Observable and Modular Chains of Thought
29:35
29:35
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
29:35
We introduce Program Trace Prompting, enhancing chain of thought explanations with formal syntax, improving observability, and enabling analysis of reasoning errors across diverse tasks in the BIG-Bench Hard benchmark. https://arxiv.org/abs//2409.15359 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple …
…
continue reading
This paper explores face pareidolia in computer vision, presenting a dataset of annotated images and analyzing the differences in face detection between humans and machines. https://arxiv.org/abs//2409.16143 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podca…
…
continue reading
1
Seeing Faces in Things: A Model and Dataset for Pareidolia
10:54
10:54
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
10:54
This paper explores face pareidolia in computer vision, presenting a dataset of annotated images and analyzing the differences in face detection between humans and machines. https://arxiv.org/abs//2409.16143 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podca…
…
continue reading
1
[QA] Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts
8:20
The paper investigates out-of-distribution behavior in autoregressive LLMs through rule extrapolation in formal languages, analyzing various architectures and proposing a normative theory inspired by algorithmic information theory. https://arxiv.org/abs//2409.13728 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_…
…
continue reading
1
Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts
29:04
29:04
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
29:04
The paper investigates out-of-distribution behavior in autoregressive LLMs through rule extrapolation in formal languages, analyzing various architectures and proposing a normative theory inspired by algorithmic information theory. https://arxiv.org/abs//2409.13728 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_…
…
continue reading
This study evaluates the effectiveness of LLM-judge preferences in improving alignment, finding no correlation with concrete metrics and highlighting biases in LLM judgments. https://arxiv.org/abs//2409.15268 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podc…
…
continue reading
1
Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking
11:39
11:39
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
11:39
This study evaluates the effectiveness of LLM-judge preferences in improving alignment, finding no correlation with concrete metrics and highlighting biases in LLM judgments. https://arxiv.org/abs//2409.15268 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podc…
…
continue reading
This paper introduces LLM Surgery, a framework for efficiently modifying large language models to unlearn outdated information and integrate new knowledge without complete retraining, demonstrating significant performance improvements. https://arxiv.org/abs//2409.13054 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@ar…
…
continue reading
1
LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models
13:56
13:56
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
13:56
This paper introduces LLM Surgery, a framework for efficiently modifying large language models to unlearn outdated information and integrate new knowledge without complete retraining, demonstrating significant performance improvements. https://arxiv.org/abs//2409.13054 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@ar…
…
continue reading
This paper explores alternative geometries and softmax logits for language-image pre-training, finding that Euclidean CLIP (EuCLIP) performs as well as or better than the original CLIP. https://arxiv.org/abs//2409.13079 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.…
…
continue reading
1
Embedding Geometries of Contrastive Language-Image Pre-Training
15:25
15:25
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
15:25
This paper explores alternative geometries and softmax logits for language-image pre-training, finding that Euclidean CLIP (EuCLIP) performs as well as or better than the original CLIP. https://arxiv.org/abs//2409.13079 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.…
…
continue reading
The Kolmogorov–Arnold Transformer (KAT) enhances transformer performance by replacing MLP layers with Kolmogorov-Arnold Network layers, addressing key challenges and demonstrating superior results in various tasks. https://arxiv.org/abs//2409.10594 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podc…
…
continue reading
1
Kolmogorov–Arnold Transformer
15:05
15:05
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
15:05
The Kolmogorov–Arnold Transformer (KAT) enhances transformer performance by replacing MLP layers with Kolmogorov-Arnold Network layers, addressing key challenges and demonstrating superior results in various tasks. https://arxiv.org/abs//2409.10594 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podc…
…
continue reading
1
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
11:52
11:52
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
11:52
This paper reveals a flaw in the inference pipeline of diffusion models for depth estimation, leading to a 2002#2 speed improvement and superior performance through end-to-end fine-tuning. https://arxiv.org/abs//2409.11355 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.app…
…
continue reading
This paper reveals a flaw in the inference pipeline of diffusion models for depth estimation, leading to a 2002#2 speed improvement and superior performance through end-to-end fine-tuning. https://arxiv.org/abs//2409.11355 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.app…
…
continue reading
1
[QA] Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm
7:03
This paper explores the geometric implications of LayerNorm in transformers, revealing its irreversibility and redundancy, and advocates for RMSNorm as a more efficient alternative with similar performance. https://arxiv.org/abs//2409.12951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…
…
continue reading
1
Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm
12:28
12:28
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
12:28
This paper explores the geometric implications of LayerNorm in transformers, revealing its irreversibility and redundancy, and advocates for RMSNorm as a more efficient alternative with similar performance. https://arxiv.org/abs//2409.12951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…
…
continue reading
This paper enhances masked particle modeling (MPM) for high-energy physics, improving performance through better implementation and a powerful decoder, outperforming previous methods in various jet physics tasks. https://arxiv.org/abs//2409.12589 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…
…
continue reading
1
Is Tokenization Needed for Masked Particle Modelling?
20:39
20:39
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
20:39
This paper enhances masked particle modeling (MPM) for high-energy physics, improving performance through better implementation and a powerful decoder, outperforming previous methods in various jet physics tasks. https://arxiv.org/abs//2409.12589 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…
…
continue reading
https://arxiv.org/abs//2409.12180 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…
…
continue reading
1
Finetuning Language Models to Emit Linguistic Expressions of Uncertainty
12:41
12:41
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
12:41
https://arxiv.org/abs//2409.12180 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…
…
continue reading
Chain-of-thought prompting enhances reasoning in large language models, particularly for math and logic tasks, but shows limited benefits for other tasks, suggesting a need for new computational paradigms. https://arxiv.org/abs//2409.12183 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…
…
continue reading
1
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
26:23
26:23
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
26:23
Chain-of-thought prompting enhances reasoning in large language models, particularly for math and logic tasks, but shows limited benefits for other tasks, suggesting a need for new computational paradigms. https://arxiv.org/abs//2409.12183 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…
…
continue reading
AgentTorch is a framework that enhances agent-based modeling by using large language models to simulate millions of agents, demonstrating its utility in analyzing complex systems like the COVID-19 pandemic. https://arxiv.org/abs//2409.10568 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…
…
continue reading
1
On the limits of agency in agent-based models
19:50
19:50
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
19:50
AgentTorch is a framework that enhances agent-based modeling by using large language models to simulate millions of agents, demonstrating its utility in analyzing complex systems like the COVID-19 pandemic. https://arxiv.org/abs//2409.10568 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…
…
continue reading
Promptriever is a novel retrieval model that follows instructions, achieving state-of-the-art performance and improved robustness, demonstrating the potential of prompting in information retrieval. https://arxiv.org/abs//2409.11136 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://pod…
…
continue reading
1
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
15:25
15:25
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
15:25
Promptriever is a novel retrieval model that follows instructions, achieving state-of-the-art performance and improved robustness, demonstrating the potential of prompting in information retrieval. https://arxiv.org/abs//2409.11136 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://pod…
…
continue reading
This paper enhances CLIP's contrastive learning by aligning image embeddings with text descriptions, improving image ranking, zero-shot classification, and introducing comparative prompting for better performance and geometric properties. https://arxiv.org/abs//2409.09721 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/…
…
continue reading
1
Finetuning CLIP to Reason about Pairwise Differences
16:47
16:47
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
16:47
This paper enhances CLIP's contrastive learning by aligning image embeddings with text descriptions, improving image ranking, zero-shot classification, and introducing comparative prompting for better performance and geometric properties. https://arxiv.org/abs//2409.09721 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/…
…
continue reading
The paper introduces Diffusion Posterior MCMC (DPMC), an improved algorithm for solving inverse problems using pretrained diffusion models, outperforming existing methods and reducing errors in high noise scenarios. https://arxiv.org/abs//2409.08551 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…
…
continue reading
1
Think Twice Before You Act: Improving Inverse Problem Solving With MCMC
11:16
11:16
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
11:16
The paper introduces Diffusion Posterior MCMC (DPMC), an improved algorithm for solving inverse problems using pretrained diffusion models, outperforming existing methods and reducing errors in high noise scenarios. https://arxiv.org/abs//2409.08551 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…
…
continue reading
The paper introduces interpretable statistical models using natural language predicates, optimizing parameters with a model-agnostic algorithm, applicable across various domains for enhanced data understanding and explanation. https://arxiv.org/abs//2409.08466 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_paper…
…
continue reading
1
Explaining Datasets in Words: Statistical Models with Natural Language Parameters
18:38
18:38
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
18:38
The paper introduces interpretable statistical models using natural language predicates, optimizing parameters with a model-agnostic algorithm, applicable across various domains for enhanced data understanding and explanation. https://arxiv.org/abs//2409.08466 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_paper…
…
continue reading
This paper argues that hallucinations in Large Language Models are inevitable due to their mathematical structure, introducing "Structural Hallucinations" and challenging the belief that they can be eliminated. https://arxiv.org/abs//2409.05746 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts…
…
continue reading
1
[QA] PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation
7:01
We present a benchmark for assessing language models' role-playing abilities through dynamic conversations, utilizing player, interrogator, and judge models, validated by experiments comparing automated and human evaluations. https://arxiv.org/abs//2409.06820 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…
…
continue reading
1
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation
7:21
We present a benchmark for assessing language models' role-playing abilities through dynamic conversations, utilizing player, interrogator, and judge models, validated by experiments comparing automated and human evaluations. https://arxiv.org/abs//2409.06820 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…
…
continue reading
LLaMA-Omni is a novel model for real-time speech interaction with LLMs, offering low-latency, high-quality responses without transcription, built on a new dataset of 200K speech instructions. https://arxiv.org/abs//2409.06666 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.…
…
continue reading
1
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
21:20
21:20
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
21:20
LLaMA-Omni is a novel model for real-time speech interaction with LLMs, offering low-latency, high-quality responses without transcription, built on a new dataset of 200K speech instructions. https://arxiv.org/abs//2409.06666 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.…
…
continue reading
The WINDOWSAGENTARENA introduces a scalable benchmark for evaluating multi-modal agents in a real Windows environment, demonstrating enhanced performance through the Navi agent across diverse tasks. https://arxiv.org/abs//2409.08264 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://po…
…
continue reading
1
WINDOWS AGENT ARENA: Evaluating Multi-Modal OS Agents at Scale
16:53
16:53
Lire Plus Tard
Lire Plus Tard
Des listes
J'aime
Aimé
16:53
The WINDOWSAGENTARENA introduces a scalable benchmark for evaluating multi-modal agents in a real Windows environment, demonstrating enhanced performance through the Navi agent across diverse tasks. https://arxiv.org/abs//2409.08264 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://po…
…
continue reading
Deep Schema Grounding (DSG) enhances vision-language models' ability to interpret visual abstractions by using structured representations, improving reasoning and understanding of abstract concepts in images. https://arxiv.org/abs//2409.08202 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: …
…
continue reading