Les meilleurs podcasts Igor Melnyk (2024)

1
[QA] Infer Human's Intentions Before Following Natural Language Instruction 8:18

6h ago8:18

8:18

The FISER framework enhances AI's ability to follow ambiguous human instructions by inferring intentions, outperforming traditional methods in collaborative tasks, particularly on the HandMeThat benchmark. https://arxiv.org/abs//2409.18073 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
Infer Human's Intentions Before Following Natural Language Instruction 27:36

6h ago27:36

27:36

The FISER framework enhances AI's ability to follow ambiguous human instructions by inferring intentions, outperforming traditional methods in collaborative tasks, particularly on the HandMeThat benchmark. https://arxiv.org/abs//2409.18073 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
[QA] MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models 7:05

6h ago7:05

7:05

This paper presents a learnable pruning method for Large Language Models, achieving efficient N:M sparsity, improved mask quality, and transferability across tasks, outperforming existing techniques in empirical evaluations. https://arxiv.org/abs//2409.17481 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers …

1
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models 15:10

6h ago15:10

15:10

This paper presents a learnable pruning method for Large Language Models, achieving efficient N:M sparsity, improved mask quality, and transferability across tasks, outperforming existing techniques in empirical evaluations. https://arxiv.org/abs//2409.17481 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers …

1
[QA] Counterfactual Token Generation in Large Language Models 7:53

1d ago7:53

7:53

This paper presents a method to enable large language models to perform counterfactual token generation, enhancing their capabilities without fine-tuning, and applying it for bias detection. https://arxiv.org/abs//2409.17027 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.a…

1
Counterfactual Token Generation in Large Language Models 14:52

1d ago14:52

14:52

This paper presents a method to enable large language models to perform counterfactual token generation, enhancing their capabilities without fine-tuning, and applying it for bias detection. https://arxiv.org/abs//2409.17027 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.a…

1
[QA] Characterizing stable regions in the residual stream of LLMs 7:45

1d ago7:45

7:45

The paper identifies stable regions in Transformers' residual streams, showing insensitivity to small changes but high sensitivity at boundaries, aligning with semantic distinctions and clustering similar prompts. https://arxiv.org/abs//2409.17113 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…

1
Characterizing stable regions in the residual stream of LLMs 5:26

1d ago5:26

5:26

The paper identifies stable regions in Transformers' residual streams, showing insensitivity to small changes but high sensitivity at boundaries, aligning with semantic distinctions and clustering similar prompts. https://arxiv.org/abs//2409.17113 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podca…

1
[QA] Watch Your Steps: Observable and Modular Chains of Thought 7:30

3d ago7:30

7:30

We introduce Program Trace Prompting, enhancing chain of thought explanations with formal syntax, improving observability, and enabling analysis of reasoning errors across diverse tasks in the BIG-Bench Hard benchmark. https://arxiv.org/abs//2409.15359 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple …

1
Watch Your Steps: Observable and Modular Chains of Thought 29:35

3d ago29:35

29:35

We introduce Program Trace Prompting, enhancing chain of thought explanations with formal syntax, improving observability, and enabling analysis of reasoning errors across diverse tasks in the BIG-Bench Hard benchmark. https://arxiv.org/abs//2409.15359 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple …

1
[QA] Seeing Faces in Things: A Model and Dataset for Pareidolia 7:38

3d ago7:38

7:38

This paper explores face pareidolia in computer vision, presenting a dataset of annotated images and analyzing the differences in face detection between humans and machines. https://arxiv.org/abs//2409.16143 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podca…

1
Seeing Faces in Things: A Model and Dataset for Pareidolia 10:54

3d ago10:54

10:54

This paper explores face pareidolia in computer vision, presenting a dataset of annotated images and analyzing the differences in face detection between humans and machines. https://arxiv.org/abs//2409.16143 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podca…

1
[QA] Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts 8:20

4d ago8:20

8:20

The paper investigates out-of-distribution behavior in autoregressive LLMs through rule extrapolation in formal languages, analyzing various architectures and proposing a normative theory inspired by algorithmic information theory. https://arxiv.org/abs//2409.13728 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_…

1
Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts 29:04

4d ago29:04

29:04

The paper investigates out-of-distribution behavior in autoregressive LLMs through rule extrapolation in formal languages, analyzing various architectures and proposing a normative theory inspired by algorithmic information theory. https://arxiv.org/abs//2409.13728 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_…

1
[QA] Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking 7:40

4d ago7:40

7:40

This study evaluates the effectiveness of LLM-judge preferences in improving alignment, finding no correlation with concrete metrics and highlighting biases in LLM judgments. https://arxiv.org/abs//2409.15268 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podc…

1
Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking 11:39

4d ago11:39

11:39

This study evaluates the effectiveness of LLM-judge preferences in improving alignment, finding no correlation with concrete metrics and highlighting biases in LLM judgments. https://arxiv.org/abs//2409.15268 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podc…

1
[QA] LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models 7:46

5d ago7:46

7:46

This paper introduces LLM Surgery, a framework for efficiently modifying large language models to unlearn outdated information and integrate new knowledge without complete retraining, demonstrating significant performance improvements. https://arxiv.org/abs//2409.13054 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@ar…

1
LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models 13:56

5d ago13:56

13:56

This paper introduces LLM Surgery, a framework for efficiently modifying large language models to unlearn outdated information and integrate new knowledge without complete retraining, demonstrating significant performance improvements. https://arxiv.org/abs//2409.13054 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@ar…

1
[QA] Embedding Geometries of Contrastive Language-Image Pre-Training 7:35

5d ago7:35

7:35

This paper explores alternative geometries and softmax logits for language-image pre-training, finding that Euclidean CLIP (EuCLIP) performs as well as or better than the original CLIP. https://arxiv.org/abs//2409.13079 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.…

1
Embedding Geometries of Contrastive Language-Image Pre-Training 15:25

5d ago15:25

15:25

This paper explores alternative geometries and softmax logits for language-image pre-training, finding that Euclidean CLIP (EuCLIP) performs as well as or better than the original CLIP. https://arxiv.org/abs//2409.13079 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.…

1
[QA] Kolmogorov–Arnold Transformer 8:14

7d ago8:14

8:14

The Kolmogorov–Arnold Transformer (KAT) enhances transformer performance by replacing MLP layers with Kolmogorov-Arnold Network layers, addressing key challenges and demonstrating superior results in various tasks. https://arxiv.org/abs//2409.10594 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podc…

1
Kolmogorov–Arnold Transformer 15:05

7d ago15:05

15:05

The Kolmogorov–Arnold Transformer (KAT) enhances transformer performance by replacing MLP layers with Kolmogorov-Arnold Network layers, addressing key challenges and demonstrating superior results in various tasks. https://arxiv.org/abs//2409.10594 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podc…

1
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think 11:52

7d ago11:52

11:52

This paper reveals a flaw in the inference pipeline of diffusion models for depth estimation, leading to a 2002#2 speed improvement and superior performance through end-to-end fine-tuning. https://arxiv.org/abs//2409.11355 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.app…

1
[QA] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think 6:42

7d ago6:42

6:42

This paper reveals a flaw in the inference pipeline of diffusion models for depth estimation, leading to a 2002#2 speed improvement and superior performance through end-to-end fine-tuning. https://arxiv.org/abs//2409.11355 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.app…

1
[QA] Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm 7:03

8d ago7:03

7:03

This paper explores the geometric implications of LayerNorm in transformers, revealing its irreversibility and redundancy, and advocates for RMSNorm as a more efficient alternative with similar performance. https://arxiv.org/abs//2409.12951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

1
Re-Introducing LayerNorm: Geometric Meaning, Irreversibility and a Comparative Study with RMSNorm 12:28

8d ago12:28

12:28

This paper explores the geometric implications of LayerNorm in transformers, revealing its irreversibility and redundancy, and advocates for RMSNorm as a more efficient alternative with similar performance. https://arxiv.org/abs//2409.12951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

1
[QA] Is Tokenization Needed for Masked Particle Modelling? 7:52

8d ago7:52

7:52

This paper enhances masked particle modeling (MPM) for high-energy physics, improving performance through better implementation and a powerful decoder, outperforming previous methods in various jet physics tasks. https://arxiv.org/abs//2409.12589 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
Is Tokenization Needed for Masked Particle Modelling? 20:39

8d ago20:39

20:39

This paper enhances masked particle modeling (MPM) for high-energy physics, improving performance through better implementation and a powerful decoder, outperforming previous methods in various jet physics tasks. https://arxiv.org/abs//2409.12589 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
[QA] Finetuning Language Models to Emit Linguistic Expressions of Uncertainty 6:49

9d ago6:49

6:49

https://arxiv.org/abs//2409.12180 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
Finetuning Language Models to Emit Linguistic Expressions of Uncertainty 12:41

9d ago12:41

12:41

https://arxiv.org/abs//2409.12180 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
[QA] To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning 7:23

9d ago7:23

7:23

Chain-of-thought prompting enhances reasoning in large language models, particularly for math and logic tasks, but shows limited benefits for other tasks, suggesting a need for new computational paradigms. https://arxiv.org/abs//2409.12183 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning 26:23

9d ago26:23

26:23

Chain-of-thought prompting enhances reasoning in large language models, particularly for math and logic tasks, but shows limited benefits for other tasks, suggesting a need for new computational paradigms. https://arxiv.org/abs//2409.12183 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: htt…

1
[QA] On the limits of agency in agent-based models 8:12

10d ago8:12

8:12

AgentTorch is a framework that enhances agent-based modeling by using large language models to simulate millions of agents, demonstrating its utility in analyzing complex systems like the COVID-19 pandemic. https://arxiv.org/abs//2409.10568 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

1
On the limits of agency in agent-based models 19:50

10d ago19:50

19:50

AgentTorch is a framework that enhances agent-based modeling by using large language models to simulate millions of agents, demonstrating its utility in analyzing complex systems like the COVID-19 pandemic. https://arxiv.org/abs//2409.10568 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

1
[QA] Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models 7:14

10d ago7:14

7:14

Promptriever is a novel retrieval model that follows instructions, achieving state-of-the-art performance and improved robustness, demonstrating the potential of prompting in information retrieval. https://arxiv.org/abs//2409.11136 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://pod…

1
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models 15:25

10d ago15:25

15:25

Promptriever is a novel retrieval model that follows instructions, achieving state-of-the-art performance and improved robustness, demonstrating the potential of prompting in information retrieval. https://arxiv.org/abs//2409.11136 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://pod…

1
[QA] Finetuning CLIP to Reason about Pairwise Differences 7:34

10d ago7:34

7:34

This paper enhances CLIP's contrastive learning by aligning image embeddings with text descriptions, improving image ranking, zero-shot classification, and introducing comparative prompting for better performance and geometric properties. https://arxiv.org/abs//2409.09721 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/…

1
Finetuning CLIP to Reason about Pairwise Differences 16:47

10d ago16:47

16:47

This paper enhances CLIP's contrastive learning by aligning image embeddings with text descriptions, improving image ranking, zero-shot classification, and introducing comparative prompting for better performance and geometric properties. https://arxiv.org/abs//2409.09721 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/…

1
[QA] Think Twice Before You Act: Improving Inverse Problem Solving With MCMC 9:00

12d ago9:00

9:00

The paper introduces Diffusion Posterior MCMC (DPMC), an improved algorithm for solving inverse problems using pretrained diffusion models, outperforming existing methods and reducing errors in high noise scenarios. https://arxiv.org/abs//2409.08551 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…

1
Think Twice Before You Act: Improving Inverse Problem Solving With MCMC 11:16

12d ago11:16

11:16

The paper introduces Diffusion Posterior MCMC (DPMC), an improved algorithm for solving inverse problems using pretrained diffusion models, outperforming existing methods and reducing errors in high noise scenarios. https://arxiv.org/abs//2409.08551 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…

1
[QA] Explaining Datasets in Words: Statistical Models with Natural Language Parameters 7:29

12d ago7:29

7:29

The paper introduces interpretable statistical models using natural language predicates, optimizing parameters with a model-agnostic algorithm, applicable across various domains for enhanced data understanding and explanation. https://arxiv.org/abs//2409.08466 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_paper…

1
Explaining Datasets in Words: Statistical Models with Natural Language Parameters 18:38

12d ago18:38

18:38

The paper introduces interpretable statistical models using natural language predicates, optimizing parameters with a model-agnostic algorithm, applicable across various domains for enhanced data understanding and explanation. https://arxiv.org/abs//2409.08466 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_paper…

1
LLMs Will Always Hallucinate, and We Need to Live With This 7:39

13d ago7:39

7:39

This paper argues that hallucinations in Large Language Models are inevitable due to their mathematical structure, introducing "Structural Hallucinations" and challenging the belief that they can be eliminated. https://arxiv.org/abs//2409.05746 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts…

1
[QA] PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation 7:01

13d ago7:01

7:01

We present a benchmark for assessing language models' role-playing abilities through dynamic conversations, utilizing player, interrogator, and judge models, validated by experiments comparing automated and human evaluations. https://arxiv.org/abs//2409.06820 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation 7:21

13d ago7:21

7:21

We present a benchmark for assessing language models' role-playing abilities through dynamic conversations, utilizing player, interrogator, and judge models, validated by experiments comparing automated and human evaluations. https://arxiv.org/abs//2409.06820 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
[QA] LLaMA-Omni: Seamless Speech Interaction with Large Language Models 7:49

14d ago7:49

7:49

LLaMA-Omni is a novel model for real-time speech interaction with LLMs, offering low-latency, high-quality responses without transcription, built on a new dataset of 200K speech instructions. https://arxiv.org/abs//2409.06666 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.…

1
LLaMA-Omni: Seamless Speech Interaction with Large Language Models 21:20

14d ago21:20

21:20

LLaMA-Omni is a novel model for real-time speech interaction with LLMs, offering low-latency, high-quality responses without transcription, built on a new dataset of 200K speech instructions. https://arxiv.org/abs//2409.06666 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.…

1
[QA] WINDOWS AGENT ARENA: Evaluating Multi-Modal OS Agents at Scale 8:06

14d ago8:06

8:06

The WINDOWSAGENTARENA introduces a scalable benchmark for evaluating multi-modal agents in a real Windows environment, demonstrating enhanced performance through the Navi agent across diverse tasks. https://arxiv.org/abs//2409.08264 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://po…

1
WINDOWS AGENT ARENA: Evaluating Multi-Modal OS Agents at Scale 16:53

14d ago16:53

16:53

The WINDOWSAGENTARENA introduces a scalable benchmark for evaluating multi-modal agents in a real Windows environment, demonstrating enhanced performance through the Navi agent across diverse tasks. https://arxiv.org/abs//2409.08264 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://po…

1
[QA] What Makes a Maze Look Like a Maze? 7:59

15d ago7:59

7:59

Deep Schema Grounding (DSG) enhances vision-language models' ability to interpret visual abstractions by using structured representations, improving reasoning and understanding of abstract concepts in images. https://arxiv.org/abs//2409.08202 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: …

Podcasts qui valent la peine d'être écoutés

Igor Melnyk Podcasts

Podcasts qui valent la peine d'être écoutés

Guide de référence rapide