انتقل إلى وضع عدم الاتصال باستخدام تطبيق Player FM !
المدونة الصوتية تستحق الاستماع
برعاية


1 Shuai Wang’s Journey from China to Charleston 38:30
[QA] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Manage episode 490477515 series 3524393
This paper introduces Tar, a multimodal framework integrating visual understanding and generation through a shared semantic representation, enhancing efficiency and performance in cross-modal tasks.
https://arxiv.org/abs//2506.18898
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
2421 حلقات
Manage episode 490477515 series 3524393
This paper introduces Tar, a multimodal framework integrating visual understanding and generation through a shared semantic representation, enhancing efficiency and performance in cross-modal tasks.
https://arxiv.org/abs//2506.18898
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
2421 حلقات
Tüm bölümler
×
1 [QA] Inverse Scaling in Test-Time Compute 7:35


1 [QA] The Invisible Leash: Why RLVR May Not Escape Its Origin 8:26

1 The Invisible Leash: Why RLVR May Not Escape Its Origin 21:49

1 [QA] Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination 8:49

1 Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination 22:17

1 [QA] Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation 7:58

1 Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation 27:15

1 [QA] AGENTSNET: Coordination and Collaborative Reasoning in Multi-Agent LLMs 7:37

1 AGENTSNET: Coordination and Collaborative Reasoning in Multi-Agent LLMs 19:47



1 [QA] Should We Still Pretrain Encoders with Masked Language Modeling? 8:09

1 Should We Still Pretrain Encoders with Masked Language Modeling? 16:52

1 [QA] Token Bottleneck: One Token to Remember Dynamics 7:30

1 Token Bottleneck: One Token to Remember Dynamics 16:06

1 [QA] A Systematic Analysis of Hybrid Linear Attention 7:55

1 A Systematic Analysis of Hybrid Linear Attention 15:40

1 [QA] First Return, Entropy-Eliciting Explore 7:43

1 First Return, Entropy-Eliciting Explore 21:32

1 [QA] Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs 8:31

1 Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs 15:32



1 [QA] Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving 8:09

1 Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving 21:33

1 [QA] Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful 7:03

1 Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful 18:57

1 [QA] The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation 7:35

1 The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation 23:36

1 [QA] Cascade: Token-Sharded Private LLM Inference 7:04

1 Cascade: Token-Sharded Private LLM Inference 35:03

1 [QA] Real-TabPFN: Improving Tabular Foundation Models via Continued Pre-training With Real-World Data 7:28

1 Real-TabPFN: Improving Tabular Foundation Models via Continued Pre-training With Real-World Data 10:15

1 [QA] Strategic Intelligence in Large Language Models Evidence from evolutionary Game Theory. 7:21

1 Strategic Intelligence in Large Language Models Evidence from evolutionary Game Theory. 34:06

1 [QA] Fast and Simplex: 2-Simplicial Attention in Triton 7:28

1 Fast and Simplex: 2-Simplicial Attention in Triton 17:55

1 [QA] Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning 7:21

1 Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning 15:33

1 [QA] DABstep: Data Agent Benchmark for Multi-step Reasoning 7:54

1 DABstep: Data Agent Benchmark for Multi-step Reasoning 16:50

1 [QA] Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling? 8:16
مرحبًا بك في مشغل أف ام!
يقوم برنامج مشغل أف أم بمسح الويب للحصول على بودكاست عالية الجودة لتستمتع بها الآن. إنه أفضل تطبيق بودكاست ويعمل على أجهزة اندرويد والأيفون والويب. قم بالتسجيل لمزامنة الاشتراكات عبر الأجهزة.