انتقل إلى وضع عدم الاتصال باستخدام تطبيق Player FM !
[QA] RL, but don't do anything I wouldn't do
Manage episode 444509913 series 3524393
The paper critiques KL regularization in reinforcement learning, showing it fails with Bayesian predictive models, and proposes a new principle to better control advanced RL agent behavior.
https://arxiv.org/abs//2410.06213
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1581 حلقات
Manage episode 444509913 series 3524393
The paper critiques KL regularization in reinforcement learning, showing it fails with Bayesian predictive models, and proposes a new principle to better control advanced RL agent behavior.
https://arxiv.org/abs//2410.06213
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1581 حلقات
All episodes
×مرحبًا بك في مشغل أف ام!
يقوم برنامج مشغل أف أم بمسح الويب للحصول على بودكاست عالية الجودة لتستمتع بها الآن. إنه أفضل تطبيق بودكاست ويعمل على أجهزة اندرويد والأيفون والويب. قم بالتسجيل لمزامنة الاشتراكات عبر الأجهزة.