[QA] RL, but don't do anything I wouldn't do

Arxiv Papers

المحتوى المقدم من Igor Melnyk. يتم تحميل جميع محتويات البودكاست بما في ذلك الحلقات والرسومات وأوصاف البودكاست وتقديمها مباشرة بواسطة Igor Melnyk أو شريك منصة البودكاست الخاص بهم. إذا كنت تعتقد أن شخصًا ما يستخدم عملك المحمي بحقوق الطبع والنشر دون إذنك، فيمكنك اتباع العملية الموضحة هنا https://ar.player.fm/legal.

3d ago 7:43

MP3•منزل الحلقة

The paper critiques KL regularization in reinforcement learning, showing it fails with Bayesian predictive models, and proposes a new principle to better control advanced RL agent behavior.

https://arxiv.org/abs//2410.06213

YouTube: https://www.youtube.com/@ArxivPapers

TikTok: https://www.tiktok.com/@arxiv_papers

Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016

Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

1581 حلقات

#Science #Igor Melnyk