16: Infini-Attention: Google's Solution for Infinite Memory in LLMs
MP3•منزل الحلقة
Manage episode 419629251 series 3548032
المحتوى المقدم من Deeper Insights. يتم تحميل جميع محتويات البودكاست بما في ذلك الحلقات والرسومات وأوصاف البودكاست وتقديمها مباشرة بواسطة Deeper Insights أو شريك منصة البودكاست الخاص بهم. إذا كنت تعتقد أن شخصًا ما يستخدم عملك المحمي بحقوق الطبع والنشر دون إذنك، فيمكنك اتباع العملية الموضحة هنا https://ar.player.fm/legal.
In this episode of the AI Paper Club Podcast, hosts Rafael Herrera and Sonia Marques welcome Leticia Fernandes, a Deeper Insights Senior Data Scientist and Generative AI Ambassador. Together, they explore the groundbreaking "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" paper from Google. This paper addresses the challenge of fitting infinite context into large language models, introducing the Infini-attention method. The trio discusses how this approach works, including how it uses linear attention and employs compressive memory to store key-value pairs, enabling models to handle extensive contexts.
We also extend a special thank you to the research team Google for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2404.07143.pdf
For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at thepaperclub@deeperinsights.com.
We also extend a special thank you to the research team Google for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2404.07143.pdf
For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at thepaperclub@deeperinsights.com.
19 حلقات