Artwork

Player FM - Internet Radio Done Right
Checked 5d ago
تمت الإضافة منذ five أسابيع
المحتوى المقدم من Red Hat. يتم تحميل جميع محتويات البودكاست بما في ذلك الحلقات والرسومات وأوصاف البودكاست وتقديمها مباشرة بواسطة Red Hat أو شريك منصة البودكاست الخاص بهم. إذا كنت تعتقد أن شخصًا ما يستخدم عملك المحمي بحقوق الطبع والنشر دون إذنك، فيمكنك اتباع العملية الموضحة هنا https://ar.player.fm/legal.
Player FM - تطبيق بودكاست
انتقل إلى وضع عدم الاتصال باستخدام تطبيق Player FM !
icon Daily Deals

Scaling AI inference with open source ft. Brian Stevens

29:39
 
مشاركة
 

Manage episode 486763187 series 3668811
المحتوى المقدم من Red Hat. يتم تحميل جميع محتويات البودكاست بما في ذلك الحلقات والرسومات وأوصاف البودكاست وتقديمها مباشرة بواسطة Red Hat أو شريك منصة البودكاست الخاص بهم. إذا كنت تعتقد أن شخصًا ما يستخدم عملك المحمي بحقوق الطبع والنشر دون إذنك، فيمكنك اتباع العملية الموضحة هنا https://ar.player.fm/legal.
Explore the future of enterprise AI with Red Hat's SVP and AI CTO, Brian Stevens. In this episode, we delve into how AI is being practically reimagined for real-world business environments, focusing on the pivotal shift to production-quality inference at scale and the transformative power of open source. Brian Stevens shares his expertise and unique perspective on: • The evolution of AI from experimental stages to essential, production-ready enterprise solutions. • Key lessons from the early days of enterprise Linux and their application to today’s AI inference challenges. • The critical role of projects like vLLM in optimizing AI models and creating a common, efficient inference stack for diverse hardware. • Innovations in GPU-based inference and distributed systems (like KV cache) that enable AI scalability. Tune in for a deep dive into the infrastructure and strategies making enterprise AI a reality. Whether you're a seasoned technologist, an AI practitioner, or a leader charting your company's AI journey, this discussion will provide valuable insights into building an accessible, efficient, and powerful AI future with open source.
  continue reading

3 حلقات

Artwork
iconمشاركة
 
Manage episode 486763187 series 3668811
المحتوى المقدم من Red Hat. يتم تحميل جميع محتويات البودكاست بما في ذلك الحلقات والرسومات وأوصاف البودكاست وتقديمها مباشرة بواسطة Red Hat أو شريك منصة البودكاست الخاص بهم. إذا كنت تعتقد أن شخصًا ما يستخدم عملك المحمي بحقوق الطبع والنشر دون إذنك، فيمكنك اتباع العملية الموضحة هنا https://ar.player.fm/legal.
Explore the future of enterprise AI with Red Hat's SVP and AI CTO, Brian Stevens. In this episode, we delve into how AI is being practically reimagined for real-world business environments, focusing on the pivotal shift to production-quality inference at scale and the transformative power of open source. Brian Stevens shares his expertise and unique perspective on: • The evolution of AI from experimental stages to essential, production-ready enterprise solutions. • Key lessons from the early days of enterprise Linux and their application to today’s AI inference challenges. • The critical role of projects like vLLM in optimizing AI models and creating a common, efficient inference stack for diverse hardware. • Innovations in GPU-based inference and distributed systems (like KV cache) that enable AI scalability. Tune in for a deep dive into the infrastructure and strategies making enterprise AI a reality. Whether you're a seasoned technologist, an AI practitioner, or a leader charting your company's AI journey, this discussion will provide valuable insights into building an accessible, efficient, and powerful AI future with open source.
  continue reading

3 حلقات

كل الحلقات

×
 
Explore what it takes to run massive language models efficiently with Red Hat's Senior Principal Software Engineer in AI Engineering, Nick Hill. In this episode, we go behind the headlines to uncover the systems-level engineering making AI practical, focusing on the pivotal challenge of inference optimization and the transformative power of the vLLM open-source project. Nick Hill shares his experiences working in AI including: • The evolution of AI optimization, from early handcrafted systems like IBM Watson to the complex demands of today's generative AI. • The critical role of open-source projects like vLLM in creating a common, efficient inference stack for diverse hardware platforms. • Key innovations like PagedAttention that solve GPU memory fragmentation and manage the KV cache for scalable, high-throughput performance. • How the open-source community is rapidly translating academic research into real-world, production-ready solutions for AI. Join us to explore the infrastructure and optimization strategies making large-scale AI a reality. This conversation is essential for any technologist, engineer, or leader who wants to understand the how and why of AI performance. You’ll come away with a new appreciation for the clever, systems-level work required to build a truly scalable and open AI future.…
 
Explore the future of enterprise AI with Red Hat's SVP and AI CTO, Brian Stevens. In this episode, we delve into how AI is being practically reimagined for real-world business environments, focusing on the pivotal shift to production-quality inference at scale and the transformative power of open source. Brian Stevens shares his expertise and unique perspective on: • The evolution of AI from experimental stages to essential, production-ready enterprise solutions. • Key lessons from the early days of enterprise Linux and their application to today’s AI inference challenges. • The critical role of projects like vLLM in optimizing AI models and creating a common, efficient inference stack for diverse hardware. • Innovations in GPU-based inference and distributed systems (like KV cache) that enable AI scalability. Tune in for a deep dive into the infrastructure and strategies making enterprise AI a reality. Whether you're a seasoned technologist, an AI practitioner, or a leader charting your company's AI journey, this discussion will provide valuable insights into building an accessible, efficient, and powerful AI future with open source.…
 
Ready to go deeper into the ever-evolving landscape of technology? Technically Speaking, hosted by Red Hat CTO and SVP of Global Engineering Chris Wright, is back and reimagined to guide you with more depth and candor. This series cuts through the noise, offering insightful, casual, and now even more in-depth conversations with leading experts from across the globe. Each discussion delves further into new and emerging technologies, helping you understand not just the 'what,' but the 'why' and 'how' these advancements will impact long-term strategic developments for your company and your career. From AI and open source innovation to cloud computing and beyond, Chris Wright and his guests humanize technology, providing an unparalleled insider’s look at what’s next with enhanced detail and open discussion. The revamped ""Technically Speaking with Chris Wright"" champions innovation and thought leadership, blending even deeper-dive discussions with updates on the latest tech news. Tune in for richer insights on complex topics, explore varied perspectives with greater nuance, and equip yourself to shape the future of technology. Discover how to turn today's emerging tech into tomorrow's strategic advantage.…
 
Loading …

مرحبًا بك في مشغل أف ام!

يقوم برنامج مشغل أف أم بمسح الويب للحصول على بودكاست عالية الجودة لتستمتع بها الآن. إنه أفضل تطبيق بودكاست ويعمل على أجهزة اندرويد والأيفون والويب. قم بالتسجيل لمزامنة الاشتراكات عبر الأجهزة.

 

icon Daily Deals
icon Daily Deals
icon Daily Deals

دليل مرجعي سريع

حقوق الطبع والنشر 2025 | سياسة الخصوصية | شروط الخدمة | | حقوق النشر
استمع إلى هذا العرض أثناء الاستكشاف
تشغيل