انتقل إلى وضع عدم الاتصال باستخدام تطبيق Player FM !
For data-hungry tech companies, YouTube is a gold mine
Manage episode 431451193 series 2513243
Companies competing in the chatbot wars are using something known in the industry as “the Pile” to train their large language models. It’s a trove of open-source data made up of text scraped from all around the internet, including Wikipedia and the European Parliament. Annie Gilbertson, investigative reporter for Proof News, recently took a deep dive into the Pile and discovered something else: a dataset called “YouTube Subtitles.” Marketplace’s Lily Jamali spoke with Gilbertson about her investigation and how YouTube creators feel about their content being used without their consent.
864 حلقات
Manage episode 431451193 series 2513243
Companies competing in the chatbot wars are using something known in the industry as “the Pile” to train their large language models. It’s a trove of open-source data made up of text scraped from all around the internet, including Wikipedia and the European Parliament. Annie Gilbertson, investigative reporter for Proof News, recently took a deep dive into the Pile and discovered something else: a dataset called “YouTube Subtitles.” Marketplace’s Lily Jamali spoke with Gilbertson about her investigation and how YouTube creators feel about their content being used without their consent.
864 حلقات
كل الحلقات
×مرحبًا بك في مشغل أف ام!
يقوم برنامج مشغل أف أم بمسح الويب للحصول على بودكاست عالية الجودة لتستمتع بها الآن. إنه أفضل تطبيق بودكاست ويعمل على أجهزة اندرويد والأيفون والويب. قم بالتسجيل لمزامنة الاشتراكات عبر الأجهزة.