انتقل إلى وضع عدم الاتصال باستخدام تطبيق Player FM !
Suhail Doshi: The Future of Computer Vision
Manage episode 418564085 series 2975159
Episode 123
I spoke with Suhail Doshi about:
* Why benchmarks aren’t prepared for tomorrow’s AI models
* How he thinks about artists in a world with advanced AI tools
* Building a unified computer vision model that can generate, edit, and understand pixels.
Suhail is a software engineer and entrepreneur known for founding Mixpanel, Mighty Computing, and Playground AI (they’re hiring!).
Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions.
Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter
Outline:
* (00:00) Intro
* (00:54) Ad read — MLOps conference
* (01:30) Suhail is *not* in pivot hell but he *is* all-in on 50% AI-generated music
* (03:45) AI and music, similarities to Playground
* (07:50) Skill vs. creative capacity in art
* (12:43) What we look for in music and art
* (15:30) Enabling creative expression
* (18:22) Building a unified computer vision model, underinvestment in computer vision
* (23:14) Enhancing the aesthetic quality of images: color and contrast, benchmarks vs user desires
* (29:05) “Benchmarks are not prepared for how powerful these models will become”
* (31:56) Personalized models and personalized benchmarks
* (36:39) Engaging users and benchmark development
* (39:27) What a foundation model for graphics requires
* (45:33) Text-to-image is insufficient
* (46:38) DALL-E 2 and Imagen comparisons, FID
* (49:40) Compositionality
* (50:37) Why Playground focuses on images vs. 3d, video, etc.
* (54:11) Open source and Playground’s strategy
* (57:18) When to stop open-sourcing?
* (1:03:38) Suhail’s thoughts on AGI discourse
* (1:07:56) Outro
Links:
* Suhail on Twitter
Get full access to The Gradient at thegradientpub.substack.com/subscribe
150 حلقات
Manage episode 418564085 series 2975159
Episode 123
I spoke with Suhail Doshi about:
* Why benchmarks aren’t prepared for tomorrow’s AI models
* How he thinks about artists in a world with advanced AI tools
* Building a unified computer vision model that can generate, edit, and understand pixels.
Suhail is a software engineer and entrepreneur known for founding Mixpanel, Mighty Computing, and Playground AI (they’re hiring!).
Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions.
Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter
Outline:
* (00:00) Intro
* (00:54) Ad read — MLOps conference
* (01:30) Suhail is *not* in pivot hell but he *is* all-in on 50% AI-generated music
* (03:45) AI and music, similarities to Playground
* (07:50) Skill vs. creative capacity in art
* (12:43) What we look for in music and art
* (15:30) Enabling creative expression
* (18:22) Building a unified computer vision model, underinvestment in computer vision
* (23:14) Enhancing the aesthetic quality of images: color and contrast, benchmarks vs user desires
* (29:05) “Benchmarks are not prepared for how powerful these models will become”
* (31:56) Personalized models and personalized benchmarks
* (36:39) Engaging users and benchmark development
* (39:27) What a foundation model for graphics requires
* (45:33) Text-to-image is insufficient
* (46:38) DALL-E 2 and Imagen comparisons, FID
* (49:40) Compositionality
* (50:37) Why Playground focuses on images vs. 3d, video, etc.
* (54:11) Open source and Playground’s strategy
* (57:18) When to stop open-sourcing?
* (1:03:38) Suhail’s thoughts on AGI discourse
* (1:07:56) Outro
Links:
* Suhail on Twitter
Get full access to The Gradient at thegradientpub.substack.com/subscribe
150 حلقات
كل الحلقات
×مرحبًا بك في مشغل أف ام!
يقوم برنامج مشغل أف أم بمسح الويب للحصول على بودكاست عالية الجودة لتستمتع بها الآن. إنه أفضل تطبيق بودكاست ويعمل على أجهزة اندرويد والأيفون والويب. قم بالتسجيل لمزامنة الاشتراكات عبر الأجهزة.