14: Building a Character-Based Text Classifier

The Data Life Podcast

المحتوى المقدم من Sanket Gupta. يتم تحميل جميع محتويات البودكاست بما في ذلك الحلقات والرسومات وأوصاف البودكاست وتقديمها مباشرة بواسطة Sanket Gupta أو شريك منصة البودكاست الخاص بهم. إذا كنت تعتقد أن شخصًا ما يستخدم عملك المحمي بحقوق الطبع والنشر دون إذنك، فيمكنك اتباع العملية الموضحة هنا https://ar.player.fm/legal.

5y ago 23:20

M4A•منزل الحلقة

Ever wonder how to automatically detect language from a script? How does Google do it?

Ever wonder how Amazon knows whether you are searching for a product or a SKU on its search bar?

We look into character-based text classifiers in this episode. We cover 2 types of models. First is the bag-of-words models such as Naive Bayes, logistic regression and vanilla neural network. Second we cover sequence models such as LSTMs and how to prepare your characters for the LSTMs including things like one-hot encoding, padding, creating character embeddings and then feeding these into LSTMs. We also cover how to set up and compile these sequence models.

Thanks for listening, and if you find this content useful, please leave a review and consider supporting this podcast from the link below.

--- Send in a voice message: https://podcasters.spotify.com/pod/show/the-data-life-podcast/message Support this podcast: https://podcasters.spotify.com/pod/show/the-data-life-podcast/support

27 حلقات

#Tech #Sanket Gupta #Data Science