Big Models, Small Pitfalls: My random walk towards building sequence-to-sequence models

Shreya Khurana

Hi, I'm a data scientist in the Domain Search team at GoDaddy which researches, develops, and deploys deep learning models for Godaddy’s domain business. I enjoy working with data in general and more specifically building NLP models. I'm a Python enthusiast and enjoy sharing my learnings with the community - I've previously presented at the Grace Hopper Conference, PyCon US, EuroPython and GeoPython.

Abstract

From Googling for the reviews of that new deli place to watching Parasite with subtitles, sequence-to-sequence learning is behind a variety of applications - machine translation, speech recognition, chatbots etc. In this talk, I share my experiences with modeling real-world messy data with attention-based transformer and CNN models - the good, the bad, and the ugly. I'll discuss some of the lessons that we learned through experimentation on sequence-to-sequence models - about architecture size, vocabulary, the relation between validation and training error, etc. Working in ML makes you realize one thing–it’s not always black and white and it doesn’t have to be a black box! The process of building good NLP models is not a straight line. Not-so-little decisions can often require tons of experiments, but in this talk, I share my modeling experience and lessons learned so your walk isn't as random as mine was!

Description

Slides

Video

Location

Date

Day 2 • 11:15-11:45 (GMT+8)

Language

English talk

Level

Intermediate