Malav Shah, DIRECTV
Malav Shah is a data scientist II at DIRECTV. He joins DIRECTV from AT&T, where he has worked on multiple consumer businesses — including broadband, wireless, and video — and implemented machine learning (ML) models for a wide variety of use cases spanning the entire customer lifecycle from acquisition to retention. Malav holds a Master’s Degree in Computer Science with a specialization in Machine Learning from Georgia Tech, a degree he puts to good use every day at DIRECTV applying modern ML techniques to help the company deliver innovative entertainment experiences.
Can you outline your career path and why you first started with machine learning?
It has been an interesting journey. During my undergraduate years, I actually studied information technology, so most of my courses weren’t about machine learning at first. Around my junior year, I took an AI course where we learned about Turing machines and that really got me interested in the world of artificial intelligence. Even then I knew I had found my calling. I started taking some extra classes in addition to my usual courses and eventually took on a capstone project to build a model that predicted outliers in medical diagnosis and prognosis, which fascinated me with the power of machine learning. I did my master’s degree at Georgia Tech specializing in machine learning, taking a variety of courses from data and visual analytics to an AI class taught by a former Google Glass technical leader Thad Starner† After graduation, I took my first role at AT&T, where I spent about a year and a half in the Chief Data Officer’s organization building acquisition and retention models for the company’s broadband product. In July 2020, I joined a new organization within DIRECTV as part of the team responsible for all things data science, with a say in how we build ML infrastructure and our MLOps pipeline across the organization. Being in a centralized data organization where I could influence not only my team but also other teams was a big motivator to join DIRECTV.
What attracted you to your current position?
While completing my master’s degree, I did an internship at AT&T. While the internship focused on the broadband product, I also touched on wireless and streaming video – things I used every day as a consumer. After graduation, most of the other roles I was offered were in software engineering or ML engineering, but AT&T offered me a position as a data scientist. Being a data scientist and thinking about how to conduct research and solve problems turned out to be attractive in the end.
That role immediately led to an opportunity to be part of a journey in video streaming, building on a nearly 30-year legacy at DIRECTV. The opportunity to build and define new cloud tools, new infrastructure and machine learning tools at such an early stage in my career is exciting. I don’t think I could get this much exposure to so many levels of executives anywhere else.
How is the machine learning organization at DIRECTV structured – is there a central ML team or are they most attached to the product or business teams?
Our team within DIRECTV acts as a center of excellence. Our responsibilities are twofold. The primary responsibility is to help solve problems and develop solutions for stakeholders from marketing, customer experience (CX), and other teams. For example, we can help build a model from scratch and get it into production for the marketing team before handing it over to their data scientists to own – keeping them in charge of the day-to-day business, while offering ongoing model updates as new requirements emerge inside. The second part of our team’s job is to define the infrastructure these teams will use so that they have the tools and technologies they need to effectively create and deploy machine learning models. Our team is also responsible for defining best practices for ML development and implementation across the organization. That’s why we’re always looking for ways to improve our existing ML pipelines based on our strategy and goals, either by building something ourselves or looking at the opportunities in the market.
When assessing this infrastructure, how do you assess whether you are going to build or buy? The ML infrastructure landscape has clearly evolved in recent years.
That’s an interesting question that came up recently in the context of reviewing ML sensing platforms like Arize. In general, we look at business value first to make sure that each new opportunity will actually deliver value to the organization. Then we look at how quickly we need the capacity, the time it takes to build in-house, the capacities we could build versus a supplier and finally the cost to buy or build. This evaluation process takes up quite a bit of our time, but it has proven effective in delivering maximum return on investment for the business.
What are your machine learning use cases?
First of all, DIRECTV does a lot of structured data modelling. For example, we’re working with our customer experience team to build a net promoter score (NPS) detractor model that we use to enable better experiences for customers experiencing issues with our service. We also work with our marketing stakeholders to build models around “personalized” customer offerings and forecasting both short-term and long-term churn.
Another area of interest is content intelligence – not analytics, but intelligence. In the field of content intelligence, one of our main focus areas is building a recommendation engine for the various carousels that customers see on the DIRECTV product. We are also starting to get to grips with computer vision and natural language processing (NLP) models. Arize’s launch of image and NLP embed tracking is something that we will probably need as we start working more with unstructured data in the coming year.
In the last few years alone, so much has changed in the media landscape. Do you see an increase in things like concept drift†
Consumption definitely skyrocketed after the pandemic. As people were trapped in their homes, attrition decreased across the industry. With people working from home, these habits can have some lasting power — and not just in rural areas where satellite TV is already a leader. One of the other trends in the streaming industry is a historic increase in sports viewership in general compared to 2019 (you shouldn’t really compare 2020 or 2021 given compressed sports schedules and canceled events). Sports fan engagement is also becoming a big trend as more streaming services in the industry get involved in sports and add interactivity, such as allowing people to bet on TV. With these ever-changing consumption patterns, it becomes more important for us to monitor things like concept drift and feature drift to ensure that we address model performance issues immediately.
What are some of the challenges you will face once models go into production – and why? model monitoring important?
In the video industry, behavior is changing rapidly. If you get a drift a month later, it can negatively impact model performance and lead to loss of business value. That’s one of the main reasons I think real-time ML monitoring updates are so important in MLOps. If my model strayed this morning, I should know that second. If my forecast has drifted, or if there’s a feature anomaly, or if a feature is blank, then I don’t want to wait a week for an analyst to check it – ideally, I want to know before a week’s forecasts are in the field.
Models are never perfect; they will always drift based on changing behavior, changing data or changing source systems. Having a centralized monitoring platform like Arize is hugely beneficial.
What advice would you give to people taking on their first role in data science?
One of the things I advise new graduate data scientists not to do is become obsessed with having perfect metric scores right away. While concentrating on a model stat like accuracy important, conceptually it’s more important to focus on understanding the underlying data – what the data does, what the data tells you – and making sure you understand the business impact and the problem you are trying to solve. These basics are important, but often people lose sight of them because they try to build the best model too quickly. Instead, I’d say spend 70 to 80% of your time on whatever you put into the model, because trash in is trash out. Once you’re sure you’re not throwing trash into the model, the rest usually takes care of itself.
Additional advice for new graduates is to pay attention to the wave of data-centric AI tools emerging. These are likely to be the next big thing in machine learning and are worth following closely.
How do you collaborate with business and product leads and link model metrics to business results?
That always happens. Whenever we create models for stakeholders, we meet with them regularly to ensure that what we see matches what should be seen in the real world. When starting a project, it is critical to ensure that the requirements and data are in place and that you understand the data correctly. I’m not even getting into what kind of model I’m going to build until the later stages of the development cycle – maybe sprint four or even sprint five. My approach is not to start by describing what type of model I want to build; I prefer to start with what should drive business value first. An in-depth understanding of the data also helps me answer nuanced questions when presenting to business leaders and stakeholders.
How do you view the evolving MLOps and ML infrastructure space?
I think we are moving into a very innovative era in machine learning as there are many new ML solutions appearing in the industry every week. ML observability is a good example of a space where hundreds of things happen. Production ML versus production of other applications is completely different because other applications have been around for a while – 15 or even 25 years – and they have a very mature production pipeline, but for machine learning it is still relatively new. It will be exciting to see how we can make ML implementation, which is a pain point for many teams, simpler and seamless. Other innovation areas I’ll be keeping a close eye on include automated insights generation tools, data-centric AI tools, and how we can further improve the ML infrastructure space where everything resides in the cloud.