Introduction to machine learning problem framing

2024-10-22

Introduction to Machine Learning Problem Framing is a course provided by Google. It teaches you how to determine if machine learning (ML) is a good approach for a problem and explains how to outline an ML solution.

developers.google.com/machine-learning/problem-framing

Objectives

Identify if ML is a good solution for a problem.
Learn how to frame an ML problem.
Understand how to pick the right model and define success metrics.

Problem framing

Problem framing is the process of analyzing a problem to isolate the individual elements that need to be addressed to solve it. Problem framing helps determine your project’s technical feasibility and provides a clear set of goals and success criteria.

At a high level, ML problem framing consists of two distinct steps:

Determining whether ML is the right approach for solving a problem.
Framing the problem in ML terms.

Understand the problem

To understand the problem, perform the following tasks:

State the goal for the product you are developing or refactoring.
Determine whether the goal is best solved using predictive ML, generative AI, or a non-ML solution.
Verify you have the data required to train a model if you’re using a predictive ML approach.

State the goal

“What am I trying to accomplish?”

Examples:

Weather app: Calculate precipitation in six-hour increments for a geographic region.
Fashion app: Generate a variety of shirt designs.
Video app: Recommend useful videos.
Mail app: Detect spam.
Financial app: Summarize financial information from multiple news sources.
Map app: Calculate travel time.
Banking app: Identify fraudulent transactions.
Dining app: Identify cuisine by a restaurant’s menu.
Ecommerce app: Reply to reviews with helpful answers.

Clear use case for ML

You don’t want to implement a complex ML solution when a simpler non-ML solution will work. (But see Schillace laws)

Predictive ML and data

To make good predictions, you need data that contains features with predictive power.
Determining which features have predictive power can be a time consuming process. You can automate finding a feature’s predictive power by using algorithms such as Pearson correlation coefficient, Adjusted mutual information (AMI), and Shapley value, which provide a numerical assessment for analyzing the predictive power of a feature.

Framing an ML problem

You frame a problem in ML terms by completing the following tasks:

Define the ideal outcome and the model’s goal.
Identify the model’s output.
Define success metrics.

Ideal outcome and model’s goal

Weather app
- Ideal outcome: Calculate precipitation in six hour increments for a geographic region.
- Model’s goal: Predict six-hour precipitation amounts for specific geographic regions.
Fashion app
- Ideal outcome: Generate a variety of shirt designs.
- Model’s goal: Generate three varieties of a shirt design from text and an image, where the text states the style and color and the image is the type of shirt (t-shirt, button-up, polo).
Video app
- Ideal outcome: Recommend useful videos.
- Model’s goal: Predict whether a user will click on a video.
Mail app
- Ideal outcome: Detect spam.
- Model’s goal: Predict whether or not an email is spam.
Financial app
- Ideal outcome: Summarize financial information from multiple news sources.
- Model’s goal: Generate 50-word summaries of the major financial trends from the previous seven days.
Map app
- Ideal outcome: Calculate travel time.
- Model’s goal: Predict how long it will take to travel between two points.
Banking app
- Ideal outcome: Identify fraudulent transactions.
- Model’s goal: Predict if a transaction was made by the card holder.
Dining app
- Ideal outcome: Identify cuisine by a restaurant’s menu.
- Model’s goal: Predict the type of restaurant.
Ecommerce app
- Ideal outcome: Generate customer support replies about the company’s products.
- Model’s goal: Generate replies using sentiment analysis and the organization’s knowledge base.

The model’s output

“What type of output do I need to solve my problem?”

“Classify an email as spam or not spam.” (Classification model)
“Predict the number of views a video will get.” (Regression model)
“Summarize an article.” (Generative AI)

Define the success metrics

Success metrics define what you care about, like engagement or helping users take appropriate action, such as watching videos that they’ll find useful.

Success metrics differ from the model’s evaluation metrics, like accuracy, precision, recall, or AUC.

Example for the weather app:

Success: Users open the “Will it rain?” feature 50 percent more often than they did before.
Failure: Users open the “Will it rain?” feature no more often than before.

Implementing a model

Train your own model versus using an already trained model

Monitoring

Summary

Framing a problem in terms of ML is a two-step process:

Verify that ML is a good approach by doing the following:
- Understand the problem.
- Identify a clear use case.
- Understand the data.
Frame the problem in ML terms by doing the following:
- Define the ideal outcome and the model’s goal.
- Identify the model’s output.
- Define success metrics.