ML crash course - Datasets, generalization, and overfitting

Machine learning crash coursedatasets, generalization, and overfitting 챕터.

developers.google.com/machine-learning/crash-course/overfitting

Introduction

Data characteristics

Types of data

Quantity of data

Quality and reliability of data

Complete vs. incomplete examples

Labels

Direct versus proxy labels

Human-generated data

Imbalanced datasets

Downsampling and Upweighting

Rebalance ratios

Dividing the original dataset

Training, validation, and test sets

Additional problems with test sets

Transforming data

Generalization

Overfitting

Fitting, overfitting, and underfitting

Detecting overfitting

What causes overfitting?

Generalization conditions

Model complexity

Regularization

What is complexity?

L2 regularization

Regularization rate (lambda)

Early stopping: an alternative to complexity-based regularization

Finding equilibrium between learning rate and regularization rate

Interpreting loss curves

What’s next?

2024 © ak