# ML crash course - Linear regression > Machine learning crash course 중 Linear regression 챕터. [Machine learning crash course](https://wiki.g15e.com/pages/Machine%20learning%20crash%20course.txt) 중 [Linear regression](https://wiki.g15e.com/pages/Linear%20regression.txt) 챕터. https://developers.google.com/machine-learning/crash-course/linear-regression ## Introduction Learning objectives: - Explain a [loss function](https://wiki.g15e.com/pages/Loss%20function.txt) and how it works. - Define and describe how [gradient descent](https://wiki.g15e.com/pages/Gradient%20descent.txt) finds the optimal model [parameters](https://wiki.g15e.com/pages/Parameter%20(machine%20learning.txt)). - Describe how to tune [hyperparameters](https://wiki.g15e.com/pages/Hyperparameter.txt) to efficiently [train](https://wiki.g15e.com/pages/Training%20(machine%20learning.txt)) a [linear model](https://wiki.g15e.com/pages/Linear%20model.txt). Prerequisites: - [Introduction to Machine Learning](https://wiki.g15e.com/pages/Introduction%20to%20Machine%20Learning.txt) Definition of [linear regression](https://wiki.g15e.com/pages/Linear%20regression.txt): > Linear regression is a statistical technique used to find the relationship between variables. In an ML context, linear regression finds the relationship between [features](https://wiki.g15e.com/pages/Feature%20(machine%20learning.txt)) and a [label](https://wiki.g15e.com/pages/Label%20(machine%20learning.txt)). ### Linear regression equation In algebraic terms, the model would be defined as $y = mx + b$, where - $y$ is the value we want to predict. - $m$ is the slope of the line - $x$ is our input value - $b$ is the y-intercept In ML, we write the equation for a linear regression model as $y' = b + w_1 x_1$, where - $y'$ is the [predicted label](https://wiki.g15e.com/pages/Prediction%20(machine%20learning.txt)) - the output - $b$ is the [bias](https://wiki.g15e.com/pages/Bias%20(machine%20learning.txt)) of the [model](https://wiki.g15e.com/pages/Model%20(machine%20learning.txt)). Bias is the same concept as the y-intercept in the algebraic equation for a line. In ML, bias is sometimes referred to as $w_0$. Bias is a [parameter](https://wiki.g15e.com/pages/Parameter%20(machine%20learning.txt)) of the model and is calculated during [training](https://wiki.g15e.com/pages/Training%20(machine%20learning.txt)). - $w_1$ is the [weight](https://wiki.g15e.com/pages/Weight%20(machine%20learning.txt)) of the feature. Weight is the same concept as the slope $m$ in the algebraic equation for a line. Weight is a [parameter](https://wiki.g15e.com/pages/Parameter%20(machine%20learning.txt)) of the model and is calculated during [training](https://wiki.g15e.com/pages/Training%20(machine%20learning.txt)). - $x_1$ is a [feature](https://wiki.g15e.com/pages/Feature%20(machine%20learning.txt)) - the input Models with multiple features have two or more weights, e.g.: $y' = b + w_1 x_1 + w_2 x_2 + w_3 x_3$ ### Key terms - [Bias](https://wiki.g15e.com/pages/Bias%20(machine%20learning.txt)) - [Feature](https://wiki.g15e.com/pages/Feature%20(machine%20learning.txt)) - [Label](https://wiki.g15e.com/pages/Label%20(machine%20learning.txt)) - [Linear regression](https://wiki.g15e.com/pages/Linear%20regression.txt) - [Parameter](https://wiki.g15e.com/pages/Parameter%20(machine%20learning.txt)) - [Weight](https://wiki.g15e.com/pages/Weight%20(machine%20learning.txt)) ## Loss Definition: > [Loss](https://wiki.g15e.com/pages/Loss%20(machine%20learning.txt)) is a numerical metric that describes how wrong a [model](https://wiki.g15e.com/pages/Model%20(machine%20learning.txt))'s [predictions](https://wiki.g15e.com/pages/Prediction%20(machine%20learning.txt)) are. Loss measures the distance between the model's predictions and the actual [labels](https://wiki.g15e.com/pages/Label%20(machine%20learning.txt)). The goal of [training](https://wiki.g15e.com/pages/Training%20(machine%20learning.txt)) a model is to minimize the loss, reducing it to its lowest possible value. Distance of loss: > Loss focuses on the distance between the values, not the direction. … Thus, all methods for calculating loss remove the sign. Types of loss: - [L1 loss](https://wiki.g15e.com/pages/L1%20loss.txt): The sum of the absolute values of the difference between the predicted values and the actual values. - [Mean absolute error](https://wiki.g15e.com/pages/Mean%20absolute%20error.txt): The average of L1 losses across a set of examples. - [L2 loss](https://wiki.g15e.com/pages/L2%20loss.txt): The sum of the squared difference between the predicted values and the actual values. - [Mean squared error](https://wiki.g15e.com/pages/Mean%20squared%20error.txt): The average of L2 losses across a set of examples. When processing multiple examples at once, we recommend averaging the losses across all the examples, whether using [MAE](https://wiki.g15e.com/pages/Mean%20absolute%20error.txt) or [MSE](https://wiki.g15e.com/pages/Mean%20squared%20error.txt). ### Choosing a loss When choosing the best loss function, consider how you want the model to treat [outliers](https://wiki.g15e.com/pages/Outliers.txt). For instance, [MSE](https://wiki.g15e.com/pages/Mean%20squared%20error.txt) moves the model more toward the outliers, while [MAE](https://wiki.g15e.com/pages/Mean%20absolute%20error.txt) doesn't. ### Key terms - [Mean absolute error](https://wiki.g15e.com/pages/Mean%20absolute%20error.txt) - [Mean squared error](https://wiki.g15e.com/pages/Mean%20squared%20error.txt) - [L1 loss](https://wiki.g15e.com/pages/L1%20loss.txt) - [L2 loss](https://wiki.g15e.com/pages/L2%20loss.txt) - [Loss](https://wiki.g15e.com/pages/Loss%20(machine%20learning.txt)) - [Outliers](https://wiki.g15e.com/pages/Outliers.txt) - [Prediction](https://wiki.g15e.com/pages/Prediction%20(machine%20learning.txt)) ## Parameters exercise https://developers.google.com/machine-learning/crash-course/linear-regression/parameters-exercise ## Gradient descent [Gradient descent](https://wiki.g15e.com/pages/Gradient%20descent.txt) is a mathematical technique that iteratively finds the [weights](https://wiki.g15e.com/pages/Weight%20(machine%20learning.txt)) and [bias](https://wiki.g15e.com/pages/Bias%20(machine%20learning.txt)) that produce the model with the lowest [loss](https://wiki.g15e.com/pages/Loss%20(machine%20learning.txt)). ### Model convergence and loss curves When [training](https://wiki.g15e.com/pages/Training%20(machine%20learning.txt)) a model, you'll often look at a [loss curve](https://wiki.g15e.com/pages/Loss%20curve.txt) to determine if the model has [converged](https://wiki.g15e.com/pages/Convergence%20(machine%20learning.txt)). The loss curve shows how the loss changes as the model trains. ### Convergence and convex functions The [loss functions](https://wiki.g15e.com/pages/Loss%20function.txt) for [linear models](https://wiki.g15e.com/pages/Linear%20model.txt) always produce a [convex](https://wiki.g15e.com/pages/Convex%20function.txt) surface. As a result of this property, when a [linear regression](https://wiki.g15e.com/pages/Linear%20regression.txt) model converges, we know the model has found the weights and bias that produce the lowest loss. ### Key terms - [Convergence](https://wiki.g15e.com/pages/Convergence%20(machine%20learning.txt)) - [Convex function](https://wiki.g15e.com/pages/Convex%20function.txt) - [Gradient descent](https://wiki.g15e.com/pages/Gradient%20descent.txt) - [Iteration](https://wiki.g15e.com/pages/Iteration%20(machine%20learning.txt)) - [Loss curve](https://wiki.g15e.com/pages/Loss%20curve.txt) ## Hyperparameters Definition: > [Hyperparameters](https://wiki.g15e.com/pages/Hyperparameter.txt) are variables that control different aspects of [training](https://wiki.g15e.com/pages/Training%20(machine%20learning.txt)). Common hyperparameters: - [Learning rate](https://wiki.g15e.com/pages/Learning%20rate.txt) - [Batch size](https://wiki.g15e.com/pages/Batch%20size%20(machine%20learning.txt)) - [Epochs](https://wiki.g15e.com/pages/Epoch%20(machine%20learning.txt)) ### Key terms - [Batch size](https://wiki.g15e.com/pages/Batch%20size%20(machine%20learning.txt)) - [Epoch](https://wiki.g15e.com/pages/Epoch%20(machine%20learning.txt)) - [Generalization](https://wiki.g15e.com/pages/Generalization%20(machine%20learning.txt)) - [Hyperparameter](https://wiki.g15e.com/pages/Hyperparameter.txt) - [Iteration](https://wiki.g15e.com/pages/Iteration%20(machine%20learning.txt)) - [Learning rate](https://wiki.g15e.com/pages/Learning%20rate.txt) - - [Parameter](https://wiki.g15e.com/pages/Parameter%20(machine%20learning.txt)) - [Stochastic gradient descent](https://wiki.g15e.com/pages/Stochastic%20gradient%20descent.txt) ## Gradient descent exercise https://developers.google.com/machine-learning/crash-course/linear-regression/gradient-descent-exercise ## Programming exercise https://developers.google.com/machine-learning/crash-course/linear-regression/programming-exercise ## What's next - [ML crash course - Logistic regression](https://wiki.g15e.com/pages/ML%20crash%20course%20-%20Logistic%20regression.txt)