# Log loss

> The loss function used in binary logistic regression.[^1]

The [loss function](https://wiki.g15e.com/pages/Loss%20function.txt) used in <binary logistic regression>.[^1]

$$
\text{Log Loss} = \sum_{(x,y) \in D} -y \log(y') - (1-y) \log(1-y')
$$

where

- $(x, y) \in D$ is the [dataset](https://wiki.g15e.com/pages/Dataset%20(machine%20learning.txt)) containing many labeled examples, wich are $(x, y)$ pairs.
- $y$ is the label in a [labeled example](https://wiki.g15e.com/pages/Labeled%20example.txt). Since this is logistic regression, every value of $y$ must either be 0 or 1.
- $y'$ is the predicted value (somewhere between 0 and 1, exclusive), given the set of features in $x$.

## Rationale

[Squared loss](https://wiki.g15e.com/pages/L2%20loss.txt) works well for a [linear regression](https://wiki.g15e.com/pages/Linear%20regression.txt) where the rate of change of the output values is constant. However, the rate of change of a logistic regression model is not constant.

If you used squared loss to calculate errors for the sigmoid function, as the output got closer and closer to 0 and 1, you would need more memory to preserve the precision needed to track these values.

Instead, the <loss function> for logistic regression is [log loss](https://wiki.g15e.com/pages/Log%20loss.txt). The Log Loss equation returns the logarithm of the magnitude of the change, rather than just the distance from data to prediction.[^2]

## Footnotes

[^1]: https://developers.google.com/machine-learning/glossary#Log_Loss
[^2]: [ML crash course - Logistic regression](https://wiki.g15e.com/pages/ML%20crash%20course%20-%20Logistic%20regression.txt)