Log loss

2024-10-21 (modified: 2024-10-27)

The loss function used in binary logistic regression.¹

\text{Log Loss} = \sum_{(x,y) \in D} -y \log(y') - (1-y) \log(1-y')

where

$(x, y) \in D$ is the dataset containing many labeled examples, wich are $(x, y)$ pairs.
$y$ is the label in a labeled example. Since this is logistic regression, every value of $y$ must either be 0 or 1.
$y'$ is the predicted value (somewhere between 0 and 1, exclusive), given the set of features in $x$ .

Rationale

Squared loss works well for a linear regression where the rate of change of the output values is constant. However, the rate of change of a logistic regression model is not constant.

If you used squared loss to calculate errors for the sigmoid function, as the output got closer and closer to 0 and 1, you would need more memory to preserve the precision needed to track these values.

Instead, the loss function for logistic regression is log loss. The Log Loss equation returns the logarithm of the magnitude of the change, rather than just the distance from data to prediction.²

Log loss

Rationale

Footnotes