Hi,
I’m having trouble to calculate the log_loss that keras calculates.
I understand log_loss and cross_entropy and know how it is calculated, nonetheless something is going terribly wrong.
According to keras my cross_entropy on validation set is something like 0.0xxx. But when I use sklearn.metrics.log_loss
with one_hot encoded labels and the predictions of my network, I get a value of 3.87xxx. So for y_true I use an encoded array with dim (n_samples, n_classes) and y_pred is an array with the corresponding softmax predictions.
So:
val_predictions[:5]
>>>array([[ 9.77910817e-01, 2.20891740e-02],
[ 9.98937905e-01, 1.06205838e-03],
[ 9.99959946e-01, 4.00941935e-05],
[ 9.99999404e-01, 5.46007016e-07],
[ 9.99951005e-01, 4.89662743e-05]], dtype=float32)
y_val[:5]
>>>array([[ 1., 0.],
[ 1., 0.],
[ 1., 0.],
[ 1., 0.],
[ 1., 0.]])
I was able to replicate sklearns log loss with numpy which made me even more wonder .
With numpy I used:
- 1/val_predictions.shape[0] * np.sum( np.sum(y_val*np.log(val_predictions), axis=1), axis=0)
and sklearn:
log_loss(y_val, val_predictions)
I appreciate your help.