Logloss is (I think?) the same as categorical crossentropy. As discussed around lesson 1/2 it is necessary to clip the probabilities as logloss overly penalises high/low probabilities that are wrong. I created a bespoke loss function that clipped the probabilities before calculating as it seemed to me sensible to use the same metric for training as for submission. In practice I am not sure if it makes any difference or not.
def do_clip(arr, mx):
clipped = arr.clip(mx, 1-mx)
return clipped/clipped.sum(axis=1)[:, np.newaxis]
def logloss(ytrue, ypred):
return categorical_crossentropy(ytrue, do_clip(ypred, clip))