Meet Mish: New Activation function, possible successor to ReLU?

As a reminder, @grankin reran the baseline for Imagewoof, 5 epochs, ( and got 61.25% averaged over 20 runs, which is higher than what you got with mish.
The true baseline for 20 epochs is most likely higher than on the leaderboard as well.

I’ve explained the issue with the leaderboard here: ImageNette/Woof Leaderboards - guidelines for proving new high scores?

There is also an issue of high variance in accuracy from run to run on Imagewoof/nette, so I wouldn’t rush to making a conclusion with a single run that is furthermore compared to a wrongly measured baseline.

I’ve made those points in the past in the other SOTA threads, but I still see the same method being used of running things once and comparing to underestimated baselines…