Meet Mish: New Activation function, possible successor to ReLU?

Completed:


(Please Zoom in to view). Results can also be viewed here - https://github.com/digantamisra98/Mish#significance-level
Mish beats all of the activations in Peak Accuracy, mean accuracy, mean loss and standard deviation of accuracy. Mish beats other activation functions at mostly a significant high level. (The P values are shown in the table)
Also if someone is familiar with keras and LSTMs, please help me in solving this issue - https://github.com/digantamisra98/Mish/issues/13

@Seb I think your second suggestion of whether Mish increases network learning capacity is somewhat demonstrated in this table where Mish was the only one to cross the 88% mark on the Top-1 Accuracy.

@ilovescience do you only want the Efficient Net B0 table to be sorted or all the 70+ benchmark tables in the repository to be sorted? (If it is the latter, then that is gonna take some significant time)

5 Likes