Meet Mish: New Activation function, possible successor to ReLU?

I made the promised post to try and provide an overview for everyone on the new techniques we’ve been using here:

re: missed paper - Good spot @muellerzr - I’ve added the LARS link!

Re: notebook - yes, please add it to the github repo, that will be a nice add for sure. Thanks for making that list of papers, that’s a big help for anyone to delve into more details.

@Seb - I had to stretch to summarize the self attention aspect in my post, so I’ve referenced you in that thread for people to ask for a tutorial about it :slight_smile: It does look promising though after seeing the results here and a quick read of the paper.

1 Like