I made the promised post to try and provide an overview for everyone on the new techniques we’ve been using here:
re: missed paper - Good spot @muellerzr - I’ve added the LARS link!
Re: notebook - yes, please add it to the github repo, that will be a nice add for sure. Thanks for making that list of papers, that’s a big help for anyone to delve into more details.
@Seb - I had to stretch to summarize the self attention aspect in my post, so I’ve referenced you in that thread for people to ask for a tutorial about it It does look promising though after seeing the results here and a quick read of the paper.