Implementing WideResNet from scratch

Hey, all!

Today I’ve got an implementation of WideResNet for you. Trying to figure out what these papers are saying and how to implement them in is really teaching me a lot. Some takeaways:

  • Darknet really is fast – really
  • Though I did not test this myself, if I extrapolate from WRN’s performance in this notebook, I can see how it would perform better against fresh ResNets with a similar number of parameters; based on this and other results from the paper, I will seriously consider widening a network before deepening if I ever feel the need to increase the number of layers beyond 50
  • As you can see, train_loss consistently hovers above valid_loss in later epochs, while valid_loss continues to drop; my suspicion is that the dropout layers are really helping the network generalize – bears further investigation


1 Like