Training loop, optimizer, scheduler API

Yes I just visited AI2 last week! :slight_smile:

Jeremy, it’s good to host you and Sebastian last Friday at AI2. Let me know if I can help with anything related to AI2.

@sebastianruder and I enjoyed our visit! :slight_smile: If you have any ideas for cool stuff from AllenNLP (or elsewhere) that might be a good fit for fastai_v1, please do let us know.

I have been trying to get a handle on how to smoothly transition between losses.

In one of the lessons @jeremy had talked about curriculum learning where the network is trained on increasingly harder images. For example the harder images would have classes which would look very similar. But my experience with this form of curriculum learning hasn’t been great. Started with a bunch a very disparate classes. Every epoch(s), I would introduce a new class which was similar to one already trained on while keeping the learning rate same. after all the classes were added started annealing. The accuracy was poorer than with standard approach of all classes present and with CLR. maybe I need to anneal while introducing new classes.

My idea was to refine this further by changing the loss function as well. instead of CE use EMD (Earth moving Distance). With EMD loss, signals from similar classes are not lost as is the case with CE which treats all classes as orthogonal. But this hasn’t panned out as well. maybe my assigned distances between classes didn’t match. but with EMD loss wasn’t converging

Have a gut feeling this that if I slowly transition from CE to EMD while also exposing the network to more difficult classes should work But not sure.

@jeremy any pointers on this.