Yes, I figured it would give more coherence to the implementation since we inherit from Optimizer. And it allows us to use inherited methods to reduce code base
The only method that I’m not overwriting is a private one to ensure smooth param_group addition for slow weights.
Also, you might want to check the discussion on this thread. I added a param synchronization method for external calls so that users can choose the state they want for their model to be evaluated. My current version is available here!
Cheers