Oh so happy to see that someone is digging deep into this!
All 3 bits are used in
seq2seq_reg(). I’m not promising they’re used correctly, or not redundantly, but they are at least used!.. This is largely stolen from Smerity’s AWD LSTM so that’s the place to look for more details about the regularization here.
If after looking there anything is unclear or looks wrong, please let me know, since this code hasn’t had a 2nd pair of eyes on it before…
Might be this is the relevant paper?
Putting it in my Mendeley to find out
Yes that’s the one. Also check their AWD LSTM code on github.
Ah ok … will take a look as time permits this week.
Do any of you guys know or have recommendations how to debug the framework code whilst running things from jupyter notebooks? Using VSCode as my IDE and would love to find a way to put something akin to a
set_trace() statement in the .py file and be able to interactively debug it just like the notebooks.
set_trace works fine in the .py files when used from notebooks. Not sure I’m following your question…
you’re right … had to restart my machine to get it working. thanks