Why the forward() method return both raw_outputs and outputs? The two lists contain the same data with the exception of the application of dropout to outputs. Why do we even return raw_inputs?
Re: LinearDecoder
Why does the forward() method return result, raw_outputs, outputs? It seems like result is all that is needed but there again is both raw_outputs and outputs. I can’t understand why the last two items are needed nor how our loss function will know to only use the first item in the tuple to compute its score.
Oh so happy to see that someone is digging deep into this!
All 3 bits are used in seq2seq_reg(). I’m not promising they’re used correctly, or not redundantly, but they are at least used!.. This is largely stolen from Smerity’s AWD LSTM so that’s the place to look for more details about the regularization here.
If after looking there anything is unclear or looks wrong, please let me know, since this code hasn’t had a 2nd pair of eyes on it before…
Ah ok … will take a look as time permits this week.
Do any of you guys know or have recommendations how to debug the framework code whilst running things from jupyter notebooks? Using VSCode as my IDE and would love to find a way to put something akin to a set_trace() statement in the .py file and be able to interactively debug it just like the notebooks.