Questions RNN_Encoder and LinearDecoder

Re: RNN_Encoder

  1. Why the forward() method return both raw_outputs and outputs? The two lists contain the same data with the exception of the application of dropout to outputs. Why do we even return raw_inputs?

Re: LinearDecoder

  1. Why does the forward() method return result, raw_outputs, outputs? It seems like result is all that is needed but there again is both raw_outputs and outputs. I can’t understand why the last two items are needed nor how our loss function will know to only use the first item in the tuple to compute its score.

Oh so happy to see that someone is digging deep into this! :slight_smile:

All 3 bits are used in seq2seq_reg(). I’m not promising they’re used correctly, or not redundantly, but they are at least used!.. This is largely stolen from Smerity’s AWD LSTM so that’s the place to look for more details about the regularization here.

If after looking there anything is unclear or looks wrong, please let me know, since this code hasn’t had a 2nd pair of eyes on it before…

Might be this is the relevant paper?

Putting it in my Mendeley to find out :slight_smile:

Yes that’s the one. Also check their AWD LSTM code on github.

Ah ok … will take a look as time permits this week.

Do any of you guys know or have recommendations how to debug the framework code whilst running things from jupyter notebooks? Using VSCode as my IDE and would love to find a way to put something akin to a set_trace() statement in the .py file and be able to interactively debug it just like the notebooks.

set_trace works fine in the .py files when used from notebooks. Not sure I’m following your question…

you’re right … had to restart my machine to get it working. thanks