Questions RNN_Encoder and LinearDecoder

wgpubs · January 30, 2018, 12:01am

Re: RNN_Encoder

Why the forward() method return both raw_outputs and outputs? The two lists contain the same data with the exception of the application of dropout to outputs. Why do we even return raw_inputs?

Re: LinearDecoder

Why does the forward() method return result, raw_outputs, outputs? It seems like result is all that is needed but there again is both raw_outputs and outputs. I can’t understand why the last two items are needed nor how our loss function will know to only use the first item in the tuple to compute its score.

jeremy · January 30, 2018, 5:02am

Oh so happy to see that someone is digging deep into this!

All 3 bits are used in seq2seq_reg(). I’m not promising they’re used correctly, or not redundantly, but they are at least used!.. This is largely stolen from Smerity’s AWD LSTM so that’s the place to look for more details about the regularization here.

If after looking there anything is unclear or looks wrong, please let me know, since this code hasn’t had a 2nd pair of eyes on it before…

radek · January 30, 2018, 11:01am

Might be this is the relevant paper?

Putting it in my Mendeley to find out

jeremy · January 30, 2018, 2:31pm

Yes that’s the one. Also check their AWD LSTM code on github.

wgpubs · January 30, 2018, 5:41pm

Ah ok … will take a look as time permits this week.

Do any of you guys know or have recommendations how to debug the framework code whilst running things from jupyter notebooks? Using VSCode as my IDE and would love to find a way to put something akin to a set_trace() statement in the .py file and be able to interactively debug it just like the notebooks.

jeremy · January 30, 2018, 6:28pm

set_trace works fine in the .py files when used from notebooks. Not sure I’m following your question…

wgpubs · January 31, 2018, 7:10pm

you’re right … had to restart my machine to get it working. thanks