Wikipedia Bi-directional model

ranih · September 12, 2018, 8:38pm

In part 10 Jeremy mentioned an improvement on top of the Wikipedia model by using also the reversed language model. What’s the right way to combine the 2 models?
Maybe I’m wrong, but I can’t just use np.mean of 2 output vectors because I want to add more layers to it (classification for example).

Thanks!

itaishch · February 18, 2019, 8:56am

You would probably want to concatenate the output vectors and let the model treat each output independently. Using a mean is like hard-coding the weight of each output vector to 0.5.