In lesson 10, one of the suggestions to improve the IMBD classifier is to train another version the same network, except on backwards versions of the reviews. Once trained, the outputs from that model are averaged with those from the forward network, and the resulting ensemble outperforms either model.
@jeremy do you know of any papers that have studied that, or is this just another trick you had up your sleeve?