Incorporating syntactic dependency information in FitLaM

Hi @jeremy. In your FitLaM paper you propose incorporation of syntactic dependencies as a future direction for further improvements. This makes sense because dependency paths have been shown to significantly improve performance in relation extractions tasks (among many others).

My question is, since dependency trees are essentially directed graphs, how should one go about it in terms of representation of dependency structures and modifaction of the language model objective. I’ve bookmarked the linked paper (Linzen et al., 2016) and will read it over the weekend. Any other ideas or papers would be really helpful.