Combining image, relational and nlp architectures

I’ve had really good initial results in image categorization using the fast.ai library. I’m quite interested in combining/adding in relational and nlp data to enhance the categorizations. Are there some examples in the videos/forums/wiki/web of combining these architectures into a larger architecture and retraining the combined network? (I’m not being able to find them.)