What techniques exist in meta-learning to encourage generalization across data sets?

I wanted to learn about the techniques in Machine Learning (and especially Deep Learning) for having meta-learning work across different data sets.

Is there a reference, or a survey paper relating to this?

things I had in mind were:

  • having a meta-learner that outputted architectures that worked across different data sets.
  • perhaps a system that outputted fully trained models that were good on a new data set

but essentially the focus of my question is to understand what has been done to have meta-learners suggest things that generalize across different data sets.

Cross-posted: https://qr.ae/TWyFff