What does the encoder actually learn? 🤔

Friends,

If any of you have spent time training LM and Classifier backbones, I’d appreciate if you could run the above experiments and report your results.

This is purely inference and should run in a few seconds.