Word embeddings in a structured learner

would it be insane/a pointless exercise to try to use the word embeddings generated by a language model as embeddings in a structured learner? They should work the same, correct? Or am I fundamentally misunderstanding the types of embeddings/weights created by a language model?

I’m guessing you can treat these as any other embedding out there, meaning that you can add a sigmoid / softmax layer or a linear layer and turn them into a classifier / regressor.