Text Classification - unexpected very low performance?

I’m not entirely sure what’s going on exactly behind the scenes when you pass the vocab to the new databunch but here is what I suspect is going on.

Glad that it helped :slight_smile:

It’s always a good sign if your intuition about the data is in line with the model’s behavior. I would go ahead and label some more data, especially since you currenlty have less data for the harder classes. Then you can re-train the model and see if it improves.

Nope, sorry. Maybe someone else knows. But it would help to post the error message.