I had a question about using ULMFiT. Let’s say I am trying to predict a rating for a specific essay response rating between 1-5. I don’t necessarily need the exact class score, because if the argmax was a 2 and the true value was a 3 that’s a lot better than the argmax being a 1 and the true value being a 3, etc. I was wondering if there was a way to make a small change to the architecture to turn a classification model into a regression based model. I’ve seen people scale the the score down to be between 0 and 1 and then use a sigmoid activation function, but I’m still having trouble understanding how this works conceptually. It’s still a number between 0 and 1, it’s not a probability of a class for 0 and 1 which is what the sigmoid provides.