What are the sanity checks when you are designing a regression model?

I have a problem similar to object detection which include both classification ad regression. However, in is in NLP, I am trying to predict the length of a concept in the sentences, for predicting the length we use regression and for the concept behind the sentence we use classification.
I notice that the classification part works okay, but the regression part does not perform well. We are using Transformers for our project.
I was wondering what are the checks/tricks that people usually do for their regression part of their models to make improvements?
my gt for regressions are normalized between 0-1, I tried mse/L1/mse+sigmoid,etc, but it seems like the model can not make use of them, I mean it learns, but the performance is so bad. also it can reach 100% on the training data if I let it overfit.
I am basically out of idea on how to check it and what to do.
Any suggestion would be really appreciated