Training Loss > Validation Loss

Thanks. I applied that to the code.

Absolutely. Right now I am trying to establish a baseline where the model overfits and then I would try to heal that step by step.

So your suggestion is to increase model complexity to create overfitting? That is a nice way to put it. Unfortunately I fail to really do so.

The details are in this notebook, but below you will find the graphs. What I was looking for is a point where the training loss gets smaller and smaller and the validation loss gets higher and higher. But I do not see such a point.

I did not, however, train the more complex models longer than the others. Not sure if this is an issue, but I would try this next. Maybe just for the most complex model.