Training Loss Starts Lower Than Validation Loss

Not sure what is going on. The first time I run .fit() it immediately displays the training loss as lower than the validation loss, no matter what numbers I put for the parameters. Any idea why this could be? I understand this means I’m overfitting, but how is it happening the first time I train, and on the first iteration?

I’ve tried drastically increasing the dropout rates to no avail.

A few lines of code that may be useful:

m = md.get_learner(emb_szs, len(df.columns) - len(cat_vars), 0.04, 1, [500,250], [0.04,0.4],
y_range=y_range, use_bn=True)
lr=1e-3; wd=1e-7, 2, wd, cycle_len=1, cycle_mult=2)

Any information would be very helpful!


I’m not sure why you think that means you’re overfitting. Unless the difference is huge there’s no problem if the training loss is smaller than the validation loss. There is only a problem if your validation loss starts to go up over time rather than down.

The validation loss starts going up after the second Epoch. All signs point to overfitting right away. I never got a similar example to this anywhere in the course.

How much data do you have?

About 40,000 rows. Was thinking that could be the issue, but I remember there being a portion of the course where we trained a model on less data than that IIRC.

How many columns in those rows? I don’t know enough about this model you’re using but perhaps it uses way too many parameters.

Around 30 total features. I didn’t realize more features would cause over fitting more easily, would’ve thought it would be the other way around. Thanks for the help btw!

I’d try lowering the size of the hidden layers (it looks like 500 and 250 neurons right now?) as well as the learning rate, just to see what happens.

Did you manage to reduce overfitting?