Training Loss Starts Lower Than Validation Loss


(Matthew Krehbiel) #1

Not sure what is going on. The first time I run .fit() it immediately displays the training loss as lower than the validation loss, no matter what numbers I put for the parameters. Any idea why this could be? I understand this means I’m overfitting, but how is it happening the first time I train, and on the first iteration?

I’ve tried drastically increasing the dropout rates to no avail.

A few lines of code that may be useful:

m = md.get_learner(emb_szs, len(df.columns) - len(cat_vars), 0.04, 1, [500,250], [0.04,0.4],
y_range=y_range, use_bn=True)
lr=1e-3; wd=1e-7
m.fit(lr, 2, wd, cycle_len=1, cycle_mult=2)

Any information would be very helpful!


(Matthew Krehbiel) #2

@jeremy


(Matthijs) #3

I’m not sure why you think that means you’re overfitting. Unless the difference is huge there’s no problem if the training loss is smaller than the validation loss. There is only a problem if your validation loss starts to go up over time rather than down.


(Matthew Krehbiel) #4

The validation loss starts going up after the second Epoch. All signs point to overfitting right away. I never got a similar example to this anywhere in the course.


(Matthijs) #5

How much data do you have?


(Matthew Krehbiel) #6

About 40,000 rows. Was thinking that could be the issue, but I remember there being a portion of the course where we trained a model on less data than that IIRC.


(Matthijs) #7

How many columns in those rows? I don’t know enough about this model you’re using but perhaps it uses way too many parameters.


(Matthew Krehbiel) #8

Around 30 total features. I didn’t realize more features would cause over fitting more easily, would’ve thought it would be the other way around. Thanks for the help btw!


(Matthijs) #9

I’d try lowering the size of the hidden layers (it looks like 500 and 250 neurons right now?) as well as the learning rate, just to see what happens.


#10

Did you manage to reduce overfitting?