I’m not quite sure if you can call this as over fitting since the validation loss isn’t consistently increasing with each epoch. I suppose increasing the number of epochs 2x should give you a better picture of what’s happening?
If you’re using image classification, try adding the argument to cnn_learner
callback_fns = ShowGraph
to plot training and validation losses against the number of iterations. Interpreting a graph is much simpler than a table.
Doubling the number of epochs does increase the validation loss overall. I’m using a tabular_learner with layers=[200, 100]. Can a tabular learner be graphically analyzed?
Hi, are you trying out the Titanic dataset from kaggle?
I couldn’t manage >88% accuracy. It seems you’ll need to do some feature engineering.
Also you can try changing up [200,100] values.
If I understand correctly, you are not overfitting. The plane around the point you’re trying to converge on is ragged, so it is throwing you off- and less learning rate has high loss in this particular problem.
In my experiments, that’s exactly now these parameters (losses, accuracy and etc) bahave in tabular models.
To my mind the model have already fit at around 5-7 cycle number. The rest looks just like fluctuations (in terms of validation set error/accuracy) and overfit.
Unfortunately, I did not came to the general answer in these situations (what to do next),
You can try to add more dropout it may help. Feature engineering also could.
But according to accuracy values, I think your set is not very big. And maybe it just not big enough to move further with this data.
I am! That makes sense. I actually already did some feature engineering (e.g. extracting titles from names, replacing with 0 for no cabin and 1 for cabin). I will try changing layers values, though there doesn’t seem to be much documentation on that.
@jimsu2012 If i remember correctly, there is a correlation between cabin location (C,E and so forth) and survival. It just shows how far people went with feature engineering
I think there are some kernels on the dataset explaining feature importance and important pre-processing before chucking it into random forests. These might be helpful.
The dataset is indeed small, oversampling might be worth looking into.