Lesson3 - Why using log of sales in Rossmann exercice and overfitting?

tomsthom · February 2, 2018, 1:50pm

Hello,

I am working on the rossmann example.
I don’t understand why we need to use the log of sales in the model?
yl = np.log(y)

Why couldn’t we use directly the sales values?

Afterwards, we convert it back directly (using exp), without using log values :

    def exp_rmspe(y_pred, targ):
        targ = inv_y(targ)
        pct_var = (targ - inv_y(y_pred))/targ
        return math.sqrt((pct_var**2).mean())

Is there a specific reason?

My second question is that in the couse example, we can see overfitting (for instance after several rounds: [ 2. 0.00707 0.01088 0.09878] )
In the image classification lesson I had understood that we have to be careful to avoid validation loss being greater than test loss (because of dropping). Why in that case can we push the model with this kind of loss? Is it because structured data are less sensitive to overfitting?

Thanks in advance for your help

radek · February 2, 2018, 3:16pm

Why we take the log of the target variable has to do with the fact that we are interested in relative changes vs absolute changes of the target value. You can find more info here

As far as overfitting goes… The values look okay. The middle one should be the results on the train set vs the last one on the validation set. Yeah, if there is dropout used than it could account for the difference. But it could just as well be that maybe the period in the val set is just easy to predict? Unless I am not seeing something these numbers look quite okay.

It’s also hard to talk about overfitting to the val set as we didn’t mess around with the parameters too much and our model never trains on the examples it contains.