Difficulty adapting different dataset to Rossman

While trying to adapt the Rossman notebook for the Corporación dataset, I have run into some issues. Rossman runs fine but the fit for Corporacion is not working (even tried reducing features to 5).

Does anyone have an idea where I might be going wrong?

I have tried going back into the Rossman code, and I suspect the problem is around here.

In particular, what is this code doing?

val_idx = np.flatnonzero(
(df.index<=datetime.datetime(2014,9,17)) & (df.index>=datetime.datetime(2014,8,1)))

I thought it was breaking off the validation set and removing the attached values we were estimating. Afterall, val_idx is as long as validation set above. However, the dates did not correspond to val_idx setup a bit earlier. It’s like a slice from the middle is cut out.

Try just picking 10 items and 10 stores and use that as your initial dataset. Your current dataset is very very big!!!


As discussed in this weeks lecture.

is selecting the last few weeks as the validation set. I went back and made sure the val_idx accurately catching the last month of data I wanted to predict.

As suggested, I reduced down to 3 items, 3 stores, reduced the variables to 4, and cut to 100,000 rows. Same error that you can see below. Even tried converting unit_sales to int64 to match up better with the Rossman datasets. At this point I think I obviously missed another big step and need to rewrite everything.

My Code

Rossman Code - What I want to see

Can you check your y_range if the value are correct?
I made one silly mistake and got the loss as NaN too, typo on y1, yL, which causes the y_range to be (0, NaN) :sweat_smile:

If you just set y_range=None you can check it’s working ok without it. (We didn’t discuss this parameter in class - will do so next week; you should get nearly as good results by setting it to None)


