Difficulty adapting different dataset to Rossman

While trying to adapt the Rossman notebook for the Corporación dataset, I have run into some issues. Rossman runs fine but the fit for Corporacion is not working (even tried reducing features to 5).

Does anyone have an idea where I might be going wrong?

I have tried going back into the Rossman code, and I suspect the problem is around here.

In particular, what is this code doing?

val_idx = np.flatnonzero(
(df.index<=datetime.datetime(2014,9,17)) & (df.index>=datetime.datetime(2014,8,1)))

I thought it was breaking off the validation set and removing the attached values we were estimating. Afterall, val_idx is as long as validation set above. However, the dates did not correspond to val_idx setup a bit earlier. It’s like a slice from the middle is cut out.

Try just picking 10 items and 10 stores and use that as your initial dataset. Your current dataset is very very big!!!

2 Likes

As discussed in this weeks lecture.

val_idx = np.flatnonzero(
(df.index<=datetime.datetime(2014,9,17)) & (df.index>=datetime.datetime(2014,8,1)))

is selecting the last few weeks as the validation set. I went back and made sure the val_idx accurately catching the last month of data I wanted to predict.

As suggested, I reduced down to 3 items, 3 stores, reduced the variables to 4, and cut to 100,000 rows. Same error that you can see below. Even tried converting unit_sales to int64 to match up better with the Rossman datasets. At this point I think I obviously missed another big step and need to rewrite everything.

My Code

Rossman Code - What I want to see

Can you check your y_range if the value are correct?
I made one silly mistake and got the loss as NaN too, typo on y1, yL, which causes the y_range to be (0, NaN) :sweat_smile:

1 Like

If you just set y_range=None you can check it’s working ok without it. (We didn’t discuss this parameter in class - will do so next week; you should get nearly as good results by setting it to None)

2 Likes

Is there an API call to perform prediction on the test sets?

@mindtrinket Hi James, can you share the process on how you made the Kaggle submission file for Rossman? Getting an error while using TTA

@arjunrajkumar I was pivoting by trying to get the Rossman model to work on the Porto Driver correctly and haven’t worked on the submission part for either.