I am trying to understand the approach of the Rossmann notebook. Here are my questions:
- If this is a time-series problem, where do we limit the model’s ability to see into the future? It seems to me that the network (and random forest at the end) can see all data at once ( = fully connected)?
- Under ‘Durations’, there are the following lines that assign two different things to df. It would seem to me that they break the code by only assigning test set to df. Have I misunderstood something?
df = train[columns]
df = test[columns]
- Jeremy follows the authors in removing instances where the store was closed. Why does he not do the same thing with the test set? Why does the code break down if we don’t remove the closed shops?
- Why is ‘Id’ added to the test set, but not the training set?
Many thanks for your help!