Rossmann questions


Hi everyone,

I am trying to understand the approach of the Rossmann notebook. Here are my questions:

  1. If this is a time-series problem, where do we limit the model’s ability to see into the future? It seems to me that the network (and random forest at the end) can see all data at once ( = fully connected)?
  2. Under ‘Durations’, there are the following lines that assign two different things to df. It would seem to me that they break the code by only assigning test set to df. Have I misunderstood something?
    df = train[columns]
    df = test[columns]
  3. Jeremy follows the authors in removing instances where the store was closed. Why does he not do the same thing with the test set? Why does the code break down if we don’t remove the closed shops?
  4. Why is ‘Id’ added to the test set, but not the training set?

Many thanks for your help!


I’ll answer some of your questions:

Question 1: Once you train and validate your model and have setup all the hyperparameters, it is advisable to train the model on the entire training set provided to you. The test (at Kaggle) will contain dates in the future and your model will be tested on that.
Question 2: The code is not meant to be run linearly. When you train, you comment out the test[columns] code and vice versa when you test.
Question 4: AFAICT ‘Id’ is not “added” here. Only Id and Sales are selected to be written out to the csv for submission to kaggle.


Hi Sudarshan,

Thanks for your feedback. I understand your points on Q2 and Q4, but I still struggle to understand how this is posed as a time-series problem. The model should be able to predict y for period tn with X up to period tn-1. Say the train/test set breaks at today’s date (May 6, 2018). How can we make predictions into the future with this model? We have no data (X_test) for the future?

Many thanks!


How can we make predictions into the future with this model? We have no data (X_test) for the future?

The underlying assumption is that the distribution of the data will not drastically change in the near future. If what happens tomorrow is drastically different than what has happened till today for the past year, then even the best model would not be able to predict that. Remember, the point of machine learning is to learn the distribution of the data (aka function aka probability).

I would think as time goes by, you would have the retrain your model the latest n observations to update your model parameters so it captures any variations that would’ve occurred in the distributions during those last n observations.

Any problem that was a time component within it can be posed as a time-series problem. The time difference does not have to be consistent too.


Sorry, I’m confused. Even if the distribution doesn’t change - which it usually does in a time-dependent (time-series) model - how do you make predictions into the future without data? In a time-series model, you always make predictions for period tn with data up to tn-1. As far as I have understood, the Rossmann notebook assumes you know your predictors (X) for time period tn. How would you now go about making predictions into the future with the Rossmann notebook? Please correct me if I have misunderstood the architecture of the model. It just seems to me that model can use data (‘independent’ variables) at period t1 to make predictions for period t1. The random forest at the end certainly can see all data at once. It seems to me it’s the same with the neural net. Many thanks.

(Luke Byrne) #6

Hi all,

I have a similar question regarding future predictions using a Rossman style architecture.

  1. How do I make predictions using the .predict() method on just one row of data. Will I need to look up into the embedding matrix to get the relevant embedding representation to pass into the predict method

  2. Say I get new data coming in, can I use the existing model weights, and retrain just giving the model the new data?

  3. How can I deploy this to a flask app for realtime predictions?

I look forward to any responses.

Kind regards,




For 3 check this out.

(Lou Acresti) #8

Has anyone attempted to provide a fixed version of the notebook? I haven’t seen anything out there, and I’ve spent hours carefully trying to “perform” this notebook properly… I imagine many others have also spent a lot of time (or just gave up) as well.