Rossman Notebook - Country Level Trends Intuition

thomzi12 · August 3, 2018, 6:14am

Hey! Not sure if this is right place to post this, but I had a question about the Rossman Notebook.

A little before halfway through the notebook, we isolate out Rossman Google Trends for all of Germany using the code:

trend_de = googletrend[googletrend.file == 'Rossmann_DE']

A little later, as part of a series of joins, we add this Germany-specific data to the training and test data:

joined = joined.merge(trend_de, 'left', ["Year", "Week"], suffixes=('', '_DE'))
joined_test = joined_test.merge(trend_de, 'left', ["Year", "Week"], suffixes=('', '_DE'))

My question is: what’s the intuition for isolating out the country-level Google trends for each store-date? Why is this feature useful, given that we also have state-specific Google trend data?

In particular, I’m trying to think about what extra information we get when the country and state trends move in opposite directions or when there is a large difference between the two values … not totally clear to me.

Thoughts?