Please help me understand “cross store variance”.
I am trying to apply lessons from the Rossmann notebook to a model of my own. It is very different in setup and the data has very different dimensions. Looking at the lesson3-rossmann notebook is helping me a lot. Trying to roll my own is forcing me to learn a lot about the Python stack (pandas, numpy, matplotlib, fastai, …). Now I am halfway in and I find myself wanting a little help keeping afloat in a sea of new insights and knowhow.
So; could you help me visualize what the Rossmann model (in the nb) is actually doing, from the perspective of the data?
Here’s how I see it now. Model is trained on lots of X’s in a (daily) time series. Here’s my ascii art viz:
day ; storeID; storeProperties; globalProperties; 1 jan; 4; bla;blabla;12;a; yak;5;4;3;yakyak; 2 jan; 4; bla;blabla;88;a; yak;5;4;3;yakYAK; 1 jan; 6; blb;meh;22;a; yak;5;4;3;yakyak; 2 jan; 6; blb;mehhh;75;a; yak;5;4;3;yakYAK;
Never mind the redundancy here. The trainee likes it like that.
We are training “supervised”, so we must also supply our Y’s (Sales). We put them (initially) in the the same series. Easy enough, one store-day = one Sales number;
> day ; storeID; storeProperties; globalProperties; Sales > 1 jan; 4; bla;blabla;12;a; yak;5;4;3;yakyak; 1000 > 2 jan; 4; bla;blabla;88;a; yak;5;4;3;yakYAK; 1200 > 1 jan; 6; blb;meh;22;a; yak;5;4;3;yakyak; 3300 > 2 jan; 6; blb;mehhh;75;a; yak;5;4;3;yakYAK; 3100
We are expected to predict Sales for each store, based on the per-store data. RIGHT?!?!?
Question: does our model learn to consider properties not related to the store in question? Can, for instance, the properties for store #4 affect the Sales in store #6 (in our model, at the moment of inference)?
For the model I am building, this “cross store variance” is very relevant: read my comments below.