(Copying this from lesson 4 wiki thread, where it didn’t get a response. maybe better here as its own question.)
Is this statement correct? In Rossman, we’re not using past Sales as an input to help predict future Sales? Or are we indirectly in some way?
My understanding is the Sales data is removed from the data during training and only used as the prediction/result. I do see the causality issues of sales data in the training, I’m trying to understand the consequences of this. (Perhaps I need to complete the DL course and do the ML course, but I’m up to this lesson 4 in DL.) I’m generalising my thinking to any model where the output has a momentum-like component.
If there’s external factors affecting sales that are not captured in the data set, the results are visible in past sales. For example, lets say a competing chain has sales a few times a year and that affects Rossman sales. Maybe they last a week or two at a time. If they are consistent in calendar dates - our data set (with date encodings) will pick this effect up. If (hypothetically, my argument is perhaps moot here) they were inconsistent dates - then I’m picturing irregular periods where sales were lower than predicted - but there’d be a pattern to this. Similarly a neighbouring store could have sale dates that bring people in and increase rossman sales. Once we noticed unexpected sales levels, we could theoretically predict the next part of that continuing. My understanding is our model can’t deal with this
I hope thats clear.