Hello,
It’s been a while since I’ve watched the courses (I watched v2 sometime early/mid last year). I was thinking about trying to build a CNN with embeddings to do time-series forecasting. My proposed initial architecture is something like the following:
- Create datetime-related features (month, day of week, percentage of month, year)
- Create event-related data (holidays, local events)
- Embed the categorical features (event data, month, day of week)
- Concatenate the features for each day
- Convolve the features with different window sizes (+/- 7, 5, 3 days)
- Batch norm/convolve repeat…
- Output a prediction vector for 31 days at a time
My biggest issue is figuring out how to appropriately perform the embeddings for event-related data. Initially, I was going to use only holiday-related events, which are generally mutually exclusive. However, if I allow for local events that my collaborators think will be relevant to the prediction, they are no longer necessarily mutually exclusive (as a concrete example, there may be a music festival and a large conference going on in the same city on the same day).
My question becomes: how do I embed something like “events on day X” where the events that occur on “day X” could be 0, 1, 2, 3… elements long? The simplest thing I can think of would be to embed each one individually, then add them all together (they are just vectors, after all). But this doesn’t feel appropriate. Could I alternatively assume a “longest length” (i.e., most number of events, say 5), encode each event with a number, then use the sorted vector (i.e., [12, 345, 451, 99999, 99999], where 99999 is a stand-in for “no event”) as the key for my embedding lookup?
Is there a better way that someone could think of? I suppose I should just try both of the above suggestions, but if someone has some experience in this sort of thing that I can leverage, that’d be great.