How to combine non-mutually exclusive embeddings


It’s been a while since I’ve watched the courses (I watched v2 sometime early/mid last year). I was thinking about trying to build a CNN with embeddings to do time-series forecasting. My proposed initial architecture is something like the following:

  1. Create datetime-related features (month, day of week, percentage of month, year)
  2. Create event-related data (holidays, local events)
  3. Embed the categorical features (event data, month, day of week)
  4. Concatenate the features for each day
  5. Convolve the features with different window sizes (+/- 7, 5, 3 days)
  6. Batch norm/convolve repeat…
  7. Output a prediction vector for 31 days at a time

My biggest issue is figuring out how to appropriately perform the embeddings for event-related data. Initially, I was going to use only holiday-related events, which are generally mutually exclusive. However, if I allow for local events that my collaborators think will be relevant to the prediction, they are no longer necessarily mutually exclusive (as a concrete example, there may be a music festival and a large conference going on in the same city on the same day).

My question becomes: how do I embed something like “events on day X” where the events that occur on “day X” could be 0, 1, 2, 3… elements long? The simplest thing I can think of would be to embed each one individually, then add them all together (they are just vectors, after all). But this doesn’t feel appropriate. Could I alternatively assume a “longest length” (i.e., most number of events, say 5), encode each event with a number, then use the sorted vector (i.e., [12, 345, 451, 99999, 99999], where 99999 is a stand-in for “no event”) as the key for my embedding lookup?

Is there a better way that someone could think of? I suppose I should just try both of the above suggestions, but if someone has some experience in this sort of thing that I can leverage, that’d be great.