Hi All, I am training a deep reinforcement learning agent. The data I am working with is a time series data with m features. But all the m features are not available from the time step T0. Few features start from T0 and few other features start at some other time step say T1000 etc. To summarise the data has the shape (n_samples, T_time_steps, m_features). Certain features are not available until some nth time step. What is the best way to train the model under such circumstances ?
This can be done with a few steps:
- Gather a list of all the features that ever show up (if they show up only once or a couple of times, you can probably safely toss them)
- Create a table with the features at each time point and set the unseen features to NaN.
- Fill NaNs using a technique of your choosing.
- Learn on the dataset
It depends if the features are related in some way too. Are their appearances correlated? Does one appear while the other does not?
Unfortunately, deep learning currently only takes care of some aspects of feature engineering; you still have to do a lot of manual work to wrangle the data.