RNN for multiple short time series?

Hi all! I have the following data problem. My data consists in N time series of length L (N is in the thousands, L is about 50). The data for each time series comes in labeled form (x(t),y(t)) for t = 1,…,L, where:

  • x(t) is a 20-dimensional numeric vector (continuous)
  • y(t) is a scalar value (continuous)

What I would like the model to do is estimate y(t+1) from past information, which could be x(t), x(t-1), …, and y(t), y(t-1), … , as well as possibly some state. Each time series should follow highly similar dynamics, so I would like to train a single model for all the time series.

My current idea is to use an RNN architecture for this, but I am not sure where to start. It seems to me that unlike in most of the examples we’ve seen, an embedding layer is not needed, so I can probably skip this. But for instance, how should I format the data? Trying to wrap my head around the input dimensions, output dimensions and hidden state dimensions.

Also, I am not sure how to split the data into train/test/validation.

Any thoughts would be very welcome!



I have a similar problem. I’d love to hear more if you’ve had any progress with yours.

Hi @ranih, to my frustration this project was cancelled before I was able to really get started on it.

I would love to hear about your own problem though, and how it goes!

I’m still going over the videos, but my current approach is to convert the times to ages and use only the last 5-10 values of the series without RNN.
My approach is based on this blog post: http://blog.kaggle.com/2015/07/27/taxi-trajectory-winners-interview-1st-place-team-🚕/