RNN for multiple short time series?

msp · September 2, 2017, 11:09am

Hi all! I have the following data problem. My data consists in N time series of length L (N is in the thousands, L is about 50). The data for each time series comes in labeled form (x(t),y(t)) for t = 1,…,L, where:

x(t) is a 20-dimensional numeric vector (continuous)
y(t) is a scalar value (continuous)

What I would like the model to do is estimate y(t+1) from past information, which could be x(t), x(t-1), …, and y(t), y(t-1), … , as well as possibly some state. Each time series should follow highly similar dynamics, so I would like to train a single model for all the time series.

My current idea is to use an RNN architecture for this, but I am not sure where to start. It seems to me that unlike in most of the examples we’ve seen, an embedding layer is not needed, so I can probably skip this. But for instance, how should I format the data? Trying to wrap my head around the input dimensions, output dimensions and hidden state dimensions.

Also, I am not sure how to split the data into train/test/validation.

Any thoughts would be very welcome!

Cheers

ranih · September 10, 2018, 9:34pm

I have a similar problem. I’d love to hear more if you’ve had any progress with yours.

msp · September 11, 2018, 8:46am

Hi @ranih, to my frustration this project was cancelled before I was able to really get started on it.

I would love to hear about your own problem though, and how it goes!

ranih · September 13, 2018, 6:42am

I’m still going over the videos, but my current approach is to convert the times to ages and use only the last 5-10 values of the series without RNN.
My approach is based on this blog post: http://blog.kaggle.com/2015/07/27/taxi-trajectory-winners-interview-1st-place-team-🚕/