Hi! I’ve been looking for a Fast.ai example with tabular data where each row is dependent on the previous one, but couldn’t find a suitable example. I’m trying to forecast the weather based on previous hourly temperature data in cities. The data looks like this:
day
time
Paris
Berlin
London
Amsterdam
7.3.2019
8:00
15.3
13.4
12.9
13.2
7.3.2019
9:00
15.4
13.5
13.1
13.3
7.3.2019
10:00
15.7
13.7
13.4
13.5
7.3.2019
11:00
15.5
13.9
13.6
13.9
7.3.2019
12:00
15.9
13.4
13.2
13.5
7.3.2019
13:00
16.0
14.0
13.9
13.9
7.3.2019
14:00
15.7
14.1
14.0
14.0
7.3.2019
15:00
15.6
13.9
14.1
13.8
7.3.2019
16:00
?
?
?
?
I’ve searched for time series and LSTM examples but I’m not sure if these are the right keywords since I only found more advanced pieces of code or NLP related examples. Which data loader and learner should I use for predicting the temperature in each city for the next hour? Thanks a bunch!
It seems there isn’t a ready-made method in Fast.ai for this type of data. Have any of you stumbled upon a relatively beginner friendly implementation of this with non-standard Fast.ai methods?
You could try something like this. In pandas out the next row on the same row as the previous. Then for dep_var put a list of what those columns are. I have no idea if it’ll have good results but that’s how I’d do it
In theory flattening the rows should work since the learner gets all the information from previous time steps. I’ll very quickly hit a performance wall though, since I’m planning on running it with hundreds of cities and include other features too, such as wind speed and humidity. One time step has nearly a thousand features, leading to very slow training when stacked up 5-10 times. Will try this today anyway, thanks!
Doesn’t the tabular learner only read rows as individual observations and not as a sequence? I could concatenate the sequences into single rows Mueller suggested above but this will lead to a massive amount of features. I could be missing something here though. I’ll have a look into Gramian Angular Field.
Thanks!
Yes it reads one row at time, but the network (linear layers and embedding matrices) learns the relationship between inputs in time.
Give it a try: in the rossmann paper they address your same problem, but instead to predict forecast they predict sales, feeding a lot of rows one at time.
There’s nothing wrong with a massive amount of features. As Jeremy says in the intro to ML the curse of dimensionality isn’t really a thing. I’ve done research where I have 100+ features. And I engineered 80% of thise