Time Series Forecasting

Hi all,

I am working on time series data (basically predicting a number in the future). What might be the best approach besides RNNs. I think I’ve heard Jeremy saying that he obtained better results using a traditional tabular approach but I couldn’t find the course related to that.

Thanks and stay safe!

Part 1 v3 Lesson 6, Rossmann is the video you want :wink:

1 Like

Thank you!

The state of art in time series forecasting is achieved using different variant of LSTM architecture. Joshua Bengio Group lately published a new architecture N-BEATS that uses s a multi-layer FC network. @ takotab implemented N-BEATS for fastai2. You can find his package fastseq here

You might check this post to have an overview of the different time series projects that we are developing for fastai in this forum.

Here is the Time Series Thread (for fastai2) where we are discussing the implementation of a common time series package for fastai2.

PS: The case treated in Rossmann is a regression: It is a kind of a single point forecasting. In the present project, we are more interested in multi-point forecasting and more specifically in probabilistic forecasting. The latter gives an idea of your confidence interval of your forecast (prediction). Single point forecasting only gives either the median or the mean of the forecast which is insufficient because we don’t know the size of the confidence interval.

3 Likes

Thank you Farid. This is precisely what I am looking for.

I am looking to forecast daily traffic to our website, where I have about 1200 days worth of data. Does this seem like a solvable problem? I have been using Facebook’s Prophet model. Eager to see what I can do with Fast AI!

@farid—do I understand correctly you guys implemented that N-BEATS architecture in fast.ai within a month of that article coming out?

@jwithing, for N-BEATS, all the credit goes to @takotab: He did a fantastic job by implementing it in fastai2. You might check out his documentation.

I built the timeseries package which is a time series classification package for fastai2. You can find its documentation here.

I’m planning to start porting Amazon GluonTS soon. Tako and I are teaming up.

thanks @farid! I’m by no means a developer or engineer, so it’s awesome to see how Fast AI and its community help enable marketers like me :wink:

I haven’t started on this yet, but I am thinking one issue will be that our paid marketing channels are a significant contributor to site traffic. And our paid marketing spend is variable over time.

We are a large online fashion resale site so I currently have a dataframe of the following:

  1. Date
  2. Number of Visitors
  3. Fashion item release day? (0 or 1)

Would it be of value to add daily paid marketing spend as another variable? Probably right?

@jwithing Yes, it would be good to include daily paid marketing as a co-variate (feature time-series). Your model will learn the impact of that data and will incorporate it in its forecasting.

Here below is a brief illustrated explanation on how it works at a high level:

A model is trained by randomly sampling several training examples from each of the time series in the training dataset. Each training example consists of a pair of adjacent context and prediction windows with fixed predefined lengths. The context_length hyperparameter controls how far in the past the network can see, and the prediction_length hyperparameter controls how far in the future predictions can be made.

The following figure represents five samples with context lengths of 12 hours and prediction lengths of 6 hours drawn from element i. The feature time series are xi,1,t and ui,2,t (also called co-variates in literature).

To capture seasonality patterns, a model can also automatically feeds lagged values from the target time series. In the example with hourly frequency, for each time index, t = T, the model exposes the zi,t values, which occurred approximately one, two, and three days in the past.

Got it those illustrations are very helpful! I’m reaching out to @takotab over email …can’t figure out how to install fastseq on a Colab notebook. I have fastai2 installed there.

@jwithing what were you having issues with? The modifications to his install code to get it working in colab should be:

!git clone https://github.com/takotab/fastseq.git
%cd fastseq
!pip install -e .

thanks @muellerzr! i was using !cd not %cd

Well, I’m still doing something wrong. Somehow I am not installing the correct packages. I think this is the issue because I am popping an error on not having nbdev. So then I add !pip install nbdev and then get another package error and so on until I get to TSDataLoader which can’t be !pip install

!curl -s https://course.fast.ai/setup/colab | bash

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"
base_dir = root_dir + 'siteprediction/'

!pip install fastai2

!git clone https://github.com/takotab/fastseq.git
%cd fastseq
!pip install -e .

from fastai import * from fastai2.basics import * from fastseq.all import * from fastseq.nbeats.model import * from fastseq.nbeats.learner import * from fastseq.nbeats.callbacks import *

do you have any idea why the training/validation loss is nan? Although I tried to fill in the missing values manually - without going through FillMissing, I still got NaN.