Time Series Forecasting

youcefjd · March 24, 2020, 2:34am

Hi all,

I am working on time series data (basically predicting a number in the future). What might be the best approach besides RNNs. I think I’ve heard Jeremy saying that he obtained better results using a traditional tabular approach but I couldn’t find the course related to that.

Thanks and stay safe!

muellerzr · March 24, 2020, 2:41am

Part 1 v3 Lesson 6, Rossmann is the video you want

youcefjd · March 24, 2020, 2:43am

Thank you!

farid · March 24, 2020, 3:46am

The state of art in time series forecasting is achieved using different variant of LSTM architecture. Joshua Bengio Group lately published a new architecture N-BEATS that uses s a multi-layer FC network. @ takotab implemented N-BEATS for fastai2. You can find his package fastseq here

You might check this post to have an overview of the different time series projects that we are developing for fastai in this forum.

Here is the Time Series Thread (for fastai2) where we are discussing the implementation of a common time series package for fastai2.

PS: The case treated in Rossmann is a regression: It is a kind of a single point forecasting. In the present project, we are more interested in multi-point forecasting and more specifically in probabilistic forecasting. The latter gives an idea of your confidence interval of your forecast (prediction). Single point forecasting only gives either the median or the mean of the forecast which is insufficient because we don’t know the size of the confidence interval.

youcefjd · March 25, 2020, 2:14am

Thank you Farid. This is precisely what I am looking for.

jwithing · March 25, 2020, 2:49am

I am looking to forecast daily traffic to our website, where I have about 1200 days worth of data. Does this seem like a solvable problem? I have been using Facebook’s Prophet model. Eager to see what I can do with Fast AI!

jwithing · March 25, 2020, 2:58am

@farid—do I understand correctly you guys implemented that N-BEATS architecture in fast.ai within a month of that article coming out?

farid · March 25, 2020, 4:28am

@jwithing, for N-BEATS, all the credit goes to @takotab: He did a fantastic job by implementing it in fastai2. You might check out his documentation.

I built the timeseries package which is a time series classification package for fastai2. You can find its documentation here.

I’m planning to start porting Amazon GluonTS soon. Tako and I are teaming up.

jwithing · March 25, 2020, 1:17pm

thanks @farid! I’m by no means a developer or engineer, so it’s awesome to see how Fast AI and its community help enable marketers like me

jwithing · March 25, 2020, 2:49pm

I haven’t started on this yet, but I am thinking one issue will be that our paid marketing channels are a significant contributor to site traffic. And our paid marketing spend is variable over time.

We are a large online fashion resale site so I currently have a dataframe of the following:

Date
Number of Visitors
Fashion item release day? (0 or 1)

Would it be of value to add daily paid marketing spend as another variable? Probably right?

farid · March 25, 2020, 4:56pm

@jwithing Yes, it would be good to include daily paid marketing as a co-variate (feature time-series). Your model will learn the impact of that data and will incorporate it in its forecasting.

Here below is a brief illustrated explanation on how it works at a high level:

A model is trained by randomly sampling several training examples from each of the time series in the training dataset. Each training example consists of a pair of adjacent context and prediction windows with fixed predefined lengths. The context_length hyperparameter controls how far in the past the network can see, and the prediction_length hyperparameter controls how far in the future predictions can be made.

The following figure represents five samples with context lengths of 12 hours and prediction lengths of 6 hours drawn from element i. The feature time series are xi,1,t and ui,2,t (also called co-variates in literature).

To capture seasonality patterns, a model can also automatically feeds lagged values from the target time series. In the example with hourly frequency, for each time index, t = T, the model exposes the zi,t values, which occurred approximately one, two, and three days in the past.

jwithing · March 26, 2020, 12:38am

Got it those illustrations are very helpful! I’m reaching out to @takotab over email …can’t figure out how to install fastseq on a Colab notebook. I have fastai2 installed there.

muellerzr · March 26, 2020, 12:44am

@jwithing what were you having issues with? The modifications to his install code to get it working in colab should be:

!git clone https://github.com/takotab/fastseq.git
%cd fastseq
!pip install -e .

jwithing · March 26, 2020, 1:52am

thanks @muellerzr! i was using !cd not %cd

jwithing · March 26, 2020, 2:11am

Well, I’m still doing something wrong. Somehow I am not installing the correct packages. I think this is the issue because I am popping an error on not having nbdev. So then I add !pip install nbdev and then get another package error and so on until I get to TSDataLoader which can’t be !pip install

!curl -s https://course.fast.ai/setup/colab | bash

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"
base_dir = root_dir + 'siteprediction/'

!pip install fastai2

!git clone https://github.com/takotab/fastseq.git
%cd fastseq
!pip install -e .

from fastai import * from fastai2.basics import * from fastseq.all import * from fastseq.nbeats.model import * from fastseq.nbeats.learner import * from fastseq.nbeats.callbacks import *

youcefjd · March 26, 2020, 2:26am

do you have any idea why the training/validation loss is nan? Although I tried to fill in the missing values manually - without going through FillMissing, I still got NaN.

hahmed988 · March 28, 2020, 1:39pm

Can someone point me to an example or blog post for multivariate time series forecasting using fastai, wherein we can pass in other categorical column like day of week as well …

I looked in the fastseq example but that is a univariate example. I have 2 months of data and I need to predict for next fifteen days.

swell · March 28, 2020, 4:29pm

I’m having the same issue where I cant get past the tsdataloader import. Have you had any luck fixing it?

farid · March 28, 2020, 4:39pm

You might check out Amazon Labs’ time series forecasting repo called GluonTS .

GluonTS uses Amazon MXNet (instead of Pytorch or TensorFlow). They implemented many state-of-the-art architectures ( DeepFactor, DeepAR, DeepState, GP Forecaster, GP Var, LST Net, N-BEATS, NPTS, Prophet, R Forecast, seq2seq, Simple FeedForward, Transformer, Trivial, and WaveNet). Many of them (DeepFactor, DeepAR, DeepState) also use categorical data (covariate variables) and use probabilistic forecasting

jwithing · March 29, 2020, 3:41am

@takotab indicates that he needed to do some more work to get it supported on Conan. I’m sure he’ll get to it！