Time series/ sequential data study group

I am glad that we all agree that there is a huge potential here. But unlike computer vision, we do not have this huge amount of publicly available labeled datasets, this is why I think data acquisition is very important here, and competitions like Kaggle’s and others help with that, in addition to people making there data publicly available.

1 Like

@hfawaz have you tried running a regular GD method on UCR? The datasets are so small that you could use something as LBFGS instead of Adam/SGD.
May improve accuracy considerably. Is there an implementation of GD (non stochastic) in fastai @jeremy ?
I have used pytorch built in LFBGS before, I may give it a try.

For InceptionTime we did play with the batch size, so basically with a big batch size you will be running batch gradient descent instead of mini-batch gradient descent over the small datasets. Not sure how that would affect ResNet and FCN, but I think that the original implementation of ResNet and FCN uses a formula to compute the mini-batch size which I found to be very suboptimal.

Yeah just use the ones in pytorch or scipy - they work fine.

Lol, yeah :

batch_size = min(x_train.shape[0]/10, 16)
1 Like

Hi @hfawaz,

Welcome to our study group! It’s a priviledge to have a Time Series world-class researcher joining us!

I hope you’ll find the experience as useful and rewarding as I have. I can say that for me the fastai community’s been the best learning and collaborative environment I’ve found in the area of ML.

I’d really like to thank you and the rest of the team for the quality of work you are producing and for openly share your code. I think you’re raising the standard of research in TS.

I also work in the area of Time Series Classification and Regression (not Forecasting), mainly with multivariate datasets.

I have a few comments on your previous post:

  • InceptionTime: I read your paper when it was public, found it super interesting, so I created a pytorch version. I’ve been using it for a couple of weeks and results on my own datasets are better than with ResNet. So thanks a lot for developing it! Personally I think that the idea of using larger receptor fields goes in the right direction. I’m building a Practical Time Series repo that I’ll be able to share either today or tomorrow that contains all that is required to train TS models with fastai, as well as a collection of some of the state-of -the-art TS architectures (FCN, ResNet, ResCNN, InceptionTime, etc). I’m currently investigating ways to improve performance of the InceptionTime network applying the fastai framework.
  • Imaging Time Series: I’m with you and Jeremy that the encoding of TS seems like a waste of time, since all the information is contained in the raw data. However, I’ve seen that in some datasets, imaging works really well, even if the dataset is tiny, as you can benefit from computer vision transfer learning. I have tried multiple encodings (Gramian, MTF, RecurrencePlots, Wavelets, etc) with mixed results. I believe that in the end raw input models should prevail, but it’s also true that our brain is far better identifying patterns based on charts that on numerical data.
  • Recurrent models: In all comparisons I’ve made, I’ve always found CNN models far superior to RNNs, and they are much faster to train. I gave up on RNNs some time ago.
  • Regression: I’m also working in this area, but my datasets are proprietary, so I cannot share them. Sorry about that!

Just to give you an idea, here are few areas I’m currently testing in the area of multivariate TS (everything using fastai):

  • Impact of LSUV (and related) initialization
  • New optimizers (like Ranger, developed by some great fastai colleagues - thread)
  • New activation function (also developed by some great fastai colleagues - thread)
  • Data augmentation: cutout, mixup, cutmix,…
  • Semi-supervised learning: mixmatch, uda, s4l
  • Training: progressive resizing
  • Ensembles vs multi-branch models vs hybrids
  • New hybrid Time-Frequency models
  • Inception architecture tweaks: ’bag of tricks’
  • Visualization of activations

I’ll post any significant insights I get during my experiments.

I’m more than happy to discuss any of this with anybody who’s interested. I’ll also create notebooks to demonstrate this functionality.

3 Likes

@oguiza I am also very glad to be here, thanks for taking this great initiative and creating this study group!
I find it great to be able to discuss with everyone interested in such an important topic.
I will be eagerly waiting for your results and implementation of InceptionTime in fastai.

As for imaging time series, I think that for some datasets (and maybe most of them) adding domain knowledge into the design of an architecture is going to help improving the accuracy - which is the case for some datasets where imaging (frequency domain for example) is some kind of domain knowledge that helped in improving the accuracy.

I am also working on multivariate, semi-supervised, data augmentation, ensembling and some architecture tweaks. I will keep everyone up-to-date once I have something concrete to show.

Thanks again for all of this!

1 Like

@oguiza I implemented the Inception module today, it looks like this:

class InceptionModule(nn.Module):
    def __init__(self, ni, use_bottleneck=True, kss=[41, 21, 11], bottleneck_size=32, nb_filters=32, stride=1):
        super().__init__()
        if use_bottleneck:
            self.conv0 = nn.Conv1d(ni, bottleneck_size, 1, bias=False)
        else:
            self.conv0 = noop
        self.conv1 = conv(bottleneck_size, nb_filters, kss[0])
        self.conv2 = conv(bottleneck_size, nb_filters, kss[1])
        self.conv3 = conv(bottleneck_size, nb_filters, kss[2])
        self.conv_bottle = nn.Sequential(nn.MaxPool1d(3, stride, padding=1), 
                                         nn.Conv1d(bottleneck_size, nb_filters, 1, bias=False))
        self.bn_relu = nn.Sequential(nn.BatchNorm1d(4*nb_filters), 
                                     nn.ReLU())
    def forward(self, x):
        x = self.conv0(x)
        return self.bn_relu(torch.cat([self.conv1(x), self.conv2(x), self.conv3(x), self.conv_bottle(x)], dim=1))

and to create the network:

def create_inception(ni, nout, kss=[41, 21, 11], stride=1, depth=6, bottleneck_size=32, nb_filters=32,head=True):
    layers = [InceptionModule(ni, kss=kss, use_bottleneck=False, stride=stride), MergeLayer(), nn.ReLU()]
    layers += (depth-1)*[InceptionModule(4*nb_filters, kss=kss, bottleneck_size=bottleneck_size, stride=stride), MergeLayer(), nn.ReLU()]
    head = [AdaptiveConcatPool1d(), Flatten(), nn.Linear(8*nb_filters, nout)] if head else []
    return  SequentialEx(*layers, *head)

I think it can be simplified a bit. @hfawaz can you check if it is correct? From my initial testings, it is not training that well. The 40 epochs needed for resnet almost don’t do anything to the InceptionTime, probably I have a bug somewhere

Nice that was fast!
Not quite sure, is there an output of model.summary() similar to keras ?

ni=1, bottleneck=32, nb_filters=32

InceptionModule(
  (conv1): Conv1d(1, 32, kernel_size=(41,), stride=(1,), padding=(20,), bias=False)
  (conv2): Conv1d(1, 32, kernel_size=(21,), stride=(1,), padding=(10,), bias=False)
  (conv3): Conv1d(1, 32, kernel_size=(11,), stride=(1,), padding=(5,), bias=False)
  (conv_bottle): Sequential(
    (0): MaxPool1d(kernel_size=3, stride=1, padding=1, dilation=1, ceil_mode=False)
    (1): Conv1d(1, 32, kernel_size=(1,), stride=(1,), bias=False)
  )
  (bn_relu): Sequential(
    (0): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (1): ReLU()
  )
)

This would be the 1st layer for reading a 1 channel TS. The problem with this display method is that you don’t see that the 3 convs+ the conv_bottle are stacked together, you could guess this by the batchnorm(128) layer that comes afterwards.

I guess here you are applying a bottleneck operation for the first layer. You can see here that I skip it for the first layer explicitly.

Thanks, I will change that. Would you mind checking here if I got it right?

TimeseriesAI

For those of you interested in the area of Time Series Classification, I’ve created a new repo called “Practical Deep Learning for Time Series” based on the fastai library.
It’s based on an idea I’ve been developing for quite some time. What I plan to do is to share a lot of code that I’ve created over the last few months, as well as some notebooks to demo how that code can be used. You will see that everything is focused on Time Series (Classification and Regression in particular).
The first commit of this repo contains the following:

  • Fastai time series library called fast_timeseries. It contains lots of things I’ll be demoing in notebooks in the next few weeks. In the first one we’ll make use of custom TSItem, TSItemLists, TSDataBunch, etc. You’ll see that it makes the use of time series in fastai really easy.

  • I’ve also included a pytorch model library called torchtimeseries.models. It contains some of the state-of-the-art models for time series classification (based on raw data). I’ve included FCN, ResNet, ResCNN and InceptionTime. I have other models, but I believe these work really well in small/ medium datasets. I’ll add more models in the near future.

  • I’ve also created a first notebook (Intro to Time Series Classification) to demo how to integrate all this in a simple way, to that you may be able to create a state-of-the-art models in just a few minutes.

In future notebooks, I’ll try to explain how you can start using more advanced initialization schemes, data augmentation for time series, visualization techniques, and many other topics related to Time Series.

I’d love to receive some feeback, expecially if there’s anything that doesn’t work as expected, or is not clear, or is missing.

10 Likes

Thanks so much for this. I look forward to getting stuck in! It’s work like this, willingly shared, that makes the fast.ai community such an amazing place!

1 Like

@oguiza thank you very much for a very clear and concise notebook to follow for time-series. I’ve been iffy about getting my feet wet with it but your notebook has made things very clear for me. Thanks :slight_smile:

1 Like

Thanks so much @AnthonyHolmes! I’ve learned so much Jeremy, Rachel, and the great fastai community that I wanted to give something back.

Excellent! You have helped me understand so many things, that I’m very glad you found the repo clear and useful. I value your opinion a lot! Thanks for sharing!

1 Like

Great work, thanks for this very fast reactivity ! I believe both implementations @tcapelle and @oguiza should achieve almost the same results.
Is anyone willing/planning to run the fastai implementation on the whole 128 archive ?

BTW I updated today the InceptionTime repository which contains now the results for the 128 UCR datasets as well as the multivariate ones.

Has anyone tried using the Mish activation for time series yet instead of ReLU? (out of curiosity, I want to play with it myself later in the week but I cannot at the moment).

Now you understand why this is called fast.ai! :wink:
No merit fro my side. I developed the architecture a couple of weeks ago and have been using on my datasets. @tcapelle has really reacted very quickly!
I have not compared the implementations. I’ll take a look at them tomorrow. I’ll let you both know if I have any questions.
As to testing the fastai implementation, I’d love to but don’t have the time or resources to do it. I just have a single, cloud GPU. So if you want to go ahead and run the test, I’ll be more than happy to assist in any way I can, but won’t be able to run long tests. I think it’d be good to benchmark against your Tensorflow implementation, as a starting point although there are a few approaches that could further improve the result.

2 Likes

No, not yet. I’ve already tested the Ranger + Flat + cosannealing framework and it seems to work better than one_cycle. I have a few ideas on how to tweak the InceptionTime arch that I’m planning to test over the next few days, but if you have the time before me just go for it!

1 Like