Actually I might be struggling with some scaling problems right now. I have a multivariate time series of 10 channels, 8 of them are related (their values sum up 1). I have 2 problems:
As you mention in your nb, using per_channel scaling would break the ratio between the dependent channels.
Some of the variables present values close to zero almost all the time, which would be a problem when using standardize, due to the standard deviation would be really close to zero. Is there a way to add an epsilon value to that scaling in your repo? I remember Jeremy doing that in some of the lessons of the course.
Adding epsilon seems like a good idea to avoid issues. Iâll add it to the code.
In your case you may consider scaling data before creating the databunch. An option would be to standardize the 8 features one way, and the other 2 in a different way, create the databunch and set scale type to None (or remove). To test this quickly maybe you could pass 8 features only and select a scaling method and train the model. In this way you could check if maintaining the ratio between them makes sense or not.
Well I wasnât thinking in combining 2 different databunches, but in testing them separately to quickly learn which scaling strategy could work best for your problem.
Iâm continuing to try out different things with my NBA forecasting project. One thing Iâm unsure of is how to deal with different players. Basically the history of each player is its own time series. Iâd like to train my model on all the players, in order to leverage all the data, and because there ought to be many common patterns. At the same time, players are clearly not identical. Any ideas on how to deal with this? My first thought is to create player embeddings, but Iâm not sure if this is the best approach.
This paper has just hit ArXiv, and looks promising:
ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels
I was just coming here to comment on this, it is super effective. I am trying this on my regression problem, and the results are amazing. Could the code be ported to pytorch to make use of GPU?
For my dataset 1M curves of 800 points it is slow.
Thtaâs the other question I had. My timeseries have 2 channels, so I just reshaped them to one channel .view(N, -1), but probably the generate_kernels functions would need to be applied per channel. Also I am doing regression, so using RidgeCV instead of RidgeClassifierCV.
Fair warning: while I was able to get the model to train, it does not currently learn anything useful. Even when I use a small subset of training data, it isnât able to overfit. So something (or more likely several things) is seriously wrong in my setup.
Currently Iâve gone back to the drawing board with much simpler models that are easier for me to understand and debug. If you can get fastai working for a forecasting problem I would love to learn what you did!
@oguiza I just finished adapting Rocket for multi channel classification/regression.
You can check it here
I have zero experience with numba but it appears to work (had to remove some numba functions like mean, sum, etc⌠and replace them by for loops.)
If I get it correctly, you need to add the dimesion to the convolutional kernels, so I added a channel dimesion for each kernel.
Then I had to adjust how the kernel is applied, so added an extra for loop to perform the convolution channel wise, and then sum the results of both layers. I tried with some data I had o hand, and it getâs good results, so it should be safe to use.
Thanks so much, @tcapelle! That was fast, again!!
I donât think Iâll have time today, but will take a deeper look at this tomorrow. Looks like a radically different approach to time series, but much faster than traditional approaches.
Iâve noticed they havenât uploaded the code they used for the Satellite Image Time Series dataset (the one they recommend for larger datasets) yet. Itâll also be interesting to see how they apply this idea using logistic regression.