One question for a complete noob in this field. Is it normal to use a row-wise representation of the time series, when stored as data frames? I see that it is the format used in both the repositories of @oguiza and @tcapelle, but as far as I know, data frames are optimized to work column-wise.
I’ve seen multiple variations. You usually need to manipulate the df to get the expected input format. To use the functionality I’ve shared, you need to have samples in rows, a column for features (in the case of multivariate ts or None for univariate) and time steps in columns, but if we see other uses I may need to update the code
As to the optimization, it’s difficult to know what would be better. Sometimes you have more samples than time steps or vice versa.
I think you can use anything you want as features or target. What is important is to let indicate if the target should be handled as a category or a float. If you try it and it doesn’t work as expected, please let me know.
yes, is implicit by the order of the rows. I will add this to the notebook to make it clear. You need to sort the data frame rows by sample, otherwise data from different samples will be mixed. Thanks for raising this!
So I had a look to the mixup data augmentation technique, I believe it is a special case of a weighted data augmentation technique that we have proposed previously but hadn’t had much success with it. Maybe you guys can make it work.
Basically the method computes the weighted average of a set of time series and consider this weighted average as the new time series (to augment the training set).
The average is computed in the DTW space instead of the euclidean one.
Here are the relevant papers, this is the original method and this one shows its use with ResNet.
I have been playing this morning with a 1D implementation of Res2Net from here. But the more I play with UCR the more I dislike this benchmark. It has so little training samples for some tasks that I am not sure if we could create a model that performs well. How I see it:
The more you train, the more your training loss decreases, but you start getting worst results in the test set at some point, for instance for SmallKitchenAppliances that has 375 samples we get this only after 40 epochs:
For OliveOil is another completely different thing, only 30 miserable samples:
I am using mixup all the time now, to augment our little data. The results of the res2net50 for our bench tasks are (100 epochs, lr=1e-3, FlatCos anneal):
I agree that it’s very frustrating sometimes!
But I have to say that it is no different from other real life datasets.
The one I use only have aound 1000 samples, and I can tell your it’s equally frustrating!!
The good thing is that when you deal with very challanges datasets, you end up trying so many things, that you learn a lot.
And there are really small datasets (like OliveOil), where you can get a very high accuracy (in some models I’ve used 96%+) with only 30 samples (levereging image encoding and transfer learning).
I’m still convinced we can beat HIVE-COTE and TS-CHIEF using this dataset!
No there are no missing values. I gues you could delete some randomly if you wanted to try it.
I’ve never dealt with missing values to be honest with you. I don’t know if they would even work.
Usually you would replace those missing values with a constant, or an average, or median, etc.
Sorry I can’t help more.
I’d think you’d want an external preprocessing step in your data frame to handle this with the average. There’s a few different methods to it but I usually do the average, so here I’d do it over the particular series instance (row). That’s how I’d go about missing values in this case