To create a supervised (classification or regression) problem out of a time series, you can use a sliding window approach (this link may be useful).
If in addition to that you need to convert a continuous variable into a class you’ll need to apply some threshold.
I tried the different pyts visualizations GASF
, GADF
, MTF
(this one I couldn’t get it working) and RecurrencePlots
.
Then the 3 pyts classifiers BOSSVSClassifier
, SAXVSMClassifier
, KNNClassifier
. Here is the confusion matrix for those classifiers:
Here is the link to the notebook.
100% agree!
That is great article, thank you for sharing it!
Hi All,
So, I spent a little more time on this.
-
as for the training/test mistake, the mistake is at the source. The UCR_TS_Archive_2015 has some datasets with train and test sets reversed. Some authors (like the Weasel paper authors) fix this manually.
All the dataset TRAIN/TEST splits are correct in UCRArchive_2018.
It turns out the review paper uses the raw 2015 archive; so their result on the Earthquake dataset using FCN, 78% is actually worse than all false.
That said, contrary to the LSTM_FCN paper, they do not cherry pick the epoch with the lowest validation error! -
To still be able to check that I was still able to reproduce their results, I chose a dataset where traditional methods perform significantly worse than FCN (or ResNet), the Adiac dataset.
For this dataset, DTW methods give an accuracy of ~60% on 36 classes.
Using FCN, I was able to reproduce the author’s results of 80+% accuracy (by selecting on the lowest training loss) - otherwise the model’s generalizable accuracy is probably around 78%, which is still very impressive. See notebook here.
Some conclusions/observations/issues:
- evaluating on a single UCR TS dataset is going to be tough to gauge new architectures, and I would argue for evaluation of new methods on the entire set of datasets, but:
- training is long (over an hour for the Adiac dataset on my 1080Ti) and does not really stretch the memory of the GPUs
- I couldn’t get fit_one_cycle and the learning rate finder to work well for the Adiac dataset and went back to the old step function, decreasing the LR when training loss stopped decreasing, requiring some manual intervention…
Thanks again @henripal for the time and effort you are putting into this competition! These are, again, extremely usegul findings.
I would propose the following:
- We need to make sure participants use the right Train/ Test split using the correct UCRArchive_2018.
- I think we should propose a new rule in competition (because it’s useful in real problems). To perform any hyperparameter search, we should use a validation set out of the training set, and then report the test set performance. That’s best practice.
To achieve both points we should update the notebook we created to prepare data. I cannot work on that until tomorrow. If anybody can do it before that’d be great. Otherwise I’ll do it tomorrow.
I’d propose we continue the discusion on how to proceed with the TS project in the TSSD study group since its beyond the scope of this competition. All those interested are welcome to participate.
@henripal I’m trying to implement the multivariate LSTM-FCN, as the softmax function return probablities for the classes (in case of Earthquackes dataset it’s 2) I hot encoded the labels. But looking into your code I see you are not using a softmax, is there a reason?
My current problem is with fastai accuracy
function. If I create a Learner without accuracy, I get following when I unfreeze then I fit:
data = DataBunch(train_dl=train_dl, valid_dl=test_dl, path=path)
learner = Learner(data, model, loss_func=loss_func)
learner.fit(10, lr=5e-5)
If I pass the
accuracy=accuracy
to the learner I get this error
RuntimeError Traceback (most recent call last)
<ipython-input-196-b5d89f36ce92> in <module>()
----> 1 learner.fit(10, lr=5e-5)
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
160 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
161 fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 162 callbacks=self.callbacks+callbacks)
163
164 def create_opt(self, lr:Floats, wd:Floats=0.)->None:
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
92 except Exception as e:
93 exception = e
---> 94 raise e
95 finally: cb_handler.on_train_end(exception)
96
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
87 if hasattr(data,'valid_dl') and data.valid_dl is not None:
88 val_loss = validate(model, data.valid_dl, loss_func=loss_func,
---> 89 cb_handler=cb_handler, pbar=pbar)
90 else: val_loss=None
91 if cb_handler.on_epoch_end(val_loss): break
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
52 if not is_listy(yb): yb = [yb]
53 nums.append(yb[0].shape[0])
---> 54 if cb_handler and cb_handler.on_batch_end(val_losses[-1]): break
55 if n_batch and (len(nums)>=n_batch): break
56 nums = np.array(nums, dtype=np.float32)
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in on_batch_end(self, loss)
236 "Handle end of processing one batch with `loss`."
237 self.state_dict['last_loss'] = loss
--> 238 stop = np.any(self('batch_end', not self.state_dict['train']))
239 if self.state_dict['train']:
240 self.state_dict['iteration'] += 1
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in __call__(self, cb_name, call_mets, **kwargs)
184 def __call__(self, cb_name, call_mets=True, **kwargs)->None:
185 "Call through to all of the `CallbakHandler` functions."
--> 186 if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
187 return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
188
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in <listcomp>(.0)
184 def __call__(self, cb_name, call_mets=True, **kwargs)->None:
185 "Call through to all of the `CallbakHandler` functions."
--> 186 if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
187 return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
188
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in on_batch_end(self, last_output, last_target, train, **kwargs)
269 if not is_listy(last_target): last_target=[last_target]
270 self.count += last_target[0].size(0)
--> 271 self.val += last_target[0].size(0) * self.func(last_output, *last_target).detach().cpu()
272
273 def on_epoch_end(self, **kwargs):
/usr/local/lib/python3.6/dist-packages/fastai/metrics.py in accuracy(input, targs)
37 input = input.argmax(dim=-1).view(n,-1)
38 targs = targs.view(n,-1)
---> 39 return (input==targs).float().mean()
40
41 def error_rate(input:Tensor, targs:Tensor)->Rank0Tensor:
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'other'
Here is current implementation - notebook
+1! I have the same problem everytime I deviate from the exact lesson notebooks (which is basically always ). Training works but the metrics don’t. I have also never gotten the f1 metric to work, so I seem to lack a fundamental understanding here. It would be extremely helpful if someone who fully understands the reasons for these errors and the mechanics of the metrics callbacks could explain this. I think other people struggle with this as well. (Examples mentioned here:
https://forums.fast.ai/t/lesson-4-advanced-discussion/30319/94?u=marcmuc
and
here:
https://forums.fast.ai/t/dataset-for-regression-cnn/30188/5?u=marcmuc
What about trying metrics=[accuracy_thresh]
?
In case of multi-class problem, the prediction is single integer, in case of multi-label problem - multiple floats. Therefore the accuracy is calculated using different algorithm.
Based on the error Expected object of scalar type Long but got scalar type Float for argument #2 'other'
I guess that one-hot encoding the labels calls a multi-label settings.
Hi, I didn’t rerun your code but looked briefly through it; hope the following will help:
Most of the models Jeremy ran during the course went like this:
- model outputs scores (unbounded floats directly from your final, dense layer)
- loss is computed using cross entropy
- accuracy is ran automagically by computing the index (
argmax
) of your model’s output and comparing that to your label (see code here). Note that this requires integer labels, not one-hot encoded!
If you look at the documentation for cross entropy on the pytorch docs, you see that it is a combination of "nn.LogSoftmax()
and nn.NLLLoss()
in one single class" - so the cross entropy loss requires your model to also output scores, and not softmax (if not, you’re softmaxing twice).
As to the question why I had BCEWithLogits instead of softmax: it is because the Earthquakes dataset only had two classes, so instead of outputting a size 2 score vector, you can output a size 1 score and predict class 0 when the score is negative and 1 when the score is positive. The BCEWithLogits loss takes care of running that score through a sigmoid, the through the Binary Cross Entropy loss. This also causes the accuracy
to fail as there’s no point in calling argmax
on a size 1 vector!
Thanks all your input helped fixing/understanding the problem. I ended up using the a Linear layer with n
output (2) then a LogSoftmax
and using a NLLLoss
loss function. this way I can reapply same architecture on another problem just by passing the right class number.
Training the model with for 100 epochs, but after only 7 it reached 100% something odd
Total time: 00:38
epoch train_loss valid_loss accuracy
1 0.437780 0.483696 0.819876 (00:00)
2 0.365820 0.430517 0.819876 (00:00)
3 0.309986 0.299840 0.875776 (00:00)
4 0.252181 0.139254 0.956522 (00:00)
5 0.207332 0.055020 0.984472 (00:00)
6 0.169240 0.021992 0.996894 (00:00)
7 0.140646 0.010572 1.000000 (00:00)
. . .
96 0.000043 0.000005 1.000000 (00:00)
97 0.000040 0.000004 1.000000 (00:00)
98 0.000038 0.000004 1.000000 (00:00)
99 0.000037 0.000004 1.000000 (00:00)
100 0.000037 0.000004 1.000000 (00:00)
I used the test set as a validation set, probably I need to go back and change this.
Here is the link to the notebook
I’ll give you a hint
def split_xy(data, classes):
X = data_train[:, 1:]
y = data_train[:, 0].astype(int)
# hot encode
#y = one_hot_encode(y, classes)
return X, y
haha my bad! fixed notebook
As the classes are imbalanced, I tried using a weighted sampler for the training dataloader like this
class_sample_count = [class_0_count, class_1_count] # dataset has 10 class-1 samples, 1 class-2 samples, etc.
weights = 1 / torch.Tensor(class_sample_count)
sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, bs)
train_dl = DataLoader(train_ds, batch_size=bs, shuffle=False, sampler = sampler)
test_dl = DataLoader(test_ds, batch_size=bs, shuffle=False)
But the accuracy is best when not using the sampler (now it’s clearly overfitting).
92 0.006502 0.775092 0.762590 (00:00)
93 0.006222 0.775078 0.762590 (00:00)
94 0.005929 0.779696 0.762590 (00:00)
95 0.005682 0.784288 0.762590 (00:00)
96 0.005545 0.787708 0.762590 (00:00)
97 0.005510 0.786689 0.762590 (00:00)
98 0.005290 0.779218 0.762590 (00:00)
99 0.005020 0.780385 0.762590 (00:00)
100 0.004803 0.775525 0.762590 (00:00)
And the learning rate was (stopped at epoch 15)
epoch train_loss valid_loss accuracy
1 0.575163
2 0.563449
3 0.573439
4 0.571960
5 0.568286
6 0.565529
7 0.561100
8 0.551227
9 0.529277
10 0.496591
11 0.457549
12 0.404592
13 0.359023
14 0.360828
15 0.720238
vs when using the sampler, the accuracy was
94 0.021156 0.671377 0.525180 (00:00)
95 0.020713 0.734338 0.503597 (00:00)
96 0.020271 0.655608 0.539568 (00:00)
97 0.019853 0.706484 0.525180 (00:00)
98 0.019432 0.778318 0.474820 (00:00)
99 0.019033 0.779579 0.474820 (00:00)
100 0.018644 0.648912 0.532374 (00:00)
and the learning rate plot when up to 100 epochs!!
epoch train_loss valid_loss accuracy
1 0.681915
2 0.690184
3 0.681726
4 0.675613
5 0.677629
6 0.678535
7 0.675027
8 0.675557
9 0.673278
10 0.669624
11 0.671635
12 0.672354
13 0.671159
14 0.672066
15 0.672701
16 0.673906
17 0.673748
18 0.671662
19 0.668255
20 0.669909
21 0.669830
22 0.670423
23 0.670914
24 0.669922
25 0.670148
26 0.670190
27 0.669763
28 0.670010
29 0.667932
30 0.665625
31 0.663546
32 0.661079
33 0.658819
34 0.655920
35 0.652094
36 0.648777
37 0.643848
38 0.639082
39 0.633803
40 0.626214
41 0.618589
42 0.610349
43 0.600344
44 0.590048
45 0.578545
46 0.566156
47 0.553994
48 0.540469
49 0.526928
50 0.512890
51 0.499177
52 0.485039
53 0.471086
54 0.457436
55 0.444083
56 0.431182
57 0.418695
58 0.406655
59 0.395032
60 0.383811
61 0.372989
62 0.362555
63 0.352489
64 0.342777
65 0.333405
66 0.324351
67 0.315605
68 0.307154
69 0.298985
70 0.291084
71 0.283442
72 0.276046
73 0.268887
74 0.261955
75 0.255240
76 0.248734
77 0.242428
78 0.236315
79 0.230387
80 0.224637
81 0.219059
82 0.213645
83 0.208389
84 0.203286
85 0.198331
86 0.193517
87 0.188840
88 0.184295
89 0.179878
90 0.175583
91 0.171407
92 0.167346
93 0.163396
94 0.159552
95 0.155813
96 0.152173
97 0.148630
98 0.145182
99 0.141823
Why is that? as the sampling should be used in similar cases!
I think it’s interesting that train loss is so much > than valid loss despite such great accuracy. Great work!
EDIT: Update, just got to the portion regarding mix-ups of tran-val-test sets. As if Data Science wasn’t hard enough ! Regardless, keep up the good work. Now that the holiday is over and a big project is completed at my day job I’m going to start working on this project as well!
I had a weighing scheme different than yours (would weigh the classes in the loss, rather than oversample).
That said I found the same result - slightly better without the sampler. I wouldn’t read too much into it as the earthquake problem is maybe close to impossible - I don’t know which version of the dataset you’re using but your 76.25% accuracy is probably “all zeros”.
As for why your lr plot goes crazy with the sampler - I think maybe because fastai “counts” the number of examples that went through and stops at 1000. If you’re weighing your examples by weights that sum to less than 1, maybe you need more examples to go to 1000. Pure speculation but that’s where I would start looking. Maybe change your weights to sum to 1 and see what happens.
A quick question.
In this notebook https://gist.github.com/oguiza/c9c373aec07b96047d1ba484f23b7b47
in cell [38]
len(data.train_ds) appears to be 60.
How did this happen?
In my case it is 30
If you have selected the ‘OliveOil’ dataset, 30 is correct. I did many experiments and I guess didn’t run that cell again to get the correct output.
But if you are willing to participate in this competition remember to modify the selected dataset to ‘Earthquakes’
I was wondering if the problem with sampling is due to the fact that this is a timeseries and there is a time an order dependency between the rows, so with shuffling or sampling we would lose this order (I don’t know what’s the right name for this).
I’m using this dataset from the timeseries classification website. I checked the prediction of my model and I see it’s outputting 1
s so it probably it learned something
>> learner.get_preds(DatasetType.Valid)
tensor([[9.9425e-01, 5.7537e-03],
[7.2724e-01, 2.7276e-01],
[9.9526e-01, 4.7443e-03],
[9.9658e-01, 3.4218e-03],
[6.0188e-01, 3.9812e-01],
[9.9603e-01, 3.9694e-03],
[9.8937e-01, 1.0634e-02],
[9.8595e-01, 1.4051e-02],
[6.0619e-01, 3.9381e-01],
[9.5675e-01, 4.3249e-02],
[9.9305e-01, 6.9478e-03],
[9.5982e-01, 4.0185e-02],
[8.4680e-01, 1.5320e-01],
[9.7382e-01, 2.6183e-02],
[9.9200e-01, 8.0023e-03],
[7.2946e-01, 2.7054e-01],
[9.0504e-01, 9.4956e-02],
[9.9728e-01, 2.7198e-03],
[9.6200e-01, 3.7998e-02],
[8.4773e-01, 1.5227e-01],
[8.2070e-01, 1.7930e-01],
[9.9535e-01, 4.6532e-03],
[9.0245e-01, 9.7546e-02],
[9.6488e-01, 3.5121e-02],
[9.6708e-01, 3.2921e-02],
[9.1114e-01, 8.8860e-02],
[6.7380e-01, 3.2620e-01],
[9.9140e-01, 8.6026e-03],
[9.9147e-01, 8.5279e-03],
[9.9574e-01, 4.2569e-03],
[6.7916e-01, 3.2084e-01],
[8.1221e-01, 1.8779e-01],
[7.2926e-01, 2.7074e-01],
[6.7195e-01, 3.2805e-01],
[9.7636e-01, 2.3637e-02],
[9.8936e-01, 1.0645e-02],
[9.8023e-01, 1.9773e-02],
[9.8331e-01, 1.6693e-02],
[9.8056e-01, 1.9441e-02],
[8.0585e-01, 1.9415e-01],
[8.5409e-01, 1.4591e-01],
[9.6576e-01, 3.4244e-02],
[5.7844e-01, 4.2156e-01],
[8.5839e-01, 1.4161e-01],
[8.3605e-01, 1.6395e-01],
[7.9539e-01, 2.0461e-01],
[9.9599e-01, 4.0103e-03],
[8.9708e-01, 1.0292e-01],
[9.9833e-01, 1.6663e-03],
[9.9922e-01, 7.7785e-04],
[9.3736e-01, 6.2635e-02],
[9.7506e-01, 2.4941e-02],
[9.9683e-01, 3.1704e-03],
[9.6329e-01, 3.6711e-02],
[9.9205e-01, 7.9503e-03],
[8.9988e-01, 1.0012e-01],
[3.4490e-01, 6.5510e-01],
[9.9277e-01, 7.2321e-03],
[9.4736e-01, 5.2636e-02],
[8.4546e-01, 1.5454e-01],
[9.8601e-01, 1.3993e-02],
[9.6343e-01, 3.6571e-02],
[9.6380e-01, 3.6198e-02],
[5.7684e-01, 4.2316e-01],
[7.6970e-01, 2.3030e-01],
[5.4828e-01, 4.5172e-01],
[9.5975e-01, 4.0253e-02],
[6.9527e-01, 3.0473e-01],
[8.5458e-01, 1.4542e-01],
[9.9969e-01, 3.0637e-04],
[9.5228e-01, 4.7721e-02],
[9.5492e-01, 4.5078e-02],
[9.8068e-01, 1.9323e-02],
[7.1458e-01, 2.8542e-01],
[5.6506e-01, 4.3494e-01],
[9.8045e-01, 1.9546e-02],
[9.2896e-01, 7.1041e-02],
[9.9604e-01, 3.9574e-03],
[9.8500e-01, 1.4995e-02],
[9.3539e-01, 6.4615e-02],
[6.9669e-01, 3.0331e-01],
[8.9084e-01, 1.0916e-01],
[9.2574e-01, 7.4258e-02],
[9.9943e-01, 5.7221e-04],
[9.5959e-01, 4.0410e-02],
[9.5426e-01, 4.5743e-02],
[9.8531e-01, 1.4690e-02],
[9.9888e-01, 1.1250e-03],
[6.8742e-01, 3.1258e-01],
[9.9715e-01, 2.8496e-03],
[7.7061e-01, 2.2939e-01],
[6.5534e-01, 3.4466e-01],
[4.4688e-01, 5.5312e-01],
[8.8147e-01, 1.1853e-01],
[9.9980e-01, 1.9565e-04],
[7.6115e-01, 2.3885e-01],
[9.9205e-01, 7.9482e-03],
[6.2418e-01, 3.7582e-01],
[9.5457e-01, 4.5432e-02],
[9.3219e-01, 6.7809e-02],
[9.7844e-01, 2.1556e-02],
[8.5520e-01, 1.4480e-01],
[9.2151e-01, 7.8488e-02],
[9.9145e-01, 8.5452e-03],
[6.2208e-01, 3.7792e-01],
[9.5117e-01, 4.8829e-02],
[5.7008e-01, 4.2992e-01],
[9.8578e-01, 1.4219e-02],
[9.9276e-01, 7.2404e-03],
[6.2644e-01, 3.7356e-01],
[9.4103e-01, 5.8970e-02],
[2.6042e-01, 7.3958e-01],
[8.4114e-01, 1.5886e-01],
[9.9983e-01, 1.7048e-04],
[9.8680e-01, 1.3196e-02],
[8.4676e-01, 1.5324e-01],
[9.6020e-01, 3.9797e-02],
[8.7532e-01, 1.2468e-01],
[9.8866e-01, 1.1343e-02],
[8.3955e-01, 1.6045e-01],
[8.9132e-01, 1.0868e-01],
[9.9871e-01, 1.2883e-03],
[9.8665e-01, 1.3352e-02],
[7.1213e-01, 2.8787e-01],
[9.3852e-01, 6.1476e-02],
[2.2901e-01, 7.7099e-01],
[9.4070e-01, 5.9303e-02],
[9.7276e-01, 2.7241e-02],
[7.0276e-01, 2.9724e-01],
[9.7210e-01, 2.7895e-02],
[9.9671e-01, 3.2880e-03],
[5.3830e-01, 4.6170e-01],
[8.5895e-01, 1.4105e-01],
[8.7945e-01, 1.2055e-01],
[9.9073e-01, 9.2734e-03],
[9.6120e-01, 3.8799e-02],
[5.0640e-01, 4.9360e-01],
[5.9930e-01, 4.0070e-01],
[9.7728e-01, 2.2716e-02]]),
tensor([0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0,
1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0,
0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0])]
is that the output of preds[1]? (where preds is what yourget_preds
returned) ? Then those are the original labels. preds[0] should give you probabilities (or logits etc. depending on your model setup) for your predictions (get_preds returns a list with predictions, targets).
https://docs.fast.ai/basic_train.html#Learner.get_preds
thanks @marcmuc for pointing out, I thought it was outputing the one hot encoded version plus the argmax one! I updated the output. will investigate the output later.