Welcome to our first Time Series Learning Competition: Earthquakes!

henripal · November 26, 2018, 6:27pm

I’ll give you a hint

def split_xy(data, classes):
    X = data_train[:, 1:]
    y = data_train[:, 0].astype(int)
    # hot encode
    #y = one_hot_encode(y, classes)
    return X, y

bachir · November 26, 2018, 7:44pm

haha my bad! fixed notebook

As the classes are imbalanced, I tried using a weighted sampler for the training dataloader like this

class_sample_count = [class_0_count, class_1_count] # dataset has 10 class-1 samples, 1 class-2 samples, etc.
weights = 1 / torch.Tensor(class_sample_count)
sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, bs)

train_dl = DataLoader(train_ds, batch_size=bs, shuffle=False, sampler = sampler)
test_dl = DataLoader(test_ds, batch_size=bs, shuffle=False)

But the accuracy is best when not using the sampler (now it’s clearly overfitting).

92     0.006502    0.775092    0.762590  (00:00)
93     0.006222    0.775078    0.762590  (00:00)
94     0.005929    0.779696    0.762590  (00:00)
95     0.005682    0.784288    0.762590  (00:00)
96     0.005545    0.787708    0.762590  (00:00)
97     0.005510    0.786689    0.762590  (00:00)
98     0.005290    0.779218    0.762590  (00:00)
99     0.005020    0.780385    0.762590  (00:00)
100    0.004803    0.775525    0.762590  (00:00)

And the learning rate was (stopped at epoch 15)

epoch	train_loss	valid_loss	accuracy
1	0.575163
2	0.563449
3	0.573439
4	0.571960
5	0.568286
6	0.565529
7	0.561100
8	0.551227
9	0.529277
10	0.496591
11	0.457549
12	0.404592
13	0.359023
14	0.360828
15	0.720238

vs when using the sampler, the accuracy was

94     0.021156    0.671377    0.525180  (00:00)
95     0.020713    0.734338    0.503597  (00:00)
96     0.020271    0.655608    0.539568  (00:00)
97     0.019853    0.706484    0.525180  (00:00)
98     0.019432    0.778318    0.474820  (00:00)
99     0.019033    0.779579    0.474820  (00:00)
100    0.018644    0.648912    0.532374  (00:00)

and the learning rate plot when up to 100 epochs!!


epoch	train_loss	valid_loss	accuracy
1	0.681915
2	0.690184
3	0.681726
4	0.675613
5	0.677629
6	0.678535
7	0.675027
8	0.675557
9	0.673278
10	0.669624
11	0.671635
12	0.672354
13	0.671159
14	0.672066
15	0.672701
16	0.673906
17	0.673748
18	0.671662
19	0.668255
20	0.669909
21	0.669830
22	0.670423
23	0.670914
24	0.669922
25	0.670148
26	0.670190
27	0.669763
28	0.670010
29	0.667932
30	0.665625
31	0.663546
32	0.661079
33	0.658819
34	0.655920
35	0.652094
36	0.648777
37	0.643848
38	0.639082
39	0.633803
40	0.626214
41	0.618589
42	0.610349
43	0.600344
44	0.590048
45	0.578545
46	0.566156
47	0.553994
48	0.540469
49	0.526928
50	0.512890
51	0.499177
52	0.485039
53	0.471086
54	0.457436
55	0.444083
56	0.431182
57	0.418695
58	0.406655
59	0.395032
60	0.383811
61	0.372989
62	0.362555
63	0.352489
64	0.342777
65	0.333405
66	0.324351
67	0.315605
68	0.307154
69	0.298985
70	0.291084
71	0.283442
72	0.276046
73	0.268887
74	0.261955
75	0.255240
76	0.248734
77	0.242428
78	0.236315
79	0.230387
80	0.224637
81	0.219059
82	0.213645
83	0.208389
84	0.203286
85	0.198331
86	0.193517
87	0.188840
88	0.184295
89	0.179878
90	0.175583
91	0.171407
92	0.167346
93	0.163396
94	0.159552
95	0.155813
96	0.152173
97	0.148630
98	0.145182
99	0.141823

Why is that? as the sampling should be used in similar cases!

whamp · November 26, 2018, 9:59pm

I think it’s interesting that train loss is so much > than valid loss despite such great accuracy. Great work!

EDIT: Update, just got to the portion regarding mix-ups of tran-val-test sets. As if Data Science wasn’t hard enough ! Regardless, keep up the good work. Now that the holiday is over and a big project is completed at my day job I’m going to start working on this project as well!

henripal · November 26, 2018, 10:05pm

I had a weighing scheme different than yours (would weigh the classes in the loss, rather than oversample).

That said I found the same result - slightly better without the sampler. I wouldn’t read too much into it as the earthquake problem is maybe close to impossible - I don’t know which version of the dataset you’re using but your 76.25% accuracy is probably “all zeros”.

As for why your lr plot goes crazy with the sampler - I think maybe because fastai “counts” the number of examples that went through and stops at 1000. If you’re weighing your examples by weights that sum to less than 1, maybe you need more examples to go to 1000. Pure speculation but that’s where I would start looking. Maybe change your weights to sum to 1 and see what happens.

sam2 · November 27, 2018, 3:51am

A quick question.
In this notebook https://gist.github.com/oguiza/c9c373aec07b96047d1ba484f23b7b47
in cell [38]
len(data.train_ds) appears to be 60.
How did this happen?
In my case it is 30

oguiza · November 27, 2018, 6:03am

If you have selected the ‘OliveOil’ dataset, 30 is correct. I did many experiments and I guess didn’t run that cell again to get the correct output.
But if you are willing to participate in this competition remember to modify the selected dataset to ‘Earthquakes’

bachir · November 27, 2018, 10:29am

I was wondering if the problem with sampling is due to the fact that this is a timeseries and there is a time an order dependency between the rows, so with shuffling or sampling we would lose this order (I don’t know what’s the right name for this).

I’m using this dataset from the timeseries classification website. I checked the prediction of my model and I see it’s outputting 1s so it probably it learned something

>> learner.get_preds(DatasetType.Valid)

tensor([[9.9425e-01, 5.7537e-03],
         [7.2724e-01, 2.7276e-01],
         [9.9526e-01, 4.7443e-03],
         [9.9658e-01, 3.4218e-03],
         [6.0188e-01, 3.9812e-01],
         [9.9603e-01, 3.9694e-03],
         [9.8937e-01, 1.0634e-02],
         [9.8595e-01, 1.4051e-02],
         [6.0619e-01, 3.9381e-01],
         [9.5675e-01, 4.3249e-02],
         [9.9305e-01, 6.9478e-03],
         [9.5982e-01, 4.0185e-02],
         [8.4680e-01, 1.5320e-01],
         [9.7382e-01, 2.6183e-02],
         [9.9200e-01, 8.0023e-03],
         [7.2946e-01, 2.7054e-01],
         [9.0504e-01, 9.4956e-02],
         [9.9728e-01, 2.7198e-03],
         [9.6200e-01, 3.7998e-02],
         [8.4773e-01, 1.5227e-01],
         [8.2070e-01, 1.7930e-01],
         [9.9535e-01, 4.6532e-03],
         [9.0245e-01, 9.7546e-02],
         [9.6488e-01, 3.5121e-02],
         [9.6708e-01, 3.2921e-02],
         [9.1114e-01, 8.8860e-02],
         [6.7380e-01, 3.2620e-01],
         [9.9140e-01, 8.6026e-03],
         [9.9147e-01, 8.5279e-03],
         [9.9574e-01, 4.2569e-03],
         [6.7916e-01, 3.2084e-01],
         [8.1221e-01, 1.8779e-01],
         [7.2926e-01, 2.7074e-01],
         [6.7195e-01, 3.2805e-01],
         [9.7636e-01, 2.3637e-02],
         [9.8936e-01, 1.0645e-02],
         [9.8023e-01, 1.9773e-02],
         [9.8331e-01, 1.6693e-02],
         [9.8056e-01, 1.9441e-02],
         [8.0585e-01, 1.9415e-01],
         [8.5409e-01, 1.4591e-01],
         [9.6576e-01, 3.4244e-02],
         [5.7844e-01, 4.2156e-01],
         [8.5839e-01, 1.4161e-01],
         [8.3605e-01, 1.6395e-01],
         [7.9539e-01, 2.0461e-01],
         [9.9599e-01, 4.0103e-03],
         [8.9708e-01, 1.0292e-01],
         [9.9833e-01, 1.6663e-03],
         [9.9922e-01, 7.7785e-04],
         [9.3736e-01, 6.2635e-02],
         [9.7506e-01, 2.4941e-02],
         [9.9683e-01, 3.1704e-03],
         [9.6329e-01, 3.6711e-02],
         [9.9205e-01, 7.9503e-03],
         [8.9988e-01, 1.0012e-01],
         [3.4490e-01, 6.5510e-01],
         [9.9277e-01, 7.2321e-03],
         [9.4736e-01, 5.2636e-02],
         [8.4546e-01, 1.5454e-01],
         [9.8601e-01, 1.3993e-02],
         [9.6343e-01, 3.6571e-02],
         [9.6380e-01, 3.6198e-02],
         [5.7684e-01, 4.2316e-01],
         [7.6970e-01, 2.3030e-01],
         [5.4828e-01, 4.5172e-01],
         [9.5975e-01, 4.0253e-02],
         [6.9527e-01, 3.0473e-01],
         [8.5458e-01, 1.4542e-01],
         [9.9969e-01, 3.0637e-04],
         [9.5228e-01, 4.7721e-02],
         [9.5492e-01, 4.5078e-02],
         [9.8068e-01, 1.9323e-02],
         [7.1458e-01, 2.8542e-01],
         [5.6506e-01, 4.3494e-01],
         [9.8045e-01, 1.9546e-02],
         [9.2896e-01, 7.1041e-02],
         [9.9604e-01, 3.9574e-03],
         [9.8500e-01, 1.4995e-02],
         [9.3539e-01, 6.4615e-02],
         [6.9669e-01, 3.0331e-01],
         [8.9084e-01, 1.0916e-01],
         [9.2574e-01, 7.4258e-02],
         [9.9943e-01, 5.7221e-04],
         [9.5959e-01, 4.0410e-02],
         [9.5426e-01, 4.5743e-02],
         [9.8531e-01, 1.4690e-02],
         [9.9888e-01, 1.1250e-03],
         [6.8742e-01, 3.1258e-01],
         [9.9715e-01, 2.8496e-03],
         [7.7061e-01, 2.2939e-01],
         [6.5534e-01, 3.4466e-01],
         [4.4688e-01, 5.5312e-01],
         [8.8147e-01, 1.1853e-01],
         [9.9980e-01, 1.9565e-04],
         [7.6115e-01, 2.3885e-01],
         [9.9205e-01, 7.9482e-03],
         [6.2418e-01, 3.7582e-01],
         [9.5457e-01, 4.5432e-02],
         [9.3219e-01, 6.7809e-02],
         [9.7844e-01, 2.1556e-02],
         [8.5520e-01, 1.4480e-01],
         [9.2151e-01, 7.8488e-02],
         [9.9145e-01, 8.5452e-03],
         [6.2208e-01, 3.7792e-01],
         [9.5117e-01, 4.8829e-02],
         [5.7008e-01, 4.2992e-01],
         [9.8578e-01, 1.4219e-02],
         [9.9276e-01, 7.2404e-03],
         [6.2644e-01, 3.7356e-01],
         [9.4103e-01, 5.8970e-02],
         [2.6042e-01, 7.3958e-01],
         [8.4114e-01, 1.5886e-01],
         [9.9983e-01, 1.7048e-04],
         [9.8680e-01, 1.3196e-02],
         [8.4676e-01, 1.5324e-01],
         [9.6020e-01, 3.9797e-02],
         [8.7532e-01, 1.2468e-01],
         [9.8866e-01, 1.1343e-02],
         [8.3955e-01, 1.6045e-01],
         [8.9132e-01, 1.0868e-01],
         [9.9871e-01, 1.2883e-03],
         [9.8665e-01, 1.3352e-02],
         [7.1213e-01, 2.8787e-01],
         [9.3852e-01, 6.1476e-02],
         [2.2901e-01, 7.7099e-01],
         [9.4070e-01, 5.9303e-02],
         [9.7276e-01, 2.7241e-02],
         [7.0276e-01, 2.9724e-01],
         [9.7210e-01, 2.7895e-02],
         [9.9671e-01, 3.2880e-03],
         [5.3830e-01, 4.6170e-01],
         [8.5895e-01, 1.4105e-01],
         [8.7945e-01, 1.2055e-01],
         [9.9073e-01, 9.2734e-03],
         [9.6120e-01, 3.8799e-02],
         [5.0640e-01, 4.9360e-01],
         [5.9930e-01, 4.0070e-01],
         [9.7728e-01, 2.2716e-02]]),
tensor([0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0,
         1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0,
         0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
         0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0,
         0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0])]

marcmuc · November 27, 2018, 11:47am

is that the output of preds[1]? (where preds is what yourget_preds returned) ? Then those are the original labels. preds[0] should give you probabilities (or logits etc. depending on your model setup) for your predictions (get_preds returns a list with predictions, targets).
https://docs.fast.ai/basic_train.html#Learner.get_preds

bachir · November 27, 2018, 11:53am

thanks @marcmuc for pointing out, I thought it was outputing the one hot encoded version plus the argmax one! I updated the output. will investigate the output later.

bachir · November 28, 2018, 2:03pm

I tried my implementation on the olive oil dataset and it seems that I’m definitely getting something wrong with my impl

>> learner.fit(50, lr=3e-3)
Total time: 00:03
epoch	train_loss	valid_loss	accuracy
1	1.396385	1.359485	0.400000
2	1.381889	1.357261	0.400000
3	1.372547	1.333944	0.400000
4	1.369740	1.322976	0.400000
5	1.353715	1.312739	0.400000
. . .
|45|1.313794|1.295826|0.400000|
|46|1.314028|1.295697|0.400000|
|47|1.312592|1.295847|0.400000|
|48|1.311265|1.296315|0.400000|
|49|1.309709|1.296983|0.400000|
|50|1.308661|1.297208|0.400000|


>> learner.get_preds(DatasetType.Train)
[tensor([[0.1642, 0.2768, 0.1338, 0.4252],
         [0.1633, 0.2772, 0.1331, 0.4264],
         [0.1634, 0.2770, 0.1331, 0.4264],
         [0.1623, 0.2776, 0.1324, 0.4276],
         [0.1638, 0.2770, 0.1333, 0.4260],
         [0.1625, 0.2775, 0.1328, 0.4272],
         [0.1620, 0.2778, 0.1324, 0.4277],
         [0.1619, 0.2779, 0.1324, 0.4279],
         [0.1620, 0.2777, 0.1326, 0.4277],
         [0.1621, 0.2777, 0.1324, 0.4277],
         [0.1612, 0.2783, 0.1321, 0.4284],
         [0.1618, 0.2780, 0.1324, 0.4278],
         [0.1623, 0.2777, 0.1325, 0.4276],
         [0.1605, 0.2785, 0.1318, 0.4292],
         [0.1609, 0.2783, 0.1319, 0.4289],
         [0.1658, 0.2760, 0.1354, 0.4228],
         [0.1657, 0.2762, 0.1355, 0.4226],
         [0.1616, 0.2779, 0.1322, 0.4283],
         [0.1617, 0.2778, 0.1320, 0.4285],
         [0.1616, 0.2778, 0.1320, 0.4286],
         [0.1609, 0.2782, 0.1318, 0.4291],
         [0.1619, 0.2777, 0.1321, 0.4283],
         [0.1614, 0.2780, 0.1320, 0.4287],
         [0.1611, 0.2781, 0.1319, 0.4289],
         [0.1623, 0.2775, 0.1325, 0.4278],
         [0.1616, 0.2778, 0.1321, 0.4285],
         [0.1618, 0.2777, 0.1321, 0.4284],
         [0.1622, 0.2776, 0.1324, 0.4278],
         [0.1619, 0.2776, 0.1321, 0.4283],
         [0.1618, 0.2779, 0.1322, 0.4281]]),
 tensor([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3,
         3, 3, 3, 3, 3, 3])]

here is my notebook

MicPie · December 2, 2018, 8:59pm

I am currently trying to solve a similar bug which somehow prevents my small custom NN to train properly on sequence data transformed to images, i.e. train, valid loss and accuracy do not improve and almost stay the same over several epochs.

So far I tried these strategies but was not successful yet:

Verified that the y values are handed over as int values and the data is stored as categorical variable (print DataBunch object). Otherwise fastai will misinterpret your application and will not choose the right loss function (In my case with two labels the loss function should be torch.nn.functional.cross_entropy, afaik.)
Be sure that you NN end-stage is compatible with your loss function (cross entropy loss has a LogSoftMax included so you don’t need it in your NN.)
The create_cnn function uses apply_init(model[1], nn.init.kaiming_normal_) for the new head, i.e. the new and untrained part. (However, so far I could not see a huge difference when I use it for my NN, maybe a little more change in the parameters which should be a good thing when visualized with tensorboard.)

I also tried it with an adapted pretrained ResNet18 with my image data and I got the same error. Because of that I am currently checking my data object setup, loss functions, and my simple NN setup.

Maybe this helps you and maybe you have tried other approaches to solve this problem?

Kind regards
Michael

Kaspar · January 2, 2019, 4:53pm

have you tried the focus loss. Jeremy used it for object localization. I have used it for segmentation where it works well - better then weighing with the inverse frequency of the class frequency.
just an idea ?

prosti · February 22, 2019, 9:45pm

Do you have any classification on things time series try to solve, like this:

predictors, like the next word problem
counters like following:
if I provide “aaaX” it will output “bbb”
if I provide “aXaXaX” it will output “bbb” also
if I provide “aaaaX” it will ouput “bbbb”
sec2sec (these will be translators)
and so on…

Let me know if this question is misplaced.

ahsanshah · March 3, 2019, 10:25pm

Hi. From the Rossman example in FastAI how does the Embedding approach fit into this mix? For instance, the contention in FastAI lecture is that for “structured” time series data sets (sales, inventory, etc) the Embedding approach (encoding the time attributes, “DayofWeek”, “MonthofYear”, “Date”) may be well suited but perhaps in some cases the CNN/RNN/Hybrid may work better. Could someone on this forum perhaps clarify what may delineate what can fall in which category. Thanks again and great forum.

marcmuc · March 3, 2019, 11:55pm

I have answered something similar in the TSSG thread, maybe this makes it a little clearer?

ahsanshah · March 4, 2019, 4:14am

Yes that helps. Thank you.

uness · November 11, 2019, 11:37pm

hello guys, I implemented a Fully Convolutional Network architecture with data augmentation using GANs and I got 76.97%. Without data augmentation I got 75.53%. This is done by generating synthetic data of the class 1 in order to balance with data of class 0.
Here is a link to the notebook.
PS: I’m using UCRArchive_2018