Time series/ sequential data study group

Hi friends! :wave:

I’ve built a little GRU model using PyTorch that is trying to predict a price of shampoo sales from this very tiny and old shampoo dataset, here’s the link to it: https://raw.githubusercontent.com/jbrownlee/Datasets/master/shampoo.csv

Here’s the model:

class GRU(nn.Module):
    def __init__(self, input_size=1, hidden_size=100, num_of_recurrent_layers=2, output_size=1):
        super().__init__()
        self.gru = nn.GRU(input_size, hidden_size, num_of_recurrent_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)
        self.memory_cell = torch.zeros(2, 2, hidden_size)
        
    def forward(self, x):
        res, memory_cell = self.gru(x.view(len(x) ,1, -1), self.memory_cell)
        self.memory_cell = memory_cell.detach()
        result = self.fc(res.view(len(x), -1))
        return result[-1]

Model was inspired by amazing people on the interwebz and I can hardly take all of the credit here :slight_smile:

I’ve been playing around with shapes and sizes of inputs and outputs for the last few days and I finally got the model to a place where it’s performing really well.

Now the main thing I don’t understand is that in all of the papers/posts I read that use pytorch to solve univariate problem almost all of them, at the end of the forward method take the last item of the predicted array as a prediction and I can’t figure out why.

I’m doing the same thing with my model and it works great but I don’t understand why. Here’s the rundown of shapes and sizes in forward:

res -> 2x1x100
result -> 2x1

Input to a linear layer is converting res into 2x100 which makes sense but the part that does not make sense is that there are two things returned from gru and that the last one is the one I need. I was expecting 1x100 :thinking:

any ideas?

Any feedback/response much appreciated,

Nik

1 Like

Ok so after a few espressos I think I figured it out. Here’s the pic that helped out a lot:

Basically what’s going on is that res (or output in the pic on the right) in my implementation of a GRU is all of the states from the last layer “depth” wise. memory_cell is states of last layer “time” wise.

So res has a shape of 2x1x100 because I have two inputs and each of them had a state (aka activation) which I then use to put through a linear layer to get an actual prediction. Reason why I need -1 is because I really only care about last item in the output sequence since I’m predicting shampoo sales for next month.

Does this sound right?

Here’s the stack overflow post that helped: https://stackoverflow.com/questions/48302810/whats-the-difference-between-hidden-and-output-in-pytorch-lstm

Also here’s the video that cleared up a lot of the confusion https://www.youtube.com/watch?v=Bl6WVj6wQaE (~11:30 is when Rachel starts talking about output vs hidden state)

1 Like

Hello Oguiza,

I downloaded your repository timeseriesAI-master a month ago, and installed numba. At that time, “05_ROCKET_a_new_SOTA_classifier” ran perfectly without error. But I just now went back to the same notebook and find an error. As far as I know, I made no changes to the conda environment, but who knows? Can you help?

pytorch: 1.2.0
fastai : 1.0.59
Python 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0]

What I have done:

  1. Downloaded timeseriesAI-master again, with no changes.
  2. Removed numba and reinstalled numba 0.46.0.
  3. Restarted Jupyter.

The error occurs with
kernels = generate_kernels(seq_len, 10000)

The error is quite verbose, but here goes…

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/targets/base.py in get_constant_generic(self, builder, ty, val)
    498         try:
--> 499             impl = self._get_constants.find((ty,))
    500             return impl(self, builder, ty, val)

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/targets/base.py in find(self, sig)
     49         if out is None:
---> 50             out = self._find(sig)
     51             self._cache[sig] = out

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/targets/base.py in _find(self, sig)
     58         else:
---> 59             raise NotImplementedError(self, sig)
     60 

NotImplementedError: (<numba.targets.base.OverloadSelector object at 0x7fd647b34c90>, (reflected list(int64),))

During handling of the above exception, another exception occurred:

NotImplementedError                       Traceback (most recent call last)
~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/errors.py in new_error_context(fmt_, *args, **kwargs)
    716     try:
--> 717         yield
    718     except NumbaError as e:

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/lowering.py in lower_block(self, block)
    259                                    loc=self.loc, errcls_=defaulterrcls):
--> 260                 self.lower_inst(inst)
    261 

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/lowering.py in lower_inst(self, inst)
    302             ty = self.typeof(inst.target.name)
--> 303             val = self.lower_assign(ty, inst)
    304             self.storevar(val, inst.target.name)

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/lowering.py in lower_assign(self, ty, inst)
    482                 const = self.context.get_constant_generic(self.builder, valty,
--> 483                                                           pyval)
    484                 # cast it to the variable type

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/targets/base.py in get_constant_generic(self, builder, ty, val)
    501         except NotImplementedError:
--> 502             raise NotImplementedError("Cannot lower constant of type '%s'" % (ty,))
    503 

NotImplementedError: Cannot lower constant of type 'reflected list(int64)'

During handling of the above exception, another exception occurred:

LoweringError                             Traceback (most recent call last)
<ipython-input-8-4e463a985c18> in <module>
----> 1 kernels = generate_kernels(seq_len, 10000)

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/dispatcher.py in _compile_for_args(self, *args, **kws)
    418                     e.patch_message('\n'.join((str(e).rstrip(), help_msg)))
    419             # ignore the FULL_TRACEBACKS config, this needs reporting!
--> 420             raise e
    421 
    422     def inspect_llvm(self, signature=None):

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/dispatcher.py in _compile_for_args(self, *args, **kws)
    351                 argtypes.append(self.typeof_pyval(a))
    352         try:
--> 353             return self.compile(tuple(argtypes))
    354         except errors.ForceLiteralArg as e:
    355             # Received request for compiler re-entry with the list of arguments

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
     30         def _acquire_compile_lock(*args, **kwargs):
     31             with self:
---> 32                 return func(*args, **kwargs)
     33         return _acquire_compile_lock
     34 

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/dispatcher.py in compile(self, sig)
    766             self._cache_misses[sig] += 1
    767             try:
--> 768                 cres = self._compiler.compile(args, return_type)
    769             except errors.ForceLiteralArg as e:
    770                 def folded(args, kws):

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/dispatcher.py in compile(self, args, return_type)
     75 
     76     def compile(self, args, return_type):
---> 77         status, retval = self._compile_cached(args, return_type)
     78         if status:
     79             return retval

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/dispatcher.py in _compile_cached(self, args, return_type)
     89 
     90         try:
---> 91             retval = self._compile_core(args, return_type)
     92         except errors.TypingError as e:
     93             self._failed_cache[key] = e

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/dispatcher.py in _compile_core(self, args, return_type)
    107                                       args=args, return_type=return_type,
    108                                       flags=flags, locals=self.locals,
--> 109                                       pipeline_class=self.pipeline_class)
    110         # Check typing error if object mode is used
    111         if cres.typing_error is not None and not flags.enable_pyobject:

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler.py in compile_extra(typingctx, targetctx, func, args, return_type, flags, locals, library, pipeline_class)
    526     pipeline = pipeline_class(typingctx, targetctx, library,
    527                               args, return_type, flags, locals)
--> 528     return pipeline.compile_extra(func)
    529 
    530 

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler.py in compile_extra(self, func)
    324         self.state.lifted = ()
    325         self.state.lifted_from = None
--> 326         return self._compile_bytecode()
    327 
    328     def compile_ir(self, func_ir, lifted=(), lifted_from=None):

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler.py in _compile_bytecode(self)
    383         """
    384         assert self.state.func_ir is None
--> 385         return self._compile_core()
    386 
    387     def _compile_ir(self):

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler.py in _compile_core(self)
    363                 self.state.status.fail_reason = e
    364                 if is_final_pipeline:
--> 365                     raise e
    366         else:
    367             raise CompilerError("All available pipelines exhausted")

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler.py in _compile_core(self)
    354             res = None
    355             try:
--> 356                 pm.run(self.state)
    357                 if self.state.cr is not None:
    358                     break

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler_machinery.py in run(self, state)
    326                     (self.pipeline_name, pass_desc)
    327                 patched_exception = self._patch_error(msg, e)
--> 328                 raise patched_exception
    329 
    330     def dependency_analysis(self):

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler_machinery.py in run(self, state)
    317                 pass_inst = _pass_registry.get(pss).pass_inst
    318                 if isinstance(pass_inst, CompilerPass):
--> 319                     self._runPass(idx, pass_inst, state)
    320                 else:
    321                     raise BaseException("Legacy pass in use")

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
     30         def _acquire_compile_lock(*args, **kwargs):
     31             with self:
---> 32                 return func(*args, **kwargs)
     33         return _acquire_compile_lock
     34 

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler_machinery.py in _runPass(self, index, pss, internal_state)
    279             mutated |= check(pss.run_initialization, internal_state)
    280         with SimpleTimer() as pass_time:
--> 281             mutated |= check(pss.run_pass, internal_state)
    282         with SimpleTimer() as finalize_time:
    283             mutated |= check(pss.run_finalizer, internal_state)

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/compiler_machinery.py in check(func, compiler_state)
    266 
    267         def check(func, compiler_state):
--> 268             mangled = func(compiler_state)
    269             if mangled not in (True, False):
    270                 msg = ("CompilerPass implementations should return True/False. "

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/typed_passes.py in run_pass(self, state)
    378             state.library.enable_object_caching()
    379 
--> 380         NativeLowering().run_pass(state) # TODO: Pull this out into the pipeline
    381         lowered = state['cr']
    382         signature = typing.signature(state.return_type, *state.args)

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/typed_passes.py in run_pass(self, state)
    323                 lower = lowering.Lower(targetctx, library, fndesc, interp,
    324                                        metadata=metadata)
--> 325                 lower.lower()
    326                 if not flags.no_cpython_wrapper:
    327                     lower.create_cpython_wrapper(flags.release_gil)

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/lowering.py in lower(self)
    177         if self.generator_info is None:
    178             self.genlower = None
--> 179             self.lower_normal_function(self.fndesc)
    180         else:
    181             self.genlower = self.GeneratorLower(self)

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/lowering.py in lower_normal_function(self, fndesc)
    218         # Init argument values
    219         self.extract_function_arguments()
--> 220         entry_block_tail = self.lower_function_body()
    221 
    222         # Close tail of entry block

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/lowering.py in lower_function_body(self)
    243             bb = self.blkmap[offset]
    244             self.builder.position_at_end(bb)
--> 245             self.lower_block(block)
    246 
    247         self.post_lower()

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/lowering.py in lower_block(self, block)
    258             with new_error_context('lowering "{inst}" at {loc}', inst=inst,
    259                                    loc=self.loc, errcls_=defaulterrcls):
--> 260                 self.lower_inst(inst)
    261 
    262     def create_cpython_wrapper(self, release_gil=False):

~/anaconda3/envs/fastai3/lib/python3.7/contextlib.py in __exit__(self, type, value, traceback)
    128                 value = type()
    129             try:
--> 130                 self.gen.throw(type, value, traceback)
    131             except StopIteration as exc:
    132                 # Suppress StopIteration *unless* it's the same exception that

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/errors.py in new_error_context(fmt_, *args, **kwargs)
    723         from numba import config
    724         tb = sys.exc_info()[2] if config.FULL_TRACEBACKS else None
--> 725         six.reraise(type(newerr), newerr, tb)
    726 
    727 

~/anaconda3/envs/fastai3/lib/python3.7/site-packages/numba/six.py in reraise(tp, value, tb)
    667         if value.__traceback__ is not tb:
    668             raise value.with_traceback(tb)
--> 669         raise value
    670 
    671 else:

LoweringError: Failed in nopython mode pipeline (step: nopython mode backend)
Cannot lower constant of type 'reflected list(int64)'

File "fastai_timeseries/exp/rocket_functions.py", line 16:
def generate_kernels(input_length, num_kernels, kss=[7, 9, 11], pad=True, dilate=True):
    candidate_lengths = np.array((kss))
    ^

[1] During: lowering "kss = arg(2, name=kss)" at /home/malcolm/fastaiActive/ROCKET/timeseriesAI-master20191230/fastai_timeseries/exp/rocket_functions.py (16)

-------------------------------------------------------------------------------
This should not have happened, a problem has occurred in Numba's internals.
You are currently using Numba version 0.46.0.

Please report the error message and traceback, along with a minimal reproducer
at: https://github.com/numba/numba/issues/new

If more help is needed please feel free to speak to the Numba core developers
directly at: https://gitter.im/numba/numba

Thanks in advance for your help in improving Numba!
--------------------------------------

Do you have any clue about what is going on here? This numba error might as well be ancient Sumerian for all I can understand of it.

Thanks for any help.

1 Like

@Pomo I’m also seeing the issue in the rocket nb. I’ve added a github issue to the repro:

1 Like

some new papers might be interesting

Temporal Tensor Transformation Network for Multivariate Time Series Prediction
https://arxiv.org/abs/2001.01051

Root Cause Detection Among Anomalous Time Series Using Temporal State Alignment
https://arxiv.org/abs/2001.01056

Hello,

I want to compare the performance of an EfficientNet-B0 model trained on images with the performance of the same model trained on time series data. The problem is a multi-label classification of 9 classes.

For now I was able to create an appropriate TimeSeriesList using fastai_timeseries (timeseriesAI) :


I know that I have to define the number classes the EfficientNet-B0 model should have:

Yet it seems that I should also change the input size of the model to be appropriate with the data, because when trying:

print(learn.summary())

I get this error:

RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [1, 2, 100] instead.

Can you help?

Dear Thomas,

Can you please share with me the code for regression that you have used? I am stuck at using InceptionTime for regression…
Thank you!

For regression I am using a very similar ItemLists as @oguiza timeseries library.
The only thing you need to do is change the loss_func to L2 or MSE.
So you need a LabelList where x’s are timeseries and y's are floats. Then a Learner with a compatible loss_func, the network is exactly the same. My code is something like this:

def curves_from_arrays(X_train, y_train, X_valid, y_valid, label_cls=FloatList):
    src = ItemLists('.', IVCurveList(X_train), IVCurveList(X_valid))
    return src.label_from_lists(y_train, y_valid, label_cls=FloatList)
#reading from numpy arrays
data = curves_from_arrays(X_train, y_train, X_valid, y_valid, label_cls=FloatList)

The data is x is a timeseries (an IV curve with 2 channels Voltage and Current) and the y are 3 floats. (I am regressing on 3 values)

>>data
LabelLists;

Train: LabelList (162176 items)
x: IVCurveList
IVCurve(ch=2, seq_len=200),IVCurve(ch=2, seq_len=200),IVCurve(ch=2, seq_len=200),IVCurve(ch=2, seq_len=200),IVCurve(ch=2, seq_len=200)
y: FloatList
[ 0.596429 -0.778986  2.887455],[ 0.191421 -0.728332 -0.999715],[ 0.766844 -0.152139 -0.545443],[-1.48883   0.756254 -0.743682],[ 0.776515 -0.732029 -0.526256]
Path: .;

Valid: LabelList (18020 items)
x: IVCurveList
IVCurve(ch=2, seq_len=200),IVCurve(ch=2, seq_len=200),IVCurve(ch=2, seq_len=200),IVCurve(ch=2, seq_len=200),IVCurve(ch=2, seq_len=200)
y: FloatList
[ 0.530938 -0.308305 -0.074202],[-2.674041  0.203413 -0.858775],[ 0.75456   0.636721 -0.159548],[0.762624 0.798379 0.160289],[ 0.74852  -0.539873 -0.589922]
Path: .;

Test: None

Then we put everything on a DataBunch

#on a databunch
db = data.databunch(bs=1024, val_bs=2048, num_workers=10)
#the network
model = create_inception(2,3)
#the learner
learn = Learner(db, model, loss_func=nn.MSELoss(), metrics=[mean_absolute_error, r2_score])

et voila!

1 Like

dear Thomas,

Thank you for your valuable help!
I was able to create time series data using the timeseriesAI (my data are time series of length 500, and 11 targets) :

    df = pd.read_csv('D:\\Projects\\PQD_classification\\NILM_Classific\\TS_AVG.csv')
db = (TimeSeriesList.from_df(df, '.', cols=df.columns.values[:500], feat=None)
      .split_by_rand_pct(valid_pct=0.2, seed=seed)
      .label_from_df(cols=['AVG_1','AVG_2','AVG_3','AVG_4','AVG_5','AVG_6','AVG_7','AVG_8','AVG_9','AVG_10','AVG_11'], label_cls=FloatList)
      .databunch(bs=bs,  val_bs=bs * 2,  num_workers=0,  device=torch.device('cuda'))
      .scale(scale_type=scale_type, scale_by_channel=scale_by_channel, 
             scale_by_sample=scale_by_sample,scale_range=scale_range)
     )
db



Still, once that’s done, I find two problems:
-First, has to do with the learning rate finder. The validation loss during the search is always #nan.

arch = InceptionTime # :eight_spoked_asterisk:
arch_kwargs = dict() # :eight_spoked_asterisk:
opt_func=Ranger
model = arch(db.features, db.c, **arch_kwargs).to(device) #db.c=11
learn = Learner(db, model, metrics= [mean_absolute_error, r2_score], opt_func=opt_func,loss_func= nn.MSELoss())
learn.lr_find()
learn.recorder.plot(suggestion=True)


(also the Loss is very high…I wonder if regression for this problem is doable!)

-Second, Once I try to train the model, I get : ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous' instead.
This is the code i used for the training:

from sklearn.model_selection import StratifiedKFold
skf=StratifiedKFold(n_splits=2, random_state=1, shuffle=True)
acc_val=[]
acc2_val=[]
np.random.seed(42)

import time
start_time = time.time()

for train_index, val_index in skf.split(df.index, df[‘AVG_1’]):
src = (TimeSeriesList.from_df(df, base_dir, cols=df.columns.values[:500], feat=None)
.split_by_idxs(train_index,val_index)
.label_from_df(cols=[‘AVG_1’,‘AVG_2’,‘AVG_3’,‘AVG_4’,‘AVG_5’,‘AVG_6’,‘AVG_7’,‘AVG_8’,‘AVG_9’,‘AVG_10’,‘AVG_11’], label_cls=FloatList))
data_fold = (src.databunch(bs=bs, val_bs=bs * 2, num_workers=1 , device=torch.device(‘cuda’))
.scale(scale_type=scale_type, scale_by_channel=scale_by_channel,scale_by_sample=scale_by_sample,scale_range=scale_range))
model = arch(db.features, db.c, **arch_kwargs).to(device)
learn = Learner(data_fold, model, loss_func=nn.MSELoss(), metrics=[mean_absolute_error, r2_score], opt_func=opt_func, callback_fns=ShowGraph)
learn.fit_one_cycle(2, slice(lr2))
loss,acc,acc2 = learn.validate()
acc_val.append(acc.numpy())
acc2_val.append(acc2.numpy())

print("— %s seconds —" % (time.time() - start_time))

Do you have any input on the problems? Thank you again for all!

I can give you some Hints that worked for me:

  • Regression needs to prepare the data to the task. I normalize my input time series and my output variables also.
  • Regressing on 11 variables is a hard problem, try with 1 first.
  • Play with loss functions, I found that MAE (mean absolute error) worked better for me than the MSE (L2 norm).

Your second bug is from sklearn, you are using the split method wrong.
I got pretty stable results for my problem after doing this.

1 Like

Here is a recent paper using transfer learning with time series
https://www.researchgate.net/publication/337095012_A_Novel_Approach_to_Short-Term_Stock_Price_Movement_Prediction_using_Transfer_Learning

4 Likes

Amazing! Thank you for setting up this repo. I followed the 01_intro notebook to build a weather forecasting model. It trains well but now I’m stuck with doing inference and I can’t quite get it working. I’d like to try it in the real world which means I don’t have the test data at the time of defining the initial databunch, only afterwards and as individual sequences.

What’s the best way to do inference on individual sequences on-demand and use the same scaling as in training? The new data sequence is a dataframe like this:

feat 0 1 2 3 4 5 6 7 target
0 12 11 13 12 10 7 9 10 None
1 7 6 6 8 9 5 6 6 None
2 9 10 9 8 8 6 7 6 None

Thanks a lot for your work!

Hi @angusde, how does Rocket handle seasonality trends ?.

Hi,

I build an library to do univariate forcasting it is based on nbeats. I adapted a lot because initially it was quite hard to train. Now it trains very fast. I also added some features to interpret to model.

Please let me know what you guys think. I’m thinking of riding up my changes in a blog post and/or expanding the library. But I would love your feedback first.

github: https://github.com/takotab/fastseq
docs: https://takotab.github.io/fastseq//index.html

11 Likes

Wow, that’s great! I was starting to read the paper just a few days ago.

How are the results with fastai compared to the results from the paper?

The full m4 does not fit in my memory but it at least is up there. I have not done a full training with switching datasets. Help (or better ideas to circumvent this issue) in this direction would be appreciated.

Saw that Google has a new model for time-series forecasting using transformer, maybe someone is interested in it.


9 Likes

I am currently trying that challenge, really cool seeing it here. I am new to ML/DL so I am struggling with the approach.

Did you find a approach to using a Time Series Regression for this challenge? I tried using Tabular but the results are meh. It only views the individual CDMs and not as a timeseries.

The timeseriesAI repo has been helpful but I am struggling to get the kelvins challenge data into the right format for a Databunch. Maybe you figured out a way to do it?

Hi, welcome to the community!

We tried different approaches, not all of them focused on DL. Our best try was to use LSTMs using only the last values of the target time series. However, the results of the leaderboard showed that ML was not playing a huge role though, probably because of the differences between the training and test sets.

For using timeseriesAI and Databunchs, you have to have each of the time series in a different row, and if you go for multivariate, the order of the rows is important and must be preserved across different variables.

I think I wrote a function to transform the input dataset from the challenge to a format for timeseriesAI. It is programmed in R though, but I can share it with you if you are interested.

Best!

1 Like

Dear community,
I would love to hear from you what you currently consider “best practice” for working with time series data in fastai.
Do people stick to tabular transformations or use functionalities originally intended for text?

1 Like