Time series/ sequential data study group

I have read that post before, that github repository together with the medium blog look very suspicious, especially after reading his resume. Code snipper he shows are all snippets u can copy online. You see some part are using mxnet, next part it is Keras.

The chance is it is a really sophisticated system or it is just a scam.

Thank you Ignacio,

There is indeed something you can contribute some help.

But let me briefly summarize the most recent results:

Meanwhile, I acquired a complete dataset of the S&P500 for the 92 years and did a lot of feature engineering today. I was actually stunned by the feature ranking since many well-known stock indicators (MACD, APX, etc etc) are totally useless b/c they correlate about 50% - 60%. with the closing price and those technical indicators that correlate the most are pretty obscure combinations rarely used in practice. The correlation heatmap of the final feature set indicates a promising start to train the model.

In case you are interested, you can run your algorithms over the dataset. I don’t mind sharing the data & feature set through email or DM, but for obvious reasons, I cannot share a download link in a public forum.

In case you really want to dig deeper into financial forecasting, you can start here: Financial forecasting with probabilistic programming and Pyro

Probabilistic programming with Pyro is perhaps the most underrated trend I know atm, and combining your image-net with a pyro bayesian neural network might be a first of its kind and may lead to an entirely new category of time-series predicting approaches.

I am super interested to see, how your image-net compares to the transfer learning I am working on and how both stack-up against a Bayesian Hybried ©NN network.

Would you be interested?
Marvin

S&P500 Feature Ranking

Correlation-Matrix-Heatmap

4 Likes

Thanks nok for looking into that guy,

the entire approach is just not working, no matter how you call it.

I spent some time on rebuilding the data, features, and some of the code, but ultimately most of the features had no correlation and some of the code I could make working, delivered noticeably different plots… I assume that one was a dead-end and got dumped. Total waste of time.

Hi Marvin, very interisting stuff you are working on,I wish i could participate. I had a thought some time ago about using triggers if some of the features come together, like rsi, macd and moving averages conditions and backtesting. But after some testing I found out this is an unbalanced class problem as these conditions do not occur a lot of time so I stopped there after bad results with LSTM’s. I had the same issue, unbalanced class, with divergence conditions what was also promising at first sight. Would want to test with tick data but it’s very hard to find tick data.
So i am very interested and will monitor this thread for new insights.

I would be also very interested in the pytorch + pyro combination.

This sounds very similar to the approach outlined in this paper:


(However, so far I didn’t found a pyro implementation of it and I’m still working on my pyro skills.)

I’m happy if you can recommend stuff in that direction. :slight_smile:

@MicPie
I don’t know more than you do but here is a starting point:

Financial forecasting with probabilistic programming

https://medium.com/@alexrachnog/financial-forecasting-with-probabilistic-programming-and-pyro-db68ab1a1dba

1 Like

@gevezex

Yes you can participate soon. I’m preparing to share data & code within the next few days. @oguiza suggested to launch a competition to predict S&P500 and I think that’s the way to go. I prepare the release of my related work as a baseline to get things started.

You get 1 minute tick data for free from AWS:

https://registry.opendata.aws/deutsche-boerse-pds/

And a sample prediction system done in TensorFlow:

2 Likes

I have been following this thread silently, but I am veryinterested in TS forecasting as well .
I have tried XGboost and Tabular model to forecast time series, but with pretty poor results, so the LSTM approach looks interesting.
My data is composed of various TS (power consumption, meterological data, calendar data) and I need to forecast the power consumption for the next hour, using the other TSs and the previous power data. I can update in real time my model with the actual data every hour.

The super meta model showed to predict Goldman Sachs looks very suspicious, but not completely wrong. Anyone was able to reproduce the results?
Any advice appreciated.

Hi Thomas,
It’s great that you joined this group.
It’s a bit difficult to give any hint without seeing the data, but if they are equally-spaced (1h), I think I’d start with either a ResNet or an FCN model on the raw data. These models work pretty well on multivariate time series data. You may want to take a look at this link where a review article is summarized.

Maybe I am not getting something, but using resnets and CNN is for classification purpose no?
I have equally spaced data (5min) for each TS.

Not necessarily. MLPs, RNNs and CNNs can all be used for forecasting problems.
The ResNet model I talk about is a model with a resnet-like structure, but adapted to time series (this is described in the link I shared before). Instead of hading 2d convolutions it has 1d convolutions. The rest is similar to a traditional resnet model.
This TS ResNet model takes an 2d array as an input (not an image) but the filters are only convolved in 1 dimension along the time axis.
In this case, instead of predicting classes, what we want is to predict an amount. So the output of the last fully connected layer will be a single float instead of multiple floats (one per class). So classes should be set to 1. The other thing you need to do is to use an adequate loss (like for example MSELoss) and the metrics you want to optimize.

My data looks like this:


I am trying to forecast P as a function of TS, TOUT and the Date, I am currently working with Tabular model and XGboost, but it is not working that good.
Also, my model will act as Control, so I want to use it online and update with the real values of P that I will be getting.

I get your idea of 1D - Resnet , but how you would set up the data to train such a model?
Would you cut the TS in smaller pieces, for instance in n+1 values, where you would use the n preceding values to predict the n+1? How would you incorporate Dateime data (categorical)? Would you join a tabular model to the output of the feature map of the resnet?

1 Like

One option would be to split data first between train and test (or train, val and test), so that there is no overlap between them.
25
This example, just includes 1 val and 1 test fold, but you could have more using a walk-forward approach.
57

Now, within each of the datasets, you will need to use a sliding window to create the X_train, y_train, etc as you describe. You would have 2 features (TS and TOUT). So if your sliding window is of size 20, you would create an 3d array of size (n_samples, 2, 20). In this way, none of the train samples would overlap with val or test samples.15

As to time, it really depends whether you think it has some predictive power or not, and how long your sliding window is. You could extract features from time like minute, hour, day/night, day of week, month, etc if you think those could have an impact on the prediction. You could treat those time features as the other 2 features. If you have extract 1 time feature, you would then feed an n_samples, 3, 20 shape array into the model.

I mentioned ResNet and FCN becuase they’ve been shown as effective multivariate time series models, and may give you a good starting point. But of course you may also use many other models: non-DL, LSTM, GRU, etc. or hybrid models.

4 Likes

Do you have an implementation of this? An example?

Time series walk-forward validation and sliding window

You can find here an example of how you can create one or multiple walk forward folds, as well as how you can use a sliding window approach to create the train, val, (test) datasets from a single array.
Bear in mind you don’t necessarily need to use multiple folds in walk forward. You can just select a single fold (with train and val -and test if you wish so-), and the create the 3D array using the sliding window.
I haven’t tested this code very thoroughly, so it’d be good that you use it with caution.

3 Likes

Thanks, I was mostly asking for a model implementation, I had created something similar to build my dataset:

def sliding_window(data, window=20, step=5):
    "Creates a new Tensor of windowed data every step"
    num_pieces = int((data.shape[1]-1-window)/step); num_pieces
    X = []
    y = []
    for j in range(num_pieces):
        X.append(data[:,step*j:step*j+window])
        y.append(data[-1,step*j+window+1])
    return torch.stack(X) , torch.stack(y) 

With this, I have a dataset that is

X.shape, y.shape
>>(torch.Size([70270, 3, 24]), torch.Size([70270]))

I was trying a naive Convnet, but it does not seems to work at all:

basic_conv = nn.Sequential(nn.Conv1d(3, 32, 3), 
                           nn.Conv1d(32, 64, kernel_size=3, stride=2, padding=1),
                           nn.Conv1d(64, 64, kernel_size=3, stride=1),
                           nn.Conv1d(64, 128, kernel_size=3, stride=2, padding=1),
                           nn.Conv1d(128, 128,kernel_size=3, stride=1),
                           AdaptiveConcatPool1d(),
                           Flatten(),
                           nn.Linear(128*2, 512),
                           nn.ReLU(),
                           nn.Linear(512,1)
                           )

Any recommendations?

Thomas,

Thanks for sharing your thoughts. The first thing you need to do is a test for random walk. To do so, you take previous value, store it in a separate column, and plot a correlation matrix.

Two scenarios:

Low correlation: means you can spin around the universe and it’s not going to get better. If the current value does not correlates with the previous (n) , it’s all random walk and there is not much you can do.

High correlation: Great, all you need is better features and a better model.

If correlation is somewhere in the middle, you may get around by converting the data into stationary by calculation the first and second derivatives. Plot a correlation matrix to see how that compares.

In terms of better features, you can use ta-lib to calculate a myriad of time series features widely used in finance.
Also,you need to categorify the date to capture trends.

However, the standard correlation matrix isn’t terrible useful for features assessment so you actually need to to a feature ranking in xgboost.

Hope that helps
Marvin

1 Like

I have a version that works “betterish” now:

class ResLayer(nn.Module):
    "Resnet style layer with `ni` inputs."
    def __init__(self, ni:int):
        super().__init__()
        self.conv1 = nn.Sequential(nn.Conv1d(ni, ni, kernel_size=3, stride=1, padding=1),
                                   nn.BatchNorm1d(ni),
                                   nn.ReLU(inplace=True)
                                  )
    def forward(self, x): return x + self.conv1(x)

def basic_conv(out):
    return   nn.Sequential(nn.Conv1d(3, 32, 3), 
                           nn.Conv1d(32, 64, kernel_size=3, stride=2, padding=1),
                           ResLayer(64),
                           nn.Conv1d(64, 128, kernel_size=3, stride=2, padding=1),
                           ResLayer(128),
                           AdaptiveConcatPool1d(),
                           Flatten(),
                           *bn_drop_lin(128*2,256, p=0.5, actn=nn.ReLU(inplace=True)),
                           *bn_drop_lin(256,512, p=0.3, actn=nn.ReLU(inplace=True)),
                           nn.Linear(512,out)
                           )

I created a windowed dataset to predict the next N points, as I have a sample every 5 mins, predicting 6 points seems to be enough for the application I want (30 min).
I don’t get the correlation idea, do you want me to check if the dep_var has a trend?
What I really need is to forecast the future of the Power, as accurate as possible for the near future, using the past values of P and the past and current values of T and time data.

1 Like

Hi, have you tried with tsfresh yet? It automatically extracts features from your time series.

Here’s an example using tsfresh for forecasting: https://github.com/blue-yonder/tsfresh/blob/master/notebooks/timeseries_forecasting_google_stock.ipynb

Works with rolling window already. My experience with Tsfresh + XGboost has been good. Or you could just extract features and then use those on a deep model.

1 Like

It looks like this approach. Did anyone try Nested CV?

Time Series Nested Cross-Validation

1 Like