98.7% accuracy on bitcoin price prediction?

Hadus · September 5, 2018, 10:55pm

Automated Bitcoin Trading via Machine Learning Algorithms:

It seems unbelievable to me that they could predict if the daily close of Bitcoin would be less or more then the previous day at 98.7% accuracy!

Is this paper reviewed?

Was/Is this actually possible to this kind of accuracy?

lukebyrne · September 7, 2018, 6:08am

The quote further down in the blurb is more telling when they are trying to predict price in 10 min and 10 second intervals

These results had 50-55% accuracy in predicting the sign of future price
change using 10 minute time intervals.

Not much better than a coin toss.

Hadus · September 7, 2018, 11:13am

55% accuracy on 10-minute intervals is really good! But for that, they used different data. Still, I am wondering about the daily predictions.

lukebyrne · September 10, 2018, 12:50am

I have done some ML/DL stuff with horse racing in the past, only way to know for sure if to put some money down!

claytonjy · September 10, 2018, 7:43pm

There’s no way this is a peer-reviewed paper; looks like it’s a project from Stanford’s CS229, an undergrad intro-to-ML class, since the only other link I can find is http://cs229.stanford.edu/proj2014/Isaac%20Madan,%20Shaurya%20Saluja,%20Aojia%20Zhao,Automated%20Bitcoin%20Trading%20via%20Machine%20Learning%20Algorithms.pdf. Given that, I think we should be careful judging it too harshly, but I think it’s worth taking this all with a big grain of salt.

98.7% accuracy does seems insanely good, good enough we should be very suspicious. I work in this area and virtually every impressive-looking ML-for-stock-prediction blog/papers/etc makes one or more cardinal sins, like standardizing using the entire dataset (information leakage via mean/sd calculations including test set), improper data alignment (using data from now to predict now, instead of now + 1), or blatantly using the future to predict the past (e.g. applying k-fold CV or reporting OOB error from RF’s). It’s hard to say anything concrete about if/where this paper misses the mark because I see no code provided.

Another concern here is the lack of confusion matrices and model-free baselines; what performance do you get by guessing “positive” every time, or by guessing the same sign as the last interval? I also wonder what’s going on in the graphs where they show a continuous response despite only doing binary classification.

The approach itself also seems quite shallow, making the results even more suspicious. They only incorporate time in the train/test splitting and in taking first-differences; no additional feature-engineering like longer lags or moving averages, no inclusion of time-related features, etc.

Applying the the approach from the Rossman notebook (lesson 3) here might be interesting; use the same features, toss in some lags, moving-averages, and timestamp features, see what happens. Last I looked you’d need to hack the StructuredLearner class a bit to support binary outcomes, but someone on this forum has an example of that posted, and I imagine some of that code has improved the last few months. Could also just treat it as regression and collapse to binary to compare to the paper.

Hadus · September 11, 2018, 2:52pm

@claytonjy I was also thinking that some information leakage could be happening there. I will train a network with the same data and see where it goes. But I will only do that after another ai project of mine is finished. I will write an update later for those who are intrested if they can wait

claytonjy · September 11, 2018, 3:09pm

Please do! Curious to see what you find. I’d also suggest looking at naive model-free benchmarks like “always predict the more common class” or “always predict the last class”. I suspect the latter in particular might do awfully well depending on the period of time covered. Good luck!

marvin · March 21, 2019, 4:17pm

Random Forest is very prone to overfitting especially with a low split rate. I have seen several overfitted RF models with >95% accuracy but all of them failed on validation. The 10 minute prediction score isn’t worth mentioning.

The guys from Deutsche Boerse (Germany’s Exchange) published a tensoflow model on intraday tick data with about 70% accuracy on 20 minutes.

It makes me wonder why the Stanford guys who wrote that white paper were not aware of the well known RF overfitting issue. I think, if you do feature engineering with InformationGain and tweak the RF split rate, you can get descent results but to the best of my knowledge it’s not near 98% .

xjdeng · March 21, 2019, 10:50pm

I built a pytorch DL model that gets maybe 60-70% correct when it comes to the following 2 predictions:

Whether or not Bitcoin will exceed its past 20-day high in the next 20 days.
Whether or not Bitcoin will dip below its past 20-day low in the next 20 days.

These two predictions are independent of each other: for example, when volatility goes up, you can have bitcoin blow past BOTH its 20 day high AND its 20 day low in the next 20 days.

You can see it here, it’s updated everyday at 8:30pm CST: https://smartstockcharts.com/category/bitcoin/

Personally, I wouldn’t trade solely on this signal alone since you can still lose a lot of money being wrong 30-40% of the time.

MarkLuds · March 22, 2019, 11:35am

Wow, that’s a great prediction. Did you use the tabular module for this?

Also, have you considered adding in some risk management to this?

I am working on a similar project (using the time-series tabular lesson) but using a stop loss under the swing lows as my sell point. Then the stop is Incorporated in as part of the prediction. For example, trying to predict the risk / reward return vs the likeliness of the stop being hit. Which is something that could probably quite easily be added to your strategy to reduce the risk, during those bad signals.

xjdeng · March 22, 2019, 6:59pm

There’s no Fastai code in this, I actually got the idea after taking Andrew Ng’s course on Machine Learning many years ago when it was still taught in Matlab. I basically transformed the Mnist model into one that predicts stock prices. The main preprocessing that I used was converting a 20 day stock chart to a 40x40 grid of ascii art (where 1’s represented filled pixels and 0’s represented white space) and trained a network on that. The bottom boundary of the grid represented the 20 day low and the upper boundary represented the 20 day high and i scaled everything according to that when drawing out the grid.

I had that idea since I used to work for a hedge fund manager who swore by making all of his trading decisions by just “looking at the stock charts” and tried replicating this way of thinking using a NN.

The data I used was the S&P500 daily open, high, low, and closing prices from 1950 to 1980 (yes, you read that right, nothing from after 1980.) I basically “painted” those candlestick charts on the 40x40 grid using literally 1’s and 0’s. It seemed to perform almost as well out of sample, including during bad years like 2008. Cutting it off at 1980 also made it easier to backtest out of sample for many decades. And if you’ve read any books on stock chart patterns from the 1960s or so and compared them to the books written today, you won’t find radically different patterns being described (although their names may differ.)

Initially, i flattened the network and trained it as a vanilla strongly connected network. Later when I discovered Keras, I retrained it there on a GPU on both strongly connected and CNN networks.

When I discovered Fastai, I tried recreating this but couldn’t do so and resorted to converting my Keras implentation to Pytorch. I also dropped the CNN implementation since, as hard as it is to believe, it strongly underperforms the strongly connected one.

For some reason, I was getting decent performance on stocks and other securities than the S&P500 even though i trained it on the sp500’s chart patterns from the 50s to 70s, although my accuracy dropped slightly (averaging in the low 70s for most other stocks and securities rather than high 70s for the S&P out of sample.)

I later extended my model to predicting bitcoin and other crypto currencies using the same method as well and it seems to pick up on similar recurring chart patterns that were common in the S&P500.

Fyi, I don’t actively trade stocks or other securities, prefering to buy and hold an index portfolio although I still rely on the results of this model when deciding to rebalance my portfolio or investing new money coming in (I try to wait for a sell signal on my s&p investment if it’s already overweight at 60% and I want to reduce it to 50% for example.)

If anyone says they can predict bitcoin or any investment with over 90% accuracy, RUN. Although you can do better than a simple coin flip, the world we live in is too uncertainty to sustain such high accuracies in predicting the price of any publicly traded security.

PegasusWithoutWinds · March 28, 2019, 1:48pm

Check the training/validation split, which is not specified in the paper; so, this could not possibly be a peer-reviewed paper. My intuition tells me that they are probably doing a random split, which is a horrible thing to do for time series data and is almost always trivial to build predictive model on that fits the training data well.

Why? Check out this most awesome post from our Rachel Thomas: https://www.fast.ai/2017/11/13/validation-sets/

marvin · March 31, 2019, 2:46pm

@xjdeng good work on the bitcoin prediction. Just FYI, as a spinoff from a discussion in the fast.ai time-series study group, I have just started an open learning challenge to predict S&P 500. Anyone is invited to join the venture.

chrisxthe · June 24, 2019, 1:31pm

Hi @marvin, may I request a reference to the Deutsche Boerse research you mentioned? Didn’t manage to find it with simple searches.

marvin · September 30, 2019, 11:26am

@chrisxthe

Here is the link / refernce

HOWEVER, be mindful, because this is perhaps the single most cumbersome way to do prediction and, personally, I would never use it because… it’s a demo and for demo’s sake it does a good job on illustrating the process but not so much SOTA.

Industry best practice is way different.

chrisxthe · September 30, 2019, 1:35pm

Many thanks! The DL part looks a bit loose but the observations made seem interesting.

In the meantime, without compromising your trade secret, would you care to elaborate a bit on the latest industry best practices, especially in prediction problems in trading?

marvin · September 30, 2019, 2:31pm

Certainly not in a public forum.

Use this reading as a starting point:
" Use ratios or change indicators rather than price"
http://nstsupport.wardsystemsgroup.com/support/august-2019-newsletter/

Vyachez · October 31, 2022, 8:04pm

As any other asset but with much more noise that makes modeling very hard. Until crypto is heavily regulated lots of manipulations continue happening in this area. This makes it very hard to build predictive models. Other than that I think it worth a try. Different timelines, datasets and analysis may take you to the path building good model that may be able to assist you making decisions.