98.7% accuracy on bitcoin price prediction?


(Martin) #1

Automated Bitcoin Trading via Machine Learning Algorithms:

It seems unbelievable to me that they could predict if the daily close of Bitcoin would be less or more then the previous day at 98.7% accuracy!

Is this paper reviewed?

Was/Is this actually possible to this kind of accuracy?


(Luke Byrne) #2

The quote further down in the blurb is more telling when they are trying to predict price in 10 min and 10 second intervals

These results had 50-55% accuracy in predicting the sign of future price
change using 10 minute time intervals.

Not much better than a coin toss.


(Martin) #3

55% accuracy on 10-minute intervals is really good! But for that, they used different data. Still, I am wondering about the daily predictions.


(Luke Byrne) #4

I have done some ML/DL stuff with horse racing in the past, only way to know for sure if to put some money down!


(Clayton Yochum) #5

There’s no way this is a peer-reviewed paper; looks like it’s a project from Stanford’s CS229, an undergrad intro-to-ML class, since the only other link I can find is http://cs229.stanford.edu/proj2014/Isaac%20Madan,%20Shaurya%20Saluja,%20Aojia%20Zhao,Automated%20Bitcoin%20Trading%20via%20Machine%20Learning%20Algorithms.pdf. Given that, I think we should be careful judging it too harshly, but I think it’s worth taking this all with a big grain of salt.

98.7% accuracy does seems insanely good, good enough we should be very suspicious. I work in this area and virtually every impressive-looking ML-for-stock-prediction blog/papers/etc makes one or more cardinal sins, like standardizing using the entire dataset (information leakage via mean/sd calculations including test set), improper data alignment (using data from now to predict now, instead of now + 1), or blatantly using the future to predict the past (e.g. applying k-fold CV or reporting OOB error from RF’s). It’s hard to say anything concrete about if/where this paper misses the mark because I see no code provided.

Another concern here is the lack of confusion matrices and model-free baselines; what performance do you get by guessing “positive” every time, or by guessing the same sign as the last interval? I also wonder what’s going on in the graphs where they show a continuous response despite only doing binary classification.

The approach itself also seems quite shallow, making the results even more suspicious. They only incorporate time in the train/test splitting and in taking first-differences; no additional feature-engineering like longer lags or moving averages, no inclusion of time-related features, etc.

Applying the the approach from the Rossman notebook (lesson 3) here might be interesting; use the same features, toss in some lags, moving-averages, and timestamp features, see what happens. Last I looked you’d need to hack the StructuredLearner class a bit to support binary outcomes, but someone on this forum has an example of that posted, and I imagine some of that code has improved the last few months. Could also just treat it as regression and collapse to binary to compare to the paper.


(Martin) #6

@claytonjy I was also thinking that some information leakage could be happening there. I will train a network with the same data and see where it goes. But I will only do that after another ai project of mine is finished. I will write an update later for those who are intrested if they can wait :slight_smile:


(Clayton Yochum) #7

Please do! Curious to see what you find. I’d also suggest looking at naive model-free benchmarks like “always predict the more common class” or “always predict the last class”. I suspect the latter in particular might do awfully well depending on the period of time covered. Good luck!