There’s no way this is a peer-reviewed paper; looks like it’s a project from Stanford’s CS229, an undergrad intro-to-ML class, since the only other link I can find is http://cs229.stanford.edu/proj2014/Isaac%20Madan,%20Shaurya%20Saluja,%20Aojia%20Zhao,Automated%20Bitcoin%20Trading%20via%20Machine%20Learning%20Algorithms.pdf. Given that, I think we should be careful judging it too harshly, but I think it’s worth taking this all with a big grain of salt.
98.7% accuracy does seems insanely good, good enough we should be very suspicious. I work in this area and virtually every impressive-looking ML-for-stock-prediction blog/papers/etc makes one or more cardinal sins, like standardizing using the entire dataset (information leakage via mean/sd calculations including test set), improper data alignment (using data from now to predict now, instead of now + 1), or blatantly using the future to predict the past (e.g. applying k-fold CV or reporting OOB error from RF’s). It’s hard to say anything concrete about if/where this paper misses the mark because I see no code provided.
Another concern here is the lack of confusion matrices and model-free baselines; what performance do you get by guessing “positive” every time, or by guessing the same sign as the last interval? I also wonder what’s going on in the graphs where they show a continuous response despite only doing binary classification.
The approach itself also seems quite shallow, making the results even more suspicious. They only incorporate time in the train/test splitting and in taking first-differences; no additional feature-engineering like longer lags or moving averages, no inclusion of time-related features, etc.
Applying the the approach from the Rossman notebook (lesson 3) here might be interesting; use the same features, toss in some lags, moving-averages, and timestamp features, see what happens. Last I looked you’d need to hack the StructuredLearner class a bit to support binary outcomes, but someone on this forum has an example of that posted, and I imagine some of that code has improved the last few months. Could also just treat it as regression and collapse to binary to compare to the paper.