L1 homework: Price prediction using images a.k.a "Forex Cats and Dogs"


(Alex Ragalie) #1

Hello everyone,

Wanted to share with you a little experiment i ran as homework after Lesson 1: could the current best-in-class image recognition algorithms be used to predict “winning” and “losing” price chart configurations instead of cats and dogs?

To try and answer the question, I’ve gone ahead and built an entire mini-pipeline (excluding the trading engine part) in Python:

  • it gets FX price data (i’m using Oanda.com)
  • then it plots it according to specific “viewPorts” (60 minutes of price data shown as a line, with 1 minute data increments)
  • saves each plot as a .png file, naming it according to its class (buy, sell or hold) and also places it in the right folder inside either “train” or “valid” folder
  • it also takes into account Rachel’s excellent advice on how to set-up train and valid datasets properly (only latest price data is used in the valid set)
  • then the data is run through the Lesson1 image recognition model

All the code can be found on my Github , presented in Jupyter Notebooks with explanations for each step and code section.

CONCLUSION

At this stage the accuracy rate is random at best (~50%), which is to be expected for the specifics of my dataset (granular FX data with little denoising work done). I knew this from the beginning, but it was a useful exercise none the less.

Appreciate any feedback, and hope that some of my code can be useful as a starting base for anyone else interested in the topic.


Finance (Trading, Investing, Fintech, etc.)
(Jeremy Howard) #2

That’s a fun project! How many images did you have?


(Alex Ragalie) #3

Hehe, glad you like the idea Jeremy.

At it’s largest dataset size, i had ~16,000 images in each category for train (e.g. in data/train/buy) and ~2,000 images for valid (as these had to be fairly recent). The best i got was ~60% accuracy, but the training and valid loss is still significant so nothing to write home about.

I need to tweak things much more going forward on this project, as i see as well significant improvements depending on what indicators i plot on the charts…right now i have only 3 moving averages, and these already do a noticeable difference in de-noising the price data and therefore increase accuracy.

As well, i need to try out different architectures (i only used the one in Lesson 1 so far), perhaps the baseline result will improve. None the less, for a relatively “fast” implementation on your existing code, i feel it was a good start :slight_smile:

Please let me know as well if you have any suggestions for things i can try next on this topic, would greatly appreciate it!


(Jeremy Howard) #4

Well… I guess imagenet pretraining won’t help, and data augmentation won’t help either. It may help to use an architecture that doesn’t need much data, like densenet.


(Alex Ragalie) #5

Highly appreciate the reco, will give it a try next.


#6

Definitely not a picture which could be found in usual “Imagenet”-like environment =) Though it could be interesting to see if CNN could give insights comparable with RNNs in this case. I.e., to extend conception of image beyond things usually treated as images. Like, application of deep learning models to urban traffic network snapshots or traffic flow matrices.


(Alex Ragalie) #7

I guess the biggest difference in my dataset is the degree of randomness in the pictures. Market data by its very nature, especially at small time intervals, tends to be mostly “noise”, and therefore extremely hard to “learn” from via ML. This is the “Holy-Grail” of quants and AlgoTrading funds all over the world, so I know what i’m trying out is a very challenging topic…it’s fun none the less to look at it from different points of view…

I can see your point though that some of the lessons (if any will come out) can be applied in other fields where the degree of noise and/or randomness is high…not sure traffic data is in that category though, but i’m most definitely not an expert…


(Will) #8

I’ve had a similar thought to using charts as pictures to feed into a image classifier.

Don’t be too discouraged by benchmarks like cats vs dogs with 99% accuracy. 60% accuracy in investing can still be an incredibly profitable trading strategy. I would spend some time computing returns given an up signal versus down signal (making sure to reduce your returns to account for transaction costs) before writing off your project.