I recently joined the XTX forecasting challenge, and have been experimenting with different DL architectures. The problem I’ve been facing is my Tabular Learner model keeps timing out on the test server (15hours limit).
I tested the inference of my model on my own CPU and it seems to run quite quickly on one row of data, so I’m guessing the test dataset is quite huge.
One thing I’ve been doing is reducing the number of hidden layers from [2000,1000] to [250,125] and it’s still too slow for the competition.
Are they better ways to simplify the architecture such that it will be faster during inference time?