I am using FastAI v0.7 for a text classifier which makes pretty accurate predictions. However, we got another data source which is numerical (6 numbers, standard normalized over the training set mean and std) and has been independently verified to have a good predictive performance for the same task.
So, I was investigating how to combine the text data and references, and investigate the performance. Here are some ideas I had:
- For each example, concatenate the 6 numbers with the concat pooling output in the PoolingLinearClassifier. So basically, the concat pooling output would look like this:
[output[-1], mxpool, avgpool, the_6_numbers]
The rationale for the concatenation here was that I didn’t want the numerical data to get pooled. I tried this, and the performance dropped significantly. I am not sure why and any possible explanation would be appreciated.
- Have a LinearBlock on top of the 6 numerical inputs, maybe of size 50 or 100. And then concatenate the output of this block with the concat pooling output as in 1.
I had tried a similar approach with a CNN-based classifier and that worked great. I am yet to investigate this experimentally.
EDIT 1: I tried this with ULMFiT, but I get similar results as 1. The validation set loss I got after 1 epoch was 8749143523.580736, while the training set loss loomed in the range 0.4-0.6. I do not get what’s wrong.
- Have 2 independent classifiers for the text data and the numerical data, both of which have exhibited a good predictive performance independently, and then add a 2 layer classifier on top of the output of the first 2 classifiers.
I would appreciate any feedback and comments on the best way to go about this.
EDIT 2: So, I was incorrectly normalizing the validation data, which caused it to be way different than the training data. That led to the erroneous training I mentioned in 1 and 2. I would like to close those issues. However, I would welcome any feedback on which of the methods 1, 2, or 3 would be the best.