Hello, I started with Machine Learning about a year ago and writing my master thesis about Transfer Learning with the main focus being to train a model with ULMFiT and a limited amount of Data. My goal is to prove that ULMFiT with the FastAi library can be done easlily and quickly by beginners like me without sacrificing too much time on training and hyper parameter tuning and still getting decent perfomance on different tasks.
The strange thing is that basic fine tuning of the classifier mostly ends up with similar or even better results then the one trained with ULMFiT method. The possible problem that could lead to those results is that I am heavily relying on the lr_find funtion since I have to train multiple BERT models to evalute all variations of my Data Sets (IMDB, SST-5 and a self labeled CounterStrike News article set) without spending too much time on choosing the right learning rate. Those data sets get split into 100/75/50/25% of their size to see how much impact the data size has on the performance.
Here is the colab notebook I wrote to evaluate the different transfer learning methods. Does anyone has a suggestion where the problem could be within my implementation of ULMFiT? My guess is that I probably should spend more time on Hyperparameter tuning but this would contradict to the purpose of my thesis which is that it should be useable quickly.