Using XGBoost with Densenet161

Hello! I was the pre-trained densenet161 for a car make and model classification problem. The accuracy is pretty good after 10 epoch, without fine tuning the learning rate.
However, I came across ‘XGBoost’ recently. From what I understood, we can use the trained CNN as feature extractor then do the training with XGBoost, that will handle the feature selection part.

I am a little bit confused.

  1. Will training with XGBoost also handle the classification part or do I need to set up a classifier to do that?
  2. I trained the model with create_cnn, can I access the features with learner?
  3. Do I fine tune the learning rate before extracting the features?

Can someone please help or direct me to some good resources?

Hi Catalina,

XGBoost is an effective classifier for tabular data (structured data). It does not work for non structured data like image, unless you indeed proceed with some encoding/feature generation. I assume that the idea here is to encode images into a “flat tensor” using a pretrained CNN and use the encoded images to feed a XGBoost.

In theory, it may work, but I am not sure what the benefit from using XGBoost in this case would be.

You could train a classifier with Densenet (or other pretrained CNN), and then drop the head (neural network classifier/fully connected layers at the end) and feed the result flat tensor into a XGBoost.

What the benefit would be. You already have trained a classifier with your CNN starting from the pretrained one (see first chapters in Fastbook for transfer learning). It does not seem to make mush sense to train a second classifier again with totally new model which has a totally different architecture, and uses different libraries

The only possible benefit would be if you have lots of tabular data/metadata together with images and you want to train with both. But even here, it seems more natural to extend the CNN to add tabular data inputs.

I hope it helps :slight_smile:

1 Like

Hello @vtecftwy ! Thank you very much for your response. It does make it clear that XGBoost would be an extra step that may not be necessary. From reading about different sources, I thought it might be a better classifier and that I could apply it in my case scenario.
For the fine tuning, I only managed to play around with the learning rate. I plan to experiment with batch size variations as well. But do you have any suggestions?

Work first on your learning rate and possibly other hyperparameters. As long as you stay with batch sizes around 16 to 128, you should not see too much of a difference with error rate or other metrics. It will mainly influence the memory you require to train the model.

Watch out for overfitting by plotting the losses after training (learn.recorder.plot_loss()).

1 Like