Use CNN as a feature extractor for tabular model


I am trying to perform something similar to this previous post: Using a trained Resnet as feature extractor

But I would like a little bit more of context if anyone faces this problem before (maybe @tapashettisr?). Which should be the best layer to extract the weights? How do you perform this? With a hook? Any code snippet would be very welcome.

Do you think the model should be trained on the specific task that wants to be solved and then extract the features? Or should be trained on any other task more suitable to extract valuable features?

Also, do you think using a dimensionality reduction (like PCA) on the features would be useful in order to add the data into the tabular model (xgb, rf, etc)?


An update into this. I used a published pipeline to perfrom the feature extraction. I end up having 1024xN, being N the number of patches for a specific image.

The problem now, is how to integrate all the information from the different patches into something with the same shape. For example, one image may be 1024x2500 but another would be something like 1024x100.

I was thinking on do something like a PCA to reduce the dimensions to something controlled but I am unsure if this will make sense. Any ideas?