Newbie here with computer vision and neural networks so hoping someone can assist me. I’m wanting to apply lesson 1 to a set of data images that are satellite images over a property at a specified zoom level (from Google static maps API). I have my images and I have residential property sale data in South Africa that allows me to create 2 classes of images by taking the top and bottom 10% of sales and classifying as 1 (luxuy) and 0 (affordable) to see whether a CNN can identify the difference between expensive and affordable property areas from satellite imagery. These images are in labeled folders ( “0” and “1”).
I’m trying to replicate a white paper that instead of just fine-tuning a pretrained network to distinguish between my 2 classes, rather output 256 features before trying to make the softmax prediction of luxury vs affordable. This is since the 256 features are to be used in a neural net model concatenated with other data features such as area under roof, number of bedrooms, etc. Thus, I need to figure out how to alter lesson 1 to add a custom head that takes the original model’s output, and then adds a fully connected linear layer to output 256 features, and then a softmax on top of that to classify as 1 or 0. Or at least that’s what I think I need to do
The plan is to determine whether a luxury/affordable classifier is accurate using only 256 features from an image, and if so, use these features as input features in another neural network. I hope I’ve explained this well enough.
Appreciate any help you guys can offer me!