Is it possible to add classes to a model and train just for that class?

Is it possible to add classes to a model and train just for that class or do I have to retrain it from scratch?

Hi I think there is basically always training involved for something you add fresh.
I think the lesson is called transfer learning so you dont have to start totally from scratch.

I hope some else can comment that knows more about the process of adding just one more category?

Good luck

Yes. You can copy the weights over and change the last layer to have one more output (or more) to have your old classes and train for a new one. Look for a thread called “Transfer Learning Twice”, this has been brought up many times :slight_smile:

I remember reading that the out_features thing had to be less than the original model. Thing is my old model had less classes than the new model I need.

It doesn’t have to be at all. It can be more or less, it doesn’t matter :slight_smile: The point of transfer learning is using some weights in which are repeated to the tasks to some degree to help make training faster and or improve on it compared to just training from scratch :slight_smile:

I will say though re-reading, you’re wanting a model to identify just one class? Are you meaning more along the lines of it either is or is not x?

Otherwise, my earlier comment stands. You retain the information for the other n classes you started with (minus the final layer) and then you add one more class into the model :slight_smile:

No, I’m trying to expand on a stanford cars dataset that I trained before. Thank you for the information :smiley:

1 Like

Hey, I tried using learn.model[-1][-1]=nn.Linear(in_features=512,out_features=196, bias=True) before loading my model and learn.model[-1][-1]=nn.Linear(in_features=512,out_features=204, bias=True) after loading, reflecting the old and new number of classes. However, when I try to run lr_find() or fit_one_cycle I get

RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_addmm

Please note that that does not happen when I don’t try to learn using my old model.
Do you have any idea why that is happening?

If I had to guess, try loading your model with .cuda() at the end

I’m using learn.load('pretrained') to load the model. Using learn.load('pretrained').cuda() gives me AttributeError: 'Learner' object has no attribute 'cuda' Am I doing something wrong?
Edit: I’m loading a .pth from not learn.export(name) Should I load the exported version instead?

No you’re loading it in right :slight_smile: try:

learn.model = learn.model.cuda()

1 Like

That worked! For other people that might look at this thread, adding learn.model = learn.model.cuda() after learn.model[-1][-1]=nn.Linear(in_features=512,out_features=204, bias=True) (the second one) will fix your problem.

1 Like

Ya, we can do that but I think in case of a Facial Recognition system using the current Facenet architecture, we need to retrain the model again to generate the 128-D encodings for each person. I just want to know that is there any way so that say if I add a new person in my database, I don’t need to recalculate the embeddings of every person again and just calculate it for the new person and add it to the database ??