New custom head in trained model

I have trained a multilabel/multiclass model using pretrained resnet34 weights, using data with 28 classes. I would now like to use the weights of this model, minus the head, to train a BINARY classifier using the same data but with only one of the labels. The new classifier uses the exact same input images and the model will need only small changes, all in and near the head.

Lesson 9 gave me the ability to drop a new custom head into the resnet34 pretrained model. But it involves a call:

    ConvLearner.pretrained(f_model, md, custom_head=head_reg4)

which requires a standard argument (e.g. resnet34 function, which is in the meta table) in the first argument (I think), and not, for example, simply a path to my transfer-trained model.

So I do not know how to change the head on the model I have trained (it was saved in a prior run and I will read it in using, I guess, learner.load()).

One possible way is to:

  1. create a new model with the new custom head and original resnet34 weights
  2. read in the model that I trained and copy its weights into the new model

Step 1 is easy. What is the best way to accomplish step 2?
Am I thinking about this the right way?
Finally: is there documentation that would answer this question?

I would also love some help with this.

You have to dig a bit into the code base for this. The model created by create_cnn is a sequential model with two things: the body and the head. If you want to keep the pretrained body, you’d find it in model[0], the head is model[1].

If you want a new CNN with less classes, you can create your new model with learn1 = create_cnn(data1,...) then load the weights from the previous head into the new one with

learn1.model[0].load_state_dict(learn.model[0].state_dict())
4 Likes

Thanks for this response - it is exactly what I was looking for.

I have a question on a detail, though. If I print learn1.model[0], I get back the very first (attached to input) layer. model[1] gives the next (sequential block) layer. Layers 8, 9, 10, 11, 12, 13 give the adaptiveConcatPool plus what I would call the head: flatten, bn, dropout, linear, FC, relu.

So I think I need to do:
learn1.model[:9].load_state_dict(learn.model[:9].state_dict())

Would you agree with this?

Oh, I didn’t remember they were joined like this, but in this case, yes.

OK thanks for your help. I believe that this is working properly now.

Are you working on HPA comp?

I was but I just made my last submission. I thought it was a good competition - learned a lot - though the statistics were a little flawed. Be very interesting to see the private LB!

How about you?

Yea me to, finally I should have chosen better final submissions…
Do you want to team up in another comp? Maybe whale challenge or the seismic challenge?

I haven’t looked at these - I’m going on vacation for a while but afterwards I’ll take a look.

So I don’t need to use create head if doing a pre-train, and this way is better?

If you look at the source, create_cnn calls create_head.