Is vgg.finetune needed for vgg.test?

The questions:

  • Is vgg.finetune needed for vgg.test?
  • Why did a 1000-output layer only output 43 numbers? (see the story below)
  • Why was I able to load weights of one architecture into a network of a different architecture?

The main observation of the story:

I loaded weights corresponding to a trained 43-class Vgg16 model into a fresh 1000-class Vgg16 model. The predictions of the 1000-class Vgg16 model contained only 43 probabilities per test image. These probabilities matched the probabilities of the 43-class Vgg16 model.

The story:

  1. I trained a Vgg16 model using the finetune procedure from lesson 1.

  2. I saved the weights.

  3. I made a fresh Vgg16 model, loaded the weights, and made predictions:

    vgg1 = Vgg16()

    vgg1.model.load_weights(weights_path)
    test_batches1, probs1 = vgg1.test(test_path, batch_size=batch_size)

  4. I noticed that I forgot to finetune it:

    vgg1.model.layers[-1].output_shape
    # Returns: (None, 1000)

  5. I made a fresh Vgg16 model again, this time finetuning it:

    vgg2 = Vgg16()

    # Equivalent to vgg2.finetune(batches)
    vgg2.model.pop()
    for layer in vgg2.model.layers: layer.trainable=False
    vgg2.model.add(Dense(43, activation=‘softmax’))
    vgg2.compile()
    # end finetune

    vgg2.model.load_weights(weights_path)
    test_batches2, probs2 = vgg2.test(test_path, batch_size=batch_size)

  6. I noted the output shape of its final layer (43 now, instead of 1000):

    vgg2.model.layers[-1].output_shape
    # Returns: (None, 43)

  7. I compared the predictions of the two models and saw that they were the same:

    probs1 == probs2
    # Returns
    array([[ True, True, True, …, True, True, True],
    [ True, True, True, …, True, True, True],
    [ True, True, True, …, True, True, True],
    …,
    [ True, True, True, …, True, True, True],
    [ True, True, True, …, True, True, True],
    [ True, True, True, …, True, True, True]], dtype=bool)

Notebooks:

Steps 3 to 7: https://github.com/MatthewKleinsmith/fast-ai-MOOC/blob/master/is_finetune_unneeded_for_test.ipynb

Steps 1 and 2: https://github.com/MatthewKleinsmith/fast-ai-MOOC/blob/master/german-traffic-signs.ipynb

print out probs1 and probs2, and their shape. I suspect your length of probs1 is your number of batches, not your number of outputs (43)

I think

print (len(probs1), len(probs2))
print (np.shape(probs1), np.shape(probs2))
print (probs1[0])
print (probs2[0])

would be infomative.

(12630, 12630)
((12630, 43), (12630, 43))
[  1.05845601e-16   1.18765105e-15   1.55969603e-17   3.12666475e-18
   1.70540237e-16   1.22288792e-19   2.69106317e-30   3.74874844e-15
   2.59100473e-21   2.25113171e-19   4.36713468e-19   9.76924002e-01
   8.76862737e-15   1.86876538e-15   5.57775907e-16   1.15655743e-27
   5.81721904e-29   3.09473505e-24   6.31797820e-06   1.24634525e-09
   5.13620364e-07   3.41670006e-04   3.62355053e-07   4.51457072e-06
   1.58777766e-07   8.44739436e-04   5.16915679e-05   1.90466265e-09
   4.52431646e-04   8.01832584e-07   1.52249979e-09   2.13727858e-02
   6.48145080e-25   1.20713343e-20   2.46993940e-22   6.70727605e-17
   3.91028934e-25   1.34636500e-23   1.48310757e-19   4.20323457e-17
   5.02027875e-25   1.60621171e-26   2.14951021e-21]
[  1.05845601e-16   1.18765105e-15   1.55969603e-17   3.12666475e-18
   1.70540237e-16   1.22288792e-19   2.69106317e-30   3.74874844e-15
   2.59100473e-21   2.25113171e-19   4.36713468e-19   9.76924002e-01
   8.76862737e-15   1.86876538e-15   5.57775907e-16   1.15655743e-27
   5.81721904e-29   3.09473505e-24   6.31797820e-06   1.24634525e-09
   5.13620364e-07   3.41670006e-04   3.62355053e-07   4.51457072e-06
   1.58777766e-07   8.44739436e-04   5.16915679e-05   1.90466265e-09
   4.52431646e-04   8.01832584e-07   1.52249979e-09   2.13727858e-02
   6.48145080e-25   1.20713343e-20   2.46993940e-22   6.70727605e-17
   3.91028934e-25   1.34636500e-23   1.48310757e-19   4.20323457e-17
   5.02027875e-25   1.60621171e-26   2.14951021e-21]

When you load the weigths with the last layer having only 43 classes, the weights for your last layer ends up being a numpy array which has only 43 elements.
myweights=vgg1.model.get_weights()
myweights[39].shape
and hence the model predictions would only output 43 classes
Maybe keras should have thrown at least a warning that the shape of the weights are different instead of just overwritting the weight arrays with one of different shape.
Nevertheless you are not doing any training after loading the weights, does a call to vgg1.fit throw any warnings?