I thought it would be good to start a fresh thread about this particular network type. There are several posts in this forum about it but I thought it would be good to discuss approaches on how to use fastai to help create a dataset and module to get started with.
These resources were useful but if you have any others please share them and I’ll add them.
You’ll finds some classes to help create a siamese network.
In the notebook is how to:
Create a Siamese Dataset from an already labelled dataset.
Siamese Network Module
Currently it is generating pairs of items before training because I thought it would be to easier to evaluate bad data. Pairs could be generated on the fly?
The Siamese Network Module uses resnet34 by default as an encoder but could use any architecture as an encoder
You can choose classes to make the validation set. This validation set is hidden from training for better evaluation of the network.
I’ve used Hinge Loss but there are many other choices for loss functions. I’ll add some more and see what they do
Should we be freezing the encoder?
How close should our results be to 0 for us to accept them as a good guess
Do Siamese networks with heads improve results
The number of pairs we can train on is very large even for a small dataset like this. How should we allow the user to configure the usage of it while preserving good defaults (having an even balance of pairs for each class)
There is this fastai v0.7 kernel that I highly recommend. The code is easy to understand with a lot of comments.
Other kernels like the following were very informative too:
this pytorch code: which used protonet (few shot learning), an interesting middle approach between one-shot learning like Siamese and classification). This is a very nice explanation on what it is and why it is a major trend in metric learning approaches.
this keras kernel: which was unbeatable due to the use of LAP in selecting the image pairs. LAP is Linear Assignment Problem which is a special type of linear programming problem which deals with the allocation of the various resources to the various activities on one to one basis. It does it in such a way that the cost or time involved in the process is minimum and profit or sale is maximum. It was very compute heavy approach to choose image pairs from the ~20K images, that it took around 50-70% of the total 3 days training time for CPU compute and the rest was training with GPU. But no other approaches could beat it.
If you are serious about Siamese and metric learning in general, I highly recommend to try running the above three kernels and play with them until you understand the concepts.
The way on how you are selecting the image pairs and how the code is increasing the difficulty of the negative image pairs were major keys in 20-30% variation in accuracy. So it is not just the Siamese network whether it works fine, but this essential minor details are crucial to make it on par or better performance than other classification approaches.
@baz . Thanks for this post. Do you have the answer for your question that how close is the distance to be consider that similar or different ? I know that it should be some where from 0 to margin not sure how to choose properly
Currently v2 is still under development and will make creating a Siamese pipeline much easier. Check out video just above that I posted that shows Jeremy explaining how to use the new api to do this. I will get to creating an example of siamese networks at some point and I’ll post it in here.
This is a great question. I did some exploration manually to see but didn’t see any glaringly obvious answer to it. I would re-run others experiments and see what answers they got and see how large they’re loss values got.
I have been trying to implement Siamese Network on different dataset rather than PETS dataset as showed in tutorial. I have tried to implement on MNIST dataset, when i looked into the show_batch().
Their is a clear problem that they are showing disimilar images as similar. I dont know where is the problem is?