Siamese Network DataBunch

mohamed1 · May 27, 2019, 1:26pm

I am trying to make a custom image list to create a databunch for a siamese network. i found this very helpful notebook:

github.com

afitts/kaggle/blob/master/competitions/humpback-whale/siamese-with-fast-ai.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this notebook I will explore setting up a [Siamese Neural Network](http://www.cs.utoronto.ca/~gkoch/files/msc-thesis.pdf) (SNN), using the fastai/pytorch framework, to try and identify whales by their flukes (tail fins). The dataset comes from the kaggle humpback whale identification [challege](https://www.kaggle.com/c/humpback-whale-identification). The inspiriation for this technique originated from Martin Piotte's [kaggle kernel](https://www.kaggle.com/martinpiotte/whale-recognition-model-with-score-0-78563) which implemented a SNN in keras.\n",
    "\n",
    "I'll be focusing on training an SNN as they are specifically tailored for one-shot learning tasks, which consists of classification under the restriction that we may only observe a single example of each possible class before making a prediction about a test instance. This is extremely useful given that in my previous [post](http://afitts.github.io/2018/11/04/humpback/) I found that the majority of whales in the dataset only have 1-4 examples in the training set. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "_uuid": "24248b49182cbb21fff652ebeccb9c171ac0c9be"
   },
   "source": [
    "## Imports"
   ]
  },

This file has been truncated. show original

however this uses the version 1.0.39 of fast ai. now, in the latest version, ImageItemList was replaced by ImageList. after changing that i get this error:
’SiamImageItemList’ object has no attribute 'xtra’
which is instigated by these lines in the SiamImageItemList:
imgs = self.xtra.Image.values
ids = self.xtra.Id.values

so what was the xtra attribute chnaged into?

matejthetree · May 27, 2019, 8:39pm

https://docs.fast.ai/vision.data.html#ItemList-specific-to-vision

this is what I found in docs

It inherits from ItemList and overwrite ItemList.get to call open_image in order to turn an image file in Path object into an Image object. label_cls can be specified for the labels, xtra contains any extra information (usually in the form of a dataframe) and processor is applied to the ItemList after splitting and labelling.

baz · May 27, 2019, 9:33pm

I’ve been working on creating some simple abstractions to help with Siamese Networks:

Have a look and feel free to make PR’s althoughI am going to commit a better version this week with examples of how to work with images (currently it focuses on audio)

mohamed1 · May 27, 2019, 9:33pm

update: xtra was an attribute in the old version of fast ai v1. now its called inner_df.
jeremy should at least document such API breaking changes, its really hard to keep up with these, especially since they take place ever so often, and i had to dig deep into the docs to find out.

baz · May 27, 2019, 9:35pm

I’d also like to have a graphical representation of embeddings as shown in the notebook you linked. This is extremely useful in visualising the results.

mohamed1 · May 27, 2019, 9:38pm

thanks. i’ll also make sure to post my work on that thread when i’m done. siamese networks are extremely useful, and it would be great if fast ai would incorporate them.

baz · May 27, 2019, 9:39pm

Yes I’m very interested in them too! I’m almost finished with a repo that will help work with them with fastai. Would be great to get some help

mohamed1 · May 27, 2019, 9:50pm

i’m afraid posting my work is all i can do now, since i’m in the middle of my graduation project.but i hope after that i can contribute to your repo and make a pull request

FangChung · November 7, 2019, 3:54am

This siamese architecture by fastai is useful for me, thanks a lot.
And I have another question about how to prediction:
for example, in Jeremy’s course Image classification, it did prediction just like

pred_class,pred_idx,outputs = learn.predict(“img.jpg”)

So, how can I implement siamese prediction just like below?

pred_target, dist = siam_learner.predict(“img1.jpg”, “img2,jpg”)

Thanks

mohamed1 · November 7, 2019, 4:39pm

i believe that will not work for you. that shouldn’t be the a problem tho, you can easily implement prediction in pure pytorch. But prediction for siamese is going to be different in the sense that you will need a support set (images with known classes) and a test set containing test images each of which you will compare compare to the images in the support set and find the most similar one. thats how siamese networks do predictions. so you can see that fastai’s predict method will not work for this.

zerov · March 5, 2020, 10:43am

I have been troubled by this problem for several days, have you solved it?

FangChung · March 6, 2020, 4:02am

as @mohamed1said, it can’t be implemented by fastai function directly. you need to use pytorch to achieve that. you should process your image to tensor format:

crop image to fulfill network size.
normalize image (cause ‘dist’ should in a range e.g. [0, 1] so that you can do some strategy)
transfer images to tensor format and add 1 dimension (cause the network we train is using batch data, the network expect your input is [batch_size, channel_size, pixel, pixel], so you should add 1D be [1, channel_size, pixel, pixel]
you should determine ‘pred_target’ based on the threshold, the threshold will be calculated by validation data.

zerov · March 7, 2020, 12:10pm

Thanks for your patience, I will try.

abhinavt · June 6, 2020, 10:45am

I have used pytorch to create a dataset, and fastai’s find_lr and fit_one_cycle to train a siamese network. It might be useful.

Here is the link to code.

baz · June 9, 2020, 7:59am

@abhinavt Thats great well done