Based on the questions about 4-channel inputs, I’m guessing that some people are working on this competition right now. Wanted to share that I released my starter code here: https://www.kaggle.com/c/human-protein-atlas-image-classification/discussion/71039 that gets you up and running with fastai v1.
This is a really neat starter William - thank you. I’m a little far down the road with an older version of fastai at the moment but am keen to see the progress people make with v1.
What does this do when you normalise with it further below. And how did you determine the values?
protein_stats = ([0.08069, 0.05258, 0.05487, 0.08282], [0.13704, 0.10145, 0.15313, 0.13814])
Those are per-channel means and standard deviations for the protein data set. Normalizing works just like with imagenet_stats, it subtracts the mean and divides by standard deviation. I didn’t calculate them myself, I got those values from another fastai kernel (https://www.kaggle.com/iafoss/pretrained-resnet34-with-rgby-0-460-public-lb)
I wonder how it affects learning rates tuning if add a new input channel to the model? Like, one should freeze all layers except the first one to fine tune values from zeros to something meaningful?
I think there’s a better solution–see my comment here https://www.kaggle.com/c/human-protein-atlas-image-classification/discussion/71039#418772, someone else suggested that instead of adding the weights for the new input channel as zeros, we just copy over the weights of one of the other channels. That’s probably good enough to not have to tune that first layer separately–it will get trained enough with the others when you unfreeze.
if the fourth channel is correlated with the other channels in the image then this must be a good approach. i guess that multiplying with 3/4 would be a good idea as well in order to preserve the magnitude of the signal that the neuron (relu) receives
I logged a Pull Request in github to update the threshold from 0.5 to 0.2 . This is consistent with planet example we covered in course and also gets better results for the Human Protein data set. Small change that will make a big difference.
Also, the question was asked in lesson 3 on what normalization stats to use for new data sets and Jeremy indicated that with transfer learning best practice is that the same stats should be used as the pre-trained model was was trained on.
So perhaps consider using image-net normalisation stats instead of the actual data set specific stats for quicker results. If the model was trained from scratch I suspect yours would be better. The Human Protein images are so different however, I don’t think this will really make a difference.
Thanks for the code. Been trying to get a pytorch dataset to be used with fastaiv1 for a while now
Thanks for the pull request! I just merged it. I’m curious–with the change to 0.2, do you know what were you able to get on public leaderboard?
I played around with a few things so not sure what improvement the change to 0.2 threshold in isolation is.
I got f_score of 0.4 and if I have to guess think it is responsible for 0.02 -.0.04 uptick, so 5-10%. This value needs to be refined again once model is trained as I have seen it shift in previous exercises.
Thank you for sharing. To continue this thread, I’d like to share my baseline for this competition too. Which is mainly inspired by planet notebook from lesson 3.
My way was to simply combine 3 1-channel images (discarding yellow images) to 1 RGB 3-channel image with greate tool imagemagick https://www.imagemagick.org/script/index.php
Combining images would be as simple as this command in linux shell:
convert r.jpg g.jpg b.jpg -channel RGB -combine combined.jpg
So workflow would assume you have a combined images in your train and test folders.
This notebook would not get you on top of LB, but provide a nice and simple way to start playing with data and experimenting further.
And here is a link for Kaggle discussion topic.
Tnanks to fastai team and community.
I’ve tried to address the same competition, but even using transfer learning on Y channel too, the model seems to overfit without reaching your score…
Comparing the two models it seems that you’re using a model way more complex than resnet50: it seems to mix two copies of resnet50 one with first convolution with 3-channel images and one with 4-channels images as input. Why do you do that?
Here is my work:
@wdhorton why is the size = 224 and would it be possible to amend the resnet.py to change the size up or down?
The size 224 is passed into the databunch create method. I chose it because it’s what resnet50 was originally trained on. But I’ve also done 128, 256, and 512 with this same notebook. You don’t have to modify resnet.py, just change the size number when you create the databunch.
I’m not using two copies of resnet, though I can see how it might look like that. I initialize a single pretrained resnet50 as the encoder variable. Then I have to modify the first convolutional layer, so I make a new Conv2D, but copy over some of the weights from the pretrained model. The rest of the model is exactly like resnet50 if you look at the code in torchvision.models. In the forward pass I don’t call encoder (resnet50) directly, I just use the layers of it that I added to self in the init method.
For some reason changing the size results in error when I start to train the model
RuntimeError: Given input size: (2048x2x2). Calculated output size: (2048x-4x-4). Output size is too small at /opt/conda/conda-bld/pytorch-nightly_1540201584778/work/aten/src/THCUNN/generic/SpatialAveragePooling.cu:63
Only change I made to your code is the size to 128 from 224 when I create the data bunch.
Ok, I think I know the issue. I’ve got a couple fixes I’m going to get out in the next few days (including migrating to datablock API), I’ll keep you updated.
Update: I made another notebook to work with the new
data_block API. In this version, I also made changes to use the
create_cnn function. You can find it at resnet50_basic_datablocks.ipynb. @chrisoos this should fix the issue you ran into too (caused by saving the encoder as
self.encoder when it wasn’t needed).