For those of you having a hard time getting started with the Kaggle challenge + FASTAI library, I’ve worked with Prince to put together a “starter kit” for processing the Iceberg matrices into RGB images, then running your first Convnet. Hopefully this will give you a starting place to start tuning and other image techniques.
*11.14.17: Updated the code for proper Kaggle submission formatting.
saved the test images with the id build in such as img_ab2348fed28.png
then after running the FastAI model, pull the file names + probabilities
then extract the ids from the file name + package with the probabilities
Caveats about the Code
Python3
Based on Fastai library
Was authored on a Paperspace instance + GPU, one model ~ trains 10 mins
100% Markdown: Also, since kaggle can’t run fastai, the notebook on kaggle is 100% markdown, which makes copying code a little more difficult.
Also, maybe try pretraining on scaled down versions of the planet data or similar. And think about how best to do data augmentation, since flipping etc doesn’t work right with the angle data in the iceberg dataset, as you may have noticed.
Thanks @timlee
I kinda got left behind in the Dog Breed Challenge competition. Definitely will get on this.
@jeremy@timlee
Would you mind me giving it a stab to get the SeNet piece working? Also, what did you exactly mean to try get it (SeNet) to work? Did you mean simply to try integrate that to the fastai library? With a meagre attempt to do the same for VGG-16, I just might be able to do that. Or did you mean, literally train the SeNet on image-net and publish the weights, so we can have a pretrained model to work with?
Also, my GPU’s being sitting idle for some days and surely eager to crunch some data (think training on imagenet )
Some people have had good luck trying different models. You can find the different models available near the top of the conv_learner.py file in the fastai directory.
@KevinB
Sorry… didn’t get my first post quite right. I was able to use VGG-16 just fine. Think Jeremy actually pushed in a feature update himself.
But will let you know what happens with the SegNet architecture.
Hi, do we have a pretrained network on CIFAR-10 integrated in fastai. Or I can run https://github.com/kuangliu/pytorch-cifar on my server then save the model and then maybe it can be add.
I’ve been reading PyTorch forums, people seem to have issues with saving and loading models. Shouldn’t it be as easy as saving the trained model parameters/weights into a pickle like file as it’s mentioned in PyTorch then load and use it in any kind of environment such as fastai ?
And why don’t people just share those serialized files trained on different datasets with the best model ?
We don’t have a pretrained CIFAR-10 model, but we’d love one - or many! So if you do train one or more of those models I’d be happy to host the weights on our web site.
“Why don’t people share the serialized files?” I have no idea - it’s a huge opportunity that no-one is taking advantage of, other than a few imagenet files. There should be pretrained nets available for satellite, medical (CT, MRI, etc), microsopic (cell) images, etc, but there aren’t any…
I haven’t heard of problems with saving and loading models on the whole, although I know that if you train on multiple GPUs you can’t load on a single CPU, and visa versa.
@jeremy I modified the dogs & cats to accomplish this along with the “Starter Kit” Thanks @timlee
I think that the TTA is essential on this challenge.
But I think we need to take in consideration rotation of the images. not just 180 or 90 degrees like TTA does with the standard 4 options.
We do up to 10 degree rotations as well. Although apparently the iceberg dataset doesn’t play nicely with standard augmentations (according to folks on the kaggle forum).