@rameshsingh thanks would summarize learnings from different experiments we are doing once completed thru a blog. Will share the notebook post some clean up as it is too dirty now for sharing in the forum
With 39 zucchinis and 47 cucumbers …
a ResNet50 with input size 299 managed to perfectly distinguish between the two on the validation set (10% of the above numbers) after 2 epochs:
I know we must be careful when interpreting results on a validation set with 13 samples but I had a limited number images and wanted to share the results in any case. The fact that training loss >> validation loss is probably an indication that my validation set is not difficult enough.
I pulled data from Kaggle - https://www.kaggle.com/slothkong/10-monkey-species
10 species of monkeys with about 100 training images for each. I didn’t see any need to do the fine tuning section with this data because how do you get better than pretty much perfect right from the start? Amazing. Gonna find some other data and go again.
Thanks for your reply Ethan. I will look into that.
Hello everyone, I found this dataset on kaggle using the google dataset search which is for classifying fruit photos. So I tried my hands and got to 0.5% error rate within 4 epochs. I used resnet34 as the architecture.
Here are the images
The only things my model is classifying wrong are same things with different labels.
Hi Radek, how do you approach memory problem with fastai for this competition? fastai learner loads all dataset into memory arrays, and this dataset is too huge to do it.
these are some experiments I did for previous fastai version: Experiments on using Redis as DataSet for fast.ai for huge datasets
Hi Vitaliy - for this competition I load the data directly from HDD.
I wanted to represent for the Caribbean programmers. So I built a classifier to classify Trinidad & Tobago Masqueraders versus regular islanders.
Here is a sample of my dataset
Here is a sample of my predictions
Here is my confusion matrix
Pretty decent results for a very small dataset. Notebook will be forthcoming.
I was thinking that data loader accepts paths and loads tensors on demand, no? Otherwise, it would be impossible to deal with any, even relatively small modern dataset. I remember that I had out-of-memory errors even when trained a dogs breeds classifier.
As I know, PyTorch datasets API doesn’t force you to load everything into memory at once. You only need to define how to retrieve a single instance based on its index.
How? if you take all 50M images you will likely be out of memory on p2/p3.xlarge instance.
I didn’t look into latest fastai, but the fastai from last year loaded all data into ArraysDataset in memory in learner.precomputed call.
I’m not sure where, maybe it was with precompute=True, but in vision, fastai only loads the images a batch at a time when needed for training/validation.
I think in this version of fastai, pre-compute option is removed.
You can also use an s3 bucket to store the data on aws and use a library like boto3 to access the data from the bucket
Small and simple spin off from lesson1
It was one of my last hackhathone task
to do recognition of road signs.
As we can see without big hassle I achieved 98% on very unique data set black and white data three classes
LD - left diagonal
RD - right diagonal
I loaded data from CSV
This is quite a nice project. I love the simplicity of it. Nice work.
This is cool. How big is your dataset?
I am talking about RAM.
Have you succeeded to train 1 cycle with all images?
Row size variety
I did it on fast.ai v2 first time by modifying same lesson dog breeds
and achieved then 95% max
Your actual achieved accuracy with v3 in your notebook with resnet50 is 98.05 % ! You forgot to remove learn.load(‘stage-1-50’) at step.