Share your work here ✅

keyurparalkar · October 27, 2018, 2:26pm

I have been currently working on this Happy whale dataset. I have used resnet34 model for classifying more than 3000+ whale species. The classifier that I build didn’t perform very well with only 10% accuracy. From this dataset I understand that there might be some more to it to increase it’s accuracy like applying certain specific transformation, choosing different etc. Since this is the problem of multi-class predictions I hope this will be taught later in this course on how to build multi-class prediction model in fastai which is so much easier as compared to building this notebook completely from scratch using pytorch. Below is my gist for happy whale dataset classification:

gist.github.com

https://gist.github.com/keyurparalkar/b435de4358ee912a1b60e65adab7d3ec

Happy whale classification.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Happy whale dataset classification"
   ]
  },
  {

This file has been truncated. show original

prajjwal1 · October 27, 2018, 5:04pm

Model predictions from Human Perspective

This what my model predicts separated by actual labels. These predictions seem pretty reasonable to me. Even I would have predicted the same. Not sure what to expect from model in terms of scope of improvement. Funny

gianferrarif · October 27, 2018, 5:43pm

We created an Italian food classifier.
Buon Appetito! A fast.ai spin on Italian food by Francesco Gianferrari Pini https://link.medium.com/NAM4XPwumR

keyurparalkar · October 27, 2018, 5:44pm

Does this mean if we set size = 224 it will add extra zero padding or do reflection along the edges so as to match ? When the original image size is 48 x 48

bachir · October 27, 2018, 6:01pm

I’m having a hard time training on the Flower Dataset (it has 102 class)!
I trained a ResNet-34 and I got the following bad performance:

Total time: 21:00
epoch  train_loss  valid_loss  error_rate
1      2.709790    5.458967    0.984719    (02:40)
2      2.661998    5.569498    0.984108    (02:38)
3      2.546748    5.700339    0.983496    (02:37)
4      2.363964    5.859702    0.982274    (02:37)
5      2.127904    6.052469    0.982274    (02:36)
6      1.910503    6.106037    0.983496    (02:37)
7      1.760798    6.104397    0.982274    (02:38)
8      1.696391    6.164757    0.982274    (02:35)

Then same thing with a ResNet-50


Total time: 30:58
epoch  train_loss  valid_loss  error_rate
1      5.143882    4.833019    0.985330    (03:08)
2      4.918839    4.791797    0.980440    (03:06)
3      4.749866    4.747269    0.976773    (03:01)
4      4.577477    4.718012    0.975550    (03:06)
5      4.449985    4.724235    0.971883    (03:04)
6      4.291201    4.759896    0.979218    (03:06)
7      4.113739    4.790433    0.975550    (03:08)
8      3.995913    4.814462    0.972494    (03:01)
9      3.880915    4.824159    0.976773    (03:05)
10     3.836075    4.842245    0.981051    (03:09)

Looking at the confusion matrix I see an interesting pattern, it basically classifies everything to the first 8 classes (look at the wide dark blues squares)

Any ideas how I could improve the performance of the classifier?

Here is the jupyter notebook - link.

init_27 · October 27, 2018, 6:45pm

(Cross posting here from kaggle discussions)

I’m trying the former and will share if I find out anything interesting although most likely I won’t.

I used your starter code, bumped up the data to 5% then decided to go ahead and train it on the complete dataset.

Fair warning: If anyone else wants to try this approach: it did take me about 2days to extract the images and will take a lot more to train the complete data. But I did make the mistake of joining a competition that’s running (Against @radek’s advice to join a fresh comp) so my money is on this idea.

henripal · October 27, 2018, 7:20pm

Have you tried increasing the number of epochs? Your validation loss is still going down so it may be that you need more training time.

ramon · October 27, 2018, 7:35pm

That’s interesting. I did the following and ended up with the same pattern:

read further

I created my own chicken dataset using google_images_download

I trained the dataset without unfreezing; looking quite promising - only the difference between male and female for each chicken type is difficult (also due to a lot of noise in the dataset I think).

Then I unfreezed and trained all layers (with ‘learn.fit_one_cycle(2, max_lr=slice(1e-3,1e-1))’)
And ended up with the following pattern (as if one chicken type is quite generic):

WIth ‘max_lr=slice(1e-6,1e-3))’ instead, it does improve. So 1e-3 seems too high in this case. So can we state that too high learning rate will create a too generalised model? And your learning rate is probably too high?

The learning rate plotted looks like:

henripal · October 27, 2018, 7:46pm

Hey @bachir - not sure about this but I’m looking through your notebook, and it seems to me like your “labels” list and your “fnames” list aren’t in the same order.

Saying that because the first 250 labels are ‘77’ while the first flower images from your fname list are definitely not the same flower. You might be training with random labels.
’

Mauro · October 27, 2018, 11:05pm

Hey everyone. I made an image classifier model that could tell the difference between 2 types of buses used in Panama city. The classic type also called Diablo Rojo that is being phased out. And the modern ones called Metrobus. The accuracy of the model on a validation set was 98.2%. As you can see it’s pretty easy to tell the difference.

Using an algorithm to download the images was a real pain. It took longer than anything else. I’m glad that’s over with. Here is the model.

cwerner · October 27, 2018, 11:28pm

Hi @Mauro

Nice busses ;-). To help with the cumbersome image download I wrote a small package…
You just specify your search terms and it’ll pull the images from multiple search engines…

Details are here if you are interested:
https://forums.fast.ai/t/small-tool-to-build-image-dataset-fastclass/28281/2

lauren · October 28, 2018, 1:26am

Hi,

I trained a model with street pictures from SF, NYC, Tokyo, and Paris. The error rate is pretty high (32%) but looking at what it gets correct and wrong is pretty interesting – it thinks SF’s Chinatown is Tokyo and it seems to associate Paris with beige.

Top loss:

Top correct:

Notebook here.

Any suggestions to improve welcome!!

Kaspar · October 28, 2018, 3:23am

This is an interesting case and the training i going well: your error rate and losses are improving steadily. You will probably be able to improve the result a lot by using the training parameters that jeremy will present later - probably in the next session.

Also i could be interesting to aggregate the species so that you have less classes. ie use the class Warbler for all the Warbler Prothonotary_Warbler’, ‘Swainson_Warbler’, 'Tennessee_Warbler etc. You could do that by writing a regular expression of a string search to locate the class Warbler after the “_”

charming · October 28, 2018, 4:02am

For rectangular image input, such as (256,512), input resized image size is (256,256) better or (512,512) ?

laura.winger · October 28, 2018, 4:03am

Hi - I trained a model to classify 4 kinds/brands of tea cups (Royal Albert, Paragon, Aynsley, and Shelley). The error rate is 32%, which is pretty good considering there are 4 roughly balanced classes and they are all very similar!

Top loss:

Here’s the notebook if you have any suggestions for improvement https://github.com/LWinger/fastai-part-1/blob/master/assignments/lesson1-teacups.ipynb

yassam · October 28, 2018, 4:11am

Recognising emotions. 6% error using resnet50. Notebook here.

rameshsingh · October 28, 2018, 8:38am

Hi Everyone,
Here I worked on Dog Breeds dataset from Kaggle which has 120 breeds and used resnet34 and resnet50 for classification. I got an error rate of 7% with just 2 cycle executions.

Here I have created my 1st public gist.

gist.github.com

https://gist.github.com/rameshsinghds/7fdfc31005a631746d438e5ecfa271c7

Dog-Breed.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%reload_ext autoreload\n",
    "%autoreload 2\n",

This file has been truncated. show original

I would request everyone to go through this and let me know your thoughts and suggestions. I think there are improvements possible.

nikhil.ikhar · October 28, 2018, 8:58am

Applied classification on Devanagari script. With Resnet18 & few epoch 99.3% accuracy.

github.com

nik-hil/ai-noteook/blob/master/Classifying Devnagari Script with 99.3116 % accuracy.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Classify devnagari script\n",
    "\n",
    "Dataset : https://www.kaggle.com/ashokpant/devanagari-character-dataset-large#dhcd.rar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%reload_ext autoreload\n",
    "%autoreload 2\n",
    "%matplotlib inline"

This file has been truncated. show original

nikhil.ikhar · October 28, 2018, 9:04am

Interesting!!. I have used the same dataset last yr & got less accuracy
This classification is very interesting & useful in food processing industry where you can automate the fruits & vegetable classification. I heard sometime back there are startups working for Reliance Fresh in this domain.

See here… https://github.com/nik-hil/ai-noteook/blob/master/FruitsClasificationKaggle-1.ipynb

anz9990 · October 28, 2018, 9:23am

Hello!

I trained a model with images of different tourist spots in Tokyo (where I currently reside) and got it to predict with a 10.1% error rate. Now when my friends come to visit my model can tell them what’s what

Here’s the top loss:

interp.most_confused(min_val=2)
[('shinjuku', 'roppongi_hills', 3)]

Its mostly confused between Shinjuku and Roppongi Hills, understandably so, as they are both urban areas and images for both have tall skyscrapers.