Share your work here ✅

I tried to work on the Fashion MNIST dataset with close to 91% accuracy on test set provided in the dataset. I trained my model on Google Colab notebook.

I used resnet 34 and then switched to resnet50. Then I probably over fitted it on resnet101.

idx2class = {
    0: 'T-shirt/top',
    1: 'Trouser',
    2: 'Pullover',
    3: 'Dress',
    4: 'Coat',
    5: 'Sandal',
    6: 'Shirt',
    7: 'Sneaker',
    8: 'Bag',
    9: 'Ankle boot' 

Lot of scope of improvement. Please suggest changes and how can I improve my notebook.


A couple of past epochs.

epoch    train_loss  valid_loss  error_rate
1        0.341232    0.255859    0.095961    (04:58)
2        0.324185    0.252643    0.096382    (04:58)
3        0.328457    0.240561    0.088709    (04:58)
4        0.297743    0.237131    0.089215    (05:10)
5        0.301143    0.236496    0.089468    (05:19)
6        0.303664    0.236342    0.089384    (04:58)
7        0.295196    0.233819    0.088034    (05:02)
8        0.282361    0.236206    0.087191    (04:57)
9        0.289749    0.234244    0.087781    (05:11)
10       0.298668    0.234326    0.088878    (05:07)
11       0.293974    0.236059    0.089299    (05:17)
12       0.286787    0.232928    0.085589    (05:06)
13       0.295403    0.232092    0.088119    (05:14)
14       0.284026    0.232384    0.086685    (05:07)
15       0.300362    0.232536    0.087697    (05:18)
16       0.283786    0.231558    0.086179    (05:10)

Isn’t a clear sign that I am overfitting because my error is wiggling around 0.085 and 0.089 even though train error reduced from 0.34 to 0.28 and valid error from 0.2558 to 0.231

Ok… I did a complete refactor to include the user interface from the FileDeleter we learned about tonight. Now it’s a super clean interface for finding duplicate or near duplicate images as well as garbage images using intermediate representations of a pretrained network.


Very nice work, thank you very much ! It will be very helpful.

Any insight on why it took Jeremy as long as it did to get it running tonight ?

Hi i will be working with mammographies (the CBIS-DDSM dataset). For now i have extracted a subset of x-ray tiles in order to classify them as healthy or malignant tissue. I have converted the x-rays to 16bit png using pydicom and create a modifed open_image in order to read the 16bit png file. Furthermore to work with pretrained networks i have created a small function to convert the input layer of a resnet to accept 1-channel input.

I plan to use this dataset for dataaugmentation and segmentation throughout the course and combine it with some of the wonderfull ideas that have come up in the first lessons. There are lots of challenges with this dataset:)

Here is the notebook:


Hey Guys, I decided to work with Architectural Heritage Elements image Dataset dataset. And I think have achieved something better than SOTA for this dataset. The authors claim 93.19% whereas I achieve 97.155% with Restnet50 after some finetuning. There is a lot of further potential for finetuning I think.

Link to the paper i am currently comparing my results to. not sure if there are any further papers on this improving their results.

Paper introducing the dataset and accuracies. I am working with 128x128.

My model and accuracies:

after some finetuning

The model makes logical mistakes, as in it gets things wrong which makes sense

confusion matrix plot

cc @jeremy


Hey, I went through your notebook but didn’t exactly get how you passed in the 3 crops. Did you pass in all three crops and then take the most occurring result or concatenate the images in some manner? It seems to me like you still passed in one image at a time. (pls point to where you passed in different in code as well)

Val loss lower than train loss means you are under-fitting. Val loss should always be higher than train loss when you are finished fitting.

edit: fix over->under


Can you tell us what you changed to make it more accurate?

I believe there is a typo here. You said today that ‘when train loss > val loss means you have not fitted enough’. I think what you mean here is: ‘Val loss lower than train loss means you are underfitting ’.

I cropped them locally and create a separate dataset for that. See:

(Check the filename: fastai-vehicles-crops.tgz)

Read the end of this blog post on how I cropped those images with ImageMagick.

Used a melspectogram


I am asking how did you pass these multiple squares of one image into the cnn to make the classification? I read your blog but this is not mentioned

They all belong to a single category. That is enough.
No need to relate it to a given image.

Great work and insights Alex ! Might be a good idea to start a new thread on the topic of working with huge datasets in fastai v1.

After listening again to part2 v2 lectures, I realize that what I I’m trying to do may be done better with the U-Net architecture. It give per-pixel classification along with using multiple levels of detail to generate the result. The ground true would be trivial - all pixels are the same class (the artist of that painting). I’ll get back to this project later.

How can we apply 10-fold in fastai?


This is neat ! :clap:

I am also bit confused now. In the

This is counter-intuitive. When we say val loss > training loss than it means my model did good on training (low train loss) but performed worse on testing (high val loss), it means it learned training data well but is not generalizing very well on a validation set so it is “overfitting”.
On the other end when we say val loss < train loss means I am doing good on validation but not so good in training so I have a scope of improvement. I am “underfitting”.

Am I missing something? (Sorry to tag you directly @jeremy)


Many apologies - not enough sleep and I didn’t notice I’d typed the opposite of what I meant! Fixed my post now, and removed most of the replies of people that I confused in the process, so as to avoid confusing people even more… :blush: