Share your work here ✅

negative. had to switch to the On Demand plan.

Take a look at create_fold_directory

It sets up 10 directories each with train/ and valid/
For example,
1/ has train dir with 2-10 fold images and valid dir with 1 fold images

ah, got it :+1: (didn’t realize you moved around the folds beforehand)

I really like your project. Just a hint for an easy way to get more data: You mention that you have 200 secs of audio and you take 5 sec windows giving you 40 examples: An often useful thing in time series/ sequence data is to have overlap of the windows. so really slide your window by just moving it e.g. 1 sec instead of 5 and you immediately have a lot more images to train on. Each image will be slightly different from the next, which is similar to e.g. rotating an photo by a tiny amount, so this is kind of data augmentation for your case. Try what kind of error rate you can achieve with that!


My first project now was a detector for common stuff you loose. I decided on keys, wallet, credit card, remote control and sunglasses. For this initial work I downloaded pictures automatically using @cwerner tool. That gave me quickly a super good error rate, but the currant dataset is too easy.

I guess this upcoming session we will have image segmentation. I do not yet know how to build a good dataset for what I am doing right now, but I guess there it will get interesting ^_^…


Very interesting @r2d2. Based on your code then I play around with the hook.callback and find that we can extract the activation of the last layer by the way below (which used the hook_output of that sgugger suggest).

    last_layer = flatten_model(learn.model)[-3]
    hook = hook_output(last_layer)
    n_valid = len(data.valid_ds.ds.y)
    for i in range(n_valid):
        img,label = data.valid_dl.dl.dataset[i]
        img = apply_tfms(, img, **
        ds = TensorDataset([None], torch.zeros(1))
        dl = DeviceDataLoader.create(ds, bs=1, shuffle=False,,,
        pred = learn.model(dl.one_batch()[0])
        if i % 1000 == 0:
            print(f'{i/n_valid*100:.2f}% ready')
        if i == 0 :
            acts = hook.stored 
        else : acts =,hook.stored), dim=0) 

I can’t find the Image.predict anymore. With that function, the code will be more compact. About HookCallback I don’t know how to use it yet :D. Because we want to save the activations in the validation set so I’m not sure if we can add a callback after we have already trained the learner. Need to read more the source code.

p/s: I guess it is useful for you too @MicPie :smiley:


Ah cool. I’ll waiting for that.

Sports action classifier article here and gist 92.5% accuracy!

This is awesome.

Do you happen to have your notebook on github or other public repo so we can see the whole thing?

Because I can, made a Baldwin Brother classifier - some weirdness but 87.5% to 90% accurate when I ran it


I tried to work on the Fashion MNIST dataset with close to 91% accuracy on test set provided in the dataset. I trained my model on Google Colab notebook.

I used resnet 34 and then switched to resnet50. Then I probably over fitted it on resnet101.

idx2class = {
    0: 'T-shirt/top',
    1: 'Trouser',
    2: 'Pullover',
    3: 'Dress',
    4: 'Coat',
    5: 'Sandal',
    6: 'Shirt',
    7: 'Sneaker',
    8: 'Bag',
    9: 'Ankle boot' 

Lot of scope of improvement. Please suggest changes and how can I improve my notebook.


A couple of past epochs.

epoch    train_loss  valid_loss  error_rate
1        0.341232    0.255859    0.095961    (04:58)
2        0.324185    0.252643    0.096382    (04:58)
3        0.328457    0.240561    0.088709    (04:58)
4        0.297743    0.237131    0.089215    (05:10)
5        0.301143    0.236496    0.089468    (05:19)
6        0.303664    0.236342    0.089384    (04:58)
7        0.295196    0.233819    0.088034    (05:02)
8        0.282361    0.236206    0.087191    (04:57)
9        0.289749    0.234244    0.087781    (05:11)
10       0.298668    0.234326    0.088878    (05:07)
11       0.293974    0.236059    0.089299    (05:17)
12       0.286787    0.232928    0.085589    (05:06)
13       0.295403    0.232092    0.088119    (05:14)
14       0.284026    0.232384    0.086685    (05:07)
15       0.300362    0.232536    0.087697    (05:18)
16       0.283786    0.231558    0.086179    (05:10)

Isn’t a clear sign that I am overfitting because my error is wiggling around 0.085 and 0.089 even though train error reduced from 0.34 to 0.28 and valid error from 0.2558 to 0.231

Ok… I did a complete refactor to include the user interface from the FileDeleter we learned about tonight. Now it’s a super clean interface for finding duplicate or near duplicate images as well as garbage images using intermediate representations of a pretrained network.


Very nice work, thank you very much ! It will be very helpful.

Any insight on why it took Jeremy as long as it did to get it running tonight ?

Hi i will be working with mammographies (the CBIS-DDSM dataset). For now i have extracted a subset of x-ray tiles in order to classify them as healthy or malignant tissue. I have converted the x-rays to 16bit png using pydicom and create a modifed open_image in order to read the 16bit png file. Furthermore to work with pretrained networks i have created a small function to convert the input layer of a resnet to accept 1-channel input.

I plan to use this dataset for dataaugmentation and segmentation throughout the course and combine it with some of the wonderfull ideas that have come up in the first lessons. There are lots of challenges with this dataset:)

Here is the notebook:


Hey Guys, I decided to work with Architectural Heritage Elements image Dataset dataset. And I think have achieved something better than SOTA for this dataset. The authors claim 93.19% whereas I achieve 97.155% with Restnet50 after some finetuning. There is a lot of further potential for finetuning I think.

Link to the paper i am currently comparing my results to. not sure if there are any further papers on this improving their results.

Paper introducing the dataset and accuracies. I am working with 128x128.

My model and accuracies:

after some finetuning

The model makes logical mistakes, as in it gets things wrong which makes sense

confusion matrix plot

cc @jeremy


Hey, I went through your notebook but didn’t exactly get how you passed in the 3 crops. Did you pass in all three crops and then take the most occurring result or concatenate the images in some manner? It seems to me like you still passed in one image at a time. (pls point to where you passed in different in code as well)

Val loss lower than train loss means you are under-fitting. Val loss should always be higher than train loss when you are finished fitting.

edit: fix over->under


Can you tell us what you changed to make it more accurate?

I believe there is a typo here. You said today that ‘when train loss > val loss means you have not fitted enough’. I think what you mean here is: ‘Val loss lower than train loss means you are underfitting ’.