Share your work here ✅

THNX @radek, looking at your code I’ve figure out how to make TTA predictions with v1 :wink:

preds = learn.TTA(is_test=True)[0]

For an old kaggle competition:

Using almost everywhere only default values from lesson 1 i’ve got very good results on validation set (error_rate=0.030588):

Late submitting my predictions I’ve got a private score of 0.359 about 240/1440.

I’ll cleanup and share my nbk asap!


I thought of trying out something simple, to get used to the new fastai library. So, I built a model on the DHCD dataset (

It’s a dataset similar to MNIST, but for Devanagari(देवनागरी) characters instead.

The dataset is not as well known and/or used, so was fun to try it out, and got pretty good results (error rate : 1.49% | accuracy : 98.51%) for less than an hour of tinkering.

Also, these top losses seem legit to me :joy:


Hi All,

I put together a medium blog post, It’s a Draft now, and haven’t publish it, I want to run it through all of you, to know your opinion, feedback, issues with article, issues with accuracy metrics, grammar, typo mistake, anything which comes to your mind.

I am planning to publish this by Saturday Eastern Time US. So any feedback till then is totally appreciated.

Huge Thanks to @jeremy for making this highly valuable education free for everyone, compare to many other institution, I attended in past for ML and DL.


Hi Community!

I’m seeing promising results on a 10 class audio classification task.

I am getting 76.3% accuracy with fastai and basically no effort, so that’s really cool! However, according to the publications listed on the dataset’s website, the top accuracy is 79%.

My goal is to surpass that by next weeks class, so I’m asking you guys for suggestions on what the most fruitful avenue might be:

  1. tune hyperparameters
  2. add audio specific data augmentation (obviously the common transformations don’t help with spectrograms)
  3. create better spectrograms which could be easier to classify

Here is my notebook.

thanks @jeremy for making this course so fun!


Hi all, really excited to see all the cool stuff people have been working on. Thanks again Jeremy, Rachel, and the rest of the community here - I’m looking forward to learning from all of you during this journey :slight_smile:

On the heels of week 1’s lesson, I decided to take a stab at my own image classification exploration between 4 different kinds of french fries (waffle, curly, shoestring, and tater tots). I wrote about it in a medium blog post (in addition to a little intro about my general excitement for Feel free to look through it if you’re interested!

Thanks :slight_smile:


Hey everyone, I wanted to see if cultural urban characteristics could be recognized by a resnet style architecture.

So I downloaded 4000 satellite images of the 1000 largest cities (using GMap’s static API), and labeled each with its country.

I was expecting it not to work that well, but got to 85% accuracy with resnet34 (over 110 classes !!!) - here are the worst offenders:

I’m not sure it’s not picking up some individual satellite characteristics rather than the urban specifics of each country but I’m curious to investigate this some more over the coming weeks.

Repo is here but the notebook is essentially a copy paste of the lesson’s code. Scraper is in there too (sorry it’s in Go but happy to share the binary)


This is a continuing project that I will be working on, but part one was just an analysis of an American Football college team taking the ball at the 25 yard line or trying to gain more yards. This is something that I wouldn’t have been able to accomplish without the help of fastai. Some of the future work I plan on doing with this data is more deep learning related where as this was more analytical. My next thing I want to do is build a model that predicts the outcome of a play so I think that will be more interesting from a technical perspective but this is my first blog post that I actually published (I’ve had some drafts, but nothing that felt polished enough to publish).


That’s an awesome idea for data collection. That could help get a good data sample together to train on before doing transfer learning onto your real images. I think that would be a useful tool. Maybe it could be a stand alone class though that is used to get a decent number of images and put them into a train/valid folder. You could maybe even have it display the image and let you y/n each image before it gets saved so you don’t get any false positives.

GetImagesFromGoogle(“puppies”, activeValidation=True) :slight_smile:

1 Like

Very interesting. I’d suggest focusing on one fold first, make that as good as you can, then do the full CV at the end.

Because this is so different to imagenet, the pre-unfreezing step doesn’t get you very far. As you see, after just one epoch the accuracy stops improving. Therefore, just do one epoch before unfreezing. Then try lr_finder. Then run as many epochs as you can before error starts getting worse.


Hello everyone, I wanted to know that how resnet34 model would be able to differentiate between a person playing two musical instruments guitar and sitar. Both of these instruments look quite similar so I thought why not to build a model that would classify them.

I used 100 examples for each class i.e. for sitar and guitar.

Using resnet34 model I got an accuracy of 94%. Here are some of my prediction:

Jupyter notebook :


I wonder if specific frequency isolation would help? Throwing away the right hand side of the spectrograms doesn’t look to lose much visually.

I just finished combining @r2d2’s PCA-based feature interpretation of resnet50 trained on @hkristen’s imageCLEF (full) Plant Identification Challenge dataset. Here’s the notebook:


That’s overfitting

No it’s not. Accuracy is always shown on validation set. See my earlier reply.


Hey everyone,

For my project I wanted to check if the resnet model can generalize to the style of an artist. So I downloaded artworks of Van Gogh and Monet and using resnet 50 in under 10 epochs trained it to 94% accuracy.

I thought that because all the images are different, it would be harder for the network to pick the style (i.e. generalize), but it turns out this is not true at all.


I grabbed activations from inside the model and ran them through a t-SNE to see how different dog/cat classes would cluster.

This is from the final conv block:

Also looking at how different activation layers in a conv block extract different features.

From a ReLU layer:


Very interesting work!
I have a query, the plots indicate for a particular class (the weights plot of the model)… that’s fine but why are they having lot of common things also in them?
Like few are detecting edges, contours and then the dog as a whole, even the background as if there’s an emboss of dog in the surrounding and few are black or
hard to see what theyare doing actually?(the black ones)

Intriguing ! How about trying normalizing with a subset of your train data than with imagenet stats?
You should also definitely augment.
Let me know if you’ve tried :slight_smile:
Also, +1 on fine tuning max 1-fold first.
FYI the link shared: is broken

Here’s the link to the ds:

Hi relly fun project

  1. Looks like you are training one fold at a time instead of all in one go. Is there a good reason for reducing the size og the training data like this ?

  2. I would take af look at the default setting in and disable rotation . i cannot se how you data could benefit from rotation

  3. let your training run longer say a cycle of 10

  4. possibly have a look at the max_lr and wd argument in fit_one_cyle

by the way you visualisation could really benefit from colorcets linear colorscales. hard to be a good linear grayscale

@etown we are working on a similar lines i.e. basically representing audio thru image and doing classification. really good to have someone thinking on similar lines…we are working on data from this kaggle competition. Will share the results shortly. if u have any more learning pls do let us know :slight_smile: