Share your work here ✅

Woohoo ! Congrats, this is superb result. Cheers ! Thanks for sharing the nb.

Hello!

I’ve tried to classify flowers using this dataset from Kaggle
It was great experience, the final result - accuracy is about 97.4% with resnet-50 architecture
I tried to learn incrementally with different learning rates for different layers, with checking optimal learning rate before unfreezing layers.
According to the most confused images, prediction has the biggest errors on data without flowers (because the data was gathered from internet), and it was good indicator I think.
You can find notebook here.

Do I understand correctly that if train loss is greater than validation loss but both decreasing including the last epoch, but error rate is the same for 3 last epochs, means that it is underfitted? Or maybe the reason is that this particular split on train/validation dataset can cause this behavior of loss values? Of course it will be better to use separate test set, but I decided to practice more on the topic of the lesson 1.

1 Like

was going through your notebook to understand how resnet18 was more efficient, when it caught my eye that you’re using the same data for both validation dataset and test dataset.

This doesn’t seem correct to me. If I understand this, validation dataset and test dataset should be different as per their definitions. The test dataset is used during during the training step, and validation is used afterwards to judge how good the training was. (Since test param is optional, I’m assuming fastai does this automatically in a neat way if not provided)

The idea is that, if the model were to see the validation set during training, it would then fit to validation set directly. This could explain why you’ve been getting superb results in a few number of epochs.

However, I’ll try to browse through the source to see how test parameter is used, so take my word with a grain of salt.

Perhaps @sgugger can clarify how the test param is supposed to be used, as I haven’t seen this used in course notebooks very often.

2 Likes

It’s the other way around, actually! :slight_smile:

5 Likes

Ah, right, I forget this too easily. So, in that case, it is fine to load same data to both validation and test ?
I tried to follow the source but got lost a bit in DataBunch.

I was interested in doing voice recognition detection. I used Audacity (https://www.audacityteam.org) to trim the audio from the following clips:

  1. Ben Affleck’s speech in The Boiler Room (https://www.youtube.com/watch?v=JfIKzReNDF4&t=62s)
  2. Joe Rogan and Elon Musk Podcast (https://www.youtube.com/watch?v=Ra3fv8gl6NE)

And used 3 min 30 seconds of audio voice from each of Ben Affleck, Joe Rogan, and Elon Musk.

I used a 5 second sliding window to plot their spectrogram, using the tutorial outlined here: https://github.com/drammock/spectrogram-tutorial/blob/master/spectrogram.ipynb

Since there was roughly 200 seconds of audio, that gave me roughly 40 spectrogram pictures each of each person.

Here is a sample of the spectrograms for each class (I am not sure why some of them are warped - my original pictures that are uploaded are not warped):

Despite the warping of these pictures, I moved on anyways to see what will happen.

I trained it on Resnet34 over 4 epochs (default settings) and got roughly 60% error:

So I decided to go with Resnet50. The error rate improved to 30% over 10 epochs:

So, 30% is not quite as low as some of the other work that we’ve been seeing on here, but I’m quite pleased with the results:

The model was pretty accurate with Ben Affleck and Elon Musk, while it was still better than random guessing for Joe Rogan.

I’d love to hear your thoughts on how I can improve the model. Obviously, I could add more training data - 40 samples each is probably too low (but this is a very tedious process to trim the audio to only a certain speaker and I might have run out of time for now). The warping picture issue is also concerning - not sure why that happened.

What do you think ? Otherwise, I’m pretty impressed that it did so well for Elon Musk and Ben Affleck for virtually zero tuning except to add epochs on Resnet50.

Because it did so well, I’m just convinced it will do much better on easier images :wink: Those spectrograms look very similar to the human eye!

Thanks for reading this!

36 Likes

I was talking about RAM, since disk size is not a problem (you can attach many terrabytes, virtually any size). I also tried only 1% of the data since I definitely agree on redundancy argument. Thank you for your replies!

Hi Jeremy, thank you for your explanation.

I was probably confused by the fact that I got memory error (not GPU memory) and when I took a look at the code I’ve jumped to conclusions too quickly. Unfortunately, I lost my logfiles from that run, so can’t check now what was going there.

Did you try disabling transforms?

Thanks for the suggestion - no, how do I do that ? :slight_smile:

From a discussion in another thread I looked at using activations to optimize an input and ended up implementing a deep dream sort of thing.

21 Likes

I used the course method for downloading images from Google to create a dataset of hotdogs, tacos, burgers, pizza, and fries. Must be time for dinner!

I got 94% accuracy right off the bat, with no cleaning/pruning of the data. I can see there are some errors in the training data. Here are the top losses. I wonder why it got #1 wrong? :slight_smile:

2 Likes

I was also looking into flower dataset, was not achieving good accuracy. Did you reformat images to 224x224 pixels?

I utilized the transfer learning method shown in Lesson-1 to train on crop leaves diseases identification using PlantVintage dataset. I wrote a blog about it, do check it out - https://medium.com/@aayushmnit/transfer-learning-using-the-fastai-library-d686b238213e

3 Likes

Hello guys))

I’m trying to work with birds dataset http://www.vision.caltech.edu/visipedia/CUB-200-2011.html,
which is a really fine-grained classification task (it’s pretty hard to differentiate all the kinds of sparrows with a naked eye). I was wondering what was the SOTA results on this data? I found two papers claiming the state of the art results. First one is from before deep learning era (https://arxiv.org/pdf/1310.1531.pdf). And their accuracy with manual feature engineering was only 64.96%. The second paper is from recent time and uses deep learning (https://arxiv.org/pdf/1807.07320.pdf ), with MA-CNN, their accuracy is 86.5%. This one is current SOTA to my best knowledge.

With fastai v1 library and almost no tuning, I did 76.4% accuracy on pretrained ResNet34 and 83.2% with ResNet50. I could probably overfit a little, but still, those are great results not only comparing with pre deep learning but also comparing with current SOTA.

Providing a reference to notebook in case if someone wants to take a look (https://github.com/ademyanchuk/course-v3/blob/master/nbs/dl1/lesson-1-birds-dai.ipynb)

Thanks to fastai team and all the community.

4 Likes

I attended Science Hack Day in San Francisco this weekend and used fastai to help build a Twitter bot that attempts to identify if a photograph is of a Cougar or not - part of a science communication hashtag game. You can see the bot in action at https://twitter.com/critter_vision/with_replies

My machine learning model is currently pretty terrible (I only got a 24% error rate - I’m certain I can do a lot better than that with more work) but that’s mainly because I spent most of the time figuring out how to deploy and run the resulting model as an API. I got that working, and I’ve just published some extensive notes on how I did that here: https://simonwillison.net/2018/Oct/29/transfer-learning/

14 Likes

I used resnet34 instead of resnet18. I got accuracy of 99.4203%.

I see we both have used the different data source. I have used kaggle data instead of uci

May be this is causing difference in accuracy.
I have to see how both data are different.

1 Like

I was interested at using satellite imaging to detect the amount of human presence in the Amazon forest. That could be used to find out how much of the forest is pristine, how much is in peril and trends over time.

I found a kaggle dataset that fit this perfectly, but it was multi-class and I thought I could make a better classifier if I focused only on the “human vs. forest” question so I basically grouped the classes like “road, habitation, etc” in “human” and “forest” as the rest. I was surprised at how easy it was to get something out of the ground and running with 7% error rate, here’s my gist:

I would appreciate any tips on how to improve it!

4 Likes

Hi Radek,

did you succeed in looking at most_confused data for your 1% of the whole data?
I am getting the error:
interp.most_confused(min_val=2000)


RuntimeError Traceback (most recent call last)
in ()
----> 1 interp.most_confused(min_val=2000)

~/anaconda3/lib/python3.6/site-packages/fastai/vision/learner.py in most_confused(self, min_val)
116 def most_confused(self, min_val:int=1)->Collection[Tuple[str,str,int]]:
117 “Sorted descending list of largest non-diagonal entries of confusion matrix”
–> 118 cm = self.confusion_matrix()
119 np.fill_diagonal(cm, 0)
120 res = [(self.data.classes[i],self.data.classes[j],cm[i,j])

~/anaconda3/lib/python3.6/site-packages/fastai/vision/learner.py in confusion_matrix(self)
91 “Confusion matrix as an np.ndarray.”
92 x=torch.arange(0,self.data.c)
—> 93 cm = ((self.pred_class==x[:,None]) & (self.y_true==x[:,None,None])).sum(2)
94 return to_np(cm)
95

RuntimeError: $ Torch: not enough memory: you tried to allocate 64GB. Buy new RAM! at /opt/conda/conda-bld/pytorch-nightly_1540719301766/work/aten/src/TH/THGeneral.cpp:204


RuntimeError Traceback (most recent call last)
in ()
----> 1 interp.most_confused(min_val=2000)

~/anaconda3/lib/python3.6/site-packages/fastai/vision/learner.py in most_confused(self, min_val)
116 def most_confused(self, min_val:int=1)->Collection[Tuple[str,str,int]]:
117 “Sorted descending list of largest non-diagonal entries of confusion matrix”
–> 118 cm = self.confusion_matrix()
119 np.fill_diagonal(cm, 0)
120 res = [(self.data.classes[i],self.data.classes[j],cm[i,j])

~/anaconda3/lib/python3.6/site-packages/fastai/vision/learner.py in confusion_matrix(self)
91 “Confusion matrix as an np.ndarray.”
92 x=torch.arange(0,self.data.c)
—> 93 cm = ((self.pred_class==x[:,None]) & (self.y_true==x[:,None,None])).sum(2)
94 return to_np(cm)
95

RuntimeError: $ Torch: not enough memory: you tried to allocate 64GB. Buy new RAM! at /opt/conda/conda-bld/pytorch-nightly_1540719301766/work/aten/src/TH/THGeneral.cpp:204

Congrats for your nice work. I have a question about using data sets on Colab. Does the folder that you define for Path (content/data/102flowers.mat) exist next to your notebook on Colab ?