Share your work here ✅

I’m doing the course with my ten-yer-old daughter. We made a classifier for deep-sky Messier Objects.

Instead of using all 110 objects, I used only the list of named objects at the bottom of this page: http://www.seasky.org/astronomy/astronomy-messier.html. This way I was getting only the most commonly photographed and distinctive ones.

However, because that list only includes one globular cluster (the Great Cluster in Hercules, M13), and I really wanted to find out if my model will be able to tell apart different globular clusters, I added M3, M4, M15, and M80. I chose these ones because I think they are distinctive from each other: M3 is particularly brilliant. M4 has more easily resolvable stars and has a distinctive bar across an egg-shaped ring of bright stars. M15 has a particularly dense and bright core. M80 also has a dense core but tapers off much more quickly. I figured that if there are any globulars that the model will be able to tell apart, these are them.

Anyways, first step was to gather the images from Google. I followed lesson 2 and downloaded about 200 images for each object. I could already see from my Google search results that this wasn’t going to be easy, and I would need lots of cleaning - the images were a mess, with most of them quite obviously wrongly labelled. Some weren’t Messier Objects at all. Some were wide sky shots, or pictures of telescopes.

Because I had so many classes of images to download, instead of running through the download command manually for each one, I just created a quick loop. I made a CSV listing the Messier Objects and used pandas to take the second column (the first column was the name, second was the M## code) and make a list our of it to iterate over:

    messiers_table = pd.read_csv('./images/messiers.csv', header=None)
    messiers = messiers_table[1].tolist()
    for m in messiers:
        print(m)
        download_images('images/'+ m + '/download', 'images/' + m, max_pics = 200)

I trained my data using the same parameters used in the lesson notes: resnet32 with 20% validation data. After the first round of training, my accuracy rate was only 45%. YIKES! Unfreezing and retraining didn’t help much. Clearly I was going to need to clean my data.

Soooo… ran ImageCleaner and started cleaning… At first, I was trying to relabel wrongly labelled images by actually trying to recognize what the images were, but it turns out that this takes a bit longer for a human than recognizing black bears and teddy bears. It was taking me several minutes per page… After doing this for a couple of hours, and several hundred images later, I decided to just delete images that were clearly not Messiers, and do the relabelling later. My dataset was over 5000 images, and I had only cleaned a few hundred, and the quality wasn’t getting much better as I was going through the top losses list.

So, retrained the model on the slightly cleaned dataset, and got the accuracy up to about 50% - not much of an improvement.

I tried a few more iterations of cleaning, retraining, recleaning, and so on, and I am now at an accuracy of about 65%. Much better, but still not good enough. But at least now the lot losses were a lot more sensible:

**

**

Here are some of my most confused. At least they are sensible:

[(‘M76’, ‘M27’, 7),
(‘M15’, ‘M13’, 6),
(‘M3’, ‘M13’, 6),
(‘M11’, ‘M24’, 5),
(‘M81’, ‘M82’, 5),
(‘M82’, ‘M81’, 5),
(‘M24’, ‘M11’, 4),
(‘M24’, ‘M6’, 4),
(‘M44’, ‘M11’, 4),
(‘M6’, ‘M11’, 4),
(‘M6’, ‘M24’, 4)]

Lesson for me was, it’s really hard to get a nice, clean, labelled dataset to start with, and using Google images is a quick way to get images, but not very clean ones!

10 Likes

Hello guys,

As Jeremy teaches us through his courses, we should try to build something ourselves from the beginning to the end, being it research or product. Our joint team from local hospital and tech company went through this way from designing a study, collecting the dataset to training the models and finally writing a paper. Happy to share our work on hydrocephalus (buildup of cerebrospinal fluid) verification with deep learning from brain magnetic resonance images. Here is a arxiv link to our paper if someone wants to read more.

Thanks, Jeremy and fastai team and all the community!

8 Likes

Created a simple Bird Sound Classifier based on Lesson 2.

The data is from: https://datadryad.org/resource/doi:10.5061/dryad.4g8b7/1

There are 6 types of bird calls: distance,hat,kackle,song,stack,tet.

This model gets around 80% accuracy, which is not bad at all for something that relies on so many different factors.

image

This is very useful for audio processing, as the audio itself can be converted into images!

This is the entire notebook: https://github.com/vishnubharadwaj00/BirdSoundClassifier

3 Likes

Part 1: Lesson 6. I have tried to predict the 2019 Cycling World Champion using a tabular model

1 Like

Hi joeldick nice work.

I had a similar experience to you when building a wristwatch classifier and followed the same procedure as yourself. It was good at recognizing 3 classes of watch but when I increased it to 100 classes the performance was very poor. I think when the dataset like the one you have has images that look quite similar, its much more difficult to build good classifiers. One way to improve my wristwatch classifier was to combine it with a character recognition to recognize the names such as Rolex. Not sure what you could do in your case.

Good work.

mrfabulous1 :smiley::smiley:

Hi joeldick would this help.

mrfabulous1 :smiley::smiley:

If you are interested in this, then you may find the following article interesting.
Looks like they used an autoencoder trained on a mixture of patches and full paintings.

1 Like

It’s fantastic reading about all the great ideas and fantastic implementations. Still a long way for me to go. Definitely fast.ai helps to deploy surprisingly strong solutions even for non software engineers :wink:

The dataset I’ve used comes from the “Fruit recognition from images using deep learning” dataset (can be downloaded on GitHub) which had at the time of writing 120 classes and over 82.000 images.

show_data

The training loss I arrived at after 3 epochs was 0.040127, validation loss at 0.001555 and the error rate at 0.000417. The training time was 5:29 min in a google colab GPU runntime. I found this to be incredible results, only 3 miss-classified images in the 9600 images-heavy validation set.

Next step is finding now my next project that I also would like to deploy in a web app. Awesome trip so far :smiley:

2 Likes

Hi,
I am interested in Reinforcement Learning for games. Before starting the fast.ai course, I programmed a Connect Four self-learning agent. I thought it would be interesting to see to what degree a 2-class classifier can identify winning positions, i.e. 4 noughts or crosses in a row, column or diagonal.


The results are a mixed bag: 90% success rate after having provided 1200 training and 400 validation images. I feel a bit disappointed because I can’t spot any discernible pattern in the top losses. The failure modes seem to be almost random. But I have learned a lot.

If you are interested in the full journey, feel free to read the notebook.

I am looking forward to your comments and hints on how to improve the accuracy!
Regards, Marius

3 Likes

After watching Jeremy make something in Lesson 2, I started looking for examples to classify and came across a very interesting problem statement. I wanted to to see how the classifier might handle three monuments which look quite similar

  • India gate
  • Arc de Triomphe
  • Washington square arch

As you can see, they are quite similar in appearance and only a close look at the architecture on them might differentiate them. The model did pretty well on a set of ~300 images from each class.

Although the error rate in the last epoch increased, I let it go and wanted to keep going with the rest of the code.


This was the corresponding LR plot and the confusion matrix as follows:

Curios, I looked at the images which were confusing the model and a lot of them were irrelevant. I wanted to use the widget Jeremy demonstrated but since I am using Google Collab, I couldn’t.

Overall, I am quite surprised by the results and love how it is going so far!

5 Likes

Hi anujmenta nice work!

I also use Google Colab.
If I do not have to many images in my dataset of images, I prune them by hand, before I start training, this often helps make the model more accurate.

However your model appears to be working well!

A wonderful person called muellerzr has created a widget which apparently works on Google Colab, which may help now or in the future PR: New Widget! Class Confusion - With Google Colab Support!.

Have a fun day!

mrfabulous1 :smiley::smiley:

1 Like

Hi everybody,
once I’ve seen the possibility to implement something real I’ve tried to make a metal corrosion/rust detector. In 2013 we’ve tried to do this at my company and we didn’t finish it at that time. The goal was to inspect communication transmission towers and scale the job of an inspector, as we used to have one person to check 400 towers. If we could filter that from images maybe we could improve his job.

It has been a little hard to find a good number of good images to create the training set, because google gave me too much garbage. At the end I’ve worked with around 600 images, most of them with corrosion at some metal point. I guess around a hundred were without corrosion, so my model was supposed to say if an image contains rusted metal or not. I’ve reached very poor results if compared with the ones shown in lesson 2. My error rate was around 14%, but anyway, I’ve created alone, without much knowledge, a working model online, something that we didn’t do 6 years ago. I must say that in the beginning the project here wanted only to define the workflow to make the tower pictures and send them to the inspectors, so we have more knowledge about images, photography, drones, but after a while we saw that deep learning could help us.

Anyway, my model is online and I’d love if people could test it: https://doc.cartola.org/ you can send images with rusted or clean metal and tell me if it worked or not. The problem is that it is in Portuguese, my native language, as I wanted my fellows to use it. If you’re able to send it an image or URL then the result “Resultado: com ferrugem” means it detected rust on the image and “sem ferrugem” means “without rust” (with = com, without = sem).

I’ve basically used the project from Natalie Downe that created a cougar or not web application over the weekend and won the Science Hack Day award in San Francisco (as mentioned in Lesson 2). Her project was my start point and I’ve basically adapted it for me, not upgrading it.

Thanks.

Practical DL for Time Series
For those of you interested in Time Series data, I’ve just uploaded a github (more info here) called timeseriesAI where I’ve shared fastai time series code, some state-of-the-art Pytorch models, and a notebook to demo how to integrate everything. You’ll see you can achieve great results in a few minutes leveraging fastai.

6 Likes

Hey, awesome-fastai is live. I think , I can make a projects section there. suggestions welcome. lots of refactoring needed.

https://twitter.com/iamShashank/status/1179100146178523138

2 Likes

Hi all,

I just wanted to share my little test project:

Adapting some code from Lesson 1, I was able to look at images of African Grey parrots, and identify the species: “Timneh” or “CAG” with 82% accuracy.

They look fairly similar, so I’m pleased with this!

I have more exciting ideas following from this which I’ll share if/when they’re coded up!

Cheers,
Lloyd

1 Like

In case it is of interest, this cloud detection thing ended up being weirdly useful.
As my goal is to create practical models for each fast.ai application,
I ended up negotiating with the Cloud Appreciation Society (https://cloudappreciationsociety.org/), training the model on their cloud dataset (over 150 000 clouds, probably largest in the world)
and we used the trained model to add a cloud detector in their Cloud-A-Day app ( https://play.google.com/store/apps/details?id=com.cloudappreciationsociety.cloudaday&hl=en).

The model (single-class 11 categories [10 main cloud types + not-a-cloud], resnet50) is now used daily in production by their members which are “over 46,000 members worldwide from 120 different countries, as of January 2019” which, as a lover of clouds of the physical type, I find pretty cool.

Thank you @jeremy I really appreciate the weird and wonderful things you and the fast.ai community are doing.

14 Likes

Thank you for sharing @vedran.grcic! :slight_smile: Congrats on the fantastic outcome.

Hi Fast AI Community!

I created Estimate Body Fat - https://www.estimatebodyfat.com/, an AI Body Fat Calculator using the ResNet 50 from Fast AI’s Lesson 2 and own Haar Cascades (for upper body identification).

The reason for creating my own haar cascades was to be able to distinguish between the different sexes. I now truly understand how AI can be ethically challenging firsthand as my first few haar cascades were unable to recognize people from certain ethnicities. Yikes!!! But not to worry, I was able to solve this issue.

To get your body fat percentage, all you need to do is upload a picture like the instructions described. After getting your body fat percentage, you also get tips on how you can lose the fat you don’t need ad a lot more useful information on diet and lifestyle.

I personally began my fat loss journey this year and created this application as to way to keep track of all my progress and motivate myself. Give it a try and let me know if you have any questions.

You can find me on twitter at bruce_rebello

6 Likes

As for the dataset, I did create my own. I hope to keep updating it in the future to be able to produce results that are far more accurate that what I have right now!

Thanks @jeremy and @rachel for these amazing courses!