Share your work here ✅

Weeds vs Grass

So there is great interest in reducing the use of herbicides in parks where children play, so I though I would take a walk in the park and take photos (25 weeds/25 grass) with my iphone and see whether the resnet34 classifier would work. And it did! The error rate was 12.5%. I had 8 in my validation set and 17 in my training. I did it again with a different mix of train and validation. I got an error rate of 25%. But I then swapped to resnet50 and that dropped back down to 12.5%.

BTW, I’m totally running this on my windows machine and I spent very little effort in installing fastai and pytorch in my virtual environment using visual studio and pip. I’ve only really tested all of lesson 1 though - fingers crossed.

1 Like

I’m a non-engineer business guy diving into this world of deep learning, and I’m loving it. After the 1st lesson I have created (painstakingly) my first trained model AND web app (took me hours for figure this out). You can input any picture of a human face and my model will be able to tell you if it is smiling, frowning, or sad! It has a 82% accuracy rate. I’m not sure if that’s a bad or good accuracy rate, but I’m proud of it. Looking forward to lesson 2-7!

Here is my web app! https://expressive.onrender.com/

I wrote about it here: https://www.instagram.com/p/B1afnSgjzY3/

2 Likes

Hey everyone, I tried to see if I could beat the IMDB results by including SentencePiece and ensembling four different models (Forward + backwards, SentencePiece + Spacy). I did not quite achieve state of the art, but I need to see what I missed as I did not get Jeremy’s results, but they look promising! CrossPost, Article, Notebook

1 Like

Hi Ramon,
thank you very much for sharing this app! I learned a lot deploying it in a local server. It took a while since I had to learn about docker and other stuff.
Anyway I got it running, but I am doing something wrong.
I compared my version with the heroku version and everything works fine except for the heatmap the heroku version its fine mine version looks a bit scrambled.
This is the Heroku


This is mine:

Any help will be appreciated.

Lorenzo

1 Like

Hi Iphattori,

Thanks for your appreciation!

My best guess right now is that I did some changes (and committed them without deploy) after the last Heroku deploy. Maybe I upgraded the model or changed the predict.py file. I’m not able to dig into this right now. But you might want to check the latest commits to see if a change broke this part of the app.

Best regards,
Ramon

Continuing the discussion from Share your work here :white_check_mark::

Should SentencePiece help on an English corpus? I treat it as a necessary evil for Polish as we have too many forms for each word to make standard vocabulary work, but wasn’t aware this is needed/ helpful for English language…

In the class (NLP), Jeremy discussed trying a blend of all four, that’s why I did it. Overall I noticed sentence piece performing slightly less, but only barely.

1 Like

That’s cool - what did you use for data? How many images and did you manually label?

I scraped images of the members of congress from https://congress.gov using Beautiful Soup and built a classifier model using the lesson 2 notebook to determine whether an image was of a Republican or Democrat. It is deployed on render at https://repubordem.onrender.com.

I used images for 304 Republican members of Congress and 249 images of Democratic members of Congress. I got the overall error rate down to 35%, which I interpreted to mean the model was picking up something meaningful to distinguish between Republicans and Democrats, but not that great since you could get an error rate of 45% by just picking Republican every time.

1 Like

Hi all,

New member here. Thanks for this open source software!

Recently completed lesson 1 and 2 of the fast.ai course and I wanted to get stuck in. Decided to classify ships. Wrote a blog post on this at https://sites.google.com/view/raybellwaves/blog/classifying-ship-classes

Had a few hiccups along the way but as a result I am more familiar with the software. Here’s a list of some of my stalling points as how I got around them:

1 Like

Banknotes detection for blind people

I wanted to share a banknote detector I made. It recognizes what currency it is (euro or usd dollar) and what denomination (5,10,20, …). The social impact purpose is to help blind people, so I took care to make “real-life” training images holding the banknotes in my hand, sometimes folded, sometimes covering part of it.

It is deployed on iris.brunosan.eu

As others have shared, the fast, fun and easier part is the deep learning part (congrats fastai!), and the production server took roughly 10x time (I also had to learn some details about docker and serverless applications).

The challenge

I found just a few efforts on identifying banknotes to help blind people. Some attempts use Computer Vision and “scale-invariant features” (with ~70% accuracy) and some use Machine Learning (with much higher accuracy. On the machine learning side, worth mentioning one by Microsoft research last year and one by a Nepali programmer, Kshitiz Rimal, with support from Intel, this year.

  • Microsoft announced their version at an AI summit last year, “has been downloaded more than 100,000 times and has helped users with over three million tasks.” Their code is available here (sans training data). Basically, they use Keras and transfer learning, as we do in our course, but they don’t unfreeze for fine-tuning, and they create a “background” class of non-relevant pictures (which, as Jeremy says, it’s odd to create a “negative” class). They used a mobile-friendly pre-trained net “MobileNet” to run the detection on-device, and 250 images per banknote (+ plus data augmentation). They get 85% accuracy.

  • The nepali version from Kshitiz: 14,000 images in total (taken by him), and gets 93% accuracy. He started with VGG19 and Keras for the nural net, and “Reach Native” for the app (This is a framework that can create both an iOS and Android app with the same code), but then he switched to Tensorflow with MobileNetV2 and native apps on each platform. This was a 6 months effort. Kudos!! He has the code for the training, AND the code for the apps, AND the training data on github.

My goal was to replicate a similar solution, but I will only make a functioning website, not the app, or on-device detection (I’m leaving that for now). Since I wanted to do several currencies at once, I wanted to try multi-class classification. All the solutions I’ve seen use single-class detection, e.g. “1 usd”, and I wanted to break it into two classes, “1” and “usd”. The reason being that I think there are features to learn across currencies (all USD look similar) and also across denominations (the 5usd and 5eur have the number in common). The commonalities should help the net reinforce those features for each class (e.g. a big digit “5”).

The easy part, Deep learning

I basically followed the multi-class lessons for satellite detection, without really many changes:

The data

It is surprisingly hard to get images on single banknotes in real-life situations. After finishing this project I found the Jordan paper and the Nepali project which both link to their dataset.

I decided to lean on Google Image searches, which I knew was going to give me unrealistically good images of banknotes, and then some that I took myself with money I had home for the low denominations (sadly I don’t have 100$ or 500eur lying around at home). In total I had between 14 and 30 images per banknote denomination. Not much at all. My dataset is here.

Since I didn’t have many images, I used data augmentation with widened parameters. (I wrongly added flips, it’s probably not a good idea):

tfms = get_transforms(do_flip=True,flip_vert=True, 
                      max_rotate=90, 
                      max_zoom=1.5, 
                      max_lighting=0.5, 
                      max_warp=0.5)

In the end, the training/validation set it looked like this:

It’s amazing one can get such good results with that few images.

The training

I used 20% split for validation, 256 pixel size for the images, resnet50 as the pre-trained model. With the resnet frozen, I did 15 epochs (2 minutes each) and got an fbeta of .087, pretty good already. Then unfroze and did more training with sliced learning rates (bigger on the last layers) on 20 epochs, to get .098. I was able to squeeze some more accuracy by freezing again the pre-trained model and doing some more epochs. The best was fbeta=0.983. No signs of over-fitting, and I used the default parameters of dropout.

Exporting the model and testing inference.

Exporting the model to PyTorch Torch script for deployment is just a few lines of code.

I did spend some time testing the exported model, and looking at the outputs (both the raw activations and the softmax. I then realized that I could use it to infer confidence:

  • positive raw activations (which always translate to high softmax) usually meant high confidence
  • negative raw activations but non-zero softmax probabilities happened when there was no clear identification, so I could use them as “tentative alternatives”.

e.g. this problematic image of a folded 5usd covering most of the 5

{‘probabilities’:
‘classes’: [‘1’, ‘10’, ‘100’, ‘20’, ‘200’, ‘5’, ‘50’, ‘500’, ‘euro’, ‘usd’]
‘softmax’: [‘0.00’, ‘0.00’, ‘0.01’, ‘0.04’, ‘0.01’, ‘0.20’, ‘0.00’, ‘0.00’, ‘0.00’, ‘99.73’],
‘output’: [’-544.18’, ‘-616.93’, ‘-347.05’, ‘-246.08’, ‘-430.36’, ‘-83.76’, ‘-550.20’, ‘-655.22’, ‘-535.67’, ‘537.59’],
‘summary’: [‘usd’],
‘others’: {‘5’: ‘0.20%’, ‘20’: ‘0.04%’, ‘100’: ‘0.01%’, ‘200’: ‘0.01%’}}

Only the activations for class “usd” positive (last on the array), but the softmax also correctly brings the class “5” up, together with some doubt about the class 20.

Deployment

This was the hard part.

Basically you need 2 parts. The client and the server.

  • The front-end is what people see, and what it does is give you a page to look at (I use Bootstrap for the UI), the code to select an image and finally displays the result. I added some code to downsample the image on the client using Javascript. The reason being that camera pictures are quite heavy nowadays and all the inference process needs is a 256 pixel image. These are the 11 lines of code to downsample on the client. Since these are all static code, I used github pages on the same repository.

  • The back-end is the one that receives the image, runs the inference code on our model, and returns the results. It’s the hard part of the hard part :slight_smile:, see below:

I first used Google Cloud Engine (GCE), as instructed here . My deployment code is here, and it includes code to upload and save a copy of the user images with the infered class, so I can check false classifications use them for further training.

Overall it was very easy to deploy. It basically creates a docker that deploys whatever code you need, and spins instances as needed. My problem was that the server is always running, actually 2 copies, at least. GCE is meant for very high scalability and response which is great, but it also meant I was paying all the time, even if no one is using it. I think it would have been 5-10$/month. If possible I wanted to deploy something that can remain online for long without paying much.

I decided to switch to AWS Lambda (course instructions here). The process looks more complicated, but it’s actually not that hard, and the huge benefit is that you only pay for use. Moreover, for the usage level, we will be well within the free tier (except the cost of keeping the model on S3, which is minimal). My code to deploy is here. Since you are deploying a Torchscript model, you just need PyTorch dependencies, and AWS has a nice docker file with all that you need. I had to add some libraries for formatting the output and logging and they were all there. That means your actual python code is minimal and you don’t need to bring fastai (On this thread Laura shared her deployment tricks IF you need to also bring fastai to the deployment).

UX, response time.

Inference of the classification takes .2 seconds roughly, which is really fast, but the overall time for the user from selecting the image to getting the result can be up to 30s, or even fail. The extra time is partly uploading the image from the client to the server, and downscaling it before uploading if needed. In real-life tests, the response time was roughly 1s, which is acceptable… except for the first times, it sometimes took up to 30s to respond for the first time. I think this is called “cold start”, as AWS pulls the Lambda from storage. To minimize the impact I added some code that triggers a ping to the server as soon as you load the client page. That ping just returns “pong” so it doesn’t consume much billing time, but it triggers AWS to get the lambda function ready for the real inference call.

Advocacy

This summer I have a small weekly section to talk about Impact Science on a spanish national radio, and we dedicated the last one to talk about Artifical Intelligence and the impact on employment and Society. I presented this tool as an example. You can listen to it (in Spanish) here (timestamp 2h31m) Julia en la Onda, Onda Cero.

Next steps

I’d love to get your feedback and ideas. Or if you try to replicate it have problems, let me know.

  • Re-train the model using a mobile-friendly like “MobileNetV2”
  • Re-train the model using as many currencies (and coins) as possible. The benefits of multi-category classification to detect the denomination should become visible as you add more currencies.
  • Add server code to upload a copy of the user images, as I did with the GCE deployment.
  • Smartphone apps with on-device inference.
21 Likes

I did a cucumber detection model that looks at English, Field and Lemon Cucumbers. It was fun to actually get the images on google. Because I used Google Colab I couldn’t use the widgets apparently, so I looked at the downloaded images on the Google drive to view and delete the bad ones.

1 Like

Hi!

Project Overview:

The project is about recognising shot-types/ shot-scale in cinematic images. The methodology is just disciplined application of Jeremy’s teachings. Nothing fancy there for fastai users. The fanciest part of the project is the dataset, which I spent a few months creating – this is where my domain knowledge as a film student came in handy.

There’s 8 different kinds of shots in cinema, and this project focuses on 6 of them:

  1. Extreme Wide Shot

  1. Long Shot

  1. Medium Shot

  1. Medium Close Up

  1. Close Up

  1. Extreme Close Up

There’s also some fascinating heatmaps which really gives us insight into why the model works well:



Blog Post:

I created an interactive website explaining the project. The blog post is aimed at different target audiences – curious readers, filmmakers/ film students, and of course, deep learning practitioners like ourselves. It assumes no background in deep learning, filmmaking, or math.

I’d highly appreciate you taking the time to read this and give me feedback:

https://rsomani95.github.io/ai-film-1.html



GitHub Repo:

I also released the pretrained model, code and validation set in this repo:


I’ve kept the training set private because there’s more work to be done there. I plan on releasing it once I think it’s diverse and robust enough to train a great model.




I hope that this project kicks off data-driven research into film. I think there’s immense potential here to create tools that could be invaluable to filmmakers. Anyways, there’s more on that in the blog post.

Personal Background:

I’d been a Python programmer for 2 months before starting the fastai course, and it’s been an absolute pleasure to be part of this community and learning all this very accessible material.
I came into the course with no conceptions of what deep learning could and couldn’t do, but after watching just two lectures it was clear that I could solve a problem that I’d been pondering upon for about a year!

Thank you for your time, and thank you to the fastai team for making this possible. I’m extremely grateful for this course.

10 Likes

Rather than creating a web app, I thought I would be a rebel and create an IOS app. It didn’t work.

TL;DR

Although I did use Watson vision cloud service in a learning IOS app (so maybe technically I did do the web app challenge).

I tried the same grass/weed training in coreML (apple’s offering), I didn’t get very good results, but I’m super excited in the possibility of creating something that works (transferring from a pytorch model into a coreml one). I’ve actually created a number of basic image recognition IOS apps in my lead up to this experiment, so based on my grass/weed coreML training, I decided not to proceed with this actual app (trust me).

1 Like

Wow!
Hi rsomani95 Hope you are well!
I though your post was excellent, both technically informative and highlighted how important it is for domain experts to become involved with “AI”.

I learned a fair bit about film making also! I never it was so nuanced as I only point and shoot when taking video with my mobile or Nikon.

Well done!

mrfabulous1 :smiley::smiley:

Hi @mrfabulous1, I am well, I hope you are too :slight_smile:.

Thanks for the feedback. I’m glad you felt that way, I was trying to cater to multiple audiences with the post, and it’s exciting to see it hit the spot.

I’m trying to reach out to people in the film industry and see if I can get to talk about this at workshops or something, help get filmmakers involved. It would be interesting to see if that could lead to a data building initiative (as the post said, datasets are the biggest missing block to widening application to film).

Filmmaking is so intricate! It boggles my mind when I try to analyse a scene. There’s always more stuff to find.

Thanks again :smiley:

1 Like

Excellent work here @Interogativ! Very well written. I went through the process with no hiccups at all.

Running headless, I cloned fastai course version 3, and pulled up the lesson 1 notebook, in my Jupyter notebook.

But when I try to import fastai, I get an error ‘fastai module not found’.

I’m hoping you might have some insight or suggestions for me? I ran through your instructions with absolutely no issues.

Thanks,

-Kalen

@kalensr did you just clone it? Or install it. If you just cloned it perhaps you need to navigate to the directory?

Before cloning the course repo, I followed Interogative’s instructions, including downloading his scripts to install pytorch and fastai, on a fresh flashed image (jeston-nano-sd-r32.2.1.zip).

After that was all setup, I then cloned the course repository, and within my jupyter notebook, I navigated to the lesson notebook from the repo. It loaded with no issues. Until I tried to import the fastai module.

Upgrade your fastai version to v1.0.57

!pip install fastai==1.0.57

1 Like