Share your work here ✅

Nice work lukew!
mrfabulous1 :smiley::smiley:

This is so cool… I think this gives me some answers I have been looking for (multi object in single frame and also “I don’t know” answer from classification).

I am going to try this and bother you if I get stuck.

Any clue if this can be done on Google Colab?

Hi everyone!

I want to share with you a project in which I have been working last weeks. It is a Mushroom identifier web app (based on Shubham Kumar repo) that uses a resnet34 model to make the predictions.

The dataset has about 8000 images of 43 mushroom classes, and the achieved accuracy of the model is ~90.4%. I think it’s quite a good value taking into account the difficulty of image recognition for mushrooms. The majority of the most confused mushrooms would probably not be correctly identified by expert mushroom hunters based in a single image!

Here is the url to the web app:

https://mushroom-identifier.onrender.com/

When showing the results, I would like to show which are the mushroom classes that use to be confused by the predicted class, but I didn’t found a way to do it.

If you have any suggestion, comment or doubt about the project I will be happy to hear you!

Cheers,

Jordi

4 Likes

Any basketball lovers out there? I created an iPhone app that I now use to keep track of my shooting progress. All I do is attach my phone (iPhone 7) to a tripod at the gym and it locates where I shot from and whether it went in or not.

Here’s a link to a video demo: https://steven874.wixsite.com/shotcount

It’s in testflight now so if you’d like to try it out, DM me and I’ll shoot you a special invite :slight_smile:

23 Likes

Hi Jordi. This is a cool project, and one that is close to my kitchen and heart! 90% accuracy seems very good for a single photo. As you noted, the other 10% might kill you.

When showing the results, I would like to show which are the mushroom classes that use to be confused by the predicted class, but I didn’t found a way to do it.

You are very close to what you ask for. I did this a year ago with imagenet categories, so please forgive me if my memory is not entirely accurate.

This cnn model outputs activations for the 43 classes. fastai automagically applies softmax activation and nll_loss to these activations. I am not sure how well this invisible process is documented, but you can see it by tracing fastai with a debugger.

So first define your own loss function that does the same as fastai and assign it to learn.loss_func. This assignment prevents fastai from automatically deducing the correct activation and loss functions. In your loss function, between softmax and nll_loss, you will find the probabilities for each class. Then you can list the probabilities of the most likely classes.

Note that these class probabilities are relative to each other. They will tell you, given the image, which classes are most likely, but they will not tell you that there is no mushroom present of any class. For that, you would need to train with sigmoid activation and set a threshold. I make this comment only because it is a recurring question on the forums that has not been clearly and definitively addressed.

Thanks for sharing this project!

Malcolm

Cool. Mind I ask how it works?

Recently dabbled with the Tabular learner. Found it easy to use https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson4-tabular.ipynb as a base as build from there.

Have a file at work with ~300 data points. I have about 4 features and want to predict a continuous variable.

For my workflow at the time I just used a multi variable linear regression model on all the data. However, seeing how easy to use a NN to this dataset I wanted to give it a go.

By definition I am overfitting the data as I am fitting a model to the ~300 points then seeing how well it predicts itself. I am not creating a prediction tool this was more of an understanding exercise of the data as well as learning about Tabular learners.

Documenting my learning’s here:

test = (TabularList.from_df(df, path=path, cont_names=cont_names))
data = (TabularList.from_df(df, path=path, cont_names=cont_names, procs=procs)
        .split_none()
        .label_from_df(cols=dep_var)
        .databunch()
       )
data.add_test(test)

nn_arch = [800, 600, 400, 200] # Length is number of layers and value is number of hidden units in each layer
learn = tabular_learner(data, layers=nn_arch, metrics=r2_score)
...

preds, y = learn.get_preds(DatasetType.Test)
targs = torch.tensor(df['dep_var'])
r2_score(preds, targs)

ProFeTorch

I’m building an alternate to fb prophet using fastai and pytorch here: https://github.com/sachinruk/ProFeTorch. A worked minimal example can be found here.

Started this off mainly because I hated how fbprophet didn’t force the output into the higher and lower bound that was set, so I created a class that does just that, and finding how easy it is to do that thanks to Jeremy’s lectures:

class Squasher(nn.Module):
    def __init__(self, low=0, high=1, alpha=0.01):
        super().__init__()
        self.L, self.H, self.alpha = low, high, alpha
    def forward(self, x): 
        x[x < self.L] = self.alpha * (x[x < self.L] - self.L) + self.L
        x[x > self.H] = self.alpha * (x[x > self.H] - self.H) + self.H
        return x

Anyway, it’s still in early days, and I may have butchered the fastai API when using this.

I do need some help with:

  1. Making this into a pip package. I still don’t quite understand requirements.txt and setup.py files work to make this into a proper package.
  2. Finalising circleci to run tests.

So if anyone knows how to would greatly appreciate the help.

Thanks,
Sachin

2 Likes

Hi stev3 Nice app!
I’m intrigued how did you train your model? Did you take a video and split the frames into various classes, such as position and score or miss?

cheers mrfabulous1 :smiley::smiley:

Hi @hkristen
I’ve been recently playing with camera traps datasets, so maybe my notes will be helpful for you.
Many nature-related datasets are here: http://lila.science/datasets. Especially interested is Serengeti Dataset. It’s huge, but you can download single files as well:

import urllib.request
basic_link="https://snapshotserengeti.s3.msi.umn.edu/"
filepath="S1/B04/B04_R1/S1_B04_R1_PICT0001.JPG"
filename="S1_B04_R1_PICT0001.JPG"
file_link=basic_link+filepath

urllib.request.urlretrieve(file_link,filename)

More details:

1 Like

Hi, this my first post!:grinning: I am from Colombia and this is my first mini project from lesson 2.
Because we use jupyter notebooks on a daily basis, I decided to build a classifier in honor to this wonderful project and also to the father of modern science, Galileo Galilei, who discovered the four largest moons of Jupiter: Io, Europa, Ganymede, and Callisto.

My dataset was obtained from google image search and using the defaults parameters for the resnet34 model, I obtained a near to zero error rate.

https://galileo-moons.onrender.com/

2 Likes

My first post and my first project from lesson 1. I tried to classify dog breeds that look like wolfs using resnet34 and 50.

2 Likes

Hi all!! I started the course a few weeks back and just after the second lesson, I made a classifier to distinguish between bulls and buffaloes. https://isitabull.onrender.com/ :slight_smile:

1 Like

I created a proposal for the society of actuaries predictive analytics contest and won second place. Any suggestions on how to improve the model are welcome.

7 Likes

Hey Maria

Are you an actuary? Nice to see here a member from my community :slight_smile:

Nice work and congrats! Can you please let me know where do we find details for such competitions organised by SOA?

Regards
Abhik

Hi All,

I am a relative newbie to this forum. I am using Google colab to run the fast ai notebooks. one of projects that i am running is to identify aerial images of different infrastructures (like schools, airports, etc.) I have downloaded images from google but the fastai model is pretty noisy with a very high error rate around percent and looking at the learning rate graph. Nothing much is help but lower learning rates are leading to spike in the error.

image

even after optimizing the learning rates, I am not able to reduce the error rate to less that 25%.

Is this a hard problem for CNNs to figure out or is there a data error. I am unable to figure out. Any help is welcome.

Thanks
Ramesh

Hello Abhik:
Thank you for your message. Yes, I am an associate of the society of actuaries (ASA). Nice to see more actuaries here as well. For the competitions organized by the SOA, there is no specific place that I know of where the competitions get announced, it is usually by email. The active involvement programs that I know of right now are the Kaggle/SOA involvement program https://www.soa.org/programs/predictive-analytics/kaggle-program/ and the call for essays in actuarial practice and innovation https://www.soa.org/research/opportunities/actuarial-practice-innovation/
Best,
Maria

1 Like

@Jabberwocky From my experience with a classifier that I tried to get working to identify four classes of Boeing’s commercial airplanes, I think the issue was the quality and quantity of data. The difference among them are very subtle and also you have to deal with airplanes with different colors. I never could obtain an error rate below 0.46

I used around 200 photos per class.

I’m sure you could do this in google colab! They make it a bit difficult working with cloned repositories… I think what you need to do is download the repo/zip file into a google drive folder, and then you can right click a notebook to open it in colab.

Colab for this kind of thing is a bit messy though. What I like to do is development on my local machine, especially this project because it involves a lot a image/file management, then only run it on a cloud GPU only once everything is in place and running the way I want it to. It’s just the training that needs a GPU. I’d say I spent 90% of my time building the inference loop and OpenCV/matplotlib image formatting, so that was fine running on my local machine’s CPU.

I think the architecture is cool because it isolates the target and that makes it a much better image-classification problem to solve. Whether running two networks at once per image is the optimal way to do it…? I’m not so sure!

Thanks. I looked at the data as well. Maybe i have to start with a problem that is a bit more tractable and move from there.