Share your work here ✅

This is an intriguing concept. I think from a security perspective, I’d be paranoid enough to not let anything hit the camera sensor that I don’t want it to see, so I’d probably go for a “physical filter”. I’ve worked in offices where they used motion detectors above cubicles to determine presence (to turn lights off when people leave) but the most annoying part of that was that these sensors turn lights off if your’e working late and their resolution wasn’t that great. So, if you’re working late just sitting at your desk, they can’t detect presence. I think a vision based system like you propose can check itself against a base case and keep the lights on even if no motion is detected. I think not too much resolution is needed to determin if there is motion within the monitored space.

Something really cool would be if the system can “learn” the base case of an empty room and then detect presence of objects within it. Especially if those objects tend to move once in a while. or maybe a combination of multiple cheap sensors that feed into the network? like a very low resolution camera, a microphone and an ultrasound motion detector all in one case?

The more I think about it, the more complicated this problem gets in my mind :sweat_smile:

Definitely an interesting and challenging problem for sure!

2 Likes

I also tried to re-use “Is it a bird?” code with a different set of images to see if the classifier can distinguish between war memes and news photos. It performed very well but it was interesting to see that a captioned news photo was classified as a meme! Looking forward to playing with this idea a little more, potentially adding NLP element to it.

3 Likes

Might be even more interesting to see if you can find out why using the techniques described here: Share your work here ✅ - #77

1 Like

Art Mood

This is a variation of my last model in which I trained an image classifier on pictures of the four seasons. In Art Mood I wanted to determine how well the model could predict the mood the paintings evoked in me. (I limited these moods to the four seasons.) The examples in the Art Mood app are a combination of both real and abstract art. Subjectively, I thought the model did very well. My wife is an artist and she too found the predictions to largely match her seasonal moods—and her perspective should be more highly regarded than my own. . .

This was a proof of concept. In order to make this more rigorous, I would need to create a labeled training set of paintings in order to compare how well the model did a priori.

Let me know if the model aligns with your Art Mood

9 Likes

I built a classifier to match an image with the best-fitting art movement from the following:

I gathered about 200 photos / movement, 150 for Expressionism from Google Arts & Culture, and 200 for the rest from DuckDuckGo.

Using a resnet34 and presizing the images got it down to below a 5% error rate.

Testing my classifier on images outside of the genres provided seems to align with my intuitions pretty well - Keith Haring is categorized as Expressionism

While for Mickey Mouse Ukiyo-e is the favored category

Next steps might be looking at training on the WikiArt dataset.

Also curious about scraping Twitter hashtags for generative art (Dall-e 2, MidJourney) and predicting likes / retweets.

Here’s my Hugging Face Demo

9 Likes

@davidrd Very cool project. I tested it with some of the paintings I used for my Art Mood project and your model performed very well! Thanks too on the lead on the WikiArt source. That looks to be quite interesting.

1 Like

I decided to try using an image classifier for a task that it’s not “supposed” to be used for: language detection! I.e., given an image of a chunk of text, can the computer tell whether it is in French or German? (I decided to use French and German rather than, say, French and English, because I wanted two languages that both use accent marks. Also, I wanted to use languages that are written in the same alphabet, as otherwise the model would be learning to recognise writing systems, not languages.)

You can see my Kaggle notebook here: https://www.kaggle.com/code/skalyan91/is-it-a-bird-creating-a-model-from-your-own-data. (I haven’t adjusted the text to reflect the fact that I’m no longer doing bird/forest classification—apologies!)

For the data collection, I did DuckDuckGo image searches for “text en français” (text in French) and “Text auf Deutsch” (text in German). However, since the resulting images were usually pretty large, I had to crop them to a standard size, rather than squishing them. (Unfortunately, the fastai library does not seem to provide a convenient function for batch-cropping a folder of images, as it does for resizing. This meant I had to do the cropping and saving using the Pillow library, which was slower as I didn’t have time to parallelise it.)

Here are some sample images I used (French, then German):

image
image

I fine-tuned the model for 10 epochs, since I thought this might give it a better chance of learning the classification. However, after about the 6th epoch, both the error rate and the validation loss pretty much got stuck.

epoch train_loss valid_loss error_rate time
0 0.841501 1.059826 0.545455 00:09
1 0.805841 0.899394 0.428571 00:09
2 0.719036 0.953295 0.363636 00:09
3 0.619781 1.105118 0.376623 00:08
4 0.540305 1.153665 0.350649 00:09
5 0.477164 1.134642 0.337662 00:09
6 0.429748 1.095640 0.311688 00:09
7 0.397670 1.083638 0.311688 00:09
8 0.367227 1.104923 0.337662 00:09
9 0.340956 1.109644 0.324675 00:09

In the end, the model ended up with an error rate (on the validation set) of about one-third. I think this is surprisingly good for a model that’s trying to do something it’s not designed for!

8 Likes

We published this work at an HCI conference last year about detecting hand postures using the reflected webcam of a laptop to trigger commands and play games. The posture recogniser uses Fastai.

7 Likes

A few days back, I built the Marvel character classifier app and deployed it on Jarvislabs.ai. While doing it, I figured out that I can make a few changes to Jarvislabs.ai platform and make it super easy for anyone to do it.

I made a blog and video describing how this can be done. All you need is to add the server_name and server_port to the launch function.

import gradio as gr
demo = gr.Interface(fn=find_character, 
                    inputs=gr.inputs.Image(shape=(256, 256)),
                    outputs= gr.outputs.Label(num_top_classes=3)
                   )
if __name__ == "__main__":
    demo.launch(server_name="0.0.0.0",server_port=6006)

I have described it in detail here in the blog and video.

I believe this can help

  • When you need to deploy models that need the power of GPU.
  • When you want to share it only with your team/friends and not make it public.

I hope you like it :smiley:.

8 Likes

Also wanted to mention an earlier experiment I did, using the image classifier to identify pictures of Indian musical instruments! (It’s in Version 1 of the same notebook as in my previous post.)

In particular, I trained the model to differentiate between a veena and a sitar, both fretted wooden instruments with two resonating chambers:

image (veena)
image (sitar)

More sample images:

image

I trained for 3 epochs, and got a final error rate of about 16%. Not bad!

3 Likes

fastai has a transform called RandomResizeCrop which would be best for this - it’ll probably give you better results, and doesn’t require any preprocessing. See the book for details.

This is great. Although IIUC it’ll cost at least $75/week to run it? Are there any ways a user can get the cost down?

Would it be a good idea to introduce new instance types that are quite a bit lighter-weight for running full-time for stuff like this? E.g 16GB RAM, 2 cores, and a 1080ti?

1 Like

After attending the first class I built an image classification model to detect different types of rock (Sandstone and Coal) which I am using a lot in the mining industry project. I am looking forward to today’s 2nd lesson. Thanks

1 Like

Yes, the cheapest GPU option currently costs 75$/week.

If we can leverage spot instances it would cost less than 32$ a week. While using spot instances, there are 2 challenges.

  • Spot instances can be terminated.
  • Should it run 24/7

Currently, users can use Jarvislabs API to handle the above challenges.

We want to make it easier further by allowing the users to make it completely configurable. For example

  • Choose spot instances for deploy
  • Deploy on 1st request, similar to AWS Lambda, and stop when there is no activity.

Regarding using 1080ti, we cannot use it in DC based on Nvidia rules. However, I am thinking of the possibility of using shared GPUs like A6000 which comes with a huge VRAM of 48GB.

6 Likes

Sounds interesting! I find 8GB VRAM is enough for most stuff I do, and 12GB adds a bit of head-room.

5 Likes

Yes, I also love the idea. I wanted to explore the possibility of sharing GPUs safely. TF had a way to create virtual GPU which limits the GPU memory.

Will explore more and would love to show it when we have made progress.

3 Likes

I applied Jeremy’s notebook to waste management data. It works like magic.

I’ll try to create an app for this model next using huggingface spaces as well as ipywidgets.

5 Likes

Ha yes it can get complicated for sure. Maybe this course will inspire a solution! Will keep you posted :smiley:

1 Like

I’ve always had problems with our city not recycling certain items and people always being confused about whether something is “recyclable as per city regulations”, so, a couple years back I did a trash classifier but the dataset wasn’t huge. I have recently found this dataset which one can contribute to as well by classifying new objects on their website.

3 Likes

Created a Jupyter notebook extension to convert the ipython notebook to Python file using nbdev:

Obviously there are lots of improvements to be made. Let me know your feedback.

3 Likes