I used the Fatkun image downloader. It has been mentioned in the forums in another topic. My dataset consisted of 670 images.
Thanks for showing me the importance if having an ‘I don’t know’ class. I feel there are two ways of making the Neural Network say “I don’t know” :
Create a new class labeled as ‘unknown’ and fill it up with images of everything that is not related to your problem. However, this is very unrealistic.
Try to extract the features from the Neural network and pass the features through a Machine Learning classifier (Like a decision tree or SVM). This method however requires us to get some knowledge on Neural Network theory.
Hi everyone! After 5-6 months of intensive studies I deployed my first Deep Learning app. It is a personal project that I have worked on and felt it would be cool to deploy it. The app allows users to turn random pictures into paintings of 3 old masters Van Gogh, Monet and Cezanne. You can check out the app at photo2painting.tech
The app is a demonstration of how CycleGan models work in production and is deployed on a Digital Ocean droplet (free with Github as I am a student). The project is still on development, so I am eager for your feedback. Feel free to contact me if you want to collaborate on the project. If you guys want to check the code, I have open-sourced it here
Your app is marvelous, you have done a great job and have created a good application of the course content.
The app shows your 5-6 months of intensive study have been well spent!
Hello @henripal, @daveluo and @lesscomfortable,
I saw that you worked all with satellite images. I am currently accompanying a group of students from the University of Brasília on Deep Learning. One of the PhD students is working on a DL project with satellite images and would like to use a pre-trained model with satellite images but they have more than 3 channels. How did you handle this problem? Thank you.
I made a web app that understands if/how an image is rotated and “derotates” it. This idea came to me because when I take pictures with my phone, they don’t get a consistent orientation, I think because of the auto-rotate feature (or lack of it?)…
The code (including the training notebook) can be found on Github and the web app at derotate.appspot.com. I don’t know if the idea is useful in itself, but I didn’t find a lot of webapps that output images (and not a class like in fastai’s tutorial), so maybe it can be useful in that sense.
Eventually, I would like to make a web service out of it and call this function during an image processing pipeline, but I’m still a bit stuck on this step. Does anyone have a recommendation on how to do it?
Hi sebderhy thanks for an immensely useful app.
Seeing yours, all the great apps and things people create on this thread and all the great work done by the fastai team and the community is truly inspirational.
12-class sentiment classification of US Airline Tweets with standard ULMFiT - ~60% accuracy
Hello everyone! I’m really interested in deep learning for NLP, so I’ve been using it to train language models to do downstream tasks (document similarity, sentiment classification etc).
I had a go at this Kaggle dataset, and after relatively little training I got around 60% accuracy on 12 classes (positive, neutral, and 10 negative classes).
I’ve been playing with momentums and learning rates, but I never seem to be able to get much further. I wonder if anyone has any pointers as to how I could substantially improve this result?
Here’s another experiment on video enhancement / superresolution I’ve been working on recently (and highly enjoyed doing!).
The idea is that since a video is a small dataset, if we start with a good image enhancement model (for example fastai lesson 7 model), and fine-tune it on the video’s images, the model can hopefully learn specific details of the scene when the camera gets closer and then reintegrate them back when the camera gets further away (does this makes sense?).
Here is a screenshot of the first results I got (more results screenshots and videos can be found on my github repository):
In my experiments, the algorithm achieved a better image quality than the pets lesson 7 model, which seems logical since it’s fine-tuned for each specific video.
I actually initially posted this work on the Deep Learning section, because I feel like it’s not finished yet, and I’m looking for help on how to move forward on this. I haven’t found a lot of work on transfer learning in video enhancement (did I miss something?) so far, although it looks like an interesting research direction to me. Do you think that this kind of transfer learning in video enhancement has potential? If so, what would you do to improve on this work?
I recently wrote a Medium Article that I wish had been available when I started this journey. I feel like some of the questions addressed are encountered fairly frequently (and have even been addressed in this course).
Hoping this might be helpful to somebody and eager to continue to give back to the community that has given us this resource!
In my recent medium article, I wrote about a project in which I created a CNN based model to predict the exact age of the person given his / her image.
This is the link:
There are many new things I learnt while working on this project:
Reconstructing the architecture of ResNet34 model to deal with Image Regression tasks
Discriminative Learning Technique
Image resizing techniques
Powerful Image augmentation techniques of Fastai v1 library
As a test image to validate the prediction accuracy of my model, I used India’s PM Modi’s picture which was taken in year 2015 (when he was 64 years old) and checked the result:
Racket Classifier
Created my first GitHub entry, to create a classifier identifying Tennis, Badminton and Table Tennis rackets. I was surprised to get to 95% accuracy. The confusion matrix also makes sense that a few badminton and tennis rackets look similar in a few angles/crops.
PS: github also has the cleaned URL files if someone wants to replicate it.
This being my first GitHub entry, looking for experts to point out issues / mistakes / suggestions to make it better!
With a bunch of tree-friendly volunteers from Data for Good, we’ve been working for two months on a wildfire detection system! Following up on the increasing severity of forest wildfires across the globe this summer, we started interviewing firefighters and surveillance teams in southern France to gain some field expertise: with the adoption of cell phones, detection itself is not an issue anymore but early detection is crucial to contain the fires.
Existing approaches leverage high-end optical equipment but don’t make the most out of the processing part, whereas we believe that wider accessibility comes with lower deployment costs.
Our first draft is quite simple: train a reliable detection model, get it to run on Raspberry Pis and place those on existing surveillance towers.
Collecting data from publicly available images, we trained a single-frame classifier using the learning of first fastai lessons. We released a first version of the library earlier this week (available through pypi as well) including our image classification dataset and a light-weight model (mobilenet v2) with an error rate lower than 4.4%
The project is open-source and our goal is that anyone with a Raspberry PI (and its camera) can download and install the inference model easily at home completely free of charge.
We are always looking at expanding our datasets and improve the model so any feedback, suggestion or contribution is very much welcome
I wanted to create a craft-beer identification network which would tell me quality/rating of a beer from an image. I realised early on that I couldn’t just use the lesson 2 classifier, because this problem requires not just image classification, but segmentation too. When there are multiple craft beers in the frame, I need to return different predictions to different coordinates.
The way I solved this was, first using this pre-trained pytorch implementation of YOLOv3, to segment and draw bounding boxes around the 1000 (possible) classes in imagenet.
Then, if the detection class = 39 (a bottle), I would pass this cut-out to the custom trained FastAI resnet model, and display the results as an overlay on the original image.
The code works well, but of course there’s the limitation that it only classifies beers that I’ve already looked up and created a dataset for. I can imagine some kind of future work which automatically adds brands to the model by scraping Google Images based on a master-list, then using YOLOv3 to extract bottles from the search results, and then running those images in training.
Also I have a short video of it running on my GitHub, but it would need some considerable refactoring to make it actually run in real-time (and beyond the scope of this hobby project)