Really cool! I made something similar with a slightly different use case. I spend a lot of time correcting wrongly labeled text data. Therefore I use an active learning approach, when iteratively integrating new annotated data into the existing model
Here is the (not at all new) idea:
Get predictions from the trained model of the previous iteration
Get all the cases where prediction is different from the annotated label
Correct the annotated label if wrongly labeled
Therefore I need a slightly different interface to integrate model predictions and probabilities .
The options in the dropdown are ordered by probability. This makes it way easier to find/select the desired category.
High and low probability predictions are color coded , so you can see potential corrections in a glance.
I basically hacked the vision ImageCleaner to make this work. But will go on to do a more fancy implementation. Would you be interested to collaborate?
Hey guys, i completed the first few lectures at fastai and finished creating a gender voice classifier. I trained it using spectrogram images of voice clips (male and female) of American people. I did not use the entire dataset. I used about 2000 images (1/4th of the entire dataset)and got an accuracy of about 99% . Then, when I tested the result with spectrogram images of Nepalese female voice, it shows surprisingly good results. A 100% accuracy on the 217 images I tested it on. If you are interested you can check the source code at github and give me some feedback. Is there anything i did wrong here or could have done better?https://github.com/SamratThapa120/Gender-Classifier-by-Voice/tree/master
After following the first lesson, I used ResNet-50 to predict gender from pictures in the CelebA dataset. I did not have to put any effort other than preparing the dataset and waiting (the dataset is ~200k images, so takes tens of minutes for the training to complete).
Not a deep learning project, but I have made a Github repository of fancy python tricks I have come across. Some of these are from the walkthrus that Jeremy did Check them out here:
If you know more tricks that will make people’s lives easier feel free to submit a PR.
I have a project ongoing about making neural networks faster and lighter in terms of parameters for fastai:
Currently, there are 3 techniques implemented:
Neural Network sparsifying: make the network sparse
Neural Network pruning: remove the useless convolution filters
Batch Normalization folding, that I already explained here
I’m planning to continue to develop it and add new techniques such as quantization, knowledge distillation,… and also to make it compatible with fastai2
I have been trying to import the dataset into model.
Somehow I am having trouble on the path command,
can someone explain what is a posixpath and how to import an dataset,
In short, can someone explain the below picture
note: I am a newbie to this coding world, and if this question seems to basic, then its my bad.
Big thanks to @oguiza for some debugging help, here is an article on speeding up fastai2 tabular with NumPy. I was able to get about a 40% boost in speed during training! Article
Note: bits like show_batch etc don’t work, but this was a pure “get it to work”
I completed the first lesson recently and started a project that uses around one hundred pictures each of four of the least recognizable democratic primary candidates to create an image classifier that can recognize them with a success rate of over 97%. I would love feedback on it! Let me know what you think. Thanks!
I’ve picked up fastai again and I’m refreshing the course.
When I’m cleaning my desktop, I frequently throw all nuts, bolts, rings etc which I cannot easily determine (kind of lazy) into a bowl. And instead of picking another empty bowl when the first four are starting to overflow, I’d thought to use the fastai to help determine the unassorted hardware for easier assorting.
I’ve created data by just placing one type of hardware on a paper, and using OpenCV and a webcam to quickly scan a lot of images of the part, going left, right, around over the part.
I created different directories with the images, like m5_nut, m5_ring, m5_cil_head_16, m4 cil_head_16 etcetera which I used to repeat lesson 2.
Writing an OpenCV program to record the images took longer than learning the model
I’ve got some very good results already. Some improvements need to be made to account for scale, since taking pictures of a ‘m4_nut’ in close proximity gives me an inference of ‘m5_nut’ which is to be expected of course.
But I’m very glad with the fast success.
Thanks for all the work that got into this great library!
Great idea for a test! I randomly picked an image from google searching on “nut”, and the inference was an M6 nut, the M6 nut is the biggest of 3 I taught the model (got M5 and M4 too).
The model recognized its a nut, but this model is not really cut out to differentiate between sizes. Since the nut is filling the picture the biggest size was predicted
I found a solution for the scaling problem by recording the images on my paper notebook which has a 5x5 mm grid. This make that the size is detected independent of the distance I have between the webcam and a part.
Hi basdebruijn I hope you are having a brilliant day!
I read your previous post about size issues and if the graph paper doesn’t affect the accuracy of your model this appears like a great idea.
I just wondering if:
You keep the distance between the object and camera the same?
weather your camera has an auto focus which changes depending on the size of the object?
Am I right in thinking, to detect the size of nuts and bolts from other images, you would need them to be photographed on the same size graph paper and the factors in the previous questions kept constant?
From you experiments, it looks like the above would work very well for a manufacturing department or on a production line.
After watching Lesson 3, wanted to explore the effect on progressive resizing on the Architectural Heritage Elements image Dataset (https://old.datahub.io/dataset/architectural-heritage-elements-image-dataset ). Also wanted to compare the results to the ones presented in: Classification of Architectural Heritage Images Using Deep Learning Techniques (Llamas et al. 2017)
Link for the paper:
The results table in the paper says that highest accuracy achieved at that time was 93.19%
Attempted to apply ResNet50 with bs of 64 and 80/20 Train/Valid split, Image size=224 and the final error rate after unfreezing and retraining earlier layers was around 0.0229 which suggests accuracy of 97.71%.
The most confused pairs looked like below:
Then attempted to apply ResNet50 with same bs and Train/Valid split but trained in three stages with different image sizes: 64, 128 and 256. Did unfreezing and retraining earlier layers on each stage. Error rate on image size 256 fluctuated a bit around 0.01 mark which suggests accuracy of around 99%.
After the above, decided to reduce the learning rate a bit and train a few more epochs. Final error rate was 0.0083, which suggests accuracy of 99.17%.
The only concern is that original image size is 128 and last stage model is applied on size 256. Was wondering if this “image expanding” is bad practice?
I have not applied the model to other types of reviews yet.
But, yes, I’m planning to get a small labelled dataset for other types like amazon and test it.
There are not many labelled dataset to do so. Do you have any at hand?
Just a thought experiment. Can you try putting a coin (penny or a quarter of known size) next to each nut photo or any other object you want to classify in order to adjust the scale of your first linear transform? Couple of benefits are - (a) research and results can be replicated by others. (b) this can solve the generic issue for photographic scale of any photo.
hi @mrfabulous1
to answer your questions,
1: no, I do not keep the distance the same, This does make it harder to determine the size. I would probably find a more rugged solution for a real application.
2: yes, the camera has autofocus, but it sometimes goes out of focus which makes for blurry images. So instead of helping, it sometimes just does the opposite.
I think indeed that the size of graph paper needs to be the same size. As somebody mentioned, maybe a coin would work too. But I think i’d stick with the same zoom level and no autofocus if i had a choice.