@jeremy In one of the videos where you showed training of data with car images, you were able to graphically browse a large amount of data and label it rather easily. What are some good tools to visualize data labels or to create training data sets?
I looked around on google but I couldn’t find any.
Great question. We had to write our own. It’s not publicly available, but we hope to create publicly available tools as part of our mission at fast.ai. Help is most welcome
What you had seemed pretty amazing to me… i have seen some solutions which are displaying content in 3D space and then able to browse them easily but for different purpose ( ebay virtual shopping store).
I’ve created a little tool for myself to quickly label images directly from jupyter notebook. It loads all images from a specified folder and creates a csv file with all the labels.
It supports single and multi class labeling.
I’ve uploaded it to github. maybe others find it useful:
This looks like a really good idea - @Winston do you have any plans to polish it up to make it more widely useful? The things I notice right away are:
Fix bugs (e.g elf should be self)
Add a screenshot to the readme with some info as to what it does
Make it work in Binder so people can try it without installing (probably just requires you adding 3-4 images to your repo and having the sample notebook use those).
Basically, take a large amount a data you want to label and :
Do the tedious task of labeling but only for 10% of the data.
Train a neural net with a data bunch made from this 10%.
Thanks to inference, use that same neural net to label the other 90% of data
Intervene only when the neural net gives a low level of certainty and by intervene, I mean confirm or correct the output label(s)
Add those labeled data to the 10% of already labeled data
Cooperative learning
Well, let’s assume Active learning is a subset of Cooperative learning.
In active learning, what you inject back are only the output labels from the predictions your neural net had a hard time to make and you had to intervene on.
Cooperative learning is just like above plus you also inject the labeled data that your neural net easily labeled back into the labeled data, without checking.
It is stated in the video that you can expect to spend only 10 to 20% of the time you would normally spend labeling data.
I think it was worth mentioning it.
What do you think of this approach ?
@jeremy I tried so look for active/cooperative learning in the forum but I couldn’t find anything :