Data Labeling

@jeremy In one of the videos where you showed training of data with car images, you were able to graphically browse a large amount of data and label it rather easily. What are some good tools to visualize data labels or to create training data sets?

I looked around on google but I couldn’t find any.

1 Like

Great question. We had to write our own. It’s not publicly available, but we hope to create publicly available tools as part of our mission at fast.ai. Help is most welcome :slight_smile:

2 Likes

Sign me up :slight_smile:

What you had seemed pretty amazing to me… i have seen some solutions which are displaying content in 3D space and then able to browse them easily but for different purpose ( ebay virtual shopping store).

Thanks!

If you are using Mac OS X, you can use RectLabel.
An image annotation tool to label images for bounding box object detection.

Key features:

  • Create a label dialog from settings
  • Settings for objects, attributes and format
  • Support the PASCAL VOC format
  • Layer order for overlapped boxes
  • Zoom in on a point
  • Quick zoom to existing boxes
  • Smart guides for creating and transforming boxes

I’ve created a little tool for myself to quickly label images directly from jupyter notebook. It loads all images from a specified folder and creates a csv file with all the labels.
It supports single and multi class labeling.

I’ve uploaded it to github. maybe others find it useful:

1 Like

This looks like a really good idea - @Winston do you have any plans to polish it up to make it more widely useful? The things I notice right away are:

  • Fix bugs (e.g elf should be self)
  • Add a screenshot to the readme with some info as to what it does
  • Make it work in Binder so people can try it without installing (probably just requires you adding 3-4 images to your repo and having the sample notebook use those).
1 Like

Hi everyone, can I suggest an approach called Active learning (and Cooperative learning) ?

I’ve heard of it from Dr Michel Valstar in an excellent short video from the youtube channel Computerphile you might already know.
The video : Active (Machine) Learning - Computerphile.

Active learning :

Basically, take a large amount a data you want to label and :

  • Do the tedious task of labeling but only for 10% of the data.
  • Train a neural net with a data bunch made from this 10%.
  • Thanks to inference, use that same neural net to label the other 90% of data
  • Intervene only when the neural net gives a low level of certainty and by intervene, I mean confirm or correct the output label(s)
  • Add those labeled data to the 10% of already labeled data

Cooperative learning

Well, let’s assume Active learning is a subset of Cooperative learning.
In active learning, what you inject back are only the output labels from the predictions your neural net had a hard time to make and you had to intervene on.
Cooperative learning is just like above plus you also inject the labeled data that your neural net easily labeled back into the labeled data, without checking.

It is stated in the video that you can expect to spend only 10 to 20% of the time you would normally spend labeling data.

I think it was worth mentioning it.
What do you think of this approach ?

@jeremy I tried so look for active/cooperative learning in the forum but I couldn’t find anything :


Do you think it could be worth a sub wiki article or something ? I would gladdly help

Cheers,
Alexandre.

1 Like

Here you go:
https://forums.fast.ai/search?q=%22active%20learning%22

:laughing: Thanks !