[Invitation to open collaboration] Practice what we learned in lec 3, learn more about ipywidgets and voila

I’m working on adding another audio dataset to fastai2. It has 3405 zebra finch calls and songs collected from 45 zebra finches (18 chicks and 27 adults). Each vocalization has been assigned to one of 11 call types.

This is likely a much more challenging dataset than the one we worked on earlier.

I am currently working on cleaning up the labeling. Calls were annotated manually (by multiple researchers typing in the file name). This has led to several inconsistencies.

I have fixed most of them but a major one remains - there should only be 11 distinct labels but there are 33! Different annotators adopted different conventions. What is more concerning, some of the labels are ambiguous.

Here is a voila app hosted on [https://mybinder.org/] I created to play the recordings to help with relabeling. The source code can be found here.

Here is the original paper that analyzes the data.

Task #1
Out of the ambiguous labels ThuC, ThuckC, ThukC, TukC can you infer which map to Thuk and which to Tuck (these are correct label names)?

Could you please post your guesses below? This would be very helpful. If you do, ideally please work on the task before looking at the guesses by other people.

Task #2
Can you assign the call types in the dataset to one of the correct labels? You can submit your guesses using the following template.

type2labels = {
    'Ag': ['example_label', 'example_label2'],
    'Be': [],
    'DC': [],
    'Di': [],
    'LT': [],
    'Ne': [],
    'So': [],
    'Te': [],
    'Th': [],
    'Tu': [],
    'Wh': []
}

Task #3
Can you add useful functionality to the voila app? For instance, maybe it would be helpful to see the spectrograms? Or maybe people using this application would feel more engaged with the task if they were greeted with an image of a zebra finch? How about including a set of zebra finch images and reloading it every x seconds?

Task #4 (tricky)
In processing of the data, there are 14 vocalizations labeled both as ‘chick’ and ‘adult’. Not knowing which class they belong to, I removed them. But can you somehow infer which is the correct label? Can you do so by listening to the calls? Or would some information from the paper be helpful?

Task #5 (advanced)
Can you run a clustering algorithm or one reducing dimensionality on this data? Is there a way of using this information, obtained in an unsupervised way, to reason about which examples may have been misclassified?

Task #6 (advanced)
With the initially corrected labels, can you train a model and see what calls it is least confident about? Could they potentially be misclassified in the dataset?

If you get any results on either of the tasks, please consider submitting a PR to the repo. Would be great if others could learn from your work :slightly_smiling_face:

The plan is to clean this dataset so that we can practice more advanced methods of classifying audio, similarly to what we are doing in this thread (this dataset should be much more challenging).

FAQ

1. Why this initiative?
Quite often threads would pop up where people are struggling with finding ways to practice the materials in the lecture. The intent behind this thread is to help in this regard. Working on something that is interesting and can be useful to others is in my experience a great way to learn. Also, this task models closely situations you will encounter in the wild.
2. Radek, is it true that you find voila and ipywidgets amazing?
Yes! But it hasn’t always been that way! Precisely, up unto a couple of days ago, I never thought I would resort to this way of creating applications. Strangely enough a situation manifested where I genuinely felt I could benefit in my work from hacking a quick application together. Luckily enough I watched lecture 3 and… voila :smile:
3. Damn, this ipywidgets thing is confusing :exploding_head:. How do you make sense of it?
It was super confusing to me as well! In fact, it still is. Luckily, I stumbled across the fabulous documentation for ipywidgets and read through the examples, did a couple of them along the way. I still can’t say I understand it well, but I now have some way of getting dangerous, even if through copying most of the code from the documentation and only making some minor adjustments here and there! That is precisely how we learn.

13 Likes

@radek nice work. Did you by any chance see an example (ipywidgets) to make a text link to an other notebook something like - next page previous page…

Sorry, I haven’t. I think you can achieve something like this using the HTML widget:

ipywidgets.HTML('<a href="relative/path/to/NB">next page</a>')

But I am not sure how well this would work on mybinder.org. I am thinking it won’t (or it might want to spin up a new VM after each click). Then again, who knows, might be worth a try, maybe there is some form of specifying the path that would work. Or might not be that big of a deal depending on your use case even if it does spin up a new vm.

If you were to host your application on a dedicated vm (your own machine) the ipywidgets.HTML approach should work I believe.

Hope this is helpful but might be someone else will have a better answer to this question though.

thank you, It helps… if you check the link at the very bottom of the page there is an other example… but I can not dig the source - lack of my knowledge how to do it. :slight_smile:

Thanks @radek! Now we’re getting wild :stuck_out_tongue:

1 Like

A good strategy might be trying to run the code in that notebook and seeing if you can modify it to fit your needs. But yes, you are right, this probably goes a little bit above just a simple case of adding widgets to the notebook.

Here is a blog post on relative vs absolute paths, maybe this can be helpful. Probably you want to point the href attribute to the name of the notebook you would like to link to. In its simplest form, if you have two notebooks in a single directory, notebook a.ipynb and b.ipynb, if you want to link from b to a, you could just open a cell, change it into a markdown cell (as opposed to a code one, by pressing m or doing so in the drop down)
image

and type this:

[link to a]('a.ipynb')

and upon executing that cell you should have a working link (like in the image I am attaching).

Not sure if this is helpful, but nothing else comes to mind :slight_smile: I believe anything you would do using ipywidgets on top of this would be just making it prettier / having some programatic way of creating these links, based on ids or whatnot.

@radek, I’m not sure if I quite understand Task #1 here. I can hear the audio samples of the ambiguous labels but I don’t have any examples of the correct labels (Thuk and Tuck). How can I say which category the ambiguous labelled calls belong to when I don’t know how the correct labels sound like? I can only guess from the spellings what the mistakes could’ve been while labelling. E.g.:

  1. Thuk: ThuC, ThuckC and ThukC
  2. Tuck: TukC

Or, by hearing the ambiguous labels, I could try to group them into two groups but then I can’t say which of them is which correct label.
Have I missed something?

I think you are right :slight_smile: This is a bit of an open ended question. Trying to work off the spellings sounds to me like a viable option. Another thing worth considering is that the names of the calls often describe the vocalization you are hearing. In the case of macaques, a coo call is literally the macaque making a coo sound :slight_smile:

I know that in English, especially for non native speakers (if you ever heard my accent you would know I am very much in that boat :slight_smile: ) it can be quite challenging to tell between the words thuk and tuck. But that is also another way to approach this.

I think it’s okay to not get this right, would be great though to hear how you and others get on with this task :slight_smile:

What might be helpful, and I am very sorry for not including this initially, is the full list of possible calls, their names. Indeed with just th and tu it is really hard to figure out what is going on. Maybe knowing one is looking for thuk and tuck can make things a little bit more tractable.

Thanks so much for looking into this Gautam!

image

1 Like

Thanks for the clarification Radek!
Might I just add how and why I guessed it that way?
Many Indian languages differentiate between a th and t sound (in fact, Hindi has four different T sounds, and additionally, four different D sounds).
That’s why I guessed the way I did.

1 Like

Definitely interested in learning more about ipywidgets and voila, used ipywidgets exclusively in this project
Visual GUI.

There is a lot of potential in creating usable interactive interfaces with ipywidgets, voila and binder(CPU is just fine for data and augmentation visualizations).

Although I did get a comment on another forum(not fastai) ‘Didn’t realize people are still using ipywidets’ :slight_smile: :grimacing:

1 Like

My guesses for Task #2 (this is definitely the simplest possible guess but I couldn’t really convince myself to guess differently after listening to many calls).

type2labels = {
   'Ag': ['Ag', 'AggC'],
   'Be': ['BeggSeq', 'Beggseq'],
   'DC': ['DC'],
   'Di': ['DisC'],
   'LT': ['LTC'],
   'Ne': ['Ne', 'NeArkC', 'NeKakleC', 'NeSeq', 'NekakleC', 'NestC', 'NestCSeq', 'NestCseq', 'NestSeq'],
   'So': ['So', 'Song'],
   'Te': ['Te', 'Tet', 'TetC'],
   'Th': ['ThuC', 'ThuckC', 'ThukC'],
   'Tu': ['TukC'],
   'Wh': ['WC', 'Wh', 'Whi', 'WhiC', 'WhiCNestC', 'Whine', 'WhineC', 'WhineCSeq']
}
1 Like

Nice new project Radek!

Based on spellings (like Gautam_e)
Task1
“Th” : [ ‘ThuC’, ‘ThuckC’, ‘ThukC’],
“Tu” : [‘TukC’]

Listening to the TukC sounds I do not hear any difference with the Th* sounds. Maybe group then as one?

1 Like

For Task #3 I’ve added two lines to @radek’s code to show the spectrogram, the length of the file and the filename to yield something like this:
image

Unfortunately, I’m struggling with binder to host the app. I was able to get an output from voila, but on following the instructions for hosting the app on Binder I get a “404: File not found” error.
I’d be happy to see anyone else’s results.

1 Like

Thank you very much for all your help on the relabeling. :slight_smile:

I went through this exercise myself and will be going with your suggestions, that is a mapping like in the post from @gautam_e:

type2labels = {
   'Ag': ['Ag', 'AggC'],
   'Be': ['BeggSeq', 'Beggseq'],
   'DC': ['DC'],
   'Di': ['DisC'],
   'LT': ['LTC'],
   'Ne': ['Ne', 'NeArkC', 'NeKakleC', 'NeSeq', 'NekakleC', 'NestC', 'NestCSeq', 'NestCseq', 'NestSeq'],
   'So': ['So', 'Song'],
   'Te': ['Te', 'Tet', 'TetC'],
   'Th': ['ThuC', 'ThuckC', 'ThukC'],
   'Tu': ['TukC'],
   'Wh': ['WC', 'Wh', 'Whi', 'WhiC', 'WhiCNestC', 'Whine', 'WhineC', 'WhineCSeq']
}

The one additional piece of information that played into discrimination of Th and Tu is this table from the paper:
image

The counts are aligned with your interpretation based on misspellings.

I have now inverted the dictionary and will be using it to relabel the examples:

2 Likes

I’m running the downloading_and_processing_of_data notebook and got an error.

df.call_type = df.call_type.apply(lambda x: labels2type[x])

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-45-a36040f6c228> in <module>
----> 1 df.call_type = df.call_type.apply(lambda x: labels2type[x])

/opt/conda/envs/fastai/lib/python3.6/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   3192             else:
   3193                 values = self.astype(object).values
-> 3194                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   3195 
   3196         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()

<ipython-input-45-a36040f6c228> in <lambda>(x)
----> 1 df.call_type = df.call_type.apply(lambda x: labels2type[x])

KeyError: 'C'

There is no ‘C’ in labels2type
{‘Ag’: ‘Ag’, ‘AggC’: ‘Ag’, ‘BeggSeq’: ‘Be’, ‘Beggseq’: ‘Be’, ‘DC’: ‘DC’, ‘DisC’: ‘Di’, ‘LTC’: ‘LT’, ‘Ne’: ‘Ne’, ‘NeArkC’: ‘Ne’, ‘NeKakleC’: ‘Ne’, ‘NeSeq’: ‘Ne’, ‘NekakleC’: ‘Ne’, ‘NestC’: ‘Ne’, ‘NestCSeq’: ‘Ne’, ‘NestCseq’: ‘Ne’, ‘NestSeq’: ‘Ne’, ‘So’: ‘So’, ‘Song’: ‘So’, ‘Te’: ‘Te’, ‘Tet’: ‘Te’, ‘TetC’: ‘Te’, ‘ThuC’: ‘Th’, ‘ThuckC’: ‘Th’, ‘ThukC’: ‘Th’, ‘TukC’: ‘Tu’, ‘WC’: ‘Wh’, ‘Wh’: ‘Wh’, ‘Whi’: ‘Wh’, ‘WhiC’: ‘Wh’, ‘WhiCNestC’: ‘Wh’, ‘Whine’: ‘Wh’, ‘WhineC’: ‘Wh’, ‘WhineCSeq’: ‘Wh’}

So this label shoud be removed right ?

[Update] I checked what is the filename for label C and found it should be Nest. Example: |861|BlaLbl8026_110608-Nest-C-13.wav|True|blalbl8026|NaT|C|13| . So I will manually change the 2 files which are wrong labeled.

I’ve just created a PR which adding just call_type[869] = 'Ne'; call_type[1663] = 'Ne' . However, I think that maybe I should do something to clean the notebook. Because now it creates many changes in metadata (for examples the order of executed cells, python version, …). You can accept it or I will review what I need to do for making change notebook.

1 Like

Thank you again Dien-Hoa for your PR and post! I just redownloaded the data and rerun the entire notebook and I am not seeing any C labels?

These are the only labels that I see:

['Ag',
 'AggC',
 'BeggSeq',
 'Beggseq',
 'DC',
 'DisC',
 'LTC',
 'Ne',
 'NeArkC',
 'NeKakleC',
 'NeSeq',
 'NekakleC',
 'NestC',
 'NestCSeq',
 'NestCseq',
 'NestSeq',
 'So',
 'Song',
 'Te',
 'Tet',
 'TetC',
 'ThuC',
 'ThuckC',
 'ThukC',
 'TukC',
 'WC',
 'Wh',
 'Whi',
 'WhiC',
 'WhiCNestC',
 'Whine',
 'WhineC',
 'WhineCSeq']

Would you please be so kind and doublecheck you are working from unmodified data, and if you are could you please provide a bit of information which rows are affected? (or which file names? that might make it easier to track what is going on).

Thank you so much again for all your efforts on this! Appreciate it!

I’ve just clone the project this afternoon. After redownloading the data, I still get the error.

Running df[df.call_type == ‘C’] I get result as below

fn adult name date_recorded call_type rendition_num
861 BlaLbl8026_110608-Nest-C-13.wav True blalbl8026 NaT C 13
1647 BlaLbl8026_110608-Nest-C-14.wav True blalbl8026 NaT C 14

So I think with the names above, this patter re.compile('(.*)_(.*)[-_](.*)-(.*)\.wav') create correctly the ‘C’ label

2 Likes

Wow, you are right! :slight_smile: But the thing is that I am already correcting this! (I just forgot I was fixing it):

I think what is happening, is that somehow our OSes are reading the files in in different order. I am on Linux, are you on Mac or Windows by any chance?

I am hoping I fixed this now, by adding sorting of the paths after reading them in. If you were so kind, could you please pull from the repo and see if the code works for you now?

Thx a lot for all your help on this!

1 Like

I use Paperspace instance so I think the OS is linux. I will try what you suggest later. I’m very happy that it can help

1 Like

The problem is solved !

1 Like