Hi, kindly use following for installs:
!pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2
!pip install --upgrade git+https://github.com/fastaudio/fastaudio.git
And then restart the kernel.
Hi, kindly use following for installs:
!pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2
!pip install --upgrade git+https://github.com/fastaudio/fastaudio.git
And then restart the kernel.
Hi, is there a way to have different transformation for audio similar to the one for image?
I tried to use Pipeline to compile two transformations, however, I got ValueErrorā¦
ValueError: too many values to unpack (expected 3)
class AlbumentationsTransform(RandTransform):
"A transform handler for multiple `Albumentation` transforms"
split_idx, order = None, 2
def __init__(self, train_aug, valid_aug):
store_attr()
def before_call(self, b, split_idx):
self.idx = split_idx
def encodes(self, sig):
if self.idx == 0:
aug_audio = self.train_aug(sig)
else:
aug_audio = self.valid_aug(sig)
return aug_audio
def get_train_aug():
return Pipeline([a2s, MaskFreq(), MaskTime()])
def get_valid_aug():
return Pipeline([a2s])
Semi-related, but should the SpecAugment Transforms in fastaudio be RandTransforms, instead of Transforms? It seems they are currently applied on the validation set as well. It has been some time since I looked into split_idx and Transform mapping, so fastai may be correctly handling this under the hood. Just want to be sure! Thanks
^ The same reasoning could apply to other Augments that are currently Transforms and not RandTransforms. But it is not as clear which should be one vs. the other and it is likely problem-specific, so maybe the flexibility is best kept.
Looks like someone recently brought this up as well:
Hello Robert,
I would like to experiment with the latest version - 0.9.x - of Torchaudio wav2vec stuff in FastAudio.
Could you please give some direction on how to do the upgrade of FastAudio to support TorchAudion 0.9.x in order to achieve this?
Iām pretty sure there is reason why FastAudio is pinned to TorchAudio 0.8.0.
Thanks,
Victor
Hey Victor,
Can you tell me more about what youāre trying to do? FastAudio is built for classification but not for speech recognition tasks. If you are working with speech I would recommend using thunder-speech which supports wav2vec2, or using torchaudio and pytorch directly.
Hello Robert,
Yes, Iām doing audio classification with Fastaudio and it works very well. But, I also need to do some kind of ālifteringā of the audio before the classification in order to validate that the audio is correctly meet some criteria. The task is just to classify 3 type of sound, everything else should be ignored.
In the test I did, if the sound is saturated it is wrongly classified. The sounds are 1s short.
Intuitively, I think wav2vec2 will help. (I have less experience in audio manipulation)
But in general, is it difficult to update fastaudio to support torchaudio 0.9.x on Linux?
Thanks,
I havenāt touched the code base in a while but if I remember correctly we pinned to 0.8.0 both to avoid future breaking changes, and because that version of fastai insisted on using a specific version of pytorch that wasnāt compatible with future torchaudio versions.
As far as how difficult it will be to update, it is something I would recommend creating a new environment and stepping through and seeing what breaks. Currently fastai supports pytorch 1.7+, and for torchaudio 0.9 you only need pytorch 1.4+. If you try this I expect some fastai stuff will break, and that would be a pain to debug/upgrade, and there may be some small fixes for torchaudio as well. Release notes help, and fastai has a discord with an audio channel: Discord, youāre more likely to get responses there than here I think.
Good luck and happy to answer any questions you may have.
Hello Robert,
Thanks for the tips. I re-compile fastaudio against the following version and the only thing I have to do is to recreate the model.
fastaudio 1.0.2.post0.dev1+g3d6c0a0.dirty (edit setup.cfg & re-build )
install_requires =
fastai>=2.5.0
torchaudio>=0.9
librosa==0.8
colorednoise>=1.1
IPython #Temporary remove the bound on IPython
fastcore>=1.3.20
fastai 2.5.2
fastbook 0.0.18
fastcore 1.3.26
fastdownload 0.0.5
fastprogress 1.0.0
fastrelease 0.1.12
torch 1.9.1
torchaudio 0.9.1
torchvision 0.10.1
Thanks,
Hello Robert,
One error I got after the upgrade is this:
dls.show_batch(max_n=3)
ā¦
~/projects/torch_1.9.1/lib/python3.7/site-packages/fastaudio/core/spectrogram.py in getattr(self, name)
70 return self._settings[name]
71 raise AttributeError(
ā> 72 f"{self.class.name} object has no attribute {name}"
73 )
74
AttributeError: AudioSpectrogram object has no attribute _settings
Thanks,
Victor
Hey Victor,
Sorry itās been a long time since Iāve looked at the code and Iām having trouble following it. The error is occurring in this block of code
def __getattr__(self, name):
if name == "settings":
return self._settings
if not name.startswith("_"):
return self._settings[name]
raise AttributeError(
f"{self.__class__.__name__} object has no attribute {name}"
)
If youāre still stuck, can you share the full stack trace? Thanks
Hello Robert,
Here is the full stack trace. Would be nice to have it resolved.
AttributeError Traceback (most recent call last)
/tmp/ipykernel_2594/1652635938.py in
----> 1 dls.show_batch(max_n=3)
~/projects/torch_1.9.1/lib/python3.7/site-packages/fastai/data/core.py in show_batch(self, b, max_n, ctxs, show, unique, **kwargs)
100 if b is None: b = self.one_batch()
101 if not show: return self._pre_show_batch(b, max_n=max_n)
ā 102 show_batch(*self._pre_show_batch(b, max_n=max_n), ctxs=ctxs, max_n=max_n, **kwargs)
103 if unique: self.get_idxs = old_get_idxs
104
~/projects/torch_1.9.1/lib/python3.7/site-packages/fastcore/dispatch.py in call(self, *args, **kwargs)
116 elif self.inst is not None: f = MethodType(f, self.inst)
117 elif self.owner is not None: f = MethodType(f, self.owner)
ā 118 return f(*args, **kwargs)
119
120 def get(self, inst, owner):
~/projects/fast-src/fastaudio/src/fastaudio/core/spectrogram.py in show_batch(x, y, samples, ctxs, max_n, nrows, ncols, figsize, **kwargs)
116 min(len(samples), max_n), nrows=nrows, ncols=ncols, figsize=figsize
117 )
ā 118 ctxs = show_batch[object](x, y, samples, ctxs=ctxs, max_n=max_n, **kwargs)
119 return ctxs
120
~/projects/torch_1.9.1/lib/python3.7/site-packages/fastai/data/core.py in show_batch(x, y, samples, ctxs, max_n, **kwargs)
16 else:
17 for i in range_of(samples[0]):
ā> 18 ctxs = [b.show(ctx=c, **kwargs) for b,c,_ in zip(samples.itemgot(i),ctxs,range(max_n))]
19 return ctxs
20
~/projects/torch_1.9.1/lib/python3.7/site-packages/fastai/data/core.py in (.0)
16 else:
17 for i in range_of(samples[0]):
ā> 18 ctxs = [b.show(ctx=c, **kwargs) for b,c,_ in zip(samples.itemgot(i),ctxs,range(max_n))]
19 return ctxs
20
~/projects/fast-src/fastaudio/src/fastaudio/core/spectrogram.py in show(self, ctx, ax, title, **kwargs)
75 def show(self, ctx=None, ax=None, title="", **kwargs):
76 āShow spectrogram using librosaā
ā> 77 return show_spectrogram(self, ctx=ctx, ax=ax, title=title, **kwargs)
78
79
~/projects/fast-src/fastaudio/src/fastaudio/core/spectrogram.py in show_spectrogram(sg, title, ax, ctx, **kwargs)
87 ia = ax.inset_axes((i / sg.nchannels, 0.2, 1 / sg.nchannels, 0.7))
88 z = specshow(
ā> 89 channel.cpu().numpy(), ax=ia, **sg._all_show_args(show_y=i == 0), **kwargs
90 )
91 ia.set_title(f"Channel {i}")
~/projects/fast-src/fastaudio/src/fastaudio/core/spectrogram.py in _all_show_args(self, show_y)
50 def _all_show_args(self, show_y: bool = True):
51 proper_kwargs = get_usable_kwargs(
ā> 52 specshow, self._settings, exclude=[āaxā, ākwargsā, ādataā]
53 )
54 if āmelā not in self._settings or not show_y:
~/projects/fast-src/fastaudio/src/fastaudio/core/spectrogram.py in getattr(self, name)
70 return self._settings[name]
71 raise AttributeError(
ā> 72 f"{self.class.name} object has no attribute {name}"
73 )
74
AttributeError: AudioSpectrogram object has no attribute _settings
========
Thanks,
Victor
Hey Victor, Iām sorry but I looked this over and Iām still not sure how the bug is getting triggered. Iām on my way out of town, but you may want to try asking in the Discord Group. Good luck and sorry I couldnāt be of more help.
Thanks Robert.
Hi all! I am a data scientist/cofounder, and a deep learning practitioner. I mostly work on speech technologies, training models, pushing them to production and publishing whenever I can. But in order to really learn about deep learning fundamentals, I recently started getting into fastai.
Since I work a lot with audio and speech I was both happy and excited to see that the audio part of the project is community driven. I was wondering what are the development plans of fastai audio and whether there are maintenance issues. I checked the issues but they are mostly enhancement and there are not many commits during the last months.
Thanks!
Hey @gullabi, sorry for the delayed response. Fastaudio is not currently under active development. The original developers switched to mainly doing speech-to-text and text-to-speech, while fastaudio is focused on classification and isnāt suited for ASR/TTS. One of the original developers, @scart97 maintains a simple but awesome ASR library (GitHub - scart97/thunder-speech: A Hackable speech recognition library.), and we also still have an audio machine learning telegram where there is very little chatter but if you ask a question someone usually answers, let me know if youād like to join.
Right now Iād recommend maybe contributing to torchaudio. When we started fastaudio, audio ML was a pain and you had to do lots of stuff manually, so we tried to build that stuff so you wouldnāt have to be an audio expert to do ML in the domain, but torchaudio came around and built a lot of the same stuff (but more of it and a lot better). Hope this helps, take care.
Hi Iām Harry, Iām one of the creators of fastaudio (classification library) and have contributed to other audio libraries like pyannote-audio (speaker diarization).
Currently I work at a TTS company (sonantic.io) as a Research Engineer.
There are several audio chats / communities for different audio problems and they are all quite small and quiet. For people who are interested in various audio applications itās also annoying to have to login to so many chats
Iāve created a discord channel that is for all types of audio problems to try and merge the communities a bit as many of the things we work on are shared / similar when working on ml with audio.
Feel free to join
Hi all, I was interested in better understanding audio spectrogram vision learning, so I recreated some useful notebooks I found into a simple notebook using fastai v2:
I hope this may be useful as a simple starting point or learning tool! Itās not currently using fastaudio, but I may look into adding that as well.
Hey guys, itās been a while since this library was maintained.
I wanted to try it out with the latest libraries, so I opened a PR to update the dependencies.
Not sure who is maintaining this library today, so Iām also writing here in case my PR reaches no one.
For those that are interested hereās a link
Hey Tal, thanks for that!
But wouldnāt it be better to not fix versions of dependencies, and consistently use min-versions instead? Ie fastai>=2.7.0
instead of fastai==2.7.12
.
Otherwise, the error you describe is bound to happen again.
Hey!
Generally I agree, but I just wanted this PR to be accepted so I want by the old rules.
Also I remember I saw a previous comment somewhere (maybe in a previous PR) that itās like that so fastai doesnāt sometime break fastaudio.
But as you can see my PR wasnāt accepted yet and itās been a while sadly