Hi folks. I’ve been away for a long time doing my own ML-audio work. I saw this thread come up near the top and decided to try fastai2_audio…
…but I’m getting an error from the tutorial notebook:
from fastprogress import progress_bar as pb
produces the error
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-4-383e3a70bf7e> in <module>
----> 1 from fastprogress import progress_bar as pb
ImportError: cannot import name 'progress_bar' from 'fastprogress' (/home/shawley/anaconda3/envs/fastai2/lib/python3.7/site-packages/fastprogress/__init__.py)
I’m able to run the main fastai2 notebooks and see progress bars. It’s just this part of the 02_tutorial.ipynb for the of fastai2_audio that’s producing this error.
Any suggestions?
Update: Seems that installing fastai2_audio broke my working fastai2 environment. I’d assumed fastai2_audio was an add-on to fastai2, but it seems to have grabbed different versions of packages that I already had, and replaced them, e.g. with pytorch=1.4 instead of the 1.3 I had.
Hey Scott, sorry that happened and thanks for reporting it. I will look into why we don’t have compatibility and see how we can fix it, and if we can’t in the short run then I’ll at least notify people in the repo of the temporary incompatibility. (sometimes we are beholden to certain versions of pytorch due to our current reliance on torchaudio). Thanks again.
@MadeUpMasters thanks. I’ll keep trying; it can easily just be user error.
My collaborator has a new idea for an alternative architecture to our SignalTrain model and I’m thinking of trying to put the new one in ‘Fastai’ form…both to make it more accessible to others (like you all) and so I can benefit from the collaborative tool-making of the FastAI community.
So in the coming weeks/months I may have a lot of questions about writing custom DataLoaders!
ERROR: torchvision 0.4.2 has requirement torch==1.3.1, but you’ll have torch 1.4.0 which is incompatible.
ERROR: fastai2 0.0.7 has requirement torch<1.4.0,>=1.2.0, but you’ll have torch 1.4.0 which is incompatible.
Should I install the libraries some other way?
[EDIT] I installed the fastai2 in the following way, as recommended here
If you look at the start of the notebooks we do some uninstalling of libraries so that we can import the fastai2 modules successfully. It looks like you could be having a similar issue.
Well done on all the work here folks, just after watching @muellerzr’s run through, this lib looks super useful!
I am hoping to use it in kaggle’s deepfake comp as some of the videos also have fake audio. Just wondering if anyone has any suggestions on the easiest way to extract audio from mp4 files? And is there a preferential format I should save them to?
ffmpeg is a great tool to manipulate video and audio via the command line on linux, the usage may look scary at first but it’s very powerful. To extract the audio from only one video:
If you search on the internet you’ll find some posts listing all of the different ways you can use ffmpeg like this one. To process multiple files, it’s just a matter of using a bash loop:
for vid in *.mp4; do ffmpeg -i "$vid" -vn -acodec pcm_s16le -ac 1 -ar 16000 "${vid%.mp4}.wav"; done
About the format, .wav with this coded is a common choice for audio data. The only parameters that you should change are the channels to 2 if you want to use stereo audio, and the sampling rate. For pure voice audio, 8 khz (-ar 8000) should be enough, but if you have other sources of sound besides voice you may want to use 16 khz (-ar 16000) or even 44.1 khz (-ar 441000). Those rates are directly related to the highest frequency present in your audio and the Nyquist theorem.
Amazing, appreciate it! Its only voice, although maybe I’ll us 16 khz because the goal is identify fake/manipulated voice, so maybe some crazy artefacts show up beyond the expected 8 khz…thanks again!
I thought I’d introduce myself after lurking for enough time! My background is in acoustic consultancy/engineering and I’m currently making a career change towards ML. I’m currently doing the Udacity ML Engineer Nanodegree and will (hopefully) be going to Georgia Tech to start the OMSCS ML specialization later in the year.
First of all, I absolutely love the work you all have done - machine listening is such a fascinating area, so I would love to contribute however I can. I also have my own personal project working on bird sound recognition for an area next to a national park in Colombia, near where I’m lucky enough to live (Bogotá), so will have a play around with V2 and feedback in due course. I used V1 late last year and it worked pretty well with mel-spectrograms on a dataset of xeno-canto recordings of 134 bird species ranging from excellent to pretty dodgy quality, so I’m excited to see how V2 can do.
I would like to use the library for my Udacity Capstone project, would you recommend I stick with V1 for now or go ahead with V2?
Hi all, I’m having some trouble running my code on the google tpu using a colab notebook. I thought you might have some more experience in this field and I’m trying to ask here.
I’m trying to run a pytorch script which is using torchaudio on a google TPU. To do this I’m using pytorch xla following this notebook, more specifically I’m using this code cell to load the xla:
!pip install torchaudio
import os
assert os.environ['COLAB_TPU_ADDR'], 'Make sure to select TPU from Edit > Notebook settings > Hardware accelerator'
VERSION = "20200220" #@param ["20200220","nightly", "xrt==1.15.0"]
!curl https://raw.githubusercontent.com/pytorch/xla/master/contrib/scripts/env-setup.py -o pytorch-xla-env-setup.py
!python pytorch-xla-env-setup.py --version $VERSION
import torch
import torchaudio
import torch_xla
however this is incompatible with the version of torchaudio that I need as: ERROR: torchaudio 0.4.0 has requirement torch==1.4.0, but you'll have torch 1.5.0a0+e95282a which is incompatible.
I couldn’t find anywhere how to load torch 1.4.0 using pytorch xla.
I tried to use the nightly version of torch audio but that gives the error as follows:
I just came across this audio extension for fastai and I was amazed. I’m trying to write a naive app to classify between 2 data sources. The model trains well, thanks to the notebook provided on GitHub.
I’m trying to load a single wav file and get predictions but I’m doing something wrong here.
I don’t understand why the learner is looking for an AudioTensor file and when I simply pass the path to test file it can’t process it. I’m sure I’m missing a key understanding of Data Block API here, please help.
@muellerzr Maybe you can help, I picked the AudioTensor creation part from your video tutorial. My apologies, I usually don’t at-mention at all but I’m fighting this for the last 6 hours and going crazy. And I just found a similar thread and I’m not sure if it’s a fastaiv2 issue.