First post, new to deep learning and just finished the book. Working on my first big project I have some data saved in several thousand npy files. One file loaded is of shape (3, 4096) and it is essentially a waveform.
I want to convert each file to a CQT spectrogram where the three dimensions act as ‘RGB’ and then use a CNN for binary classification. When it came to developing the DataBlock I had this solution:
First Approach
For blocks = (ImageBlock, CategoryBlock) I had get_x = cqt_tfm, with code below:
def scale_minmax(X, min=0.0, max=1.0):
X_std = (X - X.min()) / (X.max() - X.min())
X_scaled = X_std * (max - min) + min
return X_scaled
def cqt_image(arr):
cqt = np.abs(librosa.cqt(arr/np.max(arr), sr=2048, fmin=8, hop_length=64, filter_scale=0.8, bins_per_octave=12))
img = scale_minmax(cqt, 0, 255).astype(np.uint8)
return img
def cqt_tfm(fname):
return np.apply_along_axis(cqt_image, axis=1, arr=np.load(fname)).transpose(1,2,0)
And this works, but one downside is that it’s slow because it has to do all this processing for several hundred thousand files. I also, instead of having CQT code operate in the get_x function, tried developing a transform to pass into item_tfms, because this seems like the “correct” way to do it so that new files can get transformed according to this structure but when I run the datablock summary, shown below, it calls my getx function first, then PILBase.create, interpreting the (3,4096) object as an image. When in reality I want the pipeline to do getx → CQTTransform → PILBase.create. Because the current way is scaling the data in weird ways and I’d preferably not have to undo the change by PILBase.create just to have to call it again.
Setting up Pipeline: getx -> PILBase.create
Setting up Pipeline: label_func -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
Setting up after_item: Pipeline: CQTTransform -> ToTensor
Setting up before_batch: Pipeline:
Is there a good/fast way of going about this? Currently, I’m running a separate script to convert into CQT format in parallel for speed.