I’m trying to classify spectra for medical purposes using the Fast ai library. I thought originally about storing them as tabular data, but that limits you to a
TabularModel style model. The data is structured so as to look like an image, continuous floating point values which a CNN would work well with due to the specific features in the data (e.g stretching the peak would give the same classification)
Here’s an example
Where x is the index and y is the value at that index.
Creating a 1D CNN using the
Tabular type is my plan, but I can’t find solid documentation on how to do this / any pre-trained models.
Any help or pointers would be appreciated
First question what is this data all about? What are the important features you would use to classify it?
If you wanna have an Introduction to the tabular learner have a look here: https://confusedcoders.com/data-science/deep-learning/how-to-apply-deep-learning-on-tabular-data-with-fastai
From my point of view you should calulate the full width at half max (FWHM), the intensity of the peak (max y value) and the point of the peak in x direction and use them as continuous variables in your tabular learner, this should work pretty fine. Maybe try to exclude the noise of your data beforehand.
If you have any categorical features (male, female, crystall structure etc.) that you know of use them too as input. (you can see how to do that in the provided example)
If you need more help you can also write me any time I’m happy to contribute.
If you were able to do it please tell me if it worked, I work with similar spectras in an other field of science and would be happy if we could work together.
The data is supposed to have peaks at specific points representing the sample composition and its relative concentrations. Unfortunately the peaks are hard to resolve due to noise that dwarfs the sample. This is introduced by the measurement such that data science techniques like a SVM are needed.
I wanted to try some transfer learning of a 1D CNN trained from something like this https://ieeexplore.ieee.org/document/8210784
Do you have any idea how I’d go about implementing it in Fast AI?
To be honest, I didn’t yet try out a lot of 1D data in FAST AI. I found this https://magenta.tensorflow.org/ddsp in the Fast AI audio threat (Deep Learning with Audio Thread) maybe you can learn something there because they basically also do at least some kind of signal processing. If I find out more, I’ll let you know.
I followed a tutorial on the fast ai docs. I subclassed the
Tabular datatype and set all the data to be continuous. I’ll share the code in another update once it’s production-ready.