Modeling Sequence of sensor data(structured dataset with no time value)

I have posted a similar thread in fastai into to machine learning forum too.

I have a some sensor data that captures analog waveforms which are converted by an ADC to 80 floating point numbers at some sampling rate in a sequence. I have 6 sensors whose values come as data in 6 columns and for each waveform there are 80 sequences of floating point values. All of this represent input data related to 1 out of 120 categories of output data. At the end of each row of data, the labeled output is also available and added by the testing hardware because while testing the sequence is generated for each type of 120 categories. So, my dataset looks like this:
columns ----> sensor1 sensor2 sensor3 sensor4 sensor5 sensor6 output
rows
|
1 x11 x21 x31 x41 x51 x61 y1
2 x21 x22 x32 x42 x52 x62 y1
. ……………………………………………………………….
. …………………………………………………………………
80 x801 x802 x803 x804 x805 x806 y1

So, for one reading of y1, x11 to x806 forms the dataset, but I need to make sure that x11, x21…x801 is a sequence, x21,x22, x32…x802 is a sequence etc. Which means each column data for rows 1 to 80 is a sequence and the 6 sequences together is a pattern for a y1. The amplitude of x values could change for a given y, but the sequence would be similar. That is, a, b, c,……… and 1.2a, 1.2b,1.2c… would generate same y output as long as a, b,c etc occurs in same/similar order across the 6 sensors

I need to quickly develop a model in fastai where given a bunch of training data, I need to have a predictor such that when the 80 rows but of sensor data is streamed in, output category is detected.
Given a labelled dataset, I have following questions:

  • How do I load this data in fastai as a 6 sequences of 80 values for a given output? I could transpose 80 rows to 80 columns, but how do I link columns as sequence representation for each sensor

  • What type of trainer and loss function is a good starting point for such problems?

  • Once I figure the dataset structure for the train/set for the labelled data, what model class do I choose? It looks like a RNN problem, but I would not have millions of rows of dataset, it is fairly small with a few 10s of thousands for each category. In that case would it still be RNN ?

Appreciate any guidance on getting started. Is Rossman sample kind of notebook a good way to get started?

2 Likes