TLDR; How can I implement a RNN architecture to solve a NON-NLP binary classification problem, using the highest api-level possible of fastai. Each sample is a series of 8001 integers.
Data examples, formatted in three different ways can be found here
Link to notebook here. (You might have to navigate to ‘Untitled.ipynb’ when you open the link)
I have been studying machine learning for about a year now through some classes at my university, and this semester I decided that I would try out fastai. I’m currently through the “Practical deep learning for coders” course, and I really like the ideas of the fastai library. I’m looking forward to become more skilled in using it.
My question is about advice in one of my projects, and I would be very grateful for any help on how to best go about it.
I’m trying to make a binary classifier that handles one dimensional series of 8001 integers. Data examples, formatted in three different ways can be found here. My first goal is to test a LSTM architecture
I wanted to use the high level API of fastai, so I tried modifing this example.
df = pd.read_csv('dummy_data_like_IMDB.csv') dls = TextDataLoaders.from_df(df, text_col='text', label_col='label', valid_col='is_valid') learn = text_classifier_learner(dls, AWD_LSTM)
Q1 I formatted the data to be similar to the example but I’m getting an error “IndexError: single positional indexer is out-of-bounds” on the TextDataLoaders line. Any suggestions? See the notebook for full code. (You might have to navigate to ‘Untitled.ipynb’ when you open the link)
Q2 TextDataLoader builds a vocabulary witch isn’t quite right for this non-nlp task. Any suggestions? The integeres in the sample will be within a range of -100 to 10 000 or so.
Q3 I not quite comfortable with the lower API levels of fastai yet, honestly I’m just getting to know th upper levels. If I need to dig deeper to acomplish a non-nlp RNN task, do you have any suggestions on where to start?
text_vocab should include all numbers.