I know how to write collaborative filtering and tabular regression models, but I don’t know where to begin with a problem where you only have a sequence of large numbers and trying to predict the following number or the next ten numbers in the series.
I can’t use collaborative filtering because I don’t have an “item and ID,” nor can I use tabular regression because I don’t have any categories or continuous data associated with each number in a sequence.
Any help on how to set up a model to solving/predicting the next number in a sequence of numbers is greatly appreciated.
Hi Duchaba
This is the language model problem. You can use a RNN. Jeremy shows an example where he genenates one, two, three. I think it is in the NLP course. I will try and find it.
Regards Conwyn
Thanks for the information. I took a quick look at chapter 12 RNN and LSTM and did not immediately see it. I guess it’s time to knuckle down, study, and do the Jupyter notebook.
NLP basically takes words and assign them a number so converts Mary had a little lamb to 1 2 3 4 5. You take the sequence 1 2 3 4 as independent with 2 3 4 5 as the dependent. So given any sequence 6 7 8 9 the model predicts the most likely sentence as 6 7 8 9 10 and convert Mary had a little to Mary had a little lamb if you use 1 2 3 4 as the input and 2 3 4 5 as the prediction. In language we often refer back to previous parts of the sentence. The man had a dog whose loud voice/bark could be heard over the silence of the night. To resolve the subject of the voice/bark NLP has a memory.
So I suggest only only need the inner piece whereby your number sequence is predicted by the memory method of the RNN. The book covers the various technologies.
I think what you are describing is a time series. There is a whole topic, with many participants, devoted to classifying and predicting time series:
You will find many tested, refined techniques and ideas there – RNNs, Transformers, auto-regressive, Rocket, etc.
But if you are thinking of these sequences not as time series, but rather as the kind of problem found on an aptitude test for humans… that’s a very interesting question! As Conwyn points out, there is a grammar to these types of sequences. Something like a language model might discover and extend the patterns. I also have some ideas in this direction. So would you please clarify what kind of problem you are trying to solve?
Thank you for your thoughts. I understand the concept of RNN and how NLP uses it to predict the next word-token in the sequence/text.
I am struggling with the basic implementation of the “fastai.data.block.DataBlock” for the LSTM. The DataBlock required x and y block type, e.g. for image classification it would be:
blocks=(fastai.vision.data.ImageBlock, fastai.vision.data.CategoryBlock)
But what is the definition for LSTM DataBlock?
Hi Duchaba
A language model predicts the next word so 1,2,3 predicts 2,3.4 so on page 385 they are using
sl=16;seqs = L((tensor(nums[i;i+sl]),tensor(nums[i+1,ii+sl+1]));
cut = int(len(seqs)*0.8)
dls = Dataloader.from_dsets(group_chunks(seqs[:cut],bs),group_chunks(seq[cut:,bs),
bs=bs,drop_last=True,shuffle=False).
(repost because I was replying to the wrong person)
Thank you so much for pointing me to the “Time series” group/discussion. I will read the postings to understand the group/topic base before I post a question on it.
Time series is a fascinating topic in Neural Networks. I have a vague idea of [somehow] using embedded layers, much like in collaborative filtering, with time series prediction. However, I need more experience in coding time series before I can begin that research.
I am putting the time series number-sequence in a pandas dataframe, where column 1 to 12 is the X-input and the column 13 is the Y. Using sliding window technique then column 2 to 13 is the X-input and column 14 is the Y, and so on. I am not using Tensor Rank-1 as in the book.
Maybe I should rethink my data representation.
Hi again. I appreciate your initiative and Conwyn’s help implementing a language model. Still, I need to point out that yours is not a language problem but rather a time series problem. The way you are approaching it, with a sliding window, will make predictions based only on the previous twelve values. This works fine for training a basic LM model. There, the input naturally divides into short sentences. (It may work ok for your time series problem if there are no long-range dependencies.) But that approach does not work well in general for a time series problem, where the prediction should be based on the entire series up to that point.
An RNN applied to a times series takes the sequence value and the hidden state at a time point, and outputs an updated hidden state. The hidden state goes into a “head” that derives a prediction. The prediction and target are compared by a loss function that trains the RNN. PyTorch provides functions that seemingly process the entire sequence in one step, though it must necessarily operate sequentially in the GPU. So the RNN, typically LSTM or GRU, trains from the whole sequence, not in batches of twelve numbers.
My suggestion is that you find a working tutorial on LSTM, study how it’s done, and experiment with changing the sequence to your own. There are many more advanced methods for predicting time series (it’s a rich area), and most of them work better than RNN. But an RNN is a fine place to start learning.