Estimating distribution parameters using RNNs


(Tim) #1

Hello!

I’m working on a problem where I am trying to forecast unit sales of products (so integers) at different locations. So what I have is time a lot of time series for unit sales at each location for each product. I’ve found this great paper: https://arxiv.org/pdf/1704.04110.pdf, where they do exactly this, and I am trying to implement however i’m a bit confused about something fundamental paper which is:

Instead of predicting sequences of numbers, they use an RNN with LSTM cells to return sequences of ESTIMATES of mean and variance for Gaussian distributions, or mean and dispersion (for negative binomial distribution). I am particularly interested in the negative binomial case because I am also predicting count data.

Most of the ML tutorials and everything i’ve read ypically when teaching about loss functions in terms of predicting values, comparing that value with training target data and then adjusting network weights and biases to minimize the loss, which intuitively makes sense to me, however in this paper though they frame it in terms of estimating the parameters of a distribution for the following time point.

The way I understand it is that you would have inputs of sequences (time lagged time series), and you are predicting the non time lagged series, and then it gets fed into the LSTM cells and it returns a sequence with the length of whatever period you want to predict and width of 2 (mean and dispersion in the negative binomial case).

My question then is how to turn these estimates into counts in order to assess how accurate the model is on test data? My intuition is that if you have the mean and dispersion of the distribution that you could then calculate the mode of that distribution ( floor(mean * ((dispersion - 1) / dispersion)) and because it is a discrete distribution, you would have a series of integers which you could then calculate accuracy of.

Does anyone have experience with something similar to this? Modeling count data, estimating distribution parameters using neural networks?

Any input would be great! :slight_smile: