Understanding LSTMs - Regarding the 3 sigmoids in Chris Olah's blog


(Aseem Bansal) #1

I was reading Chris Olah’s blog at http://colah.github.io/posts/2015-08-Understanding-LSTMs/

I was thinking that given the same input the sigmoid will always give the same output. So why would we have 3 different sigmoids in the cells? Is it just to make the diagram clear or am I missing something?


(Niyas Mohammed) #2

You are probably making one of the two following assumptions, just as I did :slight_smile:

  1. The input is fed directly to a sigmoid function
  2. There is a matrix multiplication involved before piping it through the sigmoid function, but the weights and biases are the same for all three cases.

If you look closel, the input and the state remain the same for all three, there is a matrix multiplication involved and the output of this matrix multiplication is what is passed on to the sigmoid function. If the weights and biases involved in these operations are different for the three cases, we can expect the output of the sigmoid to be completely different.

And sure enough…