Listening to Lecture 5, here, a student brings up a good question about if order matters in our sentiment analysis of the IMDB dataset, right after we’ve built the most basic NN to start of with.
It turns out that even though our input may be sorted by frequency (?), we are actual not using any kind of Bag of Words model or idea or technique.
Jeremy says (and I’m paraphrasing because he’s talking faster than I can type! ),
“We are connecting every one of the inputs to the output, but doing it for every one of the incoming factors, creating a big cartesian product of all the weights, which takes into account position.”
This is quite a mouthful. Jeremy could you please elaborate?
When we looked at the collaborative filtering example which helped to motivate the embedding (including a dot product), there wasn’t any talk of position or cartesian products. Is there more going on, and how do we maintain the order in this way?