@jeremy We are looking at the BOW model in our lessons for NLP tasks and would like to know how we can use N-gram models and any other improvised models ?
We got good results in the course using CNNs and RNNs. I would suggest trying those.
I have tried RNN’s and Multi Size CNN architectures but can you pls explain on how RNN and CNN will be able to do beyond bag of words model since we have used vocab_sizes in all of them ?
There’s no relationship between vocab_size and back of words. Can you explain what your issue is in detail?
This is my understanding :
Any corpus (all data points) is tokenized into set of unique words (This set is called vocab_size).
Each word in the vocab size is transformed into 50(100,300 etc) dimensional embedding.
Here each word is a feature in itself.
This is passed through RNN’s or CNN’s to learn and to predict the correct category of label.
My question is how do I use “phrases” as features instead of only words since phrases in a sentence captures the semantic meaning of the context.
Phrases again can be broken down into set of words like 2 word at a time (bi-gram) or 3 words at a time(tri-gram) or n-grams at a time.
@janardhanp22 The order that the words are passed into the RNN impact what parameters are learned. For a CNN, the filters are applied to sequential words, so again the word order of a sentence matters. BOW doesn’t have a notion of sequence/order.