Sorry if this question is very obvious.
I have a document:
[I ate the food]
[I am eating today]
[I cannot do it]
[What are you doing]
During the fine tuning step,
How does one training example look like?
That is , Is the tokens [“I”,“ate”,“the”] trained on “food”
Is there a window size here. I am confused.
Please let me know, how the actual training is .
For example in word2vec, there is a sliding window for words to be trained