This is my first time doing IC modeling. I want to extract images + corresponding text sequences, but I don’t know what is the best way to organise those files for training an image captioning model?
This is my first time doing IC modeling. I want to extract images + corresponding text sequences, but I don’t know what is the best way to organise those files for training an image captioning model?