Image Captioning

yashkatariya · May 5, 2017, 3:53am

@jeremy Was image captioning implemented in Part 2 of the course?

jeremy · May 5, 2017, 3:23pm

Not directly, although the section on DeVISE covered pretty much all you need to implement it. Would be a good class project.

yashkatariya · June 6, 2017, 11:40am

@jeremy I took your advice and have started implementing Image captioning as a project but I am running into some problems. The loss is stuck around 5 and the accuracy around 0.07. I have tried changing the activations, learning rate, the optimizer, but it had no effect. Can anyone tell me how to resolve this situation?

Link for the code:- https://gist.github.com/yashk2810/d49f2e40b624a782552b32bfdb9cc6cc

markovbling · June 7, 2017, 5:21pm

Jeremy, I LOVED the section of the course covering DeVISE - such an elegant idea!

I suspect you use an LSTM or some sort of RNN to go from word2vec embedding of the image to a sentence but I’d appreciate if you could please give a couple sentences of elaboration on how to go from DeVISE to image captioning?

Thanks!

jeremy · June 8, 2017, 5:19pm

I’m so glad you liked it - it was fun to put together! Yes you’d use an RNN for the sentence embedding. See the later translation videos for some ideas that might be useful.

markovbling · June 15, 2017, 10:28pm

Makes sense, thank you!

In case anyone is interested:
I’ve started watching the Andrej Karpathy CS231n videos and came across his 2015 paper on generating image captions using an RNN on top of a CNN:

Also, the DeepMind team recently released their image captioning model written in tensorflow available here:

yashkatariya · June 16, 2017, 2:21am

I have started implementing Image Captioning using Keras on the Flickr8k dataset. At 35.33% accuracy, I am getting decent results and I still have to train more. If you want, we can collaborate.

Code link:- https://github.com/yashk2810/Image-Captioning/blob/master/Image%20Captioning%20Bidirectional%20LSTM.ipynb

markovbling · June 18, 2017, 9:01pm

Very cool! Absolutely swarmed with work right now but will try remember to come back and take a look at this when I have some time to delve deeply into it

yashkatariya · July 8, 2017, 5:53pm

Image Captioning using InceptionV3 and Beam Search. https://github.com/yashk2810/Image-Captioning
https://yashk2810.github.io/Image-Captioning-using-InceptionV3-and-Beam-Search/