Has anybody have an experience with the CTC loss implementation? Either in Pytorch or Keras? i found various github repos, also a bunch is mentioned in this nice CTC guide: Sequence Modeling With CTC
The main goal is to implement the CRNN architecture from An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition - i started with Keras cause my current implementation for a similar problem is based on VGG-like CNN, but then found their github repo - the code is in Torch/Lua - so maybe it makes sense to port it to PyTorch instead…
I think tensorflow has one.
yes, it does but again i’m thinking maybe it’s time to take a deep breath and dive into Pytorch
The best CTC is Baidu’s, and there’s a pytorch lib that uses it here https://github.com/SeanNaren/deepspeech.pytorch . Or there’s a pytorch version based on TF’s CTC here: https://github.com/ryanleary/pytorch-ctc
Let us know how you get along!
over the weekend after a couple small hiccups i was able to install and test the Sean Naren’s pytorch CTC version.
on a tangent, i also found a pytorch implementation of the C-RNN arch i was planning to experiment with; it comes with weights which i tested with my data, and the results are not better (i’d say worse) than what i currently have from my VGG-like CNN. Of course it’s not clear what the results would be if i retrain C-RNN on my own data - suspect it would be the same…
both arch have pretty much the same CNN part so it boils down to how the output sequence predicted - using RNN or FC multi-output with independent FC for each character in the sequence…
Thanks for the update - what hiccups did you have? And how did you get past them? Do you have any plans to write a little post about your experiences with this? (I, for one, would be very interested to learn more!)
tbh there is not much to write about, at least not yet - the hiccups were around the OSX build for pytorch bindings - the torch complains that one needs to ignore, the C++ compiler options and the shared libs - nothing that the basic hackery wont’ solve but… there is one more serious problem that responding to you helped me to realize - which is that - the target deployment env is Windows, and there is no pytorch CTC version available - which means i need to do the porting… think this should be doable (though it also depends on the pytorch Windows version availability/stability (yes, saw your post on twitter on this))
ETA: i’m doing the training on Ubuntu with GPU, but the dev is mostly on Mac
@helena , can you share your experience implementing CRNN ?.
Putting in few updates, in case anybody else is looking for a CTC loss implementation
its now packed in as torch.nn.CTCLoss, from version 1.0
Further details in this link
You should visit this CTCModel . This is the code can directly used in keras with simple use.
Also basic scratch implementation of CTC in python found in standford-ctc
Might it can be helpful for you.