CTC loss Implementation?

Has anybody have an experience with the CTC loss implementation? Either in Pytorch or Keras? i found various github repos, also a bunch is mentioned in this nice CTC guide: Sequence Modeling With CTC
The main goal is to implement the CRNN architecture from An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition - i started with Keras cause my current implementation for a similar problem is based on VGG-like CNN, but then found their github repo - the code is in Torch/Lua - so maybe it makes sense to port it to PyTorch instead…


I think tensorflow has one.

yes, it does but again i’m thinking maybe it’s time to take a deep breath and dive into Pytorch :exploding_head:

The best CTC is Baidu’s, and there’s a pytorch lib that uses it here https://github.com/SeanNaren/deepspeech.pytorch . Or there’s a pytorch version based on TF’s CTC here: https://github.com/ryanleary/pytorch-ctc

Let us know how you get along!


over the weekend after a couple small hiccups i was able to install and test the Sean Naren’s pytorch CTC version.
on a tangent, i also found a pytorch implementation of the C-RNN arch i was planning to experiment with; it comes with weights which i tested with my data, and the results are not better (i’d say worse) than what i currently have from my VGG-like CNN. Of course it’s not clear what the results would be if i retrain C-RNN on my own data - suspect it would be the same…

both arch have pretty much the same CNN part so it boils down to how the output sequence predicted - using RNN or FC multi-output with independent FC for each character in the sequence…

Thanks for the update - what hiccups did you have? And how did you get past them? Do you have any plans to write a little post about your experiences with this? (I, for one, would be very interested to learn more!)

tbh there is not much to write about, at least not yet - the hiccups were around the OSX build for pytorch bindings - the torch complains that one needs to ignore, the C++ compiler options and the shared libs - nothing that the basic hackery wont’ solve :slight_smile: but… there is one more serious problem that responding to you helped me to realize - which is that - the target deployment env is Windows, and there is no pytorch CTC version available - which means i need to do the porting… think this should be doable (though it also depends on the pytorch Windows version availability/stability (yes, saw your post on twitter on this))
ETA: i’m doing the training on Ubuntu with GPU, but the dev is mostly on Mac

@helena , can you share your experience implementing CRNN ?.

Putting in few updates, in case anybody else is looking for a CTC loss implementation

its now packed in as torch.nn.CTCLoss, from version 1.0

Further details in this link


You should visit this CTCModel . This is the code can directly used in keras with simple use.
Also basic scratch implementation of CTC in python found in standford-ctc
Might it can be helpful for you.