Keras vs PyTorch

Just wondering what people’s thoughts are on PyTorch vs Keras? E.g.

  • Do you use one or the other completely, or do you both dependent on task?
  • Is PyTorch much more tricky than Keras (e.g. could you code faster in Keras than in PyTorch)?
  • What about in the longer term? Is one better for a wider range of activities (e.g. mobile, robotics, more complicated networks,…)?


I like some of the freedom pytorch affords in terms of putting together your models. Keras seems like whoever they got working on documentation cares a lot more. Also, the author / originator of keras is pretty active on twitter daily and posts good stuff. As weird as it seems, for similar network architectures, I apparently hit the out of memory gpu error on keras more than on pytorch. Also, for complicated or deep networks, building the computation graph takes forever. The way I figure, if we’re going to be functional / bleeding edge data scientists, and if we already know how to program, then it’s in our best interest to just bite the bullet and learn both. I hope someone passionate goes in and submits a bunch of decent pull requests for pytorch’s docs though…


Documentation on PyTorch is less complete but they actually have an active discourse forum too:


Someone posted a superb explanation on reddit a few days ago, quite an eye-opener to software management from a company’s operations perspective.