FastAI vs Keras+TensorFlow

Hello everyone,

Maybe, I’m asking sth which has been questioned similarly many times, but I can’t find an exact answer to my question, so I’m asking one more time here.

Throughout this course, fastai is used instead of Keras+TensorFlow. At the FAQ of this forum, it also mentions why it is in this way, say the limitation of TF + Keras (they are much slower, result in less accurate models, and take more code)

My question is, is there limitation for fastai ( <1> on its own <2> in comparison to TF + Keras) ? If fastai is better than TF + Keras in all areas, can we forget TF + Keras at all ?

I hope that this question, and the coming discussion can let us have better understanding on both fastai and TF + Keras. Thanks !



Limitations of fastai that jump to mind:

  • Not much documentation
  • Relies on pytorch, which doesn’t have such mature production (mobile or high scalability server) capabilities compared to tensorflow
  • Pytorch doesn’t run on as many devices yet (e.g Google’s TPU)
  • Not supported by as big an organization as tf
  • Some parts still missing or incomplete (e.g object localization APIs)

Benefits of fastai include:

  • Much less code for you to write for most common tasks
  • More best practices baked in, so normally faster to train and higher accuracy
  • Easier to understand
  • Handles tabular data much better
  • Fits in with wider python ecosystem better (e.g pandas)
  • The dynamic nature of pytorch is much better for experimentation and iteration, and therefore many recent research papers are on pytorch first

I think the biggest issue with fastai, or even standalone pytorch, would be production deployability.

Another (implicitly mentioned in your list), is that skills in Keras/TF could be much more easily marketable. And that’s very important for those of us that would be really happy to switch to a DL job.

My solution (not particularly original) is to follow along with you and at the same time keep studying Keras/TF. For example, in my opinion Chollet’s book is an ideal complement of fastai.


Could you elaborate on the production deployability aspect?

From what I have learnt from folks such as James Bradbury (Salesforce) and other engineers at Tensorflow, they ack that while pytorch code deployment is non-trivial, it is not entirely impossible. There are three options for us : one, convert Pytorch code to Caffe (most preferred technique at Facebook), two, use ONNX and three, once your prototypical phase in Pytorch is done – convert the code to tf.eager. I got the idea that the differences between the tf.eager and Pytorch are slimming down since either code might look quite similar at kernel level.
Not sure if I am missing anything here.

1 Like

Most people just stick pytorch straight behind a flask end-point or similar with CPU inference. It’s only likely to be an issue if you need to scale really big (in which case you can afford to hire specialized engineers!) or you need to deploy to mobile.

I don’t think the marketability issue is a big one, myself, since switching languages/libraries is pretty straightforward (a couple of days of study) once you understand the concepts. But I think trying to learn DL in a library that doesn’t show best practices, and hides concepts behind layers of complexity, is a big problem!

Having said that, Chollet’s book is excellent and reading it after or at the same time as doing the courses is a fine idea, as long as you don’t find it distracting from the concepts we’re teaching.


Jeremy has quite objectively and nicely answered this question.

It is true that today TensorFlow+Keras is much more prevalent than PyTorch+FastAi.

However, this situation can change, if:

  • FastAi’s Deep Learning courses popularize PyTorch+FastAi the way Coursera’s Data Science Specialization popularized R.
  • People realize the benefits of PyTorch+FastAi.

This reminds me the war in 2000 between Intel’s Rambus and ?'s USB port standards.


Core limitations aside and the many layers of unnecessary complexity (which I think “eager” now resolves for TensorFlow) what best practices is the TensorFlow library lacking in?

Learning rate finder, discriminative learning rates, batchnorm freezing, NLP classification on large documents, use of continuous and categorical columns in tabular data, access to CSV files, etc etc etc… :wink:


Regarding deployment, I was working on ONNX export for fastai, and got it working pretty well for the standard image models, but it is hacky.

I would like to start a community effort to add a onnx export functionality to fastai if people are interested and no other projects like this are currently running.

I will write a message tomorrow going into much more depth about what I found.


For the technology world, it is always a war

For the oldest, video tape = Betamax vs VHS

Even for recent, there are many famous wars

desktop = Microsoft vs Apple

mobile OS = Google vs Apple

The point is how to survive, and fight for more market share.

DL framework (either fastai or tf) is still under rigorous development, either one is near to be very mature. TF version 1.0 was released in 2017, not a very long history. There is a long way (or war) to go. Can we say in this way ?

great response.

I’m new to deep learning and was doing this course. Something i believe you said in the course was that using another library such as tensorflow would be much more easier to screw up. From that perspective, wouldnt u also say that knowing how things operate under the hood is important in building a serious model?

fastai might be easier to use, but as a beginner who wants to build their own predictive model, if i require my model to be accurate im going to need to know how every single aspect works under the hood and how parameters must be tuned-- maybe im wrong but to get down to a more granular level isn’t the heavy lifting that other libraries require something that shouldnt be ignored?

You could also write a blog post, if you have time to spare. I think most people would find it interesting.

Yes, I was thinking specifically about mobile.

Absolutely. That’s why in the course we show how everything works under the hood. Some of that is in part 1, some in part 2 (still in development).

1 Like

We (Prolego) switched to pytorch/fastai for our consulting engagements for many of the reasons @jeremy lists. Additionally we find that enterprise clients are (understandably) wary about taking over whatever we create for them.

When we start the conversation with “Pytorch is a python class …” the leadership gets much more comfortable.

Most Fortune 500 companies want to compete on their data, business model, expertise - and not AI innovation. So we help them apply AI to what they already know how to do.


Hi @j.laute
would you please help me in deploying a fastai trained model to ONNX?
I am trying but i am getting this error:

RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[1, 1000, 1000, 3] to have 3 channels, but got 1000 channels instead

any idea how to fix this?