Abstraction and Focus

I think there’s a lack of clarity in how the Swift for TensorFlow project and fastai library fit together. In today’s SIG Swift meeting [meeting notes] someone went to the trouble of implementing a learning rate scheduler for S4TF [link]. This’d be useful if one didn’t exist already in fastai. A few meetings ago there was discussion about setting up a S4TF ‘model zoo’ inspired by PyTorch’s. When Jeremy brought up that fastai already had this, the response from several people implied they thought fastai was simply a collection of interesting notebooks presented in the two fastai+S4TF lectures at the end of the Deep Learning Course v3.

I’m noticing there’s a trend to bring everything together into S4TF, or rather to recreate it there. I also notice a lot of people either only referring to fastai in reference to a few notebooks, or being unaware that it’s a deep learning training library.

I also don’t really know what S4TF or MLIR is. MLIR I’m okay with: it may be some sort of differentiable compiler, and I’ll get that in time. But what is S4TF? It looks like there’s confusion in abstraction levels.

We have serious limitations in Python. We have an easy to use and powerful language in Swift, connected to a powerful compiler LLVM and the MLIR project. We also have the world’s best training library in fastai, which’ll probably (if not already) evolve into a general-purpose differentiable computing library.

My understanding is there’re serious gains in combining the two (presumably using TensorFlow as the initial ML framework). That sounds like a great idea. Then why is work being done to reinvent the wheel? Work on the Swift and fastai sides should focus on those domains and on integration.

This is the forum for S4TF and fastai collaboration. We should figure out how these pieces fit together and what range of abstractions are appropriate to each. That way we save a person from doing serious duplicate work, and keep the community on the same page. It’ll also give a clear sense of progress and what real-world goal we’re moving towards.

The way I envision the deep learning / differentiable stack is Low / Mid / High : MLIR / Swift / fastai. Each part of it has an important role.

@jeremy and @clattner, I think this is relevant to you; and anyone involved on either side of harebrain.


Thanks for the comments and questions @Borz. They’re absolutely the right questions to be asking - and I’m not sure anyone fully has the answers just yet. Frankly, we have the same issues in trying to figure out the demarcation between between fastai and pytorch too!

The good news is that the s4tf team has committed to working on answering these questions right from the start. So I think we’ll end up with a better answer than we’ve gotten to with pytorch (where, for instance, they have their own optimizers and schedulers and stuff, but they’re much less flexible than fastai’s - and they don’t integrate with any training loop because there isn’t a pytorch training loop at all!)

It’s challenging, because @sgugger and I are vastly less experienced with Swift than most of the s4tf team, but on the other hand there’s not so much experience for some on the s4tf team of training practical DL models or doing DL research. So we’re doing our best to leverage each other’s expertise and develop the skills in the areas we’re less familiar with.

Hopefully in a couple of weeks we’ll be able to present the s4tf fastai library to the wider s4tf community, showing some of its capabilities, explaining some of the background behind its features and API design choice, and discussing ideas for next steps. And before that happens we’ll try to package it up better so that it’s easier to get started with both using and contributing to the library.

As part of that process, we’ll work closely with @clattner, @dynamicwebpaige, @saeta and the whole team to better figure out where the API boundaries are, what API layers should exist, and who is responsible for what. But I’d expect that to be pretty fluid for quite a while!.. :slight_smile:


Hey all,

+1 to the questions, and to everything @jeremy has said.

Just as a follow-up: a few weeks ago, I presented a proposal for layering the S4TF APIs at the Oct 18 design meeting, and now have started implementing it. We still have a ways to go to fully flesh out the boundaries, but we’re actively working in this direction.

Questions, comments, and concerns welcome!

All the best,