Hi, I’m the tech lead of a small AI startup. We’re mostly working on a product that doing human pose detection on mobile device and cloud.
Currently we’re planning for the next major phase of our product development, and part of this is a full upgrade of our deep learning development pipeline to enable us do more interesting things.
After watching the course video, reading through the swift notebooks, and setup the environment on my dev box, I’m completely sold. S4TF solves problems of Python been used in a production environment. Our application frontend and backend are exclusively using TypeScript, which may be the best decision I’ve ever made for this project. And Swift feels the same when I got the VSCode extension working, syntax highlighting, auto completion, accurate real time error/warning as you are typing code. This is the language I’m searching for years, and it is here.
Basically, I can compromise A LOT to make our team using Swift. And I do realize it’s not ready for production use right now. But the idea just keeps coming back in my mind: what if we bite the bullet and give it a try? We’re a small team and we can hack a lot of things if it’s hackable. So here I’m collecting opinions here to help me estimate how big is the “bullet”
These are some of the things we may care:
The performance, especially the GPU performance for training
Using public cloud (We are using Azure Machine Learning right now) to do training
Ways to deploy to production (for example, TensorFlow serving).
I believe there is no “one-click” to export and deploy option right now. But what if we can afford to manually collect the weights from S4TF and apply it on the other side? What if we can even afford to replicate the network structure? Can we extract the graphic pb from S4TF binary?
How about pretrained weights? How hard it would be to convert some pretrained weight to a S4TF model? (I don’t think there is a way to import the model structure right now, correct me if I’m wrong)
Are there anything that may prevent us from implementing certain network architecture? For example, we’re using mobilenet family a lot. And the perf issue of depthwise convolution in PyTorch (https://github.com/pytorch/pytorch/issues/18631) forced us move away from pytorch (and FastAI )
Anything else? Like in-deterministic results? Un-traceable crashes? Un-hackable blockers?
If our internal schedule works out, we’ll pick a small “framework validation task”, like porting PoseNet to S4TF, and have 1-2 developers spend a week or so to give it a try. And I’ll post our learning back here.
But any thoughts from the community before we get started will be super helpful!
There are a lot of question here and I don’t have full answers, but: we are investing heavily into solving all of these, because is supremely important to jump over the status quo and redefine expectations. There are a few things we are struggling with on the infra side of things (e.g. stack traces that go back to your source) but we expect those to be fixed when the plumbing gets sorted out.
Thanks Chris!
After thinking harder, I think there are only two potentially “unhackable” to me:
Is current S4TF expected to achieve full TensorFlow GPU performance?
Is “expressiveness” of current S4TF expected equal to bare TF?
To my very naive understanding of how thing related to each other, S4TF has a full TensorFlow C++ core inside it. So the answer to these two questions should be positive. Is this the case?
To be honest, I’ve no idea of S4TF but before you’re going to use it in production, you should perhaps read at least this post: http://boringtechnology.club/
I won’t bet the future of my new found company on this technology. But this is just me…
I’m not sure what you’re hoping to get out of using S4TF, other than playing around with cool technology. (You definitely don’t need it in order to do human pose detection on mobile.)
I don’t know anything about your company, but if I was an investor in an AI startup I’d probably not be happy if they decided to go with unproven technology in their stack. There are already plenty of risks involved in running a startup – not smart to make it even more risky…
Hi Matthijs, glade to see you here! Remember we had email about 2 months ago about pose estimation on mobile (And we’re still working on this, just taking a little “cloudy shortcut” at this moment… will get back to you soon)
For me I don’t think it’s a technology stack choice, it’s more like a tool chain choice, since underlying it’s just the same tensorflow. It’s like adding a unittest framework or adding a type checker for python.
But I know it’s definitely means much more than adding a static type checker for our python code
I’m not trying to make the decision of “jumping to S4TF” here. Instead I’m trying to decide if it’s worth to spend a little bit of my team’s time to give it a shot and see how far it goes, mostly as a learning experience.
So right now (after played with it a little bit more last night) I tend to make it my personal side project for the next month instead of bring it up to the team.
Yeah I’m well aware of the tendency of “chasing shining objects” of tech people especially myself. And I hope I’ve grown out of that (by some hard lessons …)
For me I’m looking for “Engineering Efficiency” of our team. And python have a huge problem here, it make the code hard to maintain and reuse, especially for teams of our size right now.
For a smaller team or single person project, Python is OK since everyone knows everything. For a bigger team and project, infrastructure and process level solutions can be placed (like unit test, check-in process for static type checks) to mitigate the dynamic issue of Python.
Last year we switched our entire stack to TypeScript, which is a language roughly comparable to Swift in lots of ways (and their rankings are very close in language trending boards). This change makes our live magnitudes better. We don’t have resource to write unit test for majority of our code. But with typings enforced across frontend and backend, we can confidently make last minute API changes without worry about breaking random things in some hidden corner of the code base, because compiler checked every corner for us.
For S4TF I’m looking for the same thing for our research team. But now I realized the ecosystem is very early here and there are still rough edges in the language itself.
It’s too early to use s4tf in production just now - as we said in the course, most stuff doesn’t quite work yet! But it’s a great time to get involved for research and learning, since it has more potential than anything else out there IMHO.
The underlying TF run-time is really slow, unfortunately, but Swift hopefully will leverage MLIR to leapfrog all other libs and in the future might well be the fastest option.