Getting neural networks to work on phone

I wanted to get neural networks to work on smartphone. For that, I was going through some papers accepted in the upcoming ICLR 2017 conference. There are just too many different techniques. Anyone has any experience with different techniques and can help me out with which one works best?
Currently I am planning to use these two papers:

    Also, I came across techniques which could decrease 32-bit neural networks to 4-5 bit nets with no loss in accuracy.
    But it looks like this will be quite complicated and time consuming to implement. We will need to implement low level operations that work on 4-5 bits for this. Any suggestions from someone who has tried?

I’d suggest initially trying either squeezenet or enet as your architecture. See if that’s enough. Maybe also look into the new caffe2, since it’s particularly designed for mobiles.

PS: You can even squeeze a network down to 1-bit :slight_smile: e.g.

Squeezenet is not that accurate for our applications. So I was thinking a better alternative would be to use modern compression techniques (pruning, quantization) that can compress without any loss in accuracy. This paper decreases the precision of a network down to 4 bits with increase in accuracy. But implementing low precision (<8 bits precision) ops would involve implementing these operations in C++. So I was thinking of going ahead with weight sharing for now. The problem with pruning and quantization based methods is they don’t significantly decrease the convolution layer parameters. So decreasing number of bits becomes important.
Any suggestions for this? Is there any open source library that has low precision (<8 bit) convolution and other ops?

1 Like

I haven’t seen one - although it’s not something I’ve looked into.

If you are running on a newer SnapDragon (82x or 83x) then NPE offers quantized support as well as GPU and Hexgon support which is quite fast and efficient.

1 Like

About newer SnapDragon
You can take a look to tensorflow TensorFlow Runtime with HVX Acceleration

It looks like deprecated and not supported any more. So I recommend SNPE (Snapdragon Neural Processing Engine)

Just to add my 2 cents here as I’ve been working on getting NNs to run on phones better for the last few months, with a decent degree of success.


  1. MobileNet is really good, IMO. It gives an accuracy as good as VGG-16. And it’s fast. Definitely not as fast as Squeezenet but who wants Alexnet-level accuracy in 2017?

  2. Enet is also really smooth but that is if you want segmentation.

  3. I’m starting to play with Densenet ( I found pre-trained Caffe models which I have converted to Caffe2 and also CoreML. I’ve NOT tested it on mobile devices as of yet but i’m guessing it should work just fine.

  4. YOLO. TinyYOLO runs really fast on iphone 6s and above. There are umpteen implementations of TinyYOLO on Github which you can use to run on both iOS and Android devices. If you’re target iOS devices you can use frameworks like Forge( or CoreML to run Yolo on the phone.

  5. im2txt - Again, super easy to run on iOS/Android devices. A tad slow. Not anywhere close to real time.


  1. Tensorflow - it’s sad to see that TF isnt bullish on iOS. Their iOS APIs need a major facelift, need to support Swift out of the box. No iOS developer wants to write Obj-C++. For Android, I think it’s pretty solid support - you can even train on-device, from what i heard they have a pretty robus Java API. Which is awesome. When TF

  2. Caffe2 - A framework that held SO much promise when it released but the work has more or less slowed down. Again, you only have a C++ and a python API. Along with a bunch of dependencies. Not a fan.

  3. Metal/MPSCNN - Going a little more iOS specific, you can check out MPSCNN for running NNs on iOS 10+ devices. Works super smooth. And with a framework like Forge, it gets a lot easier. The downside is you need to know Metal and have some background with it to use it.

  4. CoreML/Vision - A godsend for people working on iOS. So far, for practical purposes CoreML/Vision works great for iOS 11+ devices. Ofcourse you need to pay a lot of attention while converting your models. Always specify the exact pre-processing needed for that model. The major downside to this is that you can’t define your own layer-types. Until iOS 11 Beta 2 mobileNets couldn’t be implemented using CoreML because CoreML didnt really have a depthwise convolution layer.

Hope that helps, @singlasahil14. I’m yet to try tricks that @jeremy mentioned, stuff like quantization or pruning.


Networks can be developed using the global configuration where you can do the adjustments easily and for more information go to