How to scale an Image Recogniton API?

Hi all,
So I have deployed a CNN written in Keras, as a Rest Api using flask on render and it works perfectly. I followed this tutorial .(
However I have a few doubts.

In my project, the client side will send a folder of images to get classified. This folder will usually contain ~10,000 images.

How can the client side send 10,000 images as POST requests? I tried running a for loop on a list of images, where every single image is sent as a POST request but this process is very slow.

How can I send Images in batches to the REST API?

I tried reading this ( but it’s a little too advanced for my DevOps skills right now. Is there a simpler solution?

I suspect the easiest way to scale this would be to run the AI part in separate processes e.g. multiple flask processes managed by gunicorn/uwsgi/… (I assume you do inference on the CPU). So you could process more images in parallel.

Then you need to solve uploading - as you noticed you won’t get any speed benefit if you just upload one image after another in a single process. In production you may have multiple sources of images to classify so you don’t have to do anything special.

If you have a single process which should submit images for classification ideally you’d use async uploading (httpx library).

Can you share a Github repository (if you are aware of any) from where I can find the basic structure of my API ?

As far as I understand you just scale like any other web app so any flask deployment tutorial with multiple worker processes should be fine. For the client-side submission the httpx docs should tell you everything you need.

(Of course you could also send image batches and use a worker queue like celery/rq to distribute the load but I guess that async uploading should be straight-forward.)

You should try sending them all in JSON as base64 perhaps. That’ll make them all one gigantic request.