You’d think APIs are simple. You send a request and get back a response, what’s so complicated about that? But sometimes the response takes too long to compute.
Google recommends you aim for a response time lower than 200 milliseconds, everything over half a second is an issue. For example If you developed a Deep Learning algorithm and you want to share it with the world, you need to develop an API exposing it.
But you obviously can’t compute the algorithm for 10 minutes before returning a response from the API. No user will wait that long, and you most certainly should not use an expensive GPU that has a great compute power in order to serve the API requests.
In this talk we’ll explore tools for solving this problem and building a scalable system. We will get to know Celery, which is an open-source, asynchronous, distributed task queue. It will save you blood, sweat and tears when trying to set up a distributed workers system to perform tasks for your API.