Flask has been claimed as synchronous on many occasions, yet still possible to get async working but takes extra work. So, let’s see how a naive web server synchronously handle requests. We expect a bunch of requests were received, and the server will process them one at a time, meaning that when a request is being served, other requests cannot be processed and will have to wait.
Let’s see it in action. We first define an endpoint where a simple 1 second delay is applied everytime a request is received.
L9~L21 defined the endpoint. We captured several interesting data points like process id, thread id and execution time for demonstration purpose. At L26, we have to specify `threaded=False` in order for it to be single threaded.
By looking into the source code of Flask.run, it is running a development server leveraging werkzeug.serving.run_simple method. Even Flask’s own documentation has repeated several times its own built-in documentation is not for production. In production, Flask can only be used as the web application framework and we do need a WSGI server.
Below is a diagram demonstrating the underlying dependencies behind fask.run, how the development server got built from some of the most foundationational built-in libraries like os.forking, threading, etc.
If we put aside the question of how the server got built for a second, let’s spawn some requests and test the performance.
The code above will spawn 10 requests simultaneously. There are several other ways of doing it and the easiest way to make asynchronous request is grequset, and the code is as easy as above. There is a whole discussion between imap and map about are they truly issuing 10 requests at a time that worth noting, but the code above works for me.
There is a clear difference between the threaded=True and False on the server side. When flask.run threaded=False, even if all the requests were issued simultaneously, the requests were processed one by one.
After we changed the app.run to be threaded=True, this is how the responses look like now.
All the requests come back at 1 second (1.457s) at once.
This is the difference between using threads on the server side. We can also test it using different processes. The end outcome looks similar as all the requests are processed quickly but the difference is the server is creating many new processes, issuing one process for a given new requests.
These two different approaches has their own pros can cons. There is a great comparison between thread and process.
After seeing what just happened, next time people say flask is synchronous and it can handle only one request at a time. Knowing that by configuring thread, process at app.run, you know for sure we will be able to handle more than a request at a time. Should we use it, clearly not, even the app.run is not supposed to be called but can/cannot is different from should/should not.
Have a bet with your coworkers and win yourself a beer!