Forums

Multi-tasking with Flask / Gunicorn / Gevent ?

Please help me if you can, I am py_level = "nu-bie",

Problem description: can't obtain any kind of "multi-tasking", in my own PC, in Flask application ( not with Flask, not with Gunicorn, not with Gevent ).

Environment:
* Python version: 3.10.13
* Flask version: 3.0.1
* Ubuntu 20.04

Assume I have a Flask application where one of the "branches is:"

import time
from flask import Flask

app = Flask(__name__)

@app.route('/slow')
def slow_branch():
    for i in range( 0,15 ):
        time.sleep(1)
    return "finished"

( the function takes 15 seconds to finish )

The app is started with Flask, at the Terminal:

$flask --app myapp  --host=localhost --port=5000

Now, in my own PC, I open 2 web-browsers, and point both of them to:

http://localhost:5000/slow

As fast as I can, I press [ENTER] on both browsers. The first browser finishes in 15 secords, as expected. The second browser takes 32 seconds to finish ( 15 waiting + 15 function time + 2 seconds delay from my part, to go from browser 1 to browser 2 ).

My conclussion: There is no multi-tasking taking place.

I try a "front-end" Gunicorn with

$gunicorn -w 4 -b 127.0.0.1:5000 myapp:app

Same thing. 32 seconds for the second browser to finish. No multi-tasking.

I try Gevent, with a file containing:

from gevent.pywsgi import WSGIServer
from myapp import app
http_server = WSGIServer(('', 5000), app)
http_server.serve_forever()

This file is named: [gevent.start] Then, from command line:

$python gevent.start

( I have a virtual environment, so [python] starts a python3.10.3 )

It starts the server ok, but no multi-tasking.
The first browser finishes in 15 secords, The second browser takes 32 seconds to finish ( 15 + 15 + 2 seconds delay from my part ).

What am I doing wrong ?

Why workers are not working ?

How can multi-tasking be achieved ? ( Both browsers should finish in 15-16 seconds total time, because running in "parallel" ).

I was thinking... (!) Should slow_branch() start a thread to do the work ? In this way slow_branch returns almost immediately. But the "Request" object will probably be destroyed by Flask, and my thread will need it (I think). So complicated ! Do I need to write a socket application from scratch to do this multi-tasking ?

Thank you in advance for your kind help.

Take a look at https://help.pythonanywhere.com/pages/AsyncInWebApps/.

Hello, thank you for directing me to the AsyncWebApps link.

My question still remains the same.
I don't know how to resolve "basic" concurrency issue.

Question: How can I achieve, in my own PC, concurrency for 2 requests,  Flask app ?
Please, I need a very practical* answer.

*practical = command line to start Gunicorn correctly (or a different server). Changes to Flask app to make this work.

This is not a problem with [pythonanywhere]. This is a situation I do not know how to resolve in my own PC, with Flask/Gunicorn. In my PC I can have multiple "workers".

Please take into consideration I have little Python experience/knowledge ( same with Flask and Gunicorn ). I have some knowledge in other areas of programming.

Problem: Two browsers make the same request (localhost:5000/somename) at the "same-time". Result: no concurrency observed. The requests go to a routine that wastes time with sleep(15). The first browser finishes in 15 seconds, the second browser takes 30 seconds to finish ( requests are not cocurrent, is my conclusion ).

I have tried many options in Gunicorn, but never achieving any kind of concurrency.
( I am probably giving the wrong options in the start line )

PS: Experiments are done in my own PC, because my Free account with [pythonanywhere] has only 1 worker. I assume 2 workers in Gunicorn, will give concurrency for 2 requests.

What am I doing wrong ?

Thank you !

You cant do it inside the web app. you need to do async stuff outside and Always-on Task is the good place.

Thank you so much for your reply !

I still do not know how to achieve  "concurrency" on my own PC, for a Flask app.

Please, what is the [name] of such [Always-on] task that will provide [concurrency] services to a Flask application ?

In my mind, Gunicorn is an [Always-on] task also. It is a server. It has to be [on] always. In their Documentation, they mention different types of threads to use. I tried them. Those workers are not working (in my case). I don't know how to make them work. Plese show me how. If I knew, I would not be here ! Can you help me ?

If there is no server that offer those capabilities (web-request-concurrency up to [n] clients), please let me know, with a clear "No server exists to achieve concurrency for Flask app".

Is there a simple "out-of-the-box" server that will provide concurrency for up to [n] clients ?

If Python/Flask + [SomeServer], do not have a solution to obtain web-concurrency, please let me know.

Thank you so much for your time and your help !

Please, close this topic/thread.

Today, finally, as of a few minutes ago, Gunicorn is in good mood, working as it is suposed to work. Concurrency with [n] workers to service [n] clients at the same time.

What changed ? I do not know.

For 4 days in a row, tried many things and nothing worked (no concurrency).

Today, it works ( nothing done different from before ).

So, if tomorrow it does not work again, I do not know how to fix it. (re-boot my best guess).

fjl was referring to Always-on task feature on PythonAnywhere. Our default web apps don't support Gunicorn yet, but you can try them using API, as an experimental feature that is being developped currently, see (and adjust): this help page.

Thank you for writing.

I apologize, my knowledge is very low with Python and related programs. My ignorance prevents me from understanding the answers given to me.

If it helps anyone,

As far as I understand, Gunicorn solves many types of concurrency problems, whithout making any change to your program. For example, taking care of [n] client simultaneous requests, where each request is a slow process (uploading a file or lengthy data-base request).

Gunicorn is a web-server for your application (Flask/Django, other wsgi).

Gunicorn recommends placing a "regular" web server in front of it. ( Apache, NgineX, etc. ) when deploying it.

For testing purposes, you do not need to have Apache or NgineX. Gunicorn is enough.

You do not have to make any changes to your application, for this to work.

Example: Apache receives the request from the end-user. It dispatches it to Gunicorn. Gunicorn takes care of the "multi tasking" of your application, and sends the request to an instance or thread of your app, that is not busy. You do not make any threads. Gunicorn makes them automatically. When your application responds, it goes to Gunicorn, then to Apache, then back to the client. This is all transparent to your application. No changes at all ! Beautiful !

As you can see, Apache is not handling the "multi-tasking" of your application. Gunicorn is taking care of it.

In other words, you can remove the slow I/O bottleneck of your app, running Gunicorn in front of your app (Flask/Django/other wsgi application).

When starting Gunicorn, you specify:

a) How many workers. Each worker (as I understand), is an independant copy of your application, running on totally different memory areas (each worker is a different process). They recommend 1 or 2 workers per CPU in you machine. These workers are good for CPU intensive apps. Gunicorn keeps all copies loaded all the time, for fast response. Your limit is how many CPU you have available and enough memory for all workers to do whatever they do, do.

b) How many threads [n] for each worker . If [n] is more than 1, [n] threads are started for each worker. You have to specify the type of thread. ("gthread" is recommended). It is perfectly ok if you say: 1 worker, 4 threads (gthread), when starting Gunicorn. You may also start with 4 worker, 2 threads. This means 2 threads for each worker. It is all up to you to decide how many is "good" for your system. This model (multi-thread) is recommended for I/O bound web-sites/pages (slow uploading/downloading/data-base queries/local lengthy process).

If you start with 10 threads and 1 worker, you remove the I/O bottleneck of your web-site, for 10 concurrent visitors, asking for slow, I/O bound requests.

As an example ( as far as I understand), if pythonanywhere would deploy Gunicorn as part of the web-service-chain, fixing the number of workers to [1,2,3, ...] according to the type of account, and leaving the [n] threads definition part to the "account holder" (as configurable in each account). This would solve many concurrency problems for all types of accounts. It would not hurt others, since the memory usage goes towards each account, also, the CPU usage would be the same, since threads consume "1-at-a-time" processor time. The end-user of the web-site, will always see things as if he was the only one accesing the web-site. Fast web applications !!! for [n] simultaneous users ! Even an account that has 2 or 3 workers, could benefit to high degree with configurable [n] threads for each worker.

What do you think ? Perhaps you wish to share your valuable knowledge. Please add a [post] below !.

Thank you again,