How to Optimize the Performance of a Flask Application

Introduction

Flask is a lightweight and flexible web framework for building small to medium-sized applications. It is commonly used in projects ranging from simple personal blogs to more complex applications, such as REST APIs, SaaS platforms, e-commerce websites, and data-driven dashboards.

However, as your application increases in traffic or complexity, you may notice performance bottlenecks. Whether you're building a content management system (CMS), an API for a mobile app, or a real-time data visualization tool, optimizing Flask performance is crucial to delivering a responsive and scalable user experience.

In this tutorial, you will explore techniques and best practices for optimizing the performance of a Flask application.

Prerequisites

A server running Ubuntu and a non-root user with sudo privileges and an enabled firewall. For instructions on how to set this up, please select your distribution from this list and follow our Getting Started with a Server guide. Please make sure you are running a supported version of Ubuntu.
Getting to Know the Linux Command Line For an introduction or refresher on the command line, you can check out this guide on Linux Command Line Primer.
Basic understanding of Python programming.
Python 3.7 or higher is installed on your Ubuntu system. To learn how to run Python scripts on Ubuntu, you can refer to our tutorial on how to run Python scripts on Ubuntu.

Setting up your Flask environment

Ubuntu 24.04 comes with Python 3 by default. Open a terminal and run the following command to double-check your Python 3 installation:

root@ubuntu:~# python3 --version
Python 3.12.3

If Python 3 is already installed on your machine, the above command will return the current version of Python 3 installation. If it is not installed, you can run the following command and get the Python 3 installation:

root@ubuntu:~# sudo apt install python3

Next, you need to install the pip package installer on your system:

root@ubuntu:~# sudo apt install python3-pip

After installing pip, let's install Flask.

You will install Flask via pip. It is recommended to do this in a virtual environment to avoid interference with other packages on your system.

root@ubuntu:~# python3 -m venv myprojectenv
root@ubuntu:~# source myprojectenv/bin/activate

root@ubuntu:~# pip install Flask

Create a Flask application

The next step is to write the Python code for the Flask application. To create a new script, go to the directory of your choice:

root@ubuntu:~# cd ~/path-to-your-script-directory

Once you're in the directory, create a new Python file, app.py, and import Flask. Next, initialize a Flask application and create a root path.

root@ubuntu:~# nano app.py

This will open a blank text editor. Write your logic here or copy the code below:

from flask import Flask, jsonify, request
app = Flask(__name__)
# Simulate a slow endpoint
@app.route('/slow')
def slow():
import time
time.sleep(2) # to simulate a slow response
return jsonify(message="This request was slow!")
# Simulate an intensive database operation
@app.route('/db')
def db_operation():
# This is a dummy function to simulate a database query
result = {"name": "User", "email": "[email protected]"}
return jsonify(result)
# Simulate a static file being served
@app.route('/')
def index():
return "<h1>Welcome to the Sample Flask App</h1>"
if __name__ == '__main__':
app.run(debug=True)

Now let's run the Flask application:

root@ubuntu:~# flask run

You can test the endpoints with the following curl commands:

Test the / endpoint (returns static content):

root@ubuntu:~# curl http://127.0.0.1:5000/

[secondary_lebel Output]
<h1>Welcome to the Sample Flask App</h1>%

Test the endpoint/slow (simulates a slow response):

root@ubuntu:~# time curl http://127.0.0.1:5000/db

To check this slow endpoint, we use the time command in Linux. The time command is used to measure the execution time of a particular command or program. It provides three main pieces of information:

Real Time: The actual time elapsed from the start to the end of the command.
User Time: The amount of CPU time spent in user mode.
System Time: The amount of CPU time spent in kernel mode.

This will help us measure the actual time taken by our slow endpoint. The output might look something like this:

Output
{"message":"This request was slow!"}
curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total

This request takes about 2 seconds to respond, because the time.sleep(2) call simulates a slow response.

Let's test the /db endpoint (simulates database operations):

root@ubuntu:~# curl http://127.0.0.1:5000/db

Output
{"email":"[email protected]","name":"User"}

By testing these endpoints using curl, you can verify that your Flask application is running correctly and that the responses are as expected.

In the next section, you will learn to optimize application performance using various techniques.

Use a production-ready WSGI server

Flask's built-in development server is not designed for production environments. To effectively handle concurrent requests, you should move to a production-ready WSGI server like Gunicorn.

Install and launch Gunicorn

Let's install Gunicorn.

root@ubuntu:~# pip install gunicorn

Run the Flask application using Gunicorn with 4 worker processes:

root@ubuntu:~# gunicorn -w 4 -b 0.0.0.0:8000 app:app

Output
% /Library/Python/3.9/bin/gunicorn -w 4 -b 0.0.0.0:8000 app:app
[2024-09-13 18:37:24 +0530] [99925] [INFO] Starting gunicorn 23.0.0
[2024-09-13 18:37:24 +0530] [99925] [INFO] Listening at: http://0.0.0.0:8000 (99925)
[2024-09-13 18:37:24 +0530] [99925] [INFO] Using worker: sync
[2024-09-13 18:37:24 +0530] [99926] [INFO] Booting worker with pid: 99926
[2024-09-13 18:37:25 +0530] [99927] [INFO] Booting worker with pid: 99927
[2024-09-13 18:37:25 +0530] [99928] [INFO] Booting worker with pid: 99928
[2024-09-13 18:37:25 +0530] [99929] [INFO] Booting worker with pid: 99929
[2024-09-13 18:37:37 +0530] [99925] [INFO] Handling signal: winch
^C[2024-09-13 18:38:51 +0530] [99925] [INFO] Handling signal: int
[2024-09-13 18:38:51 +0530] [99927] [INFO] Worker exiting (pid: 99927)
[2024-09-13 18:38:51 +0530] [99926] [INFO] Worker exiting (pid: 99926)
[2024-09-13 18:38:51 +0530] [99928] [INFO] Worker exiting (pid: 99928)
[2024-09-13 18:38:51 +0530] [99929] [INFO] Worker exiting (pid: 99929)
[2024-09-13 18:38:51 +0530] [99925] [INFO] Shutting down: Master

Here are the benefits of using Gunicorn:

Concurrent request handling: Gunicorn allows multiple requests to be processed simultaneously using multiple worker processes.
Load Balancing: Balances incoming requests across worker processes and ensures optimal use of server resources.
Asynchronous workers: With asynchronous workers like gevent, it can efficiently execute long-running tasks without blocking other requests.
Scalability: Gunicorn can scale horizontally by increasing the number of worker processes to handle more concurrent requests.
Fault tolerance: Automatically replaces workers that are unresponsive or broken, ensuring high availability.
Production-Ready: Unlike the Flask development server, Gunicorn is optimized for production environments with better security, stability, and performance features.

By switching to Gunicorn for production, you can significantly improve the throughput and responsiveness of your Flask application, preparing it to efficiently handle real-world traffic.

Enable caching to reduce load.

Caching is one of the best ways to improve Flask performance by reducing overhead. Here, you add Flask-Caching to cache the result of the slow path.

Install and configure Flask-Caching with Redis

Install the necessary packages:

root@ubuntu:~# pip install Flask-Caching redis

Update app.py to add cache to slow/path

Open the editor and update the app.py file with the following command:

root@ubuntu:~# nano app.py

from flask_caching import Cache
app = Flask(__name__)
# Configure Flask-Caching with Redis
app.config['CACHE_TYPE'] = 'redis'
app.config['CACHE_REDIS_HOST'] = 'localhost'
app.config['CACHE_REDIS_PORT'] = 6379
cache = Cache(app)
@app.route('/slow')
@cache.cached(timeout=60)
def slow():
import time
time.sleep(2) # Simulate a slow response
return jsonify(message="This request was slow!")

After the first request to /slow, subsequent requests are served from the cache within 60 seconds, bypassing the time.sleep() function. This reduces server load and increases response speed.

Note: For this tutorial, we are using localhost as the Redis host. However, in a production environment, it is recommended to use a managed Redis service such as DigitalOcean Managed Redis. This provides better scalability, reliability, and security for your storage needs. You can learn more about integrating DigitalOcean Managed Redis into a production-level application in this tutorial on Storage Using DigitalOcean Redis on the Application Platform.

To check if data is being cached, we run the following commands for the /slow endpoint.

This is the first request to the /slow endpoint. After this request completes, the result of the /slow route is cached.

root@ubuntu:~# time curl http://127.0.0.1:5000/slow

Output
{"message":"This request was slow!"}
curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total

This is a subsequent request to the /slow endpoint within 60 seconds:

root@ubuntu:~# time curl http://127.0.0.1:5000/slow

Output
{"message":"This request was slow!"}
curl http://127.0.0.1:5000/slow 0.00s user 0.00s system 0% cpu 0.015 total

Optimize database queries

Database queries can often become a performance bottleneck. In this section, you will simulate database query optimization using SQLAlchemy and join pooling.

Simulating a database query with join aggregation

First, let's install SQLAlchemy.

root@ubuntu:~# pip install Flask-SQLAlchemy

Update app.py to configure connection integration.

rom flask_sqlalchemy import SQLAlchemy
from sqlalchemy import text
# Simulate an intensive database operation
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///test.db'
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
app.config['SQLALCHEMY_POOL_SIZE'] = 5 # Connection pool size
db = SQLAlchemy(app)
@app.route('/db1')
def db_operation_pooling():
# Simulate a database query
result = db.session.execute(text('SELECT 1')).fetchall()
return jsonify(result=str(result))

Now, when we execute the curl request to the db1 path, we should see the following output:

root@ubuntu:~# curl http://127.0.0.1:5000/db1

output
{"result":"[(1,)]"}

You can significantly optimize the performance of your Flask application by implementing connection pooling in a production environment. Connection pooling allows the application to reuse existing database connections instead of creating new connections for each request. This reduces the overhead of creating new connections, resulting in faster response times and improved scalability.

The SQLALCHEMY_POOL_SIZE configuration we set earlier limits the number of connections in the pool. You should adjust this value in a production environment based on your specific needs and server capabilities. Additionally, you may want to consider other integration options such as SQLALCHEMY_MAX_OVERFLOW to allow additional connections if the pool is full and SQLALCHEMY_POOL_TIMEOUT to specify how long a request should wait for a connection.

Remember, while our example uses SQLite for simplicity, in a real-world scenario, you're likely using a more robust database like PostgreSQL or MySQL. These databases have their own connection pooling mechanisms that can be used in conjunction with SQLAlchemy's integration for better performance.

With careful configuration and use of connection pooling, you can ensure that your Flask application performs database operations efficiently even under heavy load, thus significantly improving its overall performance.

Enable Gzip compression

Compressing your responses can drastically reduce the amount of data transferred between your server and clients and improve performance.

Install and configure Flask-Compress

Let's install the Flask-compress package.

root@ubuntu:~# pip install Flask-Compress

Next, let's update app.py to enable compression.

from flask_compress import Compress
# This below command enables Gzip compression for the Flask app
# It compresses responses before sending them to clients,
# reducing data transfer and improving performance
Compress(app)
@app.route('/compress')
def Compress():
return "<h1>Welcome to the optimized Flask app !</h1>"

This automatically compresses responses larger than 500 bytes, reducing the transmission time of large responses.

In a production environment, Gzip compression can significantly reduce the amount of data transferred between the server and clients, especially for text-based content such as HTML, CSS, and JavaScript.

This reduction in data transfer results in faster page load times, improved user experience, and reduced bandwidth costs. Additionally, many modern web browsers automatically support Gzip compression, making it a fully compatible optimization technique. By enabling Gzip compression, you can effectively improve the performance and scalability of your Flask application without having to make any changes on the client side.

Uploading intensive tasks to Celery

For resource-intensive operations like sending emails or processing large datasets, it's best to offload them into background jobs using Celery. This prevents long-running tasks from blocking incoming requests.

Celery is a powerful distributed task queuing system that allows you to execute time-consuming tasks asynchronously. By offloading intensive operations to Celery, you can significantly improve the responsiveness and scalability of your Flask application. Celery works by delegating tasks to worker processes that can run on separate machines, allowing for better resource utilization and parallel processing.

Key benefits of using celery include:

Improve response time for user requests
Better scalability and resource management
Ability to perform complex and time-consuming tasks without blocking the main program
Built-in support for task scheduling and retrying failed tasks
Easy integration with various message brokers such as RabbitMQ or Redis

Using Celery, you can ensure that your Flask application remains responsive even when dealing with computationally intensive or I/O-bound tasks.

Configure Celery for background tasks

Let's install Celery.

root@ubuntu:~# pip install Celery

Next, let's update app.py to configure Celery for asynchronous tasks:

from celery import Celery
celery = Celery(app.name, broker='redis://localhost:6379/0')
@celery.task
def long_task():
import time
time.sleep(10) # Simulate a long task
return "Task Complete"
@app.route('/start-task')
def start_task():
long_task.delay()
return 'Task started'

In a separate terminal, launch the Celery worker:

root@ubuntu:~# celery -A app.celery worker --loglevel=info

Output
------------- celery@your-computer-name v5.2.7 (dawn-chorus)
--- ***** ----- 
-- ******* ---- Linux-x.x.x-x-generic-x86_64-with-glibc2.xx 2023-xx-xx
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app: app:0x7f8b8c0b3cd0
- ** ---------- .> transport: redis://localhost:6379/0
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. app.long_task
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Connected to redis://localhost:6379/0
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: searching for neighbors
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: all alone
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] celery@your-computer-name ready.

Now run a curl command to hit the /start-task path, the output will be as follows:

root@ubuntu:~# curl http://127.0.0.1:5000/start-task

Output
Task started

This will return "Task started" almost immediately, even if the background task is still running.

The start_task() function does two things:

Calls long_task.delay(), which starts the Celery task asynchronously. This means that the task is queued to run in the background, but the function does not wait for it to complete.
It immediately returns the string "Task started".

An important thing to note is that the actual long-running task (simulated by the 10-second sleep) is executed asynchronously by Celery. The Flask route does not wait for this task to complete before responding to the request.

So, when you screw this endpoint, you immediately get a response that "job started", while the actual job continues to run in the background for 10 seconds.

After 10 seconds when the background job is complete, you should see this report message:

The output will be similar to this:

[2024-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Task app.long_task[task-id] received
[2024-xx-xx xx:xx:xx,xxx: INFO/ForkPoolWorker-1] Task app.long_task[task-id] succeeded in 10.xxxs: 'Task Complete'

This example shows how Celery improves the performance of a Flask application by executing long-running tasks asynchronously, keeping the main application responsive. The long-running task runs in the background, freeing the Flask application to handle other requests.

In a production environment, running Celery involves the following:

Using a robust message broker like RabbitMQ
Using a proprietary result backend (e.g., PostgreSQL)
Manage workers with process control systems (e.g., supervisor)
Implement monitoring tools (e.g., flowers) Increase error handling and logging
Increased error handling and logging
Use task prioritization
Scaling with multiple workers on different machines
Ensuring appropriate security measures

Result

In this tutorial, you learned how to optimize your Flask application by implementing various performance enhancement techniques. By following these steps, you can improve the performance, scalability, and responsiveness of your Flask application, ensuring that it runs efficiently even under heavy loads.

How to optimize the performance of a Flask application

Introduction

Prerequisites

Setting up your Flask environment

Create a Flask application

Use a production-ready WSGI server

Install and launch Gunicorn

Enable caching to reduce load.

Install and configure Flask-Caching with Redis

Update app.py to add cache to slow/path

Optimize database queries

Simulating a database query with join aggregation

Enable Gzip compression

Install and configure Flask-Compress

Uploading intensive tasks to Celery

Configure Celery for background tasks

Result

In this article:

Post written by: admin

Leave a Reply

EducationalHow to launch a fast API application with a NoSQL database

How to install and use Homebrew on macOS

SSH to AWS EC2 from PuTTY on Windows

Popular JavaScript Frameworks

Transfer Apache logs to OpenSearch via Logstash

3 ways to compare Strings in C++

How to set up a firewall with UFW on Ubuntu

How to use primary keys in SQL

Everything we know about GTA 6 so far

How to optimize the performance of a Flask application

Introduction

Prerequisites

Setting up your Flask environment

Create a Flask application

Use a production-ready WSGI server

Install and launch Gunicorn

Enable caching to reduce load.

Install and configure Flask-Caching with Redis

Update app.py to add cache to slow/path

Optimize database queries

Simulating a database query with join aggregation

Enable Gzip compression

Install and configure Flask-Compress

Uploading intensive tasks to Celery

Configure Celery for background tasks

Result

In this article:

Post written by: admin

Follow

Leave a Reply

You May Also Like