Solving Python’s everlasting problem of slow code

Python is truly my go-to programming language. The ease of use, clean code and speed of development it delivers are unmatched. Often times I was able to prototype an idea very quickly. When I moved on to larger workloads there was one problem though: the execution speed.

Because Python enforces a lot less constraints than non dynamically typed languages, such as Go, Rust or C/C++, it is easier and faster to write code in Python, but it will also never achieve the same levels of speed.

So let me be clear from the start: There are ways of improving Python’s speed, but they all come with added complexity and their own downsides. If you don’t necessarily have to stick to Python it would probably be your best choice to just use a programming language such as Go

Alternative interpreters/compilers

PyPy is an alternative to Python’s default interpreter CPython. On average, PyPy is 4x faster than Python! Not all libraries work with PyPy though. A list of compatible libraries is available here.

This one is arguably a favorite of mine. Nuitka first translates your Python code to C and then compiles it. This also includes a lot of clever optimizations which offer a speedup of well over 300%. You get a single binary which includes all off your program’s dependencies and can be easily distributed. The target computer doesn’t even need to have Python installed anymore to run the program! At the same time Nuitka offers a greater amount of compatibility than PyPy.

Solving the problem of concurrency

When it comes to concurrency (completing multiple tasks at the same time) there is a problem with Python: The Global Interpreter Lock (GIL). This basically allows a single Python process to only ever use one thread. This article goes into more detail: Link

Let’s look at some example code:

from datetime import datetimeimport httpxstart_time = datetime.now()for _ in range(0, 10):
httpx.get("https://httpbin.org/get")
print("Took: ", datetime.now() - start_time)

Here we send 10 HTTP GET requests to httpbin and also determine the time it takes.

The output: Took: 0:00:04.893904

On my machine it took nearly 5 seconds to send those requests and receive the responses. Let’s see if we can speed things up a little.

By running code asynchronously, we can make use of waiting time which occurs throughout the program. Take web requests as an example: Normally the program would send an HTTP request and then wait for the server to send back a response before going on to the next line of code. The only downside of this is that the code becomes harder to debug.

With asynchronous code the next web request (or task) will be started while we are still waiting for the first response to arrive.

from datetime import datetime
import asyncio
import httpxasync def run():
async with httpx.AsyncClient() as client:
for _ in range(0, 10):
await client.get("https://httpbin.org/get")
start_time = datetime.now()
asyncio.get_event_loop().run_until_complete(run())
print("Took: ", datetime.now() - start_time)

By modifying our code to use asyncio we reduce the time the program takes to complete all request to 0:00:01.466089. The program now runs more than 3 times as fast.

Since one Python process can only every use one thread at a time (→ GIL), why not use multiple processes? That is exactly what the package multiprocessing allows us to do. While this speeds up the program, each additional process also takes up system resources, and it becomes an additional challenge to exchange data between the processes.

Modifying our source code once again, this time to start multiple processes for completing the requests, we get:

from datetime import datetime
from multiprocessing import Pool, cpu_count
import httpxdef run():
httpx.get("https://httpbin.org/get")
start_time = datetime.now()# Sets the number of processes running at the same time
# equal to the number of cpu cores.
pool = Pool(cpu_count())
for _ in range(0, 10):
pool.apply_async(run, [])
pool.close()
pool.join()
print("Took: ", datetime.now() - start_time)

This brings our total down to 0:00:00.531039! It pretty much took only 1/3 of the time of the asynchronous implementation.

Originally published at https://quoorex.com on April 13, 2021.

--

--

--

Developer and full-time learner. Aspiring entrepreneur.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How to get columns from a Excel XLSX spreadsheet in Python

The Backend Developer Roadmap: Part 2: Basic Frontend Knowledge vol. 1

Linearize a PDF in Python

Day 1: Unity and Version Control

Filter with Parameters: Adding ‘All’ option to filter by all values

What it means to be a Programmer?

Deploy Your First Docker Container — Part 1

Google Smart Home Action — Account Linking

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Quoorex

Quoorex

Developer and full-time learner. Aspiring entrepreneur.

More from Medium

The `getattr` function in Python

NSQ with Docker in baby steps -70 lines of code

LightGBM Starter Code

How much data fits into a Postgres jsonb field?