Making your life easier with Python decorators, part two

2023-06-20 by Pauli Lohi

Perhaps, after reading the first part, you now feel knowledgeable and confident enough to write a decorator or two. But you may still have some doubts. Sure, in some very specific situations, using decorators seems like a good solution, but they still feel like a neat little trick instead of a proper tool worthy of being used in very serious production code. Even though many Python standard library decorators are well understood and commonly used, are custom-made decorators easy to read and understand enough? And how about testing? Surely, using a lot of decorators will make testing a nightmare...

I have found these common misconceptions about Python decorators to be more often based on second-hand experience and "common knowledge" than facts. Don't get me wrong; I totally agree that creating complex solutions to simple problems is a real issue in software engineering. But that is more often caused by using the wrong tools for the problem at hand than by the tools themselves being poor. In this article, you will gain a deeper understanding of how Python functions and decorators work. Additionally, you will learn which kind of structures can be replaced by reusable decorators and some aspects to consider when writing tests for code that uses decorators. After reading it, you will be ready to write some production-grade code with decorators!

Diving deeper into the Python data model

Before getting our hands dirty with the real-world use case, let's go through a few more details about decorators and functions in the Python data model. As we already know, in Python, functions are just objects of a specific type called function. Like other Python objects, they also store their data in attributes. Let's create a function and take a look:

def sum_of_two(a, b):
    return a + b

If we put parentheses after the function name, we make a function call, executing the code inside the function body:

>>> sum_of_two(1, 2)
3

This is the most common way function objects are used, and for many programmers and projects, it is the only way they are ever interacted with. However, it's also possible to access other attributes of the object using attribute references. For example, the name of the function is stored in the __name__ attribute and can be accessed by placing a period and the attribute name after the function name:

>>> sum_of_two
<function sum_of_two at 0x7f523cb17280>
>>> sum_of_two.__name__
'sum_of_two'

When a function definition statement (identified by the def keyword) is executed, the Python interpreter creates a function object and automatically assigns values to a bunch of attributes. Even the actual compiled code of the function body is stored in an attribute called __code__. In addition to the data required to execute the function, some metadata, such as the name and the docstring of the function, is created and stored into the object. All of these attributes are readable, and most of them can even be changed during the execution of the program.

But why is this important, and what does it have to do with decorators? Most of the time, a decorator actually creates a completely new function that just wraps around the original. When the new function is created, it will also get new attributes created from scratch, regardless of the values of the original function's attributes.

Let's demonstrate:

def hello_decorator(original_func):
    def decorated_func(*args, **kwargs):
        """This is the decorated function"""
        print("Hello from decorator")
        return original_func(*args, **kwargs)
    return decorated_func

@hello_decorator
def sum_of_two(a, b):
    """This function returns the sum of a and b"""
    return a + b

Now if we check the name and the docstring of the decorated function, we can see that they match the decorator's inner function instead of the decorated function:

>>> sum_of_two.__name__
'decorated_func'
>>> sum_of_two.__doc__
'This is the decorated function'

Clearly, this is not the desired outcome. The decorator should extend the functionality of our function, not replace its metadata. This problem can be easily solved: just copy the attributes of the original function to the decorated one, like this:

def hello_decorator(original_func):
    def decorated_func(*args, **kwargs):
        print("Hello from decorator")
        return original_func(*args, **kwargs)

    decorated_func.__name__ = original_func.__name__
    decorated_func.__doc__ = original_func.__doc__

    return decorated_func

Of course, this solution needs to be applied to every decorator you write. Instead of going through the process of copy-pasting the same lines of code all around the codebase, a more generic solution would be better. Luckily, this is a problem so common that a solution is already included in the standard library! We can use @functools.wraps to copy the metadata from the original function to the decorated one:

import functools

def hello_decorator(original_func):
    @functools.wraps(original_func)
    def decorated_func(*args, **kwargs):
        print("Hello from decorator")
        return original_func(*args, **kwargs)
    return decorated_func

@hello_decorator
def sum_of_two(a, b):
    """This function returns the sum of a and b"""
    return a + b

Now the attributes of the sum_of_two function stay unchanged when the decorator is applied:

>>> sum_of_two.__name__
'sum_of_two'
>>> sum_of_two.__doc__
'This function returns the sum of a and b'

So which attributes does the @functools.wraps decorator copy? Inspecting its source code reveals that the following properties are copied from the original function:

__doc__: the function's docstring
__name__: the function's name
__qualname__: the function's qualified name
__module__: the name of the module the function was defined
__dict__: dictionary storing the object's arbitrary attributes
__annotations__: dictionary containing (type) annotations of parameters

Copying these attributes is enough to make the decorated function look just like the original in almost every use case. The original version of the function is also stored in an attribute named __wrapped__ of the decorated function. This can be extremely useful when writing tests for the function.

Note: Interested in learning more about the Python data model? The official Python language reference explains the data model and call expressions in a detailed but easily approachable manner.

Decorators in action

Most of the time, instead of using a decorator, it's totally possible to just make a couple of function calls or to use a class or some other structure. However, there are some situations where using a decorator can absolutely be the best and most readable way to reuse code.

Consider the following scenario:

You are working with an external API that is not completely stable. Sometimes calling the API fails or times out, but waiting a few seconds and trying exactly the same API call again resolves the issue. This can be solved by simply wrapping the API call in a loop and retrying it every time an exception is raised until a limit of maximum tries is reached or the call succeeds:

import time

MAX_TRIES = 5
RETRY_WAIT_SECONDS = 3

def fetch_data():
    current_try = 0
    while True:
        try:
            return call_external_api()
        except Exception as error:
            current_try += 1
            if current_try == MAX_TRIES:
                raise error
            time.sleep(RETRY_WAIT_SECONDS)

This solution works just fine, but it's not very reusable. If you have multiple functions where you need to call external APIs, you have to copy-paste the same structure into each function. You would also need to test that the retry functionality works correctly in each of the functions separately. A better solution would be to apply a retry decorator to each of the functions. Here's an implementation of a retry decorator:

import time
import functools


def retry(max_tries=5, wait_seconds=3):
    def decorator(original_func):
        @functools.wraps(original_func)
        def decorated_func(*args, **kwargs):
            current_try = 0
            while True:
                try:
                    return original_func(*args, **kwargs)
                except Exception as error:
                    current_try += 1
                    if current_try == max_tries:
                        raise error
                    time.sleep(wait_seconds)

        return decorated_func
    return decorator


@retry()
def fetch_data():
    return call_external_api()


@retry(wait_seconds=10)
def fetch_more_data():
    return call_other_external_api()

Note how using a decorator also allows us to easily parametrize the retry functionality! The decorator is quite useful already, but is it ready to be used in a serious production application? The function is now always executed again if any error occurs, which seems a bit too aggressive. For example, when calling an HTTP API, it would be better to retry only if a server error occurs: if the error is caused by the client (that's our code!), then it makes no sense to try again.

Let's implement a more specialized decorator for retrying functions that make HTTP requests using the popular requests library:

import time
import functools
from requests import HTTPError


def retry_http(max_tries=5, wait_seconds=3, http_codes=[500, 502, 503, 504]):
    def decorator(original_func):
        @functools.wraps(original_func)
        def decorated_func(*args, **kwargs):
            current_try = 0
            while True:
                try:
                    return original_func(*args, **kwargs)
                except HTTPError as error:
                    current_try += 1
                    code = error.response.status_code
                    if current_try == max_tries or code not in http_codes:
                        raise error
                    time.sleep(wait_seconds)

        return decorated_func
    return decorator

Now, if the decorated function raises an HTTPError, the decorator code will check if the HTTP response status code is Internal Server Error, Bad Gateway, Service Unavailable, or Gateway Timeout, and if so, will try to execute the function again. Here's a function that uses the requests library to fetch data from an HTTP API:

import requests

@retry_http()
def fetch_data():
    response = requests.get("api-endpoint-url")
    response.raise_for_status()
    return response.content

Please note that requests.get does not raise exceptions on non-success HTTP status codes by default; you have to manually call the raise_for_status method to check if the request was successful or not. This is a bit problematic because the decorated function is responsible for implementing that call. If the call is omitted, the retry functionality will not work. We'll get back to how to solve this issue later.

Simplify your tests with decorators

In my experience, contrary to popular belief, using decorators can often make testing your code a lot easier. According to the principles of unit testing, code should be tested by isolating the smallest possible units of code (most commonly functions, classes, and methods) and validating the correctness of each unit separately. Moving logic from functions to decorators allows you to test the functions and the decorators separately. And it's definitely easier to write comprehensive tests for a small and simple unit of code than for a long and complex one.

Let's start by writing tests for the fetch_data function from the previous section. It's extremely simple, so we only need to test two things. Firstly, the function should return the content from the HTTP response if no errors occurred:

from requests import Response
from unittest import mock


def test_fetch_data_returns_response_content():
    mock_response = Response()
    mock_response.status_code = 200
    mock_response._content = "foo"

    with mock.patch("requests.get", return_value=mock_response):
        assert fetch_data.__wrapped__() == mock_response.content

Because it's a unit test, the fetch_data function should be tested in total isolation. There should be no dependencies to external systems or code. That's why the requests.get function needs to be replaced with a mock function when running the test. This is done by calling mock.patch with return_value argument containing a hard-coded return value. Please also note that the test calls the original, non-decorated fetch_data function. This is possible because the decorator uses @functools.wraps, which makes the non-decorated version of the function available through the __wrapped__ property.

The second thing we need to test is that the function raises an HTTPError if the HTTP response status code is an error code. Like in the first test, we replace the requests.get function with a mock version. This time the mock should always return a reponse with HTTP 500 code:

import pytest
from requests import HTTPError


def test_fetch_data_raises_on_error_status_code():
    mock_response = Response()
    mock_response.status_code = 500

    with mock.patch("requests.get", return_value=mock_response):
        with pytest.raises(HTTPError):
            fetch_data.__wrapped__()

Calling the function within a pytest.raises context ensures that an exception of the desired type, HTTPError, is raised within the block. Otherwise the test fails.

Testing the @retry_http decorator requires a bit more work. It should meet the following specifications:

The decorated function should be executed again if and only if the HTTP error code raised by the function is one of the codes specified in the decorator parameter http_codes
The decorated function should wait the correct amount of time between the executions
The decorated function should be executed again the correct number of times

Let's start with the last one:

import pytest
from unittest import mock
from requests import HTTPError, Response


mock_response = Response()
mock_response.status_code = 500


def test_retry_http_correct_try_count():
    max_tries = 10
    func = mock.Mock(side_effect=HTTPError(response=mock_response))
    decorated_func = retry_http(max_tries=max_tries)(func)

    with pytest.raises(HTTPError):
        decorated_func()

    assert func.call_count == max_tries

First, we create a mock function using the Mock class that always throws an HTTPError with status code 500. Then we decorate the function with the @retry_http decorator. Note that the decorator has to be applied using regular function call syntax instead of decorator syntax. This is because the function is not created using the def keyword, so decorator syntax can't be used. After applying the decorator, we call the function, expecting an HTTPError to be raised. Finally, we check that our mock function was called the correct number of times using call_count, a property provided by the Mock class.

The test works, but it takes 30 seconds to run! That is, of course, caused by the wait between retries. Because we are writing a unit test, we should also replace the time.sleep function with a mock object:

@mock.patch("time.sleep")
def test_retry_http_correct_try_count(mock_sleep):
    max_tries = 10
    func = mock.Mock(side_effect=HTTPError(response=mock_response))
    decorated_func = retry_http(max_tries=max_tries)(func)

    with pytest.raises(HTTPError):
        decorated_func()

    assert func.call_count == max_tries

Now the test works correctly and only takes a few milliseconds to run! Let's move on to validating the second part of our specification:

@mock.patch("time.sleep")
def test_retry_http_correct_sleep_length(mock_sleep):
    wait_seconds = 10
    func = mock.Mock(side_effect=HTTPError(response=mock_response))
    decorated_func = retry_http(wait_seconds=wait_seconds)(func)

    with pytest.raises(HTTPError):
        decorated_func()

    for call in mock_sleep.mock_calls:
        assert call.args == (wait_seconds,)

It is enough to assert that time.sleep was called with the correct arguments: a unit test for our decorator function should not be responsible for validating that a function it depends on works correctly.

The first item in the specification can be validated similarly to the other ones:

@mock.patch("time.sleep")
def test_retry_http_correct_code(mock_sleep):
    http_codes = [500]
    func = mock.Mock(side_effect=HTTPError(response=mock_response))
    decorated_func = retry_http(http_codes=http_codes)(func)

    with pytest.raises(HTTPError):
        decorated_func()

    assert func.call_count > 1


@mock.patch("time.sleep")
def test_retry_http_incorrect_code(mock_sleep):
    http_codes = [502]
    func = mock.Mock(side_effect=HTTPError(response=mock_response))
    decorated_func = retry_http(http_codes=http_codes)(func)

    with pytest.raises(HTTPError):
        decorated_func()

    assert func.call_count == 1

Now we have a reasonable test suite that can be used to verify that our retry decorator and the function fetching data from an external API work as expected. Is this good enough to be used in a real production application? Well, that, of course, depends on the application. But generally, I would say yes, it is.

Note: The test code in this section is written for the pytest testing framework, but the same principles can be applied to most other testing frameworks and tools too.

Unleashing the synergy between decorators and monkey patching

Even though our code could be considered ready for production already, there is still a flaw in the @http_retry decorator that opens up the possibility for errors or unexpected behavior: every time we make an HTTP request with requests, we have to remember to manually call raise_for_status. Otherwise, no HTTPError is raised, and the retry logic won't execute. One way to solve the issue would be to create wrappers for the HTTP request functions and use those instead of the original ones. But this solution would create another very similar problem: the programmer implementing new API calls would need to know about the wrapped functions and remember to use them. To minimize the possibility of a mistake, every HTTP request made inside a function decorated with @http_retry should always be automatically followed by a raise_for_status call.

Monkey patching to the rescue! It's a technique that can be used to alter the behavior of existing code dynamically, most commonly used to patch existing third-party code. The description might sound complicated but in our case, it's very simple. For example, we can replace the built-in print function with a different implementation just by assigning a new value to it using the = operator:

def better_print(*args, **kwargs):
    print("[BETTER PRINT]", *args, **kwargs)

print = better_print

Let's try it out:

>>> print("Hello world!")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in better_print
  File "<stdin>", line 2, in better_print
  File "<stdin>", line 2, in better_print
  [Previous line repeated 996 more times]
RecursionError: maximum recursion depth exceeded

The code crashes when better_print is executed: the name print in the global scope is already bound to the function itself, which leads to infinite recursion. We can solve this issue by binding another name to the original print function and using it inside our new function like this:

original_print = print

def better_print(*args, **kwargs):
    original_print("[BETTER PRINT]", *args, **kwargs)

print = better_print

Now when we call print, the output is prefixed as expected:

>>> print("Hello World!")
[BETTER PRINT] Hello World!

Note: Name resolution in Python always tries to find the name in the nearest enclosing scope first. If the name is not present in the current scope, the interpreter tries to find it in the next enclosing scope and finally in the global scope.

We can patch functions from the requests module with the same principle:

import requests

original_get = requests.get

def patched_get(*args, **kwargs):
    response = original_get(*args, **kwargs)
    response.raise_for_status()
    return response

requests.get = patched_get

Now if we make an HTTP request using requests.get that returns an error code, an HTTPError will be automatically raised:

>>> requests.get("endpoint-with-error")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in patched_get
  File "/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: INTERNAL SERVER ERROR for url: endpoint-with-error

To avoid confusion, we should ensure that our monkey patch is enabled only during the execution of the decorated function. This can be achieved with the following steps:

When execution of the decorated function starts, apply the monkey patch
Execute the decorated function
Revert the monkey patch
Return the result from the decorated function

Let's implement a new decorator @patch_requests to perform these steps:

import requests
import functools

original_get = requests.get

def patched_get(*args, **kwargs):
    response = original_get(*args, **kwargs)
    response.raise_for_status()
    return response

def patch_requests(original_func):
    @functools.wraps(original_func)
    def decorated_func(*args, **kwargs):
        requests.get = patched_get
        result = original_func(*args, **kwargs)
        requests.get = original_get
        return result
    return decorated_func

Now we can apply the new decorator to our fetch_data function and omit the raise_for_status call from the function body:

@retry_http()
@patch_requests
def fetch_data():
    response = requests.get("api-endpoint-url")
    return response.content

At first glance, this seems to work correctly. The retry functionality works, and the function raises the correct exceptions without explicitly checking the response status code. But monkey patching can be tricky! Raising an exception interrupts the normal flow of execution and returns from the function. This means that the monkey patch will not get reverted, possibly causing catastrophic side effects elsewhere in the codebase. Let's fix the problem using a try-finally block:

def patch_requests(original_func):
    @functools.wraps(original_func)
    def decorated_func(*args, **kwargs):
        try:
            requests.get = patched_get
            result = original_func(*args, **kwargs)
            return result
        finally:
            requests.get = original_get
    return decorated_func

Looking pretty good! The finally block ensures that the patch always gets reverted, even if something unexpected happens. Now, let's combine it with the retry decorator. We want to ensure that HTTP error codes always raise an exception when using the retry decorator. Instead of copy-pasting the code from @patch_requests to @retry_http and creating a single, larger, and more complex decorator, we can just make the former apply the latter. Here's the complete code for the decorators:

import requests
import time
import functools
from requests import HTTPError

original_get = requests.get


def patched_get(*args, **kwargs):
    response = original_get(*args, **kwargs)
    response.raise_for_status()
    return response


def patch_requests(original_func):
    @functools.wraps(original_func)
    def decorated_func(*args, **kwargs):
        try:
            requests.get = patched_get
            result = original_func(*args, **kwargs)
            return result
        finally:
            requests.get = original_get
    return decorated_func


def retry_http(max_tries=5, wait_seconds=3, http_codes=[500, 502, 503, 504]):
    def decorator(original_func):
        @patch_requests
        @functools.wraps(original_func)
        def decorated_func(*args, **kwargs):
            current_try = 0
            while True:
                try:
                    return original_func(*args, **kwargs)
                except HTTPError as error:
                    current_try += 1
                    code = error.response.status_code
                    if current_try == max_tries or code not in http_codes:
                        raise error
                    time.sleep(wait_seconds)

        return decorated_func
    return decorator

And here's the code of the function:

@retry_http()
def fetch_data():
    response = requests.get("api-endpoint-url")
    return response.content

All done! With this solution, calling unstable external APIs with requests should be a lot more reliable. Of course, a similar solution can be useful for any operation that is not guaranteed to succeed all the time.

Note: Monkey patching is usually considered a very bad practice. When done carelessly, it can cause a lot of confusion and bugs and make the code difficult to follow in general. However, in my opinion, some cases like the example above can benefit a lot from monkey patching. It's a powerful tool that gives you a lot of flexibility without the need to create many abstractions. Also, using it together with decorators can help you avoid the worst issues by limiting the scope in which the monkey patch is applied.

Another note: In this example, only the get function from requests is patched. For real-world use, you probably want to patch the functions for other HTTP verbs too.

Conclusion

By gaining a deeper understanding of how decorators and functions work in Python, you can leverage their capabilities to write easily readable, maintainable, and reusable code. Decorators can be especially useful in cases where other structures do not allow reusing code easily without creating heavy abstractions. Despite common misconceptions about their complexity, decorators can actually simplify the testing process by allowing you to split your code into smaller units. Happy decorating!