reduce() in Python and why to avoid it

Share
Copied to clipboard.
Series: Looping
Trey Hunner smiling in a t-shirt against a yellow wall
Trey Hunner
4 min. read Watch as video Python 3.8—3.12
Python Morsels
Watch as video
03:45

Python's functools module has a function called reduce that I usually recommend avoiding.

What is the functools.reduce function?

The functools.reduce function looks a little bit like this:

not_seen = object()


def reduce(function, iterable, default=not_seen):
    """An approximation of the code for functools.reduce."""
    value = default
    for item in iterable:
        if value is not_seen:
            value = item
            continue
        value = function(value, item)
    return value

The reduce function is a bit complex. It's best understood with an example.

Performing arithmetic (a bad example)

>>> from functools import reduce
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> reduce(lambda x, y: x + y, numbers)

In the above example, we're calling reduce with two arguments:

  1. a function that adds two numbers together
  2. a list of numbers:

When we call reduce with those arguments it doesn't just add the first two numbers together. Instead it adds all the numbers together:

>>> reduce(lambda x, y: x + y, numbers)
46

That first function is called repeatedly to add up all of the numbers in this list. The reduce function first calls the given function on the first two items in numbers, then it takes the result it got back and uses that along with the third number as the new two arguments, and so on.

This is a bit of a silly example, because we have a function built into Python that can do this for us. The built-in sum function is both easier to understand and faster than using reduce:

>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> sum(numbers)
46

Even multiplying numbers isn't a great example of reduce:

>>> from functools import reduce
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> reduce(lambda x, y: x * y, numbers)
33264

When multiplying it's better to use the prod function in Python's math module (added in Python 3.8) because it's again faster and more readable that reduce:

>>> from math import prod
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> prod(numbers)
33264

Those two examples are silly uses of reduce, but not all reduce calls can be summarized in just a single line of code though.

A more complex example

This deep_get function allows us to deeply query a nested dictionary of dictionaries:

from functools import reduce

def deep_get(mapping, key_tuple):
    """Deeply query dict-of-dicts from given key tuple."""
    return reduce(lambda acc, val: acc[val], key_tuple, mapping)

For example, here's a dictionary of dictionaries:

>>> webhook_data = {
...     "event_type": "subscription_created",
...     "content": {
...         "customer": {
...             "created_at": 1575397900,
...             "card_status": "card",
...             "subscription": {
...                 "status": "active",
...                 "created_at": 1575397900,
...                 "next_billing_at": 1577817100
...             }
...         }
...     }
... }

We might wanna look up a key in this dictionary, and then look up a key in the dictionary we get back, and a key in the dictionary we get back there, and a key in it to finally get a value that we're looking for:

>>> webhook_data["content"]["customer"]["subscription"]["status"]
'active'

Instead of doing this querying manually, we could make a tuple of strings representing these keys, and pass that tuple to our deep_get function so it can do the querying for us:

>>> status_key = ("content", "customer", "subscription", "status")
>>> deep_get(webhook_data, status_key)
'active'

This deep_get function works, and it is powerful. But it's also pretty complex.

from functools import reduce

def deep_get(mapping, key_tuple):
    """Deeply query dict-of-dicts from given key tuple."""
    return reduce(lambda acc, val: acc[val], key_tuple, mapping)

Personally, I find this deep_get function hard to understand. We've condensed quite a bit of logic into just one line of code.

I would much rather see this deep_get function implemented using a for loop:

def deep_get(mapping, key_tuple):
    """Deeply query dict-of-dicts from given key tuple."""
    value = mapping
    for key in key_tuple:
        value = value[key]
    return value

I find that for loop easier to understand than the equivalent reduce call.

Don't re-invent the wheel

Even if you're familiar with functional programming techniques and you really like reduce, you might want to ask yourself:

Is the reduce call I'm about to use more efficient or less efficient than either a for loop or another tool included in Python?

For example, years ago, I saw this use of reduce in an answer to a programming question online:

>>> from functools import reduce
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> reduce(lambda accum, n: accum and n > 0, numbers, True)
True

This code checks whether all the numbers in a given list are greater than zero.

This code works but there's a better way to accomplish this task in Python.

The built-in all function in Python can accept a generator expression that performs the same task for us:

>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> all(n > 0 for n in numbers)
True

I find that all call easier to read, but it's also more efficient than the reduce call.

If we had many numbers and one of them was less than or equal to zero, the all function would return early (as soon as it found the number that doesn't match our condition). Whereas reduce will always loop all the way to the end.

Try to avoid reinventing the wheel with reduce. Your code will often be more readable (and sometimes even more efficient) without functools.reduce.

Common reduce operations in Python

Here are some common reduction operations in Python as well as some tools included in Python that are often more efficient and more readable than an equivalent reduce call:

Operation With functools.reduce Without reduce
Sum all reduce(lambda x, y: x+y, nums) sum(nums)
Multiply all reduce(lambda x, y: x*y, nums) math.prod(nums)
Join strings reduce(lambda s, t: s+t, strs) "".join(strs)
Merge dictionaries reduce(lambda g, h: g|h, cfgs) ChainMap(*reversed(cfgs))
Set union reduce(lambda s, t: s|t, sets) set.union(*sets)
Set intersection reduce(lambda s, t: s&t, sets) set.intersect(*sets)

Some of these are built-in functions, some are methods on built-in objects, and some are in the standard library.

Try to avoid functools.reduce

Python's reduce function (in the functools module) can implement a complex reduction operation with just a single line of code. But that single line of code is sometimes more confusing and less efficient than an equivalent for loop or another specialized reduction tool that's included with Python. So I usually recommend avoiding functools.reduce.

Series: Looping

Unlike, JavaScript, C, Java, and many other programming languages we don't have traditional C-style for loops. Our for loops in Python don't have indexes.

This small distinction makes for some big differences in the way we loop in Python.

To track your progress on this Python Morsels topic trail, sign in or sign up.

0%
A Python Tip Every Week

Need to fill-in gaps in your Python skills? I send weekly emails designed to do just that.

Python Morsels
Watch as video
03:45