Sign in to your Python Morsels account to save your screencast settings.
Don't have an account yet? Sign up here.
Let's make a generator expression.
generator expressiondefinition in Python Terminology.
Here we have a list and a list comprehension that loops over that list:
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> squares = [n**2 for n in numbers]
If we turn the square brackets ([
and ]
) in that list comprehension into parentheses ((
and )
):
>>> squares = (n**2 for n in numbers)
This will turn our list comprehension into a generator expression.
List comprehensions give us back new lists. Generator expressions give us back new generator objects:
>>> squares
<generator object <genexpr> at 0x7fcb363347b0>
A generator object, unlike a list, doesn't have a length:
>>> len(squares)
Traceback (most recent call last):
File "<console>", line 1, in <module>
TypeError: object of type 'generator' has no len()
If we try to index a generator object, to get its first item for example, we'll get an error:
>>> squares[0]
Traceback (most recent call last):
File "<console>", line 1, in <module>
TypeError: 'generator' object is not subscriptable
You cannot index a generator.
The only thing we can really do with a generator is loop over it:
>>> for n in squares:
... print(n)
...
4
1
9
16
49
121
324
It seems like generators have fewer features than lists. So why would we even want to use a generator expression?
The benefit of generators is that they are lazy iterables, meaning they don't do work until you start looping over them.
Right after we evaluate a generator expression a generator object will be made:
>>> squares = (n**2 for n in numbers)
>>> squares
<generator object <genexpr> at 0x7fd49a500900>
But up to this point this generator hasn't actually computed anything. It doesn't contain any values, unlike a list.
So if we change the number 4
in our list (at index 3
) to the number 5
:
>>> numbers
[2, 1, 3, 4, 7, 11, 18]
>>> numbers[3] = 5
>>> numbers
[2, 1, 3, 5, 7, 11, 18]
And then we loop over our generator object (using a list constructor, for
loop, or any other form of looping) we'll see that the fourth item isn't 16
, it's 25
:
>>> list(squares)
[4, 1, 9, 25, 49, 121, 324]
Generators don't do work until the point that they're looped over.
And if you loop over a generator a second time it'll be empty:
>>> list(squares)
[]
Generator objects are lazy iterables and they are single-use iterables. Items are generated as we loop over a generator (that's what makes them lazy) and these items are consumed as we loop over the generator, meaning they aren't stored anywhere (that's what makes them single-use).
When all the items in a generator have been consumed (meaning we've fully looped-over it) we say that it's exhausted.
That squares
generator above was exhausted:
>>> list(squares)
[]
You don't necessarily need to fully exhaust generators as you loop over them.
If we were to start looping over a generator and then we stopped once a condition was met (n > 10
below):
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> squares = (n**2 for n in numbers)
>>> for n in squares:
... print(n)
... if n > 10:
... break
...
4
1
9
16
If we then started looping again (using the list constructor in this case) our generator would start up where it left off before:
>>> list(squares)
[49, 121, 324]
Generators generate values as you loop over them.
Generator expressions are a comprehension-like syntax for creating new generator objects.
The only thing that one can do with a generator object is loop over it. Once you've looped over a generator object completely (i.e. you've exhausted it by consuming all the items within it) it doesn't really have a use anymore. Once a generator is exhausted it's empty forever.
There is one more thing we can do with the generators (besides looping over them) though it's a little bit unusual to see.
All generators can be passed to the built-in next
function.
The next
function gives us the next item in a generator:
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> squares = (n**2 for n in numbers)
>>> next(squares)
4
Generators keep track of the expression they need to evaluate on the iterable they're looping over and they keep track of where they are in the iterable.
If we call next
on a generator repeatedly we'll get each individual item in the generator:
>>> next(squares)
1
>>> next(squares)
9
>>> next(squares)
16
>>> next(squares)
49
>>> next(squares)
121
>>> next(squares)
324
If we call next
on a generator that's exhausted (it's been fully consumed) we'll get a StopIteration
exception:
>>> next(squares)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
That StopIteration
exception indicates that there are no more values in this generator (it's empty):
>>> list(squares)
[]
Just as list comprehensions make new lists, generator expressions make new generator objects.
A generator is an iterable which doesn't actually contain or store values; it generates values as you loop over it.
This means generators are more memory efficient than lists because they don't really store memory to hold their values. Instead they generate values on the fly as we loop over them.
Generator expressions give us generators which are lazy single-use iterables.
Need to fill-in gaps in your Python skills?
Sign up for my Python newsletter where I share one of my favorite Python tips every week.
List comprehensions make new lists. Generator expressions make new generator objects. Generators are iterators, which are lazy single-use iterables. Unlike lists, generators aren't data structures. Instead they do work as you loop over them.
To track your progress on this Python Morsels topic trail, sign in or sign up.
Need to fill-in gaps in your Python skills? I send weekly emails designed to do just that.