What is an iterator?
iteratordefinition in Python Terminology.
Here we have a generator expression called squares
:
>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)
Generator expressions give us back generator objects:
>>> squares
<generator object <genexpr> at 0x7fe6b73fc120>
We can pass a generator object to the built-in next
function to get just its next item:
>>> next(squares)
4
But what do you think would happen if we passed a list to the built-in next
function?
>>> next(numbers)
If we pass a list to the built-in next function, we'll get a TypeError
that says 'list' object is not an iterator
:
>>> next(numbers)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator
We know what an iterable is. An iterable is anything that you're able to iterate over.
An iterator is the object that actually performs the iteration over an iterable.
From Python's perspective, an iterable is any object that can be passed to the built-in iter
function to get an iterator from it:
>>> numbers = [2, 1, 3, 4]
>>> my_iterator = iter(numbers)
>>> my_iterator
<list_iterator object at 0x7f2114273e20>
Note that as Python programmers, we don't normally pass iterables to the iter
function ourselves (Python uses this internally).
Once you have an iterator, you can pass it to the built-in next
function to get just its next item:
>>> next(my_iterator)
2
>>> next(my_iterator)
1
>>> next(my_iterator)
3
>>> next(my_iterator)
4
But if you pass an iterator to the built-in next
function and it's exhausted (it has no more items) you'll get a StopIteration
exception:
>>> next(my_iterator)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Iterators power all forms of iteration in Python:
Python's for
loops are powered by iterators:
>>> numbers = [2, 1, 3, 4]
>>> for n in numbers:
... print(n)
...
2
1
3
4
Comprehensions are powered by iterators:
>>> numbers = [2, 1, 3, 4]
>>> cubes = [n**3 for n in numbers]
Tuple unpacking is powered by iterators:
>>> numbers = [2, 1, 3, 4]
>>> first, *middle, last = numbers
Iterators even power the *
operator used to unpack items into separate arguments in a function call:
>>> numbers = [2, 1, 3, 4]
>>> print(*numbers)
2 1 3 4
Anything that involves looping involves iterators under the hood.
for
loopsThis means we could actually re-implement a for
loop ourselves without a for
loop.
This function with a for
loop:
def for_loop(iterable):
for item in iterable:
print("Do something with", item)
Is the same as this function that uses a while
loop:
def for_loop(iterable):
iterator = iter(iterable)
while True:
try:
item = next(iterator)
except StopIteration:
break # Iterator exhausted
else:
print("Do something with", item)
In this function we're calling iter
to get an iterator from the given iterable:
iterator = iter(iterable)
Then we're repeatedly calling next
on the iterator, breaking out of our loop only when we get a StopIteration
exception, and executing the body of our loop until StopIteration
occurs:
while True:
try:
item = next(iterator)
except StopIteration:
break # Iterator exhausted
else:
print("Do something with", item)
Under the hood, Python's for
loops actually handle a StopIteration
exception every time the for
loop ends.
This for_loop
function works just as if we had implemented it with a for
loop:
>>> for_loop([2, 1, 3, 4])
Do something with 2
Do something with 1
Do something with 3
Do something with 4
This is what for
loops actually do under the hood in Python: they get an iterator from an iterable and repeatedly get the next item from it.
Generators can be passed to the built-in next
function, which means [generators are iterators][]:
>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)
>>> next(squares)
4
But we normally loop over generators, which means they're also iterables:
>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)
>>> for n in squares:
... print(n)
...
4
1
9
16
So generators are both iterators and iterables.
This is actually true of all iterators in Python. Every iterator is also an iterable.
This is where things get a little weird (if they weren't weird enough already).
Every iterator is also an iterable.
And any iterable in Python can be passed to the built-in iter
function.
That means that iterators, such as generator objects, can be passed to the built-in iter
function.
What do you think we'll get if we pass an iterator to the iter
function?
>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)
>>> iter(squares)
When you pass an iterable to the built-in iter
function, it returns an iterator.
Since iterators are both iterables and iterators, they give you themselves back when asked for an iterator:
>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)
>>> iter(squares)
<generator object <genexpr> at 0x7fcf52c04a50>
>>> squares
<generator object <genexpr> at 0x7fcf52c04a50>
When you pass an iterator to the built-in iter
function, it will give you itself back (which is a little bit weird).
All these rules about iterables and iterators are referred to as The Iterator Protocol.
The Iterator Protocol are the rules that make all forms of looping work in Python. This is how looping works, from Python's perspective.
The Iterator Protocol says that:
iter
function to get an iterator from itnext
function to get its next itemnext
function, a StopIteration
exception will be raisediter
functionSo an iterable is anything that you can iterate over. And an iterator is the object that actually performs the iteration over an iterable.
But iterators are also iterables themselves. They're lazy iterables they get consumed as you loop over them.
Normally if you have a generator object (which is an iterator) you probably wouldn't call next
on it repeatedly: you'd probably loop over it instead!
Just as sequences are the generic form of a list-like object in Python, iterators are the generic form of a generator-like object.
Need to fill-in gaps in your Python skills?
Sign up for my Python newsletter where I share one of my favorite Python tips every week.
Generator functions look like regular functions but they have one or more yield
statements within them. Unlike regular functions, the code within a generator function isn't run when you call it! Calling a generator function returns a generator object, which is a lazy iterable.
To track your progress on this Python Morsels topic trail, sign in or sign up.
Need to fill-in gaps in your Python skills? I send weekly emails designed to do just that.