What is an iterator?

Share
Copied to clipboard.
Trey Hunner smiling in a t-shirt against a yellow wall
Trey Hunner
5 min. read 4 min. video Python 3.8—3.12

What is an iterator?

Generator objects are iterators

Here we have a generator expression called squares:

>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)

Generator expressions give us back generator objects:

>>> squares
<generator object <genexpr> at 0x7fe6b73fc120>

We can pass a generator object to the built-in next function to get just its next item:

>>> next(squares)
4

But what do you think would happen if we passed a list to the built-in next function?

>>> next(numbers)

If we pass a list to the built-in next function, we'll get a TypeError that says 'list' object is not an iterator:

>>> next(numbers)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator

We know what an iterable is. An iterable is anything that you're able to iterate over.

An iterator is the object that actually performs the iteration over an iterable.

Iterables are powered by iterators

From Python's perspective, an iterable is any object that can be passed to the built-in iter function to get an iterator from it:

>>> numbers = [2, 1, 3, 4]
>>> my_iterator = iter(numbers)
>>> my_iterator
<list_iterator object at 0x7f2114273e20>

Note that as Python programmers, we don't normally pass iterables to the iter function ourselves (Python uses this internally).

Once you have an iterator, you can pass it to the built-in next function to get just its next item:

>>> next(my_iterator)
2
>>> next(my_iterator)
1
>>> next(my_iterator)
3
>>> next(my_iterator)
4

But if you pass an iterator to the built-in next function and it's exhausted (it has no more items) you'll get a StopIteration exception:

>>> next(my_iterator)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Iterators power all iteration

Iterators power all forms of iteration in Python:

Python's for loops are powered by iterators:

>>> numbers = [2, 1, 3, 4]
>>> for n in numbers:
...     print(n)
...
2
1
3
4

Comprehensions are powered by iterators:

>>> numbers = [2, 1, 3, 4]
>>> cubes = [n**3 for n in numbers]

Tuple unpacking is powered by iterators:

>>> numbers = [2, 1, 3, 4]
>>> first, *middle, last = numbers

Iterators even power the * operator used to unpack items into separate arguments in a function call:

>>> numbers = [2, 1, 3, 4]
>>> print(*numbers)
2 1 3 4

Anything that involves looping involves iterators under the hood.

Python uses iterators to run for loops

This means we could actually re-implement a for loop ourselves without a for loop.

This function with a for loop:

def for_loop(iterable):
    for item in iterable:
        print("Do something with", item)

Is the same as this function that uses a while loop:

def for_loop(iterable):
    iterator = iter(iterable)
    while True:
        try:
            item = next(iterator)
        except StopIteration:
            break  # Iterator exhausted
        else:
            print("Do something with", item)

In this function we're calling iter to get an iterator from the given iterable:

    iterator = iter(iterable)

Then we're repeatedly calling next on the iterator, breaking out of our loop only when we get a StopIteration exception, and executing the body of our loop until StopIteration occurs:

    while True:
        try:
            item = next(iterator)
        except StopIteration:
            break  # Iterator exhausted
        else:
            print("Do something with", item)

Under the hood, Python's for loops actually handle a StopIteration exception every time the for loop ends.

This for_loop function works just as if we had implemented it with a for loop:

>>> for_loop([2, 1, 3, 4])
Do something with 2
Do something with 1
Do something with 3
Do something with 4

This is what for loops actually do under the hood in Python: they get an iterator from an iterable and repeatedly get the next item from it.

Iterators are also iterables

Generators can be passed to the built-in next function, which means generators are iterators:

>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)
>>> next(squares)
4

But we normally loop over generators, which means they're also iterables:

>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)
>>> for n in squares:
...     print(n)
...
4
1
9
16

So generators are both iterators and iterables.

This is actually true of all iterators in Python. Every iterator is also an iterable.

Iterators are their own iterators

This is where things get a little weird (if they weren't weird enough already).

Every iterator is also an iterable. And any iterable in Python can be passed to the built-in iter function.

That means that iterators, such as generator objects, can be passed to the built-in iter function.

What do you think we'll get if we pass an iterator to the iter function?

>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)
>>> iter(squares)

When you pass an iterable to the built-in iter function, it returns an iterator.

Since iterators are both iterables and iterators, they give you themselves back when asked for an iterator:

>>> numbers = [2, 1, 3, 4]
>>> squares = (n**2 for n in numbers)
>>> iter(squares)
<generator object <genexpr> at 0x7fcf52c04a50>
>>> squares
<generator object <genexpr> at 0x7fcf52c04a50>

When you pass an iterator to the built-in iter function, it will give you itself back (which is a little bit weird).

All these rules about iterables and iterators are referred to as The Iterator Protocol.

The Iterator Protocol

The Iterator Protocol are the rules that make all forms of looping work in Python. This is how looping works, from Python's perspective.

The Iterator Protocol says that:

  1. An iterable is an object that can be passed to the built-in iter function to get an iterator from it
  2. An iterator can be passed to the built-in next function to get its next item
  3. When an exhausted iterator is passed to the built-in next function, a StopIteration exception will be raised
  4. Iterators are also iterables and they return themselves when passed to the built-in iter function

Iterators are lazy iterables that power all other iterables

So an iterable is anything that you can iterate over. And an iterator is the object that actually performs the iteration over an iterable.

But iterators are also iterables themselves. They're lazy iterables they get consumed as you loop over them.

Normally if you have a generator object (which is an iterator) you probably wouldn't call next on it repeatedly: you'd probably loop over it instead!

Just as sequences are the generic form of a list-like object in Python, iterators are the generic form of a generator-like object.

Series: Generator Functions

Generator functions look like regular functions but they have one or more yield statements within them. Unlike regular functions, the code within a generator function isn't run when you call it! Calling a generator function returns a generator object, which is a lazy iterable.

To track your progress on this Python Morsels topic trail, sign in or sign up.

0%
A Python Tip Every Week

Need to fill-in gaps in your Python skills? I send weekly emails designed to do just that.