Docstrings in Python

Transcript:

Let's talk about docstrings in Python.

What is a docstring

We have a function called get_hypotenuse:

from math import sqrt


def get_hypotenuse(a, b):
    """Return right triangle hypotenuse, given its other two sides."""
    return sqrt(a**2 + b**2)

If we ask for help on this function:

>>> help(get_hypotenuse)

We'll see a friendly help message:

get_hypotenuse(a, b)
    Return right triangle hypotenuse, given its other two sides.

We wrote that message when we defined this function.

Docstrings must be the first statement in a function

The very first statement in the get_hypotenuse function is a multi-line string:

def get_hypotenuse(a, b):
    """Return right triangle hypotenuse, given its other two sides."""
    return sqrt(a**2 + b**2)

But that string somehow shows up as documentation.

Note that this isn't a comment Comments look like this in Python:

def get_hypotenuse(a, b):
    # A comment

That's a single-line comment. We don't have multi-line comments in Python.

This multi-line string we wrote acts as documentation:

    """
    Return right triangle hypotenuse, given its other two sides.
    """

This string is called a docstring. It acts as documentation for this function because of where it is and what it is.

If we were to move this string below our return statement:

def get_hypotenuse(a, b):
    return sqrt(a**2 + b**2)
    "Return right triangle hypotenuse, given its other two sides."

And then we asked for help on our get_hypotenuse function again:

>>> help(get_hypotenuse)
get_hypotenuse(a, b)

We'd see that there isn't any documentation for the function now.

The documentation string has to be the very first statement inside the function. It can't be part of some expression. For example it can't be an assignment statement:

from math import sqrt


def get_hypotenuse(a, b):
    z = """Return right triangle hypotenuse, given its other two sides."""
    return sqrt(a**2 + b**2)

It must be just one string on its own.

Triple quotes: convention or requirement?

What if we took this multi-line string and removed the triple quotes, making it a single-line string:

from math import sqrt


def get_hypotenuse(a, b):
    "Return right triangle hypotenuse, given its other two sides."
    return sqrt(a**2 + b**2)

If we ask for help on get_hypotenuse now, will we still see our docstring?

>>> help(get_hypotenuse)

We will!

get_hypotenuse(a, b)
    Return right triangle hypotenuse, given its other two sides.

PEP 257: the docstring style guide

It's very common to see triple quotes used for docstrings but using triple quotes for docstrings is just a convention. It's a convention that's dictated by PEP 257. Python Enhancement Proposal 257 is basically a style guide for how to write your docstrings in Python. PEP 257 says you should use triple quotes so that it's easy to turn them into multi-line strings.

Another reason I prefer to use multi-line strings for docstrings is that they stand out a little bit more: those triple quotes catch my eye.

PEP 257 also says to start your docstrings with a capital letter, use the imperative tense (meaning "return" instead of "returns"), end with period, and a whole bunch of other stuff.

Take a look at PEP 257 if you're trying to figure out some best practice for writing your own docstrings in Python.

The __doc__ attribute

So if the very first statement in any function is a string hanging out on its own that's a docstring. Python will read docstrings and display them whenever you ask for help on that function.

In fact, Python will even attach the docstrings to the function. Functions in Python have a __doc__ attribute, this __doc__ attribute is the documentation for that function:

>>> get_hypotenuse.__doc__
Return right triangle hypotenuse, given its other two sides.

Docstrings in classes

So, if there's a docstring in a function, if you look at its source code you'll see that the very first statement is that string. This same rule applies for classes.

We have a class called Point:

class Point:
    """A 3-dimensional point."""
    def __init__(self, x, y, z):
        self.x, self.y, self.z = x, y, z

The very first statement in that Point class is a string hanging out on its own.

If we asked for help on this class, we'll see this docstring:

>>> help(Point)
class Point(builtins.object)
| Point(x, y, z)
|
| A 3-dimensional point.
|
| Methods defined here:
|
| __init__(self, x, y, z)
|   Initialize self. See help (type(self)) for accurate signature.
|

In fact, this is how the help function works in general.

If you ask for help on an object Python will look for the docstring of that object.

Modules have docstring too

Even modules can have docstrings.

We have a module called hi.py that starts with a string:

"""Hi there!"""

name = "Trey"
x = 4

If we import this hi module and ask for help on it, we'll see the docstring for that module:

>> import hi
>>> help(hi)
NAME
    hi - Hi there!
DATA
    name = 'Trey'
    x = 4
FILE
    /home/trey/hi.py

Summary

So if the very first statement in a function, a module, or a class is a string hanging out on its own, regardless of whether it's a multi-line string or not, that's a docstring.

That docstring will be read by Python, parsed by Python, and attached to the object to act as documentation for that object.

It's a best practice to *use docstrings to document your code rather than using comments. Comments are ignored by Python but docstrings are not.

And if you see a docstring in code, note that it has a special meaning: don't move it to the second statement in a function because that wouldn't make it a docstring anymore. Keep your docstrings as the first statement in your classes, modules, and functions so that Python will be able to read them.