Sign in to your Python Morsels account to save your screencast settings.
Don't have an account yet? Sign up here.
Let's talk about Python's walrus operator.
We have a function called get_quantitiy
that accepts a string argument which represents a number and a unit (either kilograms or grams):
import re
UNITS_RE = re.compile(r'^(?P<quantity>\d+)\s*(?P<units>kg|g)$')
def get_quantity(string):
match = UNITS_RE.search(string)
if match:
return (int(match.group('quantity')), match.group('units'))
return int(string)
When we call this function it returns a tuple with 2 items: the number (converted to an integer) and the unit.
>>> get_quantity('4 kg')
(4, 'kg')
>>> get_quantity('4 g')
(4, 'g')
If we give this function a string representing just a number (no units), it will give us that number back converted to an integer:
>>> get_quantity('4')
4
This get_quantitiy
function assumes that whatever we give to it is either the pattern (number and unit) or just an integer.
We're doing this using regular expressions, which are a form of pattern matching:
import re
UNITS_RE = re.compile(r'^(?P<quantity>\d+)\s*(?P<units>kg|g)$')
We're not going to get into regular expressions right now.
Instead we're going to focus on these two lines of code (from get_quantity
):
match = UNITS_RE.search(string)
if match:
On the first line, the match
variable stores either a match object or None
.
Then if match
asks the question "did we get something that it truthy?"
A match object is truthy and None
is falsey, so we're basically checking whether we got a match
object to work with.
Those two lines above (the assignment to match
and the conditional check based on match
) can actually be combined into one line of code.
We can take these two lines of code:
match = UNITS_RE.search(string)
if match:
And combine them into one line of code using an assignment expression (new in Python 3.8):
if match := UNITS_RE.search(string):
Before we had an assignment statement and a condition (that were checking in our if
statement).
Now we have both in one line of code.
import re
UNITS_RE = re.compile(r'^(?P<quantity>\d+)\s*(?P<units>kg|g)$')
def get_quantity(string):
if match := UNITS_RE.search(string):
return (int(match.group('quantity')), match.group('units'))
return int(string)
We're using the walrus operator, which is the thing that powers assignment expressions.
Assignment expressions allow us to embed an assignment statement inside of another line of code.
They use walrus operator (:=
):
if match := UNITS_RE.search(string):
Which is different from a plain assignment statement (=
) because an assignment statement has to be on a line all on its own:
match = UNITS_RE.search(string)
The :=
is called the walrus operator because it looks kind of like a walrus on its side: the colon looks sort of like eyes and the equal sign looks kind of like tusks.
Checking to see if we got a match object when using regular expressions in Python is a very common use of the walrus operator.
Another common use case for the walrus operator is in a while
loop.
Specifically it's common to see a walrus operator used in a while
loop that repeatedly:
With the walrus operator we can perform both of those actions at the same time.
We have a function called compute_md5
:
import hashlib
def compute_md5(filename):
md5 = hashlib.md5()
with open(filename, mode="rb") as f:
while chunk := f.read(8192):
md5.update(chunk)
return md5.hexdigest()
This function takes a file name and gives us back the MD5 checksum of that file:
>>> compute_md5('units.py')
'b6a5563be535cb94a44d8aea5f9b0f8c'
We might use a function like this if we were trying to check for duplicate files or verify that a large file downloaded accurately.
We're not going to focus on the details of this function though.
We care about what is the walrus operator doing in this compute_md5
function and what's the alternative of the walrus operator here?
We're repeatedly reading eight kilobytes (8192 bytes) into the chunk
variable:
while chunk := f.read(8192):
md5.update(chunk)
The alternative to this is to assign to the chunk
variable before our loop, check the value of chunk
in our loop condition, and also assign to chunk
at the end of each loop iteration:
chunk = f.read(8192)
while chunk:
md5.update(chunk)
chunk = f.read(8192)
I would argue that the using an assignment expression makes this code more readable than the alternative because we've taken what was three lines of code and turned it into just one line.
In each iteration of our loop we're grabbing a chunk, checking its truthiness (to see whether we've reached the end of the file) and assigning that chunk to the chunk
variable.
And we're doing all of this in just one line of code:
while chunk := f.read(8192):
Assignment expressions use the walrus operator (:=
).
Assignment expressions are a way of taking an assignment statement and embedding it in another line of code. I don't recommend using them unless they make your code more readable.
Need to fill-in gaps in your Python skills?
Sign up for my Python newsletter where I share one of my favorite Python tips every week.
Need to fill-in gaps in your Python skills? I send weekly emails designed to do just that.