What's the best way to check whether one string contains another string in Python?
Let's say we have two strings, part
and whole
.
>>> part = "Python"
>>> whole = "I enjoy writing Python code."
We'd like to know whether part
is a substring of whole
.
That is, we'd like to know whether the entire part
string appears within the whole
string.
Python's strings have an index
method and a find
method that can look for substrings.
Folks new to Python often reach for these methods when checking for substrings, but there's an even simpler way to check for a substring.
Let's take a look at the index
and find
methods first and briefly discuss why I don't recommend using them and why I recommend using the in
operator instead.
index
methodThe string index
method accepts a string and returns the index that the given string was first found:
>>> whole.index(part)
16
If the given string was not found, a ValueError
exception will be raised:
>>> whole.index("Java")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: substring not found
To determine whether whole
contains part
, we could use exception handling along with the string index
method:
>>> try:
... whole.index(part)
... found = True
... except ValueError:
... found = False
...
>>> found
True
But exception handling can incur a performance cost, so we may want to avoid the index
method for performance reasons alone.
Performance considerations aside, this approach also seems a bit too verbose. We're using 4 lines of code just to check whether one string contains another.
I've always found the behavior of the index
method a bit surprising.
I recommend avoiding the string index
method.
find
methodThe string find
method accepts a string and returns the index that the given string was first found:
>>> whole.find("Python")
16
If no string was found, find
returns -1
instead:
>>> whole.find("Java")
-1
To determine whether whole
contains part
, we could use the string find
method and then make sure the returned index is greater than -1
:
>>> found = whole.find(part) > -1
>>> found
True
That works, but it feels a bit awkward.
In particular, that -1
seems a bit too magical.
count
methodWhat if we count the number of times a given substring is found?
Python's strings have a count
method that will accept a substring and return a count of the number of times that substring was found in the string.
>>> count = whole.count("Python")
>>> count
1
This approach works, but it does a bit more than we need.
The index
and find
methods will return True
as soon as those methods find a substring.
But with the count
method, our code will always loop all the way through the string in order to count every occurrence.
This performance difference likely isn't a concern with smaller strings, but it is something to keep in mind.
The count
method seems like the most readable approach so far, even if it may sometimes be less performant than using the find
method.
But there's an even better way to check whether one string is the substring of another string.
in
operatorPython has an in
operator that works with many data structures, including strings.
Here's in
used to check whether a list contains a particular item:
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> 5 in numbers
False
With strings, the in
operator is used for checking whether one string contains another:
>>> message = "I enjoy writing Python code."
>>> "Python" in message
True
This is exactly the tool we've been seeking!
Python's in
operator is typically the most idiomatic way to check whether one string contains another string in Python.
What if you want to ignore capitalization while performing your string containment check?
Take these two strings:
>>> part = "python"
>>> whole = "I enjoy writing Python code."
The part
string isn't within the whole
string, but it would be if we could somehow disregard whether each letter is uppercase or lowercase.
>>> part in whole
False
To solve this problem, we could combine two tools:
in
operatorcasefold
method (or the upper
or lower
methods if you prefer)The casefold
method will lowercase a string while specially considering certain Unicode characters as well:
>>> part.casefold()
'python'
>>> whole.casefold()
'i enjoy writing python code.'
If we call casefold
on our part
and whole
strings and then use the in
operator, we'll essentially perform a case-insensitive containment check:
>>> part.casefold() in whole.casefold()
True
Many Python problems can't be solved with a single tool, but can be solved fairly simply by combining a couple simple tools.
Combining the casefold
method and the in
operator to perform case-insensitive substring checks is just one such helpful mash-up.
What if you need to check whether one string contains another specifically at the beginning or the end of the string?
You can use the string startswith
method to check whether one string starts with another string:
>>> whole = "I enjoy writing Python code."
>>> whole.startswith("You ")
False
>>> whole.startswith("I ")
True
You can use the string endswith
method to check whether one string ends with another string:
>>> whole = "I enjoy writing Python code."
>>> whole.endswith("code?")
False
>>> whole.endswith("code.")
True
You can think of startswith
as a prefix check and endswith
as a suffix check.
What if you need something slightly more complex? What if you want to check whether a string contains a pattern that can't be described by a simple substring check?
For example, what if we wanted to check whether a string contained PyCon
, some whitespace, and then a 4-digit number (e.g. PyCon 2025
or PyCon 3030
)?
>>> message = "My first Python conference was PyCon 2014."
At this point, you'll probably want to reach for a regular expression.
Here we're using a regular expression to look for PyCon
, one or more whitespace characters, 4 digit characters, and a then "word boundary":
>>> import re
>>> contains_pycon_year = bool(re.search(r"PyCon\s+\d{4}\b", message))
>>> contains_pycon_year
True
Regular expressions are powerful, but they're often confusing for new programmers as well as many experienced Python programmers. I recommend avoiding regular expressions whenever a simple containment check would suffice.
in
operator for substring checksNeed to check whether one string contains another in Python?
If all you need is a simple substring check, use the in
operator.
>>> part = "Python"
>>> whole = "I enjoy writing Python code."
>>> part in whole
True
If you need a case-insensitive containment check, use the string casefold
method:
>>> part = "Python"
>>> whole = "I enjoy writing Python code."
>>> part.casefold() in whole.casefold()
True
If you do need something more complex, you may need to reach for a regular expression.
But please stick to the in
operator if you can.
Reaching for a regular expression prematurely can make your code harder to read with little benefit.
Intro to Python courses often skip over some fundamental Python concepts.
Sign up below and I'll explain concepts that new Python programmers often overlook.
Intro to Python courses often skip over some fundamental Python concepts.
Sign up below and I'll share ideas new Pythonistas often overlook.