# Representing binary data with bytes

Share
3 min. read Watch as video Python 3.8—3.12
Watch as video
03:25

Let's talk about the difference between strings and bytes in Python.

## Creating `bytes` objects in Python

Strings represent text (human language that is). For example, here we have a string named `text`:

``````>>> text = "hello"
``````

But there's another type that's closely associated with strings, which kind of looks like making a string with a `b` prefixed in front of it.

``````>>> data = b"hello"
``````

That `b` is sort of like an `f` before an f-string, or an `r` before a raw string. But that `b` doesn't actually make a string, it makes a `bytes` object:

``````>>> data
b'hello'
>>> type(data)
<class 'bytes'>
``````

## Strings represent text, `bytes` objects represent binary data

If we loop over a string in Python, we'll get back sub-strings representing each of the characters in that string:

``````>>> text = "hello"
>>> list(text)
['h', 'e', 'l', 'l', 'o']
``````

What do you think we'll get if we loop over a `bytes` object?

``````>>> data = b"hello"
>>> list(data)
``````

Since `bytes` objects represent binary data, when we loop over them we get back numbers (from `0` to `255`) representing each of the bytes in that binary data:

``````>>> data = b"hello"
>>> list(data)
[104, 101, 108, 108, 111]
``````

We can also do the opposite of this. We can take an iterable of numbers and turn it into a `bytes` object by passing it to the `bytes` constructor:

``````>>> nums = [0, 65, 97, 255]
>>> bytes(nums)
b'\x00Aa\xff'
``````

## Where are bytes objects used in Python?

All data that comes from outside of our Python process starts as bytes. But if that data represents text (and Python knows it) Python will convert it to strings automatically.

If we use the urllib module in Python to do an HTTP request, the data that we get back is not represented as a string:

``````>>> from urllib.request import urlopen
>>> data
b'Grace Jones\n'
>>> type(data)
<class 'bytes'>
``````

The data we get back is represented as a `bytes` object because it might not even represent text. After all, an HTTP request can send back any data, even arbitrary binary data.

If we open up a file with the mode of `rb`, we're opening that file not in the default read-text mode, but instead in read-binary mode.

``````>>> with open("avatar.jpg", mode="rb") as jpg_file:
...
``````

So when we read from that file, the data that we get out of it will not be a string, it'll be a `bytes` object.

``````>>> type(jpg_data)
<class 'bytes'>
``````

In fact in this case where we're opening up a `jpg` file, we get a `bytes` object with a lot of bytes in it, because it takes a lot of bytes to represent an image:

``````>>> len(jpg_data)
1108051
``````

## How to convert `bytes` into a string

If you end up with a `bytes` object in Python, and you know that that object represents text, you can turn it into a string by calling its `decode` method:

``````>>> data = b"bytes! \xe2\x9c\xa8"
>>> data.decode()
'bytes! ✨'
``````

The `decode` method (without any arguments passed to it) uses a default character encoding of `utf-8`. Even if we know that the data we're working with uses that default character encoding of `utf-8`, it's considered a best practice to always specify the encoding of our bytes:

``````>>> text = data.decode("utf-8")
>>> text
'bytes! ✨'
``````

As the Zen of Python says, "explicit is always better than implicit".

If for some reason you have a string you want to turn it into bytes, you can call the `encode` method on that string to encode it into bytes:

``````>>> text.encode()
b'bytes! \xe2\x9c\xa8'
``````

Just like `decode`, the `encode` method defaults to using `utf-8`, but you could specify a different character encoding if you wanted to:

``````>>> text.encode("utf-8")
b'bytes! \xe2\x9c\xa8'
>>> text.encode("utf-16-le")
b"b\x00y\x00t\x00e\x00s\x00!\x00 \x00('"
``````

## Summary

Strings represent text-based data, while bytes represent binary data (i.e. images, video, or anything else you could represent on a computer).

Depending on what you use Python for, you probably won't encounter `bytes` objects very often. But when you do, the one thing you'll probably want to do with them is call their `decode` method to turn them into a string (assuming those bytes represent text).

A Python Tip Every Week

Need to fill-in gaps in your Python skills? I send weekly emails designed to do just that.