Learn how to manage your Python version and libraries.
Sometimes, when working with Python, you may want to create a class that is just a simple container for some data, so you just write a plain class. Well, Python developers thought ahead of you on that use case and created typing.NamedTuple and dataclasses.dataclass for you.
Picture this scenario: you just bought five new games on the Steam sale. At first, you are happy and thrilled to play, but then you realize that your library has over 200 games and that probably "Cities Skylines II" is not your top priority.
Instead of playing, you decide to write a Python script to organize your library, and you start like this:
class Game:
def __init__(self, name: str, price: float, rating: float):
self.name = name
self.price = price
self.rating = rating
Then you saw on a YouTube video that you could use a __repr__ method to print your objects in a more readable way, so you add this to your class:
def __repr__(self) -> str:
return f"""
{self.name.upper()}
Your code is looking good, so you proceed to program a Library class to hold all your games:
class Library:
def __init__(self):
self.games = []
def add_game(self, game: Game):
self.games.append(game)
Wouldn't it be nice to have a method that returns cool metrics about my library?
def get_cool_metrics(self) -> int:
return len(self.games)
What about the total price?
def get_cool_metrics(self) -> tuple[int, float]:
return len(self.games), sum(game.price for game in self.games)
And what about the total playtime?
But wait. That would imply adding an argument to the constructor of Game, updating the __repr__... And then the get_cool_metrics methods would return a tuple[int, float, float]? Why is everything so complicated?
Python dataclasses are a way to create classes that are just simple containers for data. Their syntax is simpler, and they have a lot of pre-implemented dunder methods, including the constructor.
You could use a dataclass to create the Game class.
from dataclasses import dataclass
@dataclass
class Game:
name: str
price: float
rating: float
And adding the playtime attribute is just as simple as adding a new line to the class definition.
And they could be used for the return type of the get_cool_metrics method, right?
While dataclasses are a great way to create simple classes, they are dynamic, or, in other words, mutable and slow. And what you really need is a C-like struct to hold data.
In Python, the most similar thing to a struct is a collections.namedtuple and a typing.NamedTuple.
You shouldn't use collections.namedtuple. The syntax is ugly, and it is not as powerful as the typing.NamedTuple, as it benefits from the Python classes syntax.
Now, we can create a "struct" to hold the cool metrics of our library.
from typing import NamedTuple
class CoolMetrics(NamedTuple):
total_games: int
total_price: float
total_playtime: float
As you can see, the syntax is very similar to the dataclass syntax. But remember, NamedTuple acts the same way as a Python tuple with named attributes. It is immutable and faster than a class.
Finally, our library method would look like this:
def get_cool_metrics(self) -> CoolMetrics:
total_games = len(self.games)
total_price = sum(game.price for game in self.games)
total_playtime = sum(game.playtime for game in self.games)
return CoolMetrics(total_games, total_price, total_playtime)
Now you can access the data by dot notation, and extend the CoolMetrics class and the get_cool_metrics method at will.
A NamedTuple is a simple container for data, a fancier Python tuple, or a C-like struct. It's immutable: you can't change the values after the object is created. In the case of the CoolMetrics object, why would you change any attribute?
On the other hand, a dataclass is a prettier way of writing a simple Python class that holds state. It offers a lot of pre-implemented magic methods, like __init__, __repr__, __eq__, and so on. It is mutable, and you can change the values at will. For example, I could easily update the playtime of my Game class by accessing the attribute.
In this article, we saw how to use NamedTuple and dataclass to create simple classes that hold data and their main differences. Now, go and play some video games, because your library won't ever shrink.