Five things in Python that I only learned this year

Published: 24 Jul 2024
Written by: Chun Fei Lung

Many people learn Python as their first or second programming language, but only few people bother to master it.

My experience with Python has its ups and downs

Python occupies a somewhat peculiar place in the programming language landscape. The programming language is often touted as beginner-friendly and used in introductory programming courses like Harvard University’s popular CS50: Introduction to Computer Science course. But only few people go on to become Python experts. I think there are a couple of reasons for that:

Python may be a good language for beginners, but it is not necessarily a good language for those who want to build mobile apps, web apps, or software that needs to be safe and performant. People who develop software professionally often move on to other programming languages like C#, Java, Rust, or JavaScript.
Those who continue to use Python as their main language usually fall into one of two categories. Some develop web applications, for which (memory) safety and performance are less of a concern. Others, like data scientists and analysts, prefer to focus on solving problems rather than bikeshedding endlessly about things like design patterns and architecture.

As a result, many people – myself included – rarely feel the need to learn more about Python than what they need to get the job done. And that’s quite a pity, because Python has quite some features that allow you to write less and better code. In this article I describe a few of the features that I found useful, but didn’t learn about until much later.

Formatting strings

Formatting strings in Python via string concatenation can be quite cumbersome, because you will have to manually convert any variable that is not a string:

conversion_rate = 2.20371

# “TypeError: can only concatenate str (not "int") to str”
print('One euro used to be worth ' + conversion_rate + ' Dutch guilders')

# Prints “One euro used to be worth 2.20371 Dutch guilders”
print('One euro used to be worth ' + str(conversion_rate) + ' Dutch guilders')

Python 3.6 introduced formatted string literals (more commonly known as f-strings) which make it a lot easier to construct formatted strings:

# Prints “One euro used to be worth 2.20371 Dutch guilders”
print(f'One euro used to be worth {conversion_rate} Dutch guilders')

When numeric values need to be formatted in a specific way, e.g. with a certain number of decimals, you often see people write code like this:

# 1. Round `conversion_rate` to two decimals. This turns it into `2.2`
# 2. Convert `2.2` to a string so that we can manually add the dropped `0` back
# 3. Left align the `2.2` string, fill empty space with `0` up to 4 characters
rounded_rate = str(round(conversion_rate, 2)).ljust(4, '0')

# Prints “One euro used to be worth 2.20 Dutch guilders”
print(f'One euro used to be worth {rounded_rate} Dutch guilders')

There is no need to do this, because you can also use format specifications to format variables directly! Format specifications can be used in f-strings, but also with format() and %:

# All three lines print “One euro used to be worth 2.20 Dutch guilders”
print(f'One euro used to be worth {conversion_rate:.2f} Dutch guilders')
print('One euro used to be worth {:.2f} Dutch guilders'.format(conversion_rate))
print('One euro used to be worth %.2f Dutch guilders' % conversion_rate)

Dictionaries

Python dictionaries provide an easy way to store arbitrary values in a single data structure, which makes them quite versatile:

metro_line = {
  'destination': 'Isolatorweg',
  'number': 51,
  'color': '#008c45',
  'trains': ['CAF', 'Metropolis', 'Inneo'],
}

metro_line['origin'] = 'Gein' # Adding entries
metro_line['number'] = 50     # Updating entries
del metro_line['trains']      # Deleting entries

# Prints “{'destination': 'Isolatorweg', 'number': 50, 'color': '#008c45', 'origin': 'Gein'}”
print(metro_line)

# Prints “M50 Isolatorweg”
print(f'M{metro_line['number']} {metro_line['destination']}')

Dictionary keys are usually simple values like strings or numbers. Not many people know that tuples can also be used as composite dictionary keys:

transit_line_types = {
    (12, 'Amstelstation'): 'tram',
    (40, 'Muiderpoortstation'): 'bus',
    (50, 'Isolatorweg'): 'metro',
}

# Prints “bus”
print(transit_line_types[40, 'Muiderpoortstation'])

Dictionaries are often used to keep track of things. The following snippet shows a fairly common pattern for constructing such dictionaries using a for loop:

transit_lines = [
    {'number': 50, 'destination': 'Isolatorweg'},
    {'number': 50, 'destination': 'Gein'},
    {'number': 51, 'destination': 'Isolatorweg'},
    {'number': 51, 'destination': 'Centraal Station'},
]

destination_lines = {}
for line in transit_lines:
    if line['destination'] not in destination_lines:
        destination_lines[line['destination']] = []
    destination_lines[line['destination']] += [line['number']]

# Prints “{'Isolatorweg': [50, 51], 'Gein': [50], 'Centraal Station': [51]}”
print(destination_lines)

In this example destination_lines is a dictionary that contains the transit lines that serve each destination. The lines are stored in a list, which is initialised the first time a destination is added.

You can save yourself a bit of work by simply using a defaultdict, which is a special version of dict that automatically initialises dictionary entries with a value of a particular type (in this case list):

from collections import defaultdict

destination_lines = defaultdict(list)
for line in transit_lines:
    destination_lines[line['destination']] += [line['number']]

# Prints “defaultdict(<class 'list'>, {'Isolatorweg': [50, 51], 'Gein': [50], 'Centraal Station': [51]})”
print(destination_lines)

# Prints “{'Isolatorweg': [50, 51], 'Gein': [50], 'Centraal Station': [51]}”
print(dict(destination_lines))

Counter is another useful dict subclass that is especially helpful when you need to count things, as it also includes methods that allow you to easily retrieve the sum of all counts and the n most common elements:

from collections import Counter

destination_line_counts = Counter()
for line in transit_lines:
    destination_line_counts[line['destination']] += 1

# Prints “{'Isolatorweg': 2, 'Gein': 1, 'Centraal Station': 1}”
print(dict(destination_line_counts))
# Prints “4”
print(destination_line_counts.total())
# Prints “[('Isolatorweg', 2)]”
print(destination_line_counts.most_common(1))

Classes and value objects

Dictionaries are flexible, which makes them useful during prototyping and in situations where the keys are not known beforehand. In all other cases, it’s usually better to use a class, which allows you to read and modify values using dot notation:

class Line:
    def __init__(self, destination: str, number: int, color: str):
        self.destination = destination
        self.number = number
        self.color = color

metro_line = Line('Station Zuid', 52, '#00b2ec')

# Prints “Station Zuid”
print(metro_line.destination)

You can also add a @dataclass annotation to a class to turn it into a data class. This makes Python automatically generate __init__() and __repr__() functions:

from dataclasses import dataclass

@dataclass
class Line:
    destination: str
    number: int
    color: str

# This has now become a bit riskier
metro_line = Line('Station Zuid', 52, '#00b2ec')
# so let’s use named arguments instead
metro_line = Line(destination='Station Zuid', number=52, color='#00b2ec')

# Prints “Station Zuid”
print(metro_line.destination)

Under the hood a Python class is basically a dictionary (side note: Actually, it’s two dictionaries: one for the class and one for the instance.) with some functions. Sadly, Python makes it very easy for you to unintentionally add new attributes to a class:

metro_line = Line(destination='Noord', number=52, color='#00b2ec')

metro_line.colour = '#0e4a92' # Typo in attribute

# Prints “{'destination': 'Noord', 'number': 52, 'color': '#00b2ec', 'colour': '#0e4a92'}”
print(metro_line.__dict__)

This is where __slots__ comes in. Setting this variable prevents Python from using a dict to store values. Not only does this help you catch typos, it’s also more memory-efficient:

class Line:
    __slots__: ('destination', 'number', 'color')
    destination: str
    number: int
    color: str

metro_line = Line(destination='Station Zuid', number=52, color='#00b2ec')

# “AttributeError: 'Line' object has no attribute 'colour'”
metro_line.colour = '#0e4a92'

If you use data classes you can also simply add the slots=True parameter to the @dataclass decorator:

@dataclass(slots=True)
class Line:
    destination: str
    number: int
    color: str

One important thing to note is that data classes are mutable. If you need an immutable version of a data class, create a named tuple:

from typing import NamedTuple

class Line(NamedTuple):
    destination: str
    number: int
    color: str

metro_line = Line(destination='Station Zuid', number=52, color='#00b2ec')

# Prints “Station Zuid”
print(metro_line.destination)

# “AttributeError: can't set attribute”
metro_line.destination = 'Almere'

Abstract base classes

Many contemporary programming languages might be described as Java-like. Python is an exception to that rule; not only because of its lack of curly braces and its use of significant indentation, but also because it actually predates Java and thus didn’t get the opportunity to “steal” from it.

For instance, there are no such things as interfaces in Python. Instead, you’ll have to use an abstract base class (or simply ABC) to declare abstract methods that must be implemented by classes that inherit from it:

from abc import ABC, abstractmethod

class Animal(ABC):
    @abstractmethod
    def greet(self):
        pass

class Dog(Animal):
    def greet(self):
        print('Awoo!')

    def is_good_boi(self):
        return True

# “TypeError: Can't instantiate abstract class Animal without an implementation for abstract method 'greet'”
Animal().greet()

# Prints “Awoo!”
Dog().greet()

While Python does not have “real” interfaces, it does have something that you won’t find in languages like Java: multiple inheritance.

Multiple inheritance allows you to define classes with multiple base classes, which coincidentally is also an easy way to make it hard for everyone to reason about your code. But when used with restraint, multiple inheritance can actually be very useful. For instance, it allows us to implement traits and mixins:

class Person:
    def say(self, text: str):
        return text

class Mocking:
    def say(self, text: str):
        original_result = super().say(text)
        return self.__alternating_caps(original_result)

    def __alternating_caps(self, input_string):
        result = []
        for i, char in enumerate(input_string):
            if i % 2 == 0:
                result.append(char.upper())
            else:
                result.append(char.lower())
        return ''.join(result)

class AnnoyedAsian(Mocking, Person):
    pass

# Prints “WhErE ArE YoU ReAlLy fRoM?”
print(AnnoyedAsian().say('Where are you really from?'))

Argument passing

Functions in Python can generally be called using positional arguments, keyword arguments, or a combination thereof. Positional arguments take less effort to write, but can be harder to understand and are more error-prone.

As a programmer you can explicitly forbid the use of positional arguments by adding a * to the function signature. Any arguments that follow the * must be provided using keywords.

In the example below all arguments must be passed using keyword arguments:

def save_preferences(*, is_smoker: bool, has_tattoos: bool, is_hot: bool):
    pass

# “TypeError: save_preferences() takes 0 positional arguments but 3 were given”
save_preferences(False, False, True)

# This will work just fine
save_preferences(is_smoker=False, has_tattoos=False, is_hot=True)

In the next example the first argument can (but doesn’t have to) be positional, but the second argument must use a keyword:

def create_profile(name: str, *, looking_for: Type[DatingPreference]):
    pass

# TypeError: create_profile() takes 1 positional argument but 2 were given
create_profile('Annie', DatingPreference.SHORT_TERM_RELATIONSHIP)

# This will work just fine
create_profile('Britta', looking_for=DatingPreference.NEW_FRIENDS)
create_profile(name='Jeff', looking_for=DatingPreference.SHORT_TERM_FUN)

Speaking of keyword arguments, if you already have a dictionary with keys that correspond with the keyword argument names, you can simply pass that dictionary to the function directly using **:

request = {
  'name': 'Dean',
  'looking_for': DatingPreference.HAVING_JEFFREY_INSIDE_OF_ME,
}

# These two statements are equivalent
create_profile(**request)
create_profile(name=request['name'], looking_for=request['looking_for'])