Classes and objects#

Object-oriented programming: encapsulation#

Python is also an object-oriented language. Object-oriented programming is very useful and used in many libraries so it is very useful to understand how the simple object-oriented mechanisms work in Python.

For some problems, Object-oriented programming is a very efficient paradigm. Many libraries use it so it is worth understanding what is object oriented programming (POO) and when it is useful.

Concepts#

Object

An object is an entity that has a state and a behavior. Objects are the basic elements of object-oriented system.

Class

Classes are “families” of objects. A class is a pattern that describes how objects will be built.

Example: the weather stations#

Solution 0: a list of lists#

Let us suppose we have a set of weather stations that do measurements of wind speed and temperature. Suppose now one wants to compute some statistics on these data. A basic representation of a station will be an array of arrays: wind values and temperature values.

paris = [[10, 0, 20, 30, 20, 0], [1, 5, 1, -1, -1, 3]]

# get wind when temperature is maximal
idx_max_temp = paris[1].index(max(paris[1]))
print(f"max temp is {paris[1][idx_max_temp]}°C at index {idx_max_temp} ")
print(f"wind speed at max temp = {paris[0][idx_max_temp]} km/h")
max temp is 5°C at index 1 
wind speed at max temp = 0 km/h

Comments on this solution#

Many problems:

  • if the number of measurements increases (e.g. having rainfall, humidity, …) the previous indexing will not be valid (what will paris[5] represent? wind, temperature, …, ?)

  • Code analysis is not (that) straightforward

Solution 1: a dict of lists#

We can use a dictionnary:

paris = {"wind": [10, 0, 20, 30, 20, 0], "temperature": [1, 5, 1, -1, -1, 3]}

# get wind when temperature is minimal
paris_temp = paris["temperature"]
idx_max_temp = paris_temp.index(max(paris_temp))

print(f"max temp is {paris_temp[idx_max_temp]}°C at index {idx_max_temp}")
print(f"wind speed at max temp = {paris['wind'][idx_max_temp]} km/h")
max temp is 5°C at index 1
wind speed at max temp = 0 km/h

Comments#

  • Pro

    • More readable code (reading paris["temperature"] is clearer than paris[1]).

    • Less error prone code (i.e. using words as keys allow to not use index numbers that are easily mistaken and lead to code that is hard to read and debug)

  • Con

    • The code to compute the final result is not very readable

Solution 2: add functions#

paris = {"wind": [10, 0, 20, 30, 20, 0], "temperature": [1, 5, 1, -1, -1, 3]}


def max_temp(station: dict[str, list[float]]) -> float:
    """returns the maximum temperature available in the station"""
    return max(station["temperature"])


def arg_max_temp(station: dict[str, list[float]]) -> int:
    """returns the index of maximum temperature available in the station"""
    max_temperature = max_temp(station)
    return station["temperature"].index(max_temperature)


idx_max_temp = arg_max_temp(paris)

print(f"max temp is {max_temp(paris)}°C at index {arg_max_temp(paris)}")
print(f"wind speed at max temp = {paris['wind'][idx_max_temp]} km/h")
max temp is 5°C at index 1
wind speed at max temp = 0 km/h

Comments#

  • Pro:

    • Adding functions leads to a code that is easier to read, hence easier to debug.

    • Testing functions can be done separately from the rest of the code.

    • The computation done on the second part depends upon the functions (i.e it depends on the function definitions not their implementations).

    • Adding function allows to reuse code: computing the max temperature is something one could want to do in other places.

  • Con

    • We rely on the fact that the dictionnaries have been built correctly (for example wind and temperature arrays have the same length).

Solution 3: init function#

Define a function that builds the station (delegate the generation of the station dictionnary to a function).

def build_station(wind: list[float], temp: list[float]) -> dict[str, list[float]]:
    """Build a station given wind and temp
    :param wind: (list) floats of winds
    :param temp: (list) float of temperatures
    """
    if len(wind) != len(temp):
        raise ValueError("wind and temperature should have the same size")
    return {"wind": list(wind), "temperature": list(temp)}


def max_temp(station: dict[str, list[float]]) -> float:
    """returns the maximum temperature available in the station"""
    return max(station["temperature"])


def arg_max_temp(station: dict[str, list[float]]) -> int:
    """returns the index of maximum temperature available in the station"""
    max_temperature = max_temp(station)
    return station["temperature"].index(max_temperature)


paris = build_station([10, 0, 20, 30, 20, 0], [1, 5, 1, -1, -1, 3])
idx_max_temp = arg_max_temp(paris)

print(f"max temp is {max_temp(paris)}°C at index {arg_max_temp(paris)}")
print(f"wind speed at max temp = {paris['wind'][idx_max_temp]} km/h")
max temp is 5°C at index 1
wind speed at max temp = 0 km/h

Comments#

  • If the dedicated function build_station is used, the returned dictionary is well structured.

  • If one changes build_station, only max_temp and arg_max_temp have to be changed accordingly

  • We use a list comprehension to be able to have parameters wind and temp provided by any ordered iterable (e.g. see test_build_station_with_iterable wtih range)

  • BUT if we have a new kind of station, i.e. that holds only wind and humidity, we want to avoid to be able to use max_temp with it.

Solution 4: using a class#

We would like to “embed” the max_temp and the arg_max_temp in the “dictionnary station” in order to address the last point.

And here comes object-oriented programming !

A class defines a template used for building object. In our example, the class (named WeatherStation) defines the specifications of what is a weather station (i.e, a weather station should contain an array for wind speeds, named “wind”, and an array for temperatures, named “temp”). paris should now be an object that answers to these specifications. Is is called an instance of the class WeatherStation.

When defining the class, we need to define how to initialize the object (special “function” __init__).

class WeatherStation(object):
    """A weather station that holds wind and temperature

    :param wind: any ordered iterable
    :param temperature: any ordered iterable

    wind and temperature must have the same length.

    """

    def __init__(self, wind: list[float], temperature: list[float]):
        """initialize the weather station.
        Precondition: wind and temperature must have the same length.
                      ValueError is raised if this is not the case
        :param wind: any ordered iterable
        :param temperature: any ordered iterable"""
        self.wind = list(wind)
        self.temp = list(temperature)
        if len(self.wind) != len(self.temp):
            raise ValueError(
                "wind and temperature should have the same size"
                f" got len(wind)={len(self.wind)} vs "
                f" len(temp)={len(self.temp)}"
            )

    def max_temp(self) -> float:
        """returns the maximum temperature recorded in the station"""
        return max(self.temp)

    def arg_max_temp(self) -> int:
        """returns the index of (one of the) maximum temperature recorded in the station"""
        return self.temp.index(self.max_temp())


paris = WeatherStation([10, 0, 20, 30, 20, 0], [1, 5, 1, -1, -1, 3])
idx_max_temp = paris.arg_max_temp()

print(f"max temp is {paris.max_temp()}°C at index {paris.arg_max_temp()}")
print(f"wind speed at max temp = {paris.wind[idx_max_temp]} km/h")
max temp is 5°C at index 1
wind speed at max temp = 0 km/h

Comments#

  • The max_temp and the arg_max_temp are now part of the class WeatherStation. Functions attached to classes are named methods. Similary, wind and temp lists are also now part this class. Variables attached to classes are named members or attributes.

  • if max_temp method is called in many places, we can improve it by caching the result. This will not affect code the uses the class.

  • arg_max_temp method should be rewriten as we implicitelly check equality of floats.

An object (here paris) thus contains both attributes (holding data for example) and methods to access and/or process the data.

Exercise 15 (Try to code with class)

  • Add a method (perceived_temp) that takes as input a temperature and wind and return the perceived temperature, i.e. taking into account the wind chill effect.

  • Modify max_temp and arg_max_temp so that they take an additional optional boolean parameter (e.g. perceived default to False). If perceived is False, the methods have the same behaviour as before. If perceived is True, the temperatures to process are the perceived temperatures.

Comments :#

  • The wind array was changed to have different maximum temperatures for the air and perceived temperatures: for air temperatures, the max is 5°C (with a wind speed 50 km/h). For perceived temperatures, the max is 3°C (as the wind speed is 0).

  • It was a choice to set the apparent/perceived temperature to the air temperature if the wind speed is 0 so the tests were written with this in mind. Testing such choices allows to have clear inputs/outputs.

  • isinstance allows to test the type of an object (in this case, we test if apparent_temps is a list)

  • When testing boolean in if structures: use if perceived: rather than if perceived == True:. It is equivalent but clearer and shorter !

Coming next#

What if we now have a weather station that also measure humidity ?

Do we need to rewrite everything ?

What if we rewrite everything and we find a bug ?

Here comes inheritance