8. Python’s typing system#

Author: Tue Nguyen

8.1. Outline#

  • Why do we care about data types?

  • Built-in vs. user-defined types

  • A bird-eye view of Python’s typing system

  • Brief discussion of the common built-in types

  • How to check types in Python

8.2. Why data types?#

  • A large part of programming is to deal with real-life data

  • But machines don’t understand real-life data

  • Thus, we need to represent them in ways that machines can understand through data types

  • A data type specify

    • What kind of values it could represent

    • Operations we can perform on it

  • Consider some examples in Python

    • Real life data: integers or whole numbers

      • Python type: int

      • Possible values: integers such as 0, 1, -5

      • Possible operations: addition, multiplication, subtraction, taking absolute value

    • Real life data: text

      • Python type: str

      • Possible value: strings of text such as "Hello", "Good morning"

      • Possible operations: concatenation, substring, to uppercase, to lowercase

  • Different programming languages might define different types to represent the same kind of real life data. Thus

    • Data types in Python are different from that of C/C++

    • However, the underlying principles are very similar

8.3. Built-in vs. user-defined types#

We can categorize data into two major types: built-in and user-defined

  • Built-in types are is part of the language and are readily available when we installed Python

  • User-defined types are built by programmers (based on the built-in types) to suit their needs for more complex applications

This series focuses mostly on the built-in types and some user-defined types developed by professional programmers

  • You won’t learn to build your own user-defined types in this series

  • In fact, you rarely create user-defined types when working as a data analyst/scientist even at intermediate levels

8.4. A bird-eye view of Python’s typing system#

The following picture illustrates a bird-eye view of Python’s typing hierarchy

  • Note that the diagram is not complete

  • I only show common types that, based on my experience, are relevant and important

  • The main purpose is to provide the fundamentals, not to overwhelm you with types that you will never use

8.5. Brief discussion of common built-in types#

8.5.1. Unknown values#

  • Motivation: used to represent the state of being unknown/undefined in both data kind and data value

    • Note that being undefined is NOT the same as being empty

    • An empty list still belong to the list type

  • Real-life data: unknown values, missing values

  • Python implementation: NoneType

8.5.2. Truth values#

  • Motivation: used to represent binary data

  • Real-life data: True/False, Yes/No, Success/Failure, Male/Female, Good/Bad

  • Python implementation: bool

8.5.3. Numbers#

  • Real-life data: numeric values such as whole numbers, real numbers, or complex numbers

  • Python implementation: int, float, complex

  • Notes

    • In Python, bool type is a sub-type of int

    • As you will see later, True and False are actually 1 and 0

8.5.4. Sequences#

  • Motivation: used to represent an ordered collection of elements using the same name

    • The order of the elements of the sequence matters.

    • We access an element of a sequence using the sequence name and the element’s position (or index)

  • Real-life data: a list of student scores, an array of customer names

  • Python implementation: list, tuple, range, str

8.5.5. Mappings#

  • Motivation: used to represent a collection of key-value pairs

    • There is no intrinsic order in a mapping

    • We access an element of a dictionary using the corresponding key

  • Real-life data: key="name" and value="John", key="age" and value=20

  • Python implementation: dict

8.5.6. Sets#

  • Motivation: used to represent sets (as in mathematics)

    • A set is a collection of unique elements

    • There is no intrinsic order in a set

    • There is no way to access an element of a set

    • We can only check if a value is in a set or not

  • Real-life data: a set of black-listed customers, a set of promotion codes

  • Python implementation: set, frozenset

8.5.7. Callable#

  • Motivation: used to represent object that we can call to perform an action

  • Real-life data: a function that computes the square root of a number, a method that sorts elements of a list

  • Python implementation: builtin_function_or_method

8.6. Get types#

We use type() function to get the type of a literal, variable, or expression

a) Ex 1: literals

# None literal
type(None)
NoneType
# Bool literal
type(True)
bool
# Integer literal
type(4)
int
# Real literal
type(1.23)
float
# Complex literal
type(1 + 3j)
complex
# String literal
type("Hello")
str
# List literal
type([1, 2, 3])
list
# Tuple literal
type((1, 2, 3))
tuple
# Set literal
type({1, 2, 3})
set
# Dictionary literal
type({"name": "Bob", "age": 20})
dict

b) Ex 2: variables

# Integer variable
x = 5
type(x)
int
# String variable
x = "Hello. How are you?"
type(x)
str
# List variable
scores = ["A", "B", "C"]
type(scores)
list

c) Ex 3: expressions

When the input of type() is an expression, Python will evaluate the expression and return the type of the result

# Int expression
type(2 + 3)
int
# Float expression
type(2.5 + 1.2)
float
type("Hello." + " How are you")
str

8.7. Check types#

  • We use ininstance() function to check if a literal, variable, or expression belongs to a certain type

  • This function returns either True or False

# We can use isinstance() function to check if
# a variable belongs to a specific data type
# Let's see its documentation
?isinstance
# Check if True is a bool
isinstance(True, bool)
True
# Check if 100 is a bool
isinstance(100, bool)
False
# Check if 100 is an int
isinstance(100, int)
True
# Check if True is an int
# You see that bool is a sub-type of int
isinstance(100, int)
True
# Now check if a variable is a bool
x = 3 > 5
print(x)
print(isinstance(x, bool))
False
True
# Check if a variable is a float
x = 10 / 3
print(x)
print(isinstance(x, float))
3.3333333333333335
True
# We can check if a variable belongs to one of several types
# Ex: check if x is a number (int or float)
x = 365 / 12
isinstance(x, (int, float))
True

8.8. Flat vs. collection types#

We can also informally divide Python’s built-in types into flat types and collection types

  • An object of a flat type can hold a single element only

  • An object of a collection types can hold potentially multiple elements

Within the scope of this series

  • Flat types include NoneType, bool, int, float, complex

  • Collection types include

    • Sequences: list, tuple, range, str

    • Mappings: dict

    • Sets: set, frozenset

Callable types such as builtin_function_or_method be considered flat types. However, since they are so different from the ones mentioned above, I will not combine them in the “Data types” sections

a) Ex 1: flat types

# Int
x = 5
print(x)
5
# Float
x = 10.5
print(x)
10.5

b) Ex 2: collection types

# List
x = [1, 2, 3]
print(x)
[1, 2, 3]
# Get the first element of the list
x[0]
1
# Get the last element of the list
x[-1]
3
# Set
x = {1, 2, 3}
print(x)
{1, 2, 3}
# Check if 2 in the set
2 in x
True
# Check if 10 in the set
10 in x
False

8.9. Summary#

Why data types?

  • We use data types to represent real-life data in ways that computers can understand

  • A data type specifies

    • Possible values it can represent

    • Possible operations on it

  • Different programming languages might define different types to represent the same kind of real life data

Built-in vs. user-defined types

  • Built-in types are is part of the language and are readily available when we installed Python

  • User-defined types are built by programmers (based on the built-in types) to suit their needs for more complex applications

Common built-in types in Python

  • For unknown data: NoneType

  • For binary data: bool

  • For numeric data: int, float, complex

  • For sequences: list, tuple, range, str

  • For key-value pair data: dict

  • For sets: set, frozenset

  • For actions: builtin_function_or_method

Get data types

  • Use type() function to check the type of a literal, variable, or expression

Check data types

  • Use isinstance() function to check if a literal, variable, or expression belongs to a certain type

Flat vs. collection types

  • An object of a flat type can hold a single element only

  • An object of a collection types can hold potentially multiple elements

  • Within the scope of this series

    • Flat types include NoneType, bool, int, float, complex

    • Collection types include

      • Sequences: list, tuple, range, str

      • Mappings: dict

      • Sets: set, frozenset

  • Callable types such as builtin_function_or_method be considered flat types, but since they are very different from the rest, we will treat them separately