14. List type#

Author: Tue Nguyen

14.1. Outline#

  • What is a list?

  • Create a list

  • Index a list: read & write

  • Slice a list: read & write

  • Nested lists

  • Iterate through a list

  • Mutability

  • Copy a list

  • Operations on a list

  • Unpack a list

  • List comprehensions

14.2. What is a list?#

  • The list type is one the 4 sequence types that we will learn in this series

  • Each element can be accessed using the list name and the element’s index

  • Python start indexing from 0

  • Negative index means counting from right to left

  • Lists are mutable meaning that they can change after created

14.3. Create a list#

  • To create a list we use []

  • Elements are separated by commas ,

# Init a list
x = [1, 2, 3, 4]

# Print value and type
print(x)
print(type(x))
[1, 2, 3, 4]
<class 'list'>
# A list can hold elements of different types
# including lists
x = [None, True, 10, 20.5, "ABC", [1, 2]]
print(x)
print(type(x))
[None, True, 10, 20.5, 'ABC', [1, 2]]
<class 'list'>
# If there are too much elements
# You can arrange the elements in multiple lines
x = ["Apple", "Banana", "Orange", "Peach",
    "Guava", "Watermelon", "Pineapple"]

print(x)
print(type(x))
['Apple', 'Banana', 'Orange', 'Peach', 'Guava', 'Watermelon', 'Pineapple']
<class 'list'>

14.4. Index a list#

  • Indexing a list means accessing one single element of a list

  • There are two type of indexing

    • Indexing (read): access an element, read its value, but does not modify it

    • Indexing (write): access an element, then modify it

14.4.1. Indexing (read)#

# Init a list
x = [1, 2, 3, 4, 5]
x
[1, 2, 3, 4, 5]
# Get the first element
x[0]
1
# Get the third element
x[2]
3
# Get the last element
x[-1]
5
# Get the second to last element
x[-2]
4
# If you index beyond beyond the range of a list
# you will get an error: list index out of range
# Try x[100] to see it

14.4.2. Indexing (write)#

# Init a list
x = [1, 2, 3, 4, 5]
x
[1, 2, 3, 4, 5]
# Change the third element to 99
x[2] = 99
x
[1, 2, 99, 4, 5]
# Change the last element to "Hello"
x[-1] = "Hello"
x
[1, 2, 99, 4, 'Hello']
# Delete first element
# Deletion is considered an indexing (write) because
# we accesss the element using index, then modify (delete) it 
del x[0]
x
[2, 99, 4, 'Hello']
# Be careful when using del to delete multiple elements 
# because after each del, the list is shorter by one element
# so you need to adjust the counting accordingly
# Let's see this example
del x[0], x[1]
x
[99, 'Hello']

14.5. Slice a list#

  • Indexing is accessing one single element at a time

  • Slicing accesses a bunch of elements (called a slice) simultaneously

  • As indexing, there are also 2 types of slicing

    • Slicing (read): access a slice, but does not modify it

    • Slicing (write): access a slice, then modify it

14.5.1. Slicing (read)#

  • To slice a list, we use the following syntax: x[start:stop:step]

    • x: the name of the list

    • start: starting position (will be included in the result)

    • stop: ending position (will be excluded from the result)

    • step: number of elements between each jump

  • In many cases, we want a consecutive slice (step = 1). If so, we can use x[start:stop] instead

  • Slicing a list is a bit tricky, but you will get used to it through practice

# Init a list
x = [1, 2, 3, 4, 5]
x
[1, 2, 3, 4, 5]
# Slice from the 1st to 3rd element
# Remember start is included, stop is NOT included
# and Python starts indexing at 0
x[0:3:1]
[1, 2, 3]
# We can omit step when step = 1
x[0:3]
[1, 2, 3]
# If we starts from the beginning of the list
# we can omit the start
x[:3]
[1, 2, 3]
# Slice from 2nd to 3rd
x[1:3]
[2, 3]
# Slice from 2nd to last
# The last element has index = 4
# Thus we have to set stop = 5 
# because stop is NOT included
x[1:5]
[2, 3, 4, 5]
# However, if we slice to the end of the list
# we can omit stop
# So better way is
x[1:]
[2, 3, 4, 5]
# We can slice from the 1st to last element
# to make a shallow copy of x 
# (more on shallow copy later)
x[:]
[1, 2, 3, 4, 5]
# Slice from 1st to last
# But skip 1 element at a time
x[::2]
[1, 3, 5]
# Skip 2 elements at a time
x[::3]
[1, 4]
# If we index beyond the range of a list
# we won't get out-of-range error 
# Python will slice until the end of the list
x[1:1000]
[2, 3, 4, 5]
# How to slice from right to left
# We can use a negative step to do this
# But it might involve some ugly counting
# Thus, the better way is to slice from left to right as usual
# then reverse it using reverse()
# reverse() returns an iterator, not a list
# To get a list back, we perform typecasting using list()
# Ex: get 4th, 3rd, 2nd elements in that order
list(reversed(x[1:4]))
[4, 3, 2]
# If we don't use reverse, then the  solution is
x[3:0:-1]
[4, 3, 2]
type(reversed(x[1:4]))
list_reverseiterator
# We also need to distinguish between indexing and slicing
# even when we there is only one element in the slice
# - Indexing returns an element of the list
# - Slicing returns part of the list, so a slice is a list
# Let's see the example below
print(x[0])
print(x[:1])
1
[1]

14.5.2. Slicing (write)#

# Init a list
x = [1, 2, 3, 4, 5]
x
[1, 2, 3, 4, 5]
# Replace 3 first elements with 100, 200, 300
x[:3] = [100, 200, 300]
x
[100, 200, 300, 4, 5]
# If the value on the RHS is longer/shorter than the slice
# the list will automatically expand/shrink accordingly
# Ex 1: replace the first 3 elements to 99, 98
x[:3] = [98, 99]
x
[98, 99, 4, 5]
# Ex 2: replace the last 2 elements with "A", "B", and "C"
x[-2:] = ["A", "B", "C"]
x
[98, 99, 'A', 'B', 'C']
# We can also delete a slice using del
# Ex: delete 2nd and 3rd elements
del x[1:3]
x
[98, 'B', 'C']

14.6. Nested lists#

# A list who contains another list is a nested list
x = [1, 2, ["A", "B", "C", "D"]]
print(x)
[1, 2, ['A', 'B', 'C', 'D']]
# The inner list is just a regular element of the outer list
# and can be acccessed by indexing
x[-1]
['A', 'B', 'C', 'D']
# Since the inner list is a list itself
# we can perform indexing and slicing as usual
# such as get the 1st element
x[-1][0]
'A'
# Get last element
x[-1][-1]
'D'
# Get first 3 elements
x[-1][:3]
['A', 'B', 'C']
# Overwrite 2nd element with "Haha"
x[-1][1] = "Haha"
x
[1, 2, ['A', 'Haha', 'C', 'D']]

14.7. Iterate through a list#

14.7.1. The for loop#

  • We use a for loop to iterate through the elements of a list

  • Each time we visit an element is call an iteration

  • Collection types such as list, tuple, set are called iterable because we can iterate through their elements

  • The for loop will be discussed in detail in the “Control flows” section

# Init a list
x = [1, 2, 3, 4, 5]
x
[1, 2, 3, 4, 5]
# Iterate through each element and print out its squares
for e in x:
    print(e**2)
1
4
9
16
25

What happened?

  • In the first iteration, e was assigned the first element of x, which was 1

  • Thus, e**2 produced 1

  • After this iteration, Python noticed that it had NOT reached the end of the list, so it move to the next iteration

  • This time, e was assigned the second element of x, which was 2

  • Thus, e**2 produced 4

  • This logic happened again and again until Python finished the fifth iteration

  • This time, Python noticed that it already reached the end of the list, and the loop ended

14.7.2. Notes on indentation#

  • Notice the indentation in the above example

  • print(e**2) is indented in (indented to the right) one TAB

  • This indentation indicates that the indented block of code lie inside (NOT parallel to) the for loop

  • Thus, it will be executed in each iteration

  • Let’s consider two examples

# Ex 1
# In this example, both the assignment and print statement
# are indented in (inside the for loop)
# Thus, both statements are executed 5 times
# we can see that 5 outputs are printed out
for e in x:
    squared = e**2
    print(squared)
1
4
9
16
25
# Ex 2
# In this example, only the assignment is indented in
# so the assignment is executed 5 times
# However, the print statement is not inside the for loop
# but it lies parallel to the for loop
# Thus, it is executed only when the for loop ends
# At this stage, variable squared has the value 25
# and you see this value 25 being printed out only once
for e in x:
    squared = e**2
print(squared)
25

14.7.3. Notes on the temporary variable e#

  • You should think of variable e in the above example as an temporary variable to get access to each element of the list x

  • Thus, you can use whatever name you want such as i, j, or element

  • However, remember to use the same name both in the for statement and within the for loop

# Ex 1
for i in x:
    print(e**2)
25
25
25
25
25

What happened?

  • Here, Python iterated through all the element of x

  • In each iteration, i indeed took the value of the corresponding element of x

  • However, we mistakenly use e**2 instead of i**2 in print()

  • From the last example, we know that e took the value 5

  • That ‘s why we see 25 being printed 5 times

# Ex 2
# Now, change e in print() to i
# We see everything works correctly
for i in x:
    print(i**2)
1
4
9
16
25

14.7.4. More examples with the for loop#

Ex 1: print out the following text

1 squared = 1
2 squared = 4
...
10 squared = 100
# Solution
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

for e in x:
    print(f"{e} squared = {e**2}")
1 squared = 1
2 squared = 4
3 squared = 9
4 squared = 16
5 squared = 25
6 squared = 36
7 squared = 49
8 squared = 64
9 squared = 81
10 squared = 100

Ex 2: filtering values

Create a list x = [1, 2, -7, 8, -9, -4, 10, 30] and then

  • Print out only positive values of x

  • Print out only even values of x

  • Print out only positive and even values of x

# Init a list
x = [1, 2, -7, 8, -9, -4, 10, 30]
# Print out positive values
for e in x:
    if e > 0:
        print(e)
1
2
8
10
30
# Print out even values
for e in x:
    if e % 2 == 0:
        print(e)
2
8
-4
10
30
# Print out positive and even values
for e in x:
    if (e > 0) and (e % 2 == 0):
        print(e)
2
8
10
30

Ex 3: counting

Count the number of students who passed the exam from the list of scores [1, 2, 7, 9, 10, 10, 3, 4, 8] (assume that passing mean score >=4)

# Init scores
scores = [1, 2, 7, 9, 10, 10, 3, 4, 8]

# Init the number of students who passed
num_passed = 0

# Iterate through every score
for s in scores:
    if s >= 4:
        # If we find a pass, then increase num_passed by 1
        num_passed = num_passed + 1
        
# After the loop is done,
# let's check how many passes we have counted
num_passed
6

14.8. Mutability#

  • Lists are called mutable, meaning that we can change (mutate) aftrer created

  • The opposite of mutable is immutable

  • And all the flat types we have learned so far are immutable

14.8.1. Remark 1#

  • Mutability is a property of an object, not of a variable

  • When an object is created, it has a data type that stays with it forever

  • That data type specifies whether the object is mutable or immutable

  • A variable is just a label

    • We can paste the label on an immutable object

    • But we can then take the label away and paste it on a mutable object

    • Thus, when we say x is mutable, the correct interpretation should be that “the object that x is currently referring to is mutable”

  • Let’s consider some example

# Python evaluates the RHS and comes up with 5 (an int)
# Since int type is immutable, the object holding 5 is mutable
# We don't say variable x is mutable
x = 5

# This is the ID of the object that x is pointing to
id(x)
140041610592672
# What if we re-assign x to another int?
# Python will create a different object
# and the old object (with ID obove) will be destroyed
x = 10

# You see x is pointing to another object with different ID
# So in fact, we did NOT replace value 5 in the old object with 10
# What we did is changing the location that x pointed to
# Thus, because we can change value of x from 5 to 10 doesn't make x mutable
# Remember, mutability is a property of an OBJECT, not of a variable
id(x)
140041610592832
# Now re-assign x to a list
# Python evaluates the RHS and sees a list
# then an mutable object of type list is created
x = [1, 2, 3]

# Here is the ID of that object
id(x)
140041440673856
# Now, replace the 1st element with 99
x[0] = 99

# You see that the list content has changed
# but the ID stays the same
# This is what we meant when saying a list is mutable
print(x)
print(id(x))
[99, 2, 3]
140041440673856
# Now, append a new element to a list
x.append("A")

# The object still stays the same, only its content change
print(x)
print(id(x))
[99, 2, 3, 'A']
140041440673856

14.8.2. Remark 2#

  • A mutable object is a box that can change what it stores inside

  • An immutable object is a box that is fixed: once created, it stays the same until being destroyed by the garbage collector

  • An object of flat types such as bool, int, or float is immutable

    • Once created, it holds the initial value until it dies

    • We can do things like .append() or x[0] = <new_value> as we saw with a list

14.8.3. Remark 3#

Question

  • Mutable types are more flexible, so why do we need immutable types?

  • Why don’t we mark int or float mutable?

Answer

  • Immutable offers some advantages

    • Readability

    • Performance

    • Security

  • Readability

    • When we see a float or tuple, we are confident that its value stays the same forever

    • But when we see a list, it’s harder to reason about its content because it might change at different stages

    • So we need to pay more attention to all possible scenarios to make sure our program will work as expected

  • Performance

    • Sometimes using immutable objects make computation faster

    • It is because the program know in advance what is inside the box

    • When dealing with mutable objects, some checking might be needed to see if the content in the box is allowed for the computation

    • The overhead from this checking procedure might hurt run-time performance

  • Security

    • In many cases, we know for sure that they never change during the life cycle of a program (Ex: constants)

    • Using immutable types prevent these constants from being modified either accidentally or intentionally

    • Or when passing a list to a function, we might worry that the function will modify the original list and, this might cause unwanted effect when the list is used elsewhere later

    • So without looking at the whole code of the function, we will never be sure

    • But if we pass a float or a tuple to a function, we are completely safe

  • For a more technical discussion, see here

14.9. Copy a list#

14.9.1. Aliasing#

# Init a list
x = [1, 2, 3]

# See the value and ID
print(x)
print(id(x))
[1, 2, 3]
140041525099328
# Now perform the assignment y = x
# Many think we are COPYING x to y
# But it is NOT true
y = x
# We already knew that the above assignment won't create a new object
# After the assignment, both x and y are poiting to the same object
# Thus, y is consider an ALIAS (or a nickname of x)
# There is no copying here
print(id(x))
print(id(y))
print(x is y)
140041525099328
140041525099328
True
# Let's try to change 1st element of x to 99
x[0] = 99
# We see the change is reflected in both x and y
print(x)
print(y)
[99, 2, 3]
[99, 2, 3]

14.9.2. Shallow copying#

# Init a list
x = [1, 2, [88, 99]]

# Print values and ID
print(x)
print(id(x))
[1, 2, [88, 99]]
140041440301376
# Make a shallow copy using method .copy
y = x.copy()
# Verify x and y hold the same values
print(x)
print(y)
[1, 2, [88, 99]]
[1, 2, [88, 99]]
# But they are pointing to different objects
print(id(x))
print(id(y))
print(x is y)
140041440301376
140041440305024
False
# EXAMPLE 1
# Now try to change the 1st element of x
x[0] = True

# We see that only x change, y stays the same
print(x)
print(y)
[True, 2, [88, 99]]
[1, 2, [88, 99]]
# EXAMPLE 2
# Now notice the last element of x is a list
# Try to change the 1st element of this list through x
x[-1][0] = False

# We see the change is reflected in both x and y
print(x)
print(y)
[True, 2, [False, 99]]
[1, 2, [False, 99]]
# EXAMPLE 3
# Now replace the last element of x with "Apple"
x[-1] = "Apple"

# We see that the change is reflected in x only
print(x)
print(y)
[True, 2, 'Apple']
[1, 2, [False, 99]]

What happened?

  • When we perform x = [1, 2, [88, 99]]

    • First Python creates 3 objects

      • Object A1 of type int to hold value 1

      • Object A2 of type int to hold value 2

      • Object A3 of type list to hold value [88, 99]

    • Then Python creates another object A of type list to hold references to A1, A2, and A3

    • Thus, each element of x is in fact a reference to an object, not value 1, 2, or [88, 99]

  • When we run x[0], we get 1 because Python looks at x[0] and sees a reference to A1, then Python finds the box A1, opens it, and returns the value inside, which is 1

  • When we perform y = x.copy(), Python creates another object, says B, and copies over all elements of A (the object that x is pointing to) to B, then Python points y to B

  • Thus y and x now contain the same references to A1, A2, A3

  • Notice that A1 and A2 are int objects and thus immutable

  • However, A3 is a list object and thus mutable

In EXAMPLE 1: we perform x[0] = True

  • An object is created, says A1b, to hold the value True

  • Then Python deletes the binding between x[0] with A1 (holding value 1) and establishes a new binding between x[0] and A1b (holding value True)

  • Thus, print(x) gives us [True, 2, [88, 99]]

  • However, y[0] is still referring to the box A1, and thus A1 is not destroyed by the garbage collector

  • And print print(y) gives us [1, 2, [88, 99]]

In EXAMPLE 2: we perform x[-1][0] = False

  • x[-1] gives the reference to the box A3

  • This box is also a list itself, so it contains references to 2 other objects holding 88 and 99

  • Thus x[-1][0] gives us the reference to the object holding 88

  • And x[-1][0] = False updates the first element of A3, pointing it to a new object holding False

  • So A3 itself doesn’t change, but only its first element changes

  • Since both x[-1] and y[-1] are still pointing to A3, we can see the change is reflected in both x and y

In EXAMPLE 3: we perform x[-1] = "Apple"

  • Python creates a new object, says A3b, to hold "Apple"

  • Then it deletes the binding between x[-1] and A3 and establishes a new binding between x[-1] and A3b

  • A3 is not destroyed because y[-1] is still pointing to it

  • Thus, x now holds the references to A1b, A2, A3b (with the corresponding values True, 2, "Apple")

  • And y now holds the references to A1, A2, A3 (with the corresponding values 1, 2, [False, 99])

Remarks

  • Since x.copy() only copies the references to the underlying objects, it is called shallow copying

  • There are three ways to make a shallow copy

    • y = x.copy()

    • y = x[:]

    • y = list(x)

14.9.3. Deep copying#

We use deep copy when we want to make a copy that is completely irrelevant to the original object

# Init a list
x = [1, 2, [88, 99]]
print(x)
print(id(x))
[1, 2, [88, 99]]
140041440440960
# We have to import module copy
import copy
# Now make a deep copy
y = copy.deepcopy(x)
# Verify that x and y have the same values
print(x)
print(y)
[1, 2, [88, 99]]
[1, 2, [88, 99]]
# But they are different object
print(id(x))
print(id(y))
print(x is y)
140041440440960
140041440444096
False
# Now change 1st element of last element of x to False
x[-1][0] = False
# We see that only x changes, y stays the same
print(x)
print(y)
[1, 2, [False, 99]]
[1, 2, [88, 99]]
# Confirm change is made on y only
print(x)
print(y)
[1, 2, [False, 99]]
[1, 2, [88, 99]]

What happened?

  • When we run x = [1, 2, [88, 99]], 3 objects A1, A2, A3 are created to hold 1, 2, and [88, 99]

  • Then another object A is created to hold the references to A1, A2, and A3

  • When we run y = copy.deepcopy(x), Python create 3 new object B1, B2, B3 and then copy over the values inside A1, A2, A3 to B1, B2, and B3

  • Then Python creates a new object B to hold the references to B1, B2, and B3

  • Thus, now x and y are completely irrelevant

Remarks

  • Why don’t we just use deep copy every time?

  • Because deep copy takes more time and space

    • A reference is just an address to a memory slot

    • So it is much smaller in size compared to the actual value being stored in that memory address

    • Thus copying a reference (shallow) is much faster and takes up much less space than copying the actual value (deep)

14.10. Operations on a list#

We distinguish between two types of operations on a list

  • Regular operations: those that don’t change the list

  • Inplace operations: those that change the list

14.10.1. Regular operations#

# Init 2 lists
x = [2, 7, -3, 11.5, True, 2]
y = ["A", 1, 58.5, 94, [12, 20]]

print(x)
print(y)
[2, 7, -3, 11.5, True, 2]
['A', 1, 58.5, 94, [12, 20]]

Index and slice

# Index the 1st elements
x[0]
2
# Slice first 3 elements
x[:3]
[2, 7, -3]

Get number of elements

len(x)
6

Membership checking

# Is 2 in x?
2 in x
True
# Is 'A' in x?
"A" in x
False

Count number of occurrences

# How many 2 in x?
x.count(2)
2
# How many 2 in y
y.count(2)
0

Concatenate 2 lists

# Concat 2 lists returns a new list
# whose elements are from x and y
x + y
[2, 7, -3, 11.5, True, 2, 'A', 1, 58.5, 94, [12, 20]]
# However, this concatenation just return the combination
# the original lists do not change
print(x)
print(y)
[2, 7, -3, 11.5, True, 2]
['A', 1, 58.5, 94, [12, 20]]
# If you want to save the result
# you need to assign it to a new variable
z = x + y
print(z)
[2, 7, -3, 11.5, True, 2, 'A', 1, 58.5, 94, [12, 20]]

Replicate a list

# Replicate twice
x * 2
[2, 7, -3, 11.5, True, 2, 2, 7, -3, 11.5, True, 2]
# Quickly make a list of 10 zeros
[0] * 10
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Sort a list

  • We can only sort a list if its element are comparable

  • Ex: all elements are numbers or all elements are strings

# x only contains int, float, and bool (sub-type of int)
# So x is sortable
# Try to sort x in ascending order
sorted(x)
[-3, True, 2, 2, 7, 11.5]
# Sort x in descending order
sorted(x, reverse=True)
[11.5, 7, 2, 2, True, -3]
# As previously, this operation just returned sorted values
# It doesn't change the original list (x stays the same)
x
[2, 7, -3, 11.5, True, 2]
# If you want to save the result, you might need an assignment
z = sorted(x)
print(x)
print(z)
[2, 7, -3, 11.5, True, 2]
[-3, True, 2, 2, 7, 11.5]
# We can sort a list of strings
sorted(["BOB", "JACK", "JOHN", "BELLE", "TIMON"])
['BELLE', 'BOB', 'JACK', 'JOHN', 'TIMON']
# But we cannot sorted a list whose elements are not comparable
# Try with [1, 2, "A"] or [1, 2, [88, 99]]

Reverse a list

# Recall x
x
[2, 7, -3, 11.5, True, 2]
# We use reversed() function to reverse a list
# However, it will return a iterator for better performance
reversed(x)
<list_reverseiterator at 0x7f5df051ee80>
# We will learn about iterator later
# For now just use typecasting to get back a list
list(reversed(x))
[2, True, 11.5, -3, 7, 2]

Get the index of the first occurence

# Recall x, we see that 2 appears 2 times in x
# First occurence at index 0
# Second occurence at index 5
x
[2, 7, -3, 11.5, True, 2]
# Get index of 1st occurence of 2 in x
x.index(2)
0
# Get index of 1st occurence of True in x
x.index(True)
4
# However, you will get an error if the value is not in x
# Try with x.index(1000)

14.10.2. Inplace operations#

  • These operations make some change to a list and the change is saved to the original list

  • After that, a None is usually returned to indicate that the action is done correctly

Sort a list

# Init a list
x = [2, 7, -3, 11.5, 2, True]
x
[2, 7, -3, 11.5, 2, True]
# We use x.sort() to sort a list in ascending order
# and save this change to x instead of returning a sorted version
# A None is returned, but you cannot see it printed out
x.sort()
# Now verify that the change is saved to the original list
x
[-3, True, 2, 2, 7, 11.5]
# To see a None is returned, to assign x.sort() to a variable
z = x.sort()
print(z)
None
# You can also sort x in descending order
x.sort(reverse=True)
x
[11.5, 7, 2, 2, True, -3]

Note: if you want to sort a list x and the use the name x for the sorted version (meaning you don’t care about the original version anymore) then you can use

  • x.sort()

  • Or x = sorted(x)

  • But not x = x.sort() because you will get None

Reverse a list

# Init
x = [2, 7, -3, 11.5, 2, True]
x
[2, 7, -3, 11.5, 2, True]
# Reverse
x.reverse()
x
[True, 2, 11.5, -3, 7, 2]

Add one element to a list

# Init
x = [2, 7, -3, 11.5, 2, True]
x
[2, 7, -3, 11.5, 2, True]
# Append 99 to x
x.append(99)
x
[2, 7, -3, 11.5, 2, True, 99]

Add multiple elements to a list

# Init
x = [2, 7, -3, 11.5, 2, True]
x
[2, 7, -3, 11.5, 2, True]
# Add 3 elements to x
x.extend(["A", "B"])
x
[2, 7, -3, 11.5, 2, True, 'A', 'B']
# Note that if we use append in this case
# ["A", "B"] will be added as a single element
x.append(["A", "B"])
x
[2, 7, -3, 11.5, 2, True, 'A', 'B', ['A', 'B']]

Insert an element

# Init
x = [2, 7, -3, 11.5, 2, True]
x
[2, 7, -3, 11.5, 2, True]
# Read the documentation for .insert()
?x.insert
# Now insert "A" to the first position
x.insert(0, "A")
x
['A', 2, 7, -3, 11.5, 2, True]
# Insert "B" to the 3rd position
x.insert(2, "B")
x
['A', 2, 'B', 7, -3, 11.5, 2, True]

Remove an element

# Init
x = [2, 7, -3, 11.5, 2, True]
x
[2, 7, -3, 11.5, 2, True]
# We can remove by value
# Remove value 2 (only first occurence is removed)
x.remove(2)
x
[7, -3, 11.5, 2, True]
# If the value is not in x, we will get an error
# Try with x.remove(1000)
# To avoid the error, use if
if 1000 in x:
    x.remove(1000)
# We can also remove an element by position using del
# Ex: remove the second element
del x[1]
x
[7, 11.5, 2, True]

Pop out an element and delete that element from the list

# Init
x = [2, 7, -3, 11.5, 2, True]
x
[2, 7, -3, 11.5, 2, True]
# See the documentation for x.pop
?x.pop
# Pop the last element, save it to variable z
z = x.pop()
print(x)
print(z)
[2, 7, -3, 11.5, 2]
True
# Pop the 1st element, save it to variable z
z = x.pop(0)
print(x)
print(z)
[7, -3, 11.5, 2]
2
# Pop the 2nd element, save it to variable z
z = x.pop(1)
print(x)
print(z)
[7, 11.5, 2]
-3

Clear all elements

# Init
x = [2, 7, -3, 11.5, 2, True]
print(x)
print(id(x))
[2, 7, -3, 11.5, 2, True]
140041440097728
# x.clear() will remove all elements
# but the object is still the old one (same ID)
x.clear()
print(x)
print(id(x))
[]
140041440097728
# It is different from x = []
# Assignment to [] create a new object 
# and we re-paste the label x on the new object
x = []
print(id(x))
140041440035584

14.11. Unpack a list#

# Suppose we have a list that contain information about date
my_date = [2022, 4, 1]
my_date
[2022, 4, 1]
# We want to extract year, month, date to 3 separate variables
# The traditional way
y = my_date[0]
m = my_date[1]
d = my_date[2]
# However, we have a better alternative
y, m, d = my_date
# Verify that it works
print(y)
print(m)
print(d)
2022
4
1
  • Sometimes we want to extract some elements and ignore the others

  • We can use _ for the unwanted variables so that we don’t have to waste time thinking names for them

# Get date, month only
_, m, d = my_date
print(m)
print(d)
4
1
# _ is actually a valid variable name
# That's why the above statement work
# Try to print it out and you wil get the year
print(_)
2022
# Thus, if you dont want to use _
# You can use other names instead such as xxx
# However, _ is a widely accepted convention
# because it is so intutive
xxx, m, d = my_date
print(xxx)
2022
# What if we want to get the year only?
y, _, _ = my_date
print(y)
2022
# However, better way to ignore the rest is to used *_
# Ex: get year and ignore others
y, *_ = my_date
print(y)
2022
# The above statement unpacks first element of my_date into y
# and throws the rest as a list to _
# Let's verify it
print(_)
[4, 1]
# What if we care only about the date
*_, d = my_date
print(d)
print(_)
1
[2022, 4]

14.12. List comprehensions#

  • List comprehension is a short way to create a new list from an existing iterable

  • The traditional way is to use a for loop

  • Syntax

    • Without filtering: [expression for elem in a_list]

    • With filtering: [expression for elem in a_list if condition]

a) Ex 1: generate a list of squares

# Init a list 
x = [1, 2, 3, 4, 5]
x
[1, 2, 3, 4, 5]
# Method 1: wse for loop 
# to generate a list of square from x
squares = []
for i in x:
    squares.append(i**2)

squares
[1, 4, 9, 16, 25]
# Method 2: use list comp
squares = [i**2 for i in x]
squares
[1, 4, 9, 16, 25]

b) Ex 2: extract positive numbers

# Init a list 
x = [1, -2, 0, 7, 9, 11, 20, -3, 8]
x
[1, -2, 0, 7, 9, 11, 20, -3, 8]
# Method 1: extract positive numbers from x
pos_numbers = []
for i in x:
    if i > 0:
        pos_numbers.append(i)
        
pos_numbers
[1, 7, 9, 11, 20, 8]
# Method 2: use list comp
pos_numbers = [i for i in x if i > 0]
pos_numbers
[1, 7, 9, 11, 20, 8]

c) Ex 3: remove bad customers

# Init 2 lists
new_cust = ["John", "Jack", "Mary", "Belle", "Tom", "Ricky"]
bad_cust = ["Mary", "Ricky"]
# Method 1: use for loop
# extract customers in new_cust 
# who is not in bad_cust
good_cust = []
for cust in new_cust:
    if cust not in bad_cust:
        good_cust.append(cust)
        
good_cust
['John', 'Jack', 'Belle', 'Tom']
# Method 2: use list comp
good_cust = [cust for cust in new_cust if cust not in bad_cust]
good_cust
['John', 'Jack', 'Belle', 'Tom']

14.13. Summary#

What is a list?

  • A mutable sequence

Create a list

  • Using []. Ex: x = [1, 2, 3]

Index a list

  • Indexing means accessing one single element at a time

  • Python starts indexing at 0

  • Negative index means counting from the right

  • Indexing (read): access an element, read its value but do not modify it

    • x[0]: read 1st element

    • x[2]: read 3rd element

    • x[-1]: read the last element

    • x[-2]: read second to last element

  • Indexing (write): access an element, then modify it

    • x[0] = 1: overwrite 1st element with 1

    • x[-1] = 8: overwrite last element with 8

    • del x[2]: delete 3rd element

  • Will get an out-of-range error if we index beyond the range of the list

Slice a list

  • Slicing means accessing a bunch of elements (called a slice) simultaneously

  • Syntax

    • x[start:stop:step]: most general one

    • x[start:stop]: if step = 1

    • x[start:]: if step = 1 and slice until the end of x

    • x[:stop]: if step = 1 and slice from the beginning of x

  • Note that start is included but stop is NOT included

  • Slicing (read): access a slice, but does not modify it

    • x[:3]: slice first 3 elements

    • x[3:]: slice from 4th element to the end

    • x[-3:]: slice last 3 elements

    • x[1:5]: slice from 2nd to 5th elements

    • x[:]: slice the whole list (equivalent to x.copy())

    • x[::2]: slice from the beginning to end but skip 1 element between each jump

  • Slicing (write)

    • x[:3] = [1, 2, 3]: replace first 3 elements by 3 elements 1, 2, and 3

    • x[:3] = [1, 2]: replace first 3 elements by only 2 elements 1, 2 (the list will automatically shrinks)

    • x[:3] = [1, 2, 3, 4]: replace first 3 elements by 4 elements 1, 2, 3, and 4 (the list will automatically expand)

    • del x[1:3]: delete 2nd and 3rd elements

Nested lists

  • A list who contains another list is a nested list

  • Ex: x = [1, 2, ["A", "B", "C", "D"]]

  • Access the inner list: x[-1]

  • Access the first element of the inner list: x[-1][0]

  • Access the last element of the inner list: x[-1][-1]

  • Slice first 3 elements of the inner list: x[-1][:3]

Iterate through a list

  • We use a for loop to iterate through the elements of a list

  • Each time we visit an element is called an iteration

  • Collection types such as listtuple, and set are called iterable because we can iterate through their elements

  • After : of the for statement, we need to indent in (TAB)

    • Statement inside the for loop will be evaluated in every iteration

    • Statement parallel to the for loop is considered outside the for loop

Mutability

  • Mutability is a property of an object, not of a variable

  • A variable name is just a label pasted on an object

    • We can take the label from one object and re-paste it on another object

    • Thus, we never say a variable is mutable or immutable

    • We can only say about mutability when talking about objects and data types

  • The data type of an object is determined when the object is created and stays with the object forever

  • An object can be mutable or immutable, depending on its data type

    • Examples of mutable types: list, dict, set

    • Examples of immutable types: bool, float, str, tuple, frozenset

  • A mutable object is a box that can change what it stores inside

  • An immutable object is a box that is fixed: once created, it stays the same until being destroyed by the garbage collector

  • Pros and cons

    • Mutable types

      • Pros: flexible

      • Cons: less readable, insecure, sometimes slow

    • Immutable:

      • Pros: more readable, secure, and often faster

      • Cons: less flexible

Copy a list

  • For this section, to avoid lengthy wording

    • When I say x, it means the variable

    • When I say x_obj, it means the objects that x is pointing to

  • Suppose x is a list

  • Aliasing: y = x

    • This is NOT copying

    • x and y now are pointing to the same object

  • Shallow copying

    • Three ways

      • y = x.copy()

      • y = x[:]

      • y = list(x)

    • Only the references of elements of x_obj is copied to y_obj

    • So if x_obj contains a mutable element, and we change that element through x, the change will be reflected in both x and y

  • Deep copying

    • We have to import the module copy

    • Then run y = copy.deepcopy(x)

    • The above statement will NOT copy the references

    • It uses the references to look for the actual values in the memory, then creates objects to hold those copied values

    • Thus, y_obj now contains the references to the new objects

    • And x_obj and y_obj is completely irrelevant

  • Pros and cons

    • Shallow copying is faster and takes less memory

    • But deep copying avoids unwanted effects from mutability

Operations on a list

  • Two types of operations on a list

    • Regular operations: those that don’t change the list

    • Inplace operations: those that change the list

  • Regular operations

    • Index (read), slice (read)

    • Get number of element: len(x)

    • Check membership: "A" in x

    • Count number of occurrences: x.count("A")

    • Concat 2 list: x + y

    • Replicate a list: x * 3

    • Sort a list: sorted(x)

    • Reverse a list: reversed(x)

    • Get index of first occurrence: x.index("A")

  • Inplace operations

    • Sort a list: x.sort()

    • Reverse a list: x.reverse()

    • Append: x.append("A")

    • Extend: x.extend(["A", "B"])

    • Insert an element: x.insert(0, "A")

    • Remove an element by value: x.remove("A")

    • Remove an element by index: del x[1]

    • Pop out an element: val = x.pop(0)

    • Clear a list: x.clear()

Unpack a list

  • Suppose my_date = [2022, 4, 1]

  • Unpack all: y, m, d = my_date

  • Get year and month only: y, m, _ = my_date

  • Get year only: y, *_ = my_date

  • Get date only: *_, d = my_date

List comprehensions

  • Short way to create a new list from an existing iterable without using a loop

  • Syntax 1 (without filtering): [expr for elem in a_list]

  • Syntax 2 (with filtering): [expr for elem in a_list if condition]

14.14. Practice#

To be updated