Python Objects Mutability and Recycling

Overview of Object Referencing and Mutation and Garbage Collection in Python

Python Objects Mutability and Recycling
Python Objects Mutability and Recycling

Objects are the core constructs of any OOP language, and each language defines its own syntax to create them, update them and destroy them. In Python every object has an identity, a type, and a value. However, only the value of an object may change over time.

In this article we’ll address these topics:

  • What are Variables in Python?
  • How to Copy an Object?
  • Python Garbage Collection

Variables Are Not Boxes

Usually Variables are regarded as Boxes or Containers which hinders the understanding of reference variables in object oriented languages.

Python variables are like reference variables in Java, you can think of them as labels with names attached to the objects.

Variables are not boxes holding the values, more like sticky notes on the object to reference it. Image from book Fluent Python
Variables are not boxes holding the values, more like sticky notes on the object to reference it. Image from book Fluent Python

In the next example we modify the list referenced by “var_one”, by appending another item. When we print “var_two” we get the same list.

var_one = [1, 2, 3]
var_two = var_one
var_one.append(4)
print(var_two)
# [1, 2, 3, 4]

This means the “var_two” references the same list referenced by “var_one”.

Nothing prevents an object from having several labels assigned to it, i.e. different variables referencing the same object.

This leads us to another question! How do we check if two objects are equal?

In Python, each object has an id, and it can be retrieved by the id(obj) function. Now two variables referencing the same object will have the same id, i.e. the id is the Memory Address of the object and it is unique during the object’s lifecycle.

charles = {'name': 'Charles L. Dodgson', 'born': 1832}
lewis = charles
print(lewis is charles)
# True

print(id(charles), id(lewis))
# (4300473992, 4300473992)

lewis['balance'] = 950
print(charles)
# {'name': 'Charles L. Dodgson', 'born': 1832, 'balance': 950}

alex = {'name': 'Charles L. Dodgson', 'born': 1832, 'balance': 950}
print(alex == charles)
# True

print(alex is not charles)
# False
Charles and Lewis are bound to the same object, alex is bound to a separate object of equal value. Image from book Fluent Python
Charles and Lewis are bound to the same object, alex is bound to a separate object of equal value. Image from book Fluent Python

The is and is not operators compare the identity of two objects while the
id() function returns an integer representing the identity.

The == operator compares the values of objects, i.e. he data they hold and this is what we often care more about.

The is operator cannot be overloaded and tends to be faster than == operator because it can be overloaded.

Most built-in types and Python objects override the __eq__ special method to support the == Operator.

How to Copy an Object? Shallow Copy VS Deep Copy

Shallow copies are the easiest to make, but they may not be what we want.

Shallow copies copy the references to the copied object, i.e. we don’t create new objects we just reference the existing embedded objects.

This saves memory and causes no problems if all the items are immutable, But if there are mutable items, it may lead to unpleasant surprises.

list_1 = [3, [66, 55, 44], (7, 8, 9)]
list_2 = list(list_1)
Python Tutor visualization
Python Tutor visualization

The above Code Snippet Execution Visualization clarifies that making a shallow copy of a list using the constructor will only reference the existing objects in the original list.

list_1.append(100)
list_1[1].remove(55)
print('list_1:', list_1)
# list_1: [3, [66, 44], (7, 8, 9), 100]
print('list_2:', list_2)
# list_2: [3, [66, 44], (7, 8, 9)]
list_2[1] += [33, 22]
list_2[2] += (10, 11)
print('list_1:', list_1)
# list_1: [3, [66, 44, 33, 22], (7, 8, 9), 100]
print('list_2:', list_2)
# list_2: [3, [66, 44, 33, 22], (7, 8, 9, 10, 11)]
Last state of execution of the example code
Last state of execution of the example code

Few things to keep in mind:

  • Any operations on embedded objects will be visible to other variables referencing the said objects, unless that object is Immutable.
  • Tuples are Immutable, so any operation on them creates a new tuple and rebinds it to the variable like in the example.

Deep Copy to The Rescue

Deep copies are duplicates that do not share references of embedded objects, i.e. if we deep copy a list or any object we will create new references for its embedded objects.

Python Standard Library has a module that implements this, it offers two functions copy() and deepcopy(), the first for the shallow copy and the latter for the deep copy.

import copy
 
class Bus:
    def __init__(self, passengers=None):
        if passengers is None:
            self.passengers = []
        else:
            self.passengers = list(passengers)
   
    def pick(self, name):
        self.passengers.append(name)

    def drop(self, name):
        self.passengers.remove(name)

bus1 = Bus(['Alice', 'Bill', 'Claire', 'David'])
bus2 = copy.copy(bus1)
bus3 = copy.deepcopy(bus1)

print(id(bus1), id(bus2), id(bus3))
# (4301498296, 4301499416, 4301499752)

bus1.drop('Bill')

print(bus2.passengers)
# ['Alice', 'Claire', 'David']

print(id(bus1.passengers), id(bus2.passengers), id(bus3.passengers))
# (4302658568, 4302658568, 4302657800)

print(bus3.passengers)
# ['Alice', 'Bill', 'Claire', 'David']

Sometimes making a deep copy can lead to bugs, some objects could have cyclic references or objects that may refer to external resources that should not be copied.

The solution would be to implement our own __copy__ and __deepcopy__ Special methods.

Garbage Collection

Objects in Python are never explicitly destroyed like in C# for example, they are garbage collected when they become unreachable.

We do have a statement that deletes references, del, but never the actual Objects.

a = [1, 2]
b = a
del a
print(b)
# [1, 2]
 b = [3]
# The original object is now ready to be garbage collected

Python’s garbage collector will discard an object from memory if it has no references.

Conclusion

It's clear that Python tries, in its own way, to make our code readable and optimized with the built in garbage collection, so we don't have to worry too much about objects staying in memory for longer than they are needed. However, there are always more solutions to optimize the performance of an app.

Here are few resources that could help you understand better the inner workings of Python so you could implement the fitting fix for your service.

Further Reading