Advanced Python: Understanding Deep Copy and Shallow Copy

In <span>Python</span>, the assignment statement (<span>obj_b = obj_a</span>) does not create a true copy. It merely creates a new variable using the same reference. Therefore, when you want to make an actual copy of mutable type objects (like lists and dictionaries) and modify the copy without affecting the original object, you must be particularly careful.

Before learning about shallow copy and deep copy, we need to understand which data types are mutable and which are immutable.

1. Data Types

In <span>Python</span>, data types can be divided into mutable types (<span>mutable</span>) and immutable types (<span>immutable</span>). Understanding the differences and characteristics of these two types is crucial for writing efficient and maintainable code.

  • Immutable types: Once an object is created, it cannot be modified. If modified, a new object is created.

    # Integer (int)
    x = 10
    y = x
    x = 20 # Reassignment does not affect the value of y
    print(y)  # Output: 10
    
    # Float (float)
    x = 3.14
    y = x
    x = 2.7 # Reassignment does not affect the value of y
    print(y)  # Output: 3.14
    
    # String (str)
    s = "hello"
    s = s + " world"
    print(s)  # Output: "hello world"
    
    # Tuple (tuple)
    t = (1, 2, 3)
    # Will raise an error: 'tuple' object does not support item assignment
    t[0] = 10
    
    • Integer (int)
    • Float (float)
    • String (str)
    • Tuple (tuple)
  • Mutable types: The values of their objects can be modified in place without creating new objects.

    • List (list)
    • Dictionary (dict)
    • Set (set)
    • Bytearray (bytearray)
# List (list)
l1 = [1, 2, 3, 4]
l2 = l1
l1[0] = 10
print(l1, l2) # Output: [10, 2, 3, 4] [10, 2, 3, 4]

# Dictionary (dict)
d = {"a": 1, "b": 2}
c = d
d["a"] = 10 # Modify the value in the dictionary
d["c"] = 3 # Add a new key-value pair
print(c, d)  # Output: {'a': 10, 'b': 2, 'c': 3} {'a': 10, 'b': 2, 'c': 3}

# Set (set)
s = {1, 2, 3}
s2 = s
s.add(4)  # Add an element
s.remove(2)  # Remove an element
print(s, s2)  # Output: {1, 3, 4} {1, 3, 4}

# Bytearray (bytearray)
b = bytearray(b"hello")
b1 = b
b[0] = 72 # Modify the content of the bytearray
print(b, b1)  # Output: bytearray(b'Hello') bytearray(b'Hello')

Advanced Python: Understanding Deep Copy and Shallow Copy

2. Shallow Copy and Deep Copy

For a “true” copy, we can use the <span>copy</span> module. However, there are important differences between shallow copy and deep copy when it comes to composite/nested objects (like nested lists or dictionaries) and custom objects:

  • Shallow copy: Only one level deep. It creates a new collection object and fills it with references to the nested objects. This means that modifying nested objects in the copy beyond one level will affect the original object.

  • Deep copy: A complete independent clone. It creates a new collection object and recursively fills it with copies of the nested objects found in the original object.

Assignment Operation

For mutable types, this only creates a new variable with the same reference. Modifying one will affect the other.

l1 = [1, 2, 3, 4]
l2 = l1
l1[0] = 10
print(l1, l2) # Output: [10, 2, 3, 4] [10, 2, 3, 4]

Shallow Copy

One level deep. Modifications at level 1 do not affect other lists.

Use <span>copy.copy()</span> or object-specific copy functions/copy constructors.

import copy

l1 = [1, 2, 3, 4]
l2 = copy.copy(l1)
l1[0] = 10
print(l1, l2) # Output: [10, 2, 3, 4] [1, 2, 3, 4]

However, for nested objects, modifications at level 2 or higher will indeed affect other objects!

import copy

l1 = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
l2 = copy.copy(l1)

# Will affect other lists!
l1[0][0] = -10
print(l1, l2)

Modifying level 2 of l1 will also change level 2 of l2, resulting in:

[[-10, 2, 3, 4, 5], [6, 7, 8, 9, 10]] 
[[-10, 2, 3, 4, 5], [6, 7, 8, 9, 10]]

Deep Copy

A completely independent clone. Use <span>copy.deepcopy()</span>.

import copy

l1 = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
l2 = copy.deepcopy(l1)

# Will not affect other lists!
l1[0][0] = -10
print(l1, l2, id(l1), id(l2))

Will not affect other objects at all, and the memory addresses are different.

[[-10, 2, 3, 4, 5], [6, 7, 8, 9, 10]] 
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
1890075992128 
1890075985408

3. Application Scenarios

In <span>Python</span>, the choice between deep copy (<span>deepcopy</span>) and shallow copy (<span>copy</span>) depends on whether a completely independent copy is needed.

  • Modifying data structures (shallow copy for single-layer or simple structures, deep copy for multi-layer nested or circular references)
  • Function parameter passing (passing references of mutable types, deep copy within the function does not affect the passed parameters themselves)
  • Data sharing needs (shallow copy allows sharing of sub-objects, deep copy provides complete isolation)

Leave a Comment