Comprehensive Guide to Python Composite Data Types

Part One: Overview of Composite Data Types

1.1 What are Composite Data Types

Composite data types are data structures that organize multiple data items together, allowing them to store multiple values and providing various methods for manipulating these values. Unlike basic data types (integers, floats, strings, etc.) that can only store a single value, composite data types enable us to handle collections of data in a structured manner.

The main composite data types in Python are:

· List: An ordered, mutable sequence

· Tuple: An ordered, immutable sequence

· Dictionary: A mapping of key-value pairs

· Set: An unordered collection of unique elements

1.2 Why Composite Data Types are Needed

Imagine you need to manage student information for a class:

“`python

# Without using composite data types (cumbersome and hard to manage)
student1_name = "Zhang San"
student1_age = 18
student1_grade = 85
student2_name = "Li Si"
student2_age = 17
student2_grade = 92
# ... need to define multiple variables for each student

# Using composite data types (concise and easy to manage)
students = [
    {"name": "Zhang San", "age": 18, "grade": 85},
    {"name": "Li Si", "age": 17, "grade": 92},
    {"name": "Wang Wu", "age": 19, "grade": 78}
]

“`

The advantages of composite data types include:

· Data Organization: Grouping related data together

· Batch Operations: Performing operations on the entire dataset

· Code Conciseness: Reducing repetitive code

· Ease of Maintenance: Clear data structures that are easy to modify and extend

Part Two: In-Depth Analysis of Lists

2.1 Basic Concepts of Lists

Lists are one of the most commonly used and flexible data structures in Python. They are ordered, mutable sequences that can contain elements of any type.

2.2 Creating Lists

“`python

# Creating an empty list
empty_list = []
empty_list2 = list()
# Creating a list with elements
fruits = ["apple", "banana", "orange"]
numbers = [1, 2, 3, 4, 5]
mixed = [1, "hello", 3.14, True, [1, 2, 3]]  # Can contain different types of elements
# Using the list() constructor
chars = list("Python")  # ['P', 'y', 't', 'h', 'o', 'n']
numbers_range = list(range(5))  # [0, 1, 2, 3, 4]
print(fruits)        # ['apple', 'banana', 'orange']
print(numbers)       # [1, 2, 3, 4, 5]
print(mixed)         # [1, 'hello', 3.14, True, [1, 2, 3]]
print(chars)         # ['P', 'y', 't', 'h', 'o', 'n']
print(numbers_range) # [0, 1, 2, 3, 4]

“`

2.3 Accessing List Elements

“`python

# Index access (starting from 0)
fruits = ["apple", "banana", "orange", "grape", "mango"]
print(fruits[0])    # "apple" - first element
print(fruits[2])    # "orange" - third element
print(fruits[-1])   # "mango" - last element
print(fruits[-2])   # "grape" - second to last element
# Slicing (getting sublists)
print(fruits[1:3])   # ['banana', 'orange'] - from index 1 to 2 (excluding 3)
print(fruits[:3])    # ['apple', 'banana', 'orange'] - from start to index 2
print(fruits[2:])    # ['orange', 'grape', 'mango'] - from index 2 to end
print(fruits[::2])   # ['apple', 'orange', 'mango'] - every other element
print(fruits[::-1])  # ['mango', 'grape', 'orange', 'banana', 'apple'] - reverse the list
# Check if an element exists
if "apple" in fruits:
    print("Apple is in the fruit list")
if "watermelon" not in fruits:
    print("Watermelon is not in the fruit list")

“`

2.4 Modifying Lists

Lists are mutable and can be modified:

“`python

fruits = ["apple", "banana", "orange"]
# Modify a single element
fruits[1] = "blueberry"
print(fruits)  # ['apple', 'blueberry', 'orange']
# Modify a slice (can change the length of the list)
fruits[1:3] = ["banana", "grape", "mango"]
print(fruits)  # ['apple', 'banana', 'grape', 'mango']
# Adding elements
fruits.append("peach")          # Add a single element at the end
fruits.insert(1, "cherry")      # Insert an element at a specified position
fruits.extend(["watermelon", "pineapple"]) # Add multiple elements at the end
print(fruits)  # ['apple', 'cherry', 'banana', 'grape', 'mango', 'peach', 'watermelon', 'pineapple']
# Concatenating lists
more_fruits = ["kiwi", "strawberry"]
all_fruits = fruits + more_fruits
print(all_fruits)  # New list containing all fruits

“`

2.5 Deleting List Elements

“`python

fruits = ["apple", "banana", "orange", "grape", "mango"]
# Delete by index
del fruits[1]                # Delete the element at index 1
removed_fruit = fruits.pop(2) # Delete the element at index 2 and return it
last_fruit = fruits.pop()     # Delete the last element and return it
print(fruits)         # ['apple', 'orange']
print(removed_fruit)  # 'grape'
print(last_fruit)     # 'mango'
# Delete by value
fruits.remove("apple")  # Delete the first occurrence of "apple"
print(fruits)         # ['orange']
# Clear the list
fruits.clear()
print(fruits)         # []
# Delete the entire list
del fruits
# print(fruits)  # This will raise an error: NameError, fruits is not defined

“`

2.6 Common List Methods

“`python

fruits = ["apple", "banana", "orange", "banana", "grape"]
# Count occurrences
print(fruits.count("banana"))  # 2
# Find index
print(fruits.index("orange"))        # 2
print(fruits.index("banana", 2))     # 3 - start searching from index 2
# Sorting
numbers = [3, 1, 4, 1, 5, 9, 2]
numbers.sort()  # Ascending sort
print(numbers)  # [1, 1, 2, 3, 4, 5, 9]
numbers.sort(reverse=True)  # Descending sort
print(numbers)  # [9, 5, 4, 3, 2, 1, 1]
# Sorting a list of strings
fruits.sort()  # Sort in alphabetical order
print(fruits)  # ['orange', 'grape', 'apple', 'banana', 'banana']
fruits.sort(key=len)  # Sort by string length
print(fruits)  # ['grape', 'apple', 'banana', 'banana', 'orange']
# Reverse the list
fruits.reverse()
print(fruits)  # ['banana', 'banana', 'apple', 'orange', 'grape']
# Copying a list
fruits_copy = fruits.copy()
fruits_copy2 = fruits[:]  # Another way to copy

“`

2.7 List Comprehensions

List comprehensions provide a concise way to create lists:

“`python

# Basic syntax: [expression for item in iterable]
# Create a list of squares
squares = [x**2 for x in range(10)]
print(squares)  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
# List comprehension with condition
even_squares = [x**2 for x in range(10) if x % 2 == 0]
print(even_squares)  # [0, 4, 16, 36, 64]
# List comprehension with multiple loops
pairs = [(x, y) for x in [1, 2, 3] for y in [3, 1, 4] if x != y]
print(pairs)  # [(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
# Applying functions
words = ['hello', 'world', 'python']
upper_words = [word.upper() for word in words]
print(upper_words)  # ['HELLO', 'WORLD', 'PYTHON']
# Nested list comprehensions
matrix = [
    [1, 2, 3],
    [4, 5, 6], 
    [7, 8, 9]
]
flattened = [num for row in matrix for num in row]
print(flattened)  # [1, 2, 3, 4, 5, 6, 7, 8, 9]

“`

2.8 Lists and Loops

“`python

# Iterating through a list
fruits = ["apple", "banana", "orange"]
# Directly iterating through elements
for fruit in fruits:
    print(f"I like to eat {fruit}")
# Iterating through indices and elements
for index, fruit in enumerate(fruits):
    print(f"Index {index}: {fruit}")
# Using while loop to iterate
i = 0
while i < len(fruits):
    print(fruits[i])
    i += 1
# Using lists in loops
numbers = [1, 2, 3, 4, 5]
# Filtering even numbers
evens = []
for num in numbers:
    if num % 2 == 0:
        evens.append(num)
print(evens)  # [2, 4]
# Transforming elements
squared = []
for num in numbers:
    squared.append(num ** 2)
print(squared)  # [1, 4, 9, 16, 25]

“`

2.9 Multi-Dimensional Lists

“`python

# Creating a two-dimensional list (matrix)
matrix = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]
# Accessing elements
print(matrix[0][1])  # 2
print(matrix[2][0])  # 7
# Iterating through a two-dimensional list
print("Matrix content:")
for row in matrix:
    for element in row:
        print(element, end=' ')
    print()  # New line
# Creating a three-dimensional list
cube = [
    [
        [1, 2],
        [3, 4]
    ],
    [
        [5, 6], 
        [7, 8]
    ]
]
print(cube[0][1][0])  # 3
# Real-world application: Student grades table
students_grades = [
    ["Zhang San", 85, 92, 78],
    ["Li Si", 76, 88, 95],
    ["Wang Wu", 92, 79, 84]
]
print("\nStudent Grades Table:")
print("Name\tMath\tChinese\tEnglish")
for student in students_grades:
    for item in student:
        print(item, end='\t')
    print()

“`

Part Three: In-Depth Analysis of Tuples

3.1 Basic Concepts of Tuples

Tuples are similar to lists but are immutable. Once created, their contents cannot be modified. Tuples are typically used to store data that should not be changed.

3.2 Creating Tuples

“`python

# Creating an empty tuple
empty_tuple = ()
empty_tuple2 = tuple()
# Creating a tuple with elements
fruits = ("apple", "banana", "orange")
numbers = (1, 2, 3, 4, 5)
mixed = (1, "hello", 3.14, True)
# Tuple with a single element (note the comma)
single_element = (42,)  # This is a tuple
not_a_tuple = (42)      # This is an integer
# Using the tuple() constructor
chars = tuple("Python")  # ('P', 'y', 't', 'h', 'o', 'n')
numbers_range = tuple(range(5))  # (0, 1, 2, 3, 4)
print(fruits)        # ('apple', 'banana', 'orange')
print(single_element)  # (42,)
print(not_a_tuple)   # 42
print(chars)         # ('P', 'y', 't', 'h', 'o', 'n')

“`

3.3 Accessing Tuple Elements

“`python

# Accessing by index
fruits = ("apple", "banana", "orange", "grape", "mango")
print(fruits[0])   # "apple"
print(fruits[2])   # "orange" 
print(fruits[-1])  # "mango"
# Slicing
print(fruits[1:3])    # ('banana', 'orange')
print(fruits[:3])     # ('apple', 'banana', 'orange')
print(fruits[2:])     # ('orange', 'grape', 'mango')
print(fruits[::2])    # ('apple', 'orange', 'mango')
print(fruits[::-1])   # ('mango', 'grape', 'orange', 'banana', 'apple')
# Check if an element exists
if "apple" in fruits:
    print("Apple is in the tuple")

“`

3.4 Immutability of Tuples

“`python

fruits = ("apple", "banana", "orange")
# Attempting to modify a tuple (will raise an error)
# fruits[1] = "blueberry"  # TypeError: 'tuple' object does not support item assignment
# However, if a tuple contains mutable elements, those mutable elements can be modified
mixed_tuple = (1, [2, 3], 4)
mixed_tuple[1].append(5)  # Modify the list within the tuple
print(mixed_tuple)  # (1, [2, 3, 5], 4)
# Tuples can be reassigned
fruits = ("apple", "banana", "orange")
fruits = ("grape", "mango")  # This is legal, we created a new tuple
print(fruits)  # ('grape', 'mango')

“`

3.5 Common Tuple Operations

“`python

fruits = ("apple", "banana", "orange")
# Length
print(len(fruits))  # 3
# Count occurrences
numbers = (1, 2, 3, 2, 1, 2, 3, 4)
print(numbers.count(2))  # 3
# Find index
print(fruits.index("banana"))  # 1
# Concatenating tuples
tuple1 = (1, 2, 3)
tuple2 = (4, 5, 6)
combined = tuple1 + tuple2
print(combined)  # (1, 2, 3, 4, 5, 6)
# Repeating tuples
repeated = tuple1 * 3
print(repeated)  # (1, 2, 3, 1, 2, 3, 1, 2, 3)
# Unpacking tuples
a, b, c = (1, 2, 3)
print(a, b, c)  # 1 2 3
# Star unpacking
first, *middle, last = (1, 2, 3, 4, 5)
print(first)   # 1
print(middle)  # [2, 3, 4]
print(last)    # 5

“`

3.6 Converting Between Tuples and Lists

“`python

# List to tuple
fruits_list = ["apple", "banana", "orange"]
fruits_tuple = tuple(fruits_list)
print(fruits_tuple)  # ('apple', 'banana', 'orange')
# Tuple to list
numbers_tuple = (1, 2, 3, 4)
numbers_list = list(numbers_tuple)
print(numbers_list)  # [1, 2, 3, 4]

“`

3.7 Use Cases for Tuples

“`python

# 1. Returning multiple values from a function
def get_user_info():
    name = "Alice"
    age = 30
    city = "New York"
    return name, age, city  # Actually returns a tuple
user_info = get_user_info()
print(user_info)  # ('Alice', 30, 'New York')
# Unpacking return values
name, age, city = get_user_info()
print(f"Name: {name}, Age: {age}, City: {city}")
# 2. As dictionary keys (because tuples are immutable)
locations = {
    (40.7128, -74.0060): "New York",
    (34.0522, -118.2437): "Los Angeles", 
    (51.5074, -0.1278): "London"
}
print(locations[(40.7128, -74.0060)])  # New York
# 3. Protecting data from modification
CONSTANTS = (3.14159, 2.71828, 1.41421)
# CONSTANTS[0] = 3.14  # This will raise an error, protecting the constant value
# 4. Formatting strings
person = ("Zhang San", 25, "Engineer")
message = "Name: %s, Age: %d, Occupation: %s" % person
print(message)  # Name: Zhang San, Age: 25, Occupation: Engineer

“`

Part Four: In-Depth Analysis of Dictionaries

4.1 Basic Concepts of Dictionaries

Dictionaries are collections of key-value pairs. Each key must be unique, while values can be of any data type. Dictionaries are unordered (Python 3.7+ maintains insertion order), and mutable.

4.2 Creating Dictionaries

“`python

# Creating an empty dictionary
empty_dict = {}
empty_dict2 = dict()
# Creating a dictionary with key-value pairs
person = {
    'name': 'Alice',
    'age': 30,
    'city': 'New York'
}
# Using the dict() constructor
person2 = dict(name='Bob', age=25, city='London')
person3 = dict([('name', 'Charlie'), ('age', 35), ('city', 'Paris')])
print(person)   # {'name': 'Alice', 'age': 30, 'city': 'New York'}
print(person2)  # {'name': 'Bob', 'age': 25, 'city': 'London'}
print(person3)  # {'name': 'Charlie', 'age': 35, 'city': 'Paris'}
# Creating a dictionary using fromkeys method
default_dict = dict.fromkeys(['name', 'age', 'city'], 'unknown')
print(default_dict)  # {'name': 'unknown', 'age': 'unknown', 'city': 'unknown'}

“`

4.3 Accessing Dictionary Elements

“`python

person = {'name': 'Alice', 'age': 30, 'city': 'New York'}
# Accessing values by key
print(person['name'])  # Alice
print(person['age'])   # 30
# Using get method to access (returns None or default value if key does not exist)
print(person.get('name'))      # Alice
print(person.get('country'))   # None
print(person.get('country', 'USA'))  # USA (default value)
# Checking if a key exists
if 'name' in person:
    print("Name key exists")
# Getting all keys, values, and key-value pairs
print(person.keys())    # dict_keys(['name', 'age', 'city'])
print(person.values())  # dict_values(['Alice', 30, 'New York'])  
print(person.items())   # dict_items([('name', 'Alice'), ('age', 30), ('city', 'New York')])

“`

4.4 Modifying Dictionaries

“`python

person = {'name': 'Alice', 'age': 30, 'city': 'New York'}
# Adding or modifying key-value pairs
person['age'] = 31           # Modify existing key
person['country'] = 'USA'    # Add new key
print(person)  # {'name': 'Alice', 'age': 31, 'city': 'New York', 'country': 'USA'}
# Using update method to merge dictionaries
person.update({'age': 32, 'job': 'Engineer'})
print(person)  # {'name': 'Alice', 'age': 32, 'city': 'New York', 'country': 'USA', 'job': 'Engineer'}
# Setting default values
person.setdefault('salary', 50000)  # If salary does not exist, set default value 50000
print(person['salary'])  # 50000
person.setdefault('age', 40)  # age already exists, will not modify
print(person['age'])  # 32

“`

4.5 Deleting Dictionary Elements

“`python

person = {'name': 'Alice', 'age': 30, 'city': 'New York', 'country': 'USA'}
# Deleting a specified key
del person['country']
print(person)  # {'name': 'Alice', 'age': 30, 'city': 'New York'}
# Using pop to delete and return value
age = person.pop('age')
print(age)     # 30
print(person)  # {'name': 'Alice', 'city': 'New York'}
# Using popitem to delete and return the last key-value pair (Python 3.7+ ordered)
last_item = person.popitem()
print(last_item)  # ('city', 'New York')
print(person)     # {'name': 'Alice'}
# Clearing the dictionary
person.clear()
print(person)  # {}
# Deleting the entire dictionary
del person
# print(person)  # Raises an error: person is not defined

“`

4.6 Common Dictionary Methods

“`python

person = {'name': 'Alice', 'age': 30, 'city': 'New York'}
# Getting the length of the dictionary
print(len(person))  # 3
# Copying a dictionary
person_copy = person.copy()
print(person_copy)  # {'name': 'Alice', 'age': 30, 'city': 'New York'}
# Iterating through the dictionary
print("Iterating keys:")
for key in person:
    print(f"{key}: {person[key]}")
print("\nIterating key-value pairs:")
for key, value in person.items():
    print(f"{key}: {value}")
print("\nIterating values:")
for value in person.values():
    print(value)
# Dictionary comprehensions
squares = {x: x**2 for x in range(5)}
print(squares)  # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
# Reversing key-value pairs
reversed_dict = {v: k for k, v in person.items()}
print(reversed_dict)  # {'Alice': 'name', 30: 'age', 'New York': 'city'}

“`

4.7 Nested Dictionaries

“`python

# Nested dictionary
users = {
    'user1': {
        'name': 'Alice',
        'age': 30,
        'city': 'New York'
    },
    'user2': {
        'name': 'Bob', 
        'age': 25,
        'city': 'London'
    },
    'user3': {
        'name': 'Charlie',
        'age': 35, 
        'city': 'Paris'
    }
}
# Accessing nested dictionary
print(users['user1']['name'])  # Alice
print(users['user2']['age'])   # 25
# Modifying nested dictionary
users['user1']['age'] = 31
# Adding a new user
users['user4'] = {'name': 'David', 'age': 28, 'city': 'Berlin'}
# Iterating through nested dictionary
for user_id, user_info in users.items():
    print(f"User ID: {user_id}")
    for key, value in user_info.items():
        print(f"  {key}: {value}")
    print()
# Complex nested example
school = {
    'class1': {
        'teacher': 'Teacher Zhang',
        'students': {
            'S001': {'name': 'Xiao Ming', 'grade': 85},
            'S002': {'name': 'Xiao Hong', 'grade': 92}
        }
    },
    'class2': {
        'teacher': 'Teacher Li', 
        'students': {
            'S003': {'name': 'Xiao Gang', 'grade': 78},
            'S004': {'name': 'Xiao Li', 'grade': 88}
        }
    }
}
# Accessing nested data
print(school['class1']['students']['S001']['name'])  # Xiao Ming

“`

Part Five: In-Depth Analysis of Sets

5.1 Basic Concepts of Sets

Sets are collections of unordered, unique elements. Sets are mainly used for membership testing, eliminating duplicate elements, and performing mathematical set operations.

5.2 Creating Sets

“`python

# Creating an empty set
empty_set = set()
# empty_set = {}  # This creates an empty dictionary, not a set
# Creating a set with elements
fruits = {'apple', 'banana', 'orange'}
numbers = {1, 2, 3, 4, 5}
# Using the set() constructor
chars = set('python')  # {'p', 'y', 't', 'h', 'o', 'n'}
numbers_range = set(range(5))  # {0, 1, 2, 3, 4}
# Creating a set from a list (removing duplicates)
numbers_list = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
unique_numbers = set(numbers_list)
print(unique_numbers)  # {1, 2, 3, 4}
print(fruits)      # {'apple', 'banana', 'orange'}
print(numbers)     # {1, 2, 3, 4, 5}
print(chars)       # {'p', 'y', 't', 'h', 'o', 'n'}

“`

5.3 Accessing Set Elements

“`python

fruits = {'apple', 'banana', 'orange'}
# Sets do not support indexing because they are unordered
# print(fruits[0])  # Raises an error: TypeError: 'set' object is not subscriptable
# Iterating through a set
for fruit in fruits:
    print(fruit)
# Checking if an element exists
if 'apple' in fruits:
    print("Apple is in the set")

“`

5.4 Modifying Sets

“`python

fruits = {'apple', 'banana'}
# Adding elements
fruits.add('orange')
print(fruits)  # {'apple', 'banana', 'orange'}
# Adding multiple elements
fruits.update(['grape', 'mango'])
print(fruits)  # {'apple', 'banana', 'orange', 'grape', 'mango'}
# Removing elements
fruits.remove('banana')  # Raises an error if the element does not exist
print(fruits)  # {'apple', 'orange', 'grape', 'mango'}
fruits.discard('apple')  # Does not raise an error if the element does not exist
print(fruits)  # {'orange', 'grape', 'mango'}
# Randomly remove and return an element
random_fruit = fruits.pop()
print(f"Removed fruit: {random_fruit}")
print(fruits)  # Remaining fruits
# Clearing the set
fruits.clear()
print(fruits)  # set()

“`

5.5 Set Operations

“`python

set1 = {1, 2, 3, 4, 5}
set2 = {4, 5, 6, 7, 8}
# Union
union = set1 | set2
union2 = set1.union(set2)
print(union)  # {1, 2, 3, 4, 5, 6, 7, 8}
# Intersection
intersection = set1 & set2
intersection2 = set1.intersection(set2)
print(intersection)  # {4, 5}
# Difference
difference = set1 - set2
difference2 = set1.difference(set2)
print(difference)  # {1, 2, 3}
# Symmetric difference (elements that appear in only one of the sets)
symmetric_difference = set1 ^ set2
symmetric_difference2 = set1.symmetric_difference(set2)
print(symmetric_difference)  # {1, 2, 3, 6, 7, 8}
# Subset and superset
subset = {1, 2}
print(subset.issubset(set1))  # True
print(set1.issuperset(subset))  # True
# Disjoint sets
disjoint = {1, 2}
disjoint2 = {3, 4}
print(disjoint.isdisjoint(disjoint2))  # True

“`

5.6 Common Set Methods

“`python

fruits = {'apple', 'banana', 'orange'}
# Length of the set
print(len(fruits))  # 3
# Copying a set
fruits_copy = fruits.copy()
print(fruits_copy)  # {'apple', 'banana', 'orange'}
# Set comprehensions
squares = {x**2 for x in range(5)}
print(squares)  # {0, 1, 4, 9, 16}
# Frozen set (immutable set)
frozen_fruits = frozenset(['apple', 'banana', 'orange'])
# frozen_fruits.add('grape')  # Raises an error: frozenset is immutable
print(frozen_fruits)  # frozenset({'apple', 'banana', 'orange'})

“`

5.7 Use Cases for Sets

“`python

# 1. Removing duplicates
numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
unique_numbers = list(set(numbers))
print(unique_numbers)  # [1, 2, 3, 4]
# 2. Membership testing (faster than lists)
large_set = set(range(1000000))
if 999999 in large_set:  # Fast membership test
    print("Element is in the set")
# 3. Data comparison
students_math = {'Alice', 'Bob', 'Charlie'}
students_physics = {'Bob', 'Charlie', 'David'}
# Students attending both courses
both = students_math & students_physics
print(both)  # {'Bob', 'Charlie'}
# Students attending only one course
only_one = students_math ^ students_physics
print(only_one)  # {'Alice', 'David'}
# All students attending courses
all_students = students_math | students_physics
print(all_students)  # {'Alice', 'Bob', 'Charlie', 'David'}
# Students attending only math course
only_math = students_math - students_physics
print(only_math)  # {'Alice'}

“`

Part Six: Comprehensive Applications and Practical Examples

6.1 Text Word Frequency Statistics

“`python

def word_frequency(text):
    """Count the frequency of words in the text"""
    # Convert to lowercase and split words
    words = text.lower().split()
    
    # Count word frequency
    frequency = {}
    for word in words:
        # Remove punctuation
        word = word.strip('.,!?;:"')
        if word:
            frequency[word] = frequency.get(word, 0) + 1
    
    return frequency

def display_frequency(frequency, top_n=10):
    """Display word frequency statistics"""
    # Sort by frequency
    sorted_frequency = sorted(frequency.items(), key=lambda x: x[1], reverse=True)
    
    print("Word Frequency Statistics:")
    for i, (word, count) in enumerate(sorted_frequency[:top_n], 1):
        print(f"{i}. {word}: {count}")
# Test text
text = """
Python is an interpreted, high-level, general-purpose programming language. 
Created by Guido van Rossum and first released in 1991, Python's design philosophy 
emphasizes code readability with its notable use of significant whitespace. 
Its language constructs and object-oriented approach aim to help programmers 
write clear, logical code for small and large-scale projects.
"""
# Count word frequency
frequency = word_frequency(text)
display_frequency(frequency)
# Using collections.Counter for simplicity
from collections import Counter
words = text.lower().split()
words = [word.strip('.,!?;:"') for word in words if word.strip('.,!?;:"')]
word_counter = Counter(words)
print("\nUsing Counter:")
for word, count in word_counter.most_common(5):
    print(f"{word}: {count}")

“`

6.2 Student Grade Management System

“`python

class StudentManager:
    """Student Grade Manager"""
    
    def __init__(self):
        self.students = {}
    
    def add_student(self, student_id, name):
        """Add a student"""
        if student_id in self.students:
            print(f"Student ID {student_id} already exists")
        else:
            self.students[student_id] = {
                'name': name,
                'scores': {}
            }
            print(f"Successfully added student: {name}")
    
    def add_score(self, student_id, subject, score):
        """Add a score"""
        if student_id not in self.students:
            print(f"Student ID {student_id} does not exist")
            return
        
        self.students[student_id]['scores'][subject] = score
        print(f"Successfully added {subject} score for {self.students[student_id]['name']}: {score}")
    
    def get_student_info(self, student_id):
        """Get student information"""
        if student_id not in self.students:
            print(f"Student ID {student_id} does not exist")
            return None
        
        student = self.students[student_id]
        info = f"Student ID: {student_id}\nName: {student['name']}\nScores:"
        
        if student['scores']:
            for subject, score in student['scores'].items():
                info += f"\n  {subject}: {score}"
            
            # Calculate average score
            avg_score = sum(student['scores'].values()) / len(student['scores'])
            info += f"\nAverage Score: {avg_score:.2f}"
        else:
            info += "\n  No scores available"
        
        return info
    
    def get_subject_stats(self, subject):
        """Get subject statistics"""
        scores = []
        for student_id, student in self.students.items():
            if subject in student['scores']:
                scores.append(student['scores'][subject])
        
        if not scores:
            print(f"No scores available for subject {subject}")
            return
        
        stats = {
            'count': len(scores),
            'average': sum(scores) / len(scores),
            'max': max(scores),
            'min': min(scores)
        }
        
        print(f"Subject {subject} Statistics:")
        print(f"  Number of participants: {stats['count']}")
        print(f"  Average Score: {stats['average']:.2f}")
        print(f"  Highest Score: {stats['max']}")
        print(f"  Lowest Score: {stats['min']}")
    
    def list_all_students(self):
        """List all students"""
        if not self.students:
            print("No student information available")
            return
        
        print("All Student Information:")
        for student_id, student in self.students.items():
            print(f"  {student_id}: {student['name']}")
# Usage example
manager = StudentManager()
# Adding students
manager.add_student("S001", "Zhang San")
manager.add_student("S002", "Li Si")
manager.add_student("S003", "Wang Wu")
# Adding scores
manager.add_score("S001", "Math", 85)
manager.add_score("S001", "English", 92)
manager.add_score("S002", "Math", 78)
manager.add_score("S002", "English", 88)
manager.add_score("S003", "Math", 90)
# Viewing student information
print(manager.get_student_info("S001"))
print()
# Viewing subject statistics
manager.get_subject_stats("Math")
print()
# Listing all students
manager.list_all_students()

“`

6.3 Data Deduplication and Cleaning

“`python

def clean_data(data):
    """Data cleaning: deduplication, handling null values, normalization"""
    
    # Deduplication
    unique_data = list(set(data))
    print(f"Before deduplication: {len(data)} records")
    print(f"After deduplication: {len(unique_data)} records")
    
    # Handling null values and invalid data
    cleaned_data = []
    for item in unique_data:
        # Remove whitespace from both ends of strings
        if isinstance(item, str):
            item = item.strip()
        
        # Skip null values and None
        if item and item != "NULL" and item != "null":
            cleaned_data.append(item)
    
    print(f"After cleaning: {len(cleaned_data)} records")
    return cleaned_data

def analyze_data(data):
    """Data analysis"""
    # Data type distribution
    type_count = {}
    for item in data:
        item_type = type(item).__name__
        type_count[item_type] = type_count.get(item_type, 0) + 1
    
    print("Data Type Distribution:")
    for data_type, count in type_count.items():
        print(f"  {data_type}: {count}")
    
    # Numeric data statistics
    numbers = [item for item in data if isinstance(item, (int, float))]
    if numbers:
        stats = {
            'count': len(numbers),
            'sum': sum(numbers),
            'average': sum(numbers) / len(numbers),
            'max': max(numbers),
            'min': min(numbers)
        }
        
        print("\nNumeric Data Statistics:")
        for stat, value in stats.items():
            print(f"  {stat}: {value}")
# Test data
test_data = [
    "Apple", "Banana", "apple", "BANANA", "Orange",
    "Apple", "", "   Apple   ", "NULL", "null", None,
    1, 2, 3, 2, 1, 4, 5, 3.14, 2.71
]
# Data cleaning
cleaned = clean_data(test_data)
print(f"\nCleaned data: {cleaned}")
# Data analysis
analyze_data(cleaned)

“`

Part Seven: Debugging and Best Practices

7.1 Common Errors and Solutions

“`python

# Error 1: Iterating while modifying a list
fruits = ['apple', 'banana', 'orange', 'grape']
# Incorrect way (may skip elements or cause unexpected behavior)
# for fruit in fruits:
#     if fruit == 'banana':
#         fruits.remove(fruit)
# Correct way: create a copy or use list comprehension
fruits_copy = fruits.copy()
for fruit in fruits_copy:
    if fruit == 'banana':
        fruits.remove(fruit)
# Or use list comprehension
fruits = [fruit for fruit in fruits if fruit != 'banana']
# Error 2: Key does not exist in dictionary
person = {'name': 'Alice', 'age': 30}
# Incorrect way
# print(person['city'])  # KeyError
# Correct way
print(person.get('city', 'Unknown'))
# Error 3: Confusing set operations
set1 = {1, 2, 3}
set2 = {3, 4, 5}
# Incorrect: using wrong operator
# intersection = set1 and set2  # This is not the intersection of sets
# Correct way
intersection = set1 & set2
print(intersection)  # {3}

“`

7.2 Performance Considerations

“`python

import time
# List vs Set membership testing performance
large_list = list(range(1000000))
large_set = set(range(1000000))
# List membership testing (slower)
start_time = time.time()
result = 999999 in large_list
end_time = time.time()
print(f"List membership testing time: {end_time - start_time:.6f} seconds")
# Set membership testing (faster)
start_time = time.time()
result = 999999 in large_set
end_time = time.time()
print(f"Set membership testing time: {end_time - start_time:.6f} seconds")
# Dictionary key testing (also fast)
large_dict = {i: i**2 for i in range(1000000)}
start_time = time.time()
result = 999999 in large_dict
end_time = time.time()
print(f"Dictionary key testing time: {end_time - start_time:.6f} seconds")

“`

7.3 Best Practices

1. Choose the right data structure

· For ordered, mutable elements: List

· For ordered, immutable elements: Tuple

· For key-value pairs: Dictionary

· For deduplication or set operations: Set

2. Use comprehensions to simplify code

· List comprehension: [x**2 for x in range(10)]

· Dictionary comprehension: {x: x**2 for x in range(5)}

· Set comprehension: {x**2 for x in range(5)}

3. Utilize built-in functions and methods

· len(), max(), min(), sum()

· sorted(), reversed()

· Specific methods for various data structures

4. Be mindful of mutability

· Lists, dictionaries, and sets are mutable

· Tuples, strings, and frozensets are immutable

5. Use type hints

“`python

   from typing import List, Dict, Tuple, Set

def process_data(numbers: List[int]) -> Dict[str, float]:

# Function implementation

pass

“`

Summary

Through this module, you have mastered:

1. Lists: Ordered, mutable sequences that support indexing, slicing, and various modification operations

2. Tuples: Ordered, immutable sequences suitable for protecting data and as dictionary keys

3. Dictionaries: Collections of key-value pairs that provide fast lookups and flexible data organization

4. Sets: Unordered collections of unique elements that support set operations and fast membership testing

These composite data types are fundamental to Python programming, and almost all Python programs will utilize them. Mastering their usage, characteristics, and applicable scenarios is crucial for writing efficient and clear Python code.

Next, you will learn about modularization and engineering, understanding how to organize larger Python projects, use standard libraries and third-party libraries, and handle exceptions.

Related posts

Leave a Comment Cancel reply