Master Pandas Core Usage in 30 Minutes: A Beginner's Guide

Hello everyone! Today we’re going to talk about a topic that many data analysis beginners find daunting: Pandas! Don’t be scared by the name, it’s not a real panda 🐼, but one of the most powerful data processing tools in Python.

Many friends think Pandas is hard to learn? No worries! Follow my pace, and in 30 minutes, you’ll go from “not knowing anything” to “basic enough to use”! Let’s start this data analysis journey! ✨

1. Getting to Know Pandas: The Swiss Army Knife of Data Analysis 🔪

First, we need to import this powerful tool:

import pandas as pd
import numpy as np

# Create a simple DataFrame
df = pd.DataFrame({
    'Name': ['Xiao Ming', 'Xiao Hong', 'Xiao Hua', 'Xiao Li'],
    'Age': [18, 22, 20, 19],
    'Score': [85, 92, 78, 95]
})

print(df)

Look! This is a basic data table (DataFrame). Doesn’t it look a lot like an Excel spreadsheet? That’s right, Pandas was designed to allow you to handle data just like you would in Excel!

2. Basic Data Operations: CRUD 🔍

Viewing Data

# View the first few rows
print(df.head())  # Default shows the first 5 rows

# View basic information
print(df.info())  # Shows data types and missing value information

# View statistical summary
print(df.describe())  # Shows statistical information for numeric columns

Tip: These are the most commonly used “view data” methods in daily data analysis, so I suggest memorizing them! 😉

Selecting Data

# Select a single column
print(df['Age'])

# Select multiple columns
print(df[['Name', 'Score']])

# Conditional filtering
print(df[df['Score'] > 80])  # Filter students with scores greater than 80

3. Data Processing: Making Data Obey 🎯

Adding New Columns

# Add a pass status column
df['Passed'] = df['Score'] >= 60
print(df)

# Add a rating column
df['Rating'] = df['Score'].apply(lambda x: 'A' if x >= 90 else 'B' if x >= 80 else 'C')
print(df)

Data Statistics

# Calculate average score
print(f"Average Score: {df['Score'].mean():.2f}")

# Group by rating and calculate statistics
print(df.groupby('Rating')['Score'].agg(['count', 'mean']))

4. Data Cleaning: Handling Dirty Data 🧹

# Handle missing values
df['Score'] = df['Score'].fillna(df['Score'].mean())  # Fill missing values with the average

# Remove duplicate rows
df = df.drop_duplicates()

# Reset index
df = df.reset_index(drop=True)

5. Practical Tips: Efficiency Hacks 💡

Data Sorting

# Sort by score in descending order
df_sorted = df.sort_values('Score', ascending=False)
print(df_sorted)

Data Merging

# Create another DataFrame
df2 = pd.DataFrame({
    'Name': ['Xiao Ming', 'Xiao Hong'],
    'Sports Score': [92, 88]
})

# Merge data
df_merged = pd.merge(df, df2, on='Name', how='left')
print(df_merged)

Mini Project: Score Analysis System 🎓

Let’s do a mini project using what we’ve learned:

def analyze_scores(df):
    # Basic statistics
    print(f"Class Average: {df['Score'].mean():.2f}")
    print(f"Highest Score: {df['Score'].max()}")
    print(f"Lowest Score: {df['Score'].min()}")
    
    # Score distribution
    bins = [0, 60, 70, 80, 90, 100]
    labels = ['Fail', 'Pass', 'Good', 'Excellent', 'Outstanding']
    df['Score Level'] = pd.cut(df['Score'], bins=bins, labels=labels)
    
    # Count of each level
    grade_counts = df['Score Level'].value_counts()
    print("\nScore Distribution:")
    print(grade_counts)
    
    return df

# Run analysis
df = analyze_scores(df)

Summary & Suggestions 🌟

The most commonly used operations in Pandas are these! Mastering them will be enough for daily data analysis.
Practice often, especially data filtering and grouping statistics, as these are the most commonly used.
You can start with small projects, like analyzing your own spending records or study scores.

Data analysis isn’t hard; what’s hard is not starting! Open Python now and type out the code from this article! 💪

Bonus Tips 🎁

Make good use of the Tab completion feature; after typingdf., press the Tab key to see all available methods
Rememberdf.head(), df.info(), and df.describe() are the three most commonly used methods to view data
Usedf.groupby() frequently; this is one of the core operations in data analysis

Alright, that’s today’s Pandas introductory tutorial! If you found it helpful, remember to like and save it! If you have any questions, feel free to discuss in the comments, and let’s improve together! 📚

Master Pandas Core Usage in 30 Minutes: A Beginner’s Guide

1. Getting to Know Pandas: The Swiss Army Knife of Data Analysis 🔪

2. Basic Data Operations: CRUD 🔍

Viewing Data

Selecting Data

3. Data Processing: Making Data Obey 🎯

Adding New Columns

Data Statistics

4. Data Cleaning: Handling Dirty Data 🧹

5. Practical Tips: Efficiency Hacks 💡

Data Sorting

Data Merging

Mini Project: Score Analysis System 🎓

Summary & Suggestions 🌟

Bonus Tips 🎁

Leave a Comment Cancel reply

1. Getting to Know Pandas: The Swiss Army Knife of Data Analysis 🔪

2. Basic Data Operations: CRUD 🔍

Viewing Data

Selecting Data

3. Data Processing: Making Data Obey 🎯

Adding New Columns

Data Statistics

4. Data Cleaning: Handling Dirty Data 🧹

5. Practical Tips: Efficiency Hacks 💡

Data Sorting

Data Merging

Mini Project: Score Analysis System 🎓

Summary & Suggestions 🌟

Bonus Tips 🎁

Related posts

Leave a Comment Cancel reply