Hello everyone, I am Ziye. Today I want to share a super practical topic: how to achieve office automation using Python. In our daily work, we often need to handle a large number of Excel spreadsheets, Word documents, or repetitive file operations. Using Python to handle these tasks can greatly improve work efficiency and reduce human errors. Let's learn how to free our hands with Python!
## 1. Excel File Processing
Excel processing can be said to be the most common requirement in office automation, so let's start here:
```python
# Install necessary libraries
# pip install openpyxl pandas
import pandas as pd
# Read Excel file
def read_excel_example():
# Read sales data
df = pd.read_excel('sales.xlsx')
# Data statistics
total_sales = df['Amount'].sum()
monthly_sales = df.groupby('Month')['Amount'].sum()
# Create a new Excel file to save the statistics
with pd.ExcelWriter('sales_summary.xlsx') as writer:
monthly_sales.to_excel(writer, sheet_name='Monthly Statistics')
print(f"Total Sales: {total_sales}")
print("\nMonthly Sales Situation:")
print(monthly_sales)
Tip: Pandas can not only handle Excel but also CSV, databases, and various data sources, making it the Swiss Army knife of data processing!
2. Word Document Processing
Next, let’s see how to process Word documents:
# pip install python-docx
from docx import Document
from docx.shared import Pt
def create_report():
# Create a new document
doc = Document()
# Add title
doc.add_heading('Monthly Work Report', 0)
# Add paragraph
p = doc.add_paragraph('Summary of this month's work:')
p.add_run('\n1. Completed project planning').bold = True
p.add_run('\n2. Held team meeting')
p.add_run('\n3. Completed code review')
# Save document
doc.save('monthly_report.docx')
# Batch modify Word documents
def batch_modify_docs(directory):
import os
for filename in os.listdir(directory):
if filename.endswith('.docx'):
doc = Document(os.path.join(directory, filename))
# Modify document content
for paragraph in doc.paragraphs:
if 'Old Company Name' in paragraph.text:
paragraph.text = paragraph.text.replace('Old Company Name', 'New Company Name')
doc.save(os.path.join(directory, 'new_' + filename))
3. PDF File Processing
PDF processing is also a common office requirement:
# pip install PyPDF2
from PyPDF2 import PdfReader, PdfWriter
def pdf_operations():
# Merge PDFs
def merge_pdfs(pdf_list, output_filename):
merger = PdfWriter()
for pdf in pdf_list:
merger.append(pdf)
merger.write(output_filename)
merger.close()
# Extract text from PDF
def extract_text(pdf_file):
reader = PdfReader(pdf_file)
text = ""
for page in reader.pages:
text += page.extract_text()
return text
# Example usage
pdf_files = ['doc1.pdf', 'doc2.pdf']
merge_pdfs(pdf_files, 'merged.pdf')
4. Email Automation
Automatically sending emails is a great helper for increasing efficiency:
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
def send_automated_email():
# Email settings
sender = "[email protected]"
password = "your_password"
receiver = "[email protected]"
# Create email
msg = MIMEMultipart()
msg['From'] = sender
msg['To'] = receiver
msg['Subject'] = "Python Automated Email"
# Email body
body = "This is an automatically sent email by Python\nAttached is the sales report of this month"
msg.attach(MIMEText(body, 'plain'))
# Add attachment
with open('sales_summary.xlsx', 'rb') as f:
attachment = MIMEText(f.read(), 'base64', 'utf-8')
attachment["Content-Type"] = 'application/octet-stream'
attachment["Content-Disposition"] = 'attachment; filename="sales_summary.xlsx"'
msg.attach(attachment)
# Send email
with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server:
server.login(sender, password)
server.send_message(msg)
5. Batch File Processing
Batch processing files is also a common automation requirement:
import os
import shutil
def file_operations():
# Batch rename
def batch_rename(directory, old_ext, new_ext):
for filename in os.listdir(directory):
if filename.endswith(old_ext):
new_name = filename.replace(old_ext, new_ext)
os.rename(
os.path.join(directory, filename),
os.path.join(directory, new_name)
)
# Organize files by extension
def organize_files(directory):
for filename in os.listdir(directory):
ext = filename.split('.')[-1]
new_dir = os.path.join(directory, ext)
if not os.path.exists(new_dir):
os.makedirs(new_dir)
shutil.move(
os.path.join(directory, filename),
os.path.join(new_dir, filename)
)
Key Learning Points Summary
-
Excel Processing: Use pandas for data analysis and processing -
Word Processing: Use python-docx to create and modify documents -
PDF Operations: Use PyPDF2 for PDF file processing -
Email Automation: Use smtplib to send automated emails -
Batch File Processing: Use os and shutil for file operations
Practice Questions
-
Create a program that automatically counts data from all Excel files in a folder and generates a report -
Write a program that automatically generates a weekly work report in Word -
Implement a tool that automatically merges PDF files and extracts text -
Develop a program that automatically sends birthday greeting emails
Precautions
-
Be mindful of memory usage when processing large files -
Be cautious with email security settings when automating email sending -
Always back up before batch processing files -
Read documentation carefully before using third-party libraries
Python office automation can really save us a lot of time. I recommend starting with small tasks and gradually expanding to more complex automation scenarios. If you encounter problems, feel free to discuss in the comments!
Next time: We will learn about Python web scraping technology and see how to automatically obtain web data. Remember to follow me, and see you next time!