Python Automation for Office: A Tool to Free Your Hands

Hello everyone, I am Ziye. Today I want to share a super practical topic: how to achieve office automation using Python. In our daily work, we often need to handle a large number of Excel spreadsheets, Word documents, or repetitive file operations. Using Python to handle these tasks can greatly improve work efficiency and reduce human errors. Let's learn how to free our hands with Python!

## 1. Excel File Processing

Excel processing can be said to be the most common requirement in office automation, so let's start here:

```python
# Install necessary libraries
# pip install openpyxl pandas

import pandas as pd

# Read Excel file
def read_excel_example():
    # Read sales data
    df = pd.read_excel('sales.xlsx')
    
    # Data statistics
    total_sales = df['Amount'].sum()
    monthly_sales = df.groupby('Month')['Amount'].sum()
    
    # Create a new Excel file to save the statistics
    with pd.ExcelWriter('sales_summary.xlsx') as writer:
        monthly_sales.to_excel(writer, sheet_name='Monthly Statistics')
        
    print(f"Total Sales: {total_sales}")
    print("\nMonthly Sales Situation:")
    print(monthly_sales)

Tip: Pandas can not only handle Excel but also CSV, databases, and various data sources, making it the Swiss Army knife of data processing!

2. Word Document Processing

Next, let’s see how to process Word documents:

# pip install python-docx
from docx import Document
from docx.shared import Pt

def create_report():
    # Create a new document
    doc = Document()
    
    # Add title
    doc.add_heading('Monthly Work Report', 0)
    
    # Add paragraph
    p = doc.add_paragraph('Summary of this month's work:')
    p.add_run('\n1. Completed project planning').bold = True
    p.add_run('\n2. Held team meeting')
    p.add_run('\n3. Completed code review')
    
    # Save document
    doc.save('monthly_report.docx')

# Batch modify Word documents
def batch_modify_docs(directory):
    import os
    for filename in os.listdir(directory):
        if filename.endswith('.docx'):
            doc = Document(os.path.join(directory, filename))
            # Modify document content
            for paragraph in doc.paragraphs:
                if 'Old Company Name' in paragraph.text:
                    paragraph.text = paragraph.text.replace('Old Company Name', 'New Company Name')
            doc.save(os.path.join(directory, 'new_' + filename))

3. PDF File Processing

PDF processing is also a common office requirement:

# pip install PyPDF2
from PyPDF2 import PdfReader, PdfWriter

def pdf_operations():
    # Merge PDFs
    def merge_pdfs(pdf_list, output_filename):
        merger = PdfWriter()
        for pdf in pdf_list:
            merger.append(pdf)
        merger.write(output_filename)
        merger.close()
    
    # Extract text from PDF
    def extract_text(pdf_file):
        reader = PdfReader(pdf_file)
        text = ""
        for page in reader.pages:
            text += page.extract_text()
        return text

# Example usage
pdf_files = ['doc1.pdf', 'doc2.pdf']
merge_pdfs(pdf_files, 'merged.pdf')

4. Email Automation

Automatically sending emails is a great helper for increasing efficiency:

import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

def send_automated_email():
    # Email settings
    sender = "[email protected]"
    password = "your_password"
    receiver = "[email protected]"
    
    # Create email
    msg = MIMEMultipart()
    msg['From'] = sender
    msg['To'] = receiver
    msg['Subject'] = "Python Automated Email"
    
    # Email body
    body = "This is an automatically sent email by Python\nAttached is the sales report of this month"
    msg.attach(MIMEText(body, 'plain'))
    
    # Add attachment
    with open('sales_summary.xlsx', 'rb') as f:
        attachment = MIMEText(f.read(), 'base64', 'utf-8')
        attachment["Content-Type"] = 'application/octet-stream'
        attachment["Content-Disposition"] = 'attachment; filename="sales_summary.xlsx"'
        msg.attach(attachment)
    
    # Send email
    with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server:
        server.login(sender, password)
        server.send_message(msg)

5. Batch File Processing

Batch processing files is also a common automation requirement:

import os
import shutil

def file_operations():
    # Batch rename
    def batch_rename(directory, old_ext, new_ext):
        for filename in os.listdir(directory):
            if filename.endswith(old_ext):
                new_name = filename.replace(old_ext, new_ext)
                os.rename(
                    os.path.join(directory, filename),
                    os.path.join(directory, new_name)
                )
    
    # Organize files by extension
    def organize_files(directory):
        for filename in os.listdir(directory):
            ext = filename.split('.')[-1]
            new_dir = os.path.join(directory, ext)
            if not os.path.exists(new_dir):
                os.makedirs(new_dir)
            shutil.move(
                os.path.join(directory, filename),
                os.path.join(new_dir, filename)
            )

Key Learning Points Summary

  1. Excel Processing: Use pandas for data analysis and processing
  2. Word Processing: Use python-docx to create and modify documents
  3. PDF Operations: Use PyPDF2 for PDF file processing
  4. Email Automation: Use smtplib to send automated emails
  5. Batch File Processing: Use os and shutil for file operations

Practice Questions

  1. Create a program that automatically counts data from all Excel files in a folder and generates a report
  2. Write a program that automatically generates a weekly work report in Word
  3. Implement a tool that automatically merges PDF files and extracts text
  4. Develop a program that automatically sends birthday greeting emails

Precautions

  1. Be mindful of memory usage when processing large files
  2. Be cautious with email security settings when automating email sending
  3. Always back up before batch processing files
  4. Read documentation carefully before using third-party libraries

Python office automation can really save us a lot of time. I recommend starting with small tasks and gradually expanding to more complex automation scenarios. If you encounter problems, feel free to discuss in the comments!

Next time: We will learn about Python web scraping technology and see how to automatically obtain web data. Remember to follow me, and see you next time!

Leave a Comment