Python Automation Series: Batch Filter and Consolidate ‘Wallet’ Sales Records from Excel Files

Python Automation Series: Batch Filter and Consolidate 'Wallet' Sales Records from Excel FilesProject IntroductionProject Name: Batch Filter and Consolidate ‘Wallet’ Sales Records from Excel FilesLibraries Used:os, pandasProject Overview: This example is used to filter rows where the “Product Name” column is “Wallet” from all worksheets of Excel files in a specified folder and its subfolders, and consolidate this data into a new Excel file. Additionally, each qualifying data row will have two extra columns added to record its source file and worksheet name.Python Automation Series: Batch Filter and Consolidate 'Wallet' Sales Records from Excel FilesPython Automation Series: Batch Filter and Consolidate 'Wallet' Sales Records from Excel Files

Implementation Process

1. Import Necessary Libraries

import os
import pandas as pd

2. Function Definition: extract_wallet_rows

    • Read all worksheets of the given Excel file. For each worksheet, if the “Sales Date” column exists, remove the time information; then filter the rows where the “Product Name” column is “Wallet” and add the source file name and worksheet name as new columns for these rows. Finally, add the qualifying data to the<span><span>output_data</span></span> list.
def extract_wallet_rows(excel_file, output_data):    """Extract rows where 'Product Name' is 'Wallet' from the given Excel file, removing the time from the sales date."""    xls = pd.ExcelFile(excel_file)    for sheet_name in xls.sheet_names:        df = pd.read_excel(xls, sheet_name=sheet_name)
        # If 'Sales Date' column exists, remove time        if '销售日期' in df.columns:            df['销售日期'] = pd.to_datetime(df['销售日期']).dt.date
        # Filter rows where 'Product Name' is 'Wallet'        wallet_df = df[df['产品名称'] == '钱包']        if not wallet_df.empty:            # Add a column to record source file and worksheet            wallet_df['来源文件'] = os.path.basename(excel_file)            wallet_df['工作表'] = sheet_name            output_data.append(wallet_df)

3. Define Function: batch_extract_wallet_rows

    • Iterate through all Excel files in the specified folder and its subfolders, calling the<span><span>extract_wallet_rows</span></span> function to process each found Excel file. All qualifying data will be merged into a single DataFrame and saved to a new Excel file.
def batch_extract_wallet_rows(input_folder, output_file):    """Batch process all Excel files in the folder."""    all_wallet_data = []
    # Iterate through all Excel files in the folder    for root, dirs, files in os.walk(input_folder):        for file in files:            if file.endswith('.xlsx') or file.endswith('.xls'):                excel_file_path = os.path.join(root, file)                extract_wallet_rows(excel_file_path, all_wallet_data)
    if all_wallet_data:        # Combine all DataFrames        combined_df = pd.concat(all_wallet_data, ignore_index=True)        # Save to a new Excel file        combined_df.to_excel(output_file, index=False, engine='openpyxl')        print(f"Extracted data saved to {output_file}")    else:        print("No matching rows found.")

4. Main Program Logic

if __name__ == "__main__":    input_folder = "../../销售统计"  # Path to the folder containing Excel files    output_file = "./combined_wallets.xlsx"  # Output file path
    # Create output folder (if it doesn't exist)    os.makedirs(os.path.dirname(output_file), exist_ok=True)
    batch_extract_wallet_rows(input_folder, output_file)

5. Conclusion

That concludes this sharing session. If you have any questions or want to share your automation office experiences, our comment section is always open for you.Your every like and share is the greatest support and encouragement for us!

Want to learn more and get the complete code from this article? It’s simple, follow our official account and reply with the article title to get it immediately.

Thank you again for reading, and I look forward to seeing you in the next sharing session!

Python Automation Series: Batch Filter and Consolidate 'Wallet' Sales Records from Excel FilesFeatured ReadingPython Automation Series: Batch Filter and Consolidate 'Wallet' Sales Records from Excel FilesPython Automation Series: One-Click Replace Document KeywordsPython Automation Series: Rotate Specific Pages of PDF FilesPython Automation Series: Filter Content of Text (txt)Python Automation Series: Scrape Selected Financial NewsPython Automation Series: One-Click Rename Files, Easily Track Modification HistoryPython Automation Series: One-Click Remove Password from PDF Documents

Python Automation Series: Convert PDF Pages to Images

Python Automation Series: Rearrange the Order of PDF Pages

Python Automation Series: Crop Specific Areas of Specific Pages in PDF

Python Automation Series: Automatically Fill Student Information from Excel into a Word Document

Python Automation Series: Easily Merge Specific Pages of PDF Files

Python Automation Series: Automatically Fill Multiple Student Information from Excel into a Word Document

Python Automation Series: How to Easily Add a Uniform Watermark to PDF Files

Python Automation Series: Batch Image Watermark Adding Tool

Python Automation Series: One-Click Extract PPT Text to Word

Python Automation Series: Quickly Generate GIF Animations with Python Scripts

Python Automation Series: Easily Obtain Douban Top 250 Movie Data

Python Automation Series: Automatically Delete Duplicate Images and Keep the Latest Version

Python Automation Series: One-Click Batch Convert Excel Multi-Page Data to Word Documents

Python Automation Series: How to Use Python to Add Password Protection to PDF Files

Leave a Comment