Project IntroductionProject Name: Batch Filter and Consolidate ‘Wallet’ Sales Records from Excel FilesLibraries Used:os, pandasProject Overview: This example is used to filter rows where the “Product Name” column is “Wallet” from all worksheets of Excel files in a specified folder and its subfolders, and consolidate this data into a new Excel file. Additionally, each qualifying data row will have two extra columns added to record its source file and worksheet name.
Implementation Process
1. Import Necessary Libraries
import os
import pandas as pd
2. Function Definition: extract_wallet_rows
- Read all worksheets of the given Excel file. For each worksheet, if the “Sales Date” column exists, remove the time information; then filter the rows where the “Product Name” column is “Wallet” and add the source file name and worksheet name as new columns for these rows. Finally, add the qualifying data to the
<span><span>output_data</span></span>
list.
def extract_wallet_rows(excel_file, output_data): """Extract rows where 'Product Name' is 'Wallet' from the given Excel file, removing the time from the sales date.""" xls = pd.ExcelFile(excel_file) for sheet_name in xls.sheet_names: df = pd.read_excel(xls, sheet_name=sheet_name)
# If 'Sales Date' column exists, remove time if '销售日期' in df.columns: df['销售日期'] = pd.to_datetime(df['销售日期']).dt.date
# Filter rows where 'Product Name' is 'Wallet' wallet_df = df[df['产品名称'] == '钱包'] if not wallet_df.empty: # Add a column to record source file and worksheet wallet_df['来源文件'] = os.path.basename(excel_file) wallet_df['工作表'] = sheet_name output_data.append(wallet_df)
3. Define Function: batch_extract_wallet_rows
- Iterate through all Excel files in the specified folder and its subfolders, calling the
<span><span>extract_wallet_rows</span></span>
function to process each found Excel file. All qualifying data will be merged into a single DataFrame and saved to a new Excel file.
def batch_extract_wallet_rows(input_folder, output_file): """Batch process all Excel files in the folder.""" all_wallet_data = []
# Iterate through all Excel files in the folder for root, dirs, files in os.walk(input_folder): for file in files: if file.endswith('.xlsx') or file.endswith('.xls'): excel_file_path = os.path.join(root, file) extract_wallet_rows(excel_file_path, all_wallet_data)
if all_wallet_data: # Combine all DataFrames combined_df = pd.concat(all_wallet_data, ignore_index=True) # Save to a new Excel file combined_df.to_excel(output_file, index=False, engine='openpyxl') print(f"Extracted data saved to {output_file}") else: print("No matching rows found.")
4. Main Program Logic
if __name__ == "__main__": input_folder = "../../销售统计" # Path to the folder containing Excel files output_file = "./combined_wallets.xlsx" # Output file path
# Create output folder (if it doesn't exist) os.makedirs(os.path.dirname(output_file), exist_ok=True)
batch_extract_wallet_rows(input_folder, output_file)
5. Conclusion
That concludes this sharing session. If you have any questions or want to share your automation office experiences, our comment section is always open for you.Your every like and share is the greatest support and encouragement for us!
Want to learn more and get the complete code from this article? It’s simple, follow our official account and reply with the article title to get it immediately.
Thank you again for reading, and I look forward to seeing you in the next sharing session!
Featured Reading
Python Automation Series: One-Click Replace Document KeywordsPython Automation Series: Rotate Specific Pages of PDF FilesPython Automation Series: Filter Content of Text (txt)Python Automation Series: Scrape Selected Financial NewsPython Automation Series: One-Click Rename Files, Easily Track Modification HistoryPython Automation Series: One-Click Remove Password from PDF Documents
Python Automation Series: Convert PDF Pages to Images
Python Automation Series: Rearrange the Order of PDF Pages
Python Automation Series: Crop Specific Areas of Specific Pages in PDF
Python Automation Series: Automatically Fill Student Information from Excel into a Word Document
Python Automation Series: Easily Merge Specific Pages of PDF Files
Python Automation Series: Automatically Fill Multiple Student Information from Excel into a Word Document
Python Automation Series: How to Easily Add a Uniform Watermark to PDF Files
Python Automation Series: Batch Image Watermark Adding Tool
Python Automation Series: One-Click Extract PPT Text to Word
Python Automation Series: Quickly Generate GIF Animations with Python Scripts
Python Automation Series: Easily Obtain Douban Top 250 Movie Data
Python Automation Series: Automatically Delete Duplicate Images and Keep the Latest Version
Python Automation Series: One-Click Batch Convert Excel Multi-Page Data to Word Documents
Python Automation Series: How to Use Python to Add Password Protection to PDF Files