Detailed Method for Adding Tilted Watermark Images to PDFs in Python
In daily office work and document processing, we often need to add watermarks or seals to PDF files to indicate the status of the document or copyright information. This article will provide a detailed analysis of a Python code snippet used to add tilted images as watermarks to PDF files, capable of batch processing PDF files, automatically adding a specified image as a watermark, and allowing adjustments to the image size, brightness, and tilt angle.
Overview of Code Functionality
This code primarily implements the following functionalities:
- Resize the specified PNG image to 20mm × 50mm
- Darken the image
- Rotate the image 45 degrees to serve as a watermark
- Add the processed image to a specified position on the first page of the PDF
- Support batch processing of all PDF files in a folder
- Automatically handle temporary files to ensure system cleanliness
The code mainly relies on <span>PyPDF2</span> for processing PDF files, <span>PIL</span> (Pillow) for image processing, and <span>reportlab</span> for generating temporary PDFs to place the images.
Core Function Analysis
Unit Conversion Function: mm_to_points
def mm_to_points(mm):
"""Convert millimeters to PDF points (1 inch = 25.4 mm = 72 points)"""
return mm * 72 / 25.4
PDF files use “points” as the basic unit, while we are usually accustomed to measuring dimensions in millimeters. This helper function implements the conversion from millimeters to points based on the conversion relationship: 1 inch = 25.4 mm = 72 points.
Main Function: add_tilted_image_to_pdf
This is the core function of the code, responsible for completing the image processing and adding the watermark to the PDF. We can understand it by breaking it down into several main steps:
1. Path Handling
if output_path is None:
name, ext = os.path.splitext(pdf_path)
output_path = f"{name}_modified{ext}"
If the output path is not specified, the function automatically appends the suffix “_modified” to the original filename as the output filename.
2. PDF Reading and Initialization
reader = PdfReader(pdf_path)
writer = PdfWriter()
if len(reader.pages) == 0:
raise ValueError("The PDF file has no pages")
first_page = reader.pages[0]
page_width = float(first_page.mediabox.width)
page_height = float(first_page.mediabox.height)
The code uses <span>PyPDF2</span>’s <span>PdfReader</span> to read the original PDF and prepares the output file with <span>PdfWriter</span>. Basic validation is performed to ensure the PDF contains at least one page, and the dimensions of the first page are obtained.
3. Image Processing Flow
This part is the most complex section of the code, responsible for processing the original image into the desired watermark style:
# Define image size (millimeters) and convert to points
target_width_mm = 50
target_height_mm = 20
target_width = mm_to_points(target_width_mm)
target_height = mm_to_points(target_height_mm)
# Open the image to be added and resize
with Image.open(image_path) as img:
resized_img = img.resize((int(target_width), int(target_height)), Image.Resampling.LANCZOS)
# Adjust image brightness (darken)
enhancer = ImageEnhance.Brightness(resized_img)
darkened_img = enhancer.enhance(darkness_factor)
# Create a rotation canvas and paste the image
max_dim = int((target_width **2 + target_height** 2) **0.5)
rotated_img = Image.new('RGBA', (int(max_dim), int(max_dim)), (255, 255, 255, 0))
paste_x = (int(max_dim) - int(target_width)) // 2
paste_y = (int(max_dim) - int(target_height)) // 2
rotated_img.paste(darkened_img, (paste_x, paste_y), darkened_img)
# Rotate the image
rotated_img = rotated_img.rotate(0, expand=True, resample=Image.Resampling.BICUBIC)
# Save the rotated image to a temporary file
with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as img_temp_file:
rotated_img.save(img_temp_file, format='PNG')
img_temp_path = img_temp_file.name
This code accomplishes:
- Resizing the image to the specified dimensions (50mm × 20mm)
- Using a brightness enhancer to darken the image (default darkness factor 0.7)
- Creating a sufficiently large transparent canvas to accommodate the rotated image
- Pasting the processed image at the center of the canvas
- Rotating the image (note: the code’s rotate(0) may be a typo; it should be rotate(45) for a 45-degree tilt)
- Saving the processed image to a temporary file
4. Generating PDF Page with Watermark
# Create a PDF page to place the rotated image
c = canvas.Canvas(temp_filename, pagesize=(page_width, page_height))
# Calculate placement position (right 1/4, bottom 1/3)
with Image.open(img_temp_path) as rotated_img:
img_width, img_height = rotated_img.size
# Calculate placement coordinates
x = page_width - (page_width * random.uniform(0.1,0.15)) - img_width
y = page_height * random.uniform(0.2,0.3)
# Draw the image
c.drawImage(img_temp_path, x, y, width=img_width, height=img_height, mask='auto')
c.save()
This part uses <span>reportlab</span> to create a temporary PDF page and draws the processed image at the specified position on that page. The position calculation uses <span>random.uniform</span>, allowing the image position to vary randomly within a certain range, enhancing the watermark’s anti-counterfeiting properties.
5. Merging PDFs and Cleaning Up Temporary Files
# Merge the original PDF and the PDF with the image
image_reader = PdfReader(temp_filename)
image_page = image_reader.pages[0]
first_page.merge_page(image_page)
writer.add_page(first_page)
# Add the other pages of the PDF
for page in reader.pages[1:]:
writer.add_page(page)
# Save the result
with open(output_path, 'wb') as f:
writer.write(f)
# Clean up temporary files
os.unlink(temp_filename)
os.unlink(img_temp_path)
Finally, the code merges the temporary PDF page with the watermark into the first page of the original PDF and adds the remaining pages, completing the generation of the final PDF. It also cleans up the previously created temporary files to avoid occupying system resources.
Batch Processing Implementation
The main program part of the code implements batch processing of all PDF files in the specified folder:
if __name__ == "__main__":
for root, dir, files in os.walk('PDF Batch Processing Cover with Seal (2)'):
for file in files:
print(file)
if file.endswith('.pdf'):
pdf_file = os.path.join(root, file)
image_file = '7.png'
output_file = os.path.join(root, file)
add_tilted_image_to_pdf(pdf_file, image_file, output_file)
This code uses <span>os.walk</span> to traverse all files in the “PDF Batch Processing Cover with Seal (2)” folder and its subfolders, calling the <span>add_tilted_image_to_pdf</span> function for each PDF file for processing.
Usage Instructions and Precautions
-
Environment Preparation: Required to install related dependency libraries
pip install pillow PyPDF2 reportlab -
File Path:
- Replace the folder path “PDF Batch Processing Cover with Seal (2)” in the code with the actual folder path
- Ensure the image file “7.png” exists in the current working directory or specify the correct path
Parameter Adjustments:
- You can adjust the image darkness by modifying the
<span>darkness_factor</span>parameter - To adjust the image size, modify the values of
<span>target_width_mm</span>and<span>target_height_mm</span> - To adjust the image position, modify the parameter range of
<span>random.uniform</span>
Precautions:
- The image rotation angle in the code is set to 0 degrees, which may be a typo; to achieve a 45-degree tilt, it should be changed to
<span>rotate(45)</span> - Processing large PDF files may require a longer time and more memory
- It is recommended to back up the original PDF files to prevent unexpected situations
Function Extension Suggestions
Based on this code, we can implement some functional extensions:
- Support for custom watermark positions and rotation angles
- Implement adding watermarks to all pages of the PDF
- Add text watermark functionality
- Add batch processing progress display
- Support for more image formats and processing of encrypted PDF files
- Add logging functionality for tracking processing results
Through this code, we can not only quickly implement the PDF watermark addition function but also gain a deeper understanding of the basic principles of PDF processing and image processing, laying the foundation for more complex document processing tasks.