Detailed Method for Adding Tilted Watermark Images to PDFs in Python

In daily office work and document processing, we often need to add watermarks or seals to PDF files to indicate the status of the document or copyright information. This article will provide a detailed analysis of a Python code snippet used to add tilted images as watermarks to PDF files, capable of batch processing PDF files, automatically adding a specified image as a watermark, and allowing adjustments to the image size, brightness, and tilt angle.

Overview of Code Functionality

This code primarily implements the following functionalities:

Resize the specified PNG image to 20mm × 50mm
Darken the image
Rotate the image 45 degrees to serve as a watermark
Add the processed image to a specified position on the first page of the PDF
Support batch processing of all PDF files in a folder
Automatically handle temporary files to ensure system cleanliness

The code mainly relies on PyPDF2 for processing PDF files, PIL (Pillow) for image processing, and reportlab for generating temporary PDFs to place the images.

Core Function Analysis

Unit Conversion Function: mm_to_points

def mm_to_points(mm):
    """Convert millimeters to PDF points (1 inch = 25.4 mm = 72 points)"""
    return mm * 72 / 25.4

PDF files use “points” as the basic unit, while we are usually accustomed to measuring dimensions in millimeters. This helper function implements the conversion from millimeters to points based on the conversion relationship: 1 inch = 25.4 mm = 72 points.

Main Function: add_tilted_image_to_pdf

This is the core function of the code, responsible for completing the image processing and adding the watermark to the PDF. We can understand it by breaking it down into several main steps:

1. Path Handling

if output_path is None:
    name, ext = os.path.splitext(pdf_path)
    output_path = f"{name}_modified{ext}"

If the output path is not specified, the function automatically appends the suffix “_modified” to the original filename as the output filename.

2. PDF Reading and Initialization

reader = PdfReader(pdf_path)
writer = PdfWriter()

if len(reader.pages) == 0:
    raise ValueError("The PDF file has no pages")

first_page = reader.pages[0]
page_width = float(first_page.mediabox.width)
page_height = float(first_page.mediabox.height)

The code uses PyPDF2’s PdfReader to read the original PDF and prepares the output file with PdfWriter. Basic validation is performed to ensure the PDF contains at least one page, and the dimensions of the first page are obtained.

3. Image Processing Flow

This part is the most complex section of the code, responsible for processing the original image into the desired watermark style:

# Define image size (millimeters) and convert to points
target_width_mm = 50
target_height_mm = 20
target_width = mm_to_points(target_width_mm)
target_height = mm_to_points(target_height_mm)

# Open the image to be added and resize
with Image.open(image_path) as img:
    resized_img = img.resize((int(target_width), int(target_height)), Image.Resampling.LANCZOS)
    
    # Adjust image brightness (darken)
    enhancer = ImageEnhance.Brightness(resized_img)
    darkened_img = enhancer.enhance(darkness_factor)
    
    # Create a rotation canvas and paste the image
    max_dim = int((target_width **2 + target_height** 2) **0.5)
    rotated_img = Image.new('RGBA', (int(max_dim), int(max_dim)), (255, 255, 255, 0))
    paste_x = (int(max_dim) - int(target_width)) // 2
    paste_y = (int(max_dim) - int(target_height)) // 2
    rotated_img.paste(darkened_img, (paste_x, paste_y), darkened_img)
    
    # Rotate the image
    rotated_img = rotated_img.rotate(0, expand=True, resample=Image.Resampling.BICUBIC)
    
    # Save the rotated image to a temporary file
    with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as img_temp_file:
        rotated_img.save(img_temp_file, format='PNG')
        img_temp_path = img_temp_file.name

This code accomplishes:

Resizing the image to the specified dimensions (50mm × 20mm)
Using a brightness enhancer to darken the image (default darkness factor 0.7)
Creating a sufficiently large transparent canvas to accommodate the rotated image
Pasting the processed image at the center of the canvas
Rotating the image (note: the code’s rotate(0) may be a typo; it should be rotate(45) for a 45-degree tilt)
Saving the processed image to a temporary file

4. Generating PDF Page with Watermark

# Create a PDF page to place the rotated image
c = canvas.Canvas(temp_filename, pagesize=(page_width, page_height))

# Calculate placement position (right 1/4, bottom 1/3)
with Image.open(img_temp_path) as rotated_img:
    img_width, img_height = rotated_img.size

# Calculate placement coordinates
x = page_width - (page_width * random.uniform(0.1,0.15)) - img_width
y = page_height * random.uniform(0.2,0.3)

# Draw the image
c.drawImage(img_temp_path, x, y, width=img_width, height=img_height, mask='auto')
c.save()

This part uses reportlab to create a temporary PDF page and draws the processed image at the specified position on that page. The position calculation uses random.uniform, allowing the image position to vary randomly within a certain range, enhancing the watermark’s anti-counterfeiting properties.

5. Merging PDFs and Cleaning Up Temporary Files

# Merge the original PDF and the PDF with the image
image_reader = PdfReader(temp_filename)
image_page = image_reader.pages[0]

first_page.merge_page(image_page)
writer.add_page(first_page)

# Add the other pages of the PDF
for page in reader.pages[1:]:
    writer.add_page(page)

# Save the result
with open(output_path, 'wb') as f:
    writer.write(f)

# Clean up temporary files
os.unlink(temp_filename)
os.unlink(img_temp_path)

Finally, the code merges the temporary PDF page with the watermark into the first page of the original PDF and adds the remaining pages, completing the generation of the final PDF. It also cleans up the previously created temporary files to avoid occupying system resources.

Batch Processing Implementation

The main program part of the code implements batch processing of all PDF files in the specified folder:

if __name__ == "__main__":
    for root, dir, files in os.walk('PDF Batch Processing Cover with Seal (2)'):
        for file in files:
            print(file)
            if file.endswith('.pdf'):
                pdf_file = os.path.join(root, file)
                image_file = '7.png'
                output_file = os.path.join(root, file)
                add_tilted_image_to_pdf(pdf_file, image_file, output_file)

This code uses os.walk to traverse all files in the “PDF Batch Processing Cover with Seal (2)” folder and its subfolders, calling the add_tilted_image_to_pdf function for each PDF file for processing.

Usage Instructions and Precautions

Environment Preparation: Required to install related dependency libraries
```
pip install pillow PyPDF2 reportlab
```
File Path:

Replace the folder path “PDF Batch Processing Cover with Seal (2)” in the code with the actual folder path
Ensure the image file “7.png” exists in the current working directory or specify the correct path

Parameter Adjustments:

You can adjust the image darkness by modifying the darkness_factor parameter
To adjust the image size, modify the values of target_width_mm and target_height_mm
To adjust the image position, modify the parameter range of random.uniform

Precautions:

The image rotation angle in the code is set to 0 degrees, which may be a typo; to achieve a 45-degree tilt, it should be changed to rotate(45)
Processing large PDF files may require a longer time and more memory
It is recommended to back up the original PDF files to prevent unexpected situations

Function Extension Suggestions

Based on this code, we can implement some functional extensions:

Support for custom watermark positions and rotation angles
Implement adding watermarks to all pages of the PDF
Add text watermark functionality
Add batch processing progress display
Support for more image formats and processing of encrypted PDF files
Add logging functionality for tracking processing results

Through this code, we can not only quickly implement the PDF watermark addition function but also gain a deeper understanding of the basic principles of PDF processing and image processing, laying the foundation for more complex document processing tasks.