Automating Web Tasks with Python Selenium Library

Selenium is a powerful web automation testing tool, also widely used for web scraping development. This article will detail the core usage methods of the Python Selenium library and provide practical code examples.Previous Python readings >>30 Python Automation Scripts for Daily Tasks: A Tool for Workplace Efficiency Improvement15 Python Libraries for Automating Financial Data Retrieval10 Methods for Data Visualization in Python10 Templates for Automating File Management in Python25 Practical Automation Codes in Python, Ready to Use15 Scripts for Automating Excel Data Processing in Python20 Code Optimization Methods in Python50 Command Line Tips in Python40 Practical Code Examples in Python: Enhancing Efficiency30 Methods for Automating Image Processing in Python20 Tools for Improving Learning Efficiency in PythonHow to Use the Python File Monitoring Tool Watchdog50 List Operation Methods in PythonCommon Application Scenarios for Callback Functions in Python40 Common Scenario Code Examples in PythonSQLite Database in Python: A Lightweight Data Storage Solution20 Data Cleaning Techniques in Python to Improve Data Quality50 Common Statistical Analysis Methods in Python20 Common Standard Modules in PythonNested Data Structures in PythonDeep Copy and Shallow Copy in PythonDetailed Usage of the Python Requests Library25 Function Development Tips in Python30 Tuple Operation Methods in PythonAutomating QR Code Generation in Python30 Common Functions for Data Analysis with Pandas in Python

1. Installation and Environment Configuration

1.1 Installing Selenium

pip install selenium -U

1.2 Downloading Browser Drivers

Browser	Driver Download Link
Chrome	https://googlechromelabs.github.io/chrome-for-testing/
Firefox	https://github.com/mozilla/geckodriver/releases
Edge	https://developer.microsoft.com/microsoft-edge/tools/webdriver/

Recommended practice: Place the driver in the system PATH or explicitly specify the path in the code to avoid version conflicts.

2. Basic Usage

2.1 Starting the Browser

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options

# Recommended: Use Chrome-for-Testing to avoid version mismatches
service = Service(executable_path='chromedriver.exe')
options = Options()
options.add_argument('--start-maximized')   # Start maximized
driver = webdriver.Chrome(service=service, options=options)

2.2 Accessing a Web Page

driver.get('https://www.example.com')

2.3 Common Browser Control Interfaces

driver.maximize_window()          # Maximize

driver.set_window_size(1280, 720) # Set window size
driver.back()                     # Back
driver.forward()                  # Forward
driver.refresh()                  # Refresh

print(driver.current_url)         # Current URL
print(driver.title)               # Page title

2.4 Releasing Resources

driver.quit()  # Close all windows and release the process

3. Element Location (Using Unified Entry `find_element`)

from selenium.webdriver.common.by import By

driver.find_element(By.ID, 'element_id')
driver.find_element(By.NAME, 'element_name')
driver.find_element(By.CLASS_NAME, 'element_class')
driver.find_element(By.TAG_NAME, 'div')
driver.find_element(By.LINK_TEXT, 'Full Link Text')
driver.find_element(By.PARTIAL_LINK_TEXT, 'Partial Link Text')
driver.find_element(By.CSS_SELECTOR, 'div.content > p')
driver.find_element(By.XPATH, "//div[@id='content']/p")

4. Element Interaction

search_box = driver.find_element(By.NAME, 'q')
search_box.send_keys('Python Selenium')  # Input
search_box.clear()                       # Clear
search_box.submit()                      # Submit

btn = driver.find_element(By.ID, 'submit')
btn.click()                              # Click

text = btn.text                          # Get text
href = btn.get_attribute('href')         # Get attribute
is_visible = btn.is_displayed()          # Is visible
is_enabled = btn.is_enabled()            # Is clickable
is_selected = btn.is_selected()          # Is selected (checkbox/radio)

5. Waiting Mechanism (Avoid `time.sleep`)

Type	Usage Scenario	Example
Forced Wait	Debugging Phase	`<span><span>time.sleep(5)</span></span>`
Implicit Wait	Global Setting	`<span><span>driver.implicitly_wait(10)</span></span>`
Explicit Wait	Precise Wait	See below

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Wait for the element to be visible and clickable
element = WebDriverWait(driver, 10).until(
    EC.element_to_be_clickable((By.ID, 'login-btn'))
)

# Wait for the URL to contain a keyword
WebDriverWait(driver, 10).until(
    EC.url_contains('dashboard')
)

6. Advanced Techniques

6.1 Executing JavaScript

# Scroll to the bottom of the page
driver.execute_script('window.scrollTo(0, document.body.scrollHeight);')

# Highlight element
driver.execute_script("arguments[0].style.border='3px solid red'", element)

6.2 Handling iframes

iframe = driver.find_element(By.TAG_NAME, 'iframe')
driver.switch_to.frame(iframe)      # Enter iframe
driver.switch_to.default_content()  # Return to main document
driver.switch_to.parent_frame()     # Return to parent iframe

6.3 Handling Pop-ups (alert / confirm / prompt)

alert = driver.switch_to.alert
print(alert.text)      # Get text
alert.accept()         # Confirm
alert.dismiss()        # Cancel
alert.send_keys('OK')  # prompt input

6.4 File Upload

file_input = driver.find_element(By.NAME, 'file')
file_input.send_keys(r'C:\path\to\file.txt')  # Must be an absolute path

If the upload button is hidden by <input type="file" hidden>, you can first execute JS to remove the hidden attribute.

6.5 Dropdown Selection

from selenium.webdriver.support.ui import Select

select = Select(driver.find_element(By.ID, 'city'))
select.select_by_visible_text('Shanghai')
select.select_by_value('sh')
select.select_by_index(2)
select.deselect_all()  # Deselect all in multi-select

6.6 Taking Screenshots

driver.save_screenshot('screenshot.png')
png_bytes = driver.get_screenshot_as_png()     # Binary
base64_str = driver.get_screenshot_as_base64() # Base64

6.7 Switching Between Multiple Windows/Tabs

Scenario: After clicking a hyperlink to open a new tab, you need to switch context to continue operations.

original = driver.current_window_handle
driver.find_element(By.LINK_TEXT, 'Open New Tab').click()

WebDriverWait(driver, 10).until(lambda d: len(d.window_handles) > 1)
driver.switch_to.window(driver.window_handles[-1])  # Switch to new tab
# Business operations …...
driver.close()                       # Close new tab
driver.switch_to.window(original)    # Switch back to original tab

6.8 Executing CDP Commands (Chrome Only)

Utilize the Chrome DevTools Protocol to achieve capabilities that native APIs cannot cover, such as

# Block images to improve loading speed
driver.execute_cdp_cmd('Network.enable', {})
driver.execute_cdp_cmd('Network.setBlockedURLs', {'urls': ['*.jpg', '*.png']})

# Modify UA
driver.execute_cdp_cmd('Network.setUserAgentOverride', {
    'userAgent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X)'
})

6.9 Downloading Files (Automatically Save to Specified Directory)

from pathlib import Path

download_dir = Path.cwd() / 'downloads'
download_dir.mkdir(exist_ok=True)

prefs = {
    'download.default_directory': str(download_dir.absolute()),
    'download.prompt_for_download': False,
    'safebrowsing.enabled': False
}
options.add_experimental_option('prefs', prefs)

After downloading, you can poll the directory until the file appears or times out.

6.10 Mouse Hover & Drag-and-Drop

from selenium.webdriver import ActionChains

menu = driver.find_element(By.ID, 'menu')
ActionChains(driver).move_to_element(menu).perform()  # Hover

src = driver.find_element(By.ID, 'draggable')
dst = driver.find_element(By.ID, 'droppable')
ActionChains(driver).drag_and_drop(src, dst).perform()

6.11 Keyboard Shortcuts & Clipboard

from selenium.webdriver.common.keys import Keys

# Ctrl + A → Ctrl + C
elm = driver.find_element(By.TAG_NAME, 'body')
elm.send_keys(Keys.CONTROL, 'a')
elm.send_keys(Keys.CONTROL, 'c')

# Use pyperclip to read/write system clipboard
import pyperclip
pyperclip.copy('Hello Selenium')
elm.send_keys(Keys.CONTROL, 'v')

6.12 Scrolling to Make Any Element Visible

elm = driver.find_element(By.ID, 'footer')
driver.execute_script('arguments[0].scrollIntoView({behavior: "smooth"});', elm)

6.13 Modifying/Removing Element Attributes (Hiding Input Validation)

pwd = driver.find_element(By.ID, 'password')
driver.execute_script('arguments[0].removeAttribute("required")', pwd)

6.14 Getting Network Requests & Responses (with Chrome DevTools)

driver.execute_cdp_cmd('Network.enable', {})
logs = driver.get_log('performance')  # Must enable loggingPrefs

Parsing logs can provide details of XHR/Fetch requests, enabling interface-level assertions.

6.15 Using Event Listeners to Wait for DOM Changes

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import staleness_of

old = driver.find_element(By.ID, 'content')
driver.find_element(By.ID, 'refresh-btn').click()
WebDriverWait(driver, 10).until(staleness_of(old))  # Wait for old node to become stale

7. Practical Examples

1. Automated Login Example

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# 1. Start the browser
service = Service('chromedriver.exe')
options = Options()
options.add_argument('--start-maximized')
driver = webdriver.Chrome(service=service, options=options)

try:
    # 2. Access the login page
driver.get('https://example.com/login')

    # 3. Enter username and password
    WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, 'username'))
    ).send_keys('your_username')

    driver.find_element(By.ID, 'password').send_keys('your_password')

    # 4. Submit the form
driver.find_element(By.XPATH, "//button[@type='submit']").click()

    # 5. Assert successful login
    WebDriverWait(driver, 10).until(
        EC.url_contains('dashboard')
    )
    print('Login successful, current title:', driver.title)

finally:
    # 6. Close the browser
driver.quit()

2. Deep Integration with pytest (fixture + automatic screenshot) Example

# conftest.py
import pytest
from selenium import webdriver

@pytest.fixture(scope='function')
def chrome():
    driver = webdriver.Chrome()
    yield driver
    driver.quit()

@pytest.hookimpl(tryfirst=True, hookwrapper=True)
def pytest_runtest_makereport(item):
    outcome = yield
    rep = outcome.get_result()
    if rep.when == 'call' and rep.failed:
        driver = item.funcargs.get('chrome')
        if driver:
            driver.save_screenshot(f'{item.nodeid}.png'.replace('::', '_'))

Automatically take screenshots on test failure for easy CI backtracking.

By mastering these Selenium usage methods, you can achieve various web automation tasks and web scraping development needs.

1. When using Selenium, try to use explicit waits instead of forced waits;

2. When locating elements, prioritize stable attributes like ID, name, etc.;

3. When handling dynamically loaded content, set reasonable wait times;

4. After completing operations, be sure to call driver.quit() to release resources;

5. For complex web operations, you can combine with JavaScript execution;

“There is no other secret, just practice makes perfect!” Use it when needed.If you find this article useful, feel free to like, share, bookmark, comment, and recommend ❤!—— Join the Knowledge Community and learn with more people ——

https://ima.qq.com/wiki/?shareId=f2628818f0874da17b71ffa0e5e8408114e7dbad46f1745bbd1cc1365277631c

【ima Knowledge Base】Kubernetes Learning Community https://ima.qq.com/wiki/?shareId=66042e013e5ccae8371b46359aa45b8714f435cc844ff0903e27a64e050b54b5

Automating Web Tasks with Python Selenium Library

1. Installation and Environment Configuration

1.1 Installing Selenium

1.2 Downloading Browser Drivers

2. Basic Usage

2.1 Starting the Browser

2.2 Accessing a Web Page

2.3 Common Browser Control Interfaces

2.4 Releasing Resources

3. Element Location (Using Unified Entry `<span><span>find_element</span></span>`)

4. Element Interaction

5. Waiting Mechanism (Avoid `<span><span>time.sleep</span></span>`)

6. Advanced Techniques

6.1 Executing JavaScript

6.2 Handling iframes

6.3 Handling Pop-ups (alert / confirm / prompt)

6.4 File Upload

6.5 Dropdown Selection

6.6 Taking Screenshots

6.7 Switching Between Multiple Windows/Tabs

6.8 Executing CDP Commands (Chrome Only)

6.9 Downloading Files (Automatically Save to Specified Directory)

6.10 Mouse Hover & Drag-and-Drop

6.11 Keyboard Shortcuts & Clipboard

6.12 Scrolling to Make Any Element Visible

6.13 Modifying/Removing Element Attributes (Hiding Input Validation)

6.14 Getting Network Requests & Responses (with Chrome DevTools)

6.15 Using Event Listeners to Wait for DOM Changes

7. Practical Examples

2. Deep Integration with pytest (fixture + automatic screenshot) Example

Leave a Comment Cancel reply

1. Installation and Environment Configuration

1.1 Installing Selenium

1.2 Downloading Browser Drivers

2. Basic Usage

2.1 Starting the Browser

2.2 Accessing a Web Page

2.3 Common Browser Control Interfaces

2.4 Releasing Resources

3. Element Location (Using Unified Entry <span><span>find_element</span></span>)

4. Element Interaction

5. Waiting Mechanism (Avoid <span><span>time.sleep</span></span>)

6. Advanced Techniques

6.1 Executing JavaScript

6.2 Handling iframes

6.3 Handling Pop-ups (alert / confirm / prompt)

6.4 File Upload

6.5 Dropdown Selection

6.6 Taking Screenshots

6.7 Switching Between Multiple Windows/Tabs

6.8 Executing CDP Commands (Chrome Only)

6.9 Downloading Files (Automatically Save to Specified Directory)

6.10 Mouse Hover & Drag-and-Drop

6.11 Keyboard Shortcuts & Clipboard

6.12 Scrolling to Make Any Element Visible

6.13 Modifying/Removing Element Attributes (Hiding Input Validation)

6.14 Getting Network Requests & Responses (with Chrome DevTools)

6.15 Using Event Listeners to Wait for DOM Changes

7. Practical Examples

2. Deep Integration with pytest (fixture + automatic screenshot) Example

Related posts

Leave a Comment Cancel reply

3. Element Location (Using Unified Entry `<span><span>find_element</span></span>`)

5. Waiting Mechanism (Avoid `<span><span>time.sleep</span></span>`)