Ansible Network Automation Beginner’s Guide (1) – The Comeback of CLI Challengers

🚀 Ansible Network Automation Beginner’s Guide (1) – The Comeback of CLI Challengers

Network Engineer Comeback Series – Part 1

Introduction: Do you remember those days when you were called in at 3 AM for a cutover? Do you remember the shaky hands while manually configuring 100 switches? Today, I want to tell you a secret – those days should come to an end!

🎯 Pain Points: Have You Experienced These Moments of Frustration?

😫 Daily Frustrations of Network Engineers

Scenario 1: The Nightmare of Repetitive Configurations

2 AM, only you and the sound of the keyboard in the office
"This is the 27th switch, the configuration is the same, why do I have to configure them one by one?"
"What if I make a mistake? Will my boss fire me tomorrow?"

Scenario 2: Anxiety of Change Management

"I just changed a VLAN, why is the entire office down?"
"Where's the backup configuration? How long will the rollback take?"
"Help, how do I remember that last week's configuration was like this!"

Scenario 3: Pressure of Audit Compliance

Auditor: "Please provide the current configuration status of all network devices"
You: "Okay, I will log in one by one to export, it will take about 3 days..."
Auditor: "What?! I need it now!"
You: "......😭"

If you have experienced any of the above, congratulations, you have come to the right place!

💡 The Comeback Begins: What is Ansible?

🤖 Ansible: The “Code Proxy” for Network Engineers

In simple terms, Ansible is a tool that allows you to manage network devices using code!!

Imagine this:

You no longer need to SSH into each device one by oneYou no longer need to repeatedly input the same configuration commandsYou no longer need to worry about configuration consistency issuesYou can even let the code do all the work while you enjoy your coffee

This is the comeback that Ansible brings to network engineers!

🏗️ Core Philosophy of Ansible: Desired State Management

“I only care about the result, not the process” – This is the philosophy of Ansible

Traditional network engineer thinking: “I need to log into the device, input these commands, and then check the results…”Ansible thinking: “I want this device to be in this state, you take care of it!”

For example:

# Traditional way (painful)
# ssh switch1
# configure terminal
# interface GigabitEthernet0/1
# switchport mode access
# switchport access vlan 10
# end
# exit
# ssh switch2
# ... (repeat 50 times)

# Ansible way (elegant)
- name: Configure access port VLAN
  cisco.ios.ios_l2_interfaces:
    config:
      - name: GigabitEthernet0/1
        access:
          vlan: 10

Got it? This is the transformation from “operator” to “architect”!

🧠 Core Concepts of Ansible: Knowledge Points Every Network Engineer Must Know

🎭 Role Division: Control Node vs Managed Node

Control Node: Your “command center”

Where Ansible is installedStores all Playbooks and configurationsYour personal computer or server

Managed Node: The network devices being managed

Switches, routers, firewallsNo additional software installation required!Any device that supports SSH/API can be managed

Network Engineer Complaints: Finally, no need to install clients on every device! Those years of Agent pitfalls…

📋 Inventory: Device Directory

# inventory.ini
[core_switches]
core1 ansible_host=192.168.1.1
core2 ansible_host=192.168.1.2

[access_switches]
access1 ansible_host=192.168.1.10
access2 ansible_host=192.168.1.11

[all:vars]
ansible_user=admin
ansible_password=your_password
ansible_network_os=cisco.ios.ios

Practical Tips:

Group devices by functionCentralize management of authentication informationSupport dynamic retrieval of device lists

📖 Playbook: Your Automation Script

---
- name: Network Device Initialization Configuration
  hosts: access_switches
  gather_facts: no  # Network devices usually do not need to gather facts

  tasks:
    - name: Configure hostname
      cisco.ios.ios_hostname:
        hostname: "{{ inventory_hostname }}-SW"

    - name: Configure DNS servers
      cisco.ios.ios_system:
        name_servers:
          - 8.8.8.8
          - 8.8.4.4

    - name: Save configuration
      cisco.ios.ios_command:
        commands: write memory

Network Engineer Comeback Moment::

Previously: Manually configuring 50 devices = 2 hours + 1 bottle of eye dropsNow: Running 1 Playbook = 2 minutes + 1 cup of coffee ☕

🔧 Red Hat Ansible Automation Platform: Enterprise-Level Arsenal

🎁 Complete Package

Ansible Core: The core engine

The heart of running PlaybooksProvides basic automation capabilities

Automation Content Navigator: The next-generation tool

Replaces the traditional ansible-playbook commandProvides an interactive interfaceSupports containerized execution environments

Automation Execution Environment: Standardized runtime environment

Includes all dependent container imagesEnsures consistency between development and production environmentsSolves the classic problem of “it works on my machine”

🎮 Detailed Explanation of Automation Content Navigator: The Next-Generation Automation Tool

Automation Content Navigator is a new top-tier tool provided by Red Hat Ansible Automation Platform 2 for developing and testing Ansible Playbooks. This tool (called ansible-navigator) replaces and extends the functionality of several earlier command-line tools, including ansible-playbook, ansible-inventory, and ansible-config.

🔧 Core Features

Interactive Interface:

Provides a text-based user interface (TUI)Supports interactive inspection of playbook outputCan disable interactive mode with the <span><span>--mode stdout</span></span> option

Environment Isolation:

Separates the control node running Ansible from the automation execution environmentRuns playbooks in containersProvides a complete working environment for automation code from deployment to production

📋 Detailed Subcommand Reference Table

The table below describes the available subcommands for <span><span>automation content navigator</span></span>:

Subcommand Use Case Example
run Run automation tasks <span><span>ansible-navigator run myplaybook.yml</span></span>
images View, pull, build container images <span><span>ansible-navigator images</span></span>
config Manage ansible-navigator configuration <span><span>ansible-navigator config</span></span>
inventory View and edit inventory files <span><span>ansible-navigator inventory -i inventory.ini</span></span>
welcome View quick reference and help <span><span>ansible-navigator welcome</span></span>
collection View and install collections <span><span>ansible-navigator collection list</span></span>
doc Get documentation for modules and plugins <span><span>ansible-navigator doc cisco.ios.ios_command</span></span>
lint Check playbook syntax and style <span><span>ansible-navigator lint myplaybook.yml</span></span>

💡 Practical Tips

Installation Configuration:

# Install ansible-navigator
dnf install ansible-core ansible-navigator ansible-lint

# Run in interactive mode
ansible-navigator

# Run in standard output mode
ansible-navigator run myplaybook.yml --mode stdout

VS Code Integration:

Install ansible-core and ansible-lint packagesVS Code will automatically perform syntax and style checks on opened playbooksProvides a better development experience

Execution Environment Management:

# View available execution environment images
ansible-navigator images

# Use a specific execution environment
ansible-navigator run myplaybook.yml --eei my-custom-ee

🎯 Practical Applications for Network Engineers

For network engineers, Automation Content Navigator offers the following advantages:

1Visual Execution: Real-time view of playbook execution process and results2Quick Debugging: Interactive interface makes problem identification more intuitive3Environment Consistency: Ensures consistency between development and production environments4Documentation Integration: Built-in documentation viewing feature for easy reference on module usage

Example Network Automation Workflow:

# 1. View network module documentation
ansible-navigator doc cisco.ios.ios_vlan

# 2. Check playbook syntax
ansible-navigator lint network_config.yml

# 3. Execute network configuration
ansible-navigator run network_config.yml --mode stdout

# 4. View execution environment
ansible-navigator images

💼 Real Application Scenarios

🚨 Automated Fault Recovery

- name: Automatic Recovery from Network Fault
  hosts: all
  tasks:
    - name: Check device connectivity
      wait_for:
        host: "{{ ansible_host }}"
        port: 22
        timeout: 10
      register: connectivity_check
      ignore_errors: yes

    - name: Automatically reload device when offline
      cisco.ios.ios_command:
        commands: reload
      when: connectivity_check.failed

📊 Automated Configuration Backup

- name: Daily Configuration Backup
  hosts: all
  tasks:
    - name: Get current configuration
      cisco.ios.ios_command:
        commands: show running-config
      register: config_output

    - name: Save configuration to file
      copy:
        content: "{{ config_output.stdout[0] }}"
        dest: "/backup/{{ inventory_hostname }}-{{ ansible_date_time.iso8601 }}.cfg"

🔒 Security Baseline Check

- name: Security Compliance Check
  hosts: all
  tasks:
    - name: Check password complexity
      cisco.ios.ios_command:
        commands: show running-config | include password
      register: password_check

    - name: Mark devices that do not meet requirements
      debug:
        msg: "{{ inventory_hostname }} password policy needs updating!"
      when: "'simple' in password_check.stdout"

🛠️ Practical Cases

Case 1: Network Device Initialization Configuration ⭐⭐⭐

---
# Filename: network_device_init.yml
# Function: Batch initialization of basic configuration for network devices
# Applicable Scenario: New device onboarding, standardized deployment
# Author: FYCheung
# Version: 1.0

- name: Network Device Initialization Configuration
  hosts: network_devices
  gather_facts: no  # Network devices usually do not need to gather system facts

  # Define variables
  vars:
    dns_servers:
      - 8.8.8.8
      - 8.8.4.4
    ntp_servers:
      - pool.ntp.org
    snmp_community: "PublicRO_2024"
    location: "Data Center Building A"

  tasks:
    # ====================== 
    # Step 1: Configure basic management information
    # ====================
    - name: Configure device hostname
      cisco.ios.ios_hostname:
        hostname: "{{ inventory_hostname }}"
      notify: Save Configuration  # Trigger handler to save configuration

    - name: Configure device location information
      cisco.ios.ios_banner:
        banner: login
        text: |
          ==========================================
          Device: {{ inventory_hostname }}
          Location: {{ location }}
          Administrator: Network Team
          Contact: [email protected]
          ==========================================
          Unauthorized access is prohibited!
          ==========================================
        state: present

    # ====================
    # Step 2: Configure network services
    # ====================
    - name: Configure DNS servers
      cisco.ios.ios_system:
        name_servers: "{{ dns_servers }}"

    - name: Configure NTP servers
      cisco.ios.ios_ntp:
        server: "{{ ntp_servers[0] }}"
        logging: true

    - name: Configure SNMP
      cisco.ios.ios_snmp_server:
        community:
          - name: "{{ snmp_community }}"
            access: ro
        contact: "[email protected]"
        location: "{{ location }}"

    # ====================
    # Step 3: Security hardening configuration
    # ====================
    - name: Enable SSH and disable Telnet
      cisco.ios.ios_config:
        lines:
          - "ip domain-name company.com"
          - "crypto key generate rsa modulus 2048"
          - "line vty 0 4"
          - "transport input ssh"
          - "exit"
          - "no ip telnet server"

    - name: Configure console password
      cisco.ios.ios_config:
        lines:
          - "line console 0"
          - "password Console@2024"
          - "login"
          - "exit"
          - "enable secret Enable@2024"

    # ====================
    # Step 4: Basic interface configuration
    # ====================
    - name: Configure management interface
      cisco.ios.ios_l3_interfaces:
        config:
          - name: Vlan1
            ipv4:
              - address: 192.168.1.{{ inventory_hostname.split('_')[-1] }}/24

    - name: Enable all interfaces
      cisco.ios.ios_command:
        commands:
          - "default interface range GigabitEthernet0/1-24"
          - "no shutdown interface range GigabitEthernet0/1-24"

    # ====================
    # Step 5: Verify configuration
    # ====================
    - name: Verify hostname configuration
      cisco.ios.ios_command:
        commands: show running-config | include hostname
      register: hostname_check

    - name: Display configuration result
      debug:
        msg: "{{ inventory_hostname }} hostname configuration: {{ hostname_check.stdout[0] }}"

    # ====================
    # Handler: Save configuration
    # ====================
  handlers:
    - name: Save Configuration
      cisco.ios.ios_command:
        commands: write memory
      listen: Save Configuration

# ====================
# Pitfall Guide
# ======================
# Pitfall 1: Some IOS versions may hang when generating RSA keys
# Solution: Add a timeout parameter before the crypto key generate command
# 
# Pitfall 2: Interfaces may need a few seconds to take effect after no shutdown
# Solution: Add a wait_for module to verify interface status
# 
# Pitfall 3: Device model differences may cause command incompatibility
# Solution: Add conditional checks or use ios_facts to detect device type

Case 2: Batch Backup and Recovery of Network Configuration ⭐⭐⭐⭐

---
# Filename: network_backup_restore.yml
# Function: Backup and recovery of network configuration
# Applicable Scenario: Daily backups, pre-change backups, fault recovery
# Author: FYCheung
# Version: 2.0

- name: Network Configuration Backup and Recovery System
  hosts: network_devices
  gather_facts: no

  # Define variables
  vars:
    backup_dir: "/opt/network_backups"
    backup_retention_days: 30
    email_notification: "[email protected]"

  tasks:
    # ====================== 
    # Function 1: Configuration Backup
    # ====================
    - name: Create backup directory
      file:
        path: "{{ backup_dir }}/{{ ansible_date_time.date }}"
        state: directory
        mode: '0755'
      run_once: true  # Only execute once on the control node

    - name: Generate backup filename
      set_fact:
        backup_file: "{{ backup_dir }}/{{ ansible_date_time.date }}/{{ inventory_hostname }}-{{ ansible_date_time.iso8601 }}.backup"

    - name: Perform configuration backup
      block:
        - name: Get full configuration
          cisco.ios.ios_command:
            commands:
              - "show running-config"
              - "show version"
              - "show vlan brief"
              - "show ip interface brief"
          register: config_backup

        - name: Save configuration to file
          copy:
            content: |
              ============================================
              Device: {{ inventory_hostname }}
              Backup Time: {{ ansible_date_time.iso8601 }}
              Operator: Ansible Automated Backup System
              ============================================

              = Device Version Information =
              {{ config_backup.stdout[1] }}

              = Complete Running Configuration =
              {{ config_backup.stdout[0] }}

              = VLAN Information =
              {{ config_backup.stdout[2] }}

              = Interface Status =
              {{ config_backup.stdout[3] }}
              ============================================
            dest: "{{ backup_file }}"

        - name: Record backup log
          lineinfile:
            path: "{{ backup_dir }}/backup_log.csv"
            line: "{{ ansible_date_time.iso8601 }},{{ inventory_hostname }},{{ backup_file }},SUCCESS"
            create: yes

      rescue:
        - name: Handle backup failure
          debug:
            msg: "{{ inventory_hostname }} configuration backup failed!"

        - name: Record failure log
          lineinfile:
            path: "{{ backup_dir }}/backup_log.csv"
            line: "{{ ansible_date_time.iso8601 }},{{ inventory_hostname }},FAILED,Backup Failed"
            create: yes

    # ====================
    # Function 2: Configuration Recovery
    # ====================
    - name: Restore configuration (requires specifying backup file)
      cisco.ios.ios_config:
        src: "{{ restore_file_path | default(backup_file) }}"
        save_when: modified
      when: restore_config is defined and restore_config|bool
      register: restore_result

    - name: Verify recovery result
      debug:
        msg: "{{ inventory_hostname }} configuration recovery{{ ' successful' if restore_result.changed else ' unchanged' }}"
      when: restore_config is defined and restore_config|bool

    # ====================
    # Function 3: Backup File Cleanup
    # ====================
    - name: Clean up expired backup files
      find:
        paths: "{{ backup_dir }}"
        age: "{{ backup_retention_days }}d"
        pattern: "*.backup"
      register: old_backups

    - name: Delete expired backups
      file:
        path: "{{ item.path }}"
        state: absent
      with_items: "{{ old_backups.files }}"
      when: old_backups.files|length > 0

    - name: Generate cleanup report
      debug:
        msg: "{{ old_backups.files|length }} expired backup files cleaned up"
      when: old_backups.files|length > 0

    # ====================
    # Function 4: Configuration Difference Comparison
    # ====================
    - name: Configuration difference check
      block:
        - name: Get current configuration
          cisco.ios.ios_command:
            commands: show running-config
          register: current_config

        - name: Read configuration from backup file
          slurp:
            src: "{{ backup_file }}"
          register: backup_config_content

        - name: Compare configuration differences
          copy:
            content: |
              ============================================
              Configuration Difference Analysis Report
              Device: {{ inventory_hostname }}
              Analysis Time: {{ ansible_date_time.iso8601 }}
              ============================================

              Backup Configuration: {{ backup_file }}

              Note: Please manually compare the following configurations to find differences

              = Current Configuration Start =
              {{ current_config.stdout[0][:500] }}...

              = Backup Configuration Start =
              {{ (backup_config_content.content|b64decode)[:500] }}...
              ============================================
            dest: "{{ backup_dir }}/{{ inventory_hostname }}-diff-{{ ansible_date_time.date }}.txt"

      when: compare_config is defined and compare_config|bool

    # ====================
    # Function 5: Email Notification
    # ====================
    - name: Send backup completion notification
      mail:
        host: smtp.company.com
        port: 587
        username: [email protected]
        password: email_password
        to: "{{ email_notification }}"
        subject: "Network Configuration Backup Completed - {{ ansible_date_time.date }}"
        body: |
          Network configuration backup task has been completed

          Backup Date: {{ ansible_date_time.date }}
          Number of Backup Devices: {{ groups['network_devices']|length }}
          Backup Directory: {{ backup_dir }}/{{ ansible_date_time.date }}

          Please log in to the server to view detailed backup logs.

          This email is sent by the Ansible automation system.
      run_once: true
      when: email_notification is defined

# ====================
# Pitfall Guide
# ======================
# Pitfall 1: Large device configurations may time out
# Solution: Increase ansible_command_timeout parameter or retrieve configurations in modules
# 
# Pitfall 2: Some devices have different output formats for show commands
# Solution: Use regex parameter of ios_command to filter key information
# 
# Pitfall 3: Restoring configurations may cause service interruptions
# Solution: Execute during maintenance windows or add configuration pre-checks

Case 3: Network Health Check and Monitoring ⭐⭐⭐⭐⭐

---
# Filename: network_health_check.yml
# Function: Health status check and monitoring of network devices
# Applicable Scenario: Daily inspections, preventive maintenance, fault alerts
# Author: FYCheung
# Version: 3.0

- name: Network Device Health Check and Monitoring System
  hosts: network_devices
  gather_facts: no

  # Define variables
  vars:
    health_report_dir: "/opt/network_health_reports"
    warning_cpu_threshold: 80    # CPU usage warning threshold
    warning_memory_threshold: 80  # Memory usage warning threshold
    warning_interface_threshold: 70  # Interface traffic threshold (Mbps)
    alert_email: "[email protected]"

  tasks:
    # ====================== 
    # Step 1: Basic Connectivity Check
    # ====================
    - name: Test device connectivity
      wait_for:
        host: "{{ ansible_host }}"
        port: 22
        timeout: 10
      delegate_to: localhost
      register: connectivity_test
      ignore_errors: yes

    - name: Record connectivity status
      set_fact:
        connectivity_status: "Normal" if connectivity_test.failed else "Abnormal"

    # ====================== 
    # Step 2: Device Hardware Status Check
    # ====================
    - name: Get device hardware information
      block:
        - name: Collect device information
          cisco.ios.ios_facts:
            gather_subset:
              - hardware
              - interfaces
              - config
          register: device_facts

        - name: Check CPU usage
          set_fact:
            cpu_status: "Warning" if device_facts.ansible_facts.ansible_cpu_usage|float > warning_cpu_threshold else "Normal"

        - name: Check memory usage
          set_fact:
            memory_status: "Warning" if device_facts.ansible_facts.ansible_memory_usage|float > warning_memory_threshold else "Normal"

      rescue:
        - name: Handle hardware check failure
          set_fact:
            cpu_status: "Check Failed"
            memory_status: "Check Failed"

    # ====================
    # Step 3: Interface Status Check
    # ====================
    - name: Get detailed interface information
      cisco.ios.ios_command:
        commands:
          - "show interfaces description"
          - "show interfaces counters errors"
          - "show ip interface brief"
      register: interface_info
      when: connectivity_status == "Normal"

    - name: Analyze interface status
      set_fact:
        interface_issues: []
        interface_up_count: 0
        interface_down_count: 0

    - name: Parse interface status
      set_fact:
        interface_up_count: "{{ interface_up_count + 1 }}"
      when: "'connected' in interface_info.stdout[0]"
      loop: "{{ interface_info.stdout[0].split('\n') }}"

    - name: Count interface status
      debug:
        msg:
          - "Device {{ inventory_hostname }} interface status:"
          - "UP interfaces: {{ interface_up_count }}"
          - "DOWN interfaces: {{ interface_down_count }}"
      when: connectivity_status == "Normal"

    # ====================
    # Step 4: Network Service Check
    # ====================
    - name: Check critical network services
      block:
        - name: Check VTP status
          cisco.ios.ios_command:
            commands: show vtp status
          register: vtp_status
          when: "'switch' in device_facts.ansible_facts.ansible_net_model|lower"

        - name: Check STP status
          cisco.ios.ios_command:
            commands: show spanning-tree root
          register: stp_status

        - name: Check HSRP/VRRP status
          cisco.ios.ios_command:
            commands: show standby brief
          register: hsrp_status
          ignore_errors: yes

      rescue:
        - name: Handle service check failure
          debug:
            msg: "{{ inventory_hostname }} some network service checks failed"

    # ====================
    # Step 5: Generate Health Report
    # ====================
    - name: Create health report directory
      file:
        path: "{{ health_report_dir }}"
        state: directory
      run_once: true
      delegate_to: localhost

    - name: Generate device health report
      template:
        src: health_report_template.j2
        dest: "{{ health_report_dir }}/{{ inventory_hostname }}_health_{{ ansible_date_time.date }}.html"
      delegate_to: localhost
      when: connectivity_status == "Normal"

    # ====================== 
    # Step 6: Alert Handling
    # ====================
    - name: Check alert conditions
      set_fact:
        alert_conditions: []

    - name: Collect alert conditions
      set_fact:
        alert_conditions: "{{ alert_conditions + [item] }}"
      when:
        - connectivity_status == "Abnormal"
        - cpu_status == "Warning"
        - memory_status == "Warning"
        - interface_down_count > 0
      with_items:
        - "Device Offline"
        - "CPU Usage Too High"
        - "Memory Usage Too High"
        - "Interface Abnormal"

    - name: Send alert email
      mail:
        host: smtp.company.com
        port: 587
        username: [email protected]
        password: email_password
        to: "{{ alert_email }}"
        subject: "⚠️ Network Device Alert - {{ inventory_hostname }}"
        body: |
          ⚠️ Network Device Health Check Alert ⚠️

          Device Name: {{ inventory_hostname }}
          Device IP: {{ ansible_host }}
          Check Time: {{ ansible_date_time.iso8601 }}

          Alert Details:
          - Connectivity Status: {{ connectivity_status }}
          - CPU Status: {{ cpu_status }}
          - Memory Status: {{ memory_status }}
          - Interface Status: UP={{ interface_up_count }}, DOWN={{ interface_down_count }}

          Alert Conditions: {{ alert_conditions|join(', ') if alert_conditions|length > 0 else 'None' }}

          Please check the device status immediately!

          This email is sent by the Ansible automation monitoring system.
      when: alert_conditions|length > 0
      delegate_to: localhost

    # ====================== 
    # Step 7: Record Performance Data
    # ====================
    - name: Record performance data to CSV
      lineinfile:
        path: "{{ health_report_dir }}/performance_log.csv"
        line: "{{ ansible_date_time.iso8601 }},{{ inventory_hostname }},{{ ansible_host }},{{ connectivity_status }},{{ cpu_status }},{{ memory_status }},{{ interface_up_count }},{{ interface_down_count }}"
        create: yes
      delegate_to: localhost

    # ====================
    # Step 8: Generate Comprehensive Report
    # ====================
    - name: Generate comprehensive health report
      copy:
        content: |
          ============================================
          Network Device Health Check Comprehensive Report
          Generation Time: {{ ansible_date_time.iso8601 }}
          Total Number of Devices Checked: {{ groups['network_devices']|length }}
          ==============================================

          {% for host in groups['network_devices'] %}
          {{ host }}:
            - Connectivity: {{ hostvars[host].connectivity_status|default('Unknown') }}
            - CPU Status: {{ hostvars[host].cpu_status|default('Unknown') }}
            - Memory Status: {{ hostvars[host].memory_status|default('Unknown') }}
            - Interface Status: UP={{ hostvars[host].interface_up_count|default(0) }}, DOWN={{ hostvars[host].interface_down_count|default(0) }}
          {% endfor %}

          ==============================================
          Alert Device List:
          {% for host in groups['network_devices'] %}
          {% if hostvars[host].alert_conditions is defined and hostvars[host].alert_conditions|length > 0 %}
          - {{ host }}: {{ hostvars[host].alert_conditions|join(', ') }}
          {% endif %}
          {% endfor %}
          ==============================================

          For detailed reports, please check the HTML files in the {{ health_report_dir }} directory.
        dest: "{{ health_report_dir }}/summary_report_{{ ansible_date_time.date }}.txt"
      run_once: true
      delegate_to: localhost

---
# Report Template File (health_report_template.j2)




    <title>{{ inventory_hostname }} Health Check Report</title>
    
        body { font-family: Arial, sans-serif; margin: 20px; }
        .header { background: #f0f0f0; padding: 20px; border-radius: 5px; }
        .status-ok { color: green; font-weight: bold; }
        .status-warning { color: orange; font-weight: bold; }
        .status-error { color: red; font-weight: bold; }
        table { border-collapse: collapse; width: 100%; margin: 20px 0; }
        th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
        th { background-color: #f2f2f2; }
    


    <div class="header">
        <h1>{{ inventory_hostname }} Health Check Report</h1>
        <p>Check Time: {{ ansible_date_time.iso8601 }}</p>
        <p>Device IP: {{ ansible_host }}</p>
    </div>

    <h2>System Status Overview</h2>
    <table>
        <tr><th>Check Item</th><th>Status</th><th>Details</th></tr>
        <tr>
            <td>Connectivity</td>
            <td class="{% if connectivity_status == &apos;Normal&apos; %}status-ok{% else %}status-error{% endif %}">{{ connectivity_status }}</td>
            <td>Device SSH connection status</td>
        </tr>
        <tr>
            <td>CPU Usage</td>
            <td class="{% if cpu_status == &apos;Normal&apos; %}status-ok{% else %}status-warning{% endif %}">{{ cpu_status }}</td>
            <td>Current CPU load status</td>
        </tr>
        <tr>
            <td>Memory Usage</td>
            <td class="{% if memory_status == &apos;Normal&apos; %}status-ok{% else %}status-warning{% endif %}">{{ memory_status }}</td>
            <td>Current memory usage status</td>
        </tr>
    </table>

    <h2>Interface Status</h2>
    <table>
        <tr><th>Status Type</th><th>Count</th></tr>
        <tr><td>UP Interfaces</td><td>{{ interface_up_count }}</td></tr>
        <tr><td>DOWN Interfaces</td><td>{{ interface_down_count }}</td></tr>
    </table>

    <div style="margin-top: 30px;font-size: 12px;color: #666">
        Report generated by Ansible automation system | {{ ansible_date_time.iso8601 }}
    </div>



# ====================== 
# Pitfall Guide
# ======================
# Pitfall 1: Some devices do not support ios_facts module
# Solution: Use traditional command methods to obtain information, or add device type checks
# 
# Pitfall 2: Network latency may cause checks to time out
# Solution: Appropriately increase timeout parameters or use asynchronous execution mode
# 
# Pitfall 3: Email sending failures affect the entire task
# Solution: Place email sending in block/rescue to avoid affecting main check functionality
# 
# Pitfall 4: HTML template file path issues
# Solution: Use the template module and ensure the template file is in the correct templates directory

🎯 Pitfall Guide: Those Years I Stumbled

💥 Pitfall 1: SSH Authentication Issues

Symptoms:<span><span>"Failed to connect to the host via SSH"</span></span>Solution:

# Check authentication information in the inventory file
[all:vars]
ansible_user=admin
ansible_password=your_password
ansible_network_os=cisco.ios.ios

# Or use SSH key authentication
ansible_ssh_private_key_file=~/.ssh/id_rsa

💥 Pitfall 2: Permission Issues

Symptoms:<span><span>"command requires privileged access"</span></span>Solution:

- name: Privileged execution example
  cisco.ios.ios_command:
    commands: show running-config
    provider:  # Explicitly specify permissions
      username: admin
      password: password
      auth_pass: enable_password

💥 Pitfall 3: Special Characters in Network Devices

Symptoms: Parsing fails when configuration contains special symbolsSolution:

- name: Correctly handle special characters
  cisco.ios.ios_config:
    lines:
      - "description \"Server Room - Zone A\""
      - "switchport access vlan 100"

💥 Pitfall 4: Idempotency Failure

Symptoms: Every time the Playbook runs, it shows changedSolution:

- name: Ensure configuration idempotency
  cisco.ios.ios_vlan:
    vlan_id: 100
    name: Management_VLAN
    state: present  # Explicitly specify desired state

🚀 Comeback Moment: From Manual to Automated Comparison

📊 Efficiency Comparison Table

Task Type Manual Operation Time Ansible Automated Time
Configure VLAN on 10 switches 2 hours 3 minutes
Backup all devices 4 hours 5 minutes
Security baseline check 1 day 10 minutes
Fault recovery 30 minutes 2 minutes

💰 Cost Savings

Time Cost:

Save 20 hours of manual configuration time per weekSave over 1000 hours of labor annuallyEquivalent to hiring 0.5 additional network engineers

Error Cost:

Human error rate reduced by 90%Fault recovery time reduced by 80%System stability improved by 60%

🎓 Learning Path: Advanced Automation Route for Network Engineers

🥇 Stage 1: Basic Mastery (1-2 weeks)

Learn the basics of YAML syntaxUnderstand core concepts of AnsibleComplete the first Hello World Playbook✅ After completion, read: Part 2 “Playbook Practical Wizard”

🥈 Stage 2: Playbook Practical Expert (2-4 weeks)

Deeply master Ansible inventory management skillsLearn to write complex multi-vendor PlaybooksMaster privilege escalation and error handling mechanismsUnderstand idempotency and handlers usage✅ Completion of this chapter will make you a Playbook practical expert

🥉 Stage 3: Enterprise Architect (1-2 months)

Learn Git version control and team collaborationMaster Automation Controller platform managementDeepen troubleshooting and debugging skillsUnderstand enterprise-level project structure and best practicesBuild complete automation workflows✅ Published: Part 3 “Enterprise-Level Automation Architect”

🏆 Stage 4: Variable Wizard (2-4 weeks)

Deeply understand Ansible variable managementMaster Facts collection and magic variablesLearn survey functions and filter applicationsImplement dynamic configuration and data-drivenBuild intelligent automation systems✅ Published: Part 4 “The Secret Weapon of Variable Wizards”

🎖️ Stage 5: Task Control Master (2-4 weeks)

Learn loop control and conditional judgmentMaster block/rescue/always error handlingDeeply understand workflow designBuild enterprise-level error recovery mechanismsImplement intelligent decision-making and automation processes✅ Published: Part 5 “Task Control Master: From Executor to Decision Maker”

🏆 Stage 6: DevOps Network Engineer (2-4 weeks)

Learn Git version control and team collaborationMaster CI/CD integration and automated testingDeeply understand DevOps culture and practicesBuild automation workflows and release managementImplement containerized deployment and microservices integration✅ Published: Part 6 “DevOps and Network Automation: From Operations to Engineers”

🏅 Stage 7: Network Automation Architect (3-6 months)

Learn infrastructure awareness and intelligent device managementMaster advanced applications of Jinja2 template engineDeeply understand platform-agnostic modules and vendor abstractionBuild rolling updates and zero downtime deploymentsDesign enterprise-level automation architecture and solutionsImplement AIOps and intelligent operations systems✅ Published: Part 7 “Automating Network Management Tasks: From Engineer to Architect”

🎉 Summary: Your Comeback Journey Starts Today!

🌟 Core Takeaways

Technical Aspects:

Mastered the core technologies of Ansible network automationLearned 3 practical Playbook casesUnderstood enterprise-level best practices

Mindset Aspects:

Shifted from “operational thinking” to “architectural thinking”Shifted from “passive response” to “proactive prevention”Shifted from “repetitive labor” to “value creation”

🚀 Next Steps

1Get Started Immediately : Download Ansible, configure the test environment2Take Small Steps : Start with simple configuration backups3Continuous Learning : Follow Ansible official documentation and community4Share Experiences : Promote automation concepts within the team

📝 Network Engineer Comeback Declaration

“I am no longer a configuration worker, I am an automation architect!”

“My value lies not in the speed of typing, but in the ability to solve problems with code!”

“Let the repetitive work be done by machines, allowing me to focus on more creative tasks!”

🔮 Series Preview: The Complete Comeback Journey of Network Engineers

📚 Series Article Planning

Part 1: From ‘Configuration Worker’ to ‘Hands-Off Manager’ (1) – The Comeback of CLI Challengers ✅ Published

Introduction to Ansible core concepts and enterprise-level platform

Part 2: From ‘Configuration Worker’ to ‘Hands-Off Manager’ (2) – Playbook Practical Wizard ✅ Published

In-depth Playbook writing, unified management across multiple vendors, 3 enterprise-level practical cases

Part 3: From ‘Configuration Worker’ to ‘Hands-Off Manager’ (3) – Enterprise-Level Automation Architect ✅ Published

Git version control, Automation Controller platform, troubleshooting expert skills

Part 4: From ‘Configuration Worker’ to ‘Hands-Off Manager’ (4) – The Secret Weapon of Variable Wizards ✅ Published

In-depth understanding of variable management, Facts collection, intelligent configuration

Part 5: From ‘Configuration Worker’ to ‘Hands-Off Manager’ (5) – Task Control Master: From Executor to Decision Maker ✅ Published

Loop control, conditional judgment, error handling, workflow design

Part 6: From ‘Configuration Worker’ to ‘Hands-Off Manager’ (6) – DevOps and Network Automation ✅ Published

CI/CD integration, automated testing, release management

Part 7: From ‘Configuration Worker’ to ‘Hands-Off Manager’ (7) – Automating Network Management Tasks ✅ Published

Infrastructure awareness, Jinja2 templates, rolling updates, architect thinking

🚀 Continuous Advancement Path

After completing this series, you can further learn:

Cloud-Native Network Automation : Kubernetes, service meshAIOps and Machine Learning : Intelligent operations and predictive analyticsNetwork Programming Development : Python, Go, Network APISecurity Automation : Zero trust, compliance automationMulti-Cloud Network Management : Cross-cloud platform network automation

💡 Learning Suggestions

Recommended Learning Order:

1Part 1 → Establish basic understanding, learn what Ansible can do for you2Part 2 → Master core skills, able to write complex Playbooks3Part 3 → Enhance architectural thinking, design enterprise-level automation solutions4Part 4 → Deepen variable management, achieve intelligent configuration5Part 5 → Master task control, build robust automation systems6Part 6 → Integrate DevOps, become a full-stack automation expert7Part 7 → Reach architect level, design enterprise-level automation architecture

Practical Suggestions:

Each article comes with complete practical cases, recommended to practice in a test environmentStart with simple configuration management, gradually transition to task control and intelligent scenariosShare learning outcomes within the team, promote automation transformationEstablish a personal learning lab, accumulate practical experienceFocus on mastering the task control skills in Part 5, which is key to building enterprise-level automation systems

🎉 Congratulations, you have already started your network engineer comeback journey!

The complete series from “Configuration Worker” to “Hands-Off Manager” will help you become a true network automation expert!

This document is produced by the “Network Engineer Comeback Plan”, feel free to share it and let more network engineers embark on the automation journey!

If you find this article helpful, please like and bookmark it, your support is my motivation to continue creating!

If you have any questions or suggestions, feel free to leave a comment for discussion!

📧 Contact:[email protected]🌐 Learning Community: Source Universe Station 13 – Network Automation Channel📚 Related Resources: Red Hat official documentation, Ansible best practices guide

Copyright © 2025 FYCheung Network Engineer Comeback Plan. All rights reserved.

Leave a Comment