How to Solve Backup Time Issues with Large Virtualization Capacity?

◉ How to Solve Backup Time Issues with Large Virtualization Capacity?Whether it is domestic or foreign virtualization, performing a full backup can result in a large capacity. Sometimes, based on requirements, off-site backups are also needed, which may require dedicated lines, not considering the cost issue for now. In reality, for example, when performing a full backup once a week, various limitations and constraints can arise, such as the data volume being too large to complete the backup, or the backup time being too long, which is close to business production time. Additionally, the time for full backups can keep extending. Is there a good solution or idea for this problem?*The question comes from community member @JAGXU, Storage Operations Management. The following shares are from community peers for your reference.

@XiaoGua, System Architect at a national joint-stock bank:

First, the virtualization platform is large, with a single cloud platform or virtualization cluster exceeding 1000 VMs, even up to 5000 VMs. There was a case where a client had 7 cloud zones, with the largest single cloud zone exceeding 5000 VMs, using a backup appliance cluster deployment architecture to achieve backups. For example, a single backup appliance with 300TB can meet the backup needs of approximately 200-400 VMs, with daily backup jobs controlled at around 200. Based on the virtual machine data protection needs, the backup strategy is configured as: staggered full backups weekly, with incremental backups every 1-3 days; or staggered full backups every two weeks, with no incremental backups; retaining for 31 or 62 days. Each backup job is sized, with most completed within 30 minutes, while some oversized VMs are considered separately. Therefore, for 5000 VMs in virtualization, configuring 10-15 backup appliances can complete the backups. Additionally, all virtual machine hosts need to be pre-configured with independent network ports for backup traffic, and the access layer or aggregation layer switches need to be configured with independent network ports to connect directly to the backup appliances (to avoid congestion by connecting to the core network); ideally, a separate backup network plane should be established to handle backup operations.Second, for oversized virtual machines (generally considered to be over 2TB), a separate backup strategy needs to be configured (avoiding concurrency during the same time period, or configuring separate storage pools, etc.) to enhance backup performance and reduce the backup window. For certain batches of large-capacity virtual machines, a dedicated appliance can be used to perform backups; as long as it is completed within 6 hours.

Third, regarding off-site backups and dedicated line bandwidth issues, due to the large scale of virtual machine data and high data redundancy, targeted off-site replication strategies need to be configured. For example, virtual machines that do not need to be replicated can use a separate storage pool for backup to avoid overly complex replication strategies; a deduplication pool must be configured to achieve data replication after deduplication, saving bandwidth.

@nkj2021, System Architect at a securities company:

To address the issues of large full backup capacity, long backup times, and potential interference with business production in a virtualization environment, the following solutions and ideas can be considered:

1. Due to the large data volume of full backups, consider performing full backups at a specific time each week (e.g., weekends) and incremental backups at other times. This ensures data integrity while reducing backup time and data volume. Using data compression technology during the backup process can significantly reduce the size of backup data, while deduplication technology can eliminate duplicate data, further reducing the space and time required for backups.

2. Schedule backup operations during business low periods or maintenance windows to avoid interfering with business production. Distributing backup operations across multiple time periods rather than concentrating them in one period can reduce the data volume of each backup and lower backup time.

3. Choosing high-performance storage devices can accelerate the read and write speed of backup data, improving backup efficiency. For off-site backup scenarios, using dedicated lines or high-bandwidth networks can ensure the rapid transmission of backup data, reducing transmission time.

4. Selecting mature backup software, which typically features automation, scalability, and reliability, can adjust backup strategies according to actual needs to improve backup efficiency.

@Emei Mountain Practitioner, QA Engineer:

To address the issue of excessive backup time in virtualization environments, combining industry practices and technical solutions, here are comprehensive solutions and strategies:

1. Optimize backup strategies: Reduce the frequency and data volume of full backups.

1. Combine incremental/differential backups with full backups.

Adopt a strategy of “staggered full backups weekly + daily incremental backups” to reduce the frequency of full backups. For example, perform full backups during business low periods (such as weekends) and only back up incremental data on weekdays.

For oversized virtual machines (e.g., over 2TB), configure separate backup strategies to avoid competing for resources with other tasks.

2. Data classification and streamlining.

Classify data based on importance: core business data is backed up frequently, while non-critical data is backed up infrequently or on demand.

Clean up redundant data: Identify useless data (such as temporary files, duplicate logs) to reduce the backup source capacity.

2. Technical means to enhance backup efficiency.

1. Data compression and deduplication technology.

Enable deduplication and compression at the source to reduce the amount of data transmitted and stored. For example, in the Dingjia case, bandwidth was saved through a deduplication pool, suitable for off-site backup scenarios.

Block-level incremental backups: Utilize the virtualization platform’s CBT (Changed Block Tracking) or storage snapshot technology to back up only the changed data blocks, shortening backup time.

2. Parallel backups and distributed architecture.

Use a backup appliance cluster to enhance concurrent processing capabilities through horizontal scaling. For example, a single backup appliance supports 200-400 VMs, and 5000 VMs can achieve efficient backups through a cluster of 10-15 devices.

Staged scheduling: Distribute backup tasks across multiple time periods to avoid resource contention.

3. High-performance hardware support.

Use 25G/100G network cards, all-flash storage, and other devices to enhance I/O performance and shorten read/write times.

3. Architecture design and network optimization.

1. Independent backup network plane.

Configure independent network ports or dedicated networks for backup traffic to avoid competing for bandwidth with production business. For example, isolate backup traffic through aggregation layer switches to prevent core network congestion.

2. Off-site backup optimization.

Prioritize the transmission of incremental data and use deduplication technology to reduce dedicated line bandwidth usage.

Utilize CDP (Continuous Data Protection) technology to achieve second-level data synchronization, shortening recovery time windows.

4. Backup software selection and compatibility.

1. Unified management of third-party backup software.

Select third-party tools that support multiple virtualization platforms (such as VMware, KVM, domestic virtualization) to achieve unified management in heterogeneous environments, avoiding the limitations of vendor-specific backup tools.

Prioritize software that supports agentless backups to reduce virtual machine resource consumption, but be cautious of snapshot dependency issues.

2. Application awareness and consistency assurance.

For virtual machines with databases, combine application consistency agents (such as Oracle RMAN) to ensure complete transaction logs during backups, avoiding data inconsistency caused by relying solely on virtualization snapshots.

5. Drills and operations management.

1. Regular backup validation and recovery drills.

Simulate failure scenarios to verify backup effectiveness, ensuring quick recovery in emergencies.

2. Automation and intelligent scheduling.

Utilize the automation strategies of backup software (such as scheduled tasks, load balancing) to reduce manual intervention and promptly detect anomalies through monitoring alerts.

6. Special considerations for domestic environments.

Compatibility adaptation:

Select backup solutions that support domestic chips (such as Haiguang, Zhaoxin) and operating systems (such as Kirin, Tongxin), and reduce invasiveness to virtualization platforms through “agentless + block-level backup”.

Offline archiving:

For data that needs to be preserved long-term, use Blu-ray libraries or tape libraries as cold storage media, forming a tiered storage system with online backups.

7. Conclusion.

In summary, the core of solving backup time issues lies in strategy optimization (such as incremental backups, data classification), technology upgrades (such as deduplication, parallel processing), and architecture design (such as independent networks, cluster deployment). For domestic environments, attention must also be paid to compatibility with domestic products and offline archiving solutions. In actual deployment, it is recommended to design hybrid strategies based on business needs and validate the effectiveness of the solutions through regular drills.

@Raise a Toast to the East Wind, System Engineer at a national joint-stock bank:

For situations where data volume is too large, backup time is too long, backup windows are insufficient, or may affect business, the following measures can generally be considered:

1. Exchange resources for time by using higher configuration backup devices, such as 25G network cards, all-flash storage, larger memory, etc., to improve the write efficiency of backup data.

2. Exchange technology for time. After completing a full backup, try to use incremental backups for subsequent backups, or enable source-side deduplication, which will significantly reduce the time for each backup.

3. Split time windows. For data that does not need to be fully backed up at once, it can be split into multiple windows. If it is structured data like a database, consider splitting the database to reduce the amount of data backed up at once.

4. Data cleanup. Properly identify data to reduce unnecessary backups and streamline backup sources.

What do you think?Welcome to discuss.Click to read the original text at the end of the article to read and discuss in the community.

If you find this article useful, pleaseshare, likeor clickto let more peers see it.

Recommended materials/articles:

  • Selection of backup scenarios and formulation of backup strategies for key application systems in financial enterprises using domestic virtualization (summary of peer discussions).

  • Interpretation of compatibility and interface completeness of backup software for key application systems in financial enterprises using domestic virtualization (peer consensus).

  • What backup scenarios remain a focus under the trend of domestic virtualization?

  • Have you backed up today? — 52 articles worth reading on backup.

  • Overall planning and design reference for enterprise data backup.

Welcome to follow the community “Backup”technical theme which will continuously update quality materials and articles. Address:

http://www.talkwithtrend.com/Topic/1195

Download the TWT community client APP

How to Solve Backup Time Issues with Large Virtualization Capacity?

How to Solve Backup Time Issues with Large Virtualization Capacity?

Long press to recognize the QR code to download

Or search for “TWT” in the app store.

Long press the QR code to follow the public account.

How to Solve Backup Time Issues with Large Virtualization Capacity?

*The content published by this public account only represents the author’s views and does not represent the community’s position.

Leave a Comment