
Click the above to follow us!
In today’s society, internet applications are becoming more widespread, with an increasing number of users. As people’s dependence on internet services grows, their expectations for service availability and user experience have also risen. So how can we ensure that services consistently provide stable, uninterrupted, and reliable service during operation?
For example, if an educational or financial system product experiences a fault online, it could lead to significant losses. Since the architecture and business logic of financial products or educational systems are quite complex, we testing engineers validate service stability through unit testing, interface testing, integration testing, and performance testing. However, this is still far from enough, as errors can occur at any time in any form, especially in distributed systems. Therefore, many companies are beginning to adopt chaos engineering (the best practice company in China is Alibaba; interested parties can look up Alibaba’s chaos engineering related practices online). Since chaos engineering requires continuous investment and accumulation, our testing department can first conduct fault injection drills to simulate and prevent online faults, which minimizes costs while maximizing benefits.
Alibaba prioritizes analyzing P1 and P2 faults and has drawn fault profiles from the perspectives of IaaS, PaaS, and SaaS layers, as shown in the figure below:
Currently, there are many diverse tools in the industry for simulating faults, each with its own advantages and disadvantages in terms of supported functions and scenarios. From comparison, chaosblade supports a rich set of features and scenarios, and its community is quite active, so we can choose to try this.
The scenarios supported by chaosblade can be referenced in the documentation:
https://chaosblade-io.gitbook.io/chaosblade-help-zh-cn/
Next, we will conduct a simple fault simulation.
Download path:
https://github.com/chaosblade-io/chaosblade/releases
Unzip and use, no installation required:
tar -zxvf chaosblade-0.9.0.linux-amd64.tar.gz
CPU stress injection:
Results as follows:
Memory stress injection:
Effect as follows:
Disk stress injection:
Effect as follows:
Network card fault injection:
Observing the interface, we can see that the response time of the interface has significantly increased.
The above is just a basic demonstration of fault injection using chaosblade. Other drills can be attempted by everyone, especially fault injection drills at the service layer, which our testing colleagues should prioritize and focus on. I strongly recommend everyone try service layer fault injection drills within the company.
If you have any questions or research on fault drills, feel free to add the administrator’s WeChat for discussion and exchange.
Add WeChat

Join the discussion