Strengthening System Resilience in the Cloud-Native Era

Strengthening System Resilience in the Cloud-Native Era

IT system construction has evolved through standalone, centralized, and distributed architectures, and the complexity of system operation and maintenance drills and fault simulation testing has continuously increased. In complex distributed systems, both infrastructure and application platforms can experience unpredictable failures. Without knowing the root cause of a failure, we cannot prevent its occurrence. A more … Read more

Building Unbreakable Database Services: Injecting 1000 Failures

Building Unbreakable Database Services: Injecting 1000 Failures

Introduction What happens when you inject 1000 failures into a YMatrix cluster? With the help of chaos testing, the ALOHA high availability architecture introduced in version 5.0 has been rigorously tested before its release; in addition to continuously refining key technologies, the YMatrix R&D team has also introduced advanced engineering methods and practices to ensure … Read more