That day at three in the morning, the online service suddenly crashed.
After checking the logs, we discovered that our seemingly rock-solid monolithic application could not withstand high concurrency. The entire team stayed up all night, and we barely managed to get through the crisis by temporarily scaling up the servers. This incident completely solidified my determination to refactor the system.
Microservices.
This concept, which has been a hot topic in the tech community for several years, is finally going to be implemented in our project. To be honest, when I first read Martin Fowler’s famous article on microservices architecture, I scoffed at it. Looking back now, it’s quite a slap in the face!
Why migrate to microservices?
The reason is simple—pain.
Our Python monolithic application has ballooned to 300,000 lines of code, and any small change feels like walking on thin ice. The dependencies between modules are intricate, and even senior developers cannot guarantee that a single modification won’t trigger a chain reaction. Not to mention the outrageous deployment time… a complete build often takes 40 minutes! It’s practically the perfect opportunity to drink coffee, scroll through Twitter, and daydream.
But microservices are not a panacea.
I have seen too many teams blindly pursue a “microservices architecture,” only to end up splitting a bloated monolithic application into a bunch of bloated micro-applications, with network call overhead making overall performance worse. This is not microservices; this is a “distributed monolith”—a far more terrifying existence than a monolithic application.
Defining service boundaries is the first hurdle.
Our team spent two whole weeks sorting out business processes and ultimately identified seven core services: user service, order service, payment service, inventory service, notification service, analytics service, and gateway service. Each service has a clear responsibility, and strong cohesion is the foundation of successful microservices.
Choosing the technology stack
After intense discussions, we finally settled on this technology stack:
- • FastAPI: Performance comparable to Node.js and Go, supports asynchronous programming, and comes with built-in OpenAPI documentation generation
- • RabbitMQ: Handles inter-service communication and supports various messaging patterns
- • PostgreSQL: Main database storage
- • Redis: Caching and distributed locks
- • Docker & Kubernetes: Containerization and orchestration
- • Prometheus & Grafana: Monitoring system
The choice of FastAPI was actually quite controversial. Half the team wanted to use Django (“So many libraries!”) while the other half preferred Flask (“Lightweight!”). In the end, I decided on FastAPI, mainly due to its native support for Python 3.6+ type annotations and its astonishing performance—simple tests on my MacBook showed it was nearly five times faster than Flask!
# This code was initially written like this
@app.get("/users/{user_id}")
async def get_user(user_id: int):
user = await database.fetch_one(
"SELECT * FROM users WHERE id = :id",
{"id": user_id}
)
return user
# But soon we discovered a problem...
# What if the user does not exist? Return None? Throw an exception?
# The improved version
@app.get("/users/{user_id}", response_model=UserResponse)
async def get_user(user_id: int):
user = await user_service.get_by_id(user_id)
if not user:
raise HTTPException(
status_code=404,
detail=f"User with ID {user_id} not found"
)
return UserResponse.from_orm(user)
Inter-service communication is another technical challenge.
Initially, we wanted to use RESTful APIs for all service calls—after all, it’s simple and direct. But we quickly found that this approach had many issues in certain scenarios. For example, after an order is created, it needs to notify the inventory service, payment service, and notification service. If we use synchronous calls, any slow response from one service can drag down overall performance.
Ultimately, we adopted a hybrid communication strategy:
- • Synchronous communication: Using HTTP/REST, suitable for scenarios requiring immediate responses
- • Asynchronous communication: Using RabbitMQ, suitable for scenarios that can be processed in the background
There’s an interesting story here… Our initial choice for the message queue was Kafka, because “in the era of big data, Kafka is standard”—what a grand reason! However, after deployment, we found that no one on the team could truly master this complex system. After two weeks of painful struggles, we switched to the simpler RabbitMQ. The lesson is:Choose technology that matches the team’s skill level; do not blindly pursue high-end solutions.
The deployment environment is a combination of Docker and Kubernetes.
Each service is packaged into an independent Docker image and orchestrated through Kubernetes. The benefits of this approach are obvious: environment consistency, horizontal scaling, automatic recovery… but there are also many challenges. The learning curve for K8s is steep, and our operations team spent a whole month buried in the official documentation.
# A typical K8s deployment configuration for a service
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
spec:
replicas: 3 # Adjust dynamically based on load
selector:
matchLabels:
app: user-service
template:
metadata:
labels:
app: user-service
spec:
containers:
- name: user-service
image: our-registry.com/user-service:1.2.0
ports:
- containerPort: 8000
env:
- name: DB_HOST
valueFrom:
configMapKeyRef:
name: app-config
key: db_host
resources:
limits:
cpu: "0.5"
memory: "512Mi"
requests:
cpu: "0.2"
memory: "256Mi"
Pitfalls in Real Projects
Data consistency issues almost doomed the project.
In a monolithic application, we are used to using database transactions to ensure the atomicity of operations. But in a microservices architecture, data is scattered across multiple services, and cross-service transactions are nearly impossible to implement. To solve this problem, we adopted theeventual consistency model—ensuring that data eventually reaches a consistent state through message queues and compensation mechanisms.
The fragility of service dependencies is also a major issue.
I remember one time when the payment service suddenly went down, causing the entire order process to collapse. After investigation, we found that our service call chain was too long: order → payment → inventory → notification, with each link dependent on the previous one. When the payment service had issues, everything else failed. Since then, we implemented thecircuit breaker pattern andservice degradation strategies to prevent cascading failures.
Once the functionality was online, performance monitoring became crucial.
We used Prometheus to collect performance metrics from each service and built visual dashboards with Grafana. Surprisingly, the initial bottleneck was not in the Python code itself, but in database queries. By adding appropriate indexes and optimizing queries, we reduced the API response time from an average of 320ms to 85ms.
This refactoring took three months, reducing the codebase from 300,000 lines to approximately 180,000 lines (distributed across various services). Deployment time decreased from 40 minutes to an average of 3 minutes per service. Most importantly, we can now independently scale each service based on actual load.
Of course, microservices are not the end.
With the development of cloud-native technologies, we may consider a serverless architecture in the future to further reduce operational burdens. But regardless of how technology changes, understanding business needs, designing system architecture reasonably, and writing maintainable code will always be fundamental skills that never go out of style.
After all, technology is just a tool; solving problems is the goal.