Click on the “Programmer Technical Expert” above, follow and select “Set as Star”
Reply with “Join Group” to get qualification for group discussions!
In a modern distributed system, “time accuracy” is the underlying cornerstone of all computations, yet it is often the most overlooked foundational capability. From log alignment, monitoring alerts, transaction consistency, to container orchestration, certificate validation, and message delay calculations… A deviation of just a few seconds in system time can trigger a series of hard-to-locate online issues.
This article will provide a panoramic explanation of Linux system time synchronization from the perspectives of principles, tools, production implementation, architectural design, and troubleshooting methods, making it suitable for technical sharing or internal training materials.
1. Why is Time Synchronization So Important?
In distributed systems, what we need more is “time consistency across all machines,” rather than just the correctness of a single machine’s time.
Problems Caused by Time Desynchronization
1. Log Misalignment
When troubleshooting issues, you may find that Service A calls Service B at 10:01, but Service B’s log shows 09:59, which will lead to:
-
Broken call chain
-
Unable to align TraceID
-
Monitoring graphs showing misalignment
2. Distributed System Consistency Failures
For example:
-
Redis’s EXPIRE judgment error leading to premature or delayed key expiration
-
Zookeeper/Kafka’s election mechanism confusion due to time dependencies
-
Distributed locks expiring early causing “lock contention safety issues”
-
Database transaction timeout judgment anomalies
3. Impact on Security Mechanisms
-
JWT token showing “not yet effective” or “expired”
-
HTTPS certificate validation failures (common browser errors)
4. Monitoring and Alert Anomalies
Prometheus/Grafana graphs showing gaps, even generating “phantom alerts”.
In short: Time synchronization is the foundation of reliability in production-level systems.
2. Linux Time Architecture
Linux has two sets of time systems:
| Name | Type | Power Dependency | Usage |
|---|---|---|---|
| RTC (Real-Time Clock) | Clock on the BIOS motherboard | Not affected by power outages | Initializes the system clock at startup |
| System Clock | Maintained in memory by the kernel | Fails on shutdown | Time actually used by applications |
At startup:
RTC → System Clock (synchronized once at boot)
Afterwards:
System Clock = Kernel Tick + NTP/Chrony calibration
Note:
-
Time in containers should remain consistent with the host machine
-
System Clock in virtual machines is more prone to drift
3. Comparison of Mainstream Time Synchronization Tools
| Tool | Type | Advantages | Recommended Scenarios |
|---|---|---|---|
| chronyd (Recommended) | NTP Client/Server | High precision, fast speed, supports virtualization, supports offline drift calculation | Enterprise-level production environments |
| ntpd | Traditional NTP daemon | Long history | Not recommended, do not use for new projects |
| systemd-timesyncd | Lightweight SNTP | Simple, lightweight | Containers or lightweight systems |
| hwclock | Adjust hardware clock | Adjust RTC | Used for synchronization before and after startup |
Best choice for production environments: chrony (compatible, stable, high precision)
4. Chrony: The Preferred Solution for Enterprise-Level Time Synchronization
1. Installation
CentOS / Rocky Linux
yum install chrony -y
Ubuntu / Debian
apt install chrony -y
2. Configuration (/etc/chrony.conf)
Below is a typical configuration suitable for enterprises:
# Upstream NTP servers, multiple can be configured
server ntp.aliyun.com iburst
server time1.cloud.tencent.com iburst
server cn.pool.ntp.org iburst
# Allow clients in the local area network to synchronize (can be opened as needed for multiple data centers)
allow 192.168.0.0/16
allow 10.0.0.0/8
# Specify local hardware clock
rtcsync
# Time drift record file for automatic calibration
driftfile /var/lib/chrony/drift
# Allow system to predict drift in case of disconnection
local stratum 10
3. Start the service
systemctl enable --now chronyd
4. Check Synchronization Status
Check overall quality:
chronyc tracking
Check synchronization sources:
chronyc sources -v
Example of field meanings:
-
Stratum: Level, 1 is the highest, normal values are usually between 2~4
-
Offset: Offset between local machine and time source (the smaller the better, in microseconds)
-
Ref time: Time of the last synchronization
5. Force Immediate Calibration (default does not allow large time adjustments at once)
If the local time deviation exceeds 1000 seconds, NTP will not immediately adjust by default, but will slowly “pull back”.
Force immediate correction:
chronyc makestep
5. Building an Internal NTP Server for Enterprises (Recommended Architecture)
For large enterprises or multiple IDC data centers, the following architecture can be adopted:
National Time Center / Aliyun NTP / Tencent Cloud NTP
│ Company Level NTP (Stratum 2)
10.10.1.10 / 10.10.1.11
│ ┌───────────┴───────────┐
│ │
Data Center A Secondary NTP Data Center B Secondary NTP
(Stratum 3) (Stratum 3)
│ │
All business servers, load balancers, databases, K8s nodes
Example configuration for enterprise NTP Server:
server ntp.aliyun.com iburst
server time.google.com iburst
local stratum 2
allow 10.0.0.0/8
This means:
-
Secondary servers can continue to synchronize downwards
-
All machines in the production environment rely only on internal NTP, not directly requesting the public network
Advantages:
-
Safe and stable, not affected by network fluctuations
-
High time consistency within the same data center (deviation <1ms)
-
Reduces pressure on public NTP services
6. systemd-timesyncd (Commonly Used in Lightweight Systems)
Used for lightweight installations, in scenarios without chronyd (e.g., containers, IoT).
Check status:
timedatectl
Enable synchronization:
timedatectl set-ntp true
Note:
Do not replace chrony in production environments.
7. Common Time Synchronization Failures and Troubleshooting Methods
1. NTP Server Unreachable
Troubleshooting:
chronyc sources -v
If you see:
^? unreachable
It indicates:
-
UDP port 123 is not open
-
DNS resolution issues
-
Public NTP standard restrictions
Solution:
firewall-cmd --add-port=123/udp --permanent
firewall-cmd --reload
2. Severe Time Drift in Virtual Machines
Virtual machines may experience unstable ticks due to CPU scheduling anomalies.
Solution:
Kernel Parameter Adjustment
grubby --update-kernel=ALL --args="tsc=reliable"
Use chrony (better than ntpd)
Chrony has many optimizations for virtualization.
3. Time Inconsistency in Containers (Docker/K8s)
Containers do not maintain their own time; the time is determined by the host machine.
Recommendations:
-
Configure chrony on the host machine
-
Do not run chronyd inside containers
-
All K8s nodes must connect to the same time source
4. Time is Wrong Again After Restart
Reason: Inaccurate hardware RTC.
Synchronize RTC:
hwclock --systohc
Read from RTC:
hwclock --hctosys
8. Summary of Best Practices for Production
✅ 1. Use chrony uniformly
Stable, fast, high precision, suitable for large-scale virtual machine scenarios.
✅ 2. Unified NTP source across multiple data centers
Ensure all server time deviations are <1ms.
✅ 3. Deploy enterprise-level NTP Server in core data centers
Reduce external network dependencies and improve security.
✅ 4. Focus on time synchronization in container clusters and virtualized environments
Avoid drift causing distributed issues.
✅ 5. Check NTP configuration after system upgrades
Some images and automation tools may overwrite configurations.
✅ 6. Use makestep for large deviations to force calibration
Avoid long periods of inconsistency due to “slow pull back”.
Time synchronization is one of the most critical infrastructures in distributed systems. It may not be as visible as CPU or memory, but it determines the reliability baseline of the system.
Thank you for reading, and feel free to share any suggestions regarding this article. Follow me for more technical insights!

- Today Cloudflare had a global incident, can your ChatGPT still access it?
- Understand distributed storage in one article: mainstream technologies, applicable scenarios, and selection guide (super detailed)
- From MCP to RAG to Agent: The next leap in AI application architecture
- Implementing traffic mirroring with Nginx Mirror module

If you like it, please click “Looking” and leave a message or share it with your friends.