
Practical Guide to Network Parameter Tuning in High-Concurrency Linux Scenarios
Table of Contents
- • Practical Guide to Network Parameter Tuning in High-Concurrency Linux Scenarios
- •
- • Introduction
- • 1. Background: When Concurrent Connections Encounter Performance Bottlenecks
- •
- • 1.1 Case Environment
- • 1.2 Initial Parameter Analysis
- • 2. In-Depth Diagnosis: Connection Status and Kernel Parameters
- •
- • 2.1 Connection Status Monitoring Techniques
- •
- • Real-time TCP State Statistics
- • Half-Connection Special Check
- • 2.2 Key Parameter Interpretation
- • 3. Tuning Solutions: From Parameters to Practice
- •
- • 3.1 Connection Management Optimization
- •
- • Resolving TIME_WAIT Accumulation
- • Shortening Connection Recycle Time
- • 3.2 Queue and Buffer Optimization
- •
- • Expanding Connection Queue
- • Adjusting Memory Buffer
- • 3.3 Keepalive and Timeout Optimization
- • 4. Verification and Monitoring
- •
- • 4.1 Real-time Monitoring Script
- •
- • Connection Status Dashboard
- • Kernel Alert Rules (Prometheus Example)
- • 4.2 Load Testing Recommendations
- • 5. Pitfall Guide
- •
- • 5.1 Common Misunderstandings
- • 5.2 Parameter Dependencies
- • 6. Conclusion
Practical Guide to Network Parameter Tuning in High-Concurrency Linux Scenarios
Introduction
In high-concurrency network service scenarios, the default network parameters of the Linux kernel often fail to meet the demands, leading to performance bottlenecks, connection timeouts, and even service crashes. This article analyzes real cases and teaches you how to tune Linux network parameters from parameter interpretation, problem diagnosis to optimization practice, supporting millions of concurrent connections.
1. Background: When Concurrent Connections Encounter Performance Bottlenecks
1.1 Case Environment
- • Server Configuration:
vCPU: 8 cores | Memory: 16GB | Network Bandwidth: 4Gbps | PPS: 800,000 - • Observed Anomalies:
- •
<span>TIME_WAIT</span>connection accumulation (2464) - • There are
<span>CLOSE_WAIT</span>connections (4) - • Occasional new connection establishment timeouts
1.2 Initial Parameter Analysis
Original configuration viewed through <span>sysctl</span><span>:</span>
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_max_tw_buckets = 131072
net.ipv4.ip_local_port_range = 1024 61999
Key Defects: Small half-connection queue, narrow port range, strict buffer limits.
2. In-Depth Diagnosis: Connection Status and Kernel Parameters
2.1 Connection Status Monitoring Techniques
Real-time TCP State Statistics
watch -n 1 'netstat -ant | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
Output Example:
ESTABLISHED 790
TIME_WAIT 2464
SYN_RECV 32 # Focus on half-connections!
Half-Connection Special Check
# View SYN_RECV connection details
ss -ntp state syn-recv
# Monitor queue overflow
netstat -s | grep -i 'listen drops'
2.2 Key Parameter Interpretation
| Parameter | Function | Default Value Issues |
<span>tcp_max_syn_backlog</span> |
Half-connection queue length | 8192 (easily full under burst traffic) |
<span>somaxconn</span> |
Full connection queue length | Must match application backlog parameter |
<span>tcp_tw_reuse</span> |
Quickly reuse TIME_WAIT ports | Default off (leads to port exhaustion) |
<span>tcp_rmem</span>/<span>tcp_wmem</span> |
Read/write buffer size | Maximum value only 6MB (affects throughput) |
3. Tuning Solutions: From Parameters to Practice
3.1 Connection Management Optimization
Resolving TIME_WAIT Accumulation
echo "net.ipv4.tcp_tw_reuse = 1" >> /etc/sysctl.conf
echo "net.ipv4.tcp_max_tw_buckets = 262144" >> /etc/sysctl.conf
echo "net.ipv4.ip_local_port_range = 1024 65000" >> /etc/sysctl.conf
Shortening Connection Recycle Time
echo "net.ipv4.tcp_fin_timeout = 30" >> /etc/sysctl.conf
3.2 Queue and Buffer Optimization
Expanding Connection Queue
echo "net.ipv4.tcp_max_syn_backlog = 65535" >> /etc/sysctl.conf
echo "net.core.somaxconn = 65535" >> /etc/sysctl.conf
echo "net.core.netdev_max_backlog = 10000" >> /etc/sysctl.conf
Adjusting Memory Buffer
cat >> /etc/sysctl.conf <<EOF
net.ipv4.tcp_mem = 8388608 12582912 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
EOF
3.3 Keepalive and Timeout Optimization
echo "net.ipv4.tcp_keepalive_time = 600" >> /etc/sysctl.conf
echo "net.ipv4.tcp_keepalive_intvl = 30" >> /etc/sysctl.conf
4. Verification and Monitoring
4.1 Real-time Monitoring Script
Connection Status Dashboard
#!/bin/bash
while true; do
clear
date
echo "---- TCP Status ----"
netstat -ant | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
echo "---- Half-Connection Queue ----"
ss -ltn | awk 'NR>1 {print "Listen Queue: Recv-Q=" $2 ", Send-Q=" $3}'
echo "---- Port Usage Rate ----"
echo "Used Ports: $(netstat -ant | grep -v LISTEN | awk '{print $4}' | cut -d: -f2 | sort -u | wc -l)/$((65000-1024))"
sleep 5
done
Kernel Alert Rules (Prometheus Example)
alert: TCP_SYN_Dropped
expr: increase(node_netstat_Tcp_Ext_SyncookiesFailed{job="node"}[1m]) > 0
for: 5m
labels:
severity: critical
annotations:
summary: "SYN Queue Overflow (Instance {{ $labels.instance }})"
4.2 Load Testing Recommendations
Use <span>wrk</span><span> to simulate high concurrency:</span>
wrk -t16 -c10000 -d60s http://service:8080
Key Metrics to Monitor:
- •
<span>SYN_RECV</span>count fluctuations - •
<span>netstat -s</span>packet drop counts - • Memory usage rate (
<span>free -m</span><code><span>)</span>
5. Pitfall Guide
5.1 Common Misunderstandings
- 1. Blindly enabling
<span>tcp_tw_recycle</span><span> will cause connection failures in NAT environments (removed from Linux 4.12)</span> - 2. Excessively large buffers can cause OOM; adjustments should be made based on memory
<span>tcp_mem</span><span>:</span><pre><code class="language-plaintext"># Calculate safe value (in pages, 1 page = 4KB)
echo $(( $(free -m | awk '/Mem:/ {print $2}') * 1024 / 4 / 3 )) >> /proc/sys/net/ipv4/tcp_mem
5.2 Parameter Dependencies
- •
<span>somaxconn</span><span> must be ≥ application layer </span><code><span>backlog</span><span>; for example, Nginx needs to be adjusted synchronously:</span>
listen 80 backlog=65535;
6. Conclusion
Through the tuning practices in this article, we achieved:
- 1. 70% reduction in TIME_WAIT connections
- 2. Maximum concurrent connections increased to over 30,000
- 3. Network throughput doubled
Link: https://blog.csdn.net/weixin_44976692/article/details/147836227?spm=1001.2100.3001.7377&utm_medium=distribute.pc_feed_blog_category.none-task-blog-classify_tag-5-147836227-null-null.nonecase&depth_1-utm_source=distribute.pc_feed_blog_category.none-task-blog-classify_tag-5-147836227-null-null.nonecase
(Copyright belongs to the original author, please delete if infringed)
Disclaimer: The content of this article is sourced from the internet and is for reference only. Reproduction is for learning and communication purposes. If it inadvertently infringes on your legal rights, please contact the Docker Chinese community in a timely manner!