Optimization of File Descriptors and Connection Counts in Linux

Optimization of File Descriptors and Connection Counts in Linux

In Linux system administration, file descriptors (File Descriptor) and connection counts are key parameters for performance optimization. A file descriptor is a handle that a process uses to access resources, while the connection count directly affects the capability of high-concurrency applications. According to a report by Red Hat, optimizing file descriptors for high-concurrency servers can improve performance by 30%-50%.

1. Overview of File Descriptors and Connection Counts

1.1 What is a File Descriptor?

A file descriptor (FD) is an integer assigned by the Linux kernel to a process for accessing files, sockets, or devices. Each process has a default of 0 (stdin), 1 (stdout), and 2 (stderr) file descriptors, and a new FD is allocated when a new file is opened.

Characteristics of FDs:

  • Range: 0 to OPEN_MAX-1 (default 1024).
  • Type: file fd, socket fd, pipe fd.
  • Lifecycle: automatically closed when the process exits.
  • Limitations: process-level and system-level.

FD embodies the philosophy of “everything is a file”.

1.2 What is Connection Count?

The connection count refers to the number of TCP/UDP connections. In high-concurrency scenarios such as web servers, the connection count is limited by FDs. Each connection occupies one socket FD.

Types of Connection Counts:

  • ESTABLISHED: connection established.
  • TIME_WAIT: waiting for closure.
  • SYN_RECV: half-connection.

Optimizing connection counts is central to high-concurrency systems.

1.3 Relationship Between File Descriptors and Connection Counts

The connection count depends on FDs, with each connection being a socket FD. Insufficient FDs can lead to “Too many open files” errors, affecting connections.

System Limitations:

  • /proc/sys/fs/file-max: global FD limit.
  • ulimit -n: process FD limit.

1.4 Importance of Optimization

Optimizing FDs and connection counts can:

  • Support high concurrency: web servers handling tens of thousands of connections.
  • Stability: avoid resource exhaustion and crashes.
  • Performance: reduce error handling overhead.
  • Security: limit process FDs to prevent abuse.
  • Cost: optimize resource utilization.

For example, Nginx has a default FD limit of 1024, which needs optimization for high concurrency.

1.5 Typical Scenarios for Optimization

  • Web Servers: Nginx with high connection counts.
  • Databases: MySQL connection pools.
  • Cloud Environments: EC2 high-concurrency applications.
  • Game Servers: real-time connections.
  • IoT: multiple device connections.

1.6 Challenges of Optimization

  • Complex Configuration: process/system-level limitations.
  • Compatibility: differences across distributions.
  • Monitoring: real-time tracking of usage.
  • Risks: excessively high limits can lead to resource exhaustion.
  • Debugging: FD leaks are hard to locate.

1.7 Goals of Optimization

  • High Connections: support tens of thousands of concurrent connections.
  • Stability: avoid FD exhaustion.
  • Monitoring: real-time alerts.
  • Automation: script configuration.
  • Security: combined with permission control.

2. Principles of File Descriptors and Connection Counts

2.1 Principles of File Descriptors

FD is an integer index of a process that points to the kernel’s file table. When a file is opened, the kernel allocates an FD and returns it to the process.

Structure:

  • struct file: kernel file structure.
  • filp_open(): open file.
  • fdget(): get FD.

Limitations:

  • RLIMIT_NOFILE: process soft/hard limits.
  • fs.nr_open: system limit.

Leaks: failing to close() leads to FD exhaustion.

2.2 Principles of Connection Counts

The connection count is a subset of socket FDs. TCP connections are allocated FDs through accept().

Half-Connection: SYN queue (tcp_max_syn_backlog).

  • Full Connection: accept queue (somaxconn).

TIME_WAIT: waiting state after closure, occupying FDs.

Parameters:

  • net.ipv4.tcp_max_tw_buckets: TIME_WAIT limit.

2.3 Relationship Between FDs and Connections

Each connection is an FD, and high connections require high FD limits.

System FD Allocation:

  • fs.file-max: global max.
  • fs.nr_open: max open.

2.4 Summary of Principles

FDs are resource handles, connection counts depend on FDs, and optimization requires adjusting limits and parameters.

3. Configuration of File Descriptors and Connection Counts

3.1 Viewing Current Limits

  • Process FDs:

    ulimit -n
    cat /proc//limits | grep "Max open files"
    
  • System FDs:

    cat /proc/sys/fs/file-max
    cat /proc/sys/fs/nr_open
    sysctl fs.file-max
    

3.2 Adjusting Process FDs

  1. Temporary:

    ulimit -Sn 65535  # soft limit
    ulimit -Hn 65535  # hard limit
    
  2. Permanent: Edit /etc/security/limits.conf:

    * soft nofile 65535
    * hard nofile 65535
    root soft nofile 65535
    root hard nofile 65535
    
  3. Service Level (systemd):

    sudo nano /etc/systemd/system/nginx.service
    

    Add:

    [Service]
    LimitNOFILE=65535
    
    sudo systemctl daemon-reload
    sudo systemctl restart nginx
    

3.3 Adjusting System FDs

Edit /etc/sysctl.conf:

fs.file-max = 2097152
fs.nr_open = 2097152
sudo sysctl -p

3.4 Connection Count Optimization

  1. TCP Parameters: Edit /etc/sysctl.conf:

    net.core.somaxconn = 65535
    net.ipv4.tcp_max_syn_backlog = 8192
    net.ipv4.tcp_fin_timeout = 15
    net.ipv4.tcp_tw_reuse = 1
    net.ipv4.tcp_tw_recycle = 1
    
    sudo sysctl -p
    
  2. Service Configuration (e.g., Nginx):

    worker_rlimit_nofile 65535;
    

3.5 Verifying Configuration

  • Check Process:

    cat /proc//limits
    
  • Test Connections:

    ab -n 10000 -c 1000 http://localhost/
    

4. Monitoring File Descriptors and Connection Counts

4.1 lsof

Usage:

lsof -p  | wc -l
lsof -i :80  # port connections

4.2 ss/netstat

Usage:

ss -tunap | wc -l
netstat -tunap | wc -l

4.3 Prometheus

Monitoring: use node exporter, configure Grafana dashboard.

Alerts: FD usage >80%.

4.4 Script Monitoring

#!/bin/bash
FD=$(lsof | wc -l)
if [ $FD -gt 100000 ]; then
    mail -s "High FD Alert" [email protected] <<< "FD: $FD"
fi

Executed by Cron.

5. Case Analysis

5.1 Case 1: Nginx High Concurrency FD Exhaustion

Scenario: Nginx reports “too many open files”.

Diagnosis:

lsof -p $(pidof nginx) | wc -l
ulimit -n

Optimization: Adjust limits.conf and nginx.service to nofile=65535.

Result: improved concurrency.

5.2 Case 2: Insufficient MySQL Connection Count

Scenario: MySQL “too many connections”.

Diagnosis:

SHOW GLOBAL STATUS LIKE 'Threads_connected';
SHOW VARIABLES LIKE 'max_connections';

Optimization: my.cnf:

[mysqld]
max_connections = 1000

Increase nofile in limits.conf.

Result: connections normalized.

5.3 Case 3: Custom Application FD Leak

Scenario: application memory continuously grows.

Diagnosis:

watch -n 1 "lsof -p  | wc -l"
strace -p  -e open,close

Optimization: add close(fd) in the code.

Result: leak fixed.

6. Common Problem Solutions

6.1 FD Exhaustion

Cause: high concurrency.

Solution: increase nofile limit.

6.2 Connection Backlog Full

Cause: low syn_backlog.

Solution: tcp_max_syn_backlog=8192.

6.3 TIME_WAIT Accumulation

Cause: high closure rate.

Solution: tcp_fin_timeout=15, tcp_tw_reuse=1.

6.4 Configuration Not Taking Effect

Cause: service not restarted.

Solution: daemon-reload, restart.

6.5 System Limits Too Low

Solution: file-max=2097152.

7. Optimization and Best Practices

7.1 Best Practices

  • Least Privilege: run services as non-root.
  • Monitoring Alerts: notify when FD >80%.
  • Automation: script checks for ulimit.

7.2 Performance Optimization

  • epoll/kqueue: efficient event handling.
  • Connection Pool: reuse connections.

7.3 Security Considerations

  • Limit process FDs to prevent DoS.

7.4 Future Trends

  • io_uring: asynchronous I/O to reduce FDs.

8. Conclusion

Optimizing file descriptors and connection counts in Linux is crucial for high-concurrency systems. Through ulimit, sysctl, and monitoring, it is possible to support tens of thousands of connections.

Leave a Comment