Linux Web Service Log Statistics Commands

Table of Contents

  • Apache Log Statistics

  • Nginx Log Statistics

  • Web Service Status Statistics

  • Other Statistical Combinations

  • Count Statistics

This article collects some common statistics commands for Apache/Nginx server logs in Linux operations.

Apache Log Statistics

# List the top IPs with the most visits today
[[email protected] httpd]# cut -d- -f 1 access_log | uniq -c | sort -rn | head -20
# Check how many unique IPs visited today
[[email protected] httpd]# awk '{print $1}' access_log | sort | uniq | wc -l
# Check the total number of visits to a specific page
[[email protected] httpd]# cat access_log | grep "index.php" | wc -l
# Check how many pages each IP visited
[[email protected] httpd]# awk '{++S[$1]} END {for (a in S) print a,S[a]}' access_log
# Sort the number of pages visited by each IP in ascending order
[[email protected] httpd]# awk '{++S[$1]} END {for (a in S) print S[a],a}' access_log | sort -n
# Check which pages a specific IP visited
[[email protected] httpd]# grep "^192.168.1.2" access_log | awk '{print $1,$7}'
# Exclude search engine statistics for today
[[email protected] httpd]# awk '{print $12,$1}' access_log | grep ^"Mozilla" | awk '{print $2}' |sort | uniq | wc -l
# Check how many IPs visited during the hour of 21/Nov/2019:03:40:26
[[email protected] httpd]# awk '{print $4,$1}' access_log | grep "21/Nov/2019:03:40:26" | awk '{print $2}'| sort | uniq | wc -l

Nginx Log Statistics

# List all IP access statistics
[[email protected] httpd]# awk '{print $1}' access_log | sort -n | uniq
# Check the top 100 most frequently accessed IPs
[[email protected] httpd]# awk '{print $1}' access_log | sort -n | uniq -c | sort -rn | head -n 100
# Check IPs that accessed more than 100 times
[[email protected] httpd]# awk '{print $1}' access_log | sort -n | uniq -c | awk '{if($1 >100) print $0}' | sort -rn
# Query detailed access statistics for a specific IP, sorted by access frequency
[[email protected] httpd]# grep '192.168.1.2' access_log | awk '{print $7}' | sort | uniq -c | sort -rn | head -n 100
# Page access statistics: Check the top 100 most frequently accessed pages
[[email protected] httpd]# awk '{print $7}' access_log | sort | uniq -c | sort -rn | head -n 100
# Page access statistics: Check the top 100 most frequently accessed pages (excluding php|py)
[[email protected] httpd]# grep -E -v ".php|.py"  access_log | awk '{print $7}' | sort |uniq -c | sort -rn | head -n 100
# Page access statistics: Check pages accessed more than 100 times
[[email protected] httpd]# cat access_log | cut -d ' ' -f 7 | sort |uniq -c | awk '{if ($1 > 100) print$0}'
# Page access statistics: Check the most accessed page in the last 1000 records
[[email protected] httpd]# tail -1000 access_log | awk '{print $7}' | sort | uniq -c | sort -nr
# Per second request statistics: Count the top 100 time points for requests per second (accurate to the second)
[[email protected] httpd]# awk '{print $4}' access_log | cut -c14-21 | sort | uniq -c | sort -nr | head -n 100
# Per minute request statistics: Count the top 100 time points for requests per minute (accurate to the minute)
[[email protected] httpd]# awk '{print $4}' access_log | cut -c14-18 | sort | uniq -c | sort -nr | head -n 100
# Per hour request statistics: Count the top 100 time points for requests per hour (accurate to the hour)
[[email protected] httpd]# awk '{print $4}' access_log | cut -c14-15 | sort | uniq -c | sort -nr | head -n 100

Web Service Status Statistics

# Count website crawlers
[[email protected] httpd]# grep -E 'Googlebot|Baiduspider' access_log | awk '{ print $1 }' | sort | uniq
# Count browser access statistics
[[email protected] httpd]# cat access_log | grep -v -E 'MSIE|Firefox|Chrome|Opera|Safari|Gecko|Maxthon' | sort | uniq -c | sort -r -n | head -n 100
# Count subnet distribution
[[email protected] httpd]# cat access_log | awk '{print $1}' | awk -F'.' '{print $1"."$2"."$3".0"}' | sort | uniq -c | sort -r -n | head -n 200
# Count referring domains
[[email protected] httpd]# cat access_log | awk '{print $2}' | sort | uniq -c | sort -rn | more
# Count HTTP status codes
[[email protected] httpd]# cat access_log | awk '{print $9}' | sort | uniq -c | sort -rn | more
# URL access count statistics
[[email protected] httpd]# cat access_log | awk '{print $7}' | sort | uniq -c | sort -rn | more
# URL traffic statistics
[[email protected] httpd]# cat access_log | awk '{print $7}' | egrep '?|&' | sort | uniq -c | sort -rn | more
# File traffic statistics
[[email protected] httpd]# cat access_log | awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}' | sort -rn | more | grep '200' access_log | awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}' | sort -rn | more

Other Statistical Combinations

# List the top IPs with the most visits today
[[email protected] httpd]# cut -d- -f 1 access_log | uniq -c | sort -rn | head -20
# Check how many unique IPs visited today
[[email protected] httpd]# awk '{print $1}' access_log | sort | uniq | wc -l
# Check the total number of visits to a specific page
[[email protected] httpd]# cat access_log | grep "index.php" | wc -l
# Check how many pages each IP visited
[[email protected] httpd]# awk '{++S[$1]} END {for (a in S) print a,S[a]}' access_log
# Sort the number of pages visited by each IP in ascending order
[[email protected] httpd]# awk '{++S[$1]} END {for (a in S) print S[a],a}' access_log | sort -n
# Check which pages a specific IP visited
[[email protected] httpd]# grep "^192.168.1.2" access_log | awk '{print $1,$7}'
# Exclude search engine statistics for today
[[email protected] httpd]# awk '{print $12,$1}' access_log | grep ^"Mozilla" | awk '{print $2}' |sort | uniq | wc -l
# Check how many IPs visited during the hour of 21/Nov/2019:03:40:26
[[email protected] httpd]# awk '{print $4,$1}' access_log | grep "21/Nov/2019:03:40:26" | awk '{print $2}'| sort | uniq | wc -l

Count Statistics

# Check the number of visits to a specific page
[[email protected] httpd]# grep "/index.php" log_file | wc -l
# Check how many pages each IP visited
[[email protected] httpd]# awk '{++S[$1]} END {for (a in S) print a,S[a]}' log_file
# Sort the number of pages visited by each IP in ascending order
[[email protected] httpd]# awk '{++S[$1]} END {for (a in S) print S[a],a}' log_file | sort -n
# Check which pages a specific IP visited
[[email protected] httpd]# grep ^111.111.111.111 log_file| awk '{print $1,$7}'
# Exclude search engine statistics for today
[[email protected] httpd]# awk '{print $12,$1}' log_file | grep ^"Mozilla | awk '{print $2}' |sort | uniq | wc -l
# Check how many IPs visited during the hour of 21/Jun/2018:14
[[email protected] httpd]# awk '{print $4,$1}' log_file | grep 21/Jun/2018:14 | awk '{print $2}'| sort | uniq | wc -l
# Count crawlers
[[email protected] httpd]# grep -E 'Googlebot|Baiduspider'  /www/logs/access.2019-02-23.log | awk '{ print $1 }' | sort | uniq
# Count browsers
[[email protected] httpd]# cat /www/logs/access.2019-02-23.log | grep -v -E 'MSIE|Firefox|Chrome|Opera|Safari|Gecko|Maxthon' | sort | uniq -c | sort -r -n | head -n 100
# IP statistics
[[email protected] httpd]# grep '23/May/2019' /www/logs/access.2019-02-23.log | awk '{print $1}' | awk -F'.' '{print $1"."$2"."$3"."$4}' | sort | uniq -c | sort -r -n | head -n 10   2206 219.136.134.13   1497 182.34.15.248   1431 211.140.143.100   1431 119.145.149.106   1427 61.183.15.179   1427 218.6.8.189   1422 124.232.150.171   1421 106.187.47.224   1420 61.160.220.252   1418 114.80.201.18
# Count subnets
[[email protected] httpd]# cat /www/logs/access.2019-02-23.log | awk '{print $1}' | awk -F'.' '{print $1"."$2"."$3".0"}' | sort | uniq -c | sort -r -n | head -n 200
# Count domains
[[email protected] httpd]# cat  /www/logs/access.2019-02-23.log |awk '{print $2}'|sort|uniq -c|sort -rn|more
# HTTP status
[[email protected] httpd]# cat  /www/logs/access.2019-02-23.log |awk '{print $9}'|sort|uniq -c|sort -rn|more
# URL statistics
[[email protected] httpd]# cat  /www/logs/access.2019-02-23.log |awk '{print $7}'|sort|uniq -c|sort -rn|more
# File traffic statistics
[[email protected] httpd]# cat /www/logs/access.2019-02-23.log |awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}'|sort -rn|moregrep ' 200 ' /www/logs/access.2019-02-23.log |awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}'|sort -rn|more
# URL access volume statistics
[[email protected] httpd]# cat /www/logs/access.2019-02-23.log | awk '{print $7}' | egrep '?|&' | sort | uniq -c | sort -rn | more
# Find the slowest running scripts
[[email protected] httpd]# grep -v 0$ /www/logs/access.2019-02-23.log | awk -F '" ' '{print $4" " $1}' web.log | awk '{print $1" "$8}' | sort -n -k 1 -r | uniq > /tmp/slow_url.txt
# Extract IP and URL
[[email protected] httpd]# tail -f /www/logs/access.2019-02-23.log | grep '/test.html' | awk '{print $1" "$7}'

Link: https://www.cnblogs.com/LyShark/p/12500145.html

(Copyright belongs to the original author, please delete if infringing)

WeChat group

To facilitate better communication on operations and related technical issues, a WeChat group has been created. Friends who want to join can scan the QR code below to add me as a friend (note: join group).

Linux Web Service Log Statistics Commands

Blog

CSDN Blog: https://blog.csdn.net/qq_25599925

Linux Web Service Log Statistics Commands

Juejin Blog: https://juejin.cn/user/4262187909781751

Linux Web Service Log Statistics Commands

Long press to recognize the QR code to visit the blog website for more quality original content.

Leave a Comment