Hello everyone, this is Linux Enthusiast Engineer. In this article, I will share 20 hardcore commands for efficiently analyzing log files using Shell. During this May Day holiday, I used these commands to troubleshoot 99% of log issues, which are simple yet effective. Let’s take a look!
Source: https://segmentfault.com/a/1190000009745139
I run a small website on Alibaba Cloud’s ECS, and I occasionally analyze my server logs to check the website’s traffic and see if there are any hackers causing damage! Therefore, I have collected and organized some server log analysis commands for everyone to try!
1. Check how many unique IPs accessed:
awk '{print $1}' log_file|sort|uniq|wc -l
2. Check how many times a specific page was accessed:
grep "/index.php" log_file | wc -l
3. Check how many pages each IP accessed:
awk '{++S[$1]} END {for (a in S) print a,S[a]}' log_file > log.txt
sort -n -t ' ' -k 2 log.txt # Further sort with sort
4. Sort the number of pages accessed by each IP in ascending order:
awk '{++S[$1]} END {for (a in S) print S[a],a}' log_file | sort -n
5. Check which pages a specific IP accessed:
grep ^111.111.111.111 log_file| awk '{print $1,$7}'
6. Remove pages counted by search engines:
awk '{print $12,$1}' log_file | grep ^"Mozilla | awk '{print $2}' |sort | uniq | wc -l
7. Check how many unique IPs accessed during the hour of 14:00 on August 16, 2015:
awk '{print $4,$1}' log_file | grep 16/Aug/2015:14 | awk '{print $2}'| sort | uniq | wc -l
8. View the top ten IP addresses:
awk '{print $1}' |sort|uniq -c|sort -nr |head -10 access_log
uniq -c is equivalent to grouping statistics and placing the count at the front
cat access.log|awk '{print $1}'|sort|uniq -c|sort -nr|head -10
cat access.log|awk '{counts[$(11)]+=1}; END {for(url in counts) print counts[url], url}
9. The ten files or pages with the most accesses:
cat log_file|awk '{print $11}'|sort|uniq -c|sort -nr | head -10
<span><strong><span>Top 20 IPs with the highest traffic:</span></strong></span>
cat log_file|awk '{print $11}'|sort|uniq -c|sort -nr|head -20
awk '{print $1}' log_file |sort -n -r |uniq -c | sort -n -r | head -20
10. Access counts by subdomain based on referer, slightly inaccurate:
cat access.log | awk '{print $11}' | sed -e ' s/http:\/\/\//' -e ' s/\/.*//' | sort | uniq -c | sort -rn | head -20
11. List the largest files by transfer size:
cat www.access.log |awk '($7~/\.php/){print $10 " " $1 " " $4 " " $7}'|sort -nr|head -100
12. List pages with output greater than 200000 bytes (approximately 200 KB) and their occurrence counts:
cat www.access.log |awk '($10 > 200000 && $7~/\.php/){print $7}'|sort -n|uniq -c|sort -nr|head -100
13. If the last column of the log records the page file transfer time, list the pages that took the longest to reach the client:
cat www.access.log |awk '($7~/\.php/){print $NF " " $1 " " $4 " " $7}'|sort -nr|head -100
14. List the pages that took the longest (over 60 seconds) and their occurrence counts:
cat www.access.log |awk '($NF > 60 && $7~/\.php/){print $7}'|sort -n|uniq -c|sort -nr|head -100
15. List files with transfer times exceeding 30 seconds:
cat www.access.log |awk '($NF > 30){print $7}'|sort -n|uniq -c|sort -nr|head -20
16. List the number of running processes for each process on the current server, sorted in descending order:
ps -ef | awk -F ' ' '{print $8 " " $9}' |sort | uniq -c |sort -nr |head -20
17. Check the current number of concurrent accesses in Apache:
Compare the difference with the MaxClients number in httpd.conf
netstat -an | grep ESTABLISHED | wc -l
18. You can use the following parameters to view data:
ps -ef|grep httpd|wc -l
1388
Count the number of httpd processes; each request starts a process, used for Apache servers. This indicates that Apache can handle 1388 concurrent requests, and this value can be automatically adjusted by Apache based on load conditions.
netstat -nat|grep -i "80"|wc -l
4341
netstat -an prints the current network connection status of the system, while grep -i “80” is used to extract connections related to port 80, and wc -l counts the number of connections. The final returned number is the total number of requests on all port 80 connections.
netstat -na|grep ESTABLISHED|wc -l
376
netstat -an prints the current network connection status of the system, while grep ESTABLISHED extracts information about established connections. Then wc -l counts the total number of established connections on all port 80 connections. The final returned number is the total number of established connections on all port 80 connections.
netstat -nat||grep ESTABLISHED|wc
Can view detailed records of all established connections.
19. Output the connection count for each IP, as well as the total connection count for each state:
netstat -n | awk '/^tcp/ {n=split($(NF-1),array,":");if(n<=2)++S[array[(1)]];else++S[array[(4)]];++s[$NF];++N} END {for(a in S){printf("%-20s %s\n", a, S[a]);++I}printf("%-20s %s\n","TOTAL_IP",I);for(a in s) printf("%-20s %s\n",a, s[a]);printf("%-20s %s\n","TOTAL_LINK",N);}
20. Other collections:
Analyze the log file for the top 20 URLs accessed on 2012-05-04 and sort them:
cat access.log |grep '04/May/2012'| awk '{print $11}'|sort|uniq -c|sort -nr|head -20
Query the IP addresses that accessed URLs containing www.abc.com:
cat access_log | awk '($11~/\www.abc.com/){print $1}'|sort|uniq -c|sort -nr
Get the top 10 IP addresses with the highest access, and can also query by time:
cat linewow-access.log|awk '{print $1}'|sort|uniq -c|sort -nr|head -10
Query the log for the time period:
cat log_file | egrep '15/Aug/2015|16/Aug/2015' |awk '{print $1}'|sort|uniq -c|sort -nr|head -10
Analyze the IPs accessing “/index.php?g=Member&m=Public&a=sendValidCode” from 2015/8/15 to 2015/8/16 in descending order:
cat log_file | egrep '15/Aug/2015|16/Aug/2015' | awk '{if($7 == "/index.php?g=Member&m=Public&a=sendValidCode") print $1,$7}'|sort|uniq -c|sort -nr
($7~/.php/) Outputs those containing .php in $7, meaning the 100 most time-consuming PHP pages:
cat log_file |awk '($7~/\.php/){print $NF " " $1 " " $4 " " $7}'|sort -nr|head -100
List the pages that took the longest (over 60 seconds) and their occurrence counts:
cat access.log |awk '($NF > 60 && $7~/\.php/){print $7}'|sort -n|uniq -c|sort -nr|head -100
Count website traffic (G):
cat access.log |awk '{sum+=$10} END {print sum/1024/1024/1024}'
Count 404 connections:
awk '($9 ~/404/)' access.log | awk '{print $9,$7}' | sort
Count HTTP status:
cat access.log |awk '{counts[$(9)]+=1}; END {for(code in counts) print code, counts[code]}'
cat access.log |awk '{print $9}'|sort|uniq -c|sort -rn
Concurrent requests per second:
watch "awk '{if($9~/200|30|404/)COUNT[$4]++}END{for( a in COUNT) print a,COUNT[a]}' log_file|sort -k 2 -nr|head -n10"
Bandwidth statistics:
cat apache.log |awk '{if($7~/GET/) count++}END{print "client_request="count}'
cat apache.log |awk '{BYTE+=$11}END{print "client_kbyte_out="BYTE/1024"KB"}'
Find the top 10 IPs with the highest access on a specific day:
cat /tmp/access.log | grep "20/Mar/2011" |awk '{print $3}'|sort |uniq -c|sort -nr|head
What the highest IP connections did on that day:
cat access.log | grep "10.0.21.17" | awk '{print $8}' | sort | uniq -c | sort -nr | head -n 10
Top 10 time periods with the most IP connections in hourly units:
awk -vFS="[:"] '{gsub("-.*","",$1);num[$2" "$1]++}END{for(i in num)print i,num[i]}' log_file | sort -n -k 3 -r | head -10
Find the minutes with the most accesses:
awk '{print $1}' access.log | grep "20/Mar/2011" |cut -c 14-18|sort|uniq -c|sort -nr|head
Get 5 minutes of logs:
if [ $DATE_MINUTE != $DATE_END_MINUTE ] ;then
# Check if the start timestamp and end timestamp are equal
START_LINE=sed -n "/$DATE_MINUTE/=" $APACHE_LOG|head -n1
# If not equal, get the line number of the start timestamp and the end timestamp
Check TCP connection status
netstat -nat |awk '{print $6}'|sort|uniq -c|sort -rn
netstat -n | awk '/^tcp/ {++S[$NF]};END {for(a in S) print a, S[a]}'
netstat -n | awk '/^tcp/ {++state[$NF]}; END {for(key in state) print key,"\t",state[key]}'
netstat -n | awk '/^tcp/ {++arr[$NF]};END {for(k in arr) print k,"\t",arr[k]}'
netstat -n |awk '/^tcp/ {print $NF}'|sort|uniq -c|sort -rn
netstat -ant | awk '{print $NF}' | grep -v '[a-z]' | sort | uniq -cnetstat -ant|awk '/ip:80/{split($5,ip,":");++S[ip[1]]}END{for (a in S) print S[a],a}' |sort -n
netstat -ant|awk '/:80/{split($5,ip,":");++S[ip[1]]}END{for (a in S) print S[a],a}' |sort -rn|head -n 10
awk 'BEGIN{printf ("http_code\tcount_num\n")} {COUNT[$10]++}END{for (a in COUNT) printf a"\t\t"COUNT[a]"\n"}'
Find the top 20 IPs by request count (commonly used to find attack sources):
netstat -anlp|grep 80|grep tcp|awk '{print $5}'|awk -F: '{print $1}'|sort|uniq -c|sort -nr|head -n20
netstat -ant |awk '/:80/{split($5,ip,":");++A[ip[1]]}END{for(i in A) print A[i],i}' |sort -rn|head -n20
Use tcpdump to sniff access to port 80 and see who is the highest:
tcpdump -i eth0 -tnn dst port 80 -c 1000 | awk -F"." '{print $1"."$2"."$3"."$4}' | sort | uniq -c | sort -nr |head -20
Find connections in TIME_WAIT state:
netstat -n|grep TIME_WAIT|awk '{print $5}'|sort|uniq -c|sort -rn|head -n20
Find connections in SYN state:
netstat -an | grep SYN | awk '{print $5}' | awk -F: '{print $1}' | sort | uniq -c | sort -nr | more
List processes by port:
netstat -ntlp | grep 80 | awk '{print $7}' | cut -d/ -f1
Check the number of connections and current connections:
netstat -ant | grep $ip:80 | wc -l
netstat -ant | grep $ip:80 | grep EST | wc -l
Check IP access counts:
netstat -nat|grep ":80"|awk '{print $5}' |awk -F: '{print $1}' | sort| uniq -c|sort -n
Linux command to analyze the current connection status:
netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
watch "netstat -n | awk '/^tcp/ {++S[\$NF]} END {for(a in S) print a, S[a]}'"
# You can monitor continuously using watch
LAST_ACK 5 # Closing a TCP connection requires closing from both directions, both parties indicate the closure of one direction of data by sending FIN. When both parties have sent the last FIN, the sender is in LAST_ACK state. The connection is only fully closed when the sender receives the acknowledgment from the other party (the FIN acknowledgment);
SYN_RECV 30 # Indicates the number of requests waiting to be processed;
ESTABLISHED 1597 # Indicates normal data transmission state;
FIN_WAIT1 51 # Indicates the server actively requests to close the TCP connection;
FIN_WAIT2 504 # Indicates the client interrupts the connection;
TIME_WAIT 1057 # Indicates the number of requests that have been processed and are waiting for timeout to end;
If you like it, remember tolike and “see it”!