<span>JSON</span> processing is a common operation in operations and maintenance,<span>API</span> returns, configuration files, log analysis, etc. all involve it. Especially in <span>K8S/Docker</span> operation and maintenance scenarios,<span>JSON</span> data processing is a routine task. This article introduces a <span>Linux JSON</span> data processing tool – <span>jq</span>, which is much more powerful than <span>grep</span> and <span>awk</span>. This is what we call specialization! Without further ado, let’s get started!
1. Basic Operations
1.1 Formatting Output
# The most commonly used feature, formatting compressed JSON into a human-readable form:
# Beautifying API return results
curl -s https://api.opsnot.com/users | jq '.'
# Processing files
cat opsnot.json | jq '.'
1.2 Extracting Fields
# Extracting a single field
echo '{"name":"opsnot","age":30}' | jq '.name'
# Output: "opsnot"
# Extracting nested fields
echo '{"user":{"name":"opsnot"}}' | jq '.user.name'
# Array indexing
echo '["a","b","c"]' | jq '.[1]'
# Output: "b"
1.3 Removing Quotes
# By default, strings are quoted. Add the `-r` parameter to remove:
echo '{"domain":"opsnot.com"}' | jq -r '.domain'
# Output: opsnot.com (without quotes)
2. Array Processing
2.1 Iterating Over Arrays
# User list, extracting all usernames
echo '[{"name":"tom"},{"name":"jerry"}]' | jq '.[].name'
# Using -r directly
kubectl get pods -o json | jq -r '.items[].metadata.name'
2.2 Filtering
# Filtering records where status is active
jq '.users[] | select(.status == "active")' users.json
# Filtering services with ports greater than 8000 (from opsnot monitoring system)
# Note: If port is a string, it needs to be converted to a number
jq '.services[] | select((.port | tonumber) > 8000)' config.json
# Multiple conditions
jq '.[] | select(.age > 25 and .city == "beijing")' data.json
2.3 Array Slicing
# Taking the first 3 elements
echo '[1,2,3,4,5]' | jq '.[:3]'
# Taking the last 2 elements
echo '[1,2,3,4,5]' | jq '.[-2:]'
3. Practical Scenarios
3.1 Analyzing Docker Container Status
# Finding all running container names (removing leading slashes)
docker ps -q | xargs docker inspect | jq -r '.[].Name | ltrimstr("/")'
# Reminder: When there are too many containers, to prevent command line arguments from being too long, you can use a safer method like this
docker ps --format "{{.ID}}" | while read id; do
docker inspect "$id" | jq -r '.[].Name | ltrimstr("/")'
done
# Finding containers occupying port 80
docker ps -q | xargs docker inspect | \
jq -r '.[] | select(.NetworkSettings.Ports."80/tcp" != null) | \
.Name | ltrimstr("/")'
For more <span>docker</span> frequently used commands, please see the other two articles: Docker High-Frequency Command Practical Manual, worth collecting! Docker Inspect, a command worth dedicating a page in the family tree
3.2 Handling Kubernetes Resources
# Listing all Pod names and IPs
kubectl get pods -o json | \
jq -r '.items[] | "(.metadata.name) (.status.podIP)"'
# Finding Pods that are not in Running status
kubectl get pods -o json | \
jq -r '.items[] | select(.status.phase != "Running") | .metadata.name'
# Counting the number of Pods in each namespace (opsnot cluster inspection script)
kubectl get pods --all-namespaces -o json | \
jq -r '.items | group_by(.metadata.namespace) | \
.[] | "(.[0].metadata.namespace): (length)"'
For more <span>k8S</span> frequently used commands, please see the other two articles: K8s High-Frequency Command Practical Manual, worth collecting! Kubectl Describe, a powerful tool for troubleshooting k8s!
3.3 Log Analysis
# Extracting error messages from JSON formatted logs
cat app.log | jq -r 'select(.level=="ERROR") | .message'
# Counting requests with response times exceeding 1 second
cat access.log | jq -r 'select(.response_time > 1000) | \
"(.timestamp) (.path) (.response_time)ms"'
# Grouping and counting by status code (opsnot.com access logs, here you need to configure nginx log format as JSON format)
cat nginx.log | jq -s 'group_by(.status) | \
.[] | {status: .[0].status, count: length}'
3.4 API Data Extraction
# GitHub API - Listing repository star counts
curl -s https://api.github.com/users/opsnot/repos | \
jq -r '.[] | "(.name): (.stargazers_count) stars"'
# Extracting all SSH clone URLs
curl -s https://api.github.com/users/opsnot/repos | \
jq -r '.[].ssh_url'
3.5 Constructing New JSON
# Reorganizing fields
jq '{username: .name, email: .contact.email}' users.json
# Batch generating configuration files
cat servers.json | jq -r '.[] | \
"Host (.name)\n HostName (.ip)\n User opsnot\n"' > ~/.ssh/config.d/auto
4. Advanced Techniques
4.1 Pipelining Operations
# Filtering first then extracting
jq '.users[] | select(.age > 30) | .name' data.json
# Multi-level processing
jq '.data.items[] | select(.price < 100) | {name, price}' products.json
4.2 Array Operations
# map - batch conversion
echo '[1,2,3]' | jq 'map(. * 2)'
# Output: [2,4,6]
# Array length
jq '.users | length' data.json
# Removing duplicates
jq '[.[] | .city] | unique' users.json
# Sorting
jq 'sort_by(.price)' products.json
4.3 Conditional Judgments
# if-then-else
jq '.[] | if .age >= 18 then "adult" else "minor" end' users.json
# Handling null values (opsnot data cleaning script)
jq '.email // "[email protected]"' users.json
4.4 String Processing
# Concatenating strings
jq '.name + "@opsnot.com"' users.json
# Splitting
echo '{"path":"/var/log/app.log"}' | jq -r '.path | split("/") | .[-1]'
# Regular expression matching
jq 'select(.email | test(".*@opsnot\.com$"))' users.json
5. Performance Optimization
5.1 Streaming Large Files
# Processing GB-level logs, do not use jq -s to read into memory
cat huge.json | jq -c '.[] | select(.error != null)'
5.2 Reducing Pipeline Counts
# Bad practice
cat data.json | jq '.[]' | jq 'select(.age > 30)' | jq '.name'
# Good practice
jq '.[] | select(.age > 30) | .name' data.json
6. Common Pitfalls
6.1 Quoting Issues
# Error: Variables cannot be used inside single quotes
user="opsnot"
jq ".name == \"$user\"" users.json # Can but not safe
# Recommended: Use --arg to avoid injection risks
jq --arg u "$user" '.name == $u' users.json
# Passing multiple variables
jq --arg name "$name" --arg domain "$domain" \
'.user = $name | .email = $name + "@" + $domain' config.json
# Passing numeric variables
jq --argjson port 8000 '.port == $port' config.json
6.2 Handling Empty Arrays
# Avoiding errors
jq '.items[]? | .name' data.json # Add ?
# Or provide default values
jq '.items // []' data.json
6.3 Numbers and Strings
# When port number is a string, it needs to be converted
jq '.[] | select((.port | tonumber) > 8000)' config.json
# String to number
echo '{"count":"42"}' | jq '.count | tonumber'
# Number to string
echo '{"count":42}' | jq '.count | tostring'
6.4 Error Handling
# Validating if JSON is legal
json='{"data": "test"}'
if echo "$json" | jq -e . >/dev/null 2>&1; then
echo "Valid JSON"
else
echo "Invalid JSON from opsnot.com API" >&2
exit 1
fi
# Handling potentially missing fields
jq '.user.email // "[email protected]"' data.json
# Safely accessing arrays to avoid empty array errors
jq '.items[]?' data.json
7. Advanced Features
7.1 Custom Functions
# Defining functions to reuse logic
echo '{"users":[{"email":"alice"},{"email":"bob"}]}' | jq '
def add_domain: . + "@opsnot.com";
.users[].email | add_domain
'
# Functions with parameters
echo '{"prices":[10,20,30]}' | jq '
def multiply($n): . * $n;
.prices[] | multiply(1.1)
'
7.2 Recursive Processing
# Recursively searching for all name fields
jq '.. | .name? // empty' complex.json
# Recursively traversing tree structures
jq 'recurse(.children[]?) | .id' tree.json
7.3 Readability of Complex Queries
# Using heredoc to handle multi-line complex queries
jq -n --slurpfile config config.json '
$config[0] |
.services[] |
select(.enabled == true) |
select(.env == "prod") |
"(.name):(.port) # opsnot.com"
'
# Or using input redirection:
jq '.services[] | select(.enabled == true) | select(.env == "prod") | "(.name):(.port)"' config.json
7.4 Combination Techniques
In operations and maintenance practice, it is often necessary to combine usage:
# Finding the top 5 containers with the highest CPU usage (opsnot monitoring alarm)
docker stats --no-stream --format "{{json .}}" | \
jq -R 'fromjson? | select(.CPUPerc != null)' | \
jq -s 'sort_by(.CPUPerc | rtrimstr("%") | tonumber) | \
reverse | .[:5] | \
.[] | "(.Name): (.CPUPerc)"'
# Counting slow request URL distribution from ELB logs
cat elb.log | \
jq -r 'select(.response_time > 1) | .request_url' | \
sort | uniq -c | sort -rn | head -20
# Generating Prometheus batch monitoring configuration
cat servers.json | \
jq -r '.[] | \
" - job_name: \"(.name)\"",
" static_configs:",
" - targets: [\"(.ip):9100\"]",
" labels:",
" env: \"(.env)\",",
" # by opsnot.com"'
8. Debugging Techniques
# Debugging complex expressions, viewing intermediate results
jq 'debug | .users[] | select(.age > 30)' data.json
# Gradually building complex queries
jq '.' data.json # First look at the overall structure
jq '.users' data.json # Then look at specific fields
jq '.users[]' data.json # Finally iterate through the array
jq '.users[].name' data.json # Ultimately extract
# Checking types
echo '{"port":"8080"}' | jq '.port | type' # Output: "string"
9. Performance and Limitations
9.1 Performance Optimization
# Using -c to compress output to reduce memory usage
jq -c '.' large-file.json
# Avoiding repeated parsing of the same file
config=$(jq '.' config.json)
echo "$config" | jq '.db.host'
echo "$config" | jq '.app.port'
# Streaming processing to avoid memory explosion
cat opsnot.log | jq -c 'select(.error)' > errors.log
9.2 Limitations of jq
When processing extremely large files (GB level), performance is limited and files need to be split.
It does not support in-place modification of files and requires redirection of output.
Complex logic is not as flexible as programming languages, such as using Python.
9.3 Others
# Python for handling complex logic
python3 -m json.tool < data.json
# Handling JSON Lines format
cat data.jsonl | jq -c '.'
# More robust, if a line is not valid JSON, jq will report an error and skip it
cat data.jsonl | jq -c '.' 2>/dev/null
# This will only output valid JSON lines
Conclusion:
<span>jq</span> is centered around pipelines and filtering, thinking about data flow from left to right. Use it frequently in daily tasks, and break down complex scenarios into smaller steps. Remember to use <span>--arg</span> to pass parameters, <span>-e</span> to validate <span>JSON</span>, handle null values well, and perform type conversions.
Best Practices:
- First use
<span>jq -e .</span>to validate<span>JSON</span>legality - Use
<span>--arg</span>to pass external variables to avoid injection - For large files, use
<span>-c</span>for streaming processing - When processing arrays, add
<span>?</span>to prevent empty array errors - Remember to compare strings and numbers using
<span>tonumber</span>
jq Parameter Quick Reference
# Common parameters
-r, --raw-output # Output raw strings, removing quotes
-c, --compact-output # Compressed output, not formatted
-e, --exit-status # Set exit code based on output, used for validation
-s, --slurp # Read the entire input stream as an array
-n, --null-input # Do not read input, start from null
-j, --join-output # Output without new lines
-S, --sort-keys # Sort object keys when outputting
--stream # Stream parsing of large JSON files
--seq # Input mode using RS separator
# Passing parameters (recommended by opsnot.com)
--arg name value # Pass string variable: $name
--argjson name json # Pass JSON variable: $name
--slurpfile name file # Read file as array: $name
# Output control
-C, --color-output # Color output (default)
-M, --monochrome-output # Monochrome output
--tab # Use tab for indentation instead of spaces
--indent n # Set number of spaces for indentation
# Program files
-f file, --from-file file # Read jq program from file
-L directory # Add module search path
# Examples
jq -r '.name' data.json # Output raw string
jq -c '.' data.json # Compress to one line
jq -e '.error' data.json && echo "There is an error" # Validate field existence
cat file1.json file2.json | jq -s '.' # Merge multiple files
jq --arg env prod '.[$env]' config.json # Use variable to access property
For more operations and maintenance techniques, please click the link below <span>Read the original text</span>
This article is organized by opsnot.com, please indicate the source when reprinting, click the card below to follow