1. Command Introduction and Principles
1.1 Introduction
tar (Tape ARchive) is the most classic archiving tool in the Linux system, originally designed for tape backup, and is now widely used for file packaging, compression, and archive management. It can package multiple files or directories into a single file and supports various compression algorithms.
1.2 Working Principle
-
File Collection: Organizes multiple files and directories into a continuous byte stream
-
Header Information: Creates metadata headers for each file (filename, permissions, timestamps, etc.)
-
Data Storage: Stores file contents in order
-
Compression Processing: Optional compression layer compresses the archived data
-
Stream Output: Supports standard input and output, facilitating pipeline operations
2. Basic Syntax
tar [options] [archive file] [files or directories...]
Common Options
# Main operation modes (must choose one)-c, --create # Create a new archive-x, --extract # Extract files from an archive-t, --list # List archive contents-r, --append # Append files to an archive-u, --update # Only add files newer than those in the archive
# Compression options-z, --gzip # Use gzip compression/decompression (.tar.gz, .tgz)-j, --bzip2 # Use bzip2 compression/decompression (.tar.bz2)-J, --xz # Use xz compression/decompression (.tar.xz)--zstd # Use zstd compression/decompression (.tar.zst)-Z, --compress # Use compress compression/decompression (.tar.Z)-a, --auto-compress # Automatically select compression method based on extension
# File operations-f, --file=ARCHIVE # Specify archive filename-v, --verbose # Show detailed processing information-C, --directory=DIR # Change to specified directory--exclude=PATTERN # Exclude files matching pattern--exclude-from=FILE # Read exclude patterns from file
# Permissions and attributes-p, --preserve-permissions # Preserve file permissions--same-owner # Try to preserve file owner--no-same-owner # Do not preserve owner when extracting files (default)--no-same-permissions # Do not preserve permissions when extracting files
# Other important options--totals # Show total byte count after processing--checkpoint # Show processing progress--verify # Verify after writing to archive--wildcards # Use wildcard pattern matching-T, --files-from=FILE # Read filenames to process from file
3. Classic Usage Scenarios
3.1 Creating Archive Files
# Create an uncompressed tar archive
tar -cvf project.tar project/
# Create a gzip compressed archive
tar -czvf project.tar.gz project/
# Create a bzip2 compressed archive
tar -cjvf project.tar.bz2 project/
# Create an xz compressed archive (high compression ratio)
tar -cJvf project.tar.xz project/
3.2 Extracting Archive Files
# Extract tar archive
tar -xvf archive.tar
# Extract gzip compressed archive
tar -xzvf archive.tar.gz
# Extract bzip2 compressed archive
tar -xjvf archive.tar.bz2
# Extract to a specified directory
tar -xzvf archive.tar.gz -C /target/directory/
3.3 Viewing Archive Contents
# List archive contents
tar -tvf archive.tar
# List compressed archive contents
tar -tzvf archive.tar.gz
# Detailed list of archive contents (including permissions, size, etc.)
tar -tvf archive.tar | less
3.4 Incremental Operations
# Add files to an existing archive
tar -rvf archive.tar newfile.txt
# Only add files newer than those in the archive
tar -uvf archive.tar project/
# Delete files from an archive (requires extraction and re-creation)
tar --delete -f archive.tar newfile.txt
4. Combining with Other Tools and Commands
4.1 Combining with find
# Find specific files and packageind . -name "*.log" -exec tar -rvf logs.tar {} \;
# Generate file list using find
find /var/log -name "*.log" -mtime -7 > filelist.txt
tar -czvf recent_logs.tar.gz -T filelist.txt
# Exclude certain file types
find . -type f ! -name "*.tmp" | tar -czvf backup.tar.gz -T -
4.2 Combining with ssh for Remote Operations
# Remote backup: locally package and transfer to remote server
tar -czf - /important/data | ssh user@remote "cat > /backup/backup.tar.gz"
# Remote restore: get and extract from remote server
ssh user@remote "tar -czf - /remote/data" | tar -xzf - -C /local/restore/
# Directly operate remote files
ssh user@remote "tar -czf - /path/to/files" | tar -tzv
4.3 Combining with gpg for Encryption
# Create an encrypted archive
tar -czf - sensitive_data/ | gpg -c > backup.tar.gz.gpg
# Extract encrypted archive
gpg -d backup.tar.gz.gpg | tar -xzf -
# Use asymmetric encryption
tar -czf - data/ | gpg -e -r [email protected] > backup.tar.gz.gpg
4.4 Automating Usage in Scripts
#!/bin/bash
# Automated backup script
automated_backup() {
local backup_dir="/backup"
local source_dirs=("/etc" "/home" "/var/www")
local timestamp=$(date +%Y%m%d_%H%M%S)
# Create backup directory
mkdir -p "$backup_dir"
# Perform backup
echo "Starting system backup..."
tar -czpf "$backup_dir/backup_$timestamp.tar.gz" \
--exclude="/home/*/.cache" \
--exclude="/var/www/*/tmp" \
"${source_dirs[@]}"
# Verify backup
if tar -tzf "$backup_dir/backup_$timestamp.tar.gz" > /dev/null; then
echo "Backup successful: $backup_dir/backup_$timestamp.tar.gz"
# Clean up old backups (keep the last 7 days)
find "$backup_dir" -name "backup_*.tar.gz" -mtime +7 -delete
else
echo "Backup verification failed!"
return 1
fi
}
5. Advanced Application Scenarios
5.1 Incremental Backup System
#!/bin/bash
# Incremental backup implementation
incremental_backup() {
local full_backup="/backup/full_backup.tar.gz"
local incremental_base="/backup/last_backup.time"
local incremental_backup="/backup/incremental_$(date +%Y%m%d_%H%M%S).tar.gz"
if [ ! -f "$full_backup" ]; then
echo "Creating full backup..."
tar -czf "$full_backup" --listed-incremental="$incremental_base" /data
else
echo "Creating incremental backup..."
tar -czf "$incremental_backup" --listed-incremental="$incremental_base" /data
fi
echo "Backup completed"
}
# Timestamp-based incremental backup
timestamp_backup() {
local last_run_file="/var/run/last_backup"
local current_time=$(date +%s)
if [ -f "$last_run_file" ]; then
local last_time=$(cat "$last_run_file")
# Find files modified since last backup
find /data -type f -newer "@$last_time" > /tmp/changed_files
if [ -s /tmp/changed_files ]; then
tar -czf "/backup/changes_$current_time.tar.gz" -T /tmp/changed_files
fi
else
# First run, create full backup
tar -czf "/backup/full_$current_time.tar.gz" /data
fi
echo "$current_time" > "$last_run_file"
}
5.2 Multi-Volume Archiving (Splitting Large Files)
#!/bin/bash
# Large file split archiving
split_archive() {
local source_dir="$1"
local part_size="100M" # Each part 100MB
local base_name="large_archive"
# Create split archive
tar -czf - "$source_dir" | split -b "$part_size" - "${base_name}.tar.gz.part"
echo "Archive split into ${base_name}.tar.gz.part*"
}
# Merging split archives
merge_archive() {
local output_file="$1"
cat *.part > "$output_file"
echo "Archive merged into $output_file"
}
# Directly process split archives
process_split_archive() {
# Directly extract split archive (no need to merge first)
cat archive.tar.gz.part* | tar -xzf -
}
5.3 Advanced Exclusion and Filtering
#!/bin/bash
# Smart backup exclusion
smart_backup() {
local exclude_file="/etc/backup_excludes"
# Create exclusion list
cat > "$exclude_file" << 'EOF'
# Cache and temporary files
*.tmp
*.cache
__pycache__
node_modules
# Log files (compressed)
*.log.gz
*.log.bz2
# Version control directories
.git
.svn
.hg
# System-specific exclusions
/proc
/sys
/dev
/tmp
/run
EOF
# Perform backup
tar -czpf "/backup/smart_backup_$(date +%Y%m%d).tar.gz" \
--exclude-from="$exclude_file" \
--exclude="/var/cache" \
/
6. Comparison with Other Commands
# tar vs zip
tar -czf archive.tar.gz dir/ # Preserves Linux permissions, better compression ratio
zip -r archive.zip dir/ # Cross-platform compatible, but does not preserve all attributes
# tar vs cpio
tar -czf archive.tar.gz dir/ # Simple syntax, widely used
find dir/ | cpio -ov > archive.cpio # More precise file control
# tar vs rsync
tar -czf backup.tar.gz /data # Creates a snapshot in time
rsync -av /data/ backup/ # Incremental sync, maintains directory structure
7. Conclusion
By mastering the tar command in depth, you can build reliable data backup and archiving solutions. Whether for simple file packaging or complex enterprise-level backup systems, tar provides a powerful and flexible toolkit. Although modern tools like restic and borg offer better features in certain scenarios, the popularity and reliability of tar make it an indispensable foundational tool in the Linux environment.
#tar command #linux unpacking tool #linux operation and maintenance command
[Please correct any omissions!]