The scripts required for this article can be directly copied in the format of this article:
train_script.sh
#!/bin/bash
train=(
" ____ "
" _|____|____ "
" | _________ | "
" | _ _ | "
" |_| |_| |_| "
)
cols=$(tput cols)
train_width=0
for line in "${train[@]}"; do
(( ${#line} > train_width )) && train_width=${#line}
done
clear
tput civis
trap 'tput cnorm; exit' SIGINT
position=0
direction=1
while true; do
clear
position=$((position + direction))
if (( position >= cols - train_width )); then
direction=-1
elif (( position <= 0 )); then
direction=1
fi
for line in "${train[@]}"; do
printf "%${position}s%s\n" "" "$line"
done
sleep 0.1
done
tab.sh
#!/bin/bash
generate_random() {
local min=$1
local max=$2
echo $(( RANDOM % (max - min + 1) + min ))
}
MIN=1
MAX=100
COUNT=1
while getopts "m:M:c:" opt; do
case $opt in
m) MIN=$OPTARG ;;
M) MAX=$OPTARG ;;
c) COUNT=$OPTARG ;;
*) echo "USE: $0 [-m min] [-M max] [-c count]"; exit 1 ;;
esac
done
for ((i=0; i<COUNT; i++)); do
generate_random $MIN $MAX
done
In daily operations, we often copy some Bash, Python, or Expect scripts from public accounts or websites to execute on Linux servers. However, we sometimes encounter execution exceptions, such as <span>command not found</span>
or <span>syntax error</span>
. Usually, this is due to the conversion of special characters during the paste process, such as:
- Inconsistent line endings between Windows and Linux
- Invisible characters (such as full-width spaces, BOM headers, carriage returns, etc.)
- HTML encoded characters (such as   replaced with spaces)
- Formatting errors (for example, tabs turning into multiple spaces)
This article will introduce several methods to help you quickly fix these issues.
1. Use <span>cat -A</span>
to check file format
In the Linux terminal, you can use the <span>cat -A</span>
command to check for special characters. The table below lists the special characters that can be viewed:
Special Character | Display Method |
---|---|
<span>Tab</span> |
Replaced by <span>^I</span> (uppercase i) for the tab character |
Line Feed | Replaced by <span>$</span> , which is normal and needs to be preserved during processing |
Carriage Return | <span>^M</span> replaces the carriage return character (usually in Windows/DOS files) |
Abnormal Spaces | <span>M-BM-</span> usually copied from websites or web pages, or non-printable characters |
cat -A space.sh
The execution result for spaces is as follows:
[root@test ~]# cat -A train_script.sh
#!/bin/bash$
$
train=($
" M-BM- M-BM- ____ M-BM- M-BM- M-BM- M-BM- M-BM- "M-BM- $
" M-BM- _|____|____ M-BM- M-BM- "$
" | M-BM- _________ | M-BM- "$
" M-BM- | M-BM- _ M-BM- _ M-BM- | M-BM- M-BM- "$
" M-BM- |_| |_| |_| M-BM- M-BM- "$
)$
$
cols=$(tput cols)$
train_width=0$
for line in "${train[@]}"; do$
M-BM- M-BM- (( ${#line} > train_width )) && train_width=${#line}$
done$
$
clear$
tput civis$
$
trap 'tput cnorm; exit' SIGINT$
$
position=0$
direction=1$
$
while true; do$
M-BM- M-BM- clear$
M-BM- M-BM- position=$((position + direction))$
$
M-BM- M-BM- if (( position >= cols - train_width )); then$
M-BM- M-BM- M-BM- M-BM- direction=-1$
M-BM- M-BM- elif (( position <= 0 )); then$
M-BM- M-BM- M-BM- M-BM- direction=1$
M-BM- M-BM- fi$
$
M-BM- M-BM- for line in "${train[@]}"; do$
M-BM- M-BM- M-BM- M-BM- printf "%${position}s%s\n" "" "$line"$
M-BM- M-BM- done$
$
M-BM- M-BM- sleep 0.1$
done$
The execution result for tab is as follows:
[root@test ~]# cat -A tab.sh
#!/bin/bash$
generate_random() {$
^Ilocal min=$1$
^Ilocal max=$2$
^Iecho $(( RANDOM % (max - min + 1) + min ))$
} $
MIN=1$
MAX=100$
COUNT=1$
while getopts "m:M:c:" opt; do$
^Icase $opt in$
^I^Im) MIN=$OPTARG ;;$
^I^IM) MAX=$OPTARG ;;$
^I^Ic) COUNT=$OPTARG ;;$
^I^I*) echo "USE: $0 [-m min] [-M max] [-c count]"; exit 1 ;;$
^Iesac$
done$
for ((i=0; i<COUNT; i++)); do$
^Igenerate_random $MIN $MAX$
done$
Command Explanation
<span>-A</span>
is equivalent to the following three options<span>-v</span>
(show control characters): displays all non-printable characters (e.g., ASCII control characters) as ^ plus character (for example, ^M represents carriage return, ^I represents tab).<span>-E</span>
(show line-ending line feeds): adds a<span>$</span>
symbol at the end of each line, indicating the presence of line feeds.<span>-T</span>
(show tabs): displays tabs as<span>^I</span>
instead of actual tabs.
2. Fixing Illegal Characters M-BM-
Use the od command to view the hexadecimal encoding of the file
od -t x1 -c train_script.sh
Command Explanation:
<span>(Octal Dump)</span>
: is a binary file viewing tool that displays file content in octal format by default, but can be customized with parameters.<span>-t</span>
: specifies the output format.<span>x1</span>
: displays in hexadecimal format, byte by byte.<span>-c</span>
: displays content in ASCII character form simultaneously. Non-printable characters (such as line feeds, tabs, etc.) will be displayed as escape symbols (such as<span>\n</span>
,<span>\t</span>
).
Example Output:
0000000 23 21 2f 62 69 6e 2f 62 61 73 68 0a 0a 74 72 61
# ! / b i n / b a s h \n \n t r a
0000020 69 6e 3d 28 0a 22 20 c2 a0 20 c2 a0 5f 5f 5f 5f
i n = ( \n " 302 240 302 240 _ _ _ _
0000040 20 c2 a0 20 c2 a0 20 c2 a0 20 c2 a0 20 c2 a0 22
302 240 302 240 302 240 302 240 302 240 "
0000060 c2 a0 0a 22 20 c2 a0 5f 7c 5f 5f 5f 5f 7c 5f 5f
302 240 \n " 302 240 _ | _ _ _ _ | _ _
0000100 5f 5f 20 c2 a0 20 c2 a0 20 22 0a 22 20 7c 20 c2
_ _ 302 240 302 240 " \n " | 302
0000120 a0 5f 5f 5f 5f 5f 5f 5f 5f 5f 20 7c 20 c2 a0 20
240 _ _ _ _ _ _ _ _ _ | 302 240
0000140 22 0a 22 20 c2 a0 20 7c 20 c2 a0 5f 20 c2 a0 5f
" \n " 302 240 | 302 240 _ 302 240 _
0000160 20 c2 a0 7c 20 c2 a0 20 c2 a0 20 22 0a 22 20 c2
302 240 | 302 240 302 240 " \n " 302
0000200 a0 20 7c 5f 7c 20 7c 5f 7c 20 7c 5f 7c 20 c2 a0
240 | _ | | _ | | _ | 302 240
0000220 20 c2 a0 22 0a 29 0a 0a 63 6f 6c 73 3d 24 28 74
............省略后半部分
From the output, we can see that <span>c2 a0 (302 240 is octal)</span>
, which corresponds to M-BM-, because the preceding part (see the second line) is:<span>#!/bin/bash</span>
<span>\n</span>
is a line feed, <span>train=(</span>
, <span>\n</span>
line feed, and the following part is garbled.
Since we have confirmed that <span>c2 a0</span>
(<span>M-BM-</span>
) is a non-breaking space, we can use <span>sed</span>
to perform the replacement:
sed -i 's/\xC2\xA0/ /g' train_script.sh
Command Explanation
<span>-i</span>
directly modifies the file (if you want to keep the original file, you can remove -i or export it as a new file using<span>></span>
).<span>\xC2\xA0</span>
is the hexadecimal representation of<span>U+00A0</span>
, which can be verified by entering<span>echo -e "aa\xC2\xA0aa"</span>
in Linux.<span>/ /g</span>
replaces it with a normal space.
The script can now run normally
3. Fixing Illegal Characters<span>^I</span>
Using the experience from the previous script fix, first execute the command to check the encoding
od -t x1 -c tab.sh
Example Output:
0000000 23 21 2f 62 69 6e 2f 62 61 73 68 0a 67 65 6e 65
# ! / b i n / b a s h \n g e n e
0000020 72 61 74 65 5f 72 61 6e 64 6f 6d 28 29 20 7b 0a
r a t e _ r a n d o m ( ) { \n
0000040 09 6c 6f 63 61 6c 20 6d 69 6e 3d 24 31 0a 09 6c
\t l o c a l m i n = $ 1 \n \t l
0000060 6f 63 61 6c 20 6d 61 78 3d 24 32 0a 09 65 63 68
o c a l m a x = $ 2 \n \t e c h
0000100 6f 20 24 28 28 20 52 41 4e 44 4f 4d 20 25 20 28
o $ ( ( R A N D O M % (
0000120 6d 61 78 20 2d 20 6d 69 6e 20 2b 20 31 29 20 2b
m a x - m i n + 1 ) +
0000140 20 6d 69 6e 20 29 29 0a 7d 0a 4d 49 4e 3d 31 0a
m i n ) ) \n } \n M I N = 1 \n
0000160 4d 41 58 3d 31 30 30 0a 43 4f 55 4e 54 3d 31 0a
M A X = 1 0 0 \n C O U N T = 1 \n
.........省略后半部分
From the output, we can see that the illegal character <span>^I</span>
corresponds to <span>\t</span>
, which is actually the tab character, so we can fix it with the following command:
# Since a tab generally represents four spaces, replacing it with four spaces can maintain the original script format
sed 's/\x09/ /g' tab.sh
or
sed -i 's/\t/ /g' tab.sh
or
expand -t 4 tab.sh > new_tab.sh
The above 09 is hexadecimal; why is there no corresponding octal below? Because 09 in hexadecimal represents a tab character, so it is displayed normally here.
Command Explanation
<span>expand</span>
is a command-line tool used to convert tabs in a file to spaces. It can replace each tab in the file with a specified number of spaces to align the text.<span>-t</span>
specifies how many spaces each tab should be replaced with; here it is replaced with four spaces.
The script can now run normally
bash new_tab.sh -m 1 -M 5000 -c 10
4. Other Methods
You can use cat to generate a new file and use sed for file replacement.
Generate a new file
cat -vT train_script.sh > new_train_script.sh
[!NOTE]
Consideration: Why not use
<span>cat -A</span>
The content is as follows:
[root@test ~]# cat -vT space.sh.bak
#!/bin/bash
train=(
" M-BM- M-BM- ____ M-BM- M-BM- M-BM- M-BM- M-BM- "M-BM-
" M-BM- _|____|____ M-BM- M-BM- "
" | M-BM- _________ | M-BM- "
" M-BM- | M-BM- _ M-BM- _ M-BM- | M-BM- M-BM- "
" M-BM- |_| |_| |_| M-BM- M-BM- "
)
cols=$(tput cols)
train_width=0
for line in "${train[@]}"; do
M-BM- M-BM- (( ${#line} > train_width )) && train_width=${#line}
done
clear
tput civis
trap 'tput cnorm; exit' SIGINT
position=0
direction=1
while true; do
M-BM- M-BM- clear
M-BM- M-BM- position=$((position + direction))
M-BM- M-BM- if (( position >= cols - train_width )); then
M-BM- M-BM- M-BM- M-BM- direction=-1
M-BM- M-BM- elif (( position <= 0 )); then
M-BM- M-BM- M-BM- M-BM- direction=1
M-BM- M-BM- fi
M-BM- M-BM- for line in "${train[@]}"; do
M-BM- M-BM- M-BM- M-BM- printf "%${position}s%s\n" "" "$line"
M-BM- M-BM- done
M-BM- M-BM- sleep 0.1
end
Perform replacement
sed -i 's/M-BM-/ /' new_train_script.sh
The script can now run normally
5. Other
Check for <span>^I</span>
illegal strings:
vim script.sh
:set list
Enter the <span>:set list</span>
command to check if the file has <span>^I</span>
symbols. If so, you can delete them manually or use:
:%s/\t/ /g
# Here, \t is used because ^I represents the tab character
Content copied from Windows may carry Windows styles, such as
[root@test]# cat -A space.sh
#!/bin/bash^M$
^M$
train=(^M$
" ____ " ^M$
" _|____|____ "^M$
" | _________ | "^M$
" | _ _ | "^M$
" |_| |_| |_| "^M$
)^M$
^M$
cols=$(tput cols)^M$
train_width=0^M$
for line in "${train[@]}"; do^M$
(( ${#line} > train_width )) && train_width=${#line}^M$
done^M$
^M$
clear^M$
tput civis^M$
^M$
trap 'tput cnorm; exit' SIGINT^M$
^M$
position=0^M$
direction=1^M$
^M$
while true; do^M$
clear^M$
position=$((position + direction))^M$
^M$
if (( position >= cols - train_width )); then^M$
direction=-1^M$
elif (( position <= 0 )); then^M$
direction=1^M$
fi^M$
^M$
for line in "${train[@]}"; do^M$
printf "%${position}s%s\n" "" "$line"^M$
done^M$
^M$
sleep 0.1^M$
done^M$
From the output, it is found that the end character is not <span>$</span>
, but <span>^M$</span>
Common line endings are of two types:
- ^M$ → Windows line endings (CRLF, i.e., \r\n)
- $ → Unix line endings (LF, i.e., \n).
You can test with the following commands
printf "Hello\r\nWorld\r\n" > win.txt # Write CRLF line endings
echo -e "Hello\nWorld" > unix.txt # Write LF line endings
# View output
cat -A win.txt
cat -A unix.txt
[!NOTE]
Consideration: Why is one using
<span>printf</span>
and the other using<span>echo</span>
Of course, you can also use another way to check
vi win.txt
:set ff?
# or
[root@test ~]# file win.txt
win.txt: ASCII text, with CRLF line terminators
Fix Windows line endings
# Let vi reformat
vim script.sh
:set ff=unix
:wq
# Alternatively, you can use the above method to check encoding and replace with sed, but this method is simpler.
# The other two commands require installation, dos2unix unix2dos
# Windows --> Unix:
dos2unix filename
unix2dos filename
I hope this tutorial on fixing special characters helps you! 🚀🚀!