How to Encrypt Chinese Text

How to Encrypt Chinese Text

Photographer: Product Manager
This Skewered Food Shop Has Finally Opened

When reading various encryption articles, you often see that the authors use numbers as examples for encryption. However, in real life, we communicate in Chinese. So how can we encrypt Chinese text?

In the article “Don’t Worry, No One Can Snooze Our Chat Messages”, we encrypt a piece of Chinese text with the following code:

msg = 'Tonight at 8 PM, let's meet at the usual place'
encryptd_msg = rsa.encrypt(msg.encode(), public_key)

Here, msg.encode() converts the Chinese message into bytes data. Let’s take a look at what this bytes data looks like:

How to Encrypt Chinese Text

It appears as a long string of hexadecimal values, which seems hard to read. However, let’s convert this bytes data into a list:

How to Encrypt Chinese Text

This transforms into a list containing numbers. Moreover, if you test it multiple times, you’ll find that the range of these numbers is 0-255, which corresponds to hexadecimal 00 to ff.

Now we can encrypt the numbers. Since 00ff corresponds to 8-bit binary numbers, let’s assume the key is 45, and its binary value is 00101101. We will perform an XOR operation on each number in this list with 45 to obtain a new list of numbers:

How to Encrypt Chinese Text

Note that when you expand this using a for loop, you don’t need to convert the bytes string into a list beforehand; the effect is the same as shown in the image below:

How to Encrypt Chinese Text

Now, let’s convert this list of numbers back into bytes data:

How to Encrypt Chinese Text

It is important to note that not all bytes can be converted back into strings. The new bytes ciphertext cannot be converted back:

How to Encrypt Chinese Text

At this point, we need to use base64 to encode it into a string:

How to Encrypt Chinese Text

The b64encode method of base64 returns bytes data, but this time it can successfully be converted back into a regular string, and the regular string is exactly the same as its bytes representation:

How to Encrypt Chinese Text

Now, you can send this ciphertext to your friend via a public chat application.

Your friend just needs to reverse the entire process to decode the correct information. Let’s write a piece of code:

import base64
code = 'yZany7S3FcqvlMKRocWtrMixncu7lMqPnciJmQ=='

encrypted_msg = base64.b64decode(code.encode())

bytes_list = [x ^ 45 for x in encrypted_msg]

msg_bytes = bytes(bytes_list)
msg = msg_bytes.decode()
print(msg)

The running effect is shown in the image below:

How to Encrypt Chinese Text

Successfully decrypted.

The number 45 is the password. The XOR operation has a property:

A xor B xor B = A

For a number A, performing an XOR operation with another number B gives the ciphertext C. Then, performing another XOR operation with B retrieves A.

We utilized this property to achieve encryption and decryption.

Some of you may wonder why we chose XOR here instead of multiplying or adding a certain number to all the numbers in the list for encryption. This is because if you want to convert a list of numbers into bytes data, all the numbers in the list must be within the range of 0-255; otherwise, it will lead to an error, as shown in the image below:

How to Encrypt Chinese Text

We cannot guarantee that the numbers after addition or multiplication will still be in the range of 0-255, but other complex algorithms may not be reversible. Therefore, we chose the XOR algorithm.

From this article, we can see that encrypting Chinese text is essentially encrypting numbers.

Leave a Comment