1 Introduction
Recently, I encountered a situation in development where file uploads were done using Base64 encoding. I remember when I first learned about HTTP file uploads, they were done by directly uploading binary files with a content-type of multipart/form-data. We know that all transmissions over the network ultimately transmit binary streams, so there is no doubt that they are essentially the same. Then why do we need to convert to Base64 first? What are the differences between these two methods? Let’s analyze this together.
2 multipart/form-data Upload
First, let’s take a look at the multipart/form-data method. I created a simple example locally to observe HTTP multipart/form-data file uploads. The HTML code is as follows:
<!DOCTYPE html><html><head> <title>File Upload Example</title> <meta charset="UTF-8"><body><h1>File Upload Example</h1><form action="/upload" method="POST" enctype="multipart/form-data"> <label for="file">Choose File:</label> <input type="file" id="file" name="file"><br> <label for="tx">Description:</label> <input type="text" id="tx" name="remark"><br><br> <input type="submit" value="Upload"></form></body></html>
The page display is also quite simple.
After selecting a file and clicking upload, I checked the request information in debugging mode using the Edge browser’s F12. The request header is as follows:
In the request header, the Content-Type is multipart/form-data; boundary=—-WebKitFormBoundary4TaNXEII3UbH8VKo. It can be a bit confusing at first, but it’s not complicated. It can be simply understood that the parameters to be passed in the request body are divided into multiple parts, each part separated by the boundary delimiter. For example, in this case, the form has two fields, file and remark, which are divided into two parts in the request body, each separated by boundary=—-WebKitFormBoundary4TaNXEII3UbH8VKo (actually, we also need to add CRLF line breaks; CR moves the cursor to the beginning of the current line, and LF indicates the end of a line of text and the start of a new line). It should be noted that the last part must end with two additional “-“. Now, let’s take a look at the request body.
The first part is the file field, which has a Content-Type of image/png. The second part is the remark field, which does not declare a Content-Type, so it defaults to text/plain, which is the input “test” in this example. At this point, you might wonder where the uploaded image is. Don’t worry, I suspect the browser has special handling; the request body does not display binary streams. We can verify this using the Filder packet capture tool.
We can see a string of garbled text in the first part. This is because the image is a binary file, and displaying it in text format naturally results in garbled text. This also confirms that binary files are indeed included in the request body. The backend uses the Spring Boot framework to receive files via MultipartFile, which parses each part of the request body to ultimately obtain the binary stream.
@RestControllerpublic class FileController { // @RequestParam can accept Content-Type types: multipart/form-data // or application/x-www-form-urlencoded request body content @PostMapping("/upload") public String upload(@RequestParam("file") MultipartFile file) { return "test"; }}
Thus, the analysis of the multipart/form-data file upload method is complete. For official information on multipart/form-data, you can refer to RFC 7578 – Returning Values from Forms: multipart/form-data (ietf.org).
3 Base64 Upload
In HTTP request methods, file uploads can only be done through multipart/form-data, which brings significant limitations. So, is there any other way to bypass this limitation? Can I upload using other request methods, such as application/json? Of course, yes. By treating the file as a string, it is no different from other ordinary parameters. We can upload using any request method. If converted to a string, file uploads become much simpler. But the question is, how do we convert binary streams to strings? There may be many “pits” involved. The common practice in the industry is to encode the binary stream into a string using Base64. But why not convert it directly to a string and instead use Base64 for conversion? Let’s analyze this below.
3.1 Base64 Encoding Principle
Before analyzing the principle, let’s first answer what Base64 encoding is. Firstly, we need to know that Base64 is just an encoding method, not an encryption/decryption algorithm. Therefore, Base64 can be encoded and decoded; it simply converts some non-displayable characters into displayable characters according to certain encoding rules. The principle of this rule is to group the binary numbers of the characters to be encoded into sets of 6 bits. Each group of binary numbers corresponds to a printable character in Base64 encoding. Since a character is displayed with one byte, each group of 6 bits in Base64 encoding must be padded with two leading zeros. Therefore, the total length is longer than before encoding by (2/6) = 1/3. Since the least common multiple of 6 and 8 is 24, the requirement for bytes to be encoded into Base64 is that it must be a multiple of 3 (24/8=3 bytes). For insufficient bytes, padding is needed, and the number of padding bytes is represented using “=” (one or two). This might sound abstract, so let’s illustrate it with the following example. We will perform Base64 encoding on the ASCII string “AB\nC” (where \n and LF represent line breaks). Since there are 4 bytes in total, to meet the requirement of being a multiple of 3, we need to extend it to 6 bytes, padding 2 additional bytes at the end.
Table 3.1
After converting to binary, each 6 bits corresponds to a different color, and two leading zeros are added in front of each 6 bits to form a byte. The final Base64 encoded character is QUIKQw==. You can search online for the Base64 encoding table.
We will verify our results by running a program.
The final result matches our previous deduction.
3.2 Purpose of Base64 Encoding
After discussing the principle, let’s continue to explore why file uploads need to be converted to a string using Base64 encoding instead of directly converting to a string. Some systems may have restrictions on special characters, or they may be processed with special meanings. Directly converting to a plain string may lead to distortion. Therefore, files need to be converted to Base64 encoded characters before uploading; binary streams cannot be directly converted to strings.
In addition, compared to multipart/form-data, Base64 encoded file uploads are more flexible. They are not limited by request types and can be any request type because they ultimately become a string, which is equivalent to a parameter field in the request. Unlike binary streams, which are limited to multipart/form-data request methods, in daily development, we often use the application/json format to place file fields in the request body. This method provides greater operational convenience.
4 Conclusion
Finally, let’s summarize and compare the advantages and disadvantages of these two file upload methods. (1) multipart/form-data can transmit binary streams, which is more efficient, while Base64 requires encoding and decoding, consuming certain performance and being less efficient. (2) Base64 is not limited by request methods, offering high flexibility, while HTTP binary stream transmission can only be done via multipart/form-data, which is less flexible. As machine performance improves, the perceived time delay difference between transmitting small files via binary streams and strings is not significant. Therefore, in most cases, we consider flexibility more, which leads to more frequent use of Base64 encoding.