protozero is a lightweight, high-performance C++ library specifically designed for encoding and decoding Protocol Buffers (protobuf) data. Unlike the official Google Protobuf library, it does not rely on .proto file code generation, but instead achieves zero-copy parsing by directly manipulating the protobuf encoding format, making it suitable for scenarios with strict performance and memory requirements (such as embedded systems or high-concurrency services).
1. Introduction to Protozero
- Features:
- High Performance: Reduces memory allocation and optimizes encoding/decoding speed through zero-copy techniques.
- Lightweight: Only depends on the C++11 standard library, without needing to link against the Google Protobuf runtime.
- Flexibility: Requires manual handling of field types and numbers, suitable for projects with stable protocols.
- Application Scenarios: Map data processing (Mapbox GL Native), real-time stream processing, embedded systems, etc.
2. Installation and Configuration
Installation on Debian/Ubuntu:
# Install development package
sudo apt-get install libprotozero-dev
Source Compilation:
Download the source from the Debian package repository and compile using CMake:
cmake -B build -DCMAKE_INSTALL_PREFIX=/usr/local
cmake --build build
sudo cmake --install build
3. Basic Usage Examples
Example 1: Encoding a Message
Assuming the protobuf message structure to be encoded is:
message Person {
string name = 1;
int32 id = 2;
repeated string emails = 3;
}
Encoding using protozero:
#include <protozero/bytes.hpp>
#include <protozero/pbf_writer.hpp>
#include <vector>
int main() {
std::vector<char> buffer;
protozero::pbf_writer writer{buffer};
// Add field: name (string, number=1)
writer.add_string(1, "Alice");
// Add field: id (int32, number=2)
writer.add_int32(2, 12345);
// Add array: emails (repeated string, number=3)
writer.add_string(3, "[email protected]");
writer.add_string(3, "[email protected]");
// At this point, buffer contains the serialized protobuf data
return 0;
}
Example 2: Decoding a Message
#include <protozero/pbf_reader.hpp>
#include <iostream>
void decode_person(const std::vector<char>& data) {
protozero::pbf_reader reader{data};
while (reader.next()) {
switch (reader.tag()) {
case 1: // name
std::cout << "Name: " << reader.get_string() << std::endl;
break;
case 2: // id
std::cout << "ID: " << reader.get_int32() << std::endl;
break;
case 3: // emails
std::cout << "Email: " << reader.get_string() << std::endl;
break;
default:
reader.skip();
}
}
}
// Example call
int main() {
std::vector<char> encoded_data = get_encoded_data(); // Get from Example 1
decode_person(encoded_data);
return 0;
}
4. Advanced Features
Handling Nested Messages
If a message contains nested types (e.g., Person contains Address):
// Encoding nested message
std::vector<char> encode_address() {
std::vector<char> addr_buffer;
protozero::pbf_writer addr_writer{addr_buffer};
addr_writer.add_string(1, "123 Main St"); // street
addr_writer.add_string(2, "SF"); // city
return addr_buffer;
}
// Add Address as a nested field of Person (number=4)
writer.add_message(4, encode_address());
During decoding, use get_bytes to parse the nested message:
case 4: {
auto addr_data = reader.get_bytes();
protozero::pbf_reader addr_reader{addr_data};
while (addr_reader.next()) {
if (addr_reader.tag() == 1) {
std::cout << "Street: " << addr_reader.get_string() << std::endl;
}
}
break;
}
Error Handling
try {
protozero::pbf_reader reader{data};
while (reader.next()) {
// Process fields
}
} catch (const protozero::exception& e) {
std::cerr << "Decode error: " << e.what() << std::endl;
}
5. Performance Optimization Suggestions
- Zero-Copy Design: Directly manipulate raw data buffers to avoid memory copying.
- Field Preallocation: Preallocate memory for repeated fields (e.g.,
std::vector::reserve()). - Avoid Type Confusion: Strictly match field types (e.g.,
get_int32()should only be used forint32).
6. Frequently Asked Questions
-
Field Number Errors: If the numbers do not match during encoding/decoding, it can lead to data corruption. It is recommended to define numbers using constants.
-
Differences from Official Protobuf:
protozerodoes not provide default values or enum names, which must be handled manually. -
Streaming Processing: Suitable for chunked decoding of large data (e.g., network transmission):
// Chunked reading of data while (auto chunk = get_next_chunk()) { protozero::pbf_reader reader{chunk}; // Incremental decoding }
Conclusion
protozero sacrifices convenience for extreme performance, making it suitable for replacing the official Protobuf library in high-load scenarios. When using it, ensure that the protocol is stable and strictly validate data types. For more details, refer to the official documentation.