FPZIP is a C/C++ library developed by Lawrence Livermore National Laboratory for the compression of multi-dimensional floating point arrays. It supports lossless compression of 1D, 2D, and 3D single precision (float) and double precision (double) arrays, and also allows lossy compression by specifying the number of precision bits to retain. The table below provides a quick overview of its core features:
| Feature Dimension | Supported Options |
|---|---|
| Compression Type | Lossless Compression / Lossy Compression |
| Data Dimension | 1D, 2D, 3D Arrays |
| Data Type | Single Precision (float), Double Precision (double) |
| Precision Control | Floating Point: 8, 16, 24, 32 bits; Double Precision: 16, 32, 48, 64 bits |
| Input/Output | Memory Buffers, Files |
🔧 Installing the FPZIP Library
On Debian/Ubuntu systems, you can install the FPZIP development library directly via the package manager:
sudo apt-get install libfpzip-dev
This command will install the libfpzip-dev package, which includes the necessary header files and static libraries for compilation.
💻 Using FPZIP for Compression and Decompression
The FPZIP library provides a simple C interface. Let’s look at a complete code example to see how to use it to compress and decompress a one-dimensional float array.
#include <fpzip.h>
#include <stdio.h>
#include <stdlib.h>
int main() {
// Initialize example data: an array containing 1000 floats
const int num_floats = 1000;
float* original_data = (float*)malloc(num_floats * sizeof(float));
for (int i = 0; i < num_floats; i++) {
original_data[i] = (float)i / 10.0f; // Generate some example data
}
// Estimate the maximum buffer size needed for compression and allocate memory
size_t buffer_size = 1024 + num_floats * sizeof(float);
unsigned char* compressed_buffer = (unsigned char*)malloc(buffer_size);
// Initialize compression structure
FPZ* fpz = fpzip_write_to_buffer(compressed_buffer, buffer_size);
fpz->type = FPZIP_TYPE_FLOAT; // Set data type to float
fpz->prec = 0; // 0 indicates lossless compression
fpz->nx = num_floats; // Set first dimension size
fpz->ny = 1; // Set second dimension size
fpz->nz = 1; // Set third dimension size
fpz->nf = 1; // Set number of scalar fields
// Write compression header
if (!fpzip_write_header(fpz)) {
fprintf(stderr, "Failed to write compression header\n");
return 1;
}
// Compress data
size_t compressed_size = fpzip_write(fpz, original_data);
if (compressed_size == 0) {
fprintf(stderr, "Compression failed\n");
return 1;
}
fpzip_write_close(fpz); // End compression
printf("Original size: %zu bytes, Compressed size: %zu bytes, Ratio: %.2f%%\n",
num_floats * sizeof(float), compressed_size,
(float)compressed_size / (num_floats * sizeof(float)) * 100.0f);
// Prepare for decompression
float* decompressed_data = (float*)malloc(num_floats * sizeof(float));
FPZ* fpz_read = fpzip_read_from_buffer(compressed_buffer);
// Read decompression header
if (!fpzip_read_header(fpz_read)) {
fprintf(stderr, "Failed to read decompression header\n");
return 1;
}
// Decompress data
size_t decompressed_count = fpzip_read(fpz_read, decompressed_data);
if (decompressed_count == 0) {
fprintf(stderr, "Decompression failed\n");
return 1;
}
fpzip_read_close(fpz_read); // End decompression
// Verify that the decompressed data is correct
int success = 1;
for (int i = 0; i < num_floats; i++) {
if (original_data[i] != decompressed_data[i]) {
success = 0;
break;
}
}
printf("Decompression data verification: %s\n", success ? "PASS" : "FAIL");
// Free memory
free(original_data);
free(compressed_buffer);
free(decompressed_data);
return 0;
}
Core Logic of the Code Explained:
- Data Initialization: Create and initialize a one-dimensional floating point array
original_data. - Configure Compression Parameters: Set key compression parameters using the
FPZstructure:type: Specify data type (FPZIP_TYPE_FLOATorFPZIP_TYPE_DOUBLE).prec: Control precision,0indicates lossless compression; you can also set it to 24 for lossy compression.nx,ny,nz: Define the size of each dimension of the array.nf: Typically set to 1, indicating a scalar field.
- Execute Compression: Call
fpzip_write_headerandfpzip_writeto complete the compression. - Execute Decompression: Use
fpzip_read_from_buffer,fpzip_read_header, andfpzip_readto perform decompression. - Resource Cleanup: After compression and decompression are complete, remember to use
fpzip_write_closeandfpzip_read_closefor cleanup and free dynamically allocated memory.
🛠️ Advanced Applications and Scenarios of FPZIP
After mastering the basic usage, let’s explore some advanced features and application scenarios of FPZIP.
-
Controlling Compression Precision: FPZIP allows you to balance between precision and compression ratio. For example, in the
fpz->precfield, you can specify the number of bits to retain. For single precision floating point numbers, you can choose to retain 24 bits for lossy compression. This is very useful in scenarios where data precision is not extremely sensitive, significantly improving the compression ratio. -
Multi-Dimensional Array Compression: The true advantage of FPZIP lies in handling multi-dimensional arrays. Suppose you have a 100x100x100 3D single precision floating point array, you can set the parameters as follows:
fpz->type = FPZIP_TYPE_FLOAT; fpz->nx = 100; fpz->ny = 100; fpz->nz = 100; fpz->nf = 1;By correctly setting the dimensions, FPZIP can more effectively utilize the spatial correlation of the data, achieving better results than simple one-dimensional stream compression.
-
HDF5 Filter Plugin: FPZIP can also be used as a compression filter for HDF5 format. This is very convenient for handling large-scale HDF5 data files commonly found in scientific computing. After installing and configuring the
fpzip_plugin, you can directly enable FPZIP compression when creating HDF5 datasets, saving significant space at the storage and I/O levels.
💎 Summary
The FPZIP library provides an efficient, flexible, and reliable solution for compressing floating point array data. It is particularly suitable for handling multi-dimensional floating point data with spatial correlation, such as data from scientific computing, numerical simulations, or large numerical datasets. With its simple C interface, you can easily integrate compression functionality into your C or C++ programs.