Unlocking the Power of Data Compression in Rust

What Do Rust Compression Libraries Do?

Rust compression libraries can be broadly categorized into two types: stream compressors and archivers. Stream compressors take a stream of bytes and emit a shorter stream of compressed bytes, while archivers enable you to serialize multiple files and directories. Some formats, like.zip, can both collect and compress files.

The Best Compression Libraries for Rust

When it comes to choosing a compression library, there are many solutions with different trade-offs in terms of runtime, CPU and memory usage, compression ratio, and safety features like checksums. We’ll focus on lossless compression formats, excluding image, audio, and video-related lossy compression formats.

Stream Compression Libraries for Rust

Our benchmark covers the following stream compression libraries:

  • DEFLATE/zlib: An older algorithm using dictionary and Huffman encoding, with three variants: plain, gzip, and zlib.
  • Snappy: Google’s 2011 answer to LZ77, offering fast runtime with a fair compression ratio.
  • LZ4: Another speed-focused algorithm in the LZ77 family, relying solely on dictionary matching.
  • ZStandard: Facebook’s 2016 algorithm for real-time applications, offering fast compression and decompression with zlib-level or better compression ratios.
  • LZMA: An asymmetric algorithm trading runtime for higher compression, often used for Linux distribution package formats.
  • Zopfli: A zlib-compatible compression algorithm trading superior compression ratio for a long runtime, useful for reducing network traffic.
  • Brotli: Google’s extension of the LZ77-based compression algorithm with second-order context modeling, giving it an edge in compressing hard-to-compress data streams.

Here’s an example of using the DEFLATE algorithm in Rust:

use flate2::write::ZlibEncoder;
use flate2::Compression;

let mut encoder = ZlibEncoder::new(Vec::new(), Compression::default());
encoder.write_all(b"Hello, world!").unwrap();
let compressed_data = encoder.finish().unwrap();

Archiving Libraries for Rust

Our benchmark also covers the following archiving libraries:

  • tar: The venerable Tape ARchiver, delegating compression to stream archivers like gzip (DEFLATE), bzip2 (LZW), and xz (LZMA).
  • zip: A widely supported format with an initial release in 1989.
  • rar: A younger format with less file overhead and slightly better compression, popular on file sharing services.

Here’s an example of creating a tar archive in Rust:

use tar::Builder;

let mut archive = Builder::new(Vec::new());
archive.append_file("hello.txt", "Hello, world!".as_bytes()).unwrap();
let tar_data = archive.into_inner().unwrap();

Benchmarking Rust Compression Libraries

To benchmark these libraries, we used a variety of files, ranging from highly compressible to very hard to compress. The results are presented in six tables, showcasing the performance of each library.

Note: The benchmark results are omitted for brevity.

Leave a Reply