bzip3 - Man Page

an efficient statistical file compressor and spiritual successor to bzip2

Examples (TL;DR)

Synopsis

bzip3 [ -BbcdehftV ] [ filenames ... ]

bz3cat is equivalent to bzip3 -dc

bunzip3 is equivalent to bzip3 -d

Description

Compress or decompress a file using run length encoding and Lempel Ziv prediction, followed by the Burrows-Wheeler transform and arithmetic coding. bzip3, like its ancestor bzip2, excels at compressing text or source code.

The command-line options are deliberately very similar to those of bzip2, but they are not identical.

bzip3 expects at most two filenames intertwined with flags. bzip3 will by default not overwrite existing files. If this behaviour is intended, use the -f flag.

If no file names are specified, bzip3 will compress from standard input to standard output, refusing to output binary data to a terminal. The -e flag (encode) is implied.

bunzip3 (or, bzip3 -d equivalently) decompresses data from standard input to the standard output, refusing to read from a terminal.

If two files are specified, the first one is used in place of standard input, and the second one is used in place of standard output.

If the -c flag is present, bzip3 will read from the specified file and output data to standard output instead. Otherwise, if decoding, bzip3 will try to guess the decompressed filename by removing the .bz3 extension. If not present, an error will be reported. If encoding, the output filename will be generated by appending the .bz3 extension to the input filename.

Options

-B --batch

Enable batch mode. By default, bzip3 will error if more than two files are passed, and the two files specified are always treated as input and output. The batch mode makes bzip3 treat every file as input, so for example bzip3 -Bd *.bz3 will decompress all .bz3 files in the current directory.

-b --block N

Set the block size to N mebibytes. The minimum is 1MiB, the maximum is 511MiB.

-c --stdout

Force writing output data to the standard output if one file is specified.

-d --decode

Force decompression.

-e/-z --encode

Force compression (default behaviour).

-f --force

Overwrite existing files.

-h --help

Display a help message and exit.

-j --jobs N

Set the amount of parallel worker threads that process one block each.

-k --keep

Keep (don't delete) the input files. Set by default, provided only for compatibility with other compressors.

-v --verbose

Set verbose output mode to see compression statistics.

-V --version

Display version information and exit.

-t --test

Verify the validity of compressed blocks.

--

Treat all subsequent arguments as file names, even if they start with a dash. This is so you can handle files with names beginning with a dash.

File Format

Compression is performed as long as the input block is longer than 64 bytes. Otherwise, it's coded as a literal block. In all other cases, the compressed data is written to the file. The file format has constant overhead of 9 bytes per file and from 9 to 17 bytes per block. Random data is coded so that expansion is generally under 0.8%.

bzip3 uses 32-bit CRC to ensure that the decompressed version of a file is identical to the original. This guards against corruption of the compressed data.

Memory Management

The -b flag sets the block size in mebibytes (MiB). The default is 16 MiB. Compression and decompression memory usage can be estimated as:

      6 x block size

Larger block sizes usually give rapidly diminishing returns. It is also important to appreciate that the decompression memory requirement is set at compression time by the choice of block size. In general, try and use the largest block size memory constraints allow, since that maximises the compression achieved.  Compression and decompression speed are virtually unaffected by block size.

Author

Kamila Szewczyk, kspalaiologos@gmail.com.

https://github.com/kspalaiologos/bzip3

Thanks to: Ilya Grebnov, Benjamin Strachan, Caleb Maclennan, Ilya Muravyov, package maintainers - Leah Neukirchen, Grigory Kirillov, Maciej Barc,  Robert Schutz, Petr Pisar, and others. Also everyone who sent patches, helped with portability problems, encouraged me to work on bzip3 and lent me machines for performance tests.

See Also

bzip2(1), bz3less(1), bz3more(1), bz3grep(1), bunzip3(1)

Referenced By

bz3grep(1), bz3less(1), bz3more(1), bz3most(1).

The man page bz3cat(1) is an alias of bzip3(1).

17 July 2024 version v1.4.0