ddr_lzma - Man Page
Data de/compression plugin for dd_rescue
Synopsis
-L /path/to/libddr_lzma.so[=option[:option[:...]]]
-L lzma=[=option[:option[:...]]]
Description
About
XZ is an archiving format that uses the LZMA2 compression algorithm, also known as Lempel-Ziv-Markov chain algorithm. LZMA2 has fixed some quirks of the highly successful LZMA algo which provides high compression ratios and is known for its superior decompression speeds. It's optimized for much better compression ratios at the cost of compression speed compared to e.g. LZO. It's lowest preset levels tend to beat highest gzip and lowest bzip2 levels for both compression ratio and speed.
This plugin has been written for dd_rescue and uses the plugin interface from it. See the dd_rescue(1) man page for more information on dd_rescue.
Options
Options are passed using dd_rescue option passing syntax: The name of the plugin (lzma) is optionally followed by an equal sign (=) and options are separated by a colon (:). the lzma plugin also allows most of common options from xz util with some additional ones. See the Examples section below.
Compression or decompression
The lzma dd_rescue plugin (subsequently referred to as just ddr_lzma which reflects the variable parts of the filename libddr_lzma.so) choses compression or decompression mode automatically if one of the input/output files has an [lt]xz suffix; otherwise you may specify z or d parameters on the command line for compression and decompression respectively.
The parameter mt will tell ddr_lzma to do de/compression in multithreded mode. This might speed up processing of data by using all cores of the CPU. If the number of cores can not be detected, only one will be used. You can also pass mt=N to explicitly specify the number of threads to be used.
Note that liblzma prior to 5.2.0 does not support multithreaded compression, while liblzma prior to 5.4.0 does not support multihreaded decompression. This parameter will be ignored then.
The plugin also supports the parameter bench[mark] ; if it's specified, it will output some information about CPU usage.
If you only want to make an integrity check of the xz-compressed file, you can use test ; if data is corrupted, you will get a message in the console about it. Instead of test you can use just t.
Pass check=XXX where XXX can be one of next integrity checksum algos: CRC32, CRC64, SHA256, NONE. If NONE is specified, the integrity checksum will not be calculated while compressing By default CRC32 will be calculated by the plugin.
Also if you want to limit memory usage when decoding, use next param: memlimit=XXX , where XXX is memory limit for decoding. The usual suffices k, M, G are supported. Values below 1M or above the machine's memory size will be rejected.
Compression presets
The parameter preset=X , selects the compression ratio, where X can be an integer from 0 to 9 inclusively. The default value is 5. The effect on compression speed and ratio is significant, see the xz documentation. Note that decompression speed is always very good with lzma, even without multithreading. You can append an e to the preset level to use more CPU (but not more memory), trying to compress the data better. This corresponds to the --extreme flag in xz. Use levels 0,1,2 if you want compression levels better than highest gzip (or low bzip2) with better speed. Multithreading further speeds things up, though you need to test it on your system, see below. Use the lzo plugin for highest CPU speed and lowest memory with very modest compression.
Bugs/Limitations
Maturity
The plugin is new as of dd_rescue 1.99.17. Do not yet rely on data saved with ddr_lzma as the only backup for valuable data. The options may also still change in the future. After the events around xz/liblzma in March 2024, some additional reviews should be done on this code before passing untrusted compressed files to it.
Compressed data is more sensitive to data corruption than plain data. Note that the checksums in the xz file format do NOT allow to correct for errors, because next bytes depends on previous ones. Checksums just allow a rather reliable detection of data corruption.
Unlike ddr_lzo, no work has been invested to cleanly deal with holes (sparse files) or to (partially) recover from corrputed compressed data. liblzma may recover in some cases, but don't count on it.
When using multithreading, you may hit bugs. Missing function symbols on decoder initialization, memlimit for the decoder always set to 1 byte might be issues you hit (depends on the system which you use). Test the mt option before relying on it.
Examples
- dd_rescue -L lzma=z:preset=9 infile outfile
compresses data from infile into outfile with compression preset == 9.
- dd_rescue -L lzma=d:mt:memlimit=1234 infile infile2
will decompress infile to infile2 in multithreding mode with memory limit equal to 1234 Megabytes.
- dd_rescue -L lzma infile.xz outfile
will decompress infile.xz into outfile.
See Also
dd_rescue(1) liblzma documentation
Author
Dmitrii Ivanov <dsivanov_9@edu.hse.ru>
Credits
The liblzma library and algorithm has been written by The Tukaani Project
https://xz.tukaani.org/xz-utils/
Copyright
This plugin is under the same license as dd_rescue: The GNU General Public License (GPL) v2 or v3 - at your option.
History
ddr_lzma plugin was first introduced with dd_rescue 1.99 (Nov 2024).