Scroll to navigation

XZ(1) XZ Utils XZ(1)

NAME

xz, unxz, xzcat, lzma, unlzma, lzcat - Compress or decompress .xz and .lzma files

SYNOPSIS

xz [option...] [file...]

COMMAND ALIASES

unxz is equivalent to xz --decompress.
xzcat is equivalent to xz --decompress --stdout.
lzma is equivalent to xz --format=lzma.
unlzma is equivalent to xz --format=lzma --decompress.
lzcat is equivalent to xz --format=lzma --decompress --stdout.

When writing scripts that need to decompress files, it is recommended to always use the name xz with appropriate arguments (xz -d or xz -dc) instead of the names unxz and xzcat.

DESCRIPTION

xz is a general-purpose data compression tool with command line syntax similar to gzip(1) and bzip2(1). The native file format is the .xz format, but the legacy .lzma format used by LZMA Utils and raw compressed streams with no container format headers are also supported.

xz compresses or decompresses each file according to the selected operation mode. If no files are given or file is -, xz reads from standard input and writes the processed data to standard output. xz will refuse (display an error and skip the file) to write compressed data to standard output if it is a terminal. Similarly, xz will refuse to read compressed data from standard input if it is a terminal.

Unless --stdout is specified, files other than - are written to a new file whose name is derived from the source file name:

  • When compressing, the suffix of the target file format (.xz or .lzma) is appended to the source filename to get the target filename.
  • When decompressing, the .xz or .lzma suffix is removed from the filename to get the target filename. xz also recognizes the suffixes .txz and .tlz, and replaces them with the .tar suffix.

If the target file already exists, an error is displayed and the file is skipped.

Unless writing to standard output, xz will display a warning and skip the file if any of the following applies:

  • File is not a regular file. Symbolic links are not followed, and thus they are not considered to be regular files.
  • File has more than one hard link.
  • File has setuid, setgid, or sticky bit set.
  • The operation mode is set to compress and the file already has a suffix of the target file format (.xz or .txz when compressing to the .xz format, and .lzma or .tlz when compressing to the .lzma format).
  • The operation mode is set to decompress and the file doesn't have a suffix of any of the supported file formats (.xz, .txz, .lzma, or .tlz).

After successfully compressing or decompressing the file, xz copies the owner, group, permissions, access time, and modification time from the source file to the target file. If copying the group fails, the permissions are modified so that the target file doesn't become accessible to users who didn't have permission to access the source file. xz doesn't support copying other metadata like access control lists or extended attributes yet.

Once the target file has been successfully closed, the source file is removed unless --keep was specified. The source file is never removed if the output is written to standard output.

Sending SIGINFO or SIGUSR1 to the xz process makes it print progress information to standard error. This has only limited use since when standard error is a terminal, using --verbose will display an automatically updating progress indicator.

Memory usage

The memory usage of xz varies from a few hundred kilobytes to several gigabytes depending on the compression settings. The settings used when compressing a file determine the memory requirements of the decompressor. Typically the decompressor needs 5 % to 20 % of the amount of memory that the compressor needed when creating the file. For example, decompressing a file created with xz -9 currently requires 65 MiB of memory. Still, it is possible to have .xz files that require several gigabytes of memory to decompress.

Especially users of older systems may find the possibility of very large memory usage annoying. To prevent uncomfortable surprises, xz has a built-in memory usage limiter, which is disabled by default. While some operating systems provide ways to limit the memory usage of processes, relying on it wasn't deemed to be flexible enough (for example, using ulimit(1) to limit virtual memory tends to cripple mmap(2)).

The memory usage limiter can be enabled with the command line option --memlimit=limit. Often it is more convenient to enable the limiter by default by setting the environment variable XZ_DEFAULTS, for example, XZ_DEFAULTS=--memlimit=150MiB. It is possible to set the limits separately for compression and decompression by using --memlimit-compress=limit and --memlimit-decompress=limit. Using these two options outside XZ_DEFAULTS is rarely useful because a single run of xz cannot do both compression and decompression and --memlimit=limit (or -M limit) is shorter to type on the command line.

If the specified memory usage limit is exceeded when decompressing, xz will display an error and decompressing the file will fail. If the limit is exceeded when compressing, xz will try to scale the settings down so that the limit is no longer exceeded (except when using --format=raw or --no-adjust). This way the operation won't fail unless the limit is very small. The scaling of the settings is done in steps that don't match the compression level presets, for example, if the limit is only slightly less than the amount required for xz -9, the settings will be scaled down only a little, not all the way down to xz -8.

Concatenation and padding with .xz files

It is possible to concatenate .xz files as is. xz will decompress such files as if they were a single .xz file.

It is possible to insert padding between the concatenated parts or after the last part. The padding must consist of null bytes and the size of the padding must be a multiple of four bytes. This can be useful, for example, if the .xz file is stored on a medium that measures file sizes in 512-byte blocks.

Concatenation and padding are not allowed with .lzma files or raw streams.

OPTIONS

Integer suffixes and special values

In most places where an integer argument is expected, an optional suffix is supported to easily indicate large integers. There must be no space between the integer and the suffix.

Multiply the integer by 1,024 (2^10). Ki, k, kB, K, and KB are accepted as synonyms for KiB.
Multiply the integer by 1,048,576 (2^20). Mi, m, M, and MB are accepted as synonyms for MiB.
Multiply the integer by 1,073,741,824 (2^30). Gi, g, G, and GB are accepted as synonyms for GiB.

The special value max can be used to indicate the maximum integer value supported by the option.

Operation mode

If multiple operation mode options are given, the last one takes effect.

Compress. This is the default operation mode when no operation mode option is specified and no other operation mode is implied from the command name (for example, unxz implies --decompress).
Decompress.
Test the integrity of compressed files. This option is equivalent to --decompress --stdout except that the decompressed data is discarded instead of being written to standard output. No files are created or removed.
Print information about compressed files. No uncompressed output is produced, and no files are created or removed. In list mode, the program cannot read the compressed data from standard input or from other unseekable sources.
The default listing shows basic information about files, one file per line. To get more detailed information, use also the --verbose option. For even more information, use --verbose twice, but note that this may be slow, because getting all the extra information requires many seeks. The width of verbose output exceeds 80 characters, so piping the output to, for example, less -S may be convenient if the terminal isn't wide enough.
The exact output may vary between xz versions and different locales. For machine-readable output, --robot --list should be used.

Operation modifiers

Don't delete the input files.
Since xz 5.4.0, this option also makes xz compress or decompress even if the input is a symbolic link to a regular file, has more than one hard link, or has the setuid, setgid, or sticky bit set. The setuid, setgid, and sticky bits are not copied to the target file. In earlier versions this was only done with --force.
This option has several effects:
  • If the target file already exists, delete it before compressing or decompressing.
  • Compress or decompress even if the input is a symbolic link to a regular file, has more than one hard link, or has the setuid, setgid, or sticky bit set. The setuid, setgid, and sticky bits are not copied to the target file.
  • When used with --decompress --stdout and xz cannot recognize the type of the source file, copy the source file as is to standard output. This allows xzcat --force to be used like cat(1) for files that have not been compressed with xz. Note that in future, xz might support new compressed file formats, which may make xz decompress more types of files instead of copying them as is to standard output. --format=format can be used to restrict xz to decompress only a single file format.
Write the compressed or decompressed data to standard output instead of a file. This implies --keep.
Decompress only the first .xz stream, and silently ignore possible remaining input data following the stream. Normally such trailing garbage makes xz display an error.
xz never decompresses more than one stream from .lzma files or raw streams, but this option still makes xz ignore the possible trailing data after the .lzma file or raw stream.
This option has no effect if the operation mode is not --decompress or --test.
Disable creation of sparse files. By default, if decompressing into a regular file, xz tries to make the file sparse if the decompressed data contains long sequences of binary zeros. It also works when writing to standard output as long as standard output is connected to a regular file and certain additional conditions are met to make it safe. Creating sparse files may save disk space and speed up the decompression by reducing the amount of disk I/O.
-S .suf, --suffix=.suf
When compressing, use .suf as the suffix for the target file instead of .xz or .lzma. If not writing to standard output and the source file already has the suffix .suf, a warning is displayed and the file is skipped.
When decompressing, recognize files with the suffix .suf in addition to files with the .xz, .txz, .lzma, or .tlz suffix. If the source file has the suffix .suf, the suffix is removed to get the target filename.
When compressing or decompressing raw streams (--format=raw), the suffix must always be specified unless writing to standard output, because there is no default suffix for raw streams.
Read the filenames to process from file; if file is omitted, filenames are read from standard input. Filenames must be terminated with the newline character. A dash (-) is taken as a regular filename; it doesn't mean standard input. If filenames are given also as command line arguments, they are processed before the filenames read from file.
This is identical to --files[=file] except that each filename must be terminated with the null character.

Basic file format and compression options

Specify the file format to compress or decompress:
This is the default. When compressing, auto is equivalent to xz. When decompressing, the format of the input file is automatically detected. Note that raw streams (created with --format=raw) cannot be auto-detected.
Compress to the .xz file format, or accept only .xz files when decompressing.
Compress to the legacy .lzma file format, or accept only .lzma files when decompressing. The alternative name alone is provided for backwards compatibility with LZMA Utils.
Compress or uncompress a raw stream (no headers). This is meant for advanced users only. To decode raw streams, you need use --format=raw and explicitly specify the filter chain, which normally would have been stored in the container headers.
Specify the type of the integrity check. The check is calculated from the uncompressed data and stored in the .xz file. This option has an effect only when compressing into the .xz format; the .lzma format doesn't support integrity checks. The integrity check (if any) is verified when the .xz file is decompressed.
Supported check types:
Don't calculate an integrity check at all. This is usually a bad idea. This can be useful when integrity of the data is verified by other means anyway.
Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet).
Calculate CRC64 using the polynomial from ECMA-182. This is the default, since it is slightly better than CRC32 at detecting damaged files and the speed difference is negligible.
Calculate SHA-256. This is somewhat slower than CRC32 and CRC64.
Integrity of the .xz headers is always verified with CRC32. It is not possible to change or disable it.
Don't verify the integrity check of the compressed data when decompressing. The CRC32 values in the .xz headers will still be verified normally.
Do not use this option unless you know what you are doing. Possible reasons to use this option:
  • Trying to recover data from a corrupt .xz file.
  • Speeding up decompression. This matters mostly with SHA-256 or with files that have compressed extremely well. It's recommended to not use this option for this purpose unless the file integrity is verified externally in some other way.
-0 ... -9
Select a compression preset level. The default is -6. If multiple preset levels are specified, the last one takes effect. If a custom filter chain was already specified, setting a compression preset level clears the custom filter chain.
The differences between the presets are more significant than with gzip(1) and bzip2(1). The selected compression settings determine the memory requirements of the decompressor, thus using a too high preset level might make it painful to decompress the file on an old system with little RAM. Specifically, it's not a good idea to blindly use -9 for everything like it often is with gzip(1) and bzip2(1).
-0 ... -3
These are somewhat fast presets. -0 is sometimes faster than gzip -9 while compressing much better. The higher ones often have speed comparable to bzip2(1) with comparable or better compression ratio, although the results depend a lot on the type of data being compressed.
-4 ... -6
Good to very good compression while keeping decompressor memory usage reasonable even for old systems. -6 is the default, which is usually a good choice for distributing files that need to be decompressible even on systems with only 16 MiB RAM. (-5e or -6e may be worth considering too. See --extreme.)
-7 ... -9
These are like -6 but with higher compressor and decompressor memory requirements. These are useful only when compressing files bigger than 8 MiB, 16 MiB, and 32 MiB, respectively.
On the same hardware, the decompression speed is approximately a constant number of bytes of compressed data per second. In other words, the better the compression, the faster the decompression will usually be. This also means that the amount of uncompressed output produced per second can vary a lot.
The following table summarises the features of the presets:

Preset DictSize CompCPU CompMem DecMem
-0 256 KiB 0 3 MiB