Scroll to navigation

BORG-COMPRESSION(1) borg backup tool BORG-COMPRESSION(1)

NAME

borg-compression - Details regarding compression

DESCRIPTION

It is no problem to mix different compression methods in one repo, deduplication is done on the source data chunks (not on the compressed or encrypted data).

If some specific chunk was once compressed and stored into the repo, creating another backup that also uses this chunk will not change the stored chunk. So if you use different compression specs for the backups, whichever stores a chunk first determines its compression. See also borg recreate.

Compression is lz4 by default. If you want something else, you have to specify what you want.

Valid compression specifiers are:

Do not compress.
Use lz4 compression. Very high speed, very low compression. (default)
Use zstd ("zstandard") compression, a modern wide-range algorithm. If you do not explicitly give the compression level L (ranging from 1 to 22), it will use level 3. Archives compressed with zstd are not compatible with borg < 1.1.4.
Use zlib ("gz") compression. Medium speed, medium compression. If you do not explicitly give the compression level L (ranging from 0 to 9), it will use level 6. Giving level 0 (means "no compression", but still has zlib protocol overhead) is usually pointless, you better use "none" compression.
Use lzma ("xz") compression. Low speed, high compression. If you do not explicitly give the compression level L (ranging from 0 to 9), it will use level 6. Giving levels above 6 is pointless and counterproductive because it does not compress better due to the buffer size used by borg - but it wastes lots of CPU cycles and RAM.
Use a built-in heuristic to decide per chunk whether to compress or not. The heuristic tries with lz4 whether the data is compressible. For incompressible data, it will not use compression (uses "none"). For compressible data, it uses the given C[,L] compression - with C[,L] being any valid compression specifier.
Use compressed-size obfuscation to make fingerprinting attacks based on the observable stored chunk size more difficult. Note:
  • You must combine this with encryption, or it won't make any sense.
  • Your repo size will be bigger, of course.
  • A chunk is limited by the constant MAX_DATA_SIZE (cur. ~20MiB).

The SPEC value determines how the size obfuscation works:

Relative random reciprocal size variation (multiplicative)

Size will increase by a factor, relative to the compressed data size. Smaller factors are used often, larger factors rarely.

Available factors:

1:     0.01 ..        100
2:     0.1  ..      1,000
3:     1    ..     10,000
4:    10    ..    100,000
5:   100    ..  1,000,000
6: 1,000    .. 10,000,000


Example probabilities for SPEC 1:

90   %  0.01 ..   0.1

9 % 0.1 .. 1
0.9 % 1 .. 10
0.09% 10 .. 100


Randomly sized padding up to the given size (additive)

110: 1kiB (2 ^ (SPEC - 100))
...
120: 1MiB
...
123: 8MiB (max.)



Examples:

borg create --compression lz4 REPO::ARCHIVE data
borg create --compression zstd REPO::ARCHIVE data
borg create --compression zstd,10 REPO::ARCHIVE data
borg create --compression zlib REPO::ARCHIVE data
borg create --compression zlib,1 REPO::ARCHIVE data
borg create --compression auto,lzma,6 REPO::ARCHIVE data
borg create --compression auto,lzma ...
borg create --compression obfuscate,110,none ...
borg create --compression obfuscate,3,auto,zstd,10 ...
borg create --compression obfuscate,2,zstd,6 ...


AUTHOR

The Borg Collective

2023-12-02