NAME¶
wzip - lossy data compression and denoising
SYNOPSIS¶
wzip [ -c | -d | -dn | -hdn ] num sf
DESCRIPTION¶
This manual page documents the
wzip command.
wzip is a program that can be used for LOSSY data compression and
denoising. It reads from STDIN and writes to STDOUT. In compression mode the
input is a sequence of ascii floating-point values.
num is the number
of these data values. The output is a sequence of small integers, most of them
zero in typical application. This is ready for effective compression with a
standard loss-less compression program like gzip.
The program can also be used for denoising. In this case both input and output
are sequences of ascii floating-point values.
The scale factor
sf determines the strength of compression or denoising.
A higher scale factor means heavier compression and stronger denoising. Four
times the standard deviation of the noise content is a good start. Otherwise 5
percent of the overall signal amplitude might be used as a first estimation of
a suitable scale factor.
If the noise content of the input data is strongly non-Gaussian-distributed,
like Poisson noise. The input data should be transformed to approximate
Gaussian-distributed noise. If the input values are Poisson-distributed, that
means for example raw counts per channel in EDX or XPD, they can be
transformed to approximate Gaussian-distributed noise by transformation of
each data point with y:=2.0*sqrt(x+0.25109). Back transformation is done with
y:=(x/2)^2. The summand 0.25109 compensates for the bias caused by the
asymmetry of the Poisson-distribution.
Invoking the program without any options writes examples of the use of the
program to STDERR.
OPTIONS¶
There must be given exactly one option.
- -c
- Compression, reads num ascii floating-point values
from STDIN and writes a sequence of integers with high redundancy to
STDOUT.
- -d
- Decompression, reads from STDIN and writes a sequence of
num ascii floating-point values to STDOUT. These are more or less
similar to the original data.
- -dn
- Denoising, reads num ascii floating-point values
from STDIN and writes a sequence of num ascii floating-point values
to STDOUT. These are more or less similar to the original data.
- -hdn
- Denoising with hard thresholding instead of wavelet
shrinkage. Single untouched noise peaks may be visible with this mode. On
the other hand, there is much less impact on the signal slope.
SEE ALSO¶
Donoho, D.L.; Johnstone, I.M.: Adapting to unknown smoothness via wavelet
shrinkage, technical report 425, Department of Statistics, Stanford
University, Stanford, June 1993,
ftp://playfair.stanford.edu/pub/donoho/ausws.ps.Z
Franzen, A.: Compression of process data with a wavelet method, steel res. 69
(1998), No. 1, pp. 28/30
Franzen, A.: Non-linear denoising with wavelet transformation, Z. Metallkd. 89
(1998), No. 4, pp. 297/302
AUTHOR¶
This manual page was written by Andreas Franzen <anfra@debian.org>, for
the Debian GNU/Linux system (but may be used by others).
Copyright (C) 1997 Andreas Franzen, placed under the GNU General Public License,
see the file copyright for details.