The zlibc package allows transparent on the fly uncompression of gzipped files.
Your programs will be able to access any compressed file, just as if they were
uncompressed. Zlibc will transparently uncompresses the data from these files
as soon as they are read, just as a compressed filesystem would do. No kernel
patch, no recompilation of these executables and no recompilation of the
libraries is needed.
It is not (yet) possible execute compressed files with zlibc. However, there is
another package, called tcx, which is able to uncompress executables on the
fly. On the other hand tcx isn't able to uncompress data files on the fly.
Fortunately, the both zlibc and tcx may coexist on the same machine without
problems.
Zlibc can be found at the following places (and their mirrors):
ftp://zlibc.linux.lu/zlibc-0.9k.tar.gz
ftp://www.tux.org/pub/knaff/zlibc/zlibc-0.9k.tar.gz
ftp://ibiblio.unc.edu/pub/Linux/libs/compression/zlibc-0.9k.tar.gz
ftp://ftp.gnu.org/gnu/zlibc/compression/zlibc-0.9k.tar.gz
Before reporting a bug, make sure that it has not yet been fixed in the Alpha
patches which can be found at:
http://zlibc.linux.lu/
http://www.tux.org/pub/knaff/zlibc
These patches are named zlibc-
version-
ddmm.taz, where version
stands for the base version,
dd for the day and
mm for the
month. Due to a lack of space, I usually leave only the most recent patch.
There is an zlibc mailing list at zlibc @ www.tux.org . Please send all bug
reports to this list. You may subscribe to the list by sending a message with
'subscribe zlibc @ www.tux.org' in its body to majordomo @ www.tux.org . (N.B.
Please remove the spaces around the "@" both times. I left them
there in order to fool spambots.) Announcements of new zlibc versions will
also be sent to the list, in addition to the linux announce newsgroups. The
mailing list is archived at
http://www.tux.org/hypermail/zlibc/latest
If you install zlibc on Linux, make sure that your shared loader
(ld-linux.so.1/ld.so) understands LD_PRELOAD. (Best if ld.so-1.8.5 or more
recent)
Type ./configure. This runs the GNU autoconfigure script which configures the
`Makefile' and the `config.h' file. You may compile time configuration options
to ./configure, see for details.
Type make to compile zlibc.
Type make install to install zlibc and associated programs to its final target.
To use this module, set the environment variable LD_PRELOAD to point to the
object. Example (sh syntax):
LD_PRELOAD=/usr/local/lib/uncompress.so
export LD_PRELOAD
or (csh syntax):
setenv LD_PRELOAD /usr/local/lib/uncompress.so On linux, use /lib/uncompress.so
instead of /usr/local/lib/uncompress.so .
You might want to put these lines in your `.profile' or `.cshrc' in order to
have the uncompressing functions available all the time. Compress your files
using gzip and enjoy
For security reasons, the dynamic loader disregards environmental variables such
as LD_PRELOAD when executing set uid programs.
However, on Linux, you can use zlibc with set uid programs too, by using one of
the two methods described below:
You may ing the path to `uncompress.so' into `/etc/ld.so.preload' instead of
using LD_PRELOAD.
WARNING: If you use `/etc/ld.so.preload', be sure to install
`uncompress.so' on your root filesystem, for instance in /lib, as is done by
the default configuration. Using a directory which is not available at boot
time, such as /usr/local/lib will cause trouble at the next reboot!
It is also careful to remove zlibc from `/etc/ld.so.preload' when installing a
new version. First test it out using LD_PRELOAD, and only if everything is ok,
put it back into `/etc/ld.so.preload'.
If you have a version of ld.so which is more recent than 1.9.0, you can set
LD_PRELOAD to just contain the basename of `uncompress.so' without the
directory. In that case, the file is found as long as it is in the shared
library path (which usually contains `/lib' and `/usr/lib')). Because the
search is restricted to the library search path, this also works for set-uid
programs.
Example (sh syntax):
LD_PRELOAD=uncompress.so
export LD_PRELOAD
or (csh syntax):
setenv LD_PRELOAD uncompress.so
The advantage of this approach over `ld.so.preload' is that zlibc can more
easily be switched off in case something goes wrong.
Once zlibc is installed, simply compress your biggest datafiles using gzip. Your
programs are now able to uncompress these files on the fly whenever they need
them.
After compressing your datafiles, you also need to change any potential symbolic
links pointing to them. Let's suppose that `x' is a symlink to `tstfil':
> echo 'this is a test' >tstfil > ln -s tstfil x > ls -l total 1
-rw-r--r-- 1 alknaff sirac 15 Feb 25 19:40 tstfil lrwxrwxrwx 1 alknaff sirac 8
Feb 25 19:40 x -> tstfil
After compressing it, you'll see the following listing:
> gzip tstfil > ls -l total 1 pr--r--r-- 1 alknaff sirac 15 Feb 25 19:40
tstfil lrwxrwxrwx 1 alknaff sirac 8 Feb 25 19:40 x -> tstfil
`Tstfil' is now shown as a pipe by zlibc in order to warn programs that they
cannot seek in it. Zlibc still shows it with its old name, and you can
directly look at its contents: > cat tstfil this is a test
However, `tstfil' is not yet accessible using the symbolic link: > cat x cat:
x: No such file or directory
In order to make `tstfil' accessible using the link, you have to destroy the
link, and remake it: > rm x /bin/rm: remove `x'? y > ln -s tstfil x >
ls -l total 1 pr--r--r-- 1 alknaff sirac 15 Feb 25 19:40 tstfil lrwxrwxrwx 1
alknaff sirac 8 Feb 25 19:44 x -> tstfil > cat x this is a test
If you compress datafiles with hard links pointing to them, gzip refuses to
compress them.
> echo 'this is a test' >tstfil > ln tstfil x > ls -li total 2
166 -rw-r--r-- 2 alknaff sirac 15 Feb 25 19:46 tstfil
166 -rw-r--r-- 2 alknaff sirac 15 Feb 25 19:46 x > gzip tstfil gzip: tstfil
has 1 other link -- unchanged
Thus you need to remove these hard links first, and remake them after
compressing the file.
> rm x /bin/rm: remove `x'? y > gzip tstfil > ln tstfil x > ls -li
total 2
167 pr--r--r-- 2 alknaff sirac 15 Feb 25 19:46 tstfil
167 pr--r--r-- 2 alknaff sirac 15 Feb 25 19:46 x > cat x this is a test
Usually, programs don't make system calls directly, but instead call a library
function which performs the actual system calls. For instance, to open a file,
the program first calls the open library function, and then this function
makes the actual syscall. Zlibc overrides the open function and other related
functions in order to do the uncompression on the fly.
If the open system call fails because the file doesn't exist, zlibc constructs
the filename of a compressed file by appending .gz to the filename supplied by
the user program. If this compressed file exists, it is opened and piped
trough gunzip, and the descriptor of the read end of this pipe is returned to
the caller.
In some cases, the compressed file is first uncompressed into a temporary file,
and a read descriptor for this file is passed to the caller. This is necessary
if the caller wants to call lseek on the file or mmap it. A description of
data files for which using temporary is necessary can be given in the
configuration files `/usr/local/etc/zlibc.conf' (`/etc/zlibc.conf' on
Linux)Actually Actually the location of the system-wide include file depends
on the settings of sysconfdir and prefix during ./configure (see section
Compile-time configuration).
and `~/.zlibrc'. See section Configuration files, for a detailed description of
their syntax.
Many user programs try to check the existence of a given file by other system
calls before actually opening it. That's why zlibc also overrides these system
calls. If for example the user program tries to stat a file, this call is also
intercepted.
The compressed file, which exists physically on the disk, is also called 'the
real file', and the uncompressed file, whose existence is only simulated by
zlibc is called 'the virtual file'.
The behavior of zlibc can be tailored using configuration files or environment
variables. This customization should normally not be needed, as the
compiled-in defaults are already pretty complete.
Environmental variables come in two kinds:
switch variables have a
boolean value and can only be turned on or off, whereas
string
variables can have arbitrary strings as values.
These variables represent a flag which can be turned on or off. If their value
is on or 1 they are turned on, if their value is off or 0 they are turned off.
All other values are ignored. If the same flag can be turned on or off using
config files, the environmental variable always has the priority.
If this variable is turned on, informational messages are printed on many
operations of zlibc. Moreover, error messages are printed in order to point
out errors in the configuration files, if any. If this variable is turned off,
errors are silently ignored.
If this variable is turned on, and if the user program tries to unlink a virtual
(uncompressed) file, zlibc translates this call into unlinking the real file.
If this variable is turned off, unlink calls on virtual files are ignored.
If this variable is turned on, zlibc is switched off.
If this variable is turned on, the readdir function shows the real (compressed)
files instead of the virtual (uncompressed) files.
These variables have a string value, which represent a file, a directory or a
command.
This is the name of the directory where the temporary uncompressed files are
put. The default is /tmp.
This is the extension which is appended to a virtual file name in order to
obtain the real (compressed) file name. The default is .gz.
This is the name of the program to be invoked to uncompress the data. Default is
gzip -dc.
This is the name of an additional configuration file. If this variable is
defined and if the corresponding file exists, the configuration described in
this file overrides the configurations in `~/.zlibrc' and in
`/usr/local/etc/zlibc.conf' (`/etc/zlibc.conf' on Linux).
It is possible to operate zlibc entirely without configuration files. In this
case, it uses the
compiled-in defaults. These are generated at
compile-time from the `zlibrc.sample' file. This file has the same syntax as
the configuration files described above (see section Configuration files). If
you want to change the compiled-in defaults of zlibc, edit that file, and
remake.
Before it can be compiled, zlibc must be configured using the GNU autoconf
script ./configure. In most circumstances, running ./configure without any
parameters is enough. However, you may customize zlibc using various options
to ./configure. The following options are supported:
Prefix used for any directories used by zlibc. By default, this is `/usr/local'.
Zlibc is installed in `$prefix/lib', looks for its system wide configuration
file in `$prefix/etc'. Man pages are installed in `$prefix/man', info pages in
`$prefix/info' etc. On Linux, if you use zlibc via `/etc/ld.so.preload', you
should use `/' as the prefix instead of the default `$prefix/lib'.
Directory containing the system-wide configuration file `zlibc.conf'. By
default, this is derived from prefix (see above).
Disables run time configuration via environmental variables and via the
configuration files. This may be needed in hyper secure environments.
Disables run time configuration via environmental variables
Tells zlibc not to use the /proc filesystem to find out the commandline of the
programs for which it runs, even if a working /proc is detected.
Tells zlibc to use the /proc filesystem to find out the commandline of the
programs for which it runs, even if no working /proc is detected.
Uses
extension as the filename extension of compressed files. By default,
is .gz
Allows to configure compressed filename extensions with at most
length
character via runtime configuration. By default is 5.
Uses
directory to store the uncompressed files. By default is /tmp.
Defines how the program for uncompressing files should be invoked. This command
should read the compressed file from stdin, and output the uncompressed data
to stdout By default is gzip -dc.
In addition to the above-listed options, the standard GNU autoconf options
apply. Type ./configure --help to get a complete list of these.