NAME¶
clfdomainsplit - split Common-Log Format web logs based on domain name
SYNOPSIS¶
clfdomainsplit [--help] [-i input] [-d defaultfile] [-c cfg-file] [-o
directory]
DESCRIPTION¶
The
clfdomainsplit program will split up large CLF format web logs based
on domain name. This is for creating separate log analysis passes for each
domain hosted on your server.
OVERVIEW¶
The input parameter specifies the file to read (default is standard input).
The defaultfile parameter specifies where data goes if it doesn't have a domain
(either it has an IP address for the server or it doesn't have the server-name
- the URL is relative to the root of the web server only). The default will be
to print them on standard error.
The
cfg-file parameter is for specifying the rules for determining what
is a different domain name. For example www.coker.com.au belongs in the same
file as coker.com.au and abc.coker.com.au because domain names ending in .au
have three major components. The domain names www.workbenelux.nl and
workbenelux.nl belong in the same file because domain names ending in .nl have
two major components (as do .com, and .gov), wheras anything ending in .va
belongs to the same organization. The rules are of the form
number:pattern which lists the number of domain parts which are
significant (2 for .com and for a simple string comparison, the default will
be:
- 2:com
- 2:nl
- 3:au
- 3:uk
If no config file is specified then it will look for
/etc/clfdomainsplit.cfg. Of course comments start with #. Also note
that the first match will be used!
The
directory parameter is to specify the location for the files to be
created (default is the current directory). I recommend that you use a
directory for this and nothing else as you never know how many files may be
created!
EXIT STATUS¶
0 No errors
1 Bad parameters
AUTHOR¶
This program, its manual page, and the Debian package were written by Russell
Coker <russell@coker.com.au>.
SEE ALSO¶
clfsplit(1),
clfmerge(1)