NAME¶
globus-rls-server - Replica Location Server
SYNOPSIS¶
globus-rls-server [
-B update_bf_int ] [
-b maxbackoff ] [
-C rlscertfile ] [
-c conffile ] [
-d ] [
-e
rli_expire_int ] [
-F update_factor ] [
-f maxfreethreads ]
[
-I true|false [
-i idletimeout ] [
-K rlskeyfile ] [
-L loglevel ] [
-l true|false ] [
-M maxconnections ] [
-m maxthreads ] [
-N ] [
-o update_buftime ] [
-p
pidfiledir ] [
-r true|false ] [
-S rli_expire_stale ] [
-s startthreads ] [
-t timeout ] [
-U myurl ] [
-u
update_ll_int ] [
-v ]
DESCRIPTION¶
The RLS server
globus-rls-server supports both a Location Replica Catalog
(LRC) server, which manages Logical FileName (LFN) to Physical FileName (PFN)
mappings in a database, and a Replica Location Index (RLI) server, which
manages mappings of LFNs to LRC servers.
globus-rls-server may be
configured as either an LRC or RLI server, or both. Both LRCs and RLIs may be
configured to send updates to other RLIs (using
globus-rls-admin(8)).
Clients wishing to locate 1 or more physical filenames associated with a logical
filename may first contact an RLI server, which will return a list of LRCs
that may know about the LFN. The LRC servers are then contacted in turn to
find the physical filenames. Note that RLI information may be out of date, so
clients should be prepared to get a negative response when contacting an LRC
(or no response at all if the LRC server is unavailable).
globus-rls-server uses
syslog(3) to log errors and other information
(facility LOG_DAEMON) when it's running in normal (daemon) mode. If the -d
option (debug) is specified then log messages are written to stdout.
LRC to RLI Updates¶
Two methods exist for LRC or RLI servers to inform RLI servers of their LFNs. By
default the list of LFNs are sent from the source to the RLI. This can be time
consuming if the number of LFNs is large, but does give the RLI an exact list
of the LFNs known to the LRC. This allows wildcard searching of the RLI.
Alternatively Bloom filters may be sent, which are highly compressed summaries
of the LFNs, however they do not allow wildcard searching, and they will
generate more "false positives" when querying an RLI. Please see
below for more on Bloom filters. The program
globus-rls-admin(8) can be
used to manage the list of RLIs that an LRC or RLI server sends updates to,
this includes partitioning LFNs amongst multiple RLI servers.
A softstate algorithm is used for updates, periodically the source server sends
its state (LFN information) to the RLI servers it updates. The RLI servers add
these LFNs to their index, or update a timestamp if the LFNs were already
known. RLI servers expire information about LFN,LRC mappings if they haven't
been updated for a period longer than the softstate update interval.
Options that can be configured to control the softstate algorithm when a source
server updates an RLI by sending LFNs are include:
- rli_expire_int (seconds)
- How often an RLI server will check for stale entries in its
database.
- rli_expire_stale (seconds)
- How old an entry must be in an RLI database before it's
considered stale. This value should be no smaller than
update_ll_int. Note if the LRC server is responding this value is
not used, instead the value of update_ll_int or
update_bf_int is retrieved from the LRC server, multiplied by 1.2,
and used as the value for rli_expire_stale.
- update_bf_int seconds
- Interval between RLI updates when using Bloom filters.
- update_ll_int (seconds)
- Interval between RLI updates when using LFN lists for
softstate updates.
Updates to an LRC (new LFNs or deleted LFNs) normally don't propagate to RLI
servers until the next softstate update (controlled by
update_ll_int
and
update_bf_int). However by enabling "immediate update"
mode an LRC will send updates to an RLI within
update_buftime seconds.
Immedate updates are enabled by setting
update_immediate to true. If
updates are done with LFN lists then only the LFNs that have been added or
deleted to the source server are sent, if Bloom filters are used then the
entire Bloom filter is sent.
When immediate updates are enabled the interval between softstate updates is
multiplied by
update_factor as long as no updates have failed (source
and RLI are considered to be in sync). This can greatly reduce the number of
softstate updates a source needs to send to an RLI. Incremental updates are
buffered by the source server until either 100 udpates have accumulated (when
LFN lists are used), or
update_buftime seconds have passed since the
last update.
A Bloom filter is an array of bits. Each LFN is hashed multiple times and the
corresponding bits in the Bloom filter are set. Querying an RLI to verify if
an LFN exists is done by performing the same hashes, and checking if the bits
in the filter are on. If not then the LFN is known not to exist, however if
they're all on then all that's known is that the LFN probably exists. The size
of the Bloom filter (as a multiple of the number of LFNs) and the number of
hash functions, control the false positive rate. The default values of 10 and
3 give a false positive rate of approximately 1%. The advantage of Bloom
filters is their efficiency. For example, if the LRC has 1,000,000 LFNs in its
database, of average length 20 bytes, then 20,000,000 bytes must be sent to an
RLI during a softstate update (assuming no partitioning). The RLI server must
perform 1,000,000 updates to its database to create new LFN,LRC mappings, or
update timestamps on existing entries. With Bloom filters only 1,250,000 bytes
are sent (10 x 1,000,000 bits / 8), and there are no database operations on
the RLI (Bloom filters are maintained entirely in memory). A comparison of the
time to perform a 1,000,000 LFN update took 20 minutes sending all the LFNs,
and less than 1 second using a Bloom filter. However as noted before wild card
searches of an RLI are not supported with Bloom filters.
The options that control Bloom filter updates are:
- rli_bloomfilter true|false
- RLI servers must have this set to accept Bloom filter
updates.
- rli_bloomfilter_dir none|default|pathname
- Bloom filters saved in this directory and read at start
time if not "none". See CONFIGURATION for details.
- lrc_bloomfilter_numhash N
- Number of hash functions, an integer from 1 to 8. The
default is 3.
- lrc_bloomfilter_ratio N
- Size of the Bloom filter as a multiple of the number of
LFNs in the LRC database. Too small a value will generate too many false
positives, too large wastes memory and network bandwidth.
Note an LRC server can update some RLIs with Bloom filters, and others with
LFNs. However an RLI server can only be updated using one method, and an RLI
acting as a source for updates can only send the type of updates that it
receives.
OPTIONS¶
- -b maxbackoff
- Maximum time, in seconds, that globus-rls-server
will attempt to reopen the socket it listens on after an I/O error.
- -C rlscertfile
- Name of X.509 certificate file that identifies the server,
sets environment variable X509_USER_CERT.
- -c conffile
- Name of configuration file for server. The default is
$GLOBUS_LOCATION/etc/globus-rls-server.conf if the environment
variable GLOBUS_LOCATION is set, else
/etc/globus-rls-server.conf.
- -d
- Enable debugging. Server will not detach from controlling
terminal and log messages will be written to stdout rather than syslog.
For additional logging verbosity set loglevel (see -L option) to higher
values.
- -e rli_expire_int
- Interval (seconds) at which an RLI server should expire
stale entries.
- -F update_factor
- If update_immediate mode is on, and the source
server is in sync with an RLI server (an LRC and RLI are synced if there
have been no failed updates since the last full softstate update), then
the interval between RLI updates for this server ( update_ll_int )
is multipled by update_factor.
- -f maxfreethreads
- Maximum number of idle threads server will leave running.
Excess threads are terminated.
- -I true|false
- Turns LRC to RLI immediate update mode on or off. Default
is false.
- -i idletimeout
- Seconds after which idle client connections are timed
out.
- -K rlskeyfile
- Name of X.509 key file. Sets environment variable
X509_USER_KEY.
- -L loglevel
- Sets log level. By default this is 0, which means only
errors will be logged. Higher values mean more verbose logging. Level 1
causes logging of major events (eg start of full softstate update), 2
includes medium level events (eg writing pending updates to an RLI), 3
enables all tracing. Level 4 includes all the SQL commands executed by the
server.
- -l true|false
- Configure whether server is an LRC server. Default is
false.
- -M maxconnections
- Maximum number of active connections. Should be small
enough to prevent server from running out of open file descriptors.
Default is 100.
- -m maxthreads
- Maximum number of threads server will start up to support
simultaneous requests.
- -N
- Disable authentication checking. Intended for debugging.
Clients should use the URL RLSN://host to disable authentication on
the client side.
- -o update_buftime
- Softstate updates are buffered until either the buffer is
full or this much time has elapsed since the last update. Default is 30
seconds.
- -p pidfiledir
- Directory where pid file should be written.
- -r
- Configure whether server is an RLI server. Default is
false.
- -S rli_expire_stale
- Interval after which entries in the RLI database are
considered stale (presumably because they were deleted in the LRC). Stale
entries are not returned in queries.
- -s startthreads
- Number of threads to start up initially.
- -t timeout
- Timeout (in seconds) for calls to other RLS servers (eg for
LRC calls to send an update to an RLI). A value of 0 disables timeouts.
The default is 30 seconds.
- -U myurl
- URL for this server.
- -u update_ll_int
- Interval (in seconds) between lfn-list LRC to RLI
updates.
- -v
- Show version and exit.
SIGNALS¶
The server will reread its configuration file if it receives a HUP signal. It
will wait for all current requests to complete and shut down cleanly if sent a
INT, QUIT or TERM signal.
CONFIGURATION¶
If the configuration file is not specified on the command line (see the -c
option) then it's looked for in
$GLOBUS_LOCATION/etc/globus-rls-server.conf, or
/etc/globus-rls-server.conf if GLOBUS_LOCATION is not set.
Most command line options may also be set in the configuration file, however
command line options always override items found in the configuration file.
The configuration file is a sequence of lines consisting of a keyword,
whitespace, and a value. Comments begin with a # and end with a newline.
- acl user: permission [permission]
- user is a regular expression matching distinguished
names (or local usernames if a gridmap file is used) of users allowed to
make calls to the server. Permission is one or more of lrc_read,
lrc_update, rli_read, rli_update, admin,
stats, and all. There may be multiple acl entries,
the first match found is used to determine a user's privileges. The
admin privilege is necessary to update an LRC's list of RLIs to
send updates to. The stats privilege allows a client to read
performance statistics.
- A gridmap file may also be used to map DNs to local
usernames, which in turn are matched against the regular expressions in
the acl list to determine the user's permissions.
- acl entries may be a combination of DNs and local
usernames. If a DN is not found in the gridmap file then it is used to
search the acl list.
- authentication true|false
- Enable or disable GSI authentication. The default is true.
If authentication is enabled clients should use the URL schema
"rls:" to connect to the server, if disabled
"rlsn:".
- db_pwd password
- Password to use to connect to MYSQL server, default is
changethis.
- db_user databaseuser
- Username to use to connect to MYSQL server, default is
dbperson.
- idletimeout seconds
- Seconds after which idle connections closed, default is
900.
- loglevel N
- Sets loglevel to N (default is 0). Higher levels mean more
verbosity.
- lrc_bloomfilter_numhash N
- Number of hash functions to use in Bloom filters. The
default is 3. Possible values are 1 to 8. This value, in conjunction with
lrc_bloomfilter_ratio, will determine the number of false positives
that may be expected when querying an RLI that is updated via Bloom
filters. The default values of 3 and 10 give a false positive rate of
approximately 1%.
- lrc_bloomfilter_ratio N
- Sets ratio of bloom filter size (in bits) to number of LFNs
in the LRC catalog. Only meaningful if Bloom filters are used to update an
RLI. The default is 10.
- lrc_dbname
- Name of LRC database, default is lrcdb.
- lrc_server true|false
- True if LRC server, default is false.
- maxbackoff seconds
- Max seconds to wait before retrying listen in the event of
an I/O error, default is 300.
- maxfreethreads N
- Maximum number of idle threads, excess threads are killed.
Default is 5.
- maxconnections N
- Maximum number of simultaneous connections. Default is
100.
- maxthreads N
- Maximum number of threads running at one time, default is
30.
- myurl URL
- URL of server. Default is
rls://<hostname>:port
- odbcini filename
- Sets environment variable ODBCINI. If not specified, and
ODBCINI is not already set, then defaults to
$GLOBUS_LOCATION/var/odbc.ini.
- pidfiledir directory
- Directory where pid file should be written, default is
/var/run.
- port N
- Port server listens on, default is 39281.
- result_limit limit
- Sets the maximum number of results returned by a query. If
a query request includes a limit greater than this value an error
(GLOBUS_RLS_BADARG) is returned. If the query request has no limit
specified then at most result_limit records are returned by a
query. A value of zero means no limit, this is the default.
- rli_bloomfilter true|false
- If true then only Bloom filter updates are accepted from
source servers, otherwise full LFN lists are accepted. Note if Bloom
filters are enabled then the RLI does not support wildcarded queries.
- rli_bloomfilter_dir none|default|pathname
- If an RLI is configured to accept bloom filters
(rli_bloomfilter true) then bloom filters may be saved to this directory
after updates. This directory is scanned when an RLI server starts up and
is used to initialize Bloom filters for each LRC that updated the RLI.
This option is useful when it is desired that the RLI recover its data
immediately after a restart rather than wait for LRCs to send another
update. If the LRCs are updating frequently this option is unnecessary,
and may be wasteful in that each Bloom filter is written to disk after
each update.
- If rli_bloomfilter_dir is set to the string
"none" then Bloom filters are not saved to disk, this is the
default. If "default" then the default directory is used, which
is $GLOBUS_LOCATION/var/rls-bloomfilters if GLOBUS_LOCATION is set, else
/tmp/rls-bloomfilters. Any other string is used as the directory name
unchanged. The Bloom filter files in this directory have the name of the
URL of the LRC that sent the Bloom filter, with slashes (/) changed to
percent signs (%), and ".bf" appended.
- rli_dbname database
- Name of RLI database, default is rlidb.
- rli_expire_int seconds
- Interval between RLI expirations of stale entries. Default
is 28800 seconds.
- rli_expire_stale seconds
- Interval after which entries in the RLI database are
considered stale (presumably because they were deleted in the LRC).
Default is 86400 seconds. Stale RLI entries are not returned in
queries.
- rli_server true|false
- True if RLI server, default is false.
- rlscertfile filename
- Name of X.509 certificate file identifying server, set by
setting environment variable X509_USER_CERT.
- rlskeyfile
- Name of X.509 key file for server, set by setting
environment variable X509_USER_KEY.
- startthreads N
- Number of threads to start initially, default is
3.
- timeout seconds
- Timeout (in seconds) for calls to other RLS servers (eg for
LRC calls to send an update to an RLI).
- update_bf_int seconds
- Interval between RLI updates when the RLI is updated by
Bloom filters. The default is 900 seconds.
- update_buftime N
- RLI updates are buffered until either the buffer is full or
this much time has elapsed since the last update. Default is 30
seconds.
- update_factor N
- If update_immediate mode is on, and the source
server is in sync with an RLI server (a source and RLI are synced if there
have been no failed updates since the last full softstate update), then
the interval between RLI updates for this server ( update_ll_int )
is multipled by update_factor.
- update_immediate true|false
- Turn LRC to RLI immediate mode updates on or off. Default
is false.
- update_ll_int seconds
- Seconds between lfn-list softstate updates, default is
86400 seconds.
- update_retry seconds
- Seconds to wait before a source server will retry to
connect to an RLI server that it needs to update. Default is 300.
FILES¶
- $GLOBUS_LOCATION/etc/globus-rls-server.conf
- Default configuration file.