NAME¶
pullnews - Pull news from multiple news servers and feed it to another
SYNOPSIS¶
pullnews [
-hnqRx] [
-b fraction] [
-c
config] [
-C width] [
-d level] [
-f
fraction] [
-F fakehop] [
-g groups] [
-G newsgroups] [
-H headers] [
-k
checkpt] [
-l logfile] [
-m header_pats]
[
-M num] [
-N timeout] [
-O] [
-p
port] [
-P hop_limit] [
-Q level] [
-r
file] [
-s to-server[:
port]] [
-S
max-run] [
-t retries] [
-T connect-pause]
[
-w num] [
-z article-pause] [
-Z
group-pause] [
from-server ...]
REQUIREMENTS¶
The "Net::NNTP" module must be installed. This module is available as
part of the libnet distribution and comes with recent versions of Perl. For
older versions of Perl, you can download it from <
http://www.cpan.org/>.
DESCRIPTION¶
pullnews reads a config file in the running user's home directory
(normally called
~/.pullnews) and connects to the upstream servers
given there as a reader client. By default, it connects to all servers listed
in the configuration file, but you can limit
pullnews to specific
servers by listing them on the command line: a whitespace-separated list of
server names can be specified, like
from-server for one of them. For
each server it connects to, it pulls over articles and feeds them to the
destination server via the IHAVE or POST commands. This means that the system
pullnews is run on must have feeding access to the destination news
server.
pullnews is designed for very small sites that do not want to bother
setting up traditional peering and is not meant for handling large feeds.
OPTIONS¶
- -b fraction
- Backtrack on server numbering reset. Specify the proportion (0.0 to 1.0)
of a group's articles to pull when the server's article number is less
than our high for that group. When fraction is 1.0, pull all the
articles on a renumbered server. The default is to do nothing.
- -c config
- Normally, the config file is stored in ~/.pullnews for the user
running pullnews. If -c is given, config will be used
as the config file instead. This is useful if you're running
pullnews as a system user on an automated basis out of cron rather
than as an individual user.
See "CONFIG FILE" below for the format of this file.
- -C width
- Use width characters per line for the progress table. The default
value is 50.
- -d level
- Set the debugging level to the integer level; more debugging output
will be logged as this increases. The default value is 0.
- -f fraction
- This changes the proportion of articles to get from each group to
fraction and should be in the range 0.0 to 1.0 (1.0 being the
default).
- -F fakehop
- Prepend fakehop as a host to the Path: header of articles fed.
- -g groups
- Specify a collection of groups to get. groups is a list of
newsgroups separated by commas (only commas, no spaces). Each group must
be defined in the config file, and only the remote hosts that carry those
groups will be contacted. Note that this is a simple list of groups, not a
wildmat expression, and wildcards are not supported.
- -G newsgroups
- Add the comma-separated list of groups newsgroups to each server in
the configuration file (see also -g and -w).
- -h
- Print a usage message and exit.
- -H headers
- Remove these named headers (colon-separated list) from fed articles.
- -k checkpt
- Checkpoint (save) the config file every checkpt articles (default
is 0, that is to say at the end of the session).
- -l logfile
- Log progress/stats to logfile (default is "stdout").
- -m header_pats
- Feed an article based on header matching. The argument is a number of
whitespace-separated tuples (each tuple being a colon-separated header and
regular expression). For instance:
-m "Hdr1:regexp1 !Hdr2:regexp2"
specifies that the article will be passed only if the "Hdr1:"
header matches "regexp1" and the "Hdr2:" header does
not match "regexp2".
- -M num
- Specify the maximum number of articles (per group) to process. The default
is to process all new articles. See also -f.
- -n
- Do nothing but read articles -- does not feed articles downstream,
writes no rnews file, does not update the config file.
- -N timeout
- Specify the timeout length, as timeout seconds, when establishing
an NNTP connection.
- -O
- Use an optimized mode: pullnews checks whether the article already
exists on the downstream server, before downloading it. It may help for
huge articles or a slow link to upstream hosts.
- -p port
- Connect to the destination news server on a port other than the default of
119. This option does not change the port used to connect to the source
news servers.
- -P hop_limit
- Restrict feeding an article based on the number of hops it has already
made. Count the hops in the Path: header ( hop_count), feeding the
article only when hop_limit is "+num" and
hop_count is more than num; or hop_limit is
"-num" and hop_count is less than num.
- -q
- Print out less status information while running.
- -Q level
- Set the quietness level ("-Q 2" is equivalent to
"-q"). The higher this value, the less gets logged. The default
is 0.
- -r file
- Rather than feeding the downloaded articles to a destination server,
instead create a batch file that can later be fed to a server using
rnews. See rnews(1) for more information about the batch
file format.
- -R
- Be a reader (use MODE READER and POST commands) to the downstream server.
The default is to use the IHAVE command.
- -s to-server[:port]
- Normally, pullnews will feed the articles it retrieves to the news
server running on localhost. To connect to a different host, specify a
server with the -s flag. You can also specify the port with this
same flag or use -p.
- -S max-run
- Specify the maximum time max-run in seconds for pullnews to
run.
- -t retries
- The maximum number (retries) of attempts to connect to a server
(see also -T). The default is 0.
- -T connect-pause
- Pause connect-pause seconds between connection retries (see also
-t). The default is 1.
- -w num
- Set each group's high water mark (last received article number) to
num. If num is negative, calculate Current+num
instead (i.e. get the last num articles). Therefore, a num
of 0 will re-get all articles on the server; whereas a num of
"-0" will get no old articles, setting the water mark to
Current (the most recent article on the server).
- -x
- If the -x flag is used, an Xref: header is added to any article
that lacks one. It can be useful for instance if articles are fed to a
news server which has xrefslave set in inn.conf.
- -z article-pause
- Sleep article-pause seconds between articles. The default is
0.
- -Z group-pause
- Sleep group-pause seconds between groups. The default is 0.
CONFIG FILE¶
The config file for
pullnews is divided into blocks, one block for each
remote server to connect to. A block begins with the host line (which must
have no leading whitespace) and contains just the hostname of the remote
server, optionally followed by authentication details (username and password
for that server). Note that authentication details can also be provided for
the downstream server (a host line could be added for it in the configuration
file, with no newsgroup to fetch).
Following the host line should be one or more newsgroup lines which start with
whitespace followed by the name of a newsgroup to retrieve. Only one newsgroup
should be listed on each line.
pullnews will update the config file to include the time the group was
last checked and the highest numbered article successfully retrieved and
transferred to the destination server. It uses this data to avoid doing
duplicate work the next time it runs.
The full syntax is:
<host> [<username> <password>]
<group> [<time> <high>]
<group> [<time> <high>]
where the <host> line must not have leading whitespace and the
<group> lines must.
A typical configuration file would be:
# Format group date high
data.pa.vix.com
rec.bicycles.racing 908086612 783
rec.humor.funny 908086613 18
comp.programming.threads
nnrp.vix.com pull sekret
comp.std.lisp
Note that an earlier run of
pullnews has filled in details about the last
article downloads from the two rec.* groups. The two comp.* groups were just
added by the user and have not yet been checked.
The nnrp.vix.com server requires authentication, and
pullnews will use
the username "pull" and the password "sekret".
FILES¶
- pathbin/pullnews
- The Perl script itself used to pull news from upstream servers and feed it
to another news server.
- ~/.pullnews
- The default config file. It is in the running user's home directory.
HISTORY¶
pullnews was written by James Brister for INN. The documentation was
rewritten in POD by Russ Allbery <rra@stanford.edu>.
Geraint A. Edwards greatly improved
pullnews, adding no more than
16 new recognized flags, fixing some bugs and integrating the
backupfeed contrib script by Kai Henningsen, adding again
6 other flags.
$Id: pullnews.pod 9504 2013-07-08 19:28:15Z iulius $
SEE ALSO¶
incoming.conf(5),
rnews(1).