.\"
.\" World Wide Web Package
.\" WWW.3
.\"
.\" Copyright (C) 1998 Paul J. Lucas
.\"
.\" This program is free software; you can redistribute it and/or modify
.\" it under the terms of the GNU General Public License as published by
.\" the Free Software Foundation; either version 2 of the License, or
.\" (at your option) any later version.
.\"
.\" This program is distributed in the hope that it will be useful,
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
.\" GNU General Public License for more details.
.\"
.\" You should have received a copy of the GNU General Public License
.\" along with this program; if not, write to the Free Software
.\" Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
.\"
.\" ---------------------------------------------------------------------------
.\" define code-start macro
.de cS
.sp
.nf
.RS 5
.ft CW
.ta .5i 1i 1.5i 2i 2.5i 3i 3.5i 4i 4.5i 5i 5.5i
..
.\" define code-end macro
.de cE
.ft 1
.RE
.fi
.if !'\\$1'0' .sp
..
.\" ---------------------------------------------------------------------------
.tr ~
.TH \f3WWW\f1 3 "February 12, 2000" "WWW"
.SH NAME
WWW \- World Wide Web Package
.SH SYNOPSIS
.ft CW
.nf
extract_description( \f2FILE\fP )
extract_meta( \f2FILE\fP, \f2NAME\fP )
hyperlink( \f2LIST\fP )
.fi
.ft 1
.SH DESCRIPTION
This package provides a utility functions for the World Wide Web
to extract descriptions of or meta information from files,
and hyperlink text.
.SH SUBROUTINES
The following Perl subroutines are defined and available:
.IP "\f(CWextract_description( \f2FILE\fP )\f1"
Extracts a description from an HTML or plain text file given by the
.I FILE
name;
.I FILE
should be an absolute path.
The first \f(CW$description::chars\f1 (default: 2048) characters are read.
If the file ends in one of the extensions
\f(CWhtm\f1, \f(CWhtml\f1, or \f(CWshtml\f1,
it is presumed to be an HTML file;
if the file ends in \f(CWtxt\f1, it is presumed to be a plain text file.
Other extensions are not recognized and no description is returned for them.
.IP ""
For HTML files, first,
if a \f(CW\f1
or a \f(CW\f1
(Dublin Core) element is found,
then the words specified as the value of the \f(CWCONTENT\f1 attribute
is returned as the description.
.IP ""
Otherwise, all HTML comments, text between
\f(CW