.TH "HXWLS" "1" "10 Jul 2011" "7.x" "HTML-XML-utils"

.de d \" begin display
.sp
.in +4
.nf
.ft CR
.CDS
..
.de e \" end display
.CDE
.in -4
.fi
.ft R
.sp
..

.SH NAME
hxwls \- list links in an HTML file
.SH SYNOPSIS
.B hxwls
.RB "[\| " \-l " \|]"
.RB "[\| " \-t " \|]"
.RB "[\| " \-r " \|]"
.RB "[\| " \-h " \|]"
.RB "[\| " \-a " \|]"
.RB "[\| " \-b
.IR " base" " \|]"
.RI "[\| " file " \|]"
.SH DESCRIPTION
.LP
The
.B hxwls
command reads an HTML file (standard input by default) and prints out
all links it finds. The output is written to stdout.
.SH OPTIONS
The following options are supported:
.TP 10
.B \-l
Produce a long listing. Instead of just the URI,
.B hxwls
prints three columns: the element name, the value of the REL
attribute, and the target URI.
.TP
.B \-t
Produce a tuple listing.
.B hxwls
prints four columns: the URI of the document itself, the element name,
the value of the REL attribute, and the target URI.
.TP
.BI \-r
Print relative URLs as they are, without converting them to absolute
URLs.
.TP
.BI \-b " base"
Use
.I base
as the initial base URL. If there is a <base> element in the document, 
it will override the \-b option.
.TP
.B \-h
Output as HTML. The output will be listed in the form of <a> elements.
.TP
.B \-a
Convert any IRIs (Internationalized Resource Identifiers) to
ASCII-only URIs. This causes any non-ASCII characters in the path of a
URI to be encoded as %-escaped octets and non-ASCII characters in the
domain name as punycode. (Punycode encoding is only available if
.B hxwls
is compiled with libidn support.)
.SH OPERANDS
The following operand is supported:
.TP 10
.I file
The name or the URL of an HTML file. If absent, standard input is read instead.
.SH "DIAGNOSTICS"
The following exit values are returned:
.TP 10
.B 0
Successful completion.
.TP
.B > 0
An error occurred in the parsing of the HTML file.
.B hxwls
will try to correct the error and produce output anyway.
.SH "SEE ALSO"
.BR asc2xml (1),
.BR hxnormalize (1),
.BR hxnum (1),
.BR xml2asc (1)