NAME¶
html2ps - convert HTML to PostScript
SYNOPSIS¶
html2ps [
-2cdDFghHLnORtTuUv ] [
-b URL ] [
-C string ] [
-e encoding ] [
-f
file[:file[:...]] ] [
-i num ] [
-k file ] [
-l lang ] [
-m num ] [
-M num ] [
-N num ] [
-o file ] [
-r path ] [
-s num ] [
-S string ] [
-W string ]
[
-x num ] [
URL|file ]
DESCRIPTION¶
The program
html2ps converts HTML to PostScript. The HTML code can be
retrieved from one or more URL:s or local files, specified as parameters on
the command line. If no parameter is given, html2ps reads from standard input.
Note: To avoid unnecessary network traffic, one can rebuild an already generated
PostScript file with new options. This is done by running html2ps with the new
options, and with the old PostScript file as input (not applicable for all
options).
OPTIONS¶
All options have a short (case sensitive), and a long (case insensitive) form.
- -2 --twoup
- Two column (2-up) output. The default is one column per
page.
- -b URL --base URL
- Use URL as a base to expand relative references for
in-line images. This is useful if you have downloaded a document to a
local file. The URL should then be the URL of the original
document.
- -c --check
- Check the syntax of the HTML file (using an external syntax
checker). The default is to not make a syntax check.
- -C string --toc string
- Generate a table of contents (ToC). The value should be a
string consisting of one of the letters 'f', 'h', or 't', optionally
combined with the letter 'b':
- b
- The ToC will be printed first. This requires that
Ghostscript is installed.
- f
- The ToC will be generated from the links in the converted
document.
- h
- The ToC will be generated from headings and titles in the
converted documents. Note that if the document author for some strange
reason has chosen to use some other means to represent the headings than
the HTML elements H1,...,H6, you are out of luck!
- t
- The ToC will be generated from links having the attribute
rev=TOC in the converted document.
- -d --debug
- Generate debugging information. You should always use this
option when reporting problems with html2ps.
- -D --dsc --DSC
- Generate DSC compliant PostScript. This requires
Ghostscript and can take quite some time to do. Note that a PostScript
file generated with this option cannot be used as input to html2ps for
reformatting later.
- -e encoding --encoding encoding
- The document encoding. Currently recognized values are
ISO-8859-1, EUC-JP, SHIFT-JIS, and ISO-2022-JP (other EUC-xx encodings may
also work). The default is ISO-8859-1.
- -f file[:file[:...]] --rcfile
file[:file[:...]]
- A colon separated list of configuration file names to use
instead of the default personal configuration file $HOME/.html2psrc.
Definitions made in one file override definitions in previous files (the
last file in the list has highest precedence). An empty file name (as in
':file', 'file1::file3', or 'file:') will expand to the default personal
file. The environment variable HTML2PSPATH is used to specify the
directories where to search for these files. (Note: this is only supposed
to be used on the command line, not in a configuration file.)
- -F --frame
- Draw a frame around the text on each page. The default is
to not draw a frame.
- -g --grayscale
- Convert colour images to grayscale images. Note that the
PostScript file will be smaller when the images are converted to
grayscale. The default is to generate colour images.
- -h --help
- Show usage information.
- -H --hyphenate
- Hyphenate the text. This requires TeX hyphenation pattern
files.
- -i num --scaleimage num
- Scale in-line images with a factor num The default
is 1.
- -k file --cookie file
- Enable cookie support, using a netscape formatted cookie
file (requires libwww-perl).
- -l lang --language lang
- Specifies the language of the document (overrides an
eventual LANG attribute of the BODY element). The language should be given
according to RFC1766 (ftp://ftp.nordu.net/rfc/rfc1766.txt) and ISO 639
(http://www.w3.org/WAI/ER/IG/ert/iso639.htm).
- -L --landscape
- Generate code for printing in landscape mode. The default
is portrait mode.
- -m num --scalemath num
- Scale mathematical formulas with a factor num The
default is 1.
- -M num --mainchapter num
- Specifies the start number for automatic numbering of
headings (by setting the seq-number parameter), the default is 1.
- -n --number
- Insert page numbers. The default is to not number the
pages.
- -N num --startno num
- Specifies the starting page number, the default is 1.
- -o file --output file
- Write the PostScript code to file. The default is to
write to standard output.
- -O --original
- Use PostScript original images if they exist. For example,
if a document contains an image figure.gif, and an encapsulated PostScript
file named figure.ps exists in the same directory, that file will be use
instead. This only work for documents read as local files. Note: if the
PostScript file is large or contains bitmap images, this must be combined
with the -D option. In HTML 4.0 this can be achieved in a much better way
with:
<OBJECT data="figure.ps"
type="application/postscript">
<OBJECT data="figure.gif" type="image/gif">
<PRE>[Maybe some ASCII art for text browsers]</PRE>
</OBJECT>
</OBJECT>
- -r path --rootdir path
- When a document is read from a local file, this value
specifies a base directory for resolving relative links starting with
"/". Typically, this should be the directory where your web
server's home page resides.
- -R --xref
- Insert cross references at every link to within the set of
converted documents.
- -s num --scaledoc num
- Scale the entire document with a factor num The
default is 1.
- -S string --style string
- This option complements/overrides definitions made in the
configuration files. The string must follow the configuration file
syntax. (Note: this is only supposed to be used on the command line, not
in a configuration file.)
- -t --titlepage
- Generate a title page. The default is to not generate
one.
- -T --text
- Text mode, ignore images. The default is to include the
images.
- -u --underline
- Underline text that constitutes a hypertext link. The
default is to not underline.
- -U --colour
- Produce colour output for text and background, when
specified. The default is black text on white background (mnemonic: coloUr
;-).
- -v --version
- Print information about the current version of
html2ps.
- -W string --web string
- Process a web of documents by recursively retrieve and
convert documents that are referenced with hyperlinks. When dealing with
remote documents it will of course be necessary to impose restrictions, to
avoid downloading the entire web... The value should be a string
consisting of one of the letters 'a', 'b', 'l', 'r', or 's', optionally
combined with a combination of the letters 'p', 'L', and a positive
integer:
- a
- Follow all links.
- b
- Follow only links to within the same directory, or below,
as the start document.
- l
- Follow only links specified with "<LINK
rel=NEXT>" in the document.
- p
- Prompt for each remote document. This mode will
automatically be entered after the first 50 documents.
- r
- Follow only relative links.
- s
- Follow only links to within the same server as the start
document.
- L
- With this option, the order in which the documents are
processed will be: first all top level documents, then the documents
linked to from these etc. For example, if the document A has links to B
and C, and B has a link to D, the order will be A-B-C-D. By default, each
document will be followed by the first document it links to etc; so the
default order for the example is A-B-D-C.
- #
- A positive integer giving the number of recursive levels.
The default is 4 (when the option is present).
- -x num --duplex num
- Generate postscript code for single or double sided
printing. No default, valid values are:
- 0
- Single sided.
- 1
- Double sided.
- 2
- Double sided, opposite page reversed (tumble mode).
BUGS¶
(This is incomplete.)
The CELLSPACING attribute of the TABLE element is not implemented as described
in the specification; instead the value of the CELLPADDING attribute is
increased by half the value of CELLSPACING.
Rendering HTML tables well is a non-trivial task. For "real" tables,
that is representation of tabular data, html2ps usually generates reasonably
good output. When tables are used for layout purposes, the result varies from
good to useless. This is because a table cell is never broken across pages. So
if a table contains a cell with a lot of content, the entire table may have to
be scaled down in size in order to make this cell fit on a single page.
Sometimes this may even result in unreadable output.
Page breaks are occasionally done in bad places: for example directly after a
(long) heading, and before the last line in a paragraph.
ENVIRONMENT¶
- HTML2PSPATH
- This variable specifies the directories to search for
configuration files. It should be a colon separated list of directory
names. Use a dot '.' to denote the current directory. An empty directory
name (as in ':dir', 'dir1::dir3', or 'dir:') will expand to the directory
where the global configuration file is. The default value is '.:', that
is: search the current directory first, and then the global one.
FILES¶
- $HOME/.html2psrc
- User configuration file, see html2psrc(5).
SEE ALSO¶
html2psrc(5),
perl(1),
setlocale(3),
strftime(3),
weblint(1)
VERSION¶
This manpage describes html2ps version 1.0 beta7.
AVAILABILITY¶
http://user.it.uu.se/~jan/html2ps.html
AUTHOR¶
Jan Karrman (jan@it.uu.se)