NAME¶
sgml-spell-checker - SGML spell checker
SYNOPSIS¶
nsgmls -l yourdoc.sgml | sgml-spell-checker [
option]
...
DESCRIPTION¶
sgml-spell-checker is a tool that you can use to automatically
spell-check your SGML documents. One of the advantages of this tool over some
other SGML-aware spell checkers is that it scans your documents in the form in
which the SGML parser actually sees it, which means it is not line-based,
system entities are resolved, marked sections are treated appropriately, etc.
Also, this tool can be made aware of particular DTDs, in the sense that it knows
not to spell-check the content of elements that do not represent
human-language text, such as <programlisting> in DocBook. An exclusion
list for the DocBook DTD is included, others can be added trivially.
The input to
sgml-spell-checker is the text representation of your SGML
document's Element Structure Information Set as generated by
nsgmls
(from SP or OpenSP; sometimes installed under the name
onsgmls). In
other words, you need to pipe the output of
nsgmls into
sgml-spell-checker as shown in the synopsis. Provide to
nsgmls
the options you need, such as
-c to search more catalogs,
-i to
include a marked section, or more source files. Do not forget the
-l
option, or you won't get any file or line references for the misspellings.
The second part of the pipe takes a couple of options; see below. Note that if
the language of the document does not match your system's locale settings, you
need to use the
--language option.
The output of
sgml-spell-checker is a list of the words that are
misspelled (in the opinion of
aspell), together with file name and line
number. Note that the line number designates where the element that contains
the word started, not where the word actually is. So most likely you will have
to search a few lines below the indicated location.
OPTIONS¶
- --debug
- Debug mode. Generates lots of output not of interest to the
normal user.
- --language=language
- Sets the language of the document. (The format depends on
the aspell installation, but something like en or
en_US should work.) By default the language is taken from the
system locale settings.
- --suggestions
- Shows correction suggestions for misspelled words.
- --dictionary=file
- Uses an additional aspell dictionary file. This
option may be used multiple times.
- --dtd=dtd
- Uses the exclusion list for the specified DTD (e.g.,
docbook).
- --help
- Shows a brief help, then exits.
EXAMPLES¶
nsgmls -l -D . mydoc.sgml | \
sgml-spell-checker --language=en --dtd=docbook \
--dictionary=mydict1.aspell --dictionary=mydict2.aspell
(You can enter this command all on one line without the backslashes, or on
several lines with the backslashes.)
NOTES¶
Read the
aspell documentation about how to set up the appropriate
dictionaries. In case you're having trouble interpreting the
aspell
documentation, here's how to make an
aspell dictionary file from a flat
word list:
rm -f mydict1.aspell # aspell won't overwrite existing files
aspell --language-tag=xx create master ./mydict1.aspell < mywordlist.txt
Watch the slashes.
aspell likes to see a slash in the name or it will
search some default location.
BUGS¶
This program should be able to identify the language from the document (e.g.,
<book lang="de">), but
aspell doesn't handle changing
the language on the fly.
AUTHOR¶
Peter Eisentraut (peter_e@gmx.net)