NAME¶
Locale::Po4a::Sgml - convert SGML documents from/to PO files
DESCRIPTION¶
The po4a (PO for anything) project goal is to ease translations (and more
interestingly, the maintenance of translations) using gettext tools on areas
where they were not expected like documentation.
Locale::Po4a::Sgml is a module to help the translation of documentation in the
SGML format into other [human] languages.
This module uses
nsgmls to parse the SGML files. Make sure it is
installed. Also make sure that the DTD of the SGML files are installed in the
system.
OPTIONS ACCEPTED BY THIS MODULE¶
- debug
- Space separated list of keywords indicating which part you
want to debug. Possible values are: tag, generic, entities and refs.
- verbose
- Give more information about what's going on.
- translate
- Space separated list of extra tags (beside the DTD provided
ones) whose content should form an extra msgid.
- section
- Space separated list of extra tags (beside the DTD provided
ones) containing other tags, some of them being of category
translate.
- indent
- Space separated list of tags which increase the indentation
level.
- verbatim
- The layout within those tags should not be changed. The
paragraph won't get wrapped, and no extra indentation space or new line
will be added for cosmetic purpose.
- empty
- Tags not needing to be closed.
- ignore
- Tags ignored and considered as plain char data by po4a.
That is to say that they can be part of an msgid. For example, <b>
is a good candidate for this category since putting it in the translate
section would create msgids not being whole sentences, which is bad.
- attributes
- A space separated list of attributes that need to be
translated. You can specify the attributes by their name (for example,
"lang"), but you can also prefix it with a tag hierarchy, to
specify that this attribute will only be translated when it is into the
specified tag. For example: <bbb><aaa>lang specifies that the
lang attribute will only be translated if it is in an <aaa> tag,
which is in a <bbb> tag. The tag names are actually regular
expressions so you can also write things like <aaa|bbbb>lang to only
translate lang attributes that are in an <aaa> or a <bbb>
tag.
- qualify
- A space separated list of attributes for which the
translation must be qualified by the attribute name. Note that this
setting automatically adds the given attribute into the 'attributes' list
too.
- force
- Proceed even if the DTD is unknown or if nsgmls finds
errors in the input file.
- include-all
- By default, msgids containing only one entity (like
'&version;') are skipped for the translator comfort. Activating this
option prevents this optimisation. It can be useful if the document
contains a construction like
"<title>Á</title>", even if I doubt such
things to ever happen...
- ignore-inclusion
- Space separated list of entities that won't be inlined. Use
this option with caution: it may cause nsgmls (used internally) to add
tags and render the output document invalid.
STATUS OF THIS MODULE¶
The result is perfect. I.e., the generated documents are exactly the same. But
there are still some problems:
- •
- The error output of nsgmls is redirected to /dev/null,
which is clearly bad. I don't know how to avoid that.
The problem is that I have to "protect" the conditional inclusions
(i.e. the "<! [ %foo [" and "]]>" stuff) from
nsgmls. Otherwise nsgmls eats them, and I don't know how to restore them
in the final document. To prevent that, I rewrite them to
"{PO4A-beg-foo}" and "{PO4A-end}".
The problem with this is that the "{PO4A-end}" and such I add are
valid in the document (not in a <p> tag or so).
Everything works well with nsgmls's output redirected that way, but it will
prevent us from detecting that the document is badly formatted.
- •
- It does work only with the DebianDoc and DocBook DTD.
Adding support for a new DTD should be very easy. The mechanism is the
same for every DTD, you just have to give a list of the existing tags and
some of their characteristics.
I agree, this needs some more documentation, but it is still considered as
beta, and I hate to document stuff which may/will change.
- •
- Warning, support for DTDs is quite experimental. I did not
read any reference manual to find the definition of every tag. I did add
tag definition to the module 'till it works for some documents I found on
the net. If your document use more tags than mine, it won't work. But as I
said above, fixing that should be quite easy.
I did test DocBook against the SAG (System Administrator Guide) only, but
this document is quite big, and should use most of the DocBook
specificities.
For DebianDoc, I tested some of the manuals from the DDP, but not all
yet.
- •
- In case of file inclusion, string reference of messages in
PO files (i.e. lines like "#: en/titletoc.sgml:9460") will be
wrong.
This is because I preprocess the file to protect the conditional inclusion
(i.e. the "<! [ %foo [" and "]]>" stuff) and
some entities (like &version;) from nsgmls because I want them
verbatim to the generated document. For that, I make a temp copy of the
input file and do all the changes I want to this before passing it to
nsgmls for parsing.
So that it works, I replace the entities asking for a file inclusion by the
content of the given file (so that I can protect what needs to be in a
subfile also). But nothing is done so far to correct the references (i.e.,
filename and line number) afterward. I'm not sure what the best thing to
do is.
AUTHORS¶
This module is an adapted version of sgmlspl (SGML postprocessor for the SGMLS
and NSGMLS parsers) which was:
Copyright (c) 1995 by David Megginson <dmeggins@aix1.uottawa.ca>
The adaptation for po4a was done by:
Denis Barbier <barbier@linuxfr.org>
Martin Quinson (mquinson#debian.org)
COPYRIGHT AND LICENSE¶
Copyright (c) 1995 by David Megginson <dmeggins@aix1.uottawa.ca>
Copyright 2002, 2003, 2004, 2005 by SPI, inc.
This program is free software; you may redistribute it and/or modify it under
the terms of GPL (see the COPYING file).