NAME¶
"LaTeXML::Core::Document" - represents an XML document under
construction.
DESCRIPTION¶
A "LaTeXML::Core::Document" represents an XML document being
constructed by LaTeXML, and also provides the methods for constructing it. It
extends LaTeXML::Common::Object.
LaTeXML will have digested the source material resulting in a
LaTeXML::Core::List (from a LaTeXML::Core::Stomach) of LaTeXML::Core::Boxs,
LaTeXML::Core::Whatsits and sublists. At this stage, a document is created and
it is responsible for `absorbing' the digested material. Generally, the
LaTeXML::Core::Boxs and LaTeXML::Core::Lists create text nodes, whereas the
LaTeXML::Core::Whatsits create "XML" document fragments, elements
and attributes according to the defining
LaTeXML::Core::Definition::Constructor.
Most document construction occurs at a
current insertion point where
material will be added, and which moves along with the inserted material. The
LaTeXML::Common::Model, derived from various declarations and document type,
is consulted to determine whether an insertion is allowed and when elements
may need to be automatically opened or closed in order to carry out a given
insertion. For example, a "subsection" element will typically be
closed automatically when it is attempted to open a "section"
element.
In the methods described here, the term $qname is used for XML qualified names.
These are tag names with a namespace prefix. The prefix should be one
registered with the current Model, for use within the code. This prefix is not
necessarily the same as the one used in any DTD, but should be mapped to the a
Namespace URI that was registered for the DTD.
The arguments named $node are an XML::LibXML node.
The methods here are grouped into three sections covering basic access to the
document, insertion methods at the current insertion point, and less commonly
used, lower-level, document manipulation methods.
Accessors¶
- "$doc = $document->getDocument;"
- Returns the "XML::LibXML::Document" currently being
constructed.
- "$doc = $document->getModel;"
- Returns the "LaTeXML::Common::Model" that represents the
document model used for this document.
- "$node = $document->getNode;"
- Returns the node at the current insertion point during
construction. This node is considered still to be `open'; any insertions
will go into it (if possible). The node will be an
"XML::LibXML::Element", "XML::LibXML::Text" or,
initially, "XML::LibXML::Document".
- "$node = $document->getElement;"
- Returns the closest ancestor to the current insertion point that is an
Element.
- "$node = $document->getChildElement($node);"
- Returns a list of the child elements, if any, of the $node.
- "@nodes = $document->getLastChildElement($node);"
- Returns the last child element of the $node, if it has one, else
undef.
- "$node = $document->getFirstChildElement($node);"
- Returns the first child element of the $node, if it has one, else
undef.
- "@nodes = $document->findnodes($xpath,$node);"
- Returns a list of nodes matching the given $xpath expression. The
context node for $xpath is $node, if given, otherwise it is the
document element.
- "$node = $document->findnode($xpath,$node);"
- Returns the first node matching the given $xpath expression. The
context node for $xpath is $node, if given, otherwise it is the
document element.
- "$node = $document->getNodeQName($node);"
- Returns the qualified name (localname with namespace prefix) of the given
$node. The namespace prefix mapping is the code mapping of the current
document model.
- "$boolean = $document->canContain($tag,$child);"
- Returns whether an element $tag can contain a child $child. $tag and
$child can be nodes, qualified names of nodes (prefix:localname), or one
of a set of special symbols "#PCDATA", "#Comment",
"#Document" or "#ProcessingInstruction".
- "$boolean = $document->canContainIndirect($tag,$child);"
- Returns whether an element $tag can contain a child $child either
directly, or after automatically opening one or more autoOpen-able
elements.
- "$boolean = $document->canContainSomehow($tag,$child);"
- Returns whether an element $tag can contain a child $child either
directly, or after automatically opening one or more autoOpen-able
elements.
- "$boolean = $document->canHaveAttribute($tag,$attrib);"
- Returns whether an element $tag can have an attribute named $attrib.
- "$boolean = $document->canAutoOpen($tag);"
- Returns whether an element $tag is able to be automatically opened.
- "$boolean = $document->canAutoClose($node);"
- Returns whether the node $node can be automatically closed.
Construction Methods¶
These methods are the most common ones used for construction of documents. They
generally operate by creating new material at the
current insertion
point. That point initially is just the document itself, but it moves
along to follow any new insertions. These methods also adapt to the document
model so as to automatically open or close elements, when it is required for
the pending insertion and allowed by the document model (See Tag).
- "$xmldoc = $document->finalize;"
- This method finalizes the document by cleaning up various temporary
attributes, and returns the XML::LibXML::Document that was
constructed.
- "@nodes = $document->absorb($digested);"
- Absorb the $digested object into the document at the current insertion
point according to its type. Various of the the other methods are invoked
as needed, and document nodes may be automatically opened or closed
according to the document model.
This method returns the nodes that were constructed. Note that the nodes may
include children of other nodes, and nodes that may already have been
removed from the document (See filterChildren and filterDeleted). Also,
text insertions are often merged with existing text nodes; in such cases,
the whole text node is included in the result.
- "$document->insertElement($qname,$content,%attributes);"
- This is a shorthand for creating an element $qname (with given
attributes), absorbing $content from within that new node, and then
closing it. The $content must be digested material, either a single box,
or an array of boxes, which will be absorbed into the element. This method
returns the newly created node, although it will no longer be the current
insertion point.
- "$document->insertMathToken($string,%attributes);"
- Insert a math token (XMTok) containing the string $string with the given
attributes. Useful attributes would be name, role, font. Returns the newly
inserted node.
- "$document->insertComment($text);"
- Insert, and return, a comment with the given $text into the current
node.
- "$document->insertPI($op,%attributes);"
- Insert, and return, a ProcessingInstruction into the current node.
- "$document->openText($text,$font);"
- Open a text node in font $font, performing any required automatic opening
and closing of intermedate nodes (including those needed for font changes)
and inserting the string $text into it.
- "$document->openElement($qname,%attributes);"
- Open an element, named $qname and with the given attributes. This will be
inserted into the current node while performing any required automatic
opening and closing of intermedate nodes. The new element is returned, and
also becomes the current insertion point. An error (fatal if in
"Strict" mode) is signalled if there is no allowed way to insert
such an element into the current node.
- "$document->closeElement($qname);"
- Close the closest open element named $qname including any intermedate
nodes that may be automatically closed. If that is not possible, signal an
error. The closed node's parent becomes the current node. This method
returns the closed node.
- "$node = $document->isOpenable($qname);"
- Check whether it is possible to open a $qname element at the current
insertion point.
- "$node = $document->isCloseable($qname);"
- Check whether it is possible to close a $qname element, returning the node
that would be closed if possible, otherwise undef.
- "$document->maybeCloseElement($qname);"
- Close a $qname element, if it is possible to do so, returns the closed
node if it was found, else undef.
- "$document->addAttribute($key=>$value);"
- Add the given attribute to the node nearest to the current insertion point
that is allowed to have it. This does not change the current insertion
point.
- "$document->closeToNode($node);"
- This method closes all children of $node until $node becomes the insertion
point. Note that it closes any open nodes, not only autoCloseable
ones.
Internal Insertion Methods
These are described as an aide to understanding the code; they rarely, if ever,
should be used outside this module.
- "$document->setNode($node);"
- Sets the current insertion point to be $node. This should be rarely
used, if at all; The construction methods of document generally maintain
the notion of insertion point automatically. This may be useful to allow
insertion into a different part of the document, but you probably want to
set the insertion point back to the previous node, afterwards.
- "$string = $document->getInsertionContext($levels);"
- For debugging, return a string showing the context of the current
insertion point; that is, the string of the nodes leading up to it. if
$levels is defined, show only that many nodes.
- "$node = $document->find_insertion_point($qname);"
- This internal method is used to find the appropriate point, relative to
the current insertion point, that an element with the specified $qname can
be inserted. That position may require automatic opening or closing of
elements, according to what is allowed by the document model.
- "@nodes = getInsertionCandidates($node);"
- Returns a list of elements where an arbitrary insertion might take place.
Roughly this is a list starting with $node, followed by its parent and the
parents siblings (in reverse order), followed by the grandparent and
siblings (in reverse order).
- "$node = $document->floatToElement($qname);"
- Finds the nearest element at or preceding the current insertion point (see
"getInsertionCandidates"), that can accept an element $qname; it
moves the insertion point to that point, and returns the previous
insertion point. Generally, after doing whatever you need at the new
insertion point, you should call "$document->setNode($node);"
to restore the insertion point. If no such point is found, the insertion
point is left unchanged, and undef is returned.
- "$node = $document->floatToAttribute($key);"
- This method works the same as "floatToElement", but find the
nearest element that can accept the attribute $key.
- "$node = $document->openText_internal($text);"
- This is an internal method, used by "openText", that assumes the
insertion point has been appropriately adjusted.)
- "$node = $document->openMathText_internal($text);"
- This internal method appends $text to the current insertion point, which
is assumed to be a math node. It checks for math ligatures and carries out
any combinations called for.
- "$node = $document->closeText_internal();"
- This internal method closes the current node, which should be a text node.
It carries out any text ligatures on the content.
- "$node = $document->closeNode_internal($node);"
- This internal method closes any open text or element nodes starting at the
current insertion point, up to and including $node. Afterwards, the parent
of $node will be the current insertion point. It condenses the tree to
avoid redundant font switching elements.
- "$document->afterOpen($node);"
- Carries out any afterOpen operations that have been recorded (using
"Tag") for the element name of $node.
- "$document->afterClose($node);"
- Carries out any afterClose operations that have been recorded (using
"Tag") for the element name of $node.
Document Modification¶
The following methods are used to perform various sorts of modification and
rearrangements of the document, after the normal flow of insertion has taken
place. These may be needed after an environment (or perhaps the whole
document) has been completed and one needs to analyze what it contains to
decide on the appropriate representation.
- "$document->setAttribute($node,$key,$value);"
- Sets the attribute $key to $value on $node. This method is prefered over
the direct LibXML one, since it takes care of decoding namespaces (if $key
is a qname), and also manages recording of xml:id's.
- "$document->recordID($id,$node);"
- Records the association of the given $node with the $id, which should be
the "xml:id" attribute of the $node. Usually this association
will be maintained by the methods that create nodes or set
attributes.
- "$document->unRecordID($id);"
- Removes the node associated with the given $id, if any. This might be
needed if a node is deleted.
- "$document->modifyID($id);"
- Adjusts $id, if needed, so that it is unique. It does this by appending a
letter and incrementing until it finds an id that is not yet associated
with a node.
- "$node = $document->lookupID($id);"
- Returns the node, if any, that is associated with the given $id.
- "$document->setNodeBox($node,$box);"
- Records the $box (being a Box, Whatsit or List), that was (presumably)
responsible for the creation of the element $node. This information is
useful for determining source locations, original TeX strings, and so
forth.
- "$box = $document->getNodeBox($node);"
- Returns the $box that was responsible for creating the element $node.
- "$document->setNodeFont($node,$font);"
- Records the font object that encodes the font that should be used to
display any text within the element $node.
- "$font = $document->getNodeFont($node);"
- Returns the font object associated with the element $node.
- "$node =
$document->openElementAt($point,$qname,%attributes);"
- Opens a new child element in $point with the qualified name $qname and
with the given attributes. This method is not affected by, nor does it
affect, the current insertion point. It does manage namespaces, xml:id's
and associating a box, font and locator with the new element, as well as
running any "afterOpen" operations.
- "$node = $document->closeElementAt($node);"
- Closes $node. This method is not affected by, nor does it affect, the
current insertion point. However, it does run any "afterClose"
operations, so any element that was created using the lower-level
"openElementAt" should be closed using this method.
- "$node = $document->appendClone($node,@newchildren);"
- Appends clones of @newchildren to $node. This method modifies any ids
found within @newchildren (using "modifyID"), and fixes up any
references to those ids within the clones so that they refer to the
modified id.
- "$node = $document->wrapNodes($qname,@nodes);"
- This method wraps the @nodes by a new element with qualified name $qname,
that new node replaces the first of @node. The remaining nodes in @nodes
must be following siblings of the first one.
NOTE: Does this need multiple nodes? If so, perhaps some kind of movenodes
helper? Otherwise, what about attributes?
- "$node = $document->unwrapNodes($node);"
- Unwrap the children of $node, by replacing $node by its children.
- "$node = $document->replaceNode($node,@nodes);"
- Replace $node by @nodes; presumably they are some sort of descendant
nodes.
- "$node = $document->renameNode($node,$newname);"
- Rename $node to the tagname $newname; equivalently replace $node by a new
node with name $newname and copy the attributes and contents. It is
assumed that $newname can contain those attributes and contents.
- "@nodes = $document->filterDeletions(@nodes);"
- This function is useful with "$doc-"absorb($box)>, when you
want to filter out any nodes that have been deleted and no longer appear
in the document.
- "@nodes = $document->filterChildren(@nodes);"
- This function is useful with "$doc-"absorb($box)>, when you
want to filter out any nodes that are children of other nodes in
@nodes.
AUTHOR¶
Bruce Miller <bruce.miller@nist.gov>
COPYRIGHT¶
Public domain software, produced as part of work done by the United States
Government & not subject to copyright in the US.