.\" Automatically generated by Pod::Man 2.28 (Pod::Simple 3.29)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings.  \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
.    ds -- \(*W-
.    ds PI pi
.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
.    ds L" ""
.    ds R" ""
.    ds C` ""
.    ds C' ""
'br\}
.el\{\
.    ds -- \|\(em\|
.    ds PI \(*p
.    ds L" ``
.    ds R" ''
.    ds C`
.    ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\"
.\" If the F register is turned on, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD.  Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{
.    if \nF \{
.        de IX
.        tm Index:\\$1\t\\n%\t"\\$2"
..
.        if !\nF==2 \{
.            nr % 0
.            nr F 2
.        \}
.    \}
.\}
.rr rF
.\" ========================================================================
.\"
.IX Title "STAG-IR 1p"
.TH STAG-IR 1p "2016-05-29" "perl v5.22.2" "User Contributed Perl Documentation"
.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
stag\-ir.pl \- information retrieval using a simple relational index
.SH "SYNOPSIS"
.IX Header "SYNOPSIS"
.Vb 2
\&  stag\-ir.pl \-r person \-k social_security_no \-d Pg:mydb myrecords.xml
\&  stag\-ir.pl \-d Pg:mydb \-q 999\-9999\-9999 \-q 888\-8888\-8888
.Ve
.SH "DESCRIPTION"
.IX Header "DESCRIPTION"
Indexes stag nodes (\s-1XML\s0 Elements) in a simple relational db structure
\&\- keyed by \s-1ID\s0 with an \s-1XML\s0 Blob as a value
.PP
Imagine you have a very large file of data, in a stag compatible
format such as \s-1XML.\s0 You want to index all the elements of type
\&\fBperson\fR; each person can be uniquely identified by
\&\fBsocial_security_no\fR, which is a direct subnode of \fBperson\fR
.PP
The first thing to do is to build the index file, which will be stored
in the database mydb
.PP
.Vb 1
\&  stag\-ir.pl \-r person \-k social_security_no \-d Pg:mydb myrecords.xml
.Ve
.PP
You can then use the index \*(L"person-idx\*(R" to retrieve \fBperson\fR nodes by
their social security number
.PP
.Vb 1
\&  stag\-ir.pl \-d Pg:mydb \-q 999\-9999\-9999 > some\-person.xml
.Ve
.PP
You can export using different stag formats
.PP
.Vb 1
\&  stag\-ir.pl \-d Pg:mydb \-q 999\-9999\-9999 \-w sxpr > some\-person.xml
.Ve
.PP
You can retrieve multiple nodes (although these need to be rooted to
make a valid file)
.PP
.Vb 1
\&  stag\-ir.pl \-d Pg:mydb \-q 999\-9999\-9999 \-q 888\-8888\-8888 \-top personset
.Ve
.PP
Or you can use a list of IDs from a file (newline delimited)
.PP
.Vb 1
\&  stag\-ir.pl \-d Pg:mydb \-qf my_ss_nmbrs.txt \-top personset
.Ve
.SS "\s-1ARGUMENTS\s0"
.IX Subsection "ARGUMENTS"
\fI\-d \s-1DB_NAME\s0\fR
.IX Subsection "-d DB_NAME"
.PP
This database will be used for storing the stag nodes
.PP
The name can be a logical name or \s-1DBI\s0 locator or DBStag shorthand \-
see DBIx::DBStag
.PP
The database must already exist
.PP
\fI\-clear\fR
.IX Subsection "-clear"
.PP
Deletes all data from the relation type (specified with \fB\-r\fR) before loading
.PP
\fI\-insertonly\fR
.IX Subsection "-insertonly"
.PP
Does not check if the \s-1ID\s0 in the file exists in the db \- will always
attempt an \s-1INSERT \s0(and will fail if \s-1ID\s0 already exists)
.PP
This is the fastest way to load data (only one \s-1SQL\s0 operation per node
rather than two) but is only safe if there is no existing data
.PP
(Default is clobber mode \- existing data with same \s-1ID\s0 will be replaced)
.PP
\fI\-newonly\fR
.IX Subsection "-newonly"
.PP
If there is already data in the specified relation in the db, and the
\&\s-1XML\s0 being loaded specifies an \s-1ID\s0 that is already in the db, then this
node will be ignored
.PP
(Default is clobber mode \- existing data with same \s-1ID\s0 will be replaced)
.PP
\fI\-transaction_size\fR
.IX Subsection "-transaction_size"
.PP
A commit will be performed every n UPDATEs/COMMITs (and at the end)
.PP
Default is autocommit
.PP
note that if you are using \-insertonly, and you are using
transactions, and the input file contains an \s-1ID\s0 already in the
database, then the transaction will fail because this script will try
and insert a duplicate \s-1ID\s0
.PP
\fI\-r RELATION-NAME\fR
.IX Subsection "-r RELATION-NAME"
.PP
This is the name of the stag node (\s-1XML\s0 element) that will be stored in
the index; for example, with the \s-1XML\s0 below you may want to use the
node name \fBperson\fR and the unique key \fBid\fR
.PP
.Vb 9
\&  <person_set>
\&    <person>
\&      <id>...</id>
\&    </person>
\&    <person>
\&      <id>...</id>
\&    </person>
\&    ...
\&  </person_set>
.Ve
.PP
This flag should only be used when you want to store data
.PP
\fI\-k UNIQUE-KEY\fR
.IX Subsection "-k UNIQUE-KEY"
.PP
This node will be used as the unique/primary key for the data
.PP
This node should be nested directly below the node that is being
stored in the index \- if it is more that one below, specify a path
.PP
This flag should only be used when you want to store data
.PP
\fI\-u UNIQUE-KEY\fR
.IX Subsection "-u UNIQUE-KEY"
.PP
Synonym for \fB\-k\fR
.PP
\fI\-create\fR
.IX Subsection "-create"
.PP
If specified, this will create a table for the relation name specified
below; you should use this the first time you index a relation
.PP
\fI\-idtype \s-1TYPE\s0\fR
.IX Subsection "-idtype TYPE"
.PP
(optional)
.PP
This is the \s-1SQL\s0 datatype for the unique key; it defaults to \s-1VARCHAR\s0(255)
.PP
If you know that your id is an integer, you can specify \s-1INTEGER\s0 here
.PP
If your id is always a 8\-character field you can do this
.PP
.Vb 1
\&  \-idtype \*(AqCHAR(8)\*(Aq
.Ve
.PP
This option only makes sense when combined with the \fB\-c\fR option
.PP
\fI\-p \s-1PARSER\s0\fR
.IX Subsection "-p PARSER"
.PP
This can be the name of a stag supported format (xml, sxpr, itext) \-
\&\s-1XML\s0 is assumed by default
.PP
It can also be a module name \- this module is used to parse the input
file into a stag stream; see Data::Stag::BaseGenerator for details
on writing your own parsers/event generators
.PP
This flag should only be used when you want to store data
.PP
\fI\-q QUERY-ID\fR
.IX Subsection "-q QUERY-ID"
.PP
Fetches the relation/node with unique key value equal to query-id
.PP
Multiple arguments can be passed by specifying \-q multple times
.PP
This flag should only be used when you want to query data
.PP
\fI\-top NODE-NAME\fR
.IX Subsection "-top NODE-NAME"
.PP
If this is specified in conjunction with \fB\-q\fR or \fB\-qf\fR then all the
query result nodes will be nested inside a node with this name (ie
this provides a root for the resulting document tree)
.PP
\fI\-qf QUERY-FILE\fR
.IX Subsection "-qf QUERY-FILE"
.PP
This is a file of newline-seperated IDs; this is useful for querying
the index in batch
.PP
\fI\-keys\fR
.IX Subsection "-keys"
.PP
This will write a list of all primary keys in the index
.SH "SEE ALSO"
.IX Header "SEE ALSO"
Data::Stag
.PP
For more complex stag to database mapping, see DBIx::DBStag and the
scripts
.PP
stag\-db.pl use file \s-1DBM\s0 indexes
.PP
stag\-storenode.pl is for storing fully normalised stag trees
.PP
selectall_xml