.\" Automatically generated by Pod::Man 2.28 (Pod::Simple 3.29) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is turned on, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{ . if \nF \{ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "XML_GREP 1p" .TH XML_GREP 1p "2016-08-04" "perl v5.22.2" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" xml_grep \- grep XML files looking for specific elements .SH "SYNOPSYS" .IX Header "SYNOPSYS" .Vb 1 \& xml_grep [options] .Ve .PP or .PP .Vb 1 \& xml_grep .Ve .PP By default you can just give \f(CW\*(C`xml_grep\*(C'\fR an XPath expression and a list of files, and get an \s-1XML\s0 file with the result. .PP This is equivalent to writing .PP .Vb 1 \& xml_grep \-\-group_by_file file \-\-pretty_print indented \-\-cond .Ve .SH "OPTIONS" .IX Header "OPTIONS" .IP "\fB\-\-help\fR" 4 .IX Item "--help" brief help message .IP "\fB\-\-man\fR" 4 .IX Item "--man" full documentation .IP "\fB\-\-Version\fR" 4 .IX Item "--Version" display the tool version .IP "\fB\-\-root\fR " 4 .IX Item "--root " look for and return xml chunks matching .Sp if neither \f(CW\*(C`\-\-root\*(C'\fR nor \f(CW\*(C`\-\-file\*(C'\fR are used then the element(s) that trigger the \f(CW\*(C`\-\-cond\*(C'\fR option is (are) used. If \f(CW\*(C`\-\-cond\*(C'\fR is not used then all elements matching the are returned .Sp several \f(CW\*(C`\-\-root\*(C'\fR can be provided .IP "\fB\-\-cond\fR " 4 .IX Item "--cond " return the chunks (or file names) only if they contain elements matching .Sp several \f(CW\*(C`\-\-cond\*(C'\fR can be provided (in which case they are \s-1OR\s0'ed) .IP "\fB\-\-files\fR" 4 .IX Item "--files" return only file names (do not generate an \s-1XML\s0 output) .Sp usage of this option precludes using any of the options that define the \s-1XML\s0 output: \&\f(CW\*(C`\-\-roots\*(C'\fR, \f(CW\*(C`\-\-encoding\*(C'\fR, \f(CW\*(C`\-\-wrap\*(C'\fR, \f(CW\*(C`\-\-group_by_file\*(C'\fR or \f(CW\*(C`\-\-pretty_print\*(C'\fR .IP "\fB\-\-count\fR" 4 .IX Item "--count" return only the number of matches in each file .Sp usage of this option precludes using any of the options that define the \s-1XML\s0 output: \&\f(CW\*(C`\-\-roots\*(C'\fR, \f(CW\*(C`\-\-encoding\*(C'\fR, \f(CW\*(C`\-\-wrap\*(C'\fR, \f(CW\*(C`\-\-group_by_file\*(C'\fR or \f(CW\*(C`\-\-pretty_print\*(C'\fR .IP "\fB\-\-strict\fR" 4 .IX Item "--strict" without this option parsing errors are reported to \s-1STDOUT\s0 and the file skipped .IP "\fB\-\-date\fR" 4 .IX Item "--date" when on (by default) the wrapping element get a \f(CW\*(C`date\*(C'\fR attribute that gives the date the tool was run. .Sp with \f(CW\*(C`\-\-nodate\*(C'\fR this attribute is not added, which can be useful if you need to compare 2 runs. .IP "\fB\-\-encoding\fR " 4 .IX Item "--encoding " encoding of the xml output (utf\-8 by default) .IP "\fB\-\-nb_results\fR " 4 .IX Item "--nb_results " output only results .IP "\fB\-\-by_file\fR" 4 .IX Item "--by_file" output only results by file .IP "\fB\-\-wrap\fR " 4 .IX Item "--wrap " wrap the xml result in the provided tag (defaults to 'xml_grep') .Sp If wrap is set to an empty string (\f(CW\*(C`\-\-wrap \*(Aq\*(Aq\*(C'\fR) then the xml result is not wrapped at all. .IP "\fB\-\-nowrap\fR" 4 .IX Item "--nowrap" same as using \f(CW\*(C`\-\-wrap \*(Aq\*(Aq\*(C'\fR: the xml result is not wrapped. .IP "\fB\-\-descr\fR " 4 .IX Item "--descr " attributes of the wrap tag (defaults to \f(CW\*(C`version="" date=""\*(C'\fR) .IP "\fB\-\-group_by_file\fR " 4 .IX Item "--group_by_file " wrap results for each files into a separate element. By default that element is named \f(CW\*(C`file\*(C'\fR. It has an attribute named \f(CW\*(C`filename\*(C'\fR that gives the name of the file. .Sp the short version of this option is \fB\-g\fR .IP "\fB\-\-exclude\fR " 4 .IX Item "--exclude " same as using \f(CW\*(C`\-v\*(C'\fR in grep: the elements that match the condition are excluded from the result, the input file(s) is (are) otherwise unchanged .Sp the short form of this option is \fB\-v\fR .IP "\fB\-\-pretty_print\fR " 4 .IX Item "--pretty_print " pretty print the output using XML::Twig styles ('\f(CW\*(C`indented\*(C'\fR', '\f(CW\*(C`record\*(C'\fR' or '\f(CW\*(C`record_c\*(C'\fR' are probably what you are looking for) .Sp if the option is used but no style is given then '\f(CW\*(C`indented\*(C'\fR' is used .Sp short form for this argument is \fB\-s\fR .IP "\fB\-\-text_only\fR" 4 .IX Item "--text_only" Displays the text of the results, one by line. .IP "\fB\-\-html\fR" 4 .IX Item "--html" Allow \s-1HTML\s0 input, files are converted using HTML::TreeBuilder .IP "\fB\-\-Tidy\fR" 4 .IX Item "--Tidy" Allow \s-1HTML\s0 input, files are converted using HTML::Tidy .SS "Condition Syntax" .IX Subsection "Condition Syntax" is an XPath-like expression as allowed by XML::Twig to trigger handlers. .PP exemples: 'para' 'para[@compact=\*(L"compact\*(R"]' '*[@urgent]' '*[@urgent=\*(L"1\*(R"]' 'para[\fIstring()\fR=\*(L"\s-1WARNING\*(R"\s0]' .PP see XML::Twig for a more complete description of the syntax .PP options are processedby Getopt::Long so they can start with '\-' or '\-\-' and can be abbreviated (\f(CW\*(C`\-r\*(C'\fR instead of \f(CW\*(C`\-\-root\*(C'\fR for example) .SH "DESCRIPTION" .IX Header "DESCRIPTION" \&\fBxml_grep\fR does a grep on \s-1XML\s0 files. Instead of using regular expressions it uses XPath expressions (in fact the subset of XPath supported by XML::Twig) .PP the results can be the names of the files or \s-1XML\s0 elements containing matching elements. .SH "SEE ALSO" .IX Header "SEE ALSO" XML::Twig Getopt::Long .SH "LICENSE" .IX Header "LICENSE" This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. .SH "AUTHOR" .IX Header "AUTHOR" Michel Rodriguez