.\" Automatically generated by Pod::Man 4.09 (Pod::Simple 3.35)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings.  \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
.    ds -- \(*W-
.    ds PI pi
.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
.    ds L" ""
.    ds R" ""
.    ds C` ""
.    ds C' ""
'br\}
.el\{\
.    ds -- \|\(em\|
.    ds PI \(*p
.    ds L" ``
.    ds R" ''
.    ds C`
.    ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD.  Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.if !\nF .nr F 0
.if \nF>0 \{\
.    de IX
.    tm Index:\\$1\t\\n%\t"\\$2"
..
.    if !\nF==2 \{\
.        nr % 0
.        nr F 2
.    \}
.\}
.\" ========================================================================
.\"
.IX Title "Data::Munge 3pm"
.TH Data::Munge 3pm "2017-11-20" "perl v5.26.1" "User Contributed Perl Documentation"
.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
Data::Munge \- various utility functions
.SH "SYNOPSIS"
.IX Header "SYNOPSIS"
.Vb 1
\& use Data::Munge;
\& 
\& my $re = list2re qw/f ba foo bar baz/;
\& # $re = qr/bar|baz|foo|ba|f/;
\& 
\& print byval { s/foo/bar/ } $text;
\& # print do { my $tmp = $text; $tmp =~ s/foo/bar/; $tmp };
\& 
\& foo(mapval { chomp } @lines);
\& # foo(map { my $tmp = $_; chomp $tmp; $tmp } @lines);
\& 
\& print replace(\*(AqApples are round, and apples are juicy.\*(Aq, qr/apples/i, \*(Aqoranges\*(Aq, \*(Aqg\*(Aq);
\& # "oranges are round, and oranges are juicy."
\& print replace(\*(AqJohn Smith\*(Aq, qr/(\ew+)\es+(\ew+)/, \*(Aq$2, $1\*(Aq);
\& # "Smith, John"
\& 
\& my $trimmed = trim "  a b c "; # "a b c"
\& 
\& my $x = \*(Aqbar\*(Aq;
\& if (elem $x, [qw(foo bar baz)]) { ... }
\& 
\& my $contents = slurp $fh;  # or: slurp *STDIN
\& 
\& eval_string(\*(Aqprint "hello world\e\en"\*(Aq);  # says hello
\& eval_string(\*(Aqdie\*(Aq);  # dies
\& eval_string(\*(Aq{\*(Aq);    # throws a syntax error
\& 
\& my $fac = rec {
\&   my ($rec, $n) = @_;
\&   $n < 2 ? 1 : $n * $rec\->($n \- 1)
\& };
\& print $fac\->(5);  # 120
\& 
\& if ("hello, world!" =~ /(\ew+), (\ew+)/) {
\&   my @captured = submatches;
\&   # @captured = ("hello", "world")
\& }
.Ve
.SH "DESCRIPTION"
.IX Header "DESCRIPTION"
This module defines a few generally useful utility functions. I got tired of
redefining or working around them, so I wrote this module.
.SS "Functions"
.IX Subsection "Functions"
.IP "list2re \s-1LIST\s0" 4
.IX Item "list2re LIST"
Converts a list of strings to a regex that matches any of the strings.
Especially useful in combination with \f(CW\*(C`keys\*(C'\fR. Example:
.Sp
.Vb 2
\& my $re = list2re keys %hash;
\& $str =~ s/($re)/$hash{$1}/g;
.Ve
.Sp
This function takes special care to get several edge cases right:
.RS 4
.IP "\(bu" 4
Empty list: An empty argument list results in a regex that doesn't match
anything.
.IP "\(bu" 4
Empty string: An argument list consisting of a single empty string results in a
regex that matches the empty string (and nothing else).
.IP "\(bu" 4
Prefixes: The input strings are sorted by descending length to ensure longer
matches are tried before shorter matches. Otherwise \f(CW\*(C`list2re(\*(Aqab\*(Aq, \*(Aqabcd\*(Aq)\*(C'\fR
would generate \f(CW\*(C`qr/ab|abcd/\*(C'\fR, which (on its own) can never match \f(CW\*(C`abcd\*(C'\fR
(because \f(CW\*(C`ab\*(C'\fR is tried first, and it always succeeds where \f(CW\*(C`abcd\*(C'\fR could).
.RE
.RS 4
.RE
.IP "byval \s-1BLOCK SCALAR\s0" 4
.IX Item "byval BLOCK SCALAR"
Takes a code block and a value, runs the block with \f(CW$_\fR set to that value,
and returns the final value of \f(CW$_\fR. The global value of \f(CW$_\fR is not
affected. \f(CW$_\fR isn't aliased to the input value either, so modifying \f(CW$_\fR
in the block will not affect the passed in value. Example:
.Sp
.Vb 3
\& foo(byval { s/!/?/g } $str);
\& # Calls foo() with the value of $str, but all \*(Aq!\*(Aq have been replaced by \*(Aq?\*(Aq.
\& # $str itself is not modified.
.Ve
.Sp
Since perl 5.14 you can also use the \f(CW\*(C`/r\*(C'\fR flag:
.Sp
.Vb 1
\& foo($str =~ s/!/?/gr);
.Ve
.Sp
But \f(CW\*(C`byval\*(C'\fR works on all versions of perl and is not limited to \f(CW\*(C`s///\*(C'\fR.
.IP "mapval \s-1BLOCK LIST\s0" 4
.IX Item "mapval BLOCK LIST"
Works like a combination of \f(CW\*(C`map\*(C'\fR and \f(CW\*(C`byval\*(C'\fR; i.e. it behaves like
\&\f(CW\*(C`map\*(C'\fR, but \f(CW$_\fR is a copy, not aliased to the current element, and the return
value is taken from \f(CW$_\fR again (it ignores the value returned by the
block). Example:
.Sp
.Vb 4
\& my @foo = mapval { chomp } @bar;
\& # @foo contains a copy of @bar where all elements have been chomp\*(Aqd.
\& # This could also be written as chomp(my @foo = @bar); but that\*(Aqs not
\& # always possible.
.Ve
.IP "submatches" 4
.IX Item "submatches"
Returns a list of the strings captured by the last successful pattern match.
Normally you don't need this function because this is exactly what \f(CW\*(C`m//\*(C'\fR
returns in list context. However, \f(CW\*(C`submatches\*(C'\fR also works in other contexts
such as the \s-1RHS\s0 of \f(CW\*(C`s//.../e\*(C'\fR.
.IP "replace \s-1STRING, REGEX, REPLACEMENT, FLAG\s0" 4
.IX Item "replace STRING, REGEX, REPLACEMENT, FLAG"
.PD 0
.IP "replace \s-1STRING, REGEX, REPLACEMENT\s0" 4
.IX Item "replace STRING, REGEX, REPLACEMENT"
.PD
A clone of javascript's \f(CW\*(C`String.prototype.replace\*(C'\fR. It works almost the same
as \f(CW\*(C`byval { s/REGEX/REPLACEMENT/FLAG } STRING\*(C'\fR, but with a few important
differences. \s-1REGEX\s0 can be a string or a compiled \f(CW\*(C`qr//\*(C'\fR object. \s-1REPLACEMENT\s0
can be a string or a subroutine reference. If it's a string, it can contain the
following replacement patterns:
.RS 4
.IP "$$" 4
Inserts a '$'.
.IP "$&" 4
Inserts the matched substring.
.IP "$`" 4
Inserts the substring preceding the match.
.IP "$'" 4
Inserts the substring following the match.
.ie n .IP "$N  (where N is a digit)" 4
.el .IP "\f(CW$N\fR  (where N is a digit)" 4
.IX Item "$N (where N is a digit)"
Inserts the substring matched by the Nth capturing group.
.IP "${N}  (where N is one or more digits)" 4
.IX Item "${N} (where N is one or more digits)"
Inserts the substring matched by the Nth capturing group.
.RE
.RS 4
.Sp
Note that these aren't variables; they're character sequences interpreted by
\&\f(CW\*(C`replace\*(C'\fR.
.Sp
If \s-1REPLACEMENT\s0 is a subroutine reference, it's called with the following
arguments: First the matched substring (like \f(CW$&\fR above), then the contents of
the capture buffers (as returned by \f(CW\*(C`submatches\*(C'\fR), then the offset where the
pattern matched (like \f(CW\*(C`$\-[0]\*(C'\fR, see \*(L"@\-\*(R" in perlvar), then the \s-1STRING.\s0 The return
value will be inserted in place of the matched substring.
.Sp
Normally only the first occurrence of \s-1REGEX\s0 is replaced. If \s-1FLAG\s0 is present, it
must be \f(CW\*(Aqg\*(Aq\fR and causes all occurrences to be replaced.
.RE
.IP "trim \s-1STRING\s0" 4
.IX Item "trim STRING"
Returns \fI\s-1STRING\s0\fR with all leading and trailing whitespace removed. Like
\&\f(CW\*(C`length\*(C'\fR it returns \f(CW\*(C`undef\*(C'\fR if the input is \f(CW\*(C`undef\*(C'\fR.
.IP "elem \s-1SCALAR, ARRAYREF\s0" 4
.IX Item "elem SCALAR, ARRAYREF"
Returns a boolean value telling you whether \fI\s-1SCALAR\s0\fR is an element of
\&\fI\s-1ARRAYREF\s0\fR or not. Two scalars are considered equal if they're both \f(CW\*(C`undef\*(C'\fR,
if they're both references to the same thing, or if they're both not references
and \f(CW\*(C`eq\*(C'\fR to each other.
.Sp
This is implemented as a linear search through \fI\s-1ARRAYREF\s0\fR that terminates
early if a match is found (i.e. \f(CW\*(C`elem \*(AqA\*(Aq, [\*(AqA\*(Aq, 1 .. 9999]\*(C'\fR won't even look
at elements \f(CW\*(C`1 .. 9999\*(C'\fR).
.IP "eval_string \s-1STRING\s0" 4
.IX Item "eval_string STRING"
Evals \fI\s-1STRING\s0\fR just like \f(CW\*(C`eval\*(C'\fR but doesn't catch exceptions. Caveat: Unlike
with \f(CW\*(C`eval\*(C'\fR the code runs in an empty lexical scope:
.Sp
.Vb 3
\& my $foo = "Hello, world!\en";
\& eval_string \*(Aqprint $foo\*(Aq;
\& # Dies: Global symbol "$foo" requires explicit package name
.Ve
.Sp
That is, the eval'd code can't see variables from the scope of the
\&\f(CW\*(C`eval_string\*(C'\fR call.
.IP "slurp \s-1FILEHANDLE\s0" 4
.IX Item "slurp FILEHANDLE"
Reads and returns all remaining data from \fI\s-1FILEHANDLE\s0\fR as a string, or
\&\f(CW\*(C`undef\*(C'\fR if it hits end-of-file. (Interaction with non-blocking filehandles is
currently not well defined.)
.Sp
\&\f(CW\*(C`slurp $handle\*(C'\fR is equivalent to \f(CW\*(C`do { local $/; scalar readline $handle }\*(C'\fR.
.IP "rec \s-1BLOCK\s0" 4
.IX Item "rec BLOCK"
Creates an anonymous sub as \f(CW\*(C`sub BLOCK\*(C'\fR would, but supplies the called sub
with an extra argument that can be used to recurse:
.Sp
.Vb 6
\& my $code = rec {
\&   my ($rec, $n) = @_;
\&   $rec\->($n \- 1) if $n > 0;
\&   print $n, "\en";
\& };
\& $code\->(4);
.Ve
.Sp
That is, when the sub is called, an implicit first argument is passed in
\&\f(CW$_[0]\fR (all normal arguments are moved one up). This first argument is a
reference to the sub itself. This reference could be used to recurse directly
or to register the sub as a handler in an event system, for example.
.Sp
A note on defining recursive anonymous functions: Doing this right is more
complicated than it may at first appear. The most straightforward solution
using a lexical variable and a closure leaks memory because it creates a
reference cycle. Starting with perl 5.16 there is a \f(CW\*(C`_\|_SUB_\|_\*(C'\fR constant that is
equivalent to \f(CW$rec\fR above, and this is indeed what this module uses (if
available).
.Sp
However, this module works even on older perls by falling back to either weak
references (if available) or a \*(L"fake recursion\*(R" scheme that dynamically
instantiates a new sub for each call instead of creating a cycle. This last
resort is slower than weak references but works everywhere.
.SH "AUTHOR"
.IX Header "AUTHOR"
Lukas Mai, \f(CW\*(C`<l.mai at web.de>\*(C'\fR
.SH "COPYRIGHT & LICENSE"
.IX Header "COPYRIGHT & LICENSE"
Copyright 2009\-2011, 2013\-2015 Lukas Mai.
.PP
This program is free software; you can redistribute it and/or modify it
under the terms of either: the \s-1GNU\s0 General Public License as published
by the Free Software Foundation; or the Artistic License.
.PP
See http://dev.perl.org/licenses/ for more information.