.\" Automatically generated by Pod::Man 2.25 (Pod::Simple 3.16)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings.  \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
.    ds -- \(*W-
.    ds PI pi
.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
.    ds L" ""
.    ds R" ""
.    ds C` ""
.    ds C' ""
'br\}
.el\{\
.    ds -- \|\(em\|
.    ds PI \(*p
.    ds L" ``
.    ds R" ''
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\"
.\" If the F register is turned on, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD.  Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.ie \nF \{\
.    de IX
.    tm Index:\\$1\t\\n%\t"\\$2"
..
.    nr % 0
.    rr F
.\}
.el \{\
.    de IX
..
.\}
.\"
.\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2).
.\" Fear.  Run.  Save yourself.  No user-serviceable parts.
.    \" fudge factors for nroff and troff
.if n \{\
.    ds #H 0
.    ds #V .8m
.    ds #F .3m
.    ds #[ \f1
.    ds #] \fP
.\}
.if t \{\
.    ds #H ((1u-(\\\\n(.fu%2u))*.13m)
.    ds #V .6m
.    ds #F 0
.    ds #[ \&
.    ds #] \&
.\}
.    \" simple accents for nroff and troff
.if n \{\
.    ds ' \&
.    ds ` \&
.    ds ^ \&
.    ds , \&
.    ds ~ ~
.    ds /
.\}
.if t \{\
.    ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u"
.    ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'
.    ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'
.    ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'
.    ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'
.    ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'
.\}
.    \" troff and (daisy-wheel) nroff accents
.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'
.ds 8 \h'\*(#H'\(*b\h'-\*(#H'
.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#]
.ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'
.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'
.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#]
.ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#]
.ds ae a\h'-(\w'a'u*4/10)'e
.ds Ae A\h'-(\w'A'u*4/10)'E
.    \" corrections for vroff
.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'
.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'
.    \" for low resolution devices (crt and lpr)
.if \n(.H>23 .if \n(.V>19 \
\{\
.    ds : e
.    ds 8 ss
.    ds o a
.    ds d- d\h'-1'\(ga
.    ds D- D\h'-1'\(hy
.    ds th \o'bp'
.    ds Th \o'LP'
.    ds ae ae
.    ds Ae AE
.\}
.rm #[ #] #H #V #F C
.\" ========================================================================
.\"
.IX Title "Paranoid::Input 3pm"
.TH Paranoid::Input 3pm "2011-04-13" "perl v5.14.2" "User Contributed Perl Documentation"
.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
Paranoid::Input \- Paranoid input functions
.SH "VERSION"
.IX Header "VERSION"
\&\f(CW$Id:\fR Input.pm,v 0.20 2011/04/13 22:01:43 acorliss Exp $
.SH "SYNOPSIS"
.IX Header "SYNOPSIS"
.Vb 1
\&  use Paranoid::Input;
\&
\&  FSZLIMIT  = 64 * 1024;
\&  LNSZLIMIT = 2 * 1024;
\&
\&  $rv = slurp($filename, \e@lines);
\&
\&  $rv = sip($filename, \e@lines);
\&  $rv = sip($filename, \e@lines, 1);
\&  $rv = tail($filename, \e@lines);
\&  $rv = tail($filename, \e@lines, \-100);
\&  $rv = tail($filename, \e@lines, \-100, 1);
\&  $rv = closeFile($filename);
\&
\&  addTaintRegex("telephone", qr/\e(\ed{3}\e)\es+\ed{3}\-\ed{4}/);
\&  $rv = detaint($userInput, "login", \e$val);
\&
\&  $rv = stringMatch($input, @strings);
.Ve
.SH "DESCRIPTION"
.IX Header "DESCRIPTION"
The modules provide safer routines to use for input activities such as reading
files and detainting user input.
.PP
The \fBsip\fR and \fBtail\fR functions keep open file handles.  Even so, it's
specifically built to be safe for use in \fBfork\fR scenarios.  You can being a
tail or sip in a parent, fork children, and all process can independently
continue sipping with no confusion between processes.  This is possible
because we check to see if the \s-1PID\s0 matches the \s-1PID\s0 in effect with the file was
opened.  If not, we reopen the file and seek to the same position so we can
pick up where we left off.
.PP
The \fBslurp\fR function isn't affected by this since it reads entire files in a
single call, no filehandles are kept open between calls.
.PP
All file-reading functions use and obey \fBflock\fR.
.PP
\&\fBaddTaintRegex\fR is only exported if this module is used with the \fB:all\fR target.
.SH "SUBROUTINES/METHODS"
.IX Header "SUBROUTINES/METHODS"
.SS "\s-1FSZLIMIT\s0"
.IX Subsection "FSZLIMIT"
The value returned/set by this lvalue function is the maximum file size that
will be read into memory.  This affects functions like \fBslurp\fR (documented
below).  Unless explicitly set this defaults to 16KB.
.SS "\s-1LNSZLIMIT\s0"
.IX Subsection "LNSZLIMIT"
The valute returned/set by this lvalue function is the maximum line length
supported by functions like \fBsip\fR (documented below).  Unless explicitly set
this defaults to 2KB.
.SS "slurp"
.IX Subsection "slurp"
.Vb 1
\&  $rv = slurp($filename, \e@lines);
.Ve
.PP
This function allows you to read a text file in its entirety into memory, 
the lines of which are placed into the passed array reference.  This function 
will only read files up to \fB\s-1FSZLIMIT\s0\fR in size.  Flocking is used (with 
\&\fB\s-1LOCK_SH\s0\fR) and the read is a blocking read.
.PP
An optional third argument sets a boolean flag which, if true, determines if
all lines are automatically chomped.  If chomping is enabled this will strip
both \s-1UNIX\s0 and \s-1DOS\s0 line separators.
.PP
The return value is false if the read was unsuccessful or the file's size
exceeded \fB\s-1FSZLIMIT\s0\fR.  In the latter case the array reference will still be
populated with what was read.  The reason for the failure can be retrieved
\&\fBfrom Paranoid::ERROR\fR.
.SS "sip"
.IX Subsection "sip"
.Vb 2
\&    $rv = sip($filename, \e@lines);
\&    $rv = sip($filename, \e@lines, 1);
.Ve
.PP
This function allows you to read a text file into memory in chunks, the 
lines of which are placed into the passed array reference.  The chunks are 
read in at up to \fB\s-1FSZLIMIT\s0\fR in size at a time.  Like \fBslurp\fR file locking 
is used and autochomping is also supported.
.PP
This function returns true if there was input read, but if any or all of the
input splits into lines greater than \fB\s-1LNSZLIMIT\s0\fR it will discard that input
and return \-1 (which is still technically boolean true).
.PP
The reason why we now care about line lengths is because it's very likely that
line boundaries will not fall neatly along our chunk boundaries, so we need to
take trailing portions of unterminated lines and store them to be joined with
the remainder from the next sip.
.PP
When sip comes up to then end of the file it does not close the file, you're
required to close it explicitly with \fBcloseFile\fR.  This is done intentionally
to allow the process to continue to effectively \fBtail\fR a growing file.
Unlike the \fBtail\fR function provided here, though, it does perform any
additional checks to see if the file you're reading was truncated or replaced.
.PP
An optional third argument tells sip whether or not to chomp all the read
lines before returning.
.SS "tail"
.IX Subsection "tail"
.Vb 3
\&    $rv = tail($filename, \e@lines);
\&    $rv = tail($filename, \e@lines, \-100);
\&    $rv = tail($filename, \e@lines, \-100, 1);
.Ve
.PP
The only difference between this function and \fBsip\fR is that tail opens the
file and immediately seeks to the end.  If an optional third argument is
passed it will seek backwards to extract and return that number of lines (if
possible).  Depending on the number passed one must be prepared for enough
memory to be allocated to store \fB\s-1LNSZLIMIT\s0\fR * that number.
.PP
This function returns true if the file is successfully open, regardless of
whether any new input was there to be read.  It only returns false if there 
was a problem opening or reading the file.
.PP
Tail should be called with the third argument for the first tail of a file.
Continuing to use it for subsequent calls will cause the number of lines
returned to be truncated to fit within that limit.
.PP
Like \fBsip\fR, one must explicitly close a file with \fBcloseFile\fR.
.SS "closeFile"
.IX Subsection "closeFile"
.Vb 1
\&  $rv = closeFile($filename);
.Ve
.PP
This function closes any open file descriptors that may have been opened via
\&\fBsip\fR or \fBtail\fR for the named file.  This returns the value of the \fBclose\fR
function if the file was open, otherwise it returns true.
.SS "addTaintRegex"
.IX Subsection "addTaintRegex"
.Vb 1
\&  addTaintRegex("telephone", qr/\e(\ed{3}\e)\es+\ed{3}\-\ed{4}/);
.Ve
.PP
This adds a regular expression which can used by name to detaint user input
via the \fBdetaint\fR function.  This will allow you to overwrite the internally
provided regexes or as well as your own.
.SS "detaint"
.IX Subsection "detaint"
.Vb 1
\&  $rv = detaint($userInput, "login", \e$val);
.Ve
.PP
This function populates the passed reference with the detainted input from the
first argument.  The second argument specifies the type of data in the first
argument, and is used to validate the input before detainting.  The following
data types are currently known:
.PP
.Vb 10
\&  alphabetic            ^([a\-zA\-Z]+)$
\&  alphanumeric          ^([a\-zA\-Z0\-9])$
\&  email                 ^([a\-zA\-Z][\ew\e.\e\-]*\e@
\&                        (?:[a\-zA\-Z0\-9][a\-zA\-Z0\-9\e\-]*\e.)*
\&                        [a\-zA\-Z0\-9]+)$
\&  filename              ^[/ \ew\e\-\e.:,@\e+]+\e[?$
\&  fileglob              ^[/ \ew\e\-\e.:,@\e+\e*\e?\e{\e}\e[\e]]+\e[?$
\&  hostname              ^(?:[a\-zA\-Z0\-9][a\-zA\-Z0\-9\e\-]*\e.)*
\&                        [a\-zA\-Z0\-9]+)$
\&  ipaddr                ^(?:\ed+\e.){3}\ed+$
\&  netaddr               ^(?:\ed+\e.){3}\ed+(?:/(?:\ed+|
\&                        (?:\ed+\e.){3}\ed+))?$
\&  login                 ^([a\-zA\-Z][\ew\e.\e\-]*)$
\&  nometa                ^([^\e\`\e$\e!\e@]+)$
\&  number                ^([+\e\-]?[0\-9]+(?:\e.[0\-9]+)?)$
.Ve
.PP
If the first argument fails to match against these regular expressions the
function will return 0.  If the string passed is either undefined or a
zero-length string it will also return 0.  And finally, if you attempt to use
an unknown (or unregistered) data type it will also return 0, and log an error
message in \fBParanoid::ERROR\fR.
.PP
\&\fB\s-1NOTE\s0\fR:  This is a small alteration in previous behavior.  In previous
versions if an undef or zero-length string was passed, or if the data type was
unknown the code would croak.  That was, perhaps, a tad overzealous on my
part.
.SS "stringMatch"
.IX Subsection "stringMatch"
.Vb 1
\&  $rv = stringMatch($input, @strings);
.Ve
.PP
This function does a multiline case insensitive regex match against the 
input for every string passed for matching.  This does safe quoted matches 
(\eQ$string\eE) for all the strings, unless the string is a perl Regexp 
(defined with qr//) or begins and ends with /.
.PP
\&\fB\s-1NOTE\s0\fR: this performs a study in hopes that for a large number of regexes
will be performed faster.  This may not always be the case.
.SH "DEPENDENCIES"
.IX Header "DEPENDENCIES"
.IP "o" 4
.IX Item "o"
Fcntl
.IP "o" 4
.IX Item "o"
Paranoid
.IP "o" 4
.IX Item "o"
Paranoid::Debug
.SH "BUGS AND LIMITATIONS"
.IX Header "BUGS AND LIMITATIONS"
If you fork a process that's already opened a file with \fBsip\fR or \fBtail\fR a
new file descriptor will be opened for the child process.  But what may be
less obvious is that with a newly opened file descriptor you will be starting
back from the beginning (or end, in the case of \fBtail\fR) of the file, rather
than from where ever you were before the fork.
.SH "AUTHOR"
.IX Header "AUTHOR"
Arthur Corliss (corliss@digitalmages.com)
.SH "LICENSE AND COPYRIGHT"
.IX Header "LICENSE AND COPYRIGHT"
This software is licensed under the same terms as Perl, itself. 
Please see http://dev.perl.org/licenses/ for more information.
.PP
(c) 2005, Arthur Corliss (corliss@digitalmages.com)