.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.42) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "HTML::HTML5::Entities 3pm" .TH HTML::HTML5::Entities 3pm "2022-10-13" "perl v5.34.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" HTML::HTML5::Entities \- drop\-in replacement for HTML::Entities .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& use HTML::Entities; \& \& my $enc = encode_entities(\*(Aqfish & chips\*(Aq); \& print "$enc\en"; # fish & chips \& \& my $dec = decode_entities($enc); \& print "$dec\en"; # fish & chips .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" This is a drop-in replacement for HTML::Entities, providing the character entities defined in \s-1HTML5.\s0 Some caveats: .IP "\(bu" 4 The implementation is pure perl, hence in some cases slower, especially decoding. .IP "\(bu" 4 It will not work in Perl < 5.8.1. .SS "Functions" .IX Subsection "Functions" .ie n .IP """decode_entities($string, ...)""" 4 .el .IP "\f(CWdecode_entities($string, ...)\fR" 4 .IX Item "decode_entities($string, ...)" This routine replaces \s-1HTML\s0 entities found in the \f(CW$string\fR with the corresponding Unicode character. If multiple strings are provided as arguments they are each decoded separately and the same number of strings are returned. .Sp If called in void context the arguments are decoded in-place. .Sp This routine is exported by default. .ie n .IP """_decode_entities($string, \e%entity2char)""" 4 .el .IP "\f(CW_decode_entities($string, \e%entity2char)\fR" 4 .IX Item "_decode_entities($string, %entity2char)" .PD 0 .ie n .IP """_decode_entities($string, \e%entity2char, $expand_prefix)""" 4 .el .IP "\f(CW_decode_entities($string, \e%entity2char, $expand_prefix)\fR" 4 .IX Item "_decode_entities($string, %entity2char, $expand_prefix)" .PD This will in-place replace \s-1HTML\s0 entities in \f(CW$string\fR. The \&\f(CW%entity2char\fR hash must be provided. Named entities not found in the \f(CW%entity2char\fR hash are left alone. Numeric entities are always expanded. .Sp If \f(CW$expand_prefix\fR is \s-1TRUE\s0 then entities without trailing \*(L";\*(R" in \&\f(CW%entity2char\fR will even be expanded as a prefix of a longer unrecognized name. .Sp .Vb 3 \& $string = "foo bar"; \& _decode_entities($string, { nb => "@", nbsp => "\exA0" }, 1); \& print $string; # will print "foo bar" .Ve .Sp This routine is exported by default. .ie n .IP """encode_entities($string)""" 4 .el .IP "\f(CWencode_entities($string)\fR" 4 .IX Item "encode_entities($string)" .PD 0 .ie n .IP """encode_entities($string, $unsafe_chars)""" 4 .el .IP "\f(CWencode_entities($string, $unsafe_chars)\fR" 4 .IX Item "encode_entities($string, $unsafe_chars)" .PD This routine replaces unsafe characters in \f(CW$string\fR with their entity representation. A second argument can be given to specify which characters to consider unsafe (i.e., which to escape). This may be a regular expression. .Sp If called in void context the string is encoded in-place. .Sp This routine is exported by default. .ie n .IP """encode_entities_numeric($string)""" 4 .el .IP "\f(CWencode_entities_numeric($string)\fR" 4 .IX Item "encode_entities_numeric($string)" This routine works just like encode_entities, except that the replacement entities are always numeric. .Sp This routine is not exported by default. .ie n .IP """num_entity($string)""" 4 .el .IP "\f(CWnum_entity($string)\fR" 4 .IX Item "num_entity($string)" Given a single character string, encodes it as a numeric entity. .Sp This routine is not exported by default. .PP The following functions cannot be exported. They behave the same as the exportable functions. .ie n .IP """HTML::Entities::decode($string, ...)""" 4 .el .IP "\f(CWHTML::Entities::decode($string, ...)\fR" 4 .IX Item "HTML::Entities::decode($string, ...)" .PD 0 .ie n .IP """HTML::Entities::encode($string)""" 4 .el .IP "\f(CWHTML::Entities::encode($string)\fR" 4 .IX Item "HTML::Entities::encode($string)" .ie n .IP """HTML::Entities::encode($string, $unsafe_characters)""" 4 .el .IP "\f(CWHTML::Entities::encode($string, $unsafe_characters)\fR" 4 .IX Item "HTML::Entities::encode($string, $unsafe_characters)" .ie n .IP """HTML::Entities::encode_numeric($string)""" 4 .el .IP "\f(CWHTML::Entities::encode_numeric($string)\fR" 4 .IX Item "HTML::Entities::encode_numeric($string)" .ie n .IP """HTML::Entities::encode_numeric($string, $unsafe_characters)""" 4 .el .IP "\f(CWHTML::Entities::encode_numeric($string, $unsafe_characters)\fR" 4 .IX Item "HTML::Entities::encode_numeric($string, $unsafe_characters)" .ie n .IP """HTML::Entities::encode_numerically($string)""" 4 .el .IP "\f(CWHTML::Entities::encode_numerically($string)\fR" 4 .IX Item "HTML::Entities::encode_numerically($string)" .ie n .IP """HTML::Entities::encode_numerically($string, $unsafe_characters)""" 4 .el .IP "\f(CWHTML::Entities::encode_numerically($string, $unsafe_characters)\fR" 4 .IX Item "HTML::Entities::encode_numerically($string, $unsafe_characters)" .PD .SS "Variables" .IX Subsection "Variables" .ie n .IP "$HTML::HTML5::Entities::hex" 4 .el .IP "\f(CW$HTML::HTML5::Entities::hex\fR" 4 .IX Item "$HTML::HTML5::Entities::hex" This variable controls whether numeric entities will use hexadecimal or decimal notation. It is \s-1TRUE\s0 (hexadecimal) by default, but can be set to \&\s-1FALSE.\s0 .Sp It only affects the encoding functions. Decoding always understands both notations. .ie n .IP "%HTML::HTML5::Entities::char2entity" 4 .el .IP "\f(CW%HTML::HTML5::Entities::char2entity\fR" 4 .IX Item "%HTML::HTML5::Entities::char2entity" .PD 0 .ie n .IP "%HTML::HTML5::Entities::entity2char" 4 .el .IP "\f(CW%HTML::HTML5::Entities::entity2char\fR" 4 .IX Item "%HTML::HTML5::Entities::entity2char" .PD There contain the mapping from all characters to the corresponding entities (and vice versa, respectively). These variables may be exported. .Sp Note that \f(CW%char2entity\fR is a more conservative set of mappings, intended to be safe for serialising strings to \s-1HTML4, HTML5\s0 and \s-1XHTML 1\s0.x. And for hysterical raisins, \f(CW%entity2char\fR does not include the leading ampersands, while \f(CW%char2entity\fR does. .SH "BUGS" .IX Header "BUGS" Please report any bugs to . .SH "SEE ALSO" .IX Header "SEE ALSO" HTML::Entities, HTML::HTML5::Parser, HTML::HTML5::Writer. .SH "AUTHOR" .IX Header "AUTHOR" Toby Inkster . .SH "COPYRIGHT AND LICENCE" .IX Header "COPYRIGHT AND LICENCE" .SS "Encoding and Decoding Functions" .IX Subsection "Encoding and Decoding Functions" Copyright (c) 1995\-2006 by Gisle Aas. .PP Copyright (c) 2012 by Toby Inkster. .PP This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself. .SS "Entity Tables" .IX Subsection "Entity Tables" Copyright (c) 2004\-2007 by Apple Computer Inc, Mozilla Foundation, and Opera Software \s-1ASA.\s0 .PP Copyright (c) 2007\-2011 by Wakaba . .PP Copyright (c) 2009\-2012 by Toby Inkster . .SH "DISCLAIMER OF WARRANTIES" .IX Header "DISCLAIMER OF WARRANTIES" \&\s-1THIS PACKAGE IS PROVIDED \*(L"AS IS\*(R" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.\s0