.\" Automatically generated by Pod::Man 4.10 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "Data::StreamDeserializer 3pm" .TH Data::StreamDeserializer 3pm "2018-11-01" "perl v5.28.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" Data::StreamDeserializer \- non\-blocking deserializer. .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 2 \& my $sr = new Data::StreamDeserializer \& data => $very_big_dump; \& \& ... somewhere \& \& unless($sr\->next) { \& # deserialization hasn\*(Aqt been done yet \& } \& \& ... \& \& if ($sr\->next) { \& # deserialization has been done \& \& ... \& if ($sr\->is_error) { \& printf "%s\en", $sr\->error; \& printf "Unparsed string tail: %s\en", $sr\->tail; \& } \& \& my $result = $sr\->result; # first deserialized object \& my $result = $sr\->result(first); # the same \& \& my $results = $sr\->result(\*(Aqall\*(Aq); # all deserialized objects \& # (ARRAYREF) \& } \& \& \& # stream deserializer \& $sr = new Data::StreamDeserializer; \& \& while(defined (my $block = read_next_data_block)) { \& $sr\->next($block); \& ... \& } \& $sr\->next(undef); # eof signal \& until ($sr\->next) { \& ... do something \& } \& # all data were parsed .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" Sometimes You need to deserialize a lot of data. If You use 'eval' (or Safe\->reval, etc) it can take You too much time. If Your code is executed in event machine it can be inadmissible. So using the module You can deserialize Your stream progressively and do something else between deserialization itearions. .SS "Recognized statements" .IX Subsection "Recognized statements" \fI\s-1HASHES\s0\fR .IX Subsection "HASHES" .PP .Vb 1 \& { something } .Ve .PP \fI\s-1ARRAYS\s0\fR .IX Subsection "ARRAYS" .PP .Vb 1 \& [ something ] .Ve .PP \fI\s-1REFS\s0\fR .IX Subsection "REFS" .PP .Vb 3 \& \e something \& \e[ ARRAY ] \& \e{ HASH } .Ve .PP \fIRegexps\fR .IX Subsection "Regexps" .PP .Vb 1 \& qr{something} .Ve .PP \fI\s-1SCALARS\s0\fR .IX Subsection "SCALARS" .PP .Vb 4 \& "something" \& \*(Aqsomething\*(Aq \& q{something} \& qq{something} .Ve .SH "METHODS" .IX Header "METHODS" .SS "new" .IX Subsection "new" Creates new deserializer. It can receive a few named arguments: .PP \fIblock_size\fR .IX Subsection "block_size" .PP The size of block which will be serialized in each 'next' cycle. Default value is 512 bytes. .PP \fIdata\fR .IX Subsection "data" .PP If You know (have) all data to deserialize before constructing the object, You can use this argument. .PP \&\fB\s-1NOTE\s0\fR: You must not use the function part or next with arguments if You used this argument. .SS "block_size" .IX Subsection "block_size" Set/get the same field. .SS "part" .IX Subsection "part" Append a part of input data to serialize. If there is no argument (or \fBundef\fR), deserializer will know that there will be no data in the future. .SS "next" .IX Subsection "next" Processes to parse next block_size bytes. Returns \fB\s-1TRUE\s0\fR if an error was detected or all input datas were parsed. .SS "next_object" .IX Subsection "next_object" The same as next but returns \fBtrue\fR after new object is found. Drop previous results. .PP For example You have the string: .PP .Vb 1 \& $str = "1, 2, [ 0, 1 ], { \*(Aqa\*(Aq => \*(Aqb\*(Aq }"; .Ve .PP You can extract objects: .PP .Vb 1 \& my $dsr = new Data::StreamDeserializer data => $str; \& \& 1 until $dsr\->next_object; \& my $first = $dsr\->result; # scalar: 1 \& \& 1 until $dsr\->next_object; \& my $second = $dsr\->result; # scalar: 2 \& \& 1 until $dsr\->next_object; \& my $third = $dsr\->result; # arrayref: [ 0, 1 ] \& \& 1 until $dsr\->next_object; \& my $third = $dsr\->result; # hashref: { \*(Aqa\*(Aq => \*(Aqb\*(Aq } .Ve .SS "skip_divider" .IX Subsection "skip_divider" If You have a string: .PP .Vb 1 \& Object Object Object .Ve .PP (there are no dividers between objects), You can call skip_divider after fetching the next object. .PP Example: .PP .Vb 1 \& $str = "1 2 [ 0, 1 ]{ \*(Aqa\*(Aq => \*(Aqb\*(Aq }"; \& \& my $dsr = new Data::StreamDeserializer data => $str; \& \& 1 until $dsr\->next_object; \& my $first = $dsr\->result; # scalar: 1 \& \& $dsr\->skip_divider; \& \& 1 until $dsr\->next_object; \& my $second = $dsr\->result; # scalar: 2 \& \& $dsr\->skip_divider; \& 1 until $dsr\->next_object; \& my $third = $dsr\->result; # arrayref: [ 0, 1 ] .Ve .PP \&\fBImportant\fR: You can't skip dividers inside nested object. The function will croak if You call it in the point that isn't between objects. .SS "is_error" .IX Subsection "is_error" Returns \fB\s-1TRUE\s0\fR if an error was detected. .SS "error" .IX Subsection "error" Returns error string. .SS "tail" .IX Subsection "tail" Returns unparsed data. .SS "result" .IX Subsection "result" Returns result of parsing. By default the function returns only the first parsed object. .PP You can call the function with argument \fB'all'\fR to get all parsed objects. In this case the function will receive \&\fB\s-1ARRAYREF\s0\fR. .SS "is_done" .IX Subsection "is_done" Returns \fB\s-1TRUE\s0\fR if all input data were processed or an error was found. If You didn't call part without arguments, and didn't call next or next_object with \fBundef\fR the function could return \fB\s-1TRUE\s0\fR only if an error occured. .SH "PRIVATE METHODS" .IX Header "PRIVATE METHODS" .SS "_push_error" .IX Subsection "_push_error" Pushes error into deserializer's error stack. .SH "SEE ALSO" .IX Header "SEE ALSO" DATA::StreamSerializer .SH "BENCHMARKS" .IX Header "BENCHMARKS" This module is almost fully written using \s-1XS/C\s0 language. So it works a bit faster or slowly than CORE::eval. .PP You can try a few scripts in \fBbenchmark/\fR directory. There are a few test arrays in this directory. .PP Here are a few test results of my system. .SS "Array which contains 100 hashes:" .IX Subsection "Array which contains 100 hashes:" It works faster than \fBeval\fR: .PP .Vb 5 \& $ perl benchmark/ds_vs_eval.pl \-n 1000 \-b 512 benchmark/tests/01_100x10 \& 38296 bytes were read \& First deserializing by eval... done \& First deserializing by Data::DeSerializer... done \& Check if deserialized objects are same... done \& \& Starting 1000 iterations for eval... done (3.755 seconds) \& Starting 1000 iterations for Data::StreamDeserializer... done (3.059 seconds) \& \& Eval statistic: \& 1000 iterations were done \& maximum deserialization time: 0.0041 seconds \& minimum deserialization time: 0.0035 seconds \& average deserialization time: 0.0036 seconds \& \& StreamDeserializer statistic: \& 1000 iterations were done \& 75000 SUBiterations were done \& 512 bytes in one block in one iteration \& maximum deserialization time: 0.0045 seconds \& minimum deserialization time: 0.0028 seconds \& average deserialization time: 0.0029 seconds \& average subiteration time: 0.00004 seconds .Ve .SS "Array which contains 1000 hashes:" .IX Subsection "Array which contains 1000 hashes:" It works slowly than \fBeval\fR: .PP .Vb 5 \& $ perl benchmark/ds_vs_eval.pl \-n 1000 \-b 512 benchmark/tests/02_1000x10 \& 355623 bytes were read \& First deserializing by eval... done \& First deserializing by Data::DeSerializer... done \& Check if deserialized objects are same... done \& \& Starting 1000 iterations for eval... done (43.920 seconds) \& Starting 1000 iterations for Data::StreamDeserializer... done (71.668 seconds) \& \& Eval statistic: \& 1000 iterations were done \& maximum deserialization time: 0.0490 seconds \& minimum deserialization time: 0.0416 seconds \& average deserialization time: 0.0426 seconds \& \& StreamDeserializer statistic: \& 1000 iterations were done \& 689000 SUBiterations were done \& 512 bytes in one block in one iteration \& maximum deserialization time: 0.0773 seconds \& minimum deserialization time: 0.0656 seconds \& average deserialization time: 0.0690 seconds \& average subiteration time: 0.00010 seconds .Ve .PP You can see, that one block is parsed in a very short time period. So You can increase block_size value to reduce total parsing time. .PP If \fBblock_size\fR is equal string size the module works two times faster than eval: .PP .Vb 5 \& $ perl benchmark/ds_vs_eval.pl \-n 1000 \-b 355623 benchmark/tests/02_1000x10 \& 355623 bytes were read \& First deserializing by eval... done \& First deserializing by Data::DeSerializer... done \& Check if deserialized objects are same... done \& \& Starting 1000 iterations for eval... done (44.456 seconds) \& Starting 1000 iterations for Data::StreamDeserializer... done (19.702 seconds) \& \& Eval statistic: \& 1000 iterations were done \& maximum deserialization time: 0.0474 seconds \& minimum deserialization time: 0.0423 seconds \& average deserialization time: 0.0431 seconds \& \& StreamDeserializer statistic: \& 1000 iterations were done \& 1000 SUBiterations were done \& 355623 bytes in one block in one iteration \& maximum deserialization time: 0.0179 seconds \& minimum deserialization time: 0.0168 seconds \& average deserialization time: 0.0171 seconds \& average subiteration time: 0.01705 seconds .Ve .SH "AUTHOR" .IX Header "AUTHOR" Dmitry E. Oboukhov, .SH "COPYRIGHT AND LICENSE" .IX Header "COPYRIGHT AND LICENSE" Copyright (C) 2011 by Dmitry E. Oboukhov .PP This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10.1 or, at your option, any later version of Perl 5 you may have available. .SH "VCS" .IX Header "VCS" The project is placed in my git repo. See here: