.\" -*- mode: troff; coding: utf-8 -*- .\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. .ie n \{\ . ds C` "" . ds C' "" 'br\} .el\{\ . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "DBD::SQLite::Cookbook 3pm" .TH DBD::SQLite::Cookbook 3pm 2024-01-10 "perl v5.38.2" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH NAME DBD::SQLite::Cookbook \- The DBD::SQLite Cookbook .SH DESCRIPTION .IX Header "DESCRIPTION" This is the DBD::SQLite cookbook. .PP It is intended to provide a place to keep a variety of functions and formals for use in callback APIs in DBD::SQLite. .SH "AGGREGATE FUNCTIONS" .IX Header "AGGREGATE FUNCTIONS" .SS Variance .IX Subsection "Variance" This is a simple aggregate function which returns a variance. It is adapted from an example implementation in pysqlite. .PP .Vb 1 \& package variance; \& \& sub new { bless [], shift; } \& \& sub step { \& my ( $self, $value ) = @_; \& \& push @$self, $value; \& } \& \& sub finalize { \& my $self = $_[0]; \& \& my $n = @$self; \& \& # Variance is NULL unless there is more than one row \& return undef unless $n || $n == 1; \& \& my $mu = 0; \& foreach my $v ( @$self ) { \& $mu += $v; \& } \& $mu /= $n; \& \& my $sigma = 0; \& foreach my $v ( @$self ) { \& $sigma += ($v \- $mu)**2; \& } \& $sigma = $sigma / ($n \- 1); \& \& return $sigma; \& } \& \& # NOTE: If you use an older DBI (< 1.608), \& # use $dbh\->func(..., "create_aggregate") instead. \& $dbh\->sqlite_create_aggregate( "variance", 1, \*(Aqvariance\*(Aq ); .Ve .PP The function can then be used as: .PP .Vb 3 \& SELECT group_name, variance(score) \& FROM results \& GROUP BY group_name; .Ve .SS "Variance (Memory Efficient)" .IX Subsection "Variance (Memory Efficient)" A more efficient variance function, optimized for memory usage at the expense of precision: .PP .Vb 1 \& package variance2; \& \& sub new { bless {sum => 0, count=>0, hash=> {} }, shift; } \& \& sub step { \& my ( $self, $value ) = @_; \& my $hash = $self\->{hash}; \& \& # by truncating and hashing, we can comsume many more data points \& $value = int($value); # change depending on need for precision \& # use sprintf for arbitrary fp precision \& if (exists $hash\->{$value}) { \& $hash\->{$value}++; \& } else { \& $hash\->{$value} = 1; \& } \& $self\->{sum} += $value; \& $self\->{count}++; \& } \& \& sub finalize { \& my $self = $_[0]; \& \& # Variance is NULL unless there is more than one row \& return undef unless $self\->{count} > 1; \& \& # calculate avg \& my $mu = $self\->{sum} / $self\->{count}; \& \& my $sigma = 0; \& while (my ($h, $v) = each %{$self\->{hash}}) { \& $sigma += (($h \- $mu)**2) * $v; \& } \& $sigma = $sigma / ($self\->{count} \- 1); \& \& return $sigma; \& } .Ve .PP The function can then be used as: .PP .Vb 3 \& SELECT group_name, variance2(score) \& FROM results \& GROUP BY group_name; .Ve .SS "Variance (Highly Scalable)" .IX Subsection "Variance (Highly Scalable)" A third variable implementation, designed for arbitrarily large data sets: .PP .Vb 1 \& package variance3; \& \& sub new { bless {mu=>0, count=>0, S=>0}, shift; } \& \& sub step { \& my ( $self, $value ) = @_; \& $self\->{count}++; \& my $delta = $value \- $self\->{mu}; \& $self\->{mu} += $delta/$self\->{count}; \& $self\->{S} += $delta*($value \- $self\->{mu}); \& } \& \& sub finalize { \& my $self = $_[0]; \& return $self\->{S} / ($self\->{count} \- 1); \& } .Ve .PP The function can then be used as: .PP .Vb 3 \& SELECT group_name, variance3(score) \& FROM results \& GROUP BY group_name; .Ve .SH SUPPORT .IX Header "SUPPORT" Bugs should be reported via the CPAN bug tracker at .PP .SH "TO DO" .IX Header "TO DO" .IP \(bu 4 Add more and varied cookbook recipes, until we have enough to turn them into a separate CPAN distribution. .IP \(bu 4 Create a series of tests scripts that validate the cookbook recipes. .SH AUTHOR .IX Header "AUTHOR" Adam Kennedy .SH COPYRIGHT .IX Header "COPYRIGHT" Copyright 2009 \- 2012 Adam Kennedy. .PP This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. .PP The full text of the license can be found in the LICENSE file included with this module.