.\" Automatically generated by Pod::Man 4.09 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .if !\nF .nr F 0 .if \nF>0 \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} .\} .\" ======================================================================== .\" .IX Title "Spreadsheet::ParseXLSX 3pm" .TH Spreadsheet::ParseXLSX 3pm "2018-04-26" "perl v5.26.2" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" Spreadsheet::ParseXLSX \- parse XLSX files .SH "VERSION" .IX Header "VERSION" version 0.27 .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& use Spreadsheet::ParseXLSX; \& \& my $parser = Spreadsheet::ParseXLSX\->new; \& my $workbook = $parser\->parse("file.xlsx"); \& # see Spreadsheet::ParseExcel for further documentation .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" This module is an adaptor for Spreadsheet::ParseExcel that reads \s-1XLSX\s0 files. For documentation about the various data that you can retrieve from these classes, please see Spreadsheet::ParseExcel, Spreadsheet::ParseExcel::Workbook, Spreadsheet::ParseExcel::Worksheet, and Spreadsheet::ParseExcel::Cell. .SH "METHODS" .IX Header "METHODS" .SS "new(%opts)" .IX Subsection "new(%opts)" Returns a new parser instance. Takes a hash of parameters: .IP "Password" 4 .IX Item "Password" Password to use for decrypting encrypted files. .ie n .SS "parse($file, $formatter)" .el .SS "parse($file, \f(CW$formatter\fP)" .IX Subsection "parse($file, $formatter)" Parses an \s-1XLSX\s0 file. Parsing errors throw an exception. \f(CW$file\fR can be either a filename or an open filehandle. Returns a Spreadsheet::ParseExcel::Workbook instance containing the parsed data. The \f(CW$formatter\fR argument is an optional formatter class as described in Spreadsheet::ParseExcel. .SH "INCOMPATIBILITIES" .IX Header "INCOMPATIBILITIES" This module returns data using classes from Spreadsheet::ParseExcel, so for the most part, it should just be a drop-in replacement. That said, there are a couple areas where the data returned is intentionally different: .IP "Colors" 4 .IX Item "Colors" In Spreadsheet::ParseExcel, colors are represented by integers which index into the color table, and you have to use \&\f(CW\*(C`Spreadsheet::ParseExcel\->ColorIdxToRGB\*(C'\fR in order to get the actual value out. In Spreadsheet::ParseXLSX, while the color table still exists, cells are also allowed to specify their color directly rather than going through the color table. In order to avoid confusion, I normalize all color values in Spreadsheet::ParseXLSX to their string \s-1RGB\s0 format (\f(CW"#0088ff"\fR). This affects the \f(CW\*(C`Fill\*(C'\fR, \f(CW\*(C`BdrColor\*(C'\fR, and \f(CW\*(C`BdrDiag\*(C'\fR properties of formats, and the \&\f(CW\*(C`Color\*(C'\fR property of fonts. Note that the default color is represented by \&\f(CW\*(C`undef\*(C'\fR (the same thing that \f(CW\*(C`ColorIdxToRGB\*(C'\fR would return). .IP "Formulas" 4 .IX Item "Formulas" Spreadsheet::ParseExcel doesn't support formulas. Spreadsheet::ParseXLSX provides basic formula support by returning the text of the formula as part of the cell data. You can access it via \f(CW\*(C`$cell\->{Formula}\*(C'\fR. Note that the restriction still holds that formula cell values aren't available unless they were explicitly provided when the spreadsheet was written. .SH "BUGS" .IX Header "BUGS" .IP "Large spreadsheets may cause segfaults on perl 5.14 and earlier" 4 .IX Item "Large spreadsheets may cause segfaults on perl 5.14 and earlier" This module internally uses XML::Twig, which makes it potentially subject to Bug #71636 for XML-Twig: Segfault with medium-sized document on perl versions 5.14 and below (the underlying bug with perl weak references was fixed in perl 5.15.5). The larger and more complex the spreadsheet, the more likely to be affected, but the actual size at which it segfaults is platform dependent. On a 64\-bit perl with 7.6gb memory, it was seen on spreadsheets about 300mb and above. You can work around this adding \&\f(CWXML::Twig::_set_weakrefs(0)\fR to your code before parsing the spreadsheet, although this may have other consequences such as memory leaks. .ie n .IP "Worksheets without the ""dimension"" tag are not supported" 4 .el .IP "Worksheets without the \f(CWdimension\fR tag are not supported" 4 .IX Item "Worksheets without the dimension tag are not supported" .PD 0 .IP "Intra-cell formatting is discarded" 4 .IX Item "Intra-cell formatting is discarded" .IP "Shared formulas are not supported" 4 .IX Item "Shared formulas are not supported" .PD Shared formula support will require an actual formula parser and quite a bit of custom logic, since the only thing stored in the document is the formula for the base cell \- updating the cell references in the formulas in the rest of the cells is handled by the application. Values for these cells are still handled properly. .PP In addition, there are still a few areas which are not yet implemented (the \&\s-1XLSX\s0 spec is quite large). If you run into any of those, bug reports are quite welcome. .PP Please report any bugs to GitHub Issues at . .SH "SEE ALSO" .IX Header "SEE ALSO" Spreadsheet::ParseExcel: The equivalent, for \s-1XLS\s0 files. .PP Spreadsheet::XLSX: An older, less robust and featureful implementation. .SH "SUPPORT" .IX Header "SUPPORT" You can find this documentation for this module with the perldoc command. .PP .Vb 1 \& perldoc Spreadsheet::ParseXLSX .Ve .PP You can also look for information at: .IP "\(bu" 4 MetaCPAN .Sp .IP "\(bu" 4 \&\s-1RT: CPAN\s0's request tracker .Sp .IP "\(bu" 4 Github .Sp .IP "\(bu" 4 \&\s-1CPAN\s0 Ratings .Sp .SH "SPONSORS" .IX Header "SPONSORS" Parts of this code were paid for by .IP "Socialflow " 4 .IX Item "Socialflow " .SH "AUTHOR" .IX Header "AUTHOR" Jesse Luehrs .SH "COPYRIGHT AND LICENSE" .IX Header "COPYRIGHT AND LICENSE" This software is Copyright (c) 2016 by Jesse Luehrs. .PP This is free software, licensed under: .PP .Vb 1 \& The MIT (X11) License .Ve