'\" t .\" Title: dumppdf .\" Author: Jakub Wilk .\" Generator: DocBook XSL Stylesheets v1.79.1 .\" Date: 12/30/2018 .\" Manual: PDFMiner Manual .\" Source: dumppdf .\" Language: English .\" .TH "DUMPPDF" "1" "12/30/2018" "dumppdf" "PDFMiner Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" dumppdf \- dumps internal contents of a PDF files .SH "SYNOPSIS" .HP \w'\fBdumppdf\fR\ 'u \fBdumppdf\fR [\fIoption\fR...] \fIfile\fR... .SH "DESCRIPTION" .PP \fBdumppdf\fR dumps the internal contents of a PDF file in pseudo\-XML format\&. This program is primarily for debugging purposes, but it\*(Aqs also possible to extract some meaningful contents .SH "OPTIONS" .PP \fB\-a\fR .RS 4 Dump all the objects\&. By default only the document trailer is printed\&. .RE .PP \fB\-i \fR\fB\fIobjno[,objno,\&...]\fR\fR .RS 4 Specifies PDF object IDs to display\&. Comma\-separated IDs, or multiple \fB\-i\fR options are accepted\&. .RE .PP \fB\-p \fR\fB\fIpageno\fR\fR\fB\fI[,pageno,\&...]\fR\fR .RS 4 Specifies the comma\-separated list of the page numbers to be extracted\&. Page numbers start at one\&. By default, it extracts text from all the pages\&. .RE .PP \fB\-r\fR, \fB\-b\fR, \fB\-t\fR .RS 4 Specifies the output format of stream contents\&. Because the contents of stream objects can be very large, they are omitted when none of the options above is specified\&. .sp With \fB\-r\fR option, the \(lqraw\(rq stream contents are dumped without decompression\&. With \fB\-b\fR option, the decompressed contents are dumped as a binary blob\&. With \fB\-t\fR option, the decompressed contents are dumped in a text format, similar to \fBrepr()\fR manner\&. When \fB\-r\fR or \fB\-b\fR option is given, no stream header is displayed for the ease of saving it to a file\&. .RE .PP \fB\-T\fR .RS 4 Show the table of contents\&. .RE .PP \fB\-P \fR\fB\fIpassword\fR\fR .RS 4 Provides the user password to access PDF contents\&. .RE .PP \fB\-d\fR .RS 4 Increase the debug level\&. .RE .SH "EXAMPLES" .PP Dump all the headers and contents, except stream objects: .sp .if n \{\ .RS 4 .\} .nf $ \fBdumppdf\fR \-a test\&.pdf .fi .if n \{\ .RE .\} .PP Dump the table of contents: .sp .if n \{\ .RS 4 .\} .nf $ \fBdumppdf\fR \-T test\&.pdf .fi .if n \{\ .RE .\} .PP Extract a JPEG image: .sp .if n \{\ .RS 4 .\} .nf $ \fBdumppdf\fR \-r \-i6 test\&.pdf > image\&.jpeg .fi .if n \{\ .RE .\} .sp .SH "SEE ALSO" .PP \fBpdf2txt\fR(1) .SH "AUTHORS" .PP \fBJakub Wilk\fR <\&jwilk@debian\&.org\&> .RS 4 Wrote this manual page for the Debian system\&. .RE .PP \fBYusuke Shinyama\fR <\&yusuke@cs\&.nyu\&.edu\&> .RS 4 Author of PDFMiner and its original HTML documentation\&. .RE