PDFTK(1) | General Commands Manual | PDFTK(1) |
NAME¶
pdftk - A handy tool for manipulating PDFSYNOPSIS¶
pdftk <input PDF files | - | PROMPT>[ input_pw <input PDF owner passwords | PROMPT> ]
[ <operation> <operation arguments> ]
[ output <output filename | - | PROMPT> ]
[ encrypt_40bit | encrypt_128bit ]
[ allow <permissions> ]
[ owner_pw <owner password | PROMPT> ]
[ user_pw <user password | PROMPT> ]
[ flatten ] [ compress | uncompress ]
[ keep_first_id | keep_final_id ] [ drop_xfa ]
[ verbose ] [ dont_ask | do_ask ]
<operation> may be empty, or:
[ cat | shuffle | burst |
generate_fdf | fill_form |
background | multibackground |
stamp | multistamp |
dump_data | dump_data_utf8 |
dump_data_fields | dump_data_fields_utf8 |
update_info | update_info_utf8 |
attach_files | unpack_files ]
DESCRIPTION¶
If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a simple tool for doing everyday things with PDF documents. Use it to:OPTIONS¶
A summary of options is included below.- --help, -h
- Show summary of options.
- <input PDF files | - | PROMPT>
- A list of the input PDF files. If you plan to combine these
PDFs (without using handles) then list files in the order you want them
combined. Use - to pass a single PDF into pdftk via stdin. Input
files can be associated with handles, where a handle is a single,
upper-case letter:
- [input_pw <input PDF owner passwords | PROMPT>]
- Input PDF owner passwords, if necessary, are associated
with files by using their handles:
- [<operation> <operation arguments>]
- If this optional argument is omitted, then pdftk runs in
'filter' mode. Filter mode takes only one PDF input and creates a new PDF
after applying all of the output options, like encryption and compression.
- cat [<page ranges>]
- Catenates pages from input PDFs to create a new PDF. Page
order in the new PDF is specified by the order of the given page ranges.
Page ranges are described like this:
- shuffle [<page ranges>]
- Collates pages from input PDFs to create a new PDF. Works like the cat operation except that it takes one page at a time from each page range to assemble the output PDF. If one range runs out of pages, it continues with the remaining ranges. Ranges can use all of the features described above for cat, like reverse page ranges, multiple ranges from a single PDF, and page rotation. This feature was designed to help collate PDF pages after scanning paper documents.
- burst
- Splits a single, input PDF document into individual pages.
Also creates a report named doc_data.txt which is the same as the
output from dump_data. If the output section is omitted,
then PDF pages are named: pg_%04d.pdf, e.g.: pg_0001.pdf, pg_0002.pdf,
etc. To name these pages yourself, supply a printf-styled format string
via the output section. For example, if you want pages named:
page_01.pdf, page_02.pdf, etc., pass output page_%02d.pdf to pdftk.
Encryption can be applied to the output by appending output options such
as owner_pw, e.g.:
- generate_fdf
- Reads a single, input PDF file and generates an FDF file suitable for fill_form out of it to the given output filename or (if no output is given) to stdout. Does not create a new PDF.
- fill_form <FDF data filename | XFDF data filename | - | PROMPT>
- Fills the single input PDF's form fields with the data from
an FDF file, XFDF file or stdin. Enter the data filename after
fill_form, or use - to pass the data via stdin, like so:
- background <background PDF filename | - | PROMPT>
- Applies a PDF watermark to the background of a single input
PDF. Pass the background PDF's filename after background like so:
- multibackground <background PDF filename | - | PROMPT>
- Same as the background operation, but applies each page of the background PDF to the corresponding page of the input PDF. If the input PDF has more pages than the stamp PDF, then the final stamp page is repeated across these remaining pages in the input PDF.
- stamp <stamp PDF filename | - | PROMPT>
- This behaves just like the background operation except it overlays the stamp PDF page on top of the input PDF document's pages. This works best if the stamp PDF page has a transparent background.
- multistamp <stamp PDF filename | - | PROMPT>
- Same as the stamp operation, but applies each page of the background PDF to the corresponding page of the input PDF. If the input PDF has more pages than the stamp PDF, then the final stamp page is repeated across these remaining pages in the input PDF.
- dump_data
- Reads a single, input PDF file and reports various statistics, metadata, bookmarks (a/k/a outlines), and page labels to the given output filename or (if no output is given) to stdout. Non-ASCII characters are encoded as XML numerical entities. Does not create a new PDF.
- dump_data_utf8
- Same as dump_data excepct that the output is encoded as UTF-8.
- dump_data_fields
- Reads a single, input PDF file and reports form field statistics to the given output filename or (if no output is given) to stdout. Non-ASCII characters are encoded as XML numerical entities. Does not create a new PDF.
- dump_data_fields_utf8
- Same as dump_data_fields excepct that the output is encoded as UTF-8.
- update_info <info data filename | - | PROMPT>
- Changes the metadata stored in a single PDF's Info
dictionary to match the input data file. The input data file uses the same
syntax as the output from dump_data. Non-ASCII characters should be
encoded as XML numerical entities. This does not change the metadata
stored in the PDF's XMP stream, if it has one. For example:
- update_info_utf8 <info data filename | - | PROMPT>
- Same as update_info except that the input is encoded as UTF-8.
- attach_files <attachment filenames | PROMPT> [to_page <page number | PROMPT>]
- Packs arbitrary files into a PDF using PDF's file
attachment features. More than one attachment may be listed after
attach_files. Attachments are added at the document level unless
the optional to_page option is given, in which case the files are
attached to the given page number (the first page is 1, the final page is
end). For example:
- unpack_files
- Copies all of the attachments from the input PDF into the
current folder or to an output directory given after output. For
example:
- [output <output filename | - | PROMPT>]
- The output PDF filename may not be set to the name of an input filename. Use - to output to stdout. When using the dump_data operation, use output to set the name of the output data file. When using the unpack_files operation, use output to set the name of an output directory. When using the burst operation, you can use output to control the resulting PDF page filenames (described above).
- [encrypt_40bit | encrypt_128bit]
- If an output PDF user or owner password is given, output PDF encryption strength defaults to 128 bits. This can be overridden by specifying encrypt_40bit.
- [allow <permissions>]
- Permissions are applied to the output PDF only if an
encryption strength is specified or an owner or user password is given. If
permissions are not specified, they default to 'none,' which means all of
the following features are disabled.
- Printing
- Top Quality Printing
- DegradedPrinting
- Lower Quality Printing
- ModifyContents
- Also allows Assembly
- Assembly
- CopyContents
- Also allows ScreenReaders
- ScreenReaders
- ModifyAnnotations
- Also allows FillIn
- FillIn
- AllFeatures
- Allows the user to perform all of the above, and top quality printing.
- [owner_pw <owner password | PROMPT>]
- [user_pw <user password | PROMPT>]
- If an encryption strength is given but no passwords are supplied, then the owner and user passwords remain empty, which means that the resulting PDF may be opened and its security parameters altered by anybody.
- [compress | uncompress]
- These are only useful when you want to edit PDF code in a text editor like vim or emacs. Remove PDF page stream compression by applying the uncompress filter. Use the compress filter to restore compression.
- [flatten]
- Use this option to merge an input PDF's interactive form fields (and their data) with the PDF's pages. Only one input PDF may be given. Sometimes used with the fill_form operation.
- [keep_first_id | keep_final_id]
- When combining pages from multiple PDFs, use one of these options to copy the document ID from either the first or final input document into the new output PDF. Otherwise pdftk creates a new document ID for the output PDF. When no operation is given, pdftk always uses the ID from the (single) input PDF.
- [drop_xfa]
- If your input PDF is a form created using Acrobat 7 or
Adobe Designer, then it probably has XFA data. Filling such a form using
pdftk yields a PDF with data that fails to display in Acrobat 7 (and 6?).
The workaround solution is to remove the form's XFA data, either before
you fill the form using pdftk or at the time you fill the form. Using this
option causes pdftk to omit the XFA data from the output PDF form.
- [verbose]
- By default, pdftk runs quietly. Append verbose to the end and it will speak up.
- [dont_ask | do_ask]
- Depending on the compile-time settings (see
ASK_ABOUT_WARNINGS), pdftk might prompt you for further input when it
encounters a problem, such as a bad password. Override this default
behavior by adding dont_ask (so pdftk won't ask you what to do) or
do_ask (so pdftk will ask you what to do).
EXAMPLES¶
- Collate scanned pages
- pdftk A=even.pdf B=odd.pdf shuffle A B output collated.pdf
- Decrypt a PDF
- pdftk secured.pdf input_pw foopass output unsecured.pdf
- Encrypt a PDF using 128-bit strength (the default), withhold all permissions (the default)
- pdftk 1.pdf output 1.128.pdf owner_pw foopass
- Same as above, except password 'baz' must also be used to open output PDF
- pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz
- Same as above, except printing is allowed (once the PDF is open)
- pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz allow printing
- Join in1.pdf and in2.pdf into a new PDF, out1.pdf
- pdftk in1.pdf in2.pdf cat output out1.pdf
- Remove 'page 13' from in1.pdf to create out1.pdf
- pdftk in.pdf cat 1-12 14-end output out1.pdf
- Apply 40-bit encryption to output, revoking all permissions (the default). Set the owner PW to 'foopass'.
- pdftk 1.pdf 2.pdf cat output 3.pdf encrypt_40bit owner_pw foopass
- Join two files, one of which requires the password 'foopass'. The output is not encrypted.
- pdftk A=secured.pdf 2.pdf input_pw A=foopass cat output 3.pdf
- Uncompress PDF page streams for editing the PDF in a text editor (e.g., vim, emacs)
- pdftk doc.pdf output doc.unc.pdf uncompress
- Repair a PDF's corrupted XREF table and stream lengths, if possible
- pdftk broken.pdf output fixed.pdf
- Burst a single PDF document into pages and dump its data to doc_data.txt
- pdftk in.pdf burst
- Burst a single PDF document into encrypted pages. Allow low-quality printing
- pdftk in.pdf burst owner_pw foopass allow DegradedPrinting
- Write a report on PDF document metadata and bookmarks to report.txt
- pdftk in.pdf dump_data output report.txt
- Rotate the first PDF page to 90 degrees clockwise
- pdftk in.pdf cat 1E 2-end output out.pdf
- Rotate an entire PDF document to 180 degrees
- pdftk in.pdf cat 1-endS output out.pdf
NOTES¶
The pdftk home page permalink is:AUTHOR¶
Sid Steward (sid.steward at pdflabs dot com) maintains pdftk. Please email him with questions or bug reports. Include pdftk in the subject line to ensure successful delivery. Thank you.October 28, 2010 |