dumppdf - Man Page
dumppdf – extract pdf structure in XML format
Synopsis
dumppdf [-h] [--version] [--debug] [--extract-toc | --extract-embedded EXTRACT_EMBEDDED] [--page-numbers PAGE_NUMBERS [PAGE_NUMBERS ...]] [--pagenos PAGENOS] [--objects OBJECTS] [--all] [--password PASSWORD] [--outfile OUTFILE] [--raw-stream | --binary-stream | --text-stream] files [files ...]
Options
Positional Arguments
- files
One or more paths to PDF files.
Optional Arguments
- -h, --help
Show a help message and exit.
- --version, -v
Show program’s version number and exit.
- --debug, -d
Use debug logging level.
- --extract-toc, -T
Extract structure of outline
- --extract-embedded EXTRACT_EMBEDDED, -E EXTRACT_EMBEDDED
Extract embedded files
Parser
Used during PDF parsing
- --page-numbers PAGE_NUMBERS [PAGE_NUMBERS ...]
A space-seperated list of page numbers to parse.
- --pagenos PAGENOS, -p PAGENOS
A comma-separated list of page numbers to parse. Included for legacy applications; use --page-numbers for more idiomatic argument entry.
- --objects OBJECTS, -i OBJECTS
Comma separated list of object numbers to extract
- --all, -a
If the structure of all objects should be extracted
- --password PASSWORD, -P PASSWORD
The password to use for decrypting PDF file.
Output
Used during output generation.
- --outfile OUTFILE, -o OUTFILE
Path to file where output is written. Or “-” (default) to write to stdout.
- --raw-stream, -r
Write stream objects without encoding
- --binary-stream, -b
Write stream objects with binary encoding
- --text-stream, -t
Write stream objects as plain text