python3-html2text - Man Page

manual page for python3-html2text 2024.2.26

Description

usage: python3-html2text [-h] [--default-image-alt DEFAULT_IMAGE_ALT]

[--pad-tables] [--no-wrap-links] [--wrap-list-items]

[--wrap-tables] [--ignore-emphasis] [--reference-links] [--ignore-links] [--ignore-mailto-links] [--protect-links] [--ignore-images] [--images-as-html] [--images-to-alt] [--images-with-size] [-g] [-d] [-e] [-b BODY_WIDTH] [-i LIST_INDENT] [-s] [--escape-all] [--bypass-tables] [--ignore-tables] [--single-line-break] [--unicode-snob] [--no-automatic-links] [--no-skip-internal-links] [--links-after-para] [--mark-code] [--decode-errors DECODE_ERRORS] [--open-quote OPEN_QUOTE] [--close-quote CLOSE_QUOTE] [--version] [--include-sup-sub] [filename] [encoding]

positional arguments

filename encoding

options

-h, --help

show this help message and exit

--default-image-alt DEFAULT_IMAGE_ALT

The default alt string for images with missing ones

--pad-tables

pad the cells to equal column width in tables

--no-wrap-links

don't wrap links during conversion

--wrap-list-items

wrap list items during conversion

--wrap-tables

wrap tables

--ignore-emphasis

don't include any formatting for emphasis

--reference-links

use reference style links instead of inline links

--ignore-links

don't include any formatting for links

--ignore-mailto-links

don't include mailto: links

--protect-links

protect links from line breaks surrounding them with angle brackets

--ignore-images

don't include any formatting for images

--images-as-html

Always write image tags as raw html; preserves `height`, `width` and `alt` if possible.

--images-to-alt

Discard image data, only keep alt text

--images-with-size

Write image tags with height and width attrs as raw html to retain dimensions

-g, --google-doc

convert an html-exported Google Document

-d, --dash-unordered-list

use a dash rather than a star for unordered list items

-e, --asterisk-emphasis

use an asterisk rather than an underscore for emphasized text

-b, --body-width BODY_WIDTH

number of characters per output line, 0 for no wrap

-i, --google-list-indent LIST_INDENT

number of pixels Google indents nested lists

-s, --hide-strikethrough

hide strike-through text. only relevant when -g is specified as well

--escape-all

Escape all special characters. Output is less readable, but avoids corner case formatting issues.

--bypass-tables

Format tables in HTML rather than Markdown syntax.

--ignore-tables

Ignore table-related tags (table, th, td, tr) while keeping rows.

--single-line-break

Use a single line break after a block element rather than two line breaks. NOTE: Requires --body-width=0

--unicode-snob

Use unicode throughout document

--no-automatic-links

Do not use automatic links wherever applicable

--no-skip-internal-links

Do not skip internal links

--links-after-para

Put links after each paragraph instead of document

--mark-code

Mark program code blocks with [code]...[/code]

--decode-errors DECODE_ERRORS

What to do in case of decode errors.'ignore', 'strict' and 'replace' are acceptable values

--open-quote OPEN_QUOTE

The character used to open quotes

--close-quote CLOSE_QUOTE

The character used to close quotes

--version

show program's version number and exit

--include-sup-sub

Include the sup and sub tags

Referenced By

The man pages html2text(1) and python-html2text(1) are aliases of python3-html2text(1).

July 2024 python3-html2text 2024.2.26