unftrip - Man Page
normalize Unicode text
Synopsis
unftrip [--ascii] [--encoding=VAL] [--unicode-version] [OPTION]ā¦ [FILE]
Description
unftrip inputs Unicode text from stdin and rewrites it to stdout according to a specified Unicode normalization form (see UAX 15).
If no normalization form is specified the character stream is left intact.
Invalid byte sequences in the input are reported on stderr and replaced by the Unicode replacement character (U+FFFD) in the output.
Normalization
- --nfc
Normalization Form C (NFC), canonical decomposition followed by canonical composition.
- --nfd
Normalization Form D (NFD), canonical decomposition.
- --nfkc
Normalization form KC (NFKC), compatibility decomposition followed by canonical composition.
- --nfkd
Normalization form KD (NFKD), compatibility decomposition.
Arguments
- FILE (absent=-)
The input file. Reads from stdin if unspecified.
Options
- -a, ā--ascii
Output the input text as newline (U+000A) separated Unicode scalar values written in the US-ASCII charset.
- -e VAL, --encoding=VAL
Input encoding, must one of UTF-8, UTF-16, UTF-16LE, UTF-16BE, ASCII or latin1. If unspecified the encoding is guessed. The output encoding is the same as the input encoding except for ASCII and latin1 where UTF-8 is output.
- --unicode-version
Output supported Unicode version.
Common Options
- --help[=FMT] (default=auto)
Show this help in format FMT. The value FMT must be one of auto, pager, groff or plain. With auto, the format is pager or plain whenever the TERM env var is dumb or undefined.
- --version
Show version information.
Exit Status
unftrip exits with one of the following values:
- 0
no error occurred
- 1
a command line parsing error occurred
- 2
the input text was malformed
- 0
on success.
- 123
on indiscriminate errors reported on standard error.
- 124
on command line parsing errors.
- 125
on unexpected internal errors (bugs).
Bugs
This program is distributed with the Uunf OCaml library. See http://erratique.ch/software/uunf for contact information.