ascii - Man Page

report character aliases

Examples (TL;DR)

Synopsis

ascii [-d] [-x] [-o] [-h] [-v] [char-alias]

Options

Called with no options, ascii behaves like "ascii -h".  Options are as follows:

-t

Script-friendly mode, emits only ISO/decimal/hex/octal/binary encodings of the character.

-s

Parse multiple characters.  Convenient way of parsing strings.

-a

Print in vertical aspect (4 columns by 16 rows) rather than 16x4. This option combines only with -d -o -x -b and must precede them.

-d

Ascii table in decimal.

-x

Ascii table in hex.

-o

Ascii table in octal.

-b

Ascii table in binary.

-h,  -?

Show summary of options and a simple ASCII table.

-v

Show version of program.

Description

Characters in the ASCII set can have many aliases, depending on context. A character’s possible names include:

This utility accepts command-line strings and tries to interpret them as one of the above.  When it finds a value, it prints all of the names of the character.  The constructs in the following list can be used to specify character values.  If an argument could be interpreted in two or more ways, names for all the different characters it might be are dumped.

character

Any character not described by one of the following conventions represents the character itself.

^character

A caret followed by a character.

\[abfnrtv0]: A backslash followed by certain special characters (abfnrtv).

mnemonic

An ASCII teletype mnemonic.

hexadecimal

A hexadecimal (hex) sequence consists of one or two case-insensitive hex digit characters (01234567890abcdef). To ensure hex interpretation use one of the prefixws h, 0x, x, or \x.

decimal

A decimal sequence consists of one, two or three decimal digit characters (0123456789). To ensure decimal interpretation use one of the prefixes d, 0d, or \d.

octal

An octal sequence consists of one, two or three octal digit characters (01234567).  To ensure octal interpretation use one of the prefixes 0o, o, or \o.

bit pattern

A bit pattern (binary) sequence consists of one to eight binary digit characters (01).  To ensure bit interpretation use one of the prefxes 0bm b, or \b.

ISO/ECMA code

An ISO/ECMA code sequence consists of one or two decimal digit characters, a slash, and one or two decimal digit characters.

name

An official ASCII or (unofficial) slang name.

:class:

A named POSIX character class.

The slang names recognized and printed out are from a rather comprehensive list that first appeared on USENET in early 1990 and has been continuously updated since.  Mnemonics recognized and printed include the official ASCII set, some official ISO names (where those differ) and a few common-use alternatives (such as NL for LF). HTML/SGML entity names are also printed when applicable.  All comparisons are case-insensitive, and dashes are mapped to spaces. Any unrecognized arguments or out of range values are silently ignored.  Note that the <option>-s</option> option will not recognize "long" names, as it cannot differentiate them from other parts of the string.

For correct results, be careful to stringize or quote shell metacharacters in arguments (especially backslash).

This utility is particularly handy for interpreting cc(1)'s ugly octal "invalid-character" messages, or when coding anything to do with serial communications.  As a side effect it serves as a handy base-converter for random 8-bit values.

Author

Eric S. Raymond esr@thyrsus.com; November 1990 (home page at http://www.catb.org/~esr/). Ioannis E. Tambouras < ioannis@debian.org added command options and minor enhancements.  Brian J. Ginsbach ginsbach@sgi.com fixed several bugs and expanded the man page. David N. Welton davidw@efn.org added the <option>-s</option> option. Matej Vela corrected the ISO names.  Dave Capella contributed the idea of listing HTML/SGML entities.

Info

2024-05-27