hxwls - Man Page

list links in an HTML file

Synopsis

hxwls [ -l ] [ -t ] [ -r ] [ -h ] [ -a ] [ -b base ] [ file ]

Description

The hxwls command reads an HTML file (standard input by default) and prints out all links it finds. The output is written to stdout.

Options

The following options are supported:

-l

Produce a long listing. Instead of just the URI, hxwls prints three columns: the element name, the value of the REL attribute, and the target URI.

-t

Produce a tuple listing. hxwls prints four columns: the URI of the document itself, the element name, the value of the REL attribute, and the target URI.

-r

Print relative URLs as they are, without converting them to absolute URLs.

-b base

Use base as the initial base URL. If there is a <base> element in the document,  it will override the -b option.

-h

Output as HTML. The output will be listed in the form of <a> elements.

-a

Convert any IRIs (Internationalized Resource Identifiers) to ASCII-only URIs. This causes any non-ASCII characters in the path of a URI to be encoded as %-escaped octets and non-ASCII characters in the domain name as punycode. (Punycode encoding is only available if hxwls is compiled with libidn support.)

Operands

The following operand is supported:

file

The name or the URL of an HTML file. If absent, standard input is read instead.

Diagnostics

The following exit values are returned:

0

Successful completion.

> 0

An error occurred in the parsing of the HTML file. hxwls will try to correct the error and produce output anyway.

See Also

asc2xml(1), hxnormalize(1), hxnum(1), xml2asc(1)

Referenced By

hxcopy(1).

10 Jul 2011 7.x HTML-XML-utils