urxvt-matcher - Man Page

match strings in terminal output and change their rendition

Description

Uses per-line display filtering (on_line_update) to underline text matching a certain pattern and make it clickable. When clicked with the mouse button specified in the matcher.button resource (default 2, or middle), the program specified in the matcher.launcher resource (default, the url-launcher resource, sensible-browser) will be started with the matched text as first argument. The default configuration is suitable for matching URLs and launching a web browser, like the former "mark-urls" extension.

The default pattern to match URLs can be overridden with the matcher.pattern.0 resource, and additional patterns can be specified with numbered patterns, in a manner similar to the "selection" extension. The launcher can also be overridden on a per-pattern basis.

It is possible to activate the most recently seen match or a list of matches from the keyboard. Simply bind a keysym to "matcher:last" or "matcher:list" as seen in the example below.

The matcher:select action enables a mode in which it is possible to iterate over the matches using the keyboard and either activate them or copy them to the clipboard. While the mode is active, normal terminal input/output is suspended and the following bindings are recognized:

Up

Search for a match upwards.

Down

Search for a match downwards.

Home

Jump to the topmost match.

End

Jump to the bottommost match.

Escape

Leave the mode and return to the point where search was started.

Enter

Activate the current match.

y

Copy the current match to the clipboard.

It is also possible to cycle through the matches using a key combination bound to the matcher:select action.

Example: load and use the matcher extension with defaults.

    URxvt.perl-ext:           default,matcher

Example: use a custom configuration.

    URxvt.url-launcher:       sensible-browser
    URxvt.keysym.C-Delete:    matcher:last
    URxvt.keysym.M-Delete:    matcher:list
    URxvt.matcher.button:     1
    URxvt.matcher.pattern.1:  \\bwww\\.[\\w-]+\\.[\\w./?&@#-]*[\\w/-]
    URxvt.matcher.pattern.2:  \\B(/\\S+?):(\\d+)(?=:|$)
    URxvt.matcher.launcher.2: gvim +$2 $1

Regex encoding/wide character matching

Urxvt stores all text as unicode, in a special encoding that uses one character/code point per column. For various reasons, the regular expressions are matched directly against this encoding, which means there are a few things you need to keep in mind:

X resources/command line arguments are locale-encoded

The regexes taken from the command line or resources will be converted from locale encoding to unicode. This can change the number of code points per character.

Wide characters are column-padded with $urxvt::NOCHAR

Wide characters (such as kanji and sometimes tabs) are padded with a special character value ($urxvt::NOCHAR). That means that constructs such as \w or . will only match part of a character, as $urxvt::NOCHAR is not matched by \w and both only match the first "column" of a wide character.

That means you have to incorporate $urxvt::NOCHAR into parts of regexes that may match wide characters. For example, to match \w+ you might want to use [\w$urxvt::NOCHAR]+ instead, and to match a single character (.) you might want to use .$urxvt::NOCHAR* instead.

Info

2024-07-20 9.31 RXVT-UNICODE