unicode_grapheme_break - Man Page
unicode grapheme cluster boundary rules
Synopsis
#include <courier-unicode.h>
unicode_grapheme_break_info_t unicode_grapheme_break_init(void);
int unicode_grapheme_next(unicode_grapheme_break_info_t handle, char32_t c);
void unicode_grapheme_deinit(unicode_grapheme_break_info_t handle);
int unicode_grapheme_break(char32_t a, char32_t b);
Description
These functions implement the unicode grapheme cluster breaking algorithm. Invoke unicode_grapheme_break_init() to initialize the grapheme cluster breaking algorithm. unicode_grapheme_break_init() returns an opaque handle. Each subsequent call to unicode_grapheme_break_next() passes this handle, and the next character. unicode_grapheme_break_next() returns a non-0 value if there's a grapheme break before the character, in a sequence of Unicode characters. unicode_grapheme_break_deinit() releases all reosurces used by the grapheme breaking handle, and the unicode_grapheme_break_info_t handle is no longer valid after this call.
The first call to unicode_grapheme_break_next() always returns non-0, as per the GB1 rule.
unicode_grapheme_break() is a simplified interface that returns non-zero if there is a grapheme break between two unicode characters a and b. This is is equivalent to calling unicode_grapheme_break_init(), followed by two calls to unicode_grapheme_break_next(), and finally unicode_grapheme_break_deinit(), then returning the result of the second call to unicode_grapheme_break_next().
See Also
TR-29[1], courier-unicode(7), unicode_convert_tocase(3), unicode_line_break(3), unicode_word_break(3).
Author
Sam Varshavchik
Author
Notes
Referenced By
courier-unicode(7), unicode_uc(3), unicode_wb_init(3).
The man pages unicode_grapheme_break_deinit(3), unicode_grapheme_break_init(3) and unicode_grapheme_break_next(3) are aliases of unicode_grapheme_break(3).