unicode::canonical - Man Page
unicode canonical normalization and denormalization
Synopsis
#include <courier-unicode.h> constexpr int decompose_flag_qc=UNICODE_DECOMPOSE_FLAG_QC; constexpr int decompose_flag_compat=UNICODE_DECOMPOSE_FLAG_COMPAT; constexpr int compose_flag_removeunused=UNICODE_COMPOSE_FLAG_REMOVEUNUSED; constexpr int compose_flag_oneshot=UNICODE_COMPOSE_FLAG_ONESHOT;
void decompose_default_reallocate(std::u32string &string, const std::vector<std::tuple<size_t, size_t>> &list);
void decompose(std::u32string &string, int flags=0, const std::function<void (std::u32string &, const std::vector<std::tuple<size_t, size_t>>)> &reallocate=decompose_default_reallocate);
void compose_default_callback(unicode_composition_t &compositions);
void compose(std::u32string &string, int flags=0, const std::function<void (unicode_composition_t &)> &cb=compose_default_reallocate);
Description
These functions implement the C++ interface for the Unicode Canonical Decomposition and Composition[1], See the description of the underlying unicode_canonical(3) C library API for more information. C++ specific notes:
The C++ decomposition reallocate callback receives a single vector of offset and size tuples instead of two separate arrays or vectors. unicode::decompose_default_reallocate() is the C++ version of the default reallocate callback. It receives the receiving the same tuple vector parameter, too. The C++ interface use std::u32strings to represent Unicode text strings, and unicode::decompose_default_reallocate() resizes it.
Like the C callback, the C++ one gets called 0 or more times.
- unicode::compose() takes care of initializing, applying, and de-initialization the unicode_composition_t object, for decomposition. The callback receives a reference to the unicode_composition_t object, which the callback should not modify in any way.
See Also
Author
Sam Varshavchik
Author
Notes
- 1.
Unicode Canonical Decomposition and Composition
https://www.unicode.org/reports/tr15/tr15-50.html
Referenced By
courier-unicode(7), unicode_canonical(3).
The man pages unicode::compose(3), unicode::compose_default_callback(3), unicode::decompose(3) and unicode::decompose_default_reallocate(3) are aliases of unicode::canonical(3).