unigen-hangul - Man Page
Generate Hangul syllables from a Johab 6/3/1 Unifont hex file
Synopsis
unigen-hangul -i hangul-base.hex -o hangul-syllables.hex
Description
unigen-hangul generates Hangul syllables from an input Unifont .hex file encoded in Johab 6/3/1 format. By default, the output is the Unicode Hangul Syllables range, U+AC00..U+D7A3. Options allow the user to specify a starting code point for the output Unifont .hex file, and ranges in hexadecimal of the starting and ending Hangul Jamo code points:
- Range
Hangul
- 1100-115E
Hangul Jamo initial consonants (choseong)
- A960-A97C
Hangul Jamo Extended-A initial consonants (choseong)
- 1161-11A7
Hangul Jamo medial vowels and diphthongs (jungseong)
- D7B0-D7C6
Hangul Jamo Extended-B medial vowels and diphthongs (jungseong)
- 11A8-11FF
Hangul Jamo final consonants (jongseong).
- D7CB-D7FB
Hangul Jamo Extended-B final consonants (jongseong).
A single code point or 0 to omit can be specified instead of a range. A starting code point of one position before a valid starting range for a Hangul jamo series (choseong, jungseong, and/or jongseong) will first use a blank glyph for that jamo, and then cycle through remaining valid code points for the respective choseong, jungseong, or jongseong. A range can span modern and ancient, and even Hangul Jamo Extended-A and Hangul Jamo Extended-B ranges.
For example,
-j3 11A7-D7FB
Will first use no jongseong (because U+11A7 is one before the start of Hangul Jamo jongseong code points), then loop through jongseong in the Hangul Jamo range of U+11A8 through U+11FF, and then loop through jongseong in the Hangul Jamo Extended-B range of U+D7CB through U+D7FB.
Options
- Option
Function
- -h, --help
Print a help message and exit.
- -all
Generate all Hangul syllables, using all modern and ancient Hangul in the Unicode range U+1100..U+11FF, assigned code points in the Extended-A range of U+A960..U+A97C, and assigned code points in the Extended-B range of U+D7B0..U+D7FF. WARNING: this will generate over 1,600,000 syllables in a 115 megabyte Unifont .hex format file. The default is to only output the 11,172 modern Hangul syllables.
- -c code_point
Starting code point in hexadecimal for output file.
- -j1 start-end
Choseong (jamo 1) start-end range in hexadecimal.
- -j2 start-end
Jungseong (jamo 2) start-end range in hexadecimal.
- -j3 start-end
Jongseong (jamo 3) start-end range in hexadecimal.
- -i input_file
Unifont hangul-base.hex formatted input file.
- -o output_file
Unifont .hex format output file.
Examples
unigen-hangul -c 1 -j3 11AB-11AB \
-i hangul-base.hex -o nieun-only.hex
This command generates Hangul syllables using all modern choseong and jungseong, and only the jongseong nieun (Unicode code point U+11AB). The output Unifont .hex file will contain code points starting at 1. Instead of specifying "-j3 11AB-11AB", simply using "-j3 11AB" will also suffice.
This next example is a series of syllable sets suggested by Ho-Seok Ee for preliminary syllable alignment checking of modern Hangul.
- The first command generates all modern syllables containing no jongseong (final consonant), starting at Unifont hexadecimal glyph location 0x1000; selecting a jongseong value that is out of range (U+1160 in this case) will use a blank filler in place of the jongseong.
- The second command generates all modern syllables containing jongseong Kiyeok (U+11AB), which has a horizontal line extending across the lower portion of a syllable, starting at Unifont hexadecimal glyph location 0x2000.
- The third command generates all modern Hangul syllables containing jongseong Rieul (U+11AF), starting at Unifont hexadecimal glyph location 0x3000.
- The fourth command generates all modern Hangul syllables containing choseong (initial consonant) Rieul (U+1105), starting at Unifont hexadecimal glyph location 0x4000.
Here is the command sequence:
unigen-hangul -c 1000 -j1 1100-1112 -j2 1161-1175 -j3 1160 \
-i hangul-base.hex > hangul-prep.hex
unigen-hangul -c 2000 -j1 1100-1112 -j2 1161-1175 -j3 11AB \
-i hangul-base.hex >> hangul-prep.hex
unigen-hangul -c 3000 -j1 1100-1112 -j2 1161-1175 -j3 11AF \
-i hangul-base.hex >> hangul-prep.hex
unigen-hangul -c 4000 -j1 1105 -j2 1161-1175 -j3 11A8-11C2 \
-i hangul-base.hex >> hangul-prep.hex
The resulting .hex file can then be examined with hexdraw, unihex2bmp, etc.
Files
Unifont .hex files in Johab 6/3/1 encoding. See unifont-johab631(5) for a description of the input file structure. This program uses functions contained in the file unihangul-support.c.
See Also
bdfimplode(1), hex2bdf(1), hex2otf(1), hex2sfd(1), hexbraille(1), hexdraw(1), hexkinya(1), hexmerge(1), johab2syllables(1), johab2ucs2(1), unibdf2hex(1), unibmp2hex(1), unibmpbump(1), unicoverage(1), unidup(1), unifont(5), unifont-johab631(5), unifont-viewer(1), unifont1per(1), unifontchojung(1), unifontksx(1), unifontpic(1), unigencircles(1), unigenwidth(1), unihex2bmp(1), unihex2png(1), unihexfill(1), unihexgen(1), unihexpose(1), unihexrotate(1), unijohab2html(1), unipagecount(1), unipng2hex(1)
Author
unigen-hangul was written by Paul Hardy.
License
unigen-hangul is Copyright © 2023 Paul Hardy.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Bugs
No known bugs exist.
Referenced By
bdfimplode(1), hex2bdf(1), hex2sfd(1), hexbraille(1), hexdraw(1), hexkinya(1), hexmerge(1), johab2syllables(1), johab2ucs2(1), unibdf2hex(1), unibmp2hex(1), unibmpbump(1), unicoverage(1), unidup(1), unifont(5), unifont1per(1), unifontchojung(1), unifont-johab631(5), unifontksx(1), unifontpic(1), unifont-viewer(1), unigencircles(1), unigenwidth(1), unihex2bmp(1), unihex2png(1), unihexfill(1), unihexgen(1), unihexpose(1), unihexrotate(1), unijohab2html(1), unipagecount(1), unipng2hex(1).