unigen-hangul - Man Page

Generate Hangul syllables from a Johab 6/3/1 Unifont hex file

Synopsis

unigen-hangul -i hangul-base.hex -o hangul-syllables.hex

Description

unigen-hangul generates Hangul syllables from an input Unifont .hex file encoded in Johab 6/3/1 format.  By default, the output is the Unicode Hangul Syllables range, U+AC00..U+D7A3.  Options allow the user to specify a starting code point for the output Unifont .hex file, and ranges in hexadecimal of the starting and ending Hangul Jamo code points:

Range

Hangul

1100-115E

Hangul Jamo initial consonants (choseong)

A960-A97C

Hangul Jamo Extended-A initial consonants (choseong)

1161-11A7

Hangul Jamo medial vowels and diphthongs (jungseong)

D7B0-D7C6

Hangul Jamo Extended-B medial vowels and diphthongs (jungseong)

11A8-11FF

Hangul Jamo final consonants (jongseong).

D7CB-D7FB

Hangul Jamo Extended-B final consonants (jongseong).

A single code point or 0 to omit can be specified instead of a range. A starting code point of one position before a valid starting range for a Hangul jamo series (choseong, jungseong, and/or jongseong) will first use a blank glyph for that jamo, and then cycle through remaining valid code points for the respective choseong, jungseong, or jongseong. A range can span modern and ancient, and even Hangul Jamo Extended-A and Hangul Jamo Extended-B ranges.

For example,

-j3 11A7-D7FB

Will first use no jongseong (because U+11A7 is one before the start of Hangul Jamo jongseong code points), then loop through jongseong in the Hangul Jamo range of U+11A8 through U+11FF, and then loop through jongseong in the Hangul Jamo Extended-B range of U+D7CB through U+D7FB.

Options

Option

Function

-h, --help

Print a help message and exit.

-all

Generate all Hangul syllables, using all modern and ancient Hangul in the Unicode range U+1100..U+11FF, assigned code points in the Extended-A range of U+A960..U+A97C, and assigned code points in the Extended-B range of U+D7B0..U+D7FF. WARNING: this will generate over 1,600,000 syllables in a 115 megabyte Unifont .hex format file.  The default is to only output the 11,172 modern Hangul syllables.

-c code_point

Starting code point in hexadecimal for output file.

-j1 start-end

Choseong (jamo 1) start-end range in hexadecimal.

-j2 start-end

Jungseong (jamo 2) start-end range in hexadecimal.

-j3 start-end

Jongseong (jamo 3) start-end range in hexadecimal.

-i input_file

Unifont hangul-base.hex formatted input file.

-o output_file

Unifont .hex format output file.

Examples

unigen-hangul -c 1 -j3 11AB-11AB \

     -i hangul-base.hex -o nieun-only.hex

This command generates Hangul syllables using all modern choseong and jungseong, and only the jongseong nieun (Unicode code point U+11AB). The output Unifont .hex file will contain code points starting at 1. Instead of specifying "-j3 11AB-11AB", simply using "-j3 11AB" will also suffice.

This next example is a series of syllable sets suggested by Ho-Seok Ee for preliminary syllable alignment checking of modern Hangul.

Here is the command sequence:

unigen-hangul -c 1000 -j1 1100-1112 -j2 1161-1175 -j3 1160 \

     -i hangul-base.hex  >  hangul-prep.hex

unigen-hangul -c 2000 -j1 1100-1112 -j2 1161-1175 -j3 11AB \

     -i hangul-base.hex  >> hangul-prep.hex

unigen-hangul -c 3000 -j1 1100-1112 -j2 1161-1175 -j3 11AF \

     -i hangul-base.hex  >> hangul-prep.hex

unigen-hangul -c 4000 -j1 1105 -j2 1161-1175 -j3 11A8-11C2 \

     -i hangul-base.hex  >> hangul-prep.hex

The resulting .hex file can then be examined with hexdraw, unihex2bmp, etc.

Files

Unifont .hex files in Johab 6/3/1 encoding.  See unifont-johab631(5) for a description of the input file structure.  This program uses functions contained in the file unihangul-support.c.

See Also

bdfimplode(1), hex2bdf(1), hex2otf(1), hex2sfd(1), hexbraille(1), hexdraw(1), hexkinya(1), hexmerge(1), johab2syllables(1), johab2ucs2(1), unibdf2hex(1), unibmp2hex(1), unibmpbump(1), unicoverage(1), unidup(1), unifont(5), unifont-johab631(5), unifont-viewer(1), unifont1per(1), unifontchojung(1), unifontksx(1), unifontpic(1), unigencircles(1), unigenwidth(1), unihex2bmp(1), unihex2png(1), unihexfill(1), unihexgen(1), unihexpose(1), unihexrotate(1), unijohab2html(1), unipagecount(1), unipng2hex(1)

Author

unigen-hangul was written by Paul Hardy.

License

unigen-hangul is Copyright © 2023 Paul Hardy.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

Bugs

No known bugs exist.

Referenced By

bdfimplode(1), hex2bdf(1), hex2sfd(1), hexbraille(1), hexdraw(1), hexkinya(1), hexmerge(1), johab2syllables(1), johab2ucs2(1), unibdf2hex(1), unibmp2hex(1), unibmpbump(1), unicoverage(1), unidup(1), unifont(5), unifont1per(1), unifontchojung(1), unifont-johab631(5), unifontksx(1), unifontpic(1), unifont-viewer(1), unigencircles(1), unigenwidth(1), unihex2bmp(1), unihex2png(1), unihexfill(1), unihexgen(1), unihexpose(1), unihexrotate(1), unijohab2html(1), unipagecount(1), unipng2hex(1).

30 July 2023