Package tesseract-tools
Training tools for tesseract
https://github.com/tesseract-ocr/tesseract
The tesseract-tools package contains tools for training tesseract.
Version: 5.3.4
See also: tesseract.
General Commands | |
ambiguous_words | generate sets of words Tesseract is likely to find ambiguous |
classifier_tester | for *legacy tesseract* engine. |
cntraining | character normalization training for Tesseract |
combine_lang_model | generate starter traineddata |
combine_tessdata | combine/extract/overwrite/list/compact Tesseract data |
dawg2wordlist | convert a Tesseract DAWG to a wordlist |
lstmeval | Evaluation program for LSTM-based networks. |
lstmtraining | Training program for LSTM-based networks. |
merge_unicharsets | Simple tool to merge two or more unicharsets. |
mftraining | feature training for Tesseract |
set_unicharset_properties | set properties about the unichars |
shapeclustering | shape clustering training for Tesseract |
text2image | generate OCR training pages. |
unicharset_extractor | Reads box or plain text files to extract the unicharset. |
wordlist2dawg | convert a wordlist to a DAWG for Tesseract |
File Formats | |
unicharambigs | Tesseract unicharset ambiguities |
unicharset | character properties file used by tesseract(1) |