An Efficient Character Segmentation Algorithm for Offline Handwritten Uighur Scripts Based on Grapheme Analysis
- DOI
- 10.2991/aiie-16.2016.31How to use a DOI?
- Keywords
- computer application; uighur language; handwriting recognition; character segmentation; grapheme
- Abstract
Cursive offline handwritten Uighur scripts contain a lot of small and random writing strokes, which makes the character segmentation is more complicated. In view of this, a new efficient character segmentation algorithm based on grapheme (part of a character) analysis is proposed in this paper. Firstly, by dot strokes detection and Component analysis, a handwritten Uighur word is over-segmented into three types of strokes: dot, affix and main strokes. Secondly, the main strokes are over-segmented but the dot strokes are clustered, so that a main graphemes queue and an additional graphemes queue are constructed respectively. Finally, the best hypothesis of characters sequence is selected by analyses of the graphemes' shapes and recognition results. Experiment results with 93.09% character segmentation accuracy rate and 97.67% recall rate have verified the validity of the proposed algorithm.
- Copyright
- © 2016, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Yamei Xu AU - Panpan Du PY - 2016/11 DA - 2016/11 TI - An Efficient Character Segmentation Algorithm for Offline Handwritten Uighur Scripts Based on Grapheme Analysis BT - Proceedings of the 2016 2nd International Conference on Artificial Intelligence and Industrial Engineering (AIIE 2016) PB - Atlantis Press SP - 129 EP - 133 SN - 1951-6851 UR - https://doi.org/10.2991/aiie-16.2016.31 DO - 10.2991/aiie-16.2016.31 ID - Xu2016/11 ER -