Visually and phonologically similar characters in incorrect simplified Chinese words | Publication | NCCU Academic Hub

Publications-Proceedings

Article View/Open

pdf(758)

Publication Export

Google Scholar^TM

NCCU Library

Discovery System

Citation Infomation

No doi shows Citation Infomation

Related Publications in TAIR

Simple Record
Full Record

題名	Visually and phonologically similar characters in incorrect simplified Chinese words
作者	Liu, Chao-Lin;Lai, Min Hua;Chuang, Yi-Hsuan;Lee, C.-Y. 劉昭麟;賴敏華;莊怡軒
貢獻者	資科系
關鍵詞	Chinese lexicon; Chinese text; Contributing factor; Similarity measure; Visual similarity; Computational linguistics; Speech; Errors
日期	2010
上傳時間	29-Jun-2015 17:55:03 (UTC+8)
摘要	Visually and phonologically similar characters are major contributing factors for errors in Chinese text. By defining appropriate similarity measures that consider extended Cangjie codes, we can identify visually similar characters within a fraction of a second. Relying on the pronunciation information noted for individual characters in Chinese lexicons, we can compute a list of characters that are phonologically similar to a given character. We collected 621 incorrect Chinese words reported on the Internet, and analyzed the causes of these errors. 83% of these errors were related to phonological similarity, and 48% of them were related to visual similarity between the involved characters. Generating the lists of phonologically and visually similar characters, our programs were able to contain more than 90% of the incorrect characters in the reported errors.
關聯	Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference, Volume 2, 2010, Pages 739-747, 23rd International Conference on Computational Linguistics, Coling 2010; Beijing; China; 23 August 2010 到 27 August 2010; 代碼 86715
資料類型	conference

dc.contributor	資科系	-
dc.creator (作者)	Liu, Chao-Lin;Lai, Min Hua;Chuang, Yi-Hsuan;Lee, C.-Y.	-
dc.creator (作者)	劉昭麟;賴敏華;莊怡軒	-
dc.date (日期)	2010	-
dc.date.accessioned	29-Jun-2015 17:55:03 (UTC+8)	-
dc.date.available	29-Jun-2015 17:55:03 (UTC+8)	-
dc.date.issued (上傳時間)	29-Jun-2015 17:55:03 (UTC+8)	-
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/76131	-
dc.description.abstract (摘要)	Visually and phonologically similar characters are major contributing factors for errors in Chinese text. By defining appropriate similarity measures that consider extended Cangjie codes, we can identify visually similar characters within a fraction of a second. Relying on the pronunciation information noted for individual characters in Chinese lexicons, we can compute a list of characters that are phonologically similar to a given character. We collected 621 incorrect Chinese words reported on the Internet, and analyzed the causes of these errors. 83% of these errors were related to phonological similarity, and 48% of them were related to visual similarity between the involved characters. Generating the lists of phonologically and visually similar characters, our programs were able to contain more than 90% of the incorrect characters in the reported errors.	-
dc.relation (關聯)	Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference, Volume 2, 2010, Pages 739-747, 23rd International Conference on Computational Linguistics, Coling 2010; Beijing; China; 23 August 2010 到 27 August 2010; 代碼 86715	-
dc.subject (關鍵詞)	Chinese lexicon; Chinese text; Contributing factor; Similarity measure; Visual similarity; Computational linguistics; Speech; Errors	-
dc.title (題名)	Visually and phonologically similar characters in incorrect simplified Chinese words	-
dc.type (資料類型)	conference	en