Chinese segmentation problem

何萬順

Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/2091

DC Field	Value	Language
dc.creator	何萬順	zh_TW
dc.date	1995-02	-
dc.date.accessioned	2008-09-19T06:42:47Z	-
dc.date.available	2008-09-19T06:42:47Z	-
dc.date.issued	2008-09-19T06:42:47Z	-
dc.identifier.uri	https://nccur.lib.nccu.edu.tw/handle/140.119/2091	-
dc.description.abstract	The correct segmentation of a sentence into words is essential in the computerized analysis of natural languages, and the generation of a sentence involves the proper composition of individually separate words. Unlike most languages written in phonetuc alphabets, the Chinese texts do not indicate word boundaries as spacing remains constant between ideographic characters. We demonstrate the problem of segmenting a written Chinese sentence into words in the context of machine translation and present some of the previous partial solutions: pre-editing, `maximal matching`, frequence priority, and other heuristic strategies. Furthermore, we propose the use of domain-specific frequency and a `no-window` principle in the implementation of the maximal matching strategy in combination with other heuristic rules as a more through scheme for Chinese segmentation.	en
dc.format.extent	1533406 bytes	-
dc.format.mimetype	application/pdf	-
dc.language	zh_TW	en
dc.language.iso	en_US	-
dc.relation	文山評論，1，79-90	en
dc.title	Chinese segmentation problem	en
dc.type	article	en
item.grantfulltext	open	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.fulltext	With Fulltext	-
item.cerifentitytype	Publications	-
item.openairetype	article	-
item.languageiso639-1	en_US	-
Appears in Collections:	期刊論文

Files in This Item:

File	Description	Size	Format
B01.pdf		1.5 MB	Adobe PDF2	View/Open

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Google Scholar^TM