Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/111687
DC FieldValueLanguage
dc.contributor資訊科學系zh_Tw
dc.creator劉昭麟zh_TW
dc.creatorLiu, Chao-Linen_US
dc.creatorHuang, Chih-Kaien_US
dc.creatorWang, Hongsuen_US
dc.creatorBol, Peter K.en_US
dc.date2015-12en_US
dc.date.accessioned2017-08-09T09:27:07Z-
dc.date.available2017-08-09T09:27:07Z-
dc.date.issued2017-08-09T09:27:07Z-
dc.identifier.urihttp://nccur.lib.nccu.edu.tw/handle/140.119/111687-
dc.description.abstractPerson names and location names are essential building blocks for identifying events and social networks in historical documents that were written in literary Chinese. We take the lead to explore the research on algorithmically recognizing named entities in literary Chinese for historical studies with language-model based and conditional-random-field based methods, and extend our work to mining the document structures in historical documents. Practical evaluations were conducted with texts that were extracted from more than 220 volumes of local gazetteers (Difangzhi,). Difangzhi is a huge and the single most important collection that contains information about officers who served in local government in Chinese history. Our methods performed very well on these realistic tests. Thousands of names and addresses were identified from the texts. A good portion of the extracted names match the biographical information currently recorded in the China Biographical Database (CBDB) of Harvard University, and many others can be verified by historians and will become as new additions to CBDB.1 © 2015 IEEE.en_US
dc.format.extent212 bytes-
dc.format.mimetypetext/html-
dc.relationProceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015, 1629-1638en_US
dc.subjectComputational linguistics; Data mining; History; Natural language processing systems; Random processes; Conditional random field; Digital humanities; Document structure; Harvard University; Historical documents; Language model; Pattern based method; Text mining; Big dataen_US
dc.titleMining local gazetteers of literary Chinese with CRF and pattern based methods for biographical information in Chinese historyen_US
dc.typeconference
dc.identifier.doi10.1109/BigData.2015.7363931
dc.doi.urihttp://dx.doi.org/10.1109/BigData.2015.7363931
item.fulltextWith Fulltext-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.grantfulltextrestricted-
item.openairetypeconference-
item.cerifentitytypePublications-
Appears in Collections:會議論文
Files in This Item:
File Description SizeFormat
index.html212 BHTML2View/Open
Show simple item record

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.