Publications-Articles

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 THE NCCU CORPUS OF SPOKEN CHINESE: MANDARIN, HAKKA, AND SOUTHERN MIN
作者 Chui, Kawai ;Lai, Huei-ling
徐嘉慧;賴惠玲
貢獻者 英文系
日期 2008-12
上傳時間 7-Jan-2014 17:28:00 (UTC+8)
摘要 In Taiwan, most people speak Mandarin, Southern Min, or Hakka. Not only are the three Chinese dialects undergoing linguistic changes, but the population of Southern Min and Hakka is also diminishing. The NCCU Corpus of Spoken Chinese is thus a project of language documentation whereby open online access to Mandarin, Hakka, and Southern Min data is provided for non-profit-making research. As a language documentation project, the NCCU spoken corpus focuses on collecting and archiving spoken forms of various types. It consists of three sub-corpora, namely the Corpus of Spoken Mandarin, the Corpus of Spoken Hakka, and the Corpus of Spoken Southern Min. The three corpora share a common scheme for the collection of spoken data, mostly in the form of spontaneous face-to-face conversations. The infrastructure of the corpus is designed in a simple yet user-friendly way, so that data can be processed efficiently in the database, and users can browse the spoken data directly from the web. We hope that our work can encourage more people to engage in building up spoken corpora from different perspectives and for different purposes.
關聯 臺灣語言學期刊, 6(2), 119-144
Taiwan Journal of Linguistics, 6(2), 119-144
資料類型 article
DOI http://dx.doi.org/10.6519/TJL.2008.6(2).5
dc.contributor 英文系en_US
dc.creator (作者) Chui, Kawai ;Lai, Huei-lingen_US
dc.creator (作者) 徐嘉慧;賴惠玲zh_TW
dc.date (日期) 2008-12en_US
dc.date.accessioned 7-Jan-2014 17:28:00 (UTC+8)-
dc.date.available 7-Jan-2014 17:28:00 (UTC+8)-
dc.date.issued (上傳時間) 7-Jan-2014 17:28:00 (UTC+8)-
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/63350-
dc.description.abstract (摘要) In Taiwan, most people speak Mandarin, Southern Min, or Hakka. Not only are the three Chinese dialects undergoing linguistic changes, but the population of Southern Min and Hakka is also diminishing. The NCCU Corpus of Spoken Chinese is thus a project of language documentation whereby open online access to Mandarin, Hakka, and Southern Min data is provided for non-profit-making research. As a language documentation project, the NCCU spoken corpus focuses on collecting and archiving spoken forms of various types. It consists of three sub-corpora, namely the Corpus of Spoken Mandarin, the Corpus of Spoken Hakka, and the Corpus of Spoken Southern Min. The three corpora share a common scheme for the collection of spoken data, mostly in the form of spontaneous face-to-face conversations. The infrastructure of the corpus is designed in a simple yet user-friendly way, so that data can be processed efficiently in the database, and users can browse the spoken data directly from the web. We hope that our work can encourage more people to engage in building up spoken corpora from different perspectives and for different purposes.en_US
dc.format.extent 766139 bytes-
dc.format.mimetype application/pdf-
dc.language.iso en_US-
dc.relation (關聯) 臺灣語言學期刊, 6(2), 119-144en_US
dc.relation (關聯) Taiwan Journal of Linguistics, 6(2), 119-144en_US
dc.title (題名) THE NCCU CORPUS OF SPOKEN CHINESE: MANDARIN, HAKKA, AND SOUTHERN MINen_US
dc.type (資料類型) articleen
dc.identifier.doi (DOI) 10.6519/TJL.2008.6(2).5-
dc.doi.uri (DOI) http://dx.doi.org/10.6519/TJL.2008.6(2).5-