Publications-Periodical Articles

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 A Statistical Analysis of Chinese Writing Style in Xin Qingnian 新青年 (New Youth Magazine) in the 1910s and 1920s
作者 余清祥
Yue, Jack C.
Lin, Curtis
貢獻者 統計系
關鍵詞 Text Mining ; Xin Qingnian (New Youth Magazine) ; Writing Style ; Exploratory Data Analysis ; Species Diversity
日期 2020
上傳時間 25-Jun-2021 10:16:25 (UTC+8)
摘要 Big Data has been very popular in many fields since IBM first introduced the term in 2010. With the progressive increase of available information, the ability of efficiently processing it is a necessary skill. The information can appear in various forms, such as texts and pictures, which normally are not in the digital format. Texts, for example, have no specific format and require structurization before being analysed. The process of structurization depends on the study’s objective and the data’s attributes. In this study, we propose a procedure for analysing Chinese texts, based on the notion of Tukey’s Explanatory Data Analysis and the concept of species diversity. To evaluate the proposed approach, we use the articles from the New Youth Magazine published from 1915 to 1926. We found that we can trace the change of Chinese writing style through the texts of the New Youth Magazine using the proposed method. It seems that more characters are common in classical (or literary) Chinese, but more two-character and multi-character words are common in modern Chinese.
關聯 Rivista degli Studi Orientali, Vol.XCII, No.4, pp.167-181
資料類型 article
dc.contributor 統計系
dc.creator (作者) 余清祥
dc.creator (作者) Yue, Jack C.
dc.creator (作者) Lin, Curtis
dc.date (日期) 2020
dc.date.accessioned 25-Jun-2021 10:16:25 (UTC+8)-
dc.date.available 25-Jun-2021 10:16:25 (UTC+8)-
dc.date.issued (上傳時間) 25-Jun-2021 10:16:25 (UTC+8)-
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/135886-
dc.description.abstract (摘要) Big Data has been very popular in many fields since IBM first introduced the term in 2010. With the progressive increase of available information, the ability of efficiently processing it is a necessary skill. The information can appear in various forms, such as texts and pictures, which normally are not in the digital format. Texts, for example, have no specific format and require structurization before being analysed. The process of structurization depends on the study’s objective and the data’s attributes. In this study, we propose a procedure for analysing Chinese texts, based on the notion of Tukey’s Explanatory Data Analysis and the concept of species diversity. To evaluate the proposed approach, we use the articles from the New Youth Magazine published from 1915 to 1926. We found that we can trace the change of Chinese writing style through the texts of the New Youth Magazine using the proposed method. It seems that more characters are common in classical (or literary) Chinese, but more two-character and multi-character words are common in modern Chinese.
dc.format.extent 658247 bytes-
dc.format.mimetype application/pdf-
dc.relation (關聯) Rivista degli Studi Orientali, Vol.XCII, No.4, pp.167-181
dc.subject (關鍵詞) Text Mining ; Xin Qingnian (New Youth Magazine) ; Writing Style ; Exploratory Data Analysis ; Species Diversity
dc.title (題名) A Statistical Analysis of Chinese Writing Style in Xin Qingnian 新青年 (New Youth Magazine) in the 1910s and 1920s
dc.type (資料類型) article