Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/120596
題名: An Internet Slang Translator Based on Decision Tree and Bi-gram Language Model
基於決策樹與二元語言模型的網路用語轉譯系統
作者: 楊亨利
黃泓彰
林青峰
Yang, Heng-Li
Huang, Hung-Chang
Lin, Qing-Feng
貢獻者: 資管系
關鍵詞: Internet slang ; Internet buzzwords ; text normalization ; decision tree ; bi-gram language model
網路用語 ; 網路流行語 ; 文字正規化 ; 決策樹 ; 語言模型 ;
日期: Mar-2015
上傳時間: 17-Oct-2018
摘要: While conducting text mining on Chinese content, Internet slang is a problem which results in a lower accuracy of text segmentation. Translating Internet slang into formal Chinese would help segmentation and, in addition, revealing the implicit information of the slang. In order to translate Internet slang, this study first collected meanings of slangs and web texts. Next, Internet slang was categorized, and translating methods, which are mainly based on decision tree and bi-gram language model, were developed for each category. The translator was then implemented. Eighty-one percentages of the Internet slang in web texts were correctly detected and translated, with a precision at ninety percentages. It is concluded that the proposed methods are quite applicable to Internet slang translation.
網路文章中含有的網路用語或網路流行語,對於以正規中文為對象的文字分析而言是一個阻礙分析的問題;若將網路用語轉譯為正規中文將會有助於取得更多可用的資訊。為了將網路用語轉譯為正規中文,本研究蒐集網路用語的定義與網路文章,將網路用語分類後,運用決策樹和語言模型的轉譯方法,對各類用語作合適的轉譯。轉譯系統能夠偵測並轉譯約81%的網路用語,其轉譯的精確度約為90%;因此,本研究所提出之以決策樹和語言模型為基礎之系統應可適合網路用語的轉譯。
關聯: 電子商務學報, 17卷1期 , 25-48
資料類型: article
DOI: http://dx.doi.org/10.6188/JEB.2015.17(1).02
Appears in Collections:會議論文

Files in This Item:
File Description SizeFormat
25-48.pdf1.26 MBAdobe PDF2View/Open
Show full item record

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.