Please use this identifier to cite or link to this item:
https://ah.lib.nccu.edu.tw/handle/140.119/112468
題名: | Distributed keyword vector representation for document categorization | 作者: | Hsieh, Yu Lun Liu, Shih Hung Chang, Yung Chun Hsu, Wen-Lian |
貢獻者: | 資科系 | 關鍵詞: | Artificial intelligence; Neural networks; Vectors; Comprehensive performance evaluation; Context information; Document categorization; Document Representation; Information explosion; Similarity measure; Vector representations; word embedding; Vector spaces | 日期: | Feb-2016 | 上傳時間: | 31-Aug-2017 | 摘要: | In the age of information explosion, efficiently categorizing the topic of a document can assist our organization and comprehension of the vast amount of text. In this paper, we propose a novel approach, named DKV, for document categorization using distributed real-valued vector representation of keywords learned from neural networks. Such a representation can project rich context information (or embedding) into the vector space, and subsequently be used to infer similarity measures among words, sentences, and even documents. Using a Chinese news corpus containing over 100,000 articles and five topics, we provide a comprehensive performance evaluation to demonstrate that by exploiting the keyword embeddings, DKV paired with support vector machines can effectively categorize a document into the predefined topics. Results demonstrate that our method can achieve the best performances compared to several other approaches. | 關聯: | TAAI 2015 - 2015 Conference on Technologies and Applications of Artificial Intelligence , 245-251 | 資料類型: | conference | DOI: | http://dx.doi.org/10.1109/TAAI.2015.7407126 |
Appears in Collections: | 會議論文 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
index.html | 209 B | HTML2 | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.