雲端運算服務環境下運用文字探勘於語意註解網頁文件分析之研究

學術產出-Theses

Article View/Open

pdf(3283)

Publication Export

Google Scholar^TM

政大圖書館

學術資源探索系統

Citation Infomation

No doi shows Citation Infomation

Simple Record
Full Record

題名	雲端運算服務環境下運用文字探勘於語意註解網頁文件分析之研究 Extraction of semantic annotation document using text mining techniques in cloud computing environment
作者	黃孝文
貢獻者	楊建民黃孝文
關鍵詞	雲端運算文件分類語意註解簡單貝氏分類器
日期	2009
上傳時間	11-Oct-2011 16:50:28 (UTC+8)
摘要	隨著網路的快速成長，資料探勘(Data Mining)及文字探勘(Text Mining)所須分析的資料集越來越龐大，透過單一機器執行資料探勘分析受限於記憶體大小及其計算能力，不僅運算時間大幅增加，分析資料集的檔案大小也因而受到限制；語意註解萃取出文件的重要內容，凸顯主題加強資料探勘及文字探勘的效果，而資料探勘、文字探勘和語意註解背後都牽涉到大規模的資料處理，透過雲端運算的技術使負載平衡，將運算工作分散至運算叢集中的每一台電腦，不僅加快運算和儲存的速度，更可降低整體的風險。本研究使用Hadoop軟體實作雲端文字探勘平台，用於分散式文字探勘及結果分析，採用涵蓋21578篇新聞文件的路透社資料集(Reuters 21578)進行實證分析，依照Mod Apte切分法分為訓練資料集及測試資料集用以進行文件分類，文件分類的步驟分為數個部分，分別為進行資料格式轉換的資料前置處理、針對文件內容加註更詳盡的連結及描述的語意註解、用以產生分類預測模型的分類器(簡單貝氏分類器、餘集簡單貝氏分類器)與評估文件分類結果的評估器；路透社資料集經過去除停用字、附加語意註解資料及文本詞彙長度統計分類，再進行簡單貝氏分類器及餘集簡單貝氏分類器的訓練，比較測試資料集的分類正確率作為文件分類實證結果。本研究根據實驗結果發現，探討去除停用字、語意註解、文件分類演算法及文本詞彙長度對於文件分類正確率的影響：(1)去除停用字使出現頻率高的停用字對於分類預測產生負面影響；(2)語意註解作為詮釋資料的取得方式，可增加文件分類的效果；(3)餘集簡單貝氏分類器，可用以減少偏斜資料對於分類預測結果的誤判；(4)文本詞彙長度較長的文章則會某種程度主導分類預測結果，造成誤判的產生，降低分類正確率；透過上述各影響因子的調整使文件分類的結果得到改善，使得文件分類正確率獲得較佳的效果。本研究提出之系統以雲端運算環境運行文件分類演算法，使得大型資料集得以更為迅速取得分析結果，使用語意註解作為詮釋資料的來源，使得文件分類模型產生過程中有更多資訊可分析，使得機器判斷的正確程度獲得改善，亦可將文件轉換為語意網文件，供語意網搜尋引擎查詢檢索，未來應加入Twitter或Facebook等擁有大量非結構化資料的網站之資料，使本平台得以分析更大規模的資料，並且考慮資料集類別分佈的集中程度對分類正確率的影響程度，同時應實作效果更佳的分類演算法，進而改善系統整體的結果。 Nowadays, businesses perform data mining and text mining need to handle large scale dataset. The computational resources of servers are often limited and lack of efficient to compute analytical jobs. But if they could run their data mining jobs under cloud computing clusters, they are able to get results very quickly on a large dataset without "out of memory" problems. In this paper, a series of experiments are conducted to measure and analyze the accuracy of the classification algorithms implemented on Hadoop using Reuters-21578 dataset; the process of text mining consisted of four stages: (1)data preprocessing, (2)semantic annotation, (3)classifier, (4)evaluator. Reuters-21578 had divided into training set and testing set based on Mod Apte Split, processed by stopwords removal, appended semantic annotations as metadata and splitted into several subsets according to different document sizes. Experiments outlined several issues that will need to be considered when conducting text mining. According to the experiment results, the researcher found that stopwords removal, semantic annotation, different classification algorithms and different document sizes could improve the classification accuracy. First, stopwords removal avoids common words from becoming noises that will do harm to classification result. Second, semantic annotation as the extra information could improve the result. Third, complementary naive bayes algorithm could solve the decision boundary problem which naive bayesian cannot handle. Fourth, long documents could dominate the classification results. Sixth, the class imbalance problem could cause a drop of classification accuracy. Text mining result could be improved by adjusting the parameters found above.
參考文獻	[1] Apte, C., Damerau, F., & Weiss, S. M. (1994). Towards language independent automated learning of text categorization models. Paper presented at the ACM SIGIR Conference on Research and Development in Information Retrieval. [2] AWS in Education. (2010). from http://aws.amazon.com/education [3] Berendt, B., Hotho, A., & Stumme, G. (2002). Towards semantic web mining. The Semantic Web—ISWC 2002, 264-278. [4] Center, N. R. (2008). DisCo. from http://discoproject.org/ [5] Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., et al. (2008). Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2), 4. [6] Chu, C. T., Kim, S. K., Lin, Y. A., Yu, Y. Y., Bradski, G., Ng, A. Y., et al. (2007). Map-reduce for machine learning on multicore. Paper presented at the NIPS. [7] Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113. [8] Dlugolinsky, S., Laclavik, M., Seleng, M. (2010). Ontea Semantic Annotation. from http://ontea.sourceforge.net/ [9] Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM, 39(11), 27-34. [10] Foundation, A. S. (2008). Cassandra. from http://incubator.apache.org/cassandra/ [11] Foundation, A. S. (2008). Hadoop. from http://hadoop.apache.org/core/ [12] Foundation, A. S. (2010). Hbase. from http://hadoop.apache.org/hbase/ [13] Gillick, D., Faria, A., & DeNero, J. (2006). Mapreduce: Distributed computing for machine learning. [14] Google. (2010). Google App Engine. from http://code.google.com/intl/en/appengine [15] Hypertable. (2010). Hypertable. from http://www.hypertable.org [16] Jenkin, N. (2009). COMP390-09A Report Distributed Machine Learning with Hadoop. [17] Kibriya, A., Frank, E., Pfahringer, B., & Holmes, G. (2005). Multinomial naive bayes for text categorization revisited. AI 2004: Advances in Artificial Intelligence, 235-252. [18] Laclavik, M., eleng, M., & Hluchy, L. (2008). Towards large scale semantic annotation built on mapreduce architecture. Computational Science–ICCS 2008, 331-338. [19] Laclavik, M., Seleng, M., Gatial, E., Balogh, Z., & Hluchy, L. (2007). Ontology based Text Annotation–OnTeA. Information modelling and knowledge bases XVIII, 311. [20] Lioma, C., Moens, M. F., & Azzopardi, L. (2008). Collaborative annotation for pseudo relevance feedback. ESAIR, 11, 25-35. [21] Maedche, A. (2001). Ontology learning for the semantic web: Intelligent Systems, IEEE. [22] Mell, P., & Grance, T. (2009). The nist definition of cloud computing. National Institute of Standards and Technology. [23] Ontotext. (2009). KIM Semantic Annotation. from http://www.ontotext.com/kim/introduction.html [24] Papadimitriou, S., & Sun, J. (2008). Disco: Distributed co-clustering with Map-Reduce: A case study towards petabyte-scale end-to-end mining. Paper presented at the ICDM. [25] ReadWriteWeb. (2010). Does Facebook Really Want a Semantic Web? , from http://www.readwriteweb.com/archives/does_facebook_really_want_a_semantic_web.php [26] Reeve, L., & Han, H. (2005). Survey of semantic annotation platforms. Paper presented at the Proceedings of the 2005 ACM symposium on Applied computing. [27] RIGHTSCALE. (2010). RIGHTSCALE. from http://www.rightscale.com/index.php [28] Sivashanmugam, K., Sheth, A., Miller, J., Verma, K., Aggarwal, R., & Rajasekaran, P. (2003). Metadata and semantics for Web services and processes. Datenbanken und Informationssysteme: Festschrift zum, 60, 245-271. [29] Stanford. (2005). TAP. from http://ksl.stanford.edu/projects/TAP/ [30] Stanford. (2007). Phoenix. from http://csl.stanford.edu/~christos/sw/phoenix/ [31] Wegener, D., Mock, M., Adranale, D., & Wrobel, S. (2009). Toolkit-based high-performance data mining of large data on MapReduce clusters. [32] Wikipedia. (2007). Dbpedia. from http://wiki.dbpedia.org/ [33] Wikipedia. (2010). Cloud Computing. from http://en.wikipedia.org/wiki/Cloud_computing [34] Witten, I. H., & Frank, E. (2005). Data Mining: Practical machine learning tools and techniques: Morgan Kaufmann Pub. [35] gipi的學習筆記，2009，http://www.dotblogs.com.tw/jimmyyu/。 [36] 劉繼鴻，2009，影音Web 2.0平台網站上行銷傳播之社會網絡與資料探勘分析研究－以YouTube-Mac網絡為例，國立政治大學資訊管理研究所碩士論文。 [37] 劉俊宏，2009，雲端運算環境下學習社群服務導向架構平台之研究，國立政治大學資訊管理研究所碩士論文。 [38] 王耀聰、陳威宇，2008，雲端運算簡介，http://bit.ly/bXsTVT。 [39] 謝良奇，2008，HP、Intel、Yahoo共組開放源碼雲端運算計畫，http://bit.ly/adLC7D。 [40] 葉慶隆，2009，Semantic Web and Knowledge Management， http://www.deg.byu.edu/ding/research/SemanticAnnotation.html。 [41] 陳瀅，2010，雲端策略：雲端運算與虛擬化技術，天下雜誌。
描述	碩士國立政治大學資訊管理研究所 97356012 98
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0097356012
資料類型	thesis

dc.contributor.advisor	楊建民	zh_TW
dc.contributor.author (Authors)	黃孝文	zh_TW
dc.creator (作者)	黃孝文	zh_TW
dc.date (日期)	2009	en_US
dc.date.accessioned	11-Oct-2011 16:50:28 (UTC+8)	-
dc.date.available	11-Oct-2011 16:50:28 (UTC+8)	-
dc.date.issued (上傳時間)	11-Oct-2011 16:50:28 (UTC+8)	-
dc.identifier (Other Identifiers)	G0097356012	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/51559	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊管理研究所	zh_TW
dc.description (描述)	97356012	zh_TW
dc.description (描述)	98	zh_TW
dc.description.abstract (摘要)	隨著網路的快速成長，資料探勘(Data Mining)及文字探勘(Text Mining)所須分析的資料集越來越龐大，透過單一機器執行資料探勘分析受限於記憶體大小及其計算能力，不僅運算時間大幅增加，分析資料集的檔案大小也因而受到限制；語意註解萃取出文件的重要內容，凸顯主題加強資料探勘及文字探勘的效果，而資料探勘、文字探勘和語意註解背後都牽涉到大規模的資料處理，透過雲端運算的技術使負載平衡，將運算工作分散至運算叢集中的每一台電腦，不僅加快運算和儲存的速度，更可降低整體的風險。本研究使用Hadoop軟體實作雲端文字探勘平台，用於分散式文字探勘及結果分析，採用涵蓋21578篇新聞文件的路透社資料集(Reuters 21578)進行實證分析，依照Mod Apte切分法分為訓練資料集及測試資料集用以進行文件分類，文件分類的步驟分為數個部分，分別為進行資料格式轉換的資料前置處理、針對文件內容加註更詳盡的連結及描述的語意註解、用以產生分類預測模型的分類器(簡單貝氏分類器、餘集簡單貝氏分類器)與評估文件分類結果的評估器；路透社資料集經過去除停用字、附加語意註解資料及文本詞彙長度統計分類，再進行簡單貝氏分類器及餘集簡單貝氏分類器的訓練，比較測試資料集的分類正確率作為文件分類實證結果。本研究根據實驗結果發現，探討去除停用字、語意註解、文件分類演算法及文本詞彙長度對於文件分類正確率的影響：(1)去除停用字使出現頻率高的停用字對於分類預測產生負面影響；(2)語意註解作為詮釋資料的取得方式，可增加文件分類的效果；(3)餘集簡單貝氏分類器，可用以減少偏斜資料對於分類預測結果的誤判；(4)文本詞彙長度較長的文章則會某種程度主導分類預測結果，造成誤判的產生，降低分類正確率；透過上述各影響因子的調整使文件分類的結果得到改善，使得文件分類正確率獲得較佳的效果。本研究提出之系統以雲端運算環境運行文件分類演算法，使得大型資料集得以更為迅速取得分析結果，使用語意註解作為詮釋資料的來源，使得文件分類模型產生過程中有更多資訊可分析，使得機器判斷的正確程度獲得改善，亦可將文件轉換為語意網文件，供語意網搜尋引擎查詢檢索，未來應加入Twitter或Facebook等擁有大量非結構化資料的網站之資料，使本平台得以分析更大規模的資料，並且考慮資料集類別分佈的集中程度對分類正確率的影響程度，同時應實作效果更佳的分類演算法，進而改善系統整體的結果。	zh_TW
dc.description.abstract (摘要)	Nowadays, businesses perform data mining and text mining need to handle large scale dataset. The computational resources of servers are often limited and lack of efficient to compute analytical jobs. But if they could run their data mining jobs under cloud computing clusters, they are able to get results very quickly on a large dataset without "out of memory" problems. In this paper, a series of experiments are conducted to measure and analyze the accuracy of the classification algorithms implemented on Hadoop using Reuters-21578 dataset; the process of text mining consisted of four stages: (1)data preprocessing, (2)semantic annotation, (3)classifier, (4)evaluator. Reuters-21578 had divided into training set and testing set based on Mod Apte Split, processed by stopwords removal, appended semantic annotations as metadata and splitted into several subsets according to different document sizes. Experiments outlined several issues that will need to be considered when conducting text mining. According to the experiment results, the researcher found that stopwords removal, semantic annotation, different classification algorithms and different document sizes could improve the classification accuracy. First, stopwords removal avoids common words from becoming noises that will do harm to classification result. Second, semantic annotation as the extra information could improve the result. Third, complementary naive bayes algorithm could solve the decision boundary problem which naive bayesian cannot handle. Fourth, long documents could dominate the classification results. Sixth, the class imbalance problem could cause a drop of classification accuracy. Text mining result could be improved by adjusting the parameters found above.	en_US
dc.description.tableofcontents	誌謝 I 摘要 II 目錄 IV 圖索引 VI 表索引 VII 第一章緒論 1 第一節研究背景 1 第二節研究動機 2 第三節研究目的 3 第四節研究架構 4 第二章文獻探討 5 第一節資料探勘與文字探勘 5 2.1.1資料探勘 5 2.1.2文字探勘 7 2.1.3資料探勘與文字探勘之差異 7 2.1.4簡單貝氏分類器 9 第二節雲端運算 10 2.2.1雲端運算的定義 10 2.2.2現有的雲端運算服務 16 第三節 MapReduce軟體設計模型 16 2.3.1 Google的分散式資料庫BigTable 18 2.3.2 MapReduce軟體設計模型應用於資料探勘之研究 20 第四節實作MapReduce架構的框架 20 2.4.1 Hadoop 21 2.4.2 DisCo 22 2.4.3使用雲端運算平台的其他方案 23 第五節現今語意網的相關發展 24 2.5.1鍊結資料(Linked Data) 24 2.5.2 Facebook 25 2.5.3 Wolfram Alpha 27 第六節語意註解(Semantic Annotation) 28 2.6.1語意註解的分類 30 2.6.2語意註解的發展 31 2.6.3語意註解於資料探勘上的應用 31 第三章研究方法 32 第一節研究設計 33 第二節文本資料集Reuters 21578 35 3.2.1路透社資料集文件格式 35 3.2.2路透社資料集的歪斜資料特性 37 第三節資料前置處理 39 3.3.1斷詞切字(Tokenization) 39 3.3.2去除停用字(Stop Words Removal) 39 3.3.3詞幹還原(Stemming) 40 第四節語意註解 41 第五節分類器(Classifier) 43 3.4.1簡單貝氏分類器 43 3.4.2餘集簡單貝氏分類器 45 3.4.3簡單貝氏分類器與餘集簡單貝氏分類器的實作 45 第六節評估器(Evaluator) 46 第七節平台建置 46 第四章研究成果 48 第一節去除停用字對文件分類正確率的影響 48 第二節語意註解對文件分類正確率的影響 50 第三節比較簡單貝氏分類器與餘集簡單貝氏分類器之分類正確率 51 第四節餘集簡單貝氏分類器消除偏斜資料產生的決策邊界問題之效果 52 第五節文本詞彙長度對文件分類正確率的影響 53 第六節避免模型過適問題 54 第七節類別分佈均勻程度對於分類結果之影響 56 第八節各階段實驗結論整理 57 第五章結論與建議 60 第一節結論 60 第二節未來研究方向 61 參考文獻 62	zh_TW
dc.language.iso	en_US	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0097356012	en_US
dc.subject (關鍵詞)	雲端運算	zh_TW
dc.subject (關鍵詞)	文件分類	zh_TW
dc.subject (關鍵詞)	語意註解	zh_TW
dc.subject (關鍵詞)	簡單貝氏分類器	zh_TW
dc.title (題名)	雲端運算服務環境下運用文字探勘於語意註解網頁文件分析之研究	zh_TW
dc.title (題名)	Extraction of semantic annotation document using text mining techniques in cloud computing environment	en_US
dc.type (資料類型)	thesis	en
dc.relation.reference (參考文獻)	[1] Apte, C., Damerau, F., & Weiss, S. M. (1994). Towards language independent automated learning of text categorization models. Paper presented at the ACM SIGIR Conference on Research and Development in Information Retrieval.	zh_TW
dc.relation.reference (參考文獻)	[2] AWS in Education. (2010). from http://aws.amazon.com/education	zh_TW
dc.relation.reference (參考文獻)	[3] Berendt, B., Hotho, A., & Stumme, G. (2002). Towards semantic web mining. The Semantic Web—ISWC 2002, 264-278.	zh_TW
dc.relation.reference (參考文獻)	[4] Center, N. R. (2008). DisCo. from http://discoproject.org/	zh_TW
dc.relation.reference (參考文獻)	[5] Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., et al. (2008). Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2), 4.	zh_TW
dc.relation.reference (參考文獻)	[6] Chu, C. T., Kim, S. K., Lin, Y. A., Yu, Y. Y., Bradski, G., Ng, A. Y., et al. (2007). Map-reduce for machine learning on multicore. Paper presented at the NIPS.	zh_TW
dc.relation.reference (參考文獻)	[7] Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.	zh_TW
dc.relation.reference (參考文獻)	[8] Dlugolinsky, S., Laclavik, M., Seleng, M. (2010). Ontea Semantic Annotation. from http://ontea.sourceforge.net/	zh_TW
dc.relation.reference (參考文獻)	[9] Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM, 39(11), 27-34.	zh_TW
dc.relation.reference (參考文獻)	[10] Foundation, A. S. (2008). Cassandra. from http://incubator.apache.org/cassandra/	zh_TW
dc.relation.reference (參考文獻)	[11] Foundation, A. S. (2008). Hadoop. from http://hadoop.apache.org/core/	zh_TW
dc.relation.reference (參考文獻)	[12] Foundation, A. S. (2010). Hbase. from http://hadoop.apache.org/hbase/	zh_TW
dc.relation.reference (參考文獻)	[13] Gillick, D., Faria, A., & DeNero, J. (2006). Mapreduce: Distributed computing for machine learning.	zh_TW
dc.relation.reference (參考文獻)	[14] Google. (2010). Google App Engine. from http://code.google.com/intl/en/appengine	zh_TW
dc.relation.reference (參考文獻)	[15] Hypertable. (2010). Hypertable. from http://www.hypertable.org	zh_TW
dc.relation.reference (參考文獻)	[16] Jenkin, N. (2009). COMP390-09A Report Distributed Machine Learning with Hadoop.	zh_TW
dc.relation.reference (參考文獻)	[17] Kibriya, A., Frank, E., Pfahringer, B., & Holmes, G. (2005). Multinomial naive bayes for text categorization revisited. AI 2004: Advances in Artificial Intelligence, 235-252.	zh_TW
dc.relation.reference (參考文獻)	[18] Laclavik, M., eleng, M., & Hluchy, L. (2008). Towards large scale semantic annotation built on mapreduce architecture. Computational Science–ICCS 2008, 331-338.	zh_TW
dc.relation.reference (參考文獻)	[19] Laclavik, M., Seleng, M., Gatial, E., Balogh, Z., & Hluchy, L. (2007). Ontology based Text Annotation–OnTeA. Information modelling and knowledge bases XVIII, 311.	zh_TW
dc.relation.reference (參考文獻)	[20] Lioma, C., Moens, M. F., & Azzopardi, L. (2008). Collaborative annotation for pseudo relevance feedback. ESAIR, 11, 25-35.	zh_TW
dc.relation.reference (參考文獻)	[21] Maedche, A. (2001). Ontology learning for the semantic web: Intelligent Systems, IEEE.	zh_TW
dc.relation.reference (參考文獻)	[22] Mell, P., & Grance, T. (2009). The nist definition of cloud computing. National Institute of Standards and Technology.	zh_TW
dc.relation.reference (參考文獻)	[23] Ontotext. (2009). KIM Semantic Annotation. from http://www.ontotext.com/kim/introduction.html	zh_TW
dc.relation.reference (參考文獻)	[24] Papadimitriou, S., & Sun, J. (2008). Disco: Distributed co-clustering with Map-Reduce: A case study towards petabyte-scale end-to-end mining. Paper presented at the ICDM.	zh_TW
dc.relation.reference (參考文獻)	[25] ReadWriteWeb. (2010). Does Facebook Really Want a Semantic Web? , from http://www.readwriteweb.com/archives/does_facebook_really_want_a_semantic_web.php	zh_TW
dc.relation.reference (參考文獻)	[26] Reeve, L., & Han, H. (2005). Survey of semantic annotation platforms. Paper presented at the Proceedings of the 2005 ACM symposium on Applied computing.	zh_TW
dc.relation.reference (參考文獻)	[27] RIGHTSCALE. (2010). RIGHTSCALE. from http://www.rightscale.com/index.php	zh_TW
dc.relation.reference (參考文獻)	[28] Sivashanmugam, K., Sheth, A., Miller, J., Verma, K., Aggarwal, R., & Rajasekaran, P. (2003). Metadata and semantics for Web services and processes. Datenbanken und Informationssysteme: Festschrift zum, 60, 245-271.	zh_TW
dc.relation.reference (參考文獻)	[29] Stanford. (2005). TAP. from http://ksl.stanford.edu/projects/TAP/	zh_TW
dc.relation.reference (參考文獻)	[30] Stanford. (2007). Phoenix. from http://csl.stanford.edu/~christos/sw/phoenix/	zh_TW
dc.relation.reference (參考文獻)	[31] Wegener, D., Mock, M., Adranale, D., & Wrobel, S. (2009). Toolkit-based high-performance data mining of large data on MapReduce clusters.	zh_TW
dc.relation.reference (參考文獻)	[32] Wikipedia. (2007). Dbpedia. from http://wiki.dbpedia.org/	zh_TW
dc.relation.reference (參考文獻)	[33] Wikipedia. (2010). Cloud Computing. from http://en.wikipedia.org/wiki/Cloud_computing	zh_TW
dc.relation.reference (參考文獻)	[34] Witten, I. H., & Frank, E. (2005). Data Mining: Practical machine learning tools and techniques: Morgan Kaufmann Pub.	zh_TW
dc.relation.reference (參考文獻)	[35] gipi的學習筆記，2009，http://www.dotblogs.com.tw/jimmyyu/。	zh_TW
dc.relation.reference (參考文獻)	[36] 劉繼鴻，2009，影音Web 2.0平台網站上行銷傳播之社會網絡與資料探勘分析研究－以YouTube-Mac網絡為例，國立政治大學資訊管理研究所碩士論文。	zh_TW
dc.relation.reference (參考文獻)	[37] 劉俊宏，2009，雲端運算環境下學習社群服務導向架構平台之研究，國立政治大學資訊管理研究所碩士論文。	zh_TW
dc.relation.reference (參考文獻)	[38] 王耀聰、陳威宇，2008，雲端運算簡介，http://bit.ly/bXsTVT。	zh_TW
dc.relation.reference (參考文獻)	[39] 謝良奇，2008，HP、Intel、Yahoo共組開放源碼雲端運算計畫，http://bit.ly/adLC7D。	zh_TW
dc.relation.reference (參考文獻)	[40] 葉慶隆，2009，Semantic Web and Knowledge Management， http://www.deg.byu.edu/ding/research/SemanticAnnotation.html。	zh_TW
dc.relation.reference (參考文獻)	[41] 陳瀅，2010，雲端策略：雲端運算與虛擬化技術，天下雜誌。	zh_TW

學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

Google Scholar^TM