Please use this identifier to cite or link to this item: https://ah.nccu.edu.tw/handle/140.119/80326


Title: 由食譜資料探勘料理特徵樣式
Mining Cuisine Patterns from Recipe Dataset
Authors: 呂耀茹
Contributors: 沈錳坤
呂耀茹
Keywords: 巨量資料
資料探勘
食譜
料理
Date: 2015
Issue Date: 2016-01-04 16:58:11 (UTC+8)
Abstract: 近年來越來越多人基於健康理由,自己動手烹調料理,也帶動食譜社群網站的成長。雖然隨著Big Data議題受到注目,Data Mining在近年來相當熱門,然而針對食譜的巨量資料探勘與分析研究並不多。
本研究由網路擷取國外知名食譜網站Allrecipes.com、Food.com及Yummly.com的食譜資料,探勘世界主要料理的食材樣式與特性,包括料理口味、常用食材、特色食材、核心食材、食材搭配關係、料理間相似度與分群、及料理自動分類。
針對資料前處理,本論文提出結合食材詞庫並利用連通單元標籤演算法,提出解決食材同義詞的方法。為了探勘料理的食材樣式與特性,本研究透過網絡分析、關連規則、Phi, PMI等方法來探勘分析各種料理的特色食材、核心食材與食材搭配樣式。此外,本論文依據料理食材之相似度,並結合階層式分群技術,有別於一般以地理位置來群聚各類料理。本論文也提出運用階層式分類技術,以根據食材來自動判斷食譜的料理種類。
透過食譜網站的大量的使用者產生資料,探勘分析世界各種料理的樣式與特性,將可了解各種料理的風格與特色,進而應用在食譜網站的資料管理與查詢。
Reference: [1] Rakesh Agrawal and Ramakrishnan Srikant, Fast Algorithms for Miningssociation Rules, International Conference on Very Large Data Bases, VLDB, 1994.
[2] Yong Yeo. Ahn, Sebastian E. Ahnert, James P. Bagrow, and Albert László Barabasi, Flavor Network and the Principles of Food Pairing, Scientific Reports, Vol.1, 2011.
[3] Florian Beil, Martin Ester, and Xiaowei Xu, Frequent Term-based Text Clustering. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002.
[4] Steven Bird, Klein Ewan, and Edward Loper. Natural Language Processing with Python, O'Reilly Media, Inc., 2009.
[5] Stephen P. Borgatti, Centrality and Network Flow, Social Networks, Vol. 27 No.1, 2005.
[6] Corrado Boscarino, N. J. Koenderink, V. Nedović, and J. L. Top, Automatic extraction of ingredient's substitutes. ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication. ACM, 2014.
[7] L. Breiman, Random Forests, Machine Learning, Vol. 45, 2001.
[8] Thomas H. Cormen, Clifford Stein, Ronald L. Rivest, and Charles E. Leiserson, Introduction to Algorithms (the 2nd Edition), McGraw-Hill, 2001.
[9] Karam Gouda and Mohammed Zaki, Efficiently Mining Maximal Frequent Itemsets, IEEE International Conference on Data Mining, 2001.
[10] Jaiwei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2001.
[11] Anna Huang, Similarity Measures for Text Document Clustering, Sixth New Zealand Computer Science Research Student Conference, Christchurch, New Zealand, 2008.
[12] James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela H. Byers, Big Data: the Next Frontier for Innovation, Competition, and Productivity, McKinsey & Company, 2011.
[13] Rada Mihalcea, Courteny Corley, and Carlo Strapparava, Corpus-based and Knowledge-based Measures of Text Semantic Similarity. In, AAAI, 2006.
[14] Trung Duc Nguyen, Diep Thi-Ngoc Nguyen, and Yasushi Kiyoki, A Regional Food's Features Extraction Algorithm and Its Application, International Workshop on Multimedia for Cooking & Eating Activities, 2013.
[15] Tore Opsahl, Filip Agneessens, and John Skvoretz, Node Centrality in Weighted Networks: Generalizing Degree and Shortest Paths, Social Networks Vol. 32, 2010.
[16] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, 1993.
[17] Carlos N. Silla Jr., and Alex A. Freitas, A Survey of Hierarchical Classification across Different Application Domains, Data Mining and Knowledge Discovery, Vol. 22, 2011.
[18] Han Su, Ting-Wei Lin, Cheng-Te Li, Man-Kwan Shan, and Janet Chang, Automatic Recipe Cuisine Classification by Ingredients, ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, 2014.
[19] Aixin Sun, Ee-Peng Lim, and Wee-Keong Ng, Performance Measurement Framework for Hierarchical Text Classification, Journal of the American Society for Information Science and Technology, Vol. 54, 2003.
[20] Chun-Yuen Teng, Yu-Ru Lin, and Lada A. Adamic, Recipe Recommendation Using Ingredient Networks, ACM Web Science Conference, 2012.
[21] Kristin M. Tolle, D. Stewart W. Tansley, and Anthony J. Hey, The fourth paradigm: Data-intensive scientific discovery [point of view]. IEEE, Vol. 99, 2011.
[22] Lav R. Varshney, Florian Pinel, Kush R. Varshney, Debarun Bhattacharjya, Angela Schörgendorfer, and Yi-Min Chee, A Big Data Approach to Computational Creativity, arXiv preprint arXiv1311.1213 (2013).
[23] Kush R. Varshney, Lav R. Varshney, Jun Wang, and Daniel Myers, Flavor Pairing in Medieval European Cuisine: A study in Cooking with Dirty Data, International Joint Conference on Artificial Intelligence Workshops, 2013.
[24] Liping Wang, Qing Li, Na Li, Guozhu Dong, and Yu Yang, Substructure Similarity Measurement in Chinese Recipes. International Conference on World Wide Web, 2008.
[25] Yan Xu, Gareth Jones, JinTao Li, Bin Wang, and ChunMing Sun, A Study on Mutual Information-Based Feature Selection for Text Categorization, Journal of Computational Information Systems, Vol. 3, 2007.
[26] Gephi in https://gePhi.org
[27] Libsvm :http://www.csie.ntu.edu.tw/~cjlin/libsvm/
[28] Phi wiki introduction, retrieved June 20 2015 from the World Wide Web https://en.wikipedia.org/wiki/Phi.
[29] Stanford Parser. http://nlp.stanford.edu/software/lex-parser
[30] SVM wiki introduction, retrieved June 18 2015 from the World Wide Web https://en.wikipedia.org/wiki/Support_vector_machine
[31] Weka: http://www.cs.waikato.ac.nz/ml/weka/
Description: 碩士
國立政治大學
資訊科學系碩士在職專班
102971008
Source URI: http://thesis.lib.nccu.edu.tw/record/#G0102971008
Data Type: thesis
Appears in Collections:[資訊科學系碩士在職專班] 學位論文

Files in This Item:

File SizeFormat
100801.pdf5256KbAdobe PDF89View/Open


All items in 學術集成 are protected by copyright, with all rights reserved.


社群 sharing