學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 世界城市的概念輪廓與連結:以Flickr Tags為例
The World Cities Concept Profiling And Concatenation:A Case Study On Flickr Tags
作者 曹期鈞
Tsao, Chi Chun
貢獻者 劉吉軒
Liu, Jyi Shane
曹期鈞
Tsao, Chi Chun
關鍵詞 社會網路
標記系統
標籤類型
詞幹分析
字詞共現指標
Social network
Tagging systems
POS
Stemming
Tag co-occurrence
日期 2013
上傳時間 10-Feb-2014 14:56:30 (UTC+8)
摘要 在這社會網路蓬勃發展之中、網際網路頻寬與速度相繼提昇的資訊年代,結合網路科技所衍生的Flickr網路相簿因應而生。Flickr提供許多API程式讓使用者或有興趣研究的專家學者能透過Flickr所收集及其所探討的議題,來觀察社會網路的變化情形。

社會網路主要是由節點以及節點間彼此相連結所形成,常見的網路模型大致可分為One-mode與Two-mode兩種網路結構,而本文則採用內部同時有兩種類節點、由兩個城市與Tags共同組合而成的Two-mode網路為基礎架構,期望藉此來闡述一個Tags系統分析法,利用Flickr使用者收集、標註之Flickr標記來與世界城市的概念輪廓相連結,透過提取城市語義分配給Flickr上照片的Tags,以及解決Part-Of-Speech (POS)、詞幹還原及雜訊處理…等問題,來達成依據排名結果分析出城市概念輪廓的最終目的。

除此之外,本文還運用了Flickr tag資料來彙整出41個城市的前100名tag,再篩選出前10名的tag,將其與相關的城市歸類一起比較。本文亦使用字詞共現指標(Tag co-occurrence)來計算與該城市的關聯性,再利用此法則來歸納出這兩個城市字詞共同出現的機會,以便於了解城市與城市之間的關連字詞組合。最後,本研究亦透過Flickr網站本身Popular Tags經由分析及匯出標籤雲的結果來與本文之實驗結果相對照,本實驗85%的吻合度驗證了可靠性。
The Flickr Web Albums was born in the information age of social network growth, internet bandwidth and speed improvement. Users and researchers can observe the changing of social network from topics collected and studied by Flickr using API programs provided by Flickr.
The main structure of social network can be distinguished one-mode and two-mode network which is composed by nodes, generally. An approach for world cities concept profiling analysis is developed in this study by conbineing two types of nodes and two cities with tag which is the two-mode network using extracting city semantics for tags assigned to photos on Flickr, solving Part-of-Speech(POS), Stemming reduction and noise handing by collecting Flickr`s tags from Flickr users.
The top 100 tags were slected for 41 cities and then top 10 tags for each city were also extracted. The Tag co-occurrence was also applied to analysis the relationship of cities. Then the connection between the cities can be understood by the result of tag co-occurrence opportunities. The 85% accurancy was demonstrated by comparing the result of analysised and exported Popular Tags from Flickr Website service and the result of experiments in this study.
參考文獻 Ames, M. and Naaman, M. (2007). Why we Tag: Motivations for annotation in mobile and online media. Paper presented at the Proceedings of the SIGCHI conference on Human factors in computing systems.
Anthonisse, J.M. (1971). The rush in a directed graph (Mathematische Besliskunde, No. BN 9/71). Amsterdam: Stichting Mathematisch Centrum.
Barnes, J.A. (1954). Class and Committees in a Norwegian Island Parish .Hum Relations,7(1):39-54.
Bastian, M. (2009), Gephi: An Open Source Software for Exploring and Manipulating Networks, AAAI Publications.
Chou, C.Y. and Mei, H. (2008). Analyzing Tag-based Mashups with Fuzzy FCA. IEEE Int.Symposium on Service-Oriented System Engineering.
Cox, A.M., Clough, P.D. & Marlow, J. (2008). Flickr: a first look at user behaviour in the context of photography as serious leisure Information Research,13(1) paper 336.
Derudder, B. and Taylor, P.J. (2005). The cliquishness of world cities, Global Networks, 5(1) 71-91.
Feinberg, J. (2009). Wordle word clouds. Abstract retrieved August 16, 2013, from http://www.wordle.net/
Flickr API. Abstract retrieved August 10, 2013, from http://www.flickr.com/services/api/
Flickr-Wikipedia,the free encyclopedia. Abstract retrieved August 10, 2013, from http://en.wikipedia.org/wiki/Flickr
Freeman, L.C. (1979). Centrality in social networks: Conceptual clarification. Social Networks, 1, 215-239.
Golder, S.A. and Huberman, B.A. (2006). Usage patterns of collaborative Tagging systems. J. Inf. Sci., 32(2), 198-208.
Hull, D.A. (1998).Stemming algorithms: A case study for detailed evaluation. Journal of the American Society for Information Science, vol. 47, no. 1, pp. 70-84, Dec.
Jenkins, O.H. (2003), Photography and travel brochures: the circle of reoresentation. Tourism Geographies, 5(3), 305-328.
Lonely Planet. Abstract retrieved August 10, 2013, from http://www.lonelyplanet.com/
Lovins, J.B. (1968a).Development of a stemming algorithm. Mechanical Translation and Computational Linguistics, II, 22-3 1
Lovins, J.B. (1968b).The Lovins stemming algorithm. Abstract retrieved September 12, 2013, from http://snowball.tartarus.org/algorithms/lovins/stemmer.html
Marcus, M., Santorini, B.and Marcinkiewicz , M. (1993). Architecture a large annotated corpus of English: the Penn Treebank. Computational Linguistics, Vol.19.
Mika, P. (2007). Ontologies are us: A unified Model of social networks and semantics. Web Semantics: Science, Services and Agents on the World Wide Web, 5(1), 5-15.
National Geographic. Abstract retrieved August 10, 2013, from http://www.nationalgeographic.com/
O’Reilly, T. (2005). Tim O’Reilly and Web 2.0: the economics of memetic liberty and control.
Porter, M. F. (1980a). An algorithm for suffix stripping, Program, 14(3):130.
Porter, M. F.(1980b).Martin Porter`s Home Page:The Porter Stemming Algorithm.Abstract retrieved September 21,2013,from http://tartarus.org/martin/index.html
Scott, J. (2000). Social network analysis: A handbook (2nd ed.). London: Sage.
Sigurbjörnsson, B. and Zwol, R.V. (2008). Flickr Tag recommendation based on collective knowledge. Proceeding of the 17th international conference on World Wide Web, ser.WWW `08. New York, NY, USA: ACM, pp. 327-336.
TF*IDF.Abstract retrieved September 21, 2013, from:TF*IDF:http://morris.lis.ntu.edu.tw/wikimedia/index.php/TF*IDF/
Top 15 Most Popular Travel Websites. Abstract retrieved August 2, 2013, from http://www.ebizmba.com/articles/travel-websites/
Toutanova, K. and Christopher, D.M. (2000). Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), 63-70.
Travel+ Leisure. Abstract retrieved August 10, 2013, from http://www.travelandleisure.com
Travel.Yahoo. Abstract retrieved August 5, 2013, from http://travel.yahoo.com/
Tripadvisor. Abstract retrieved August 3, 2013, from http://www.tripadvisor.com/tripadvisor/
Valitutti, R. (2004).WordNet-affect:an affective extension of WordNet. Proceedings of the 4th International Conference on Language Resources and Evaluation, vol. pp.1083-1086.
Wasserman, S. (1994). Social network analysis: Methods and applications. Cambridge university press, Vol. 8.
吳筱玟與周芷伊(民98)。 Tagging的分類與知識意涵:以Flickr首頁圖片為例。新聞學研究,未出版,台北市。
林岡隆(民98)。政府官員異動之社會網路分析。 國立政治大學資訊科學系碩士論文,未出版,台北市。
彭聲揚(民100)。透過圖片Tags觀察情緒字詞與事物概念之關聯。 國立政治大學資訊科學系碩士論文,未出版,台北市。
溫文喆(民98)。詞義相似度的社會網路分析研究。 國立政治大學資訊科學系碩士論文,未出版,台北市。
蔡吉章(民98)。英文文法關係之型態探勘。 國立政治大學資訊科學系碩士論文,未出版,台北市。
描述 碩士
國立政治大學
資訊科學學系
98971020
102
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0098971020
資料類型 thesis
dc.contributor.advisor 劉吉軒zh_TW
dc.contributor.advisor Liu, Jyi Shaneen_US
dc.contributor.author (Authors) 曹期鈞zh_TW
dc.contributor.author (Authors) Tsao, Chi Chunen_US
dc.creator (作者) 曹期鈞zh_TW
dc.creator (作者) Tsao, Chi Chunen_US
dc.date (日期) 2013en_US
dc.date.accessioned 10-Feb-2014 14:56:30 (UTC+8)-
dc.date.available 10-Feb-2014 14:56:30 (UTC+8)-
dc.date.issued (上傳時間) 10-Feb-2014 14:56:30 (UTC+8)-
dc.identifier (Other Identifiers) G0098971020en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/63708-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 98971020zh_TW
dc.description (描述) 102zh_TW
dc.description.abstract (摘要) 在這社會網路蓬勃發展之中、網際網路頻寬與速度相繼提昇的資訊年代,結合網路科技所衍生的Flickr網路相簿因應而生。Flickr提供許多API程式讓使用者或有興趣研究的專家學者能透過Flickr所收集及其所探討的議題,來觀察社會網路的變化情形。

社會網路主要是由節點以及節點間彼此相連結所形成,常見的網路模型大致可分為One-mode與Two-mode兩種網路結構,而本文則採用內部同時有兩種類節點、由兩個城市與Tags共同組合而成的Two-mode網路為基礎架構,期望藉此來闡述一個Tags系統分析法,利用Flickr使用者收集、標註之Flickr標記來與世界城市的概念輪廓相連結,透過提取城市語義分配給Flickr上照片的Tags,以及解決Part-Of-Speech (POS)、詞幹還原及雜訊處理…等問題,來達成依據排名結果分析出城市概念輪廓的最終目的。

除此之外,本文還運用了Flickr tag資料來彙整出41個城市的前100名tag,再篩選出前10名的tag,將其與相關的城市歸類一起比較。本文亦使用字詞共現指標(Tag co-occurrence)來計算與該城市的關聯性,再利用此法則來歸納出這兩個城市字詞共同出現的機會,以便於了解城市與城市之間的關連字詞組合。最後,本研究亦透過Flickr網站本身Popular Tags經由分析及匯出標籤雲的結果來與本文之實驗結果相對照,本實驗85%的吻合度驗證了可靠性。
zh_TW
dc.description.abstract (摘要) The Flickr Web Albums was born in the information age of social network growth, internet bandwidth and speed improvement. Users and researchers can observe the changing of social network from topics collected and studied by Flickr using API programs provided by Flickr.
The main structure of social network can be distinguished one-mode and two-mode network which is composed by nodes, generally. An approach for world cities concept profiling analysis is developed in this study by conbineing two types of nodes and two cities with tag which is the two-mode network using extracting city semantics for tags assigned to photos on Flickr, solving Part-of-Speech(POS), Stemming reduction and noise handing by collecting Flickr`s tags from Flickr users.
The top 100 tags were slected for 41 cities and then top 10 tags for each city were also extracted. The Tag co-occurrence was also applied to analysis the relationship of cities. Then the connection between the cities can be understood by the result of tag co-occurrence opportunities. The 85% accurancy was demonstrated by comparing the result of analysised and exported Popular Tags from Flickr Website service and the result of experiments in this study.
en_US
dc.description.tableofcontents 第一章 緒論....................................1
1.1 研究背景...................................1
1.2 研究動機與目的..............................2
1.3 研究資料...................................3
1.4 研究貢獻...................................4
1.5 論文架構...................................4
第二章 文獻探討.................................5
2.1 社會網路分析演變............................5
2.2 社會網路分析模型............................6
2.2.1 節點(Node).............................7
2.2.2 中心度指標(Degree Centrality) ..........7
2.2.3 近距中間度指標(Closeness Centrality)....8
2.2.4 參與中間度指標(Betweenness Centrality).10
2.3 Flickr Tags分析 ..........................11
2.3.1 標籤雲(Tags Cloud)......................12
2.3.2 Flickr API架構介紹......................13
2.3.3 Flickr API圖形分類......................14
2.3.4 Flickr Tag Clusters分析 ................15
2.3.5 詞義分析................................16
2.3.5.1 POS分析...............................16
2.3.5.2 一字多義及多字一義分析..................17
2.4 Flickr 城市選擇分析........................18
2.4.1 全球網路連接(GNC)......................19
2.4.2 旅遊網站輔助分析.........................20
2.4.3 其他考量因素.............................20
2.5 詞頻分析..................................22
2.5.1 TF-IDF分析..............................23
2.5.2 字詞共現指標(Tag co-occurrence)........24
第三章 研究方法與系統架構.......................27
3.1 研究架構..................................27
3.1.1 資料來源................................28
3.1.1.1 Flickr API擷取資料介紹................28
3.1.1.2 選擇41個城市分析......................29
3.1.1.3 區域分群.............................30
3.2 Flickr資料處理步驟.......................32
3.2.1 語言問題 ..............................32
3.2.2 詞幹處理(Stemming)...................33
3.2.3 義同詞異合併...........................35
3.2.4 抑制雜訊 ..............................37
3.2.5 重複城市 ..............................37
3.2.6 過濾門檻...............................38
3.3 研究工具.................................38
3.3.1 NodeXL介紹............................39
3.3.2 Gephi介紹.............................39
3.3.3 Stanford-Postagger介紹................39
3.4 資料庫建置...............................40
3.5 權重加權分析..............................47
第四章 實驗結果與評估..........................49
4.1 Flickr實驗資料...........................49
4.1.1 實驗資料抽樣............................50
4.1.2 Flickr Tags POS分析....................50
4.1.3 Flickr詞頻統計(Tags Frequency)........51
4.1.3.1 41個城市分析..........................52
4.1.3.2 以所有Tag總數分析......................54
4.1.4 Flickr詞頻統計小結.......................59
4.2 挑選關鍵相片分析...........................60
4.2.1 Flickr區域分群..........................60
4.2.1.1 America城市分析.......................61
4.2.1.2 Europe城市分析........................63
4.2.1.3 Asia城市分析..........................65
4.2.1.4 Oceania城市分析.......................67
4.2.1.5 區域分群小結..........................69
4.3 字詞共現實驗分析...........................70
4.3.1 建立無向共現網路.........................70
4.3.2 Flickr所有城市字詞共現分析...............70
4.3.3 Flickr America字詞共現分析..............75
4.3.4 Flickr Europe字詞共現分析...............76
4.3.5 Flickr Asia字詞共現分析.................78
4.3.6 Flickr Oceania字詞共現分析..............79
4.3.7 Flickr所有城市標籤雲分析................80
4.3.8 字詞共現實驗分析小結....................81
4.4 實驗總結.................................81
第五章 結論與未來方向..........................84
5.1 結論.....................................84
5.2 研究限制.................................85
5.3 未來研究方向.............................85
參考文獻....................................86
附錄A.......................................88
zh_TW
dc.format.extent 7370351 bytes-
dc.format.mimetype application/pdf-
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0098971020en_US
dc.subject (關鍵詞) 社會網路zh_TW
dc.subject (關鍵詞) 標記系統zh_TW
dc.subject (關鍵詞) 標籤類型zh_TW
dc.subject (關鍵詞) 詞幹分析zh_TW
dc.subject (關鍵詞) 字詞共現指標zh_TW
dc.subject (關鍵詞) Social networken_US
dc.subject (關鍵詞) Tagging systemsen_US
dc.subject (關鍵詞) POSen_US
dc.subject (關鍵詞) Stemmingen_US
dc.subject (關鍵詞) Tag co-occurrenceen_US
dc.title (題名) 世界城市的概念輪廓與連結:以Flickr Tags為例zh_TW
dc.title (題名) The World Cities Concept Profiling And Concatenation:A Case Study On Flickr Tagsen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) Ames, M. and Naaman, M. (2007). Why we Tag: Motivations for annotation in mobile and online media. Paper presented at the Proceedings of the SIGCHI conference on Human factors in computing systems.
Anthonisse, J.M. (1971). The rush in a directed graph (Mathematische Besliskunde, No. BN 9/71). Amsterdam: Stichting Mathematisch Centrum.
Barnes, J.A. (1954). Class and Committees in a Norwegian Island Parish .Hum Relations,7(1):39-54.
Bastian, M. (2009), Gephi: An Open Source Software for Exploring and Manipulating Networks, AAAI Publications.
Chou, C.Y. and Mei, H. (2008). Analyzing Tag-based Mashups with Fuzzy FCA. IEEE Int.Symposium on Service-Oriented System Engineering.
Cox, A.M., Clough, P.D. & Marlow, J. (2008). Flickr: a first look at user behaviour in the context of photography as serious leisure Information Research,13(1) paper 336.
Derudder, B. and Taylor, P.J. (2005). The cliquishness of world cities, Global Networks, 5(1) 71-91.
Feinberg, J. (2009). Wordle word clouds. Abstract retrieved August 16, 2013, from http://www.wordle.net/
Flickr API. Abstract retrieved August 10, 2013, from http://www.flickr.com/services/api/
Flickr-Wikipedia,the free encyclopedia. Abstract retrieved August 10, 2013, from http://en.wikipedia.org/wiki/Flickr
Freeman, L.C. (1979). Centrality in social networks: Conceptual clarification. Social Networks, 1, 215-239.
Golder, S.A. and Huberman, B.A. (2006). Usage patterns of collaborative Tagging systems. J. Inf. Sci., 32(2), 198-208.
Hull, D.A. (1998).Stemming algorithms: A case study for detailed evaluation. Journal of the American Society for Information Science, vol. 47, no. 1, pp. 70-84, Dec.
Jenkins, O.H. (2003), Photography and travel brochures: the circle of reoresentation. Tourism Geographies, 5(3), 305-328.
Lonely Planet. Abstract retrieved August 10, 2013, from http://www.lonelyplanet.com/
Lovins, J.B. (1968a).Development of a stemming algorithm. Mechanical Translation and Computational Linguistics, II, 22-3 1
Lovins, J.B. (1968b).The Lovins stemming algorithm. Abstract retrieved September 12, 2013, from http://snowball.tartarus.org/algorithms/lovins/stemmer.html
Marcus, M., Santorini, B.and Marcinkiewicz , M. (1993). Architecture a large annotated corpus of English: the Penn Treebank. Computational Linguistics, Vol.19.
Mika, P. (2007). Ontologies are us: A unified Model of social networks and semantics. Web Semantics: Science, Services and Agents on the World Wide Web, 5(1), 5-15.
National Geographic. Abstract retrieved August 10, 2013, from http://www.nationalgeographic.com/
O’Reilly, T. (2005). Tim O’Reilly and Web 2.0: the economics of memetic liberty and control.
Porter, M. F. (1980a). An algorithm for suffix stripping, Program, 14(3):130.
Porter, M. F.(1980b).Martin Porter`s Home Page:The Porter Stemming Algorithm.Abstract retrieved September 21,2013,from http://tartarus.org/martin/index.html
Scott, J. (2000). Social network analysis: A handbook (2nd ed.). London: Sage.
Sigurbjörnsson, B. and Zwol, R.V. (2008). Flickr Tag recommendation based on collective knowledge. Proceeding of the 17th international conference on World Wide Web, ser.WWW `08. New York, NY, USA: ACM, pp. 327-336.
TF*IDF.Abstract retrieved September 21, 2013, from:TF*IDF:http://morris.lis.ntu.edu.tw/wikimedia/index.php/TF*IDF/
Top 15 Most Popular Travel Websites. Abstract retrieved August 2, 2013, from http://www.ebizmba.com/articles/travel-websites/
Toutanova, K. and Christopher, D.M. (2000). Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), 63-70.
Travel+ Leisure. Abstract retrieved August 10, 2013, from http://www.travelandleisure.com
Travel.Yahoo. Abstract retrieved August 5, 2013, from http://travel.yahoo.com/
Tripadvisor. Abstract retrieved August 3, 2013, from http://www.tripadvisor.com/tripadvisor/
Valitutti, R. (2004).WordNet-affect:an affective extension of WordNet. Proceedings of the 4th International Conference on Language Resources and Evaluation, vol. pp.1083-1086.
Wasserman, S. (1994). Social network analysis: Methods and applications. Cambridge university press, Vol. 8.
吳筱玟與周芷伊(民98)。 Tagging的分類與知識意涵:以Flickr首頁圖片為例。新聞學研究,未出版,台北市。
林岡隆(民98)。政府官員異動之社會網路分析。 國立政治大學資訊科學系碩士論文,未出版,台北市。
彭聲揚(民100)。透過圖片Tags觀察情緒字詞與事物概念之關聯。 國立政治大學資訊科學系碩士論文,未出版,台北市。
溫文喆(民98)。詞義相似度的社會網路分析研究。 國立政治大學資訊科學系碩士論文,未出版,台北市。
蔡吉章(民98)。英文文法關係之型態探勘。 國立政治大學資訊科學系碩士論文,未出版,台北市。
zh_TW