以圖文辨識為基礎的旅遊路線規劃輔助工具 | 學術產出

學術產出-學位論文

文章檢視/開啟

pdf(1710)

書目匯出

Google Scholar^TM

題名	以圖文辨識為基礎的旅遊路線規劃輔助工具 Tour Planning Using Landmark Photo Matching and Intelligent Character Recognition
作者	黃政明 Huang, Cheng Ming
貢獻者	廖文宏<br>陳聖智 Liao, Wen Hung<br>Chen, Sheng Chih 黃政明 Huang, Cheng Ming
關鍵詞	行動裝置地標辨識智慧型文字辨識 mobile devices landmark photo matching intelligent character recognition
日期	2011
上傳時間	30-十月-2012 10:48:42 (UTC+8)
摘要	智慧型手機的用途已從語音溝通延伸轉變為多功能導向的的生活工具。目前多數的智慧型手機均具備攝影鏡頭,而此模組更已被公認為基本的標準配備。使用者透過手機,可以輕易且自然地拍攝感興趣的物體、景色或文字等,並且建立屬於自己的影像資料庫。在眾多的手機軟體中,旅遊類的程式是其中一種常見整合內容與多項感測模組的應用實例。在行動平台上, 設計一個影像辨識系統服務可以大幅地協助遊客們在旅途中去瞭解、認識知名的地標、建築物、或別具意義的物體與文字等。然而在行動平台上的可用資源是有限的,因此想要在行動平台上開發有效率的影像辨識系統,是頗具挑戰性的任務。如何在準確率與計算成本之間取得最佳的平衡點往往是行動平台上開發影像辨識技術的最重要課題。根據上述的目標,本研究擬於行動平台上設計、開發行動影像搜尋與智慧型文字辨識系統。具體而言,我們將在影像搜尋上整合兩個全域的特徵描述子,並針對印刷與手寫字體去開發智慧型文字辨識系統。實驗結果顯示, 在行動影像搜尋與文字辨識的效能測試部分,前三名的辨識率皆可達到的 80%。 The roles of smart phones have extended from simple voice communications to multi-purpose applications. Smart phone equipped with miniaturized image capturing modules are now considered standard. Users can easily take pictures of interested objects, scenes or texts, and build their own image database. Travel-type mobile app is one example that takes advantage of the array of sensors on the device. A mobile image search engine can bring much convenience to tourists when they want to retrieve information regarding specific landmarks, buildings, or other objects. However, devising an effective image recognition system for smart phone is a quite challenging task due to the complexity of image search and pattern recognition algorithms. Image recognition techniques that strike a balance between accuracy and efficiency need to be developed to cope with limited resources on mobile platforms. Toward the above goal, this thesis seeks to design effective mobile visual search and intelligent character recognition systems on mobile platforms. Specifically, we propose two global feature descriptors for efficient image search. We also develop an intelligent character recognition engine that can handle both printed and handwritten texts. Experimental results show that the accuracy reaches 80% for top-3 candidates in visual search and intelligent character recognition tasks.
參考文獻	[1] F. Corp. (April, 2012). FunTrip 旅遊手札. Available: https://http://www.facebook.com/funtrip.tw [2] M. O. T. C. R. o. C. T. Toursim Bureau. Available: http://admin.taiwan.net.tw/indexc.asp [3] T. C. Government. (June 2011). Taipei-Free. Available: http://www.tpe-free.taipei.gov.tw/TPE/ [4] UDN. (2012/04/15). 一機在手跟著「旅遊雲」玩遍全世界. Available: http://mag.udn.com/mag/digital/storypage.jsp?f_ART_ID=383884 [5] D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International journal of computer vision, vol. 60, pp. 91-110, 2004. [6] H. Bay, T. Tuytelaars, and L. Van Gool, "Surf: Speeded up robust features," Computer Vision–ECCV 2006, pp. 404-417, 2006. [7] L. Juan and O. Gwun, "A comparison of sift, pca-sift and surf," International Journal of Image Processing, vol. 3, pp. 143-152, 2009. [8] V. Chandrasekhar, S. S. Tsai, G. Takacs, D. M. Chen, N. M. Cheung, Y. Reznik, R. Vedantham, R. Grzeszczuk, and B. Girod, "Low Latency Image Retrieval with Embedded Compressed Histogram of Gradient Descriptors." [9] V. Chandrasekhar, D. M. Chen, A. Lin, G. Takacs, S. S. Tsai, N. M. Cheung, Y. Reznik, R. Grzeszczuk, and B. Girod, "Comparison of local feature descriptors for mobile visual search," 2010, pp. 3885-3888. [10] Y. Cao, H. Zhang, Y. Gao, X. Xu, and J. Guo, "Matching Image with Multiple Local Features," 2010. [11] D. Nister and H. Stewenius, "Scalable recognition with a vocabulary tree," 2006, pp. 2161-2168. [12] S. S. Tsai, D. Chen, G. Takacs, V. Chandrasekhar, R. Vedantham, R. Grzeszczuk, and B. Girod, "Fast geometric re-ranking for image-based retrieval," 2010, pp. 1029-1032. [13] S. S. Tsai, D. Chen, V. Chandrasekhar, G. Takacs, N. M. Cheung, R. Vedantham, R. Grzeszczuk, and B. Girod, "Mobile product recognition," 2010, pp. 1587-1590. [14] S. S. Tsai, D. Chen, J. P. Singh, and B. Girod, "Rate-efficient, real-time CD cover recognition on a camera-phone," 2008, pp. 1023-1024. [15] D. Chen, S. Tsai, C. H. Hsu, J. P. Singh, and B. Girod, "Mobile augmented reality for books on a shelf," 2011, pp. 1-6. [16] S. S. Tsai, H. Chen, D. Chen, R. Vedantham, R. Grzeszczuk, and B. Girod, "Mobile Visual Search Using Image and Text Features." [17] G. Inc. (2009). Google Goggles. Available: http://www.google.com/mobile/goggles/ - text [18] Amazon. (2011). Flow powered by Amazon. Available: http://itunes.apple.com/us/app/flow-powered-by-amazon/id474664425?mt=8 [19] L. Earnest, "Machine reading of cursive script," in in Proc. IFIP Congress, Amsterdam, 1963, pp. 462-466. [20] R. Casey and G. Nagy, "Automatic Recognition of Machine Printed Chinese Characters," IEEE-TEC 1966, 1966. [21] J. Liu, "Real Time Chinese Handwriting Recognition," E.E., MIT, Cambridge, 1966. [22] WorldCard. Worldictionary. Available: http://worldcard.penpowerinc.com/product.asp?sn=300 [23] P. S. Inc. Pleco. Available: http://www.pleco.com/ [24] F. Corp. (2010). HuayuNavi. Available: http://funwish.net/huayunavi/ [25] J. H. Kuo, C. M. Huang, W. H. Liao, and C. C. Huang, "HuayuNavi: a mobile Chinese learning application based on intelligent character recognition," Edutainment Technologies. Educational Games and Virtual Reality/Augmented Reality Applications, pp. 346-354, 2011. [26] M. I. M. E. L. National University of Singapore. iConference - Social networking in a conference using mobile augmented reality technology. Available: http://www.mimelab.com/content/ [27] C. M. Huang, W. H. Liao, and S. C. Chen, "Mobile Tour Planning Using Landmark Photo Matching and Intelligent Character Recognition," American Journal of Engineering and Technology Research Vol, vol. 11, 2011. [28] Wen-Hung Liao, "A Framework for Attention-Based Personal Photo Manager", Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, p.2128-232, 2009. [29] A. Oliva and A. Torralba, "Modeling the shape of the scene: A holistic representation of the spatial envelope," International journal of computer vision, vol. 42, pp. 145-175, 2001. [30] D. Parkhurst, K. Law, and E. Niebur, "Modeling the role of salience in the allocation of overt visual attention," Vision research, vol. 42, pp. 107-123, 2002. [31] J. Harel, C. Koch, and P. Perona, "Graph-based visual saliency," Advances in neural information processing systems, vol. 19, p. 545, 2007. [32] J. Dong, A. Krzyżak, and C. Y. Suen, "An improved handwritten Chinese character recognition system using support vector machine," Pattern Recognition Letters, vol. 26, pp. 1849-1856, 2005. [33] H. T. Lin, C. J. Lin, and R. C. Weng, "A note on Platt’s probabilistic outputs for support vector machines," Machine learning, vol. 68, pp. 267-276, 2007. [34] D. M. Chen, G. Baatz, K. Koser, S. S. Tsai, R. Vedantham, T. Pylvanainen, K. Roimela, X. Chen, J. Bach, and M. Pollefeys, "City-scale landmark identification on mobile devices," 2011, pp. 737-744. [35] O. P. a. M. Werman. (2010). The Quadratic-Chi Histogram Distance Family. Available: http://www.seas.upenn.edu/~ofirpele/QC/ [36] M. KLINKIGT and K. KISE, "Local Configuration of SIFT-like Features by a Shape Context," 2010, pp. 11-15. [37] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, "Object retrieval with large vocabularies and fast spatial matching," 2007, pp. 1-8. [38] H. Jégou, M. Douze, and C. Schmid, "Improving bag-of-features for large scale image search," International journal of computer vision, vol. 87, pp. 316-336, 2010.
描述	碩士國立政治大學數位內容碩士學位學程 98462011 100
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0098462011
資料類型	thesis

dc.contributor.advisor	廖文宏<br>陳聖智	zh_TW
dc.contributor.advisor	Liao, Wen Hung<br>Chen, Sheng Chih	en_US
dc.contributor.author (作者)	黃政明	zh_TW
dc.contributor.author (作者)	Huang, Cheng Ming	en_US
dc.creator (作者)	黃政明	zh_TW
dc.creator (作者)	Huang, Cheng Ming	en_US
dc.date (日期)	2011	en_US
dc.date.accessioned	30-十月-2012 10:48:42 (UTC+8)	-
dc.date.available	30-十月-2012 10:48:42 (UTC+8)	-
dc.date.issued (上傳時間)	30-十月-2012 10:48:42 (UTC+8)	-
dc.identifier (其他識別碼)	G0098462011	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/54364	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	數位內容碩士學位學程	zh_TW
dc.description (描述)	98462011	zh_TW
dc.description (描述)	100	zh_TW
dc.description.abstract (摘要)	智慧型手機的用途已從語音溝通延伸轉變為多功能導向的的生活工具。目前多數的智慧型手機均具備攝影鏡頭,而此模組更已被公認為基本的標準配備。使用者透過手機,可以輕易且自然地拍攝感興趣的物體、景色或文字等,並且建立屬於自己的影像資料庫。在眾多的手機軟體中,旅遊類的程式是其中一種常見整合內容與多項感測模組的應用實例。在行動平台上, 設計一個影像辨識系統服務可以大幅地協助遊客們在旅途中去瞭解、認識知名的地標、建築物、或別具意義的物體與文字等。然而在行動平台上的可用資源是有限的,因此想要在行動平台上開發有效率的影像辨識系統,是頗具挑戰性的任務。如何在準確率與計算成本之間取得最佳的平衡點往往是行動平台上開發影像辨識技術的最重要課題。根據上述的目標,本研究擬於行動平台上設計、開發行動影像搜尋與智慧型文字辨識系統。具體而言,我們將在影像搜尋上整合兩個全域的特徵描述子,並針對印刷與手寫字體去開發智慧型文字辨識系統。實驗結果顯示, 在行動影像搜尋與文字辨識的效能測試部分,前三名的辨識率皆可達到的 80%。	zh_TW
dc.description.abstract (摘要)	The roles of smart phones have extended from simple voice communications to multi-purpose applications. Smart phone equipped with miniaturized image capturing modules are now considered standard. Users can easily take pictures of interested objects, scenes or texts, and build their own image database. Travel-type mobile app is one example that takes advantage of the array of sensors on the device. A mobile image search engine can bring much convenience to tourists when they want to retrieve information regarding specific landmarks, buildings, or other objects. However, devising an effective image recognition system for smart phone is a quite challenging task due to the complexity of image search and pattern recognition algorithms. Image recognition techniques that strike a balance between accuracy and efficiency need to be developed to cope with limited resources on mobile platforms. Toward the above goal, this thesis seeks to design effective mobile visual search and intelligent character recognition systems on mobile platforms. Specifically, we propose two global feature descriptors for efficient image search. We also develop an intelligent character recognition engine that can handle both printed and handwritten texts. Experimental results show that the accuracy reaches 80% for top-3 candidates in visual search and intelligent character recognition tasks.	en_US
dc.description.tableofcontents	1. Introduction................................................................................................. 1 2. Related Work ............................................................................................... 6 2.1. Visual Search ............................................................................................ 6 2.1.1. Content-based Image Retrieval on PC .................................................... 6 2.1.2. Mobile Image Search ............................................................................. 7 2.1.3. Mobile Image Search Apps now ............................................................. 9 2.2. Intelligent Character Recognition............................................................. 12 2.2.1. Printed Character Recognition ............................................................. 12 2.2.2. Handwritten Character Recognition...................................................... 13 2.2.3. Mobile Character Recognition and Apps............................................... 15 3. Case Studies .............................................................................................. 18 3.1. HuayuNavi............................................................................................... 18 3.2. iConference............................................................................................. 21 4. Proposed Methodology .............................................................................. 25 4.1. System Flowchart ................................................................................... 25 4.2. Image Descriptors .................................................................................. 28 4.2.1. Weighted Gist Descriptor...................................................................... 29 4.2.2. Average ENN Descriptor ...................................................................... 32 4.2.3. Information Fusion............................................................................... 33 4.3. Intelligent Character Recognition............................................................. 34 4.3.1. Feature Extraction ............................................................................... 34 4.3.2. Recognition ......................................................................................... 35 5. Performance Evaluation ............................................................................. 38 5.1. Data Collection ....................................................................................... 38 5.1.1. Visual Search Dataset .......................................................................... 38 5.1.1.1. Field Study........................................................................................ 38 5.1.1.2. Taiwan Landmark Image Set.............................................................. 40 5.1.2. Intelligent Character Recognition Dataset ............................................ 41 5.2. Experimental Results .............................................................................. 43 5.2.1. Visual Search ....................................................................................... 43 5.2.1.1. Different Parameters in ENN.............................................................. 45 5.2.1.2. Comparison of Individual and Hybrid Approaches.............................. 47 5.2.2. Intelligent Character Recognition ......................................................... 49 5.2.3. Routing Implementation on Mobile Platform..........................................49 6. Comparative Analysis ................................................................................ 51 6.1. Benchmark Database – Oxford Buildings Dataset .................................... 51 6.2. Experiments on Oxford Building Dataset................................................. 52 6.3. Other Approaches on Oxford Dataset...................................................... 53 7. Conclusion and Future Work ...................................................................... 55 8. References................................................................................................. 57	zh_TW
dc.language.iso	en_US	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0098462011	en_US
dc.subject (關鍵詞)	行動裝置	zh_TW
dc.subject (關鍵詞)	地標辨識	zh_TW
dc.subject (關鍵詞)	智慧型文字辨識	zh_TW
dc.subject (關鍵詞)	mobile devices	en_US
dc.subject (關鍵詞)	landmark photo matching	en_US
dc.subject (關鍵詞)	intelligent character recognition	en_US
dc.title (題名)	以圖文辨識為基礎的旅遊路線規劃輔助工具	zh_TW
dc.title (題名)	Tour Planning Using Landmark Photo Matching and Intelligent Character Recognition	en_US
dc.type (資料類型)	thesis	en
dc.relation.reference (參考文獻)	[1] F. Corp. (April, 2012). FunTrip 旅遊手札. Available: https://http://www.facebook.com/funtrip.tw [2] M. O. T. C. R. o. C. T. Toursim Bureau. Available: http://admin.taiwan.net.tw/indexc.asp [3] T. C. Government. (June 2011). Taipei-Free. Available: http://www.tpe-free.taipei.gov.tw/TPE/ [4] UDN. (2012/04/15). 一機在手跟著「旅遊雲」玩遍全世界. Available: http://mag.udn.com/mag/digital/storypage.jsp?f_ART_ID=383884 [5] D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International journal of computer vision, vol. 60, pp. 91-110, 2004. [6] H. Bay, T. Tuytelaars, and L. Van Gool, "Surf: Speeded up robust features," Computer Vision–ECCV 2006, pp. 404-417, 2006. [7] L. Juan and O. Gwun, "A comparison of sift, pca-sift and surf," International Journal of Image Processing, vol. 3, pp. 143-152, 2009. [8] V. Chandrasekhar, S. S. Tsai, G. Takacs, D. M. Chen, N. M. Cheung, Y. Reznik, R. Vedantham, R. Grzeszczuk, and B. Girod, "Low Latency Image Retrieval with Embedded Compressed Histogram of Gradient Descriptors." [9] V. Chandrasekhar, D. M. Chen, A. Lin, G. Takacs, S. S. Tsai, N. M. Cheung, Y. Reznik, R. Grzeszczuk, and B. Girod, "Comparison of local feature descriptors for mobile visual search," 2010, pp. 3885-3888. [10] Y. Cao, H. Zhang, Y. Gao, X. Xu, and J. Guo, "Matching Image with Multiple Local Features," 2010. [11] D. Nister and H. Stewenius, "Scalable recognition with a vocabulary tree," 2006, pp. 2161-2168. [12] S. S. Tsai, D. Chen, G. Takacs, V. Chandrasekhar, R. Vedantham, R. Grzeszczuk, and B. Girod, "Fast geometric re-ranking for image-based retrieval," 2010, pp. 1029-1032. [13] S. S. Tsai, D. Chen, V. Chandrasekhar, G. Takacs, N. M. Cheung, R. Vedantham, R. Grzeszczuk, and B. Girod, "Mobile product recognition," 2010, pp. 1587-1590. [14] S. S. Tsai, D. Chen, J. P. Singh, and B. Girod, "Rate-efficient, real-time CD cover recognition on a camera-phone," 2008, pp. 1023-1024. [15] D. Chen, S. Tsai, C. H. Hsu, J. P. Singh, and B. Girod, "Mobile augmented reality for books on a shelf," 2011, pp. 1-6. [16] S. S. Tsai, H. Chen, D. Chen, R. Vedantham, R. Grzeszczuk, and B. Girod, "Mobile Visual Search Using Image and Text Features." [17] G. Inc. (2009). Google Goggles. Available: http://www.google.com/mobile/goggles/ - text [18] Amazon. (2011). Flow powered by Amazon. Available: http://itunes.apple.com/us/app/flow-powered-by-amazon/id474664425?mt=8 [19] L. Earnest, "Machine reading of cursive script," in in Proc. IFIP Congress, Amsterdam, 1963, pp. 462-466. [20] R. Casey and G. Nagy, "Automatic Recognition of Machine Printed Chinese Characters," IEEE-TEC 1966, 1966. [21] J. Liu, "Real Time Chinese Handwriting Recognition," E.E., MIT, Cambridge, 1966. [22] WorldCard. Worldictionary. Available: http://worldcard.penpowerinc.com/product.asp?sn=300 [23] P. S. Inc. Pleco. Available: http://www.pleco.com/ [24] F. Corp. (2010). HuayuNavi. Available: http://funwish.net/huayunavi/ [25] J. H. Kuo, C. M. Huang, W. H. Liao, and C. C. Huang, "HuayuNavi: a mobile Chinese learning application based on intelligent character recognition," Edutainment Technologies. Educational Games and Virtual Reality/Augmented Reality Applications, pp. 346-354, 2011. [26] M. I. M. E. L. National University of Singapore. iConference - Social networking in a conference using mobile augmented reality technology. Available: http://www.mimelab.com/content/ [27] C. M. Huang, W. H. Liao, and S. C. Chen, "Mobile Tour Planning Using Landmark Photo Matching and Intelligent Character Recognition," American Journal of Engineering and Technology Research Vol, vol. 11, 2011. [28] Wen-Hung Liao, "A Framework for Attention-Based Personal Photo Manager", Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, p.2128-232, 2009. [29] A. Oliva and A. Torralba, "Modeling the shape of the scene: A holistic representation of the spatial envelope," International journal of computer vision, vol. 42, pp. 145-175, 2001. [30] D. Parkhurst, K. Law, and E. Niebur, "Modeling the role of salience in the allocation of overt visual attention," Vision research, vol. 42, pp. 107-123, 2002. [31] J. Harel, C. Koch, and P. Perona, "Graph-based visual saliency," Advances in neural information processing systems, vol. 19, p. 545, 2007. [32] J. Dong, A. Krzyżak, and C. Y. Suen, "An improved handwritten Chinese character recognition system using support vector machine," Pattern Recognition Letters, vol. 26, pp. 1849-1856, 2005. [33] H. T. Lin, C. J. Lin, and R. C. Weng, "A note on Platt’s probabilistic outputs for support vector machines," Machine learning, vol. 68, pp. 267-276, 2007. [34] D. M. Chen, G. Baatz, K. Koser, S. S. Tsai, R. Vedantham, T. Pylvanainen, K. Roimela, X. Chen, J. Bach, and M. Pollefeys, "City-scale landmark identification on mobile devices," 2011, pp. 737-744. [35] O. P. a. M. Werman. (2010). The Quadratic-Chi Histogram Distance Family. Available: http://www.seas.upenn.edu/~ofirpele/QC/ [36] M. KLINKIGT and K. KISE, "Local Configuration of SIFT-like Features by a Shape Context," 2010, pp. 11-15. [37] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, "Object retrieval with large vocabularies and fast spatial matching," 2007, pp. 1-8. [38] H. Jégou, M. Douze, and C. Schmid, "Improving bag-of-features for large scale image search," International journal of computer vision, vol. 87, pp. 316-336, 2010.	zh_TW

學術產出-學位論文

文章檢視/開啟

書目匯出

Google ScholarTM

政大圖書館

引文資訊

TAIR相關學術產出

Google Scholar^TM