學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 雲端筆記之混合式文字切割與辨識
Segmentation and recognition of mixed characters for cloud-based notes
作者 王冠智
Wang, Guan Jhih
貢獻者 廖文宏
Liao, Wen Hung
王冠智
Wang, Guan Jhih
關鍵詞 文字結構濾波
字體分類
字體分類
消除雜訊
stroke filter
text font discrimination
text segmentation
noise removal
日期 2011
上傳時間 1-Nov-2012 13:58:31 (UTC+8)
摘要 文字辨識為常見的電腦視覺應用之一,隨著正確率逐漸的上升,許多新的服務相繼出現,本論文改善了筆記管理軟體最主要的問題-文字切割,並提出兩種新的中文印刷體及手寫體的分類方法。我們將筆記文件中較常見的重點標記過濾後,再使用新核心的文字結構濾波取得筆記文件中的文字區塊,新的核心數據大幅降低原始核心的計算時間。本論文也使用文字結構濾波作為分辨印刷體、手寫體的特徵值,由於文字結構濾波會依據筆畫結構給予能量回饋,使得較工整的印刷體與手寫體能有所區別,此外也使用Sobel搭配不同角度範圍進行字體辨識,實驗結果證實了本論文所提出的文字切割及字體分類方法對於筆記文件資訊的處理是有效的。
Character recognition is an important and practical application of computer vision. With the advance of this technology, more and more services embedding text recognition functionality have become available. However, segmentation is still the central issue in many situations. In this thesis, we tackle the character segmentation problem in note taking and management applications. We propose novel methods for the discrimination of handwritten and machine-printed Chinese characters. First, we perform noise removal using heuristics and apply a stroke filter with modified kernels to efficiently compute the bounding box for the text area. The responses of the stroke filter also serve as clues for differentiating machine-printed and handwritten texts. They are further enhanced using a SVM-based classifier that employs aggregated directional responses of edge detectors as input. Experiment results have validated the efficacy of the proposed approaches in terms of text localization and style recognition.
參考文獻 [[1] K. Jung, K. In Kim, and A. K. Jain, "Text information extraction in images and video: a survey," Pattern Recognition, vol. 37, pp. 977-997, 2004.
[2] R. Smith, D. Antonova, and D.-S. Lee, "Adapting the Tesseract open source OCR engine for multilingual OCR," presented at the Proceedings of the International Workshop on Multilingual OCR, 2009, pp 1-8.
[3] Q. Liu, C. Jung, and Y. Moon, "Text segmentation based on stroke filter," presented at the Proceedings of the 14th annual ACM international conference on Multimedia, 2006, pp. 129-132.
[4] X. Li, W. Wang , Q. Huang , W. Gao , and L. Qing "A hybrid text segmentation approach," in Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on, 2009, pp. 510-513.
[5] I. DAUBECHIES, Ten Lectures on Wavelets, 1992.
[6] Y.-C. Su , C.-W. Lee , and Y.-H. Yang "The fast stroke filter," in Machine Learning and Cybernetics (ICMLC), 2010 International Conference on, 2010, pp. 3044-3049.
[7] Z. Wu , X. Fang , and C. Wang "A comprehensive video text localization system based on stroke filter," in Wireless Communications & Signal Processing, 2009. WCSP 2009. International Conference on, 2009, pp. 1-4.
[8] L. Liu, D. Zhang, and J. You, "Detecting Wide Lines Using Isotropic Nonlinear Filtering," Image Processing, IEEE Transactions on, vol. 16, pp. 1584-1595, 2007.
[9] M. Huang, M. Yang, F. Liu, and E.-H. Wu, "Stroke extraction in cartoon images using edge-enhanced isotropic nonlinear filter," presented at the Proceedings of the 9th ACM SIGGRAPH Conference on Virtual-Reality Continuum and its Applications in Industry, 2010, pp. 33-38.
[10] G. Aghajari and J. Shanbehzadeh, "A Text Localization Algorithm in Color Image via New Projection Profile," in International MultiConference of Engineers and Computer Scientists, 2010, pp. 1486-1489.
[11] Q. Ye, Q. Huang, W. Gao, and D. Zhao, "Fast and robust text detection in images and video frames," Image Vision Comput., vol. 23, pp. 565-576, 2005.
[12] M. Pontil and A. Verri, "Support vector machines for 3D object recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 20, pp. 637-646, 1998.
[13] A. Gionis, P. Indyk, and R. Motwani, "Similarity Search in High Dimensions via Hashing," presented at the Proceedings of the 25th International Conference on Very Large Data Bases, 1999, pp. 518-529.
[14] E. Kavallieratou, S. Stamatatos, and H. Antonopoulou, "Machine-Printed from Handwritten Text Discrimination," presented at the Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition, 2004 , pp. 312-316.
[15] L. F. da Silva, A. Conci, and A. Sanchez, "Automatic Discrimination between Printed and Handwritten Text in Documents," in Computer Graphics and Image Processing (SIBGRAPI), 2009 XXII Brazilian Symposium on, 2009, pp. 261-267.
[16] S. N. Srihari, Y.-C. Shin, V. Ramanaprasad, and D.-S. Lee, "Name and Address Block Reader system for tax form processing," presented at the Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1, 1995, pp. 5-10.
[17] R. Duda and P. Hart, Pattern Classication and Scene Analysis, 1996.
[18] Y. Zheng, H. Li, and D. Doermann, "Machine Printed Text and Handwriting Identification in Noisy Document Images," IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, pp. 337-353, 2004.
[19] G. G. Rajput, R. Horakeri, and S. Chandrakant, "Printed and Handwritten Mixed Kannada Numerals Recognition Using SVM," International Journal on Computer Science and Engineering, p. 5, 2010, pp. 1622-1626.
[20] S. Chanda, K. Franke, and U. Pal, "Structural handwritten and machine print classification for sparse content and arbitrary oriented document fragments," presented at the Proceedings of the 2010 ACM Symposium on Applied Computing, 2010, pp. 18-22.
[21] H. Freeman, "On the Encoding of Arbitrary Geometric Configurations," Electronic Computers, IRE Transactions on, vol. EC-10, pp. 260-268, 1961.
[22] J. K. Guo and M. Y. Ma, "Separating handwritten material from machine printed text using hidden Markov models," in Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on, 2001, pp. 439-443.
[23] R. Kandan, N. K. Reddy, K. R. Arvind, and A. G. Ramakrishnan, "A robust two level classification algorithm for text localization in documents," presented at the Proceedings of the 3rd international conference on Advances in visual computing - Volume Part II, 2007, pp. 96-105.
[24] S. Suzuki and K. Be, "Topological structural analysis of digitized binary images by border following," Computer Vision, Graphics, and Image Processing, vol. 30, pp. 32-46, 1985.
描述 碩士
國立政治大學
資訊科學學系
99753003
100
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0099753003
資料類型 thesis
dc.contributor.advisor 廖文宏zh_TW
dc.contributor.advisor Liao, Wen Hungen_US
dc.contributor.author (Authors) 王冠智zh_TW
dc.contributor.author (Authors) Wang, Guan Jhihen_US
dc.creator (作者) 王冠智zh_TW
dc.creator (作者) Wang, Guan Jhihen_US
dc.date (日期) 2011en_US
dc.date.accessioned 1-Nov-2012 13:58:31 (UTC+8)-
dc.date.available 1-Nov-2012 13:58:31 (UTC+8)-
dc.date.issued (上傳時間) 1-Nov-2012 13:58:31 (UTC+8)-
dc.identifier (Other Identifiers) G0099753003en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/55134-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 99753003zh_TW
dc.description (描述) 100zh_TW
dc.description.abstract (摘要) 文字辨識為常見的電腦視覺應用之一,隨著正確率逐漸的上升,許多新的服務相繼出現,本論文改善了筆記管理軟體最主要的問題-文字切割,並提出兩種新的中文印刷體及手寫體的分類方法。我們將筆記文件中較常見的重點標記過濾後,再使用新核心的文字結構濾波取得筆記文件中的文字區塊,新的核心數據大幅降低原始核心的計算時間。本論文也使用文字結構濾波作為分辨印刷體、手寫體的特徵值,由於文字結構濾波會依據筆畫結構給予能量回饋,使得較工整的印刷體與手寫體能有所區別,此外也使用Sobel搭配不同角度範圍進行字體辨識,實驗結果證實了本論文所提出的文字切割及字體分類方法對於筆記文件資訊的處理是有效的。zh_TW
dc.description.abstract (摘要) Character recognition is an important and practical application of computer vision. With the advance of this technology, more and more services embedding text recognition functionality have become available. However, segmentation is still the central issue in many situations. In this thesis, we tackle the character segmentation problem in note taking and management applications. We propose novel methods for the discrimination of handwritten and machine-printed Chinese characters. First, we perform noise removal using heuristics and apply a stroke filter with modified kernels to efficiently compute the bounding box for the text area. The responses of the stroke filter also serve as clues for differentiating machine-printed and handwritten texts. They are further enhanced using a SVM-based classifier that employs aggregated directional responses of edge detectors as input. Experiment results have validated the efficacy of the proposed approaches in terms of text localization and style recognition.en_US
dc.description.tableofcontents 1. 研究背景與目的 1
2. 相關研究 12
2.1. 文字濾波 12
2.1.1. 文字結構濾波 12
2.1.2. 文字邊緣濾波 16
2.1.3. 多貝西小波 20
2.1.4. 小結 21
2.2. 字型分類 21
2.2.1. 根據字體特性分類 22
2.2.2. 使用機器學習分類 23
2.2.3. 小結 25
2.3. 系統流程 26
3. 筆記文字過濾 27
3.1. 非文字資訊處理 29
3.2. 套用文字濾波 33
3.2.1. 多貝西小波 33
3.2.2. 文字邊緣濾波 34
3.2.3. 文字結構濾波 36
3.2.4. 文字濾波總結 40
4. 文字型態分類 42
4.1. 資料蒐集 43
4.2. 直方圖觀察 45
4.3. 直方圖訓練 50
4.4. 測試結果 54
5. 整合結果 57
6. 結論與後續研究改進方向 62
7. 參考文獻 64

圖目錄
圖 1 1,圖左為框選紙張範圍的過程,圖右為二值化結果 2
圖 1 2,蒙恬名片王行動版使用示意圖 3
圖 1 3,智慧型手機手寫輸入範例 5
圖 1 4,會議記錄文件掃描後的完整內容 6
圖 1 5,Evernote軟體查詢三人的結果,黃色部分為有包含查詢文字的區域 7
圖 1 6,使用Google Docs的文件辨識服務於圖 1 4的完整結果 8
圖 1 7,使用OneNote辨識圖 1 4後的部分結果 9
圖 1 8,Pleco及WorldDictionary的使用方式 10
圖 2 1,文字結構濾波的計算資訊 13
圖 2 2,快速文字結構濾波的方向判斷結果 15
圖 2 3,改良後的文字結構濾波示範圖及各種參數設定 16
圖 2 4,黃色點為計算原點,紅色點為文字邊緣點,綠色點為可能是文字邊緣點,黑色則表示停止計算,左圖為原始標記,右圖為找出強度最大的方向後重新標記結果 18
圖 2 5,Isotropic Nonlinear Filter的遮罩套用於各區塊的示意圖 19
圖 2 6,左圖為欲過濾之影像,右圖為二階層之D4小波結果 20
圖 2 7,使用英文印刷體特性進行字體分類的範例 23
圖 2 8,本研究的系統流程圖 26
圖 3 1,本章節所使用的測試影像 28
圖 3 2,非文字資訊處理的流程圖 29
圖 3 3,圖a為經過Canny Edge Detector套用於圖 3 1的結果,紅色區塊所標記的是畫重點於文字底部的方法,橘色區塊所標記的是畫重點於文字外框的方法,圖b為使用所提出的偵測結果,白色線段所圈出的輪廓為偵測到符合本研究定義的重點線段的結果 31
圖 3 4,圖a為套用兩次找輪廓後的結果,第四行的”市”由於與重點標記連接造成有一部分資訊遺失,圖b為消除重點標記後的結果。 32
圖 3 5,二階層D4小波HH通道結果切割圖 34
圖 3 6,參考文字結構改善後的遮罩示意圖 35
圖 3 7,圖a為使用灰階影像及新遮罩的結果,圖b為使用亮度(V)通道的結果 36
圖 3 8,本研究提出的改良核心圖 37
圖 3 9,三種核心的長寬數據 37
圖 3 10,三種不同核心處理的結果,a:原始核心,b:改良核心,c:本研究提出核心 39
圖 3 11,移除重點標示線段,並套用本研究提出的文字結構濾波於圖 3 1的結果 41
圖 4 1,兩種不同字型的印刷體範例,上排為標楷體,下排為新細明體 43
圖 4 2,印刷體資料範例,上兩排為標楷體,下兩排為新細明體 44
圖 4 3,手寫體的部分資料 44
圖 4 4,非等比例的降維寬度直條圖 51
圖 4 5,等比例的降維寬度直條圖 52
圖 4 6,使用角度差距5的比例變化繪製出的折線圖 53
圖 4 7,使用角度差距15的比例變化繪製出的折線圖 54
圖 4 8,a圖為核心1的計算方式,b圖為核心2的計算方式 54
圖 5 1,套用閾值過濾圖 3 11雜訊的結果,圖左的閾值為50,圖右使用100,圖中白色點為具有文字資訊的區域,可看到圖右有部分手寫資訊被過濾 57
圖 5 2,利用文字輪廓進行文字切割的結果 59
圖 5 3,經由雜訊過濾及文字合併後的結果 59
圖 5 4,使用文字結構濾波+等比例降維的結果,紫色方塊表示該區塊內的資訊為印刷體,黃色方塊為手寫體 60
圖 5 5,使用Sobel核心1+角度差距15的結果 60
圖 5 6,套用本研究提出的字體分類方法於不同影像後的結果,圖a為使用文字結構濾波的辨識結果,圖b為使用Sobel核心1的辨識結果 61
圖 6 1,圖左為手動切圖後的結果,圖右為程式切割時產生部分的結果 63

表目錄
表格 2 1,[18]描述文字資訊的特徵值 24
表格 2 2,各種利用機器學習法進行字體分類的結果 25
表格 3 1,套用不同核心於圖 3 1的計算時間 38
表格 4 1,兩種字體的資料數目 45
表格 4 2,不同大小、粗細的標楷體文字所產生的直方圖(256維) 46
表格 4 3,不同大小、粗細的新細明體文字所產生的直方圖(256維) 47
表格 4 4,不同筆跡風格的直方圖結果(256維) 48
表格 4 5,使用100維進行直方圖統計的結果 49
表格 4 6,角度差距所需乘上的比例變化 53
表格 4 7,各種方法的SVM測試結果 56
表格 5 1,使用文字結構濾波及Sobel進行區塊切割的字體辨識統計結果 58
zh_TW
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0099753003en_US
dc.subject (關鍵詞) 文字結構濾波zh_TW
dc.subject (關鍵詞) 字體分類zh_TW
dc.subject (關鍵詞) 字體分類zh_TW
dc.subject (關鍵詞) 消除雜訊zh_TW
dc.subject (關鍵詞) stroke filteren_US
dc.subject (關鍵詞) text font discriminationen_US
dc.subject (關鍵詞) text segmentationen_US
dc.subject (關鍵詞) noise removalen_US
dc.title (題名) 雲端筆記之混合式文字切割與辨識zh_TW
dc.title (題名) Segmentation and recognition of mixed characters for cloud-based notesen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [[1] K. Jung, K. In Kim, and A. K. Jain, "Text information extraction in images and video: a survey," Pattern Recognition, vol. 37, pp. 977-997, 2004.
[2] R. Smith, D. Antonova, and D.-S. Lee, "Adapting the Tesseract open source OCR engine for multilingual OCR," presented at the Proceedings of the International Workshop on Multilingual OCR, 2009, pp 1-8.
[3] Q. Liu, C. Jung, and Y. Moon, "Text segmentation based on stroke filter," presented at the Proceedings of the 14th annual ACM international conference on Multimedia, 2006, pp. 129-132.
[4] X. Li, W. Wang , Q. Huang , W. Gao , and L. Qing "A hybrid text segmentation approach," in Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on, 2009, pp. 510-513.
[5] I. DAUBECHIES, Ten Lectures on Wavelets, 1992.
[6] Y.-C. Su , C.-W. Lee , and Y.-H. Yang "The fast stroke filter," in Machine Learning and Cybernetics (ICMLC), 2010 International Conference on, 2010, pp. 3044-3049.
[7] Z. Wu , X. Fang , and C. Wang "A comprehensive video text localization system based on stroke filter," in Wireless Communications & Signal Processing, 2009. WCSP 2009. International Conference on, 2009, pp. 1-4.
[8] L. Liu, D. Zhang, and J. You, "Detecting Wide Lines Using Isotropic Nonlinear Filtering," Image Processing, IEEE Transactions on, vol. 16, pp. 1584-1595, 2007.
[9] M. Huang, M. Yang, F. Liu, and E.-H. Wu, "Stroke extraction in cartoon images using edge-enhanced isotropic nonlinear filter," presented at the Proceedings of the 9th ACM SIGGRAPH Conference on Virtual-Reality Continuum and its Applications in Industry, 2010, pp. 33-38.
[10] G. Aghajari and J. Shanbehzadeh, "A Text Localization Algorithm in Color Image via New Projection Profile," in International MultiConference of Engineers and Computer Scientists, 2010, pp. 1486-1489.
[11] Q. Ye, Q. Huang, W. Gao, and D. Zhao, "Fast and robust text detection in images and video frames," Image Vision Comput., vol. 23, pp. 565-576, 2005.
[12] M. Pontil and A. Verri, "Support vector machines for 3D object recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 20, pp. 637-646, 1998.
[13] A. Gionis, P. Indyk, and R. Motwani, "Similarity Search in High Dimensions via Hashing," presented at the Proceedings of the 25th International Conference on Very Large Data Bases, 1999, pp. 518-529.
[14] E. Kavallieratou, S. Stamatatos, and H. Antonopoulou, "Machine-Printed from Handwritten Text Discrimination," presented at the Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition, 2004 , pp. 312-316.
[15] L. F. da Silva, A. Conci, and A. Sanchez, "Automatic Discrimination between Printed and Handwritten Text in Documents," in Computer Graphics and Image Processing (SIBGRAPI), 2009 XXII Brazilian Symposium on, 2009, pp. 261-267.
[16] S. N. Srihari, Y.-C. Shin, V. Ramanaprasad, and D.-S. Lee, "Name and Address Block Reader system for tax form processing," presented at the Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1, 1995, pp. 5-10.
[17] R. Duda and P. Hart, Pattern Classication and Scene Analysis, 1996.
[18] Y. Zheng, H. Li, and D. Doermann, "Machine Printed Text and Handwriting Identification in Noisy Document Images," IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, pp. 337-353, 2004.
[19] G. G. Rajput, R. Horakeri, and S. Chandrakant, "Printed and Handwritten Mixed Kannada Numerals Recognition Using SVM," International Journal on Computer Science and Engineering, p. 5, 2010, pp. 1622-1626.
[20] S. Chanda, K. Franke, and U. Pal, "Structural handwritten and machine print classification for sparse content and arbitrary oriented document fragments," presented at the Proceedings of the 2010 ACM Symposium on Applied Computing, 2010, pp. 18-22.
[21] H. Freeman, "On the Encoding of Arbitrary Geometric Configurations," Electronic Computers, IRE Transactions on, vol. EC-10, pp. 260-268, 1961.
[22] J. K. Guo and M. Y. Ma, "Separating handwritten material from machine printed text using hidden Markov models," in Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on, 2001, pp. 439-443.
[23] R. Kandan, N. K. Reddy, K. R. Arvind, and A. G. Ramakrishnan, "A robust two level classification algorithm for text localization in documents," presented at the Proceedings of the 3rd international conference on Advances in visual computing - Volume Part II, 2007, pp. 96-105.
[24] S. Suzuki and K. Be, "Topological structural analysis of digitized binary images by border following," Computer Vision, Graphics, and Image Processing, vol. 30, pp. 32-46, 1985.
zh_TW