dc.contributor.advisor | 劉昭麟 | zh_TW |
dc.contributor.advisor | Liu, Chao Lin | en_US |
dc.contributor.author (作者) | 賴敏華 | zh_TW |
dc.contributor.author (作者) | Lai, Min Hua | en_US |
dc.creator (作者) | 賴敏華 | zh_TW |
dc.creator (作者) | Lai, Min Hua | en_US |
dc.date (日期) | 2008 | en_US |
dc.date.accessioned | 17-九月-2009 14:05:13 (UTC+8) | - |
dc.date.available | 17-九月-2009 14:05:13 (UTC+8) | - |
dc.date.issued (上傳時間) | 17-九月-2009 14:05:13 (UTC+8) | - |
dc.identifier (其他 識別碼) | G0095753023 | en_US |
dc.identifier.uri (URI) | https://nccur.lib.nccu.edu.tw/handle/140.119/32700 | - |
dc.description (描述) | 碩士 | zh_TW |
dc.description (描述) | 國立政治大學 | zh_TW |
dc.description (描述) | 資訊科學學系 | zh_TW |
dc.description (描述) | 95753023 | zh_TW |
dc.description (描述) | 97 | zh_TW |
dc.description.abstract (摘要) | 本論文為提供一個能輔助學生學習英漢翻譯與漢英翻譯時,搜尋翻譯例句的環境。我們的平行語料是利用網路上可取得的文件,如:英語教學網站、學習單等,利用人工擷取中英文對照的句子。標記化語料庫中記錄了中文句、英文句、中文句斷詞後的結果、中文句的詞性標記、中文句結構樹以及英文結構樹等資訊。 使用者輸入的查詢句可包含中文句、英文句及中英文混合句。我們的系統會依據不同的搜尋功能,針對查詢句作前處理,如:斷詞、詞性擷取、結構樹分析、詞性還原、查詢句的詞彙擴展等,再與標記化語料庫作比對,最後提供與查詢句相似的中英文對照句子給使用者,讓使用者在學習翻譯時,有更多類似句可以參考。 我們的系統不容易使用正規的方式評估;為了評估系統的效能,我們記錄各個搜尋功能,在不同門檻值下所得到的類似句句數,並利用NIST及BLEU來評估本系統所提供的類似句品質;另外我們透過問卷調查請受試者勾選本系統所提供類似句。問卷調查結果顯示受試者對於本系統所提供的類似句共識度並不高;本系統在提供10句類似句中,僅有1.6句的類似句是受試者認為有幫助的。 | zh_TW |
dc.description.abstract (摘要) | I present an environment for searching related translations between Chinese and English. A parallel and tagged corpus was constructed based on the text material obtained from the Internet, including English teaching websites and public learn¬ing sheets. The corpus contains both English and Chinese sentences, the infor¬mation about how the Chinese strings were segmented, the POS tags of the Chinese words, and the syntactic structures of the English and Chinese sentences. The user can use our system to do some queries by entering a Chinese sentence, an English sentence, or any pattern with mixed Chinese and English. The query sentence will be preprocessed according to the search function which the user selects, and the results of preprocessing will be used to search in the tagged corpus. The search results will be the reference sentences that are related to the query sentence. A formal evaluation of our system is not easy. I evaluated the system by entering a set of selected queries. For those tests, I recorded and compared the amount of reference sentences the system returned, and evaluated the quality of the reference sentences with their BLEU and NIST scores with some standard translations. In addition, I evaluated my system with the help of human subjects. Human subjects were asked to choose useful sentences from the reference sentences returned by my system. Experimental results indicated that the agreements between human subjects were not high, and the human subjects found that only about 1.6 sentences were useful from 10 reference sentences. | en_US |
dc.description.tableofcontents | 第一章 緒論 ...............................................11.1 研究背景與動機..........................................11.2 研究方法與成果..........................................11.3 論文架構................................................2第二章 文獻探討.............................................42.1 電腦輔助出題系統.........................................42.2 電腦輔助語言學習.........................................52.3 以範例為基礎翻譯 (Example-Based Translation).............82.4 句法樹庫搜尋 (Treebank Search)..........................92.5 小結..................................................10第三章 語料來源與前處理.....................................113.1 語料庫收集與前處理......................................113.2 標記語料...............................................123.2.1 標記中文語料.........................................133.2.2 標記英文語料.........................................163.3 分析語料...............................................183.4 辭典..................................................19第四章 提供搜尋功能........................................214.1 使用中文字句搜尋........................................214.1.1 以中文詞為搜尋依據....................................214.1.2 以中文詞性為搜尋依據..................................254.1.3 以中文結構樹為搜尋依據................................264.2 使用英文字句搜尋........................................294.2.1 以英文詞為搜尋依據....................................294.2.2 以英文結構樹為搜尋依據................................344.3 中英文混合搜尋.........................................39第五章 系統介面............................................41第六章 系統效率評估........................................436.1 實驗資料來源...........................................436.2 實驗設計...............................................456.3 實驗結果與分析.........................................486.3.1 不同搜尋功能所得類似句句數分析(英文五大基本句型)........496.3.2 不同搜尋功能所得類似句句數分析(國中複習講義)...........546.3.3 使用NIST及BLEU評估...................................606.3.4 一般受試者的評估......................................63第七章 結論與未來展望......................................677.1 結論..................................................677.2 未來展望...............................................69參考文獻...................................................71附錄Ⅰ 中研院平衡語料庫詞類標記..............................76附錄Ⅱ Penn Treebank Tags.................................77附錄Ⅲ 不同門檻值之搜尋功能類似句句數統計......................79附錄Ⅳ 搜尋功能之特定門檻值NIST與BLEU結果.....................91 | zh_TW |
dc.format.extent | 62218 bytes | - |
dc.format.extent | 169396 bytes | - |
dc.format.extent | 137751 bytes | - |
dc.format.extent | 154827 bytes | - |
dc.format.extent | 224335 bytes | - |
dc.format.extent | 380584 bytes | - |
dc.format.extent | 343028 bytes | - |
dc.format.extent | 425264 bytes | - |
dc.format.extent | 118032 bytes | - |
dc.format.extent | 1217728 bytes | - |
dc.format.extent | 229666 bytes | - |
dc.format.extent | 163252 bytes | - |
dc.format.extent | 408824 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.language.iso | en_US | - |
dc.source.uri (資料來源) | http://thesis.lib.nccu.edu.tw/record/#G0095753023 | en_US |
dc.subject (關鍵詞) | 電腦輔助語文教學 | zh_TW |
dc.subject (關鍵詞) | 句型搜尋 | zh_TW |
dc.subject (關鍵詞) | 例句式教學 | zh_TW |
dc.subject (關鍵詞) | 電腦輔助翻譯 | zh_TW |
dc.title (題名) | 電腦輔助漢英與英漢翻譯例句搜尋服務 | zh_TW |
dc.title (題名) | A Computer Assisted Environment for Searching Related Translations between Chinese and English | en_US |
dc.type (資料類型) | thesis | en |
dc.relation.reference (參考文獻) | [1] Eric Brill, A Simple Rule-Based Part of Speech Tagger. Proceedings of the Third Conference on Applied Natural Language Processing, 152-155, 1992. | zh_TW |
dc.relation.reference (參考文獻) | http://bulba.sdsu.edu/jeanette/thesis/PennTags.html [Last visited on 18 September 2008] | zh_TW |
dc.relation.reference (參考文獻) | [2] Y.-F. Chang and D. L. Schallert, The Design for a Collaborative System of English as Foreign Language Composition Writing of Senior High School Students in Taiwan. Proceedings of the Fifth IEEE International Conference on Advance Learning Technologies, 774-775, 2005. | zh_TW |
dc.relation.reference (參考文獻) | [3] G. Doddington, Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. Proceedings of the Second international Conference on Human Language Technology Research, 138-145, 2002. | zh_TW |
dc.relation.reference (參考文獻) | [4] Z. Dong and Q. Dong, HowNet, 2000. http://www.keenage.com [Last visited on 26 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [5] The Stanford Parser: A statistical parser (version 1.6), | zh_TW |
dc.relation.reference (參考文獻) | http://www-nlp.stanford.edu/software/lex-parser.shtml [Last visited on 26 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [6] J. Kakegawa, H. Kanda, E. Fujioka, M. Itami and K. Itoh, Diagnostic Processing of Japanese for Computer-Assisted Second Language Learning. Proceedings of the Thirty Eighth Annual Meeting on Association for Computational Linguistics, 537-546, 2000. | zh_TW |
dc.relation.reference (參考文獻) | [7] O. Knutsson, T. C. Pargman and K. S. Eklundh, Transforming Grammar Checking Technology into a Learning Environment for Second Language Writing. Proceedings of the HLT-NAACL 2003 Workshop on Building Educational Applications Using Natural Language Processing, Volume 2, 38-45, 2003. | zh_TW |
dc.relation.reference (參考文獻) | [8] C.-L. Liu, C.-H. Wang, and Z.-M. Gao, Using Lexical Constraints to Enhance the Quality of Computer-Generated Multiple-Choice Cloze Items. International Journal of Computational Linguistics and Chinese Language Processing, Volume 10, Number 3, 303-328, 2005. | zh_TW |
dc.relation.reference (參考文獻) | [9] C. D. Manning and H. Schütze, Foundations of Statistical Natural Language Processing, the MIT Press, 1999. | zh_TW |
dc.relation.reference (參考文獻) | [10] G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross and K. Miller, Introduction to WordNet: An On-line Lexical Database. International Journal of Lexicography, Volume 3, Number 4, 235-244, 1990. http://wordnet.princeton.edu/doc/ [Last visited on 26 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [11] R. Mitkov and L. A. Ha, Computer-Aided Generation of Multiple-Choice Tests. Proceedings of the HLT-NAACL 2003 Workshop on Building Educational Applications Using Natural Language Processing, Volume2, 17-22, 2003. | zh_TW |
dc.relation.reference (參考文獻) | [12] S. B. Needleman and C. D. Wunsh, A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. Journal of Molecular Biology, Volume 48, Number 3, 443-453, 1970. | zh_TW |
dc.relation.reference (參考文獻) | [13] L. Nygaard and J. B. Johannessen, SearchTree – A User-friendly Treebank Search Interface. Proceedings of the 3rd Workshop on Treebanks and Linguistic Theories, 183-189, 2004. | zh_TW |
dc.relation.reference (參考文獻) | [14] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, BLEU: A Method for Automatic Evaluation of Macine Translation. Proceedings of the Fortieth Annual Meeting of the Association for Computational Linguistics, 311-318, 2002. | zh_TW |
dc.relation.reference (參考文獻) | [15] S. Sato, CTM: an example-based translation aid system. Proceedings of the fourteenth Conference on Computational Linguistics, Volume 4, 1259-1268, 1992. | zh_TW |
dc.relation.reference (參考文獻) | [16] M. Shimohata, E. Sumita, and Y. Matsumoto, Retrieving meaning-equivalent sentences for example-based rough translation. Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, Volume 3, 50-56, 2003. | zh_TW |
dc.relation.reference (參考文獻) | [17] M. Volk, J. Lundborg, and M. Mettler, A Search Tool for Parallel Treebanks, Proceedings of the Linguistic Annotation Workshop (LAW) on Association for Computational Linguistics, 85-92, 2007. | zh_TW |
dc.relation.reference (參考文獻) | [18] G.R.S. Weir and G. Lepouras, English Assistant: A Support Strategy for On-Line Second Language Learning. Proceedings of the Second IEEE International Conference on Advance Learning Technologies, 125-126, 2001. | zh_TW |
dc.relation.reference (參考文獻) | [19] X.-H. Zhou, Application of English Cohesion Theory in the Teaching of Writing to Chinese Graduate Students. Journal of US-China Education Review, Volume 4, Number 7, 31-37, 2007. | zh_TW |
dc.relation.reference (參考文獻) | [20] 中研院中文句結構樹資料庫檢索系統,http://turing.iis.sinica.edu.tw/treesearch/ [Last visited on 24 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [21] 中研院中文斷詞系統,http://ckipsvr.iis.sinica.edu.tw/ [Last visited on 24 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [22] 中研院平衡語料庫詞類標記集,http://ckipsvr.iis.sinica.edu.tw/category_list.doc [Last visited on 24 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [23] 中研院現代漢語語料庫一詞泛讀,http://140.109.150.65/cwordframe.html [Last visited on 24 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [24] 呂明欣,國小國語科測驗卷電腦輔助出題系統,碩士論文,國立政治大學。台灣,台北,2007。 | zh_TW |
dc.relation.reference (參考文獻) | [25] 林仁祥及劉昭麟。國小國語科測驗卷出題輔助系統,2007台灣網際網路研討會論文集,論文光碟。台灣,台北,2007。 | zh_TW |
dc.relation.reference (參考文獻) | [26] 唐建輝,大滿貫複習講義 英語(全),翰林出版事業股份有限公司。2008。 | zh_TW |
dc.relation.reference (參考文獻) | [27] 旋元佑文法,http://tw.myblog.yahoo.com/jw!GFGhGimWHxN4wRWXG1UDIL_XSA--/ [Last visited on 24 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [28] 基礎英文1200句,http://hk.geocities.com/cnlyhhp/eng.htm [Last visited on 24 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [29] 國民中學學習資源網,http://140.111.34.172/teacool/new_page_2.htm [Last visited on 24 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [30] 教育部委託宜蘭縣發展九年一貫課程建置語文學習領域(英語)國中教科書補充資料暨題庫建置計畫,http://140.111.66.37/english/ [Last visited on 24 June 2008] | zh_TW |
dc.relation.reference (參考文獻) | [31] 教育部國民教育司,http://www.edu.tw/EJE [Last visited on 24 June 20088] | zh_TW |
dc.relation.reference (參考文獻) | [32] 陳佳吟、柯明憲、吳紫葦及張俊盛,電腦輔助英文文法出題系統,第十七屆自然語言與語音處理研討會論文集。台灣,台南,2005。 | zh_TW |
dc.relation.reference (參考文獻) | [33] 劉吉軒、洪培鈞及李金瑛,以英語寫作輔助為目的之語料庫語句檢索方法,第十九屆自然語言與語音處理研討會論文集,5-19。台灣,台北,2007。 | zh_TW |
dc.relation.reference (參考文獻) | [34] 賴世雄,文法從頭學,長春藤有聲出版有限公司。2007。 | zh_TW |