學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 英文技術文獻中動詞與其受詞之中文翻譯的語境效用
Collocational influences on the chinese translations of english verbs and their objects in technical documents
作者 莊怡軒
Chuang, Yi Hsuan
貢獻者 劉昭麟
Liu, Chao Lin
莊怡軒
Chuang, Yi Hsuan
關鍵詞 機器翻譯
特徵評比
自然語言處理
日期 2010
上傳時間 5-Oct-2011 14:43:52 (UTC+8)
摘要 本研究使用英漢平行語料庫,試圖從中找尋英文與中文之間的翻譯情形,我們將英文及中文的動名詞組合 (V-N-collocation) 作為觀察對象。本研究各別分析英漢專利平行文句語料庫及科學人雜誌英漢對照電子書兩套語料庫,將中英文互為翻譯的文件視為一體,觀察英文及中文語言其中的特定結構及共現性 (collocation) ,建構由真實世界的語料所反應的語言翻譯模型。
  我們使用技術名詞表將平行語料庫進行技術名詞斷詞,再將句子進行結構剖析得到關係樹 (dependency tree) ,並利用關係樹結構及近義詞典取得英漢動名詞組合。本研究運用英漢動名詞組合建立英文動詞與名詞的翻譯模型,我們的系統可以根據不同的模型推薦翻譯,並比較這些翻譯模型的成效;最後也加入中文語言使用者翻譯英文動詞的實驗與本研究的翻譯模型效果作比較,結果顯示本研究的翻譯模型比起受試者,可以有較好的推薦效果。
In our investigation, we are interested in English Verb-Noun collocation (V-N collocation) and the corresponding usage in Chinese. To discover English-Chinese V-N collocation, a rich corpus is needed; therefore, we obtained one million English-Chinese parallel patent sentence pairs and seven years of bilingual Scientific American as two corpora to analyze. We trained translation models to find the usage of V-N collocations in English and Chinese. Given English V-N collocation and corresponding Chinese information, our system can recommend the proper translations of the English verb or object in collocation according to the translation models.
We experimented ten formulas to train our models using two corpora, and observed similar trends in the analyses. Preliminary comparisons of the translation quality of human subjects and our system indicated that our system could offer better recommendations for the translation tasks.
參考文獻 [1] Alexander Budanitsky and Graeme Hirst, Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Association for Computational Linguistics, 32(1), 13-47, 2006.
[2] Yu-Chia Chang, Jason S. Chang, Hao-Jan Chen, and Hsien-Chin Liou. An Automatic Collocation Writing Assistant for Taiwanese EFL Learners: A Case of Corpus-based NLP Technology. Computer Assisted Language Learning, 21(3), 283-299, 2008.
[3] Wenliang Chen, Jun’chi Kazama and Kentaro Torisawa, Bitext Dependency Parsing with Bilingual Subtree Constraints. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 21-29, 2010.
[4] Concise Oxford English Dictionary。http://startdict.sourceforge.net/Dictionaries_zh_TW.php [連結已失效]
[5] Dr.eye譯典通。http://ajds.nsysu.edu.tw/learn/dict/ [Last visited on 15 June 2011]
[6] E-HowNet。http://ckip.iis.sinica.edu.tw/taxonomy/taxonomy-doc.htm [Last visited on 15 June 2011]
[7] Google。http://www.google.com.tw/ [Last visited on 15 June 2011]
[8] Google Patents beta。http://www.google.com/patents [Last visited on 15 June 2011]
[9] HowNet。http://www.keenage.com/html/c_index.html [Last visited on 15 June 2011]
[10] Jia-Yan Jian, Yu-Chia Chang, and Jason S. Chang, TANGO: Bilingual Collocational Concordancer. Proceedings of ACL on Interactive poster and demonstration sessions, 2004.
[11] Bin Lu, Benjamin K. Tsou, Tao Jiang, Oi Yee Kwong and Jingbo Zhu, Mining Large-scale Parallel Corpora from Multilingual Patents: An English-Chinese Example and Its Application to SMT. Proceedings of the First CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2010.
[12] Michael Gamon, Jianfeng Gao, Chris Brockett, Alexander Klementiev, William Dolan, Dmitriy Belenko, and Lucy Vanderwende, Using Contextual Speller Techniques and Language Modeling for ESL Error Correction. Proceedings of the International Joint Conference on Natural Language Processing, 2008.
[13] Patent Translation Task at NTCIR-9。http://ntcir.nii.ac.jp/PatentMT/ [Last visited on 15 June 2011]
[14] Stanford Chinese Segmenter。http://nlp.stanford.edu/software/segmenter.shtml [Last visited on 15 June 2011]
[15] Stanford Parser。http://nlp.stanford.edu/software/lex-parser.shtml [Last visited on 15 June 2011]
[16] Sriam Venkatapathy and Aravind K. Joshi, Measuring the Relative Compositionality of Verb-noun (V-N) Collocations by Integrating Features. Proceeding of Human Language Technology Conference on Empirical Methods in Natural Language Processing, 899-906, 2005.
[17] WordNet。http://wordnet.princeton.edu/ [Last visited on 15 June 2011]
[18] Xing Yi, Jianfeng Gao and William B. Dolan, A Web-based English Proofing System for English as a Second Language Users. Proceedings of the Third International Joint Conference on Natural Language Processing, 619-624, 2008.
[19] XML。http://www.w3schools.com/xml/default.asp [Last visited on 15 June 2011]
[20] Shoichi YOKOAMA and Masumi OKUYAMA, Translation Disambiguation of Patent Sentences using Case Frames. Machine Translation Summit XII WS7: Third Workshop on Patent Translation, 33-36, 2009
[21] 一詞泛讀。http://elearning.ling.sinica.edu.tw/c_help.html [Last visited on 15 June 2011]
[22] 中央研究院中文斷詞系統。http://ckipsvr.iis.sinica.edu.tw/ [Last visited on 15 June 2011]
[23] 田侃文,英漢專利文書文句對列與應用,國立政治大學資訊科學所,碩士論文,2009。
[24] 科學人雜誌英漢對照電子書。http://edu2.wordpedia.com/taipei_sa/ [Last visited on 15 June 2011]
[25] 國家教育研究院學術名詞資訊網。http://terms.nict.gov.tw/download_main.php [Last visited on 15 June 2011]
[26] 曾元顯,劉昭麟,莊則敬,專利雙語語料之中、英對照詞自動擷取,第二十一屆自然語言與語音處理研討會,279-292,2009。
描述 碩士
國立政治大學
資訊科學學系
98753004
99
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0098753004
資料類型 thesis
dc.contributor.advisor 劉昭麟zh_TW
dc.contributor.advisor Liu, Chao Linen_US
dc.contributor.author (Authors) 莊怡軒zh_TW
dc.contributor.author (Authors) Chuang, Yi Hsuanen_US
dc.creator (作者) 莊怡軒zh_TW
dc.creator (作者) Chuang, Yi Hsuanen_US
dc.date (日期) 2010en_US
dc.date.accessioned 5-Oct-2011 14:43:52 (UTC+8)-
dc.date.available 5-Oct-2011 14:43:52 (UTC+8)-
dc.date.issued (上傳時間) 5-Oct-2011 14:43:52 (UTC+8)-
dc.identifier (Other Identifiers) G0098753004en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/51320-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 98753004zh_TW
dc.description (描述) 99zh_TW
dc.description.abstract (摘要) 本研究使用英漢平行語料庫,試圖從中找尋英文與中文之間的翻譯情形,我們將英文及中文的動名詞組合 (V-N-collocation) 作為觀察對象。本研究各別分析英漢專利平行文句語料庫及科學人雜誌英漢對照電子書兩套語料庫,將中英文互為翻譯的文件視為一體,觀察英文及中文語言其中的特定結構及共現性 (collocation) ,建構由真實世界的語料所反應的語言翻譯模型。
  我們使用技術名詞表將平行語料庫進行技術名詞斷詞,再將句子進行結構剖析得到關係樹 (dependency tree) ,並利用關係樹結構及近義詞典取得英漢動名詞組合。本研究運用英漢動名詞組合建立英文動詞與名詞的翻譯模型,我們的系統可以根據不同的模型推薦翻譯,並比較這些翻譯模型的成效;最後也加入中文語言使用者翻譯英文動詞的實驗與本研究的翻譯模型效果作比較,結果顯示本研究的翻譯模型比起受試者,可以有較好的推薦效果。
zh_TW
dc.description.abstract (摘要) In our investigation, we are interested in English Verb-Noun collocation (V-N collocation) and the corresponding usage in Chinese. To discover English-Chinese V-N collocation, a rich corpus is needed; therefore, we obtained one million English-Chinese parallel patent sentence pairs and seven years of bilingual Scientific American as two corpora to analyze. We trained translation models to find the usage of V-N collocations in English and Chinese. Given English V-N collocation and corresponding Chinese information, our system can recommend the proper translations of the English verb or object in collocation according to the translation models.
We experimented ten formulas to train our models using two corpora, and observed similar trends in the analyses. Preliminary comparisons of the translation quality of human subjects and our system indicated that our system could offer better recommendations for the translation tasks.
en_US
dc.description.tableofcontents 第一章 緒論...1
第二章 文獻探討...5
第三章 專利語料來源與技術名詞表建置...10
第四章 語料前處理及近義詞典建置...17
第五章 翻譯模型公式...35
第六章 使用專利文句語料建置翻譯模型...41
第七章 使用科學人雜誌語料建置翻譯模型...58
第八章 受試者實驗...65
第九章 結論與未來展望...71
zh_TW
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0098753004en_US
dc.subject (關鍵詞) 機器翻譯zh_TW
dc.subject (關鍵詞) 特徵評比zh_TW
dc.subject (關鍵詞) 自然語言處理zh_TW
dc.title (題名) 英文技術文獻中動詞與其受詞之中文翻譯的語境效用zh_TW
dc.title (題名) Collocational influences on the chinese translations of english verbs and their objects in technical documentsen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [1] Alexander Budanitsky and Graeme Hirst, Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Association for Computational Linguistics, 32(1), 13-47, 2006.zh_TW
dc.relation.reference (參考文獻) [2] Yu-Chia Chang, Jason S. Chang, Hao-Jan Chen, and Hsien-Chin Liou. An Automatic Collocation Writing Assistant for Taiwanese EFL Learners: A Case of Corpus-based NLP Technology. Computer Assisted Language Learning, 21(3), 283-299, 2008.zh_TW
dc.relation.reference (參考文獻) [3] Wenliang Chen, Jun’chi Kazama and Kentaro Torisawa, Bitext Dependency Parsing with Bilingual Subtree Constraints. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 21-29, 2010.zh_TW
dc.relation.reference (參考文獻) [4] Concise Oxford English Dictionary。http://startdict.sourceforge.net/Dictionaries_zh_TW.php [連結已失效]zh_TW
dc.relation.reference (參考文獻) [5] Dr.eye譯典通。http://ajds.nsysu.edu.tw/learn/dict/ [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [6] E-HowNet。http://ckip.iis.sinica.edu.tw/taxonomy/taxonomy-doc.htm [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [7] Google。http://www.google.com.tw/ [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [8] Google Patents beta。http://www.google.com/patents [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [9] HowNet。http://www.keenage.com/html/c_index.html [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [10] Jia-Yan Jian, Yu-Chia Chang, and Jason S. Chang, TANGO: Bilingual Collocational Concordancer. Proceedings of ACL on Interactive poster and demonstration sessions, 2004.zh_TW
dc.relation.reference (參考文獻) [11] Bin Lu, Benjamin K. Tsou, Tao Jiang, Oi Yee Kwong and Jingbo Zhu, Mining Large-scale Parallel Corpora from Multilingual Patents: An English-Chinese Example and Its Application to SMT. Proceedings of the First CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2010.zh_TW
dc.relation.reference (參考文獻) [12] Michael Gamon, Jianfeng Gao, Chris Brockett, Alexander Klementiev, William Dolan, Dmitriy Belenko, and Lucy Vanderwende, Using Contextual Speller Techniques and Language Modeling for ESL Error Correction. Proceedings of the International Joint Conference on Natural Language Processing, 2008.zh_TW
dc.relation.reference (參考文獻) [13] Patent Translation Task at NTCIR-9。http://ntcir.nii.ac.jp/PatentMT/ [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [14] Stanford Chinese Segmenter。http://nlp.stanford.edu/software/segmenter.shtml [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [15] Stanford Parser。http://nlp.stanford.edu/software/lex-parser.shtml [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [16] Sriam Venkatapathy and Aravind K. Joshi, Measuring the Relative Compositionality of Verb-noun (V-N) Collocations by Integrating Features. Proceeding of Human Language Technology Conference on Empirical Methods in Natural Language Processing, 899-906, 2005.zh_TW
dc.relation.reference (參考文獻) [17] WordNet。http://wordnet.princeton.edu/ [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [18] Xing Yi, Jianfeng Gao and William B. Dolan, A Web-based English Proofing System for English as a Second Language Users. Proceedings of the Third International Joint Conference on Natural Language Processing, 619-624, 2008.zh_TW
dc.relation.reference (參考文獻) [19] XML。http://www.w3schools.com/xml/default.asp [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [20] Shoichi YOKOAMA and Masumi OKUYAMA, Translation Disambiguation of Patent Sentences using Case Frames. Machine Translation Summit XII WS7: Third Workshop on Patent Translation, 33-36, 2009zh_TW
dc.relation.reference (參考文獻) [21] 一詞泛讀。http://elearning.ling.sinica.edu.tw/c_help.html [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [22] 中央研究院中文斷詞系統。http://ckipsvr.iis.sinica.edu.tw/ [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [23] 田侃文,英漢專利文書文句對列與應用,國立政治大學資訊科學所,碩士論文,2009。zh_TW
dc.relation.reference (參考文獻) [24] 科學人雜誌英漢對照電子書。http://edu2.wordpedia.com/taipei_sa/ [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [25] 國家教育研究院學術名詞資訊網。http://terms.nict.gov.tw/download_main.php [Last visited on 15 June 2011]zh_TW
dc.relation.reference (參考文獻) [26] 曾元顯,劉昭麟,莊則敬,專利雙語語料之中、英對照詞自動擷取,第二十一屆自然語言與語音處理研討會,279-292,2009。zh_TW