Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 電腦輔助試題翻譯:以國際數學與科學教育成就趨勢調查為例
Computer Aided Item Translation for the Trends in International Mathematics and Science Study
作者 呂明欣
Lu,Ming-Shin
貢獻者 劉昭麟<br>陳恭
Liu,Chao-Lin<br>Chen,Kung
呂明欣
Lu,Ming-Shin
關鍵詞 自然語言處理
電腦輔助教學
受限語言
試題翻譯
機器翻譯
Natural language processing
Computer-aided learning
Controlled-language
Item translation
Machine translation
TIMSS
日期 2006
上傳時間 18-Sep-2009 18:26:39 (UTC+8)
摘要 由國際教育學習成就調查委員會統一命題之國際數學與科學教育成就趨勢調查測驗,為便於台灣中小學生施測與理解,英文原文試題內容需要經過許多人工討論及翻譯時間。為了增進翻譯內容一致性及其效率,我們設計一套符合測驗試題的輔助翻譯系統,將不同格式的試題文件,經執行語法分析式的片語擷取和字典查詢,透過使用者介面,選擇合適的片語詞彙翻譯選項和詞序調整,以及提供目前常用之線上翻譯服務、回顧翻譯類似句、以及加減詞彙等功能。為了能提昇翻譯詞彙的選擇正確性,我們記錄翻譯者選詞動作,讓翻譯者能回顧過去曾處理過的翻譯類似句,並且按照系統提供之選詞頻率資訊、科學領域的期刊語料之詞頻統計,以及利用統計式中英詞彙對列和語言模型,更改選詞的優先順序。我們嘗試以過去試題為實驗對象,按年級及學科區分6大試題類別,搭配4種選詞策略,透過BLEU及NIST之翻譯評估指標比較線上翻譯系統和本系統,實驗結果顯示在各實驗組的評估上均有優於線上翻譯系統的效果。
Test items used in the Trends in International Mathematics and Science Study (TIMSS) are designed by The International Association for the Evaluation of Education Achievement, for facilitating education scientists to measure students’ competence in science and mathematics. Translating the English items into Chinese items demands a lot of work. Therefore, we would like to offer a computer-aided translation environment to improve the consistency and efficiency of the translation process.
     Through the user interface, translators could input different document format of test items, use phrase analysis and dictionary to find different phrase translations, and adjust word orders. Users of our system may obtain translations from on-line translations provided by Google and Yahoo, can look for previously translated items that contain specific word patterns, and so on. For selecting appropriate Chinese translations for English words, we considered users’ past selection, word frequencies in relevant corpora, and other language-related information in parallel corpora.
     We employed test items used in TIMSS 1999 and TIMSS 2003 to evaluate the effectiveness of our translation environment. Translations recommended by our system were compared with actual Chinese translations of the test data, and the similarity was measured with the BLEU and NIST metrics. Experimental results indicate that our system performed better or similarly with Google and Yahoo on-line translation systems.
第一章 緒論               1
     第二章 相關文獻回顧           5
     第三章 系統需求及功能架構 15
     第四章 執行翻譯之相關技術 28
     第五章 調整翻譯選詞之方法 40
     第六章 系統翻譯效率評估 61
     第七章 結論與未來展望 83
     參考文獻 86
參考文獻 [1] 呂婷婷、張政,基於N-gram共現的機器翻譯自動評估的研究,北京工商大學學報(自然科學版),第二十四卷第五期,50-53,2006。
[2] 陳光華,資訊檢索查詢之自然語言處理,中國圖書館學會會報,57,141-153,國立台灣大學圖書館學系,1996。
[3] 姚念祖、蘇克毅,淺談機器翻譯之瓶頸及目前的研發趨勢,中華民國計算語言學學會通訊,第十六卷第二期,8-18,2005。
[4] 夏敏翔、張耀升及盧文祥,使用流暢性改善詞組翻譯的統計式機器翻譯,第十八屆自然語言與語音處理研討會論文集,國立成功大學資訊工程研究所,2006。
[5] 常寶寶、張偉,機器翻譯研究的現狀和發展趨勢,術語標準化與資訊技術,第2期,32-35,1998。
[6] Winfield S. Bennett and Jonathan Slocum, The LRC Machine Translation System, Computational Linguistics, 11:2-3, 111-119, 1985.
[7] Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer, and Paul S. Roossin, A Statistical Approach to Machine Translation, Computational Linguistics, 16:2, 79-85, 1990.
[8] Cheng, Chin-Chuan, Word-focused Extensive Reading with Guidance, Selected Papers from the Thirteenth International Symposium on English Teaching, 24-32, 2004.
[9] Christopher D. Manning and Hinrich Schutze, Foundations of Statistical Natural Language Processing, The MIT Press, 1999.
[10] Redu Florian and Richard Wicentowski, Unsupervised Italian Word Sense Disambiguation using WordNets and Unlabeled Corpora, Proceedings of the SIGLEX/SENSEVAL Workshop on Word Sense Disambiguation: Recent Success and Future Directions, 67-73, 2002.
[11] Michael Carl, Inducing Translation Templates for Example-Based Machine Translation, Machine Translation Summit VII, 617-624, 1999.
[12] Fons Van de Vijver and Ronald K. Hambleton, Translating Tests: Some Practical Guidelines, European Psychologist, 1, 89-99, 1996.
[13] Ide Nancy, Véronis Jane and Word Sense Disambiguation: The State of the Art, Computational Linguistics, 24:1, 1-40, 1998.
[14] W. John Hutchins and Harold L. Somers, An Introduction to Machine Translation, London: Academic Press, 1992.
[15] Jie Lin and W. Todd Rogers, Validity of the Simultaneous Approach to the Development of Equivalent Achievement Tests in English and French (Stage II), Poster for Graduate Student Poster Session, National Council on Measurement in Education, University of Alberta, 2005.
[16] Ekedahl Jonas and Golub Koralika , Word Sense Disambiguation using WordNet and the Lesk Algorithm, Report of Language Processing and Computational Linguistics, University of Lund, 2004.
[17] Michael Lesk, Automatic Sense Disambiguation using Machine Readable Dictionaries: How to Tell a Pine Cone From an Ice Cream Cone, Proceedings of the Fifth annual international conference on Systems documentation, 24–26, 1986.
[18] Ying Liu and Zong Chengqing, Rule Base Combined Linguistics Knowledge with Corpus, Proceedings of the 2003 IEEE International Conference on Systems, Man and Cybernetics, 5022-5027, 2003.
[19] Franz Josef Och, An Efficient Method for Determining Bilingual Word Classes, Proceedings of European Chapter of the Association for Computational Linguistics, 71-76, 1999.
[20] Franz Josef Och, Hermann Ney, Improved Statistical Alignment Models, Proceedings of the Thirty-eightth Annual Meeting of the Association for Computational Linguistics, 440-447, 2000.
[21] Papineni, K., Roukos, S., Ward, T., and Zhu, W. J., BLEU: a method for automatic evaluation of machine translation, Fourtyth Annual meeting of the Association for Computational Linguistics, 311-318, 2002.
[22] Philip Resnik, Selectional Preference and Sense Disambiguation, Proceedings of the Applied Natural Language Processing Workshop on Tagging Text with Lexical Semantics: Why, What and How, 52-57, 1997.
[23] Guillermo Solano-Flores, Luis Angel Contreras-Nino and Eduardo Backhoff, The Mexican Translation of TIMSS-1995:Lessons on Test Translation From a Post-Mortem Study, Annual Meetings of National Council on Measurement in Education, Montreal, 2005.
[24] Yorick Wilks and Mark Stevenson, Combining Independent Knowledge Sources for Word Sense Disambiguation, Proceedings of the Conference on Recent Advances in Natural Language Processing, 1–7, 1997.
[25] Yang Xiaofeng and Li Tangqiu, A Study of Semantic Disambiguation Based on HowNet, International Journal of Computational Linguistics & Chinese Language Processing, 7:1, 47-78, 2002.
[26] David Yarowsky, Word Sense Disambiguation using Statistical Models of Roget`s Categories Trained on Large Corpora, Proceedings of the Fourteenth International Conference on Computational Linguistics, 454-460, 1992.
[27] 國立科學教育館科學研習月刊,http://www.ntsec.gov.tw/publish/pdf.asp
[28] 師大科教中心科學教育月刊,http://140.122.147.172/journal/(new)journal.htm
[29] Concise Oxford English Dictionary字典檔,http://stardict.sourceforge.net/Dictionaries_zh_TW.php
[30] Google Translate, http://www.google.com/translate_t
[31] JACOB PROJECT, http://danadler.com/jacob/
[32] MINIPAR HOME PAGE, http://www.cs.ualberta.ca/~lindek/minipar.htm
[33] MXPOST, http://ccc.kmit.edu.tw/wiki/advanceprogramming/index.htm
[34] TIMSS 中文版官方網頁, http://timss.sec.ntnu.edu.tw/timss2007/news.asp
[35] The Porter Stemming Algorithm, http://www.tartarus.org/martin/PorterStemmer/
[36] WordNet API, http://nlp.stanford.edu/nlp/javadoc/wn/
[37] YAHOO! 雅虎線上翻譯, http://tw.search.yahoo.com/language/
描述 碩士
國立政治大學
資訊科學學系
94753007
95
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0094753007
資料類型 thesis
dc.contributor.advisor 劉昭麟<br>陳恭zh_TW
dc.contributor.advisor Liu,Chao-Lin<br>Chen,Kungen_US
dc.contributor.author (Authors) 呂明欣zh_TW
dc.contributor.author (Authors) Lu,Ming-Shinen_US
dc.creator (作者) 呂明欣zh_TW
dc.creator (作者) Lu,Ming-Shinen_US
dc.date (日期) 2006en_US
dc.date.accessioned 18-Sep-2009 18:26:39 (UTC+8)-
dc.date.available 18-Sep-2009 18:26:39 (UTC+8)-
dc.date.issued (上傳時間) 18-Sep-2009 18:26:39 (UTC+8)-
dc.identifier (Other Identifiers) G0094753007en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/36381-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 94753007zh_TW
dc.description (描述) 95zh_TW
dc.description.abstract (摘要) 由國際教育學習成就調查委員會統一命題之國際數學與科學教育成就趨勢調查測驗,為便於台灣中小學生施測與理解,英文原文試題內容需要經過許多人工討論及翻譯時間。為了增進翻譯內容一致性及其效率,我們設計一套符合測驗試題的輔助翻譯系統,將不同格式的試題文件,經執行語法分析式的片語擷取和字典查詢,透過使用者介面,選擇合適的片語詞彙翻譯選項和詞序調整,以及提供目前常用之線上翻譯服務、回顧翻譯類似句、以及加減詞彙等功能。為了能提昇翻譯詞彙的選擇正確性,我們記錄翻譯者選詞動作,讓翻譯者能回顧過去曾處理過的翻譯類似句,並且按照系統提供之選詞頻率資訊、科學領域的期刊語料之詞頻統計,以及利用統計式中英詞彙對列和語言模型,更改選詞的優先順序。我們嘗試以過去試題為實驗對象,按年級及學科區分6大試題類別,搭配4種選詞策略,透過BLEU及NIST之翻譯評估指標比較線上翻譯系統和本系統,實驗結果顯示在各實驗組的評估上均有優於線上翻譯系統的效果。zh_TW
dc.description.abstract (摘要) Test items used in the Trends in International Mathematics and Science Study (TIMSS) are designed by The International Association for the Evaluation of Education Achievement, for facilitating education scientists to measure students’ competence in science and mathematics. Translating the English items into Chinese items demands a lot of work. Therefore, we would like to offer a computer-aided translation environment to improve the consistency and efficiency of the translation process.
     Through the user interface, translators could input different document format of test items, use phrase analysis and dictionary to find different phrase translations, and adjust word orders. Users of our system may obtain translations from on-line translations provided by Google and Yahoo, can look for previously translated items that contain specific word patterns, and so on. For selecting appropriate Chinese translations for English words, we considered users’ past selection, word frequencies in relevant corpora, and other language-related information in parallel corpora.
     We employed test items used in TIMSS 1999 and TIMSS 2003 to evaluate the effectiveness of our translation environment. Translations recommended by our system were compared with actual Chinese translations of the test data, and the similarity was measured with the BLEU and NIST metrics. Experimental results indicate that our system performed better or similarly with Google and Yahoo on-line translation systems.
en_US
dc.description.abstract (摘要) 第一章 緒論               1
     第二章 相關文獻回顧           5
     第三章 系統需求及功能架構 15
     第四章 執行翻譯之相關技術 28
     第五章 調整翻譯選詞之方法 40
     第六章 系統翻譯效率評估 61
     第七章 結論與未來展望 83
     參考文獻 86
-
dc.description.tableofcontents 第一章 緒論               1
     第二章 相關文獻回顧           5
     第三章 系統需求及功能架構 15
     第四章 執行翻譯之相關技術 28
     第五章 調整翻譯選詞之方法 40
     第六章 系統翻譯效率評估 61
     第七章 結論與未來展望 83
     參考文獻 86
zh_TW
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0094753007en_US
dc.subject (關鍵詞) 自然語言處理zh_TW
dc.subject (關鍵詞) 電腦輔助教學zh_TW
dc.subject (關鍵詞) 受限語言zh_TW
dc.subject (關鍵詞) 試題翻譯zh_TW
dc.subject (關鍵詞) 機器翻譯zh_TW
dc.subject (關鍵詞) Natural language processingen_US
dc.subject (關鍵詞) Computer-aided learningen_US
dc.subject (關鍵詞) Controlled-languageen_US
dc.subject (關鍵詞) Item translationen_US
dc.subject (關鍵詞) Machine translationen_US
dc.subject (關鍵詞) TIMSSen_US
dc.title (題名) 電腦輔助試題翻譯:以國際數學與科學教育成就趨勢調查為例zh_TW
dc.title (題名) Computer Aided Item Translation for the Trends in International Mathematics and Science Studyen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [1] 呂婷婷、張政,基於N-gram共現的機器翻譯自動評估的研究,北京工商大學學報(自然科學版),第二十四卷第五期,50-53,2006。zh_TW
dc.relation.reference (參考文獻) [2] 陳光華,資訊檢索查詢之自然語言處理,中國圖書館學會會報,57,141-153,國立台灣大學圖書館學系,1996。zh_TW
dc.relation.reference (參考文獻) [3] 姚念祖、蘇克毅,淺談機器翻譯之瓶頸及目前的研發趨勢,中華民國計算語言學學會通訊,第十六卷第二期,8-18,2005。zh_TW
dc.relation.reference (參考文獻) [4] 夏敏翔、張耀升及盧文祥,使用流暢性改善詞組翻譯的統計式機器翻譯,第十八屆自然語言與語音處理研討會論文集,國立成功大學資訊工程研究所,2006。zh_TW
dc.relation.reference (參考文獻) [5] 常寶寶、張偉,機器翻譯研究的現狀和發展趨勢,術語標準化與資訊技術,第2期,32-35,1998。zh_TW
dc.relation.reference (參考文獻) [6] Winfield S. Bennett and Jonathan Slocum, The LRC Machine Translation System, Computational Linguistics, 11:2-3, 111-119, 1985.zh_TW
dc.relation.reference (參考文獻) [7] Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer, and Paul S. Roossin, A Statistical Approach to Machine Translation, Computational Linguistics, 16:2, 79-85, 1990.zh_TW
dc.relation.reference (參考文獻) [8] Cheng, Chin-Chuan, Word-focused Extensive Reading with Guidance, Selected Papers from the Thirteenth International Symposium on English Teaching, 24-32, 2004.zh_TW
dc.relation.reference (參考文獻) [9] Christopher D. Manning and Hinrich Schutze, Foundations of Statistical Natural Language Processing, The MIT Press, 1999.zh_TW
dc.relation.reference (參考文獻) [10] Redu Florian and Richard Wicentowski, Unsupervised Italian Word Sense Disambiguation using WordNets and Unlabeled Corpora, Proceedings of the SIGLEX/SENSEVAL Workshop on Word Sense Disambiguation: Recent Success and Future Directions, 67-73, 2002.zh_TW
dc.relation.reference (參考文獻) [11] Michael Carl, Inducing Translation Templates for Example-Based Machine Translation, Machine Translation Summit VII, 617-624, 1999.zh_TW
dc.relation.reference (參考文獻) [12] Fons Van de Vijver and Ronald K. Hambleton, Translating Tests: Some Practical Guidelines, European Psychologist, 1, 89-99, 1996.zh_TW
dc.relation.reference (參考文獻) [13] Ide Nancy, Véronis Jane and Word Sense Disambiguation: The State of the Art, Computational Linguistics, 24:1, 1-40, 1998.zh_TW
dc.relation.reference (參考文獻) [14] W. John Hutchins and Harold L. Somers, An Introduction to Machine Translation, London: Academic Press, 1992.zh_TW
dc.relation.reference (參考文獻) [15] Jie Lin and W. Todd Rogers, Validity of the Simultaneous Approach to the Development of Equivalent Achievement Tests in English and French (Stage II), Poster for Graduate Student Poster Session, National Council on Measurement in Education, University of Alberta, 2005.zh_TW
dc.relation.reference (參考文獻) [16] Ekedahl Jonas and Golub Koralika , Word Sense Disambiguation using WordNet and the Lesk Algorithm, Report of Language Processing and Computational Linguistics, University of Lund, 2004.zh_TW
dc.relation.reference (參考文獻) [17] Michael Lesk, Automatic Sense Disambiguation using Machine Readable Dictionaries: How to Tell a Pine Cone From an Ice Cream Cone, Proceedings of the Fifth annual international conference on Systems documentation, 24–26, 1986.zh_TW
dc.relation.reference (參考文獻) [18] Ying Liu and Zong Chengqing, Rule Base Combined Linguistics Knowledge with Corpus, Proceedings of the 2003 IEEE International Conference on Systems, Man and Cybernetics, 5022-5027, 2003.zh_TW
dc.relation.reference (參考文獻) [19] Franz Josef Och, An Efficient Method for Determining Bilingual Word Classes, Proceedings of European Chapter of the Association for Computational Linguistics, 71-76, 1999.zh_TW
dc.relation.reference (參考文獻) [20] Franz Josef Och, Hermann Ney, Improved Statistical Alignment Models, Proceedings of the Thirty-eightth Annual Meeting of the Association for Computational Linguistics, 440-447, 2000.zh_TW
dc.relation.reference (參考文獻) [21] Papineni, K., Roukos, S., Ward, T., and Zhu, W. J., BLEU: a method for automatic evaluation of machine translation, Fourtyth Annual meeting of the Association for Computational Linguistics, 311-318, 2002.zh_TW
dc.relation.reference (參考文獻) [22] Philip Resnik, Selectional Preference and Sense Disambiguation, Proceedings of the Applied Natural Language Processing Workshop on Tagging Text with Lexical Semantics: Why, What and How, 52-57, 1997.zh_TW
dc.relation.reference (參考文獻) [23] Guillermo Solano-Flores, Luis Angel Contreras-Nino and Eduardo Backhoff, The Mexican Translation of TIMSS-1995:Lessons on Test Translation From a Post-Mortem Study, Annual Meetings of National Council on Measurement in Education, Montreal, 2005.zh_TW
dc.relation.reference (參考文獻) [24] Yorick Wilks and Mark Stevenson, Combining Independent Knowledge Sources for Word Sense Disambiguation, Proceedings of the Conference on Recent Advances in Natural Language Processing, 1–7, 1997.zh_TW
dc.relation.reference (參考文獻) [25] Yang Xiaofeng and Li Tangqiu, A Study of Semantic Disambiguation Based on HowNet, International Journal of Computational Linguistics & Chinese Language Processing, 7:1, 47-78, 2002.zh_TW
dc.relation.reference (參考文獻) [26] David Yarowsky, Word Sense Disambiguation using Statistical Models of Roget`s Categories Trained on Large Corpora, Proceedings of the Fourteenth International Conference on Computational Linguistics, 454-460, 1992.zh_TW
dc.relation.reference (參考文獻) [27] 國立科學教育館科學研習月刊,http://www.ntsec.gov.tw/publish/pdf.aspzh_TW
dc.relation.reference (參考文獻) [28] 師大科教中心科學教育月刊,http://140.122.147.172/journal/(new)journal.htmzh_TW
dc.relation.reference (參考文獻) [29] Concise Oxford English Dictionary字典檔,http://stardict.sourceforge.net/Dictionaries_zh_TW.phpzh_TW
dc.relation.reference (參考文獻) [30] Google Translate, http://www.google.com/translate_tzh_TW
dc.relation.reference (參考文獻) [31] JACOB PROJECT, http://danadler.com/jacob/zh_TW
dc.relation.reference (參考文獻) [32] MINIPAR HOME PAGE, http://www.cs.ualberta.ca/~lindek/minipar.htmzh_TW
dc.relation.reference (參考文獻) [33] MXPOST, http://ccc.kmit.edu.tw/wiki/advanceprogramming/index.htmzh_TW
dc.relation.reference (參考文獻) [34] TIMSS 中文版官方網頁, http://timss.sec.ntnu.edu.tw/timss2007/news.aspzh_TW
dc.relation.reference (參考文獻) [35] The Porter Stemming Algorithm, http://www.tartarus.org/martin/PorterStemmer/zh_TW
dc.relation.reference (參考文獻) [36] WordNet API, http://nlp.stanford.edu/nlp/javadoc/wn/zh_TW
dc.relation.reference (參考文獻) [37] YAHOO! 雅虎線上翻譯, http://tw.search.yahoo.com/language/zh_TW