dc.contributor.advisor | 季延平<br>諶家蘭 | zh_TW |
dc.contributor.advisor | Chi, Yen Ping<br>Seng, Jia Lang | en_US |
dc.contributor.author (Authors) | 吳思宏 | zh_TW |
dc.contributor.author (Authors) | Wu, Szu-Hung | en_US |
dc.creator (作者) | 吳思宏 | zh_TW |
dc.creator (作者) | Wu, Szu-Hung | en_US |
dc.date (日期) | 2007 | en_US |
dc.date.accessioned | 18-Sep-2009 20:14:10 (UTC+8) | - |
dc.date.available | 18-Sep-2009 20:14:10 (UTC+8) | - |
dc.date.issued (上傳時間) | 18-Sep-2009 20:14:10 (UTC+8) | - |
dc.identifier (Other Identifiers) | G0094356038 | en_US |
dc.identifier.uri (URI) | https://nccur.lib.nccu.edu.tw/handle/140.119/36943 | - |
dc.description (描述) | 碩士 | zh_TW |
dc.description (描述) | 國立政治大學 | zh_TW |
dc.description (描述) | 資訊管理研究所 | zh_TW |
dc.description (描述) | 94356038 | zh_TW |
dc.description (描述) | 96 | zh_TW |
dc.description.abstract (摘要) | 現今由於之前企業併購熱潮,使得企業到底價值多少?企業是否能夠還有前景?這些問題不僅僅是投資者所關心的問題,也同樣是會計師及企業評價者所關心的問題。又現今已邁入知識經濟時代,企業已從過去以土地、廠房、設備等固定資產來產生企業價值,轉而以服務、品牌、專利等無形資產為主要的企業價值時,企業的價值又要如何來估算。而這些問題都一再的顯示出“企業評價”的重要性。在進行企業評價之前,企業評價模型中之資料項的取得更是關係著最後評價結果的好壞。在企業評價資料項中,可分為財務性及非財務性。財務性資料項由於定義清楚,所以在資料的收集上較非財務性資料容易。但我們發現過往之資料收集方式並不足以應用在企業評價非財務性資料項的收集上,且現行大多採用人工處理資料的方式,不僅耗費大量時間及成本,又因人工輸入而有資料輸入錯誤之風險,使得資料的正確性大幅降低。故本研究提出一自動化擷取年報中企業評價非財務性資料項之方法,希望藉此方法達到簡化資料收集過程,提高資料的正確性。 | zh_TW |
dc.description.abstract (摘要) | Because of the trend of the business combination, now, more and more people concern about “how much value does a business have?” And “does the business still have any perspectives?” This not only get investors’’ interest, but also the accountant and business valuator. Now we already get into a new economy, called knowledge-based economy. When the businesses are not just use fixed asset, such as facility, factory and land to earn money, but also earn their money by providing services, making brand, or sell patents for live, how to measure the business’s real value and what the real value for the business is. These problems all shows that the importance of “Business Valuation.”Before calculate the business value, the most important thing is to collect the data or data category for business valuation. There are two kinds of business valuation data item. One is financial data item; the other is non-financial data item. Because of the financial data item’s clear definition, the data collection process of financial data item is easier than non-financial data item. And the data collection in the past is not fit for today, and now most valuators use manual way to process these data. This way not only wastes the time and money, but also lowers the correctness and raises the risk of mistype during the process of data collection. In this thesis, we propose an approach to automatic extract business valuation data category from annual report by using the technology of data extraction. | en_US |
dc.description.tableofcontents | 第一章 緒論1.1 研究背景 11.2 研究問題 21.3 研究目的 31.4 研究限制 31.5 研究流程 41.6 論文架構 4第二章 文獻探討2.1 企業評價 62.2 資訊擷取 72.3 中文斷詞 112.4 PDF文件格式 11 2.4.1 PDF資料結構 11 2.4.2 PDF實體結構 12 2.4.3 PDF的邏輯結構 132.5 擷取PDF表格 152.6 非財務資訊與年報 19 2.6.1非財務資訊 19 2.6.2年報中的非財務資訊 20第三章 研究模型3.1 PDF文件格式轉換 24 3.1.1 PDF轉HTML 24 3.1.2 PDF表格轉Excel 263.2 預處理程式 273.3 資訊擷取主程式 29 3.3.1 企業評價相關非財務性資料項列表 30 3.3.2 企業評價相關非財務性資料項關鍵詞 303.4 斷詞程式 323.5 Excel擷取程式 343.6 資料整合程式 36第四章 雛型系統設計4.1 雛型系統環境及系統架構 374.2 資料庫系統 384.3 資料來源 424.4 雛型系統設計與實作 43 4.4.1 後端程式 43 4.4.2 前端系統介面 45 4.4.3 後端程式介面 53第五章 實驗設計與證明5.1 實驗設計 585.2 測試樣本 585.3 實驗結果與驗證 595.4 與其他近似研究比較 60 5.4.1 近似研究(一) 60 5.4.2 近似研究(二) 60 5.4.3 綜合比較 61 第六章 研究討論與結語6.1 研究發現 646.2 關鍵詞建立發現 656.3 結論 666.4 未來研究方向 67參考文獻 69附錄:2006年資本額五十大公司列表 72 | zh_TW |
dc.format.extent | 50309 bytes | - |
dc.format.extent | 86409 bytes | - |
dc.format.extent | 92293 bytes | - |
dc.format.extent | 231554 bytes | - |
dc.format.extent | 294247 bytes | - |
dc.format.extent | 380584 bytes | - |
dc.format.extent | 1328604 bytes | - |
dc.format.extent | 284161 bytes | - |
dc.format.extent | 227844 bytes | - |
dc.format.extent | 84850 bytes | - |
dc.format.extent | 61376 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.language.iso | en_US | - |
dc.source.uri (資料來源) | http://thesis.lib.nccu.edu.tw/record/#G0094356038 | en_US |
dc.subject (關鍵詞) | 企業評價 | zh_TW |
dc.subject (關鍵詞) | 資訊擷取 | zh_TW |
dc.subject (關鍵詞) | Portable Document Format ( PDF ) | zh_TW |
dc.subject (關鍵詞) | 資訊檢索 | zh_TW |
dc.subject (關鍵詞) | 斷詞 | zh_TW |
dc.subject (關鍵詞) | Business valuation | en_US |
dc.subject (關鍵詞) | Data extraction | en_US |
dc.subject (關鍵詞) | Portable Document Format ( PDF ) | en_US |
dc.subject (關鍵詞) | Information Retrieval | en_US |
dc.subject (關鍵詞) | Word Segmentation | en_US |
dc.title (題名) | 轉換年報資料以擷取企業評價模型之非財務性資料項 | zh_TW |
dc.title (題名) | A Transformation Approach to Extract Annual Report for Non-Financial Category in Business Valuation | en_US |
dc.type (資料類型) | thesis | en |
dc.relation.reference (參考文獻) | 1. Abdou, S. & Savoy, J. (2008) “Searching in MEDLINE: Query expansion and manual indexing evaluation, ” Information Processing & Management, Vol. 44(2), 781-789. | zh_TW |
dc.relation.reference (參考文獻) | 2. Anjewierden, A. (2001). “AIDAS: Incremental Logical Structure Discovery in PDF Documents”, Sixth International Conference on Document Analysis and Recognition (pp. 0374-0378), Seattle, WA, USA. | zh_TW |
dc.relation.reference (參考文獻) | 3. Chang, Chia-Hui, Kayed, M., Girgis, M., R. , Shaalan, K. F. (2006). “A Survey of Web Information Extraction Systems, ” IEEE Transactions on Knowledge and Data Engineering, pp. 1411-1428. | zh_TW |
dc.relation.reference (參考文獻) | 4. Chen, Keh-Jiann, Ma, Wei-Yun (2002). “Unknown Word Extraction for Chinese Documents”, Proceedings of Coling 2002 (pp.169-175). Taipei, Taiwan. | zh_TW |
dc.relation.reference (參考文獻) | 5. Chien, Lee-Feng (1995). “Fast and Quasi-Natural Language Search for Gigabytes of Chinese Texts”. Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval (pp.112-120). Seattle, Washington, United States. | zh_TW |
dc.relation.reference (參考文獻) | 6. Chien, Lee-Feng (1997). “PAT-Tree-Based Keyword Extraction for Chinese Information Retrieval”, Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieva l (pp. 50-58). Philadelphia, Pennsylvania, United States. | zh_TW |
dc.relation.reference (參考文獻) | 7. Embley, D. W., Fuhr, N., Klas, C. P. & Roelleke, T. (1999) “Ontology Suitability for Uncertain Extraction of Information from Multi-Record Web Documents,” ADI’99 Proceedings. | zh_TW |
dc.relation.reference (參考文獻) | 8. Esposito, F., Ferilli, S., Basile, T. M.A., Mauro,N. D. (2005). “Intelligent Document Processing”, Eighth International Conference on Document Analysis and Recognition (pp. 1100-1104), Seoul, Korea. | zh_TW |
dc.relation.reference (參考文獻) | 9. Kuechler, W., L. (2007). “Business applications of unstructured text, ” Communications of the ACM, Vol. 50(10). | zh_TW |
dc.relation.reference (參考文獻) | 10. Kwok, Thomas, & Nguyen, Thao.(2006). “An Automatic Method to Extract Data from an Electronic Contract Composed of a Number of Documents in PDF Format”, The 3rd IEEE International Conference on Enterprise Computing, E-Commerce, and E-Services(pp. 33-37), Monte Carlo Resort, Las Vegas, Nevada, USA. | zh_TW |
dc.relation.reference (參考文獻) | 11. Hassan, T., & Baumgartner, R. (2005) “Intelligent Wrapping from PDF Documents”, Proceedings of the 1st International Workshop on Representation and Analysis of Web Space, Czech | zh_TW |
dc.relation.reference (參考文獻) | 12. Ma, Wei-Yun, & Chen, Keh-Jiann (2003) “A Bottom-up Merging Algorithm for Chinese Unknown Word Extraction”, Proceedings of ACL, Second SIGHAN Workshop on Chinese Language Processing (pp. 31-38). | zh_TW |
dc.relation.reference (參考文獻) | 13. Mansour, N., Haraty, R.,A., Daher, W. & Houri, M. (2008) “An auto-indexing method for Arabic text,” Information Processing & Management, In Press. | zh_TW |
dc.relation.reference (參考文獻) | 14. Meng, I-Heng. (2002). Design and Study of Semantic Discovery Methods for Extracting Knowledge from Free Text Information. Unpublished master’s thesis, National Chiao-Tung University. Hsinchu, Taiwan. | zh_TW |
dc.relation.reference (參考文獻) | 15. Melnik S., Raghavan, S., Yang, B., & Hector, Garcia-Molina. (2001). “Building a Distributed Full-Text Index for the Web”. Proceedings of the 10th international conference on World Wide Web (pp. 396 – 406), Hong Kong. | zh_TW |
dc.relation.reference (參考文獻) | 16. Morrison, P.,J. (2008) "Tagging and searching: Search retrieval effectiveness of folksonomies on the World Wide Web," Information Processing & Management, In Press. | zh_TW |
dc.relation.reference (參考文獻) | 17. Pen, Chih-Jen. (2001). LIEF: An Algorithm for Learning Information Extraction Rules from Unstructured Documents. Unpublished master’s thesis, National Sun Yat-sen University, Kaohsiung, Taiwan. | zh_TW |
dc.relation.reference (參考文獻) | 18. Ramel, J.-Y., Crucianu, M., Vincent, N., & Faure, C. (2003). Detection, Extraction and Representation of Tables, Seventh International Conference on Document Analysis and Recognition (ICDAR`03), 1, pp. 374-378, Edinburgh, Scotland. | zh_TW |
dc.relation.reference (參考文獻) | 19. Riloff, E., & Lehnert, W. (1994) “Information Extraction as A Basis for High-precision Text Classification,” ACM Transactions on Information Systems, 12(3), 296-333. | zh_TW |
dc.relation.reference (參考文獻) | 20. Smith, G.V.(1988). Corporate Valuation: a Business and Professional Guide. NY: John Wiley & Sons, 72-85. | zh_TW |
dc.relation.reference (參考文獻) | 21. Song Y., & Zhang W. (2005). “Research on PDF Documents Information Extraction System Based on XML”, New Technology of Library and Information Service, 9, 10-13, China. | zh_TW |
dc.relation.reference (參考文獻) | 22. Sweeney, S., Crestani, F. and Losada, D., E. (2008) “ ‘Show me more’: Incremental length summarisation using novelty detection, ” Information Processing & Management, Vol. 44(2), 663-686. | zh_TW |
dc.relation.reference (參考文獻) | 23. Tsai, Yu-Fang, & Chen, Keh-Jiann (2003) “Reliable and Cost-Effective Pos-Tagging”, Proceedings of ROCLING XV (pp161-174). | zh_TW |
dc.relation.reference (參考文獻) | 24. Tsai, Yu-Fang, & Chen, Keh-Jiann (2003) “Context-rule Model for POS Tagging”, Proceedings of PACLIC 17 (pp146-151). | zh_TW |
dc.relation.reference (參考文獻) | 25. Tseng, Yi-Feng. (2005). The Mining and Extraction of Primary Informative Blocks and Data Objects from Systematic Web Pages, Unpublished master’s thesis, National Cheng-Kung University, Tainan, Taiwan. | zh_TW |
dc.relation.reference (參考文獻) | 26. Vechtomova, O., & Karamuftuoglu, M. (2008) “Lexical cohesion and term proximity in document ranking,” Information Processing & Management, In Press. | zh_TW |
dc.relation.reference (參考文獻) | 27. Wu, Chung-Hsien, Yeh, Jui-Feng, Lai, Yu-Sheng (2006) “Semantic Segment Extraction and Matching for Internet FAQ Retrieval, ” IEEE Transactions on Knowledge and Data Engineering, pp. 930-940. | zh_TW |
dc.relation.reference (參考文獻) | 28. Zhai, Y. & Liu, B. (2006) “Structured Data Extraction from the Web Based on Partial Tree Alignment, ” IEEE Transactions on Knowledge and Data Engineering, pp. 1614-1628. | zh_TW |
dc.relation.reference (參考文獻) | 29. 王文廷(2005)。Free-DOM:萃取鬆散文件中的重要資訊並結構化之方法。台灣大學資訊工程學研究所碩士論文。未出版,台北市。 | zh_TW |
dc.relation.reference (參考文獻) | 30. 池千駒(1998)。運用財務性,非財務性資訊建立我國上巿公司財務預警模式。成功大學會計學系碩士論文,未出版,台南市。 | zh_TW |
dc.relation.reference (參考文獻) | 31. 吳啟銘(2001)。企業評價個案實證分析(初版)。台北市:智勝文化出版社。 | zh_TW |
dc.relation.reference (參考文獻) | 32. 吳俊儀(2005)。網頁資訊擷取系統應用於電腦零組件名詞擷取之研究。國防管理學院資訊研究所碩士論文。未出版,台北市。 | zh_TW |
dc.relation.reference (參考文獻) | 33. 林千翔(2005)。基於特製隱藏式馬可夫模型之中文斷詞研究。中央大學資訊工程研究所碩士論文。未出版,中壢市。 | zh_TW |
dc.relation.reference (參考文獻) | 34. 張漢傑(2007)。破解財務危機。台北市:梅林文化事業有限公司。 | zh_TW |
dc.relation.reference (參考文獻) | 35. 陳怡雯(2003)。企業財務危機預警模式-非財務指標之運用。真理大學財經研究所碩士論文。未出版,台北縣。 | zh_TW |
dc.relation.reference (參考文獻) | 36. 黃燕萍(1999)。中文社會新聞文件資訊擷取。雲林科技大學資訊管理研究所碩士論文。未出版,雲林。 | zh_TW |
dc.relation.reference (參考文獻) | 37. 董振東、董強(2001)。面向信息處理的詞匯語義研究中的若干問題,語言文字應用,第三期,pp.27-32。 | zh_TW |
dc.relation.reference (參考文獻) | 38. 維基百科。檢索GOOGLE。線上檢索日期:2006年11月30日。網址:http://zh.wikipedia.org/wiki/Google | zh_TW |
dc.relation.reference (參考文獻) | 39. GOOGLE。為什麼使用GOOGLE。線上檢索日期:2006年11月30日。網址:http://www.google.com/intl/zh-TW/why_use.html | zh_TW |