學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 本體論為基礎的統計資訊整合-以政府公開資訊為例
Ontology-Based Statistical Data Integration for Open Government
作者 梁世麒
Liang, Shih Chi
貢獻者 胡毓忠
Hu, Yuh Jong
梁世麒
Liang, Shih Chi
關鍵詞 語意網
開放式政府
Semantic Web
Open Government
日期 2012
上傳時間 1-Feb-2013 17:06:46 (UTC+8)
摘要   現代的民主國家無不致力於深化民主的價值,政府運用人民所繳納的稅金進行相關施政,在政府運用國家資源的同時,也應該提供各項施政的統計資料以便說明及用來監督政府施政的成效,政府提供的資料所涵蓋的領域及格式非常多元,若要加以運用產生具有附加價值的資訊,往往單一來源的資料無法滿足需求,必須透過多方的合併參照才能凸顯在資料背後所隱含的價值,因此使用者在運用前必須先針對不同來源的統計資料進行多方的蒐集、參考及比對,最後才能彙整成為有用的資訊,而政府將各種的資料進行公開之後也會快速累積出龐大的資料量,若要透過人工的蒐集比對其困難度也越來越高,因此如何能動態地從不同來源中萃取出有意義的內容便是一個相當大的挑戰,本研究運用語意網技術來解決此一困難,透過單一平台來進行多元資料的彙整查詢,在此平台上使用者可以依其需要選擇特定資料維度或計量單位作為整合條件,並針對特定或不特定的對象進行查詢,最後透過彙整後的結果來提高資料本身的價值,本研究最終目的為提供系統化的方法將政府公開統計資料進行有意義的萃取、彙整及再利用。
For enhancement of the value of democracy, the governments are expected to publish statistical data to explain and monitor the performance of policy implementation while they utilize the national resources and the tax for the policies. The data provided by official departments usually contain multiple domain information with diverse formats, which cause the difficulty to generate value-added information from single source. The embedded values could be revealed only by cross-reference of multiple sources. Valued information must be collected, cross-referred, and compared from different sources. In addition, after the government publishes the data, the database would be accelerated to accumulate. The difficulty of manual data collection and comparison would be enhanced consequently. Therefore, it is challenge to extract valued content from different sources dynamically.The study utilized semantic web technology to integrate the inquiry of diverse data with single platform. Users can select specific data dimension or measurement unit based on their requirement as the condition and inquire on specific or unspecific objects. The value of data could be enhanced with the integrated results. The ultimate purpose of this study is to provide a systematized method to extract, integrate and reuse government`s public statistical data.
參考文獻 參考文獻
[1] 葉俊榮、許宗力. 政府資訊公開制度之研究. 台北:行政院研考會(1996).
[2] Berners-Lee, T. Putting government data online. http://www.w3.org/
DesignIssues/GovData.html, 2009.
[3] Bizer, C. D2rq - treating non-rdf databases as virtual rdf graphs. In ISWC
(2004).
[4] Davies, A., and Lithwick, D. Government 2.0 and access to information:
2. recent developments in proactive disclosure and open data in the united
states and other countries. Ottawa, Canada: Library of Parliament.
[5] Han, L., Finin, T., Parr, C., Sachs, J., and Joshi, A. RDF123: a mechanism to
transform spreadsheets to RDF. Tech. rep., University of Maryland, Baltimore
County, August 2007. Technical Report.
[6] Harris, S., and Seaborne, A. SPARQL 1.1 Query Language. http://www.
w3.org/TR/sparql11-query/, 2010.
[7] Hausenblas, M., Halb, W., Raimond, Y., Feigenbaum, L., and Ayers, D.
Scovo: Using statistics on the web of data. In ESWC (2009).
[8] Jeni Tennison, T. The rdf data cube vocabulary. W3C Working Draft, 2012.
[9] Kampgen, B., and Harth, A. Transforming statistical linked data for use in
olap systems. In I-SEMANTICS (2011).
[10] Langegger, A., and W o , W. SemWIQ - semantic web integrator and query engine. In GI Jahrestagung (2) (2008).
[11] Langegger, A., and W o , W. XLWrap - querying and integrating arbitrary spreadsheets with SPARQL. In ISWC (2009).
[12] Orszag, P. R. Open government directive. Executive Office of the President,
Office of Management and Budget, Memorandum for the Heads of Executive
Departments and Agencies, Washington, DC, December 8.
52
[13] Prud`hommeaux, E., and Seaborne, A. SPARQL query language for rdf.
W3C Recommendation 4 (2008), 1--106.
[14] Tauberer, J. Open data is civic capital: Best practices for `open government
data`. Version 1.1, 20 July, 2009.
[15] Treasury, H. M. Putting the frontline first: smarter government. Cm 7753.
53
描述 碩士
國立政治大學
資訊科學學系
98971016
101
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0098971016
資料類型 thesis
dc.contributor.advisor 胡毓忠zh_TW
dc.contributor.advisor Hu, Yuh Jongen_US
dc.contributor.author (Authors) 梁世麒zh_TW
dc.contributor.author (Authors) Liang, Shih Chien_US
dc.creator (作者) 梁世麒zh_TW
dc.creator (作者) Liang, Shih Chien_US
dc.date (日期) 2012en_US
dc.date.accessioned 1-Feb-2013 17:06:46 (UTC+8)-
dc.date.available 1-Feb-2013 17:06:46 (UTC+8)-
dc.date.issued (上傳時間) 1-Feb-2013 17:06:46 (UTC+8)-
dc.identifier (Other Identifiers) G0098971016en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/56916-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 98971016zh_TW
dc.description (描述) 101zh_TW
dc.description.abstract (摘要)   現代的民主國家無不致力於深化民主的價值,政府運用人民所繳納的稅金進行相關施政,在政府運用國家資源的同時,也應該提供各項施政的統計資料以便說明及用來監督政府施政的成效,政府提供的資料所涵蓋的領域及格式非常多元,若要加以運用產生具有附加價值的資訊,往往單一來源的資料無法滿足需求,必須透過多方的合併參照才能凸顯在資料背後所隱含的價值,因此使用者在運用前必須先針對不同來源的統計資料進行多方的蒐集、參考及比對,最後才能彙整成為有用的資訊,而政府將各種的資料進行公開之後也會快速累積出龐大的資料量,若要透過人工的蒐集比對其困難度也越來越高,因此如何能動態地從不同來源中萃取出有意義的內容便是一個相當大的挑戰,本研究運用語意網技術來解決此一困難,透過單一平台來進行多元資料的彙整查詢,在此平台上使用者可以依其需要選擇特定資料維度或計量單位作為整合條件,並針對特定或不特定的對象進行查詢,最後透過彙整後的結果來提高資料本身的價值,本研究最終目的為提供系統化的方法將政府公開統計資料進行有意義的萃取、彙整及再利用。zh_TW
dc.description.abstract (摘要) For enhancement of the value of democracy, the governments are expected to publish statistical data to explain and monitor the performance of policy implementation while they utilize the national resources and the tax for the policies. The data provided by official departments usually contain multiple domain information with diverse formats, which cause the difficulty to generate value-added information from single source. The embedded values could be revealed only by cross-reference of multiple sources. Valued information must be collected, cross-referred, and compared from different sources. In addition, after the government publishes the data, the database would be accelerated to accumulate. The difficulty of manual data collection and comparison would be enhanced consequently. Therefore, it is challenge to extract valued content from different sources dynamically.The study utilized semantic web technology to integrate the inquiry of diverse data with single platform. Users can select specific data dimension or measurement unit based on their requirement as the condition and inquire on specific or unspecific objects. The value of data could be enhanced with the integrated results. The ultimate purpose of this study is to provide a systematized method to extract, integrate and reuse government`s public statistical data.en_US
dc.description.tableofcontents 中文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
英文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
誌謝. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
表目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
圖目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
演算法目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1 導論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1 研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 研究目的. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 各章節概述. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 研究背景. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1 資訊公開法與政府資訊公開. . . . . . . . . . . . . . . . . . . . . . 12
2.2 政府資訊再利用. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 美國. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.2 英國. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 澳洲. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.4 台灣. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 相關研究. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1 利用語意網與OLAP 技術整合公部門統計資料. . . . . . . . . . . . 18
4
3.2 語意式整合查尋引擎. . . . . . . . . . . . . . . . . . . . . . . . . . 19
4 整合方法塑模. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1 原始資料對應轉換. . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 整合知識庫. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 查詢語法改寫與執行. . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4 查詢結果重整. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5 整合程序與實驗平台設計. . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6 平台實作與驗證. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.1 建立多維度整合平台. . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.2 測試範例說明. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.2.1 範例一:2010 年1 月亞洲國家與歐洲國家入境台灣的觀光人數41
6.2.2 範例二:2009 年與2010 年製造業家數與生產力指數年增率
比較. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2.3 範例三:2011 年與2012 年生產力指數年增率與受雇員工工
資比較. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7 結論與未來展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
參考文獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
zh_TW
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0098971016en_US
dc.subject (關鍵詞) 語意網zh_TW
dc.subject (關鍵詞) 開放式政府zh_TW
dc.subject (關鍵詞) Semantic Weben_US
dc.subject (關鍵詞) Open Governmenten_US
dc.title (題名) 本體論為基礎的統計資訊整合-以政府公開資訊為例zh_TW
dc.title (題名) Ontology-Based Statistical Data Integration for Open Governmenten_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) 參考文獻
[1] 葉俊榮、許宗力. 政府資訊公開制度之研究. 台北:行政院研考會(1996).
[2] Berners-Lee, T. Putting government data online. http://www.w3.org/
DesignIssues/GovData.html, 2009.
[3] Bizer, C. D2rq - treating non-rdf databases as virtual rdf graphs. In ISWC
(2004).
[4] Davies, A., and Lithwick, D. Government 2.0 and access to information:
2. recent developments in proactive disclosure and open data in the united
states and other countries. Ottawa, Canada: Library of Parliament.
[5] Han, L., Finin, T., Parr, C., Sachs, J., and Joshi, A. RDF123: a mechanism to
transform spreadsheets to RDF. Tech. rep., University of Maryland, Baltimore
County, August 2007. Technical Report.
[6] Harris, S., and Seaborne, A. SPARQL 1.1 Query Language. http://www.
w3.org/TR/sparql11-query/, 2010.
[7] Hausenblas, M., Halb, W., Raimond, Y., Feigenbaum, L., and Ayers, D.
Scovo: Using statistics on the web of data. In ESWC (2009).
[8] Jeni Tennison, T. The rdf data cube vocabulary. W3C Working Draft, 2012.
[9] Kampgen, B., and Harth, A. Transforming statistical linked data for use in
olap systems. In I-SEMANTICS (2011).
[10] Langegger, A., and W o , W. SemWIQ - semantic web integrator and query engine. In GI Jahrestagung (2) (2008).
[11] Langegger, A., and W o , W. XLWrap - querying and integrating arbitrary spreadsheets with SPARQL. In ISWC (2009).
[12] Orszag, P. R. Open government directive. Executive Office of the President,
Office of Management and Budget, Memorandum for the Heads of Executive
Departments and Agencies, Washington, DC, December 8.
52
[13] Prud`hommeaux, E., and Seaborne, A. SPARQL query language for rdf.
W3C Recommendation 4 (2008), 1--106.
[14] Tauberer, J. Open data is civic capital: Best practices for `open government
data`. Version 1.1, 20 July, 2009.
[15] Treasury, H. M. Putting the frontline first: smarter government. Cm 7753.
53
zh_TW