學術產出-NSC Projects

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 語意式資料雲上如何來巧控海量資料分析效益與保護間的平衡
作者 胡毓忠
貢獻者 資訊科學系
關鍵詞 分散式社群Web;具隱私保護社群Web;個人可辨識別碼;WebID;具語意式的電腦規範;巨量資料分析;統計揭露控管
Decentralized Social Web; Privacy-Aware Social Web; Personally Identifiable Information (PII); WebID; Semantics-enabled Policy; Big Data Analytics; Statistical Disclosure Control (SDC)
日期 2013
上傳時間 8-Dec-2017 15:08:05 (UTC+8)
摘要 本研究案主要是探討分散式社群Web上具有隱私權保護的WebID分析研究。我們首先論述為何要用開放分散式而不是封閉集中式的個人資料管理控管機制。接著我們提出一個具有電腦規範認知的系統架構,在此架構上個人資料擁有者可以挑選一個可信的資料管理者來匿名化其個人資料與社群網路脈絡的WebID。這些個人化WebID匿名化資料集是以RDF(S)串連式資料型態來提供巨量資料的分析。除此之外我們引用結合R和Hadoop的RHadoop分析系統平台來進行有效且大量RDF(S)為主的分散式社群資料集的分析。最後我們設計並且實做出三種型態的WebID資料集控管所需的電腦可執行規範,主要包括了資料使用者控管規範,資料處理規範,與資料揭露規範,這些電腦規範可以呼叫上述RHadoop資料分析的模組,並且更進一步平衡資料使用效益和個人資料保護間的平衡。這一部份的研究成果已經發表在IEEE Web Intelligent-2014, Warsaw, Poland的國際研討會。 我們也完成另外一份論文的初稿: Propagation Control Services for WebID Analytics on the Decentralized Social Web。並準備投稿到相關的國際電腦科技研討會或專書。本論文初稿是延續上述已經發表論文的內容而從資訊流控管服務(Propagation Control Services)的觀點來分析分散式社群網路之上相關成員如資料擁有者,資料控管者,以及資料使用者之間的關係。我們沿用上述三種WebID資料集控管的三種電腦規範,並且強調這三種電腦規範執行與落實必須要在一個具可靠性與透通性的情況下來進行資料流通控管服務。我們最後點出該如何在資訊流通鏈利用上述三種電腦控管規範來呼叫WebID流通控管服務模組以化解WebID資料保護與效益間所產生的衝突。 本研究案:「語意式資料雲上如何來巧控海量資料分析效益與保護間的平衡」詳細的研究目的、文獻探討、研究方法與研究步驟、結論與未來研究請參考下面已經發表在IEEE Web Intelligence-2014 國際研討會的論文:Privacy-Preserving WebID Analytics on the Decentralized Policy-Aware Social Web (https://dl.acm.org/citation.cfm?id=2682811 )。以及另外一篇投稿中的論文:Propagation Control Services for WebID Analytics on the Decentralized Social Web。另外碩士生孫肇祥同學在103年度的碩士論文:整合R與Hadoop/MapReduce來分析FOAF社群網路,亦為本專案研究計畫成果之一。
We address the research challenges of privacy-preserving WebID analytics on the decentralized Social Web. We first argue why we should use open and decentralized control but not closed and centralized control of personal data management. Then, we present a policy-aware architecture, where a data owner hand-picks a trusted data controller to mask his/her personally identifiable information (PII) and other sensitive social relationships of the WebID so only anonymous RDF(S) linked datasets are available for analytics. Moreover, we advocate using a R and Hadoop integration paradigm, called RHadoop, for effective hybrid WebID analytics of large-scale social network linked datasets. Finally, we propose various types of semantics-enabled policies to call for the RHadoop hybrid WebID analytics and further balance data utility and protection on the privacy-aware Social Web. The primary stakeholders in WebID analytics are the data owner, data controller, and data user. Above three types of semantics-enabled policy are proposed and enforced by data controllers to enable access control, data handling, and data releasing actions on the WebID datasets. The policy enforcement should be accountable and transparent at the data controllers to provide WebID propagation control services. Each data controller enforces a data handling policy to anonymize massive WebIDs. Moreover, the super data controller enforces access control and data releasing policies to ensure that the data owners receive the privacy-preserving WebID analytics services. Finally, we point out how to resolve WebID protection and utility conflict through different types of semantics-enabled policy to call for WebID propagation control services at the data controllers of an information value chain. More detailed information about this project, Crafting the Balance between Big Data Analytics Utility and Protection in the Semantic Data Cloud, MOST 102-2221-E-004-014-, research results, please refer to the paper published at IEEE International Conference on Web Intelligence-2014, Warsaw, Poland (https://dl.acm.org/citation.cfm?id=2682811 ) and another submitting article, Propagation Control Services for WebID Analytics on the Decentralized Social Web. A master student thesis, using R and Hadoop /MapReduce for FOAF-based Social Network Analytics, submitted by Jhao-Siang Sun is one of the results.
關聯 執行起迄:2013/08/01~2014/09/30
102-2221-E-004-014
資料類型 report
dc.contributor 資訊科學系zh_Tw
dc.creator (作者) 胡毓忠zh_TW
dc.date (日期) 2013en_US
dc.date.accessioned 8-Dec-2017 15:08:05 (UTC+8)-
dc.date.available 8-Dec-2017 15:08:05 (UTC+8)-
dc.date.issued (上傳時間) 8-Dec-2017 15:08:05 (UTC+8)-
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/115090-
dc.description.abstract (摘要) 本研究案主要是探討分散式社群Web上具有隱私權保護的WebID分析研究。我們首先論述為何要用開放分散式而不是封閉集中式的個人資料管理控管機制。接著我們提出一個具有電腦規範認知的系統架構,在此架構上個人資料擁有者可以挑選一個可信的資料管理者來匿名化其個人資料與社群網路脈絡的WebID。這些個人化WebID匿名化資料集是以RDF(S)串連式資料型態來提供巨量資料的分析。除此之外我們引用結合R和Hadoop的RHadoop分析系統平台來進行有效且大量RDF(S)為主的分散式社群資料集的分析。最後我們設計並且實做出三種型態的WebID資料集控管所需的電腦可執行規範,主要包括了資料使用者控管規範,資料處理規範,與資料揭露規範,這些電腦規範可以呼叫上述RHadoop資料分析的模組,並且更進一步平衡資料使用效益和個人資料保護間的平衡。這一部份的研究成果已經發表在IEEE Web Intelligent-2014, Warsaw, Poland的國際研討會。 我們也完成另外一份論文的初稿: Propagation Control Services for WebID Analytics on the Decentralized Social Web。並準備投稿到相關的國際電腦科技研討會或專書。本論文初稿是延續上述已經發表論文的內容而從資訊流控管服務(Propagation Control Services)的觀點來分析分散式社群網路之上相關成員如資料擁有者,資料控管者,以及資料使用者之間的關係。我們沿用上述三種WebID資料集控管的三種電腦規範,並且強調這三種電腦規範執行與落實必須要在一個具可靠性與透通性的情況下來進行資料流通控管服務。我們最後點出該如何在資訊流通鏈利用上述三種電腦控管規範來呼叫WebID流通控管服務模組以化解WebID資料保護與效益間所產生的衝突。 本研究案:「語意式資料雲上如何來巧控海量資料分析效益與保護間的平衡」詳細的研究目的、文獻探討、研究方法與研究步驟、結論與未來研究請參考下面已經發表在IEEE Web Intelligence-2014 國際研討會的論文:Privacy-Preserving WebID Analytics on the Decentralized Policy-Aware Social Web (https://dl.acm.org/citation.cfm?id=2682811 )。以及另外一篇投稿中的論文:Propagation Control Services for WebID Analytics on the Decentralized Social Web。另外碩士生孫肇祥同學在103年度的碩士論文:整合R與Hadoop/MapReduce來分析FOAF社群網路,亦為本專案研究計畫成果之一。zh_TW
dc.description.abstract (摘要) We address the research challenges of privacy-preserving WebID analytics on the decentralized Social Web. We first argue why we should use open and decentralized control but not closed and centralized control of personal data management. Then, we present a policy-aware architecture, where a data owner hand-picks a trusted data controller to mask his/her personally identifiable information (PII) and other sensitive social relationships of the WebID so only anonymous RDF(S) linked datasets are available for analytics. Moreover, we advocate using a R and Hadoop integration paradigm, called RHadoop, for effective hybrid WebID analytics of large-scale social network linked datasets. Finally, we propose various types of semantics-enabled policies to call for the RHadoop hybrid WebID analytics and further balance data utility and protection on the privacy-aware Social Web. The primary stakeholders in WebID analytics are the data owner, data controller, and data user. Above three types of semantics-enabled policy are proposed and enforced by data controllers to enable access control, data handling, and data releasing actions on the WebID datasets. The policy enforcement should be accountable and transparent at the data controllers to provide WebID propagation control services. Each data controller enforces a data handling policy to anonymize massive WebIDs. Moreover, the super data controller enforces access control and data releasing policies to ensure that the data owners receive the privacy-preserving WebID analytics services. Finally, we point out how to resolve WebID protection and utility conflict through different types of semantics-enabled policy to call for WebID propagation control services at the data controllers of an information value chain. More detailed information about this project, Crafting the Balance between Big Data Analytics Utility and Protection in the Semantic Data Cloud, MOST 102-2221-E-004-014-, research results, please refer to the paper published at IEEE International Conference on Web Intelligence-2014, Warsaw, Poland (https://dl.acm.org/citation.cfm?id=2682811 ) and another submitting article, Propagation Control Services for WebID Analytics on the Decentralized Social Web. A master student thesis, using R and Hadoop /MapReduce for FOAF-based Social Network Analytics, submitted by Jhao-Siang Sun is one of the results.en_US
dc.format.extent 3381114 bytes-
dc.format.mimetype application/pdf-
dc.relation (關聯) 執行起迄:2013/08/01~2014/09/30zh_TW
dc.relation (關聯) 102-2221-E-004-014zh_TW
dc.subject (關鍵詞) 分散式社群Web;具隱私保護社群Web;個人可辨識別碼;WebID;具語意式的電腦規範;巨量資料分析;統計揭露控管zh_TW
dc.subject (關鍵詞) Decentralized Social Web; Privacy-Aware Social Web; Personally Identifiable Information (PII); WebID; Semantics-enabled Policy; Big Data Analytics; Statistical Disclosure Control (SDC)en_US
dc.title (題名) 語意式資料雲上如何來巧控海量資料分析效益與保護間的平衡_TW
dc.type (資料類型) report