學術產出-Proceedings

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 關聯式文本探勘資訊探索實驗平台設計─以「二二八事件臺灣本地新聞史料彙編」為例
Designing an Experiment Platform for Information Exploration with Relational Text Mining: A Case Study with the Taiwan 228 - event News Archive
作者 劉吉軒
Liu, Jyi-shane
甯格致
薛化元
蔡銘峰
貢獻者 資科系
關鍵詞 史料資訊探勘 ; 社會網路分析 ; 資訊檢索
Historical Text Mining ; Social Network Analysis ; Information Retrieval ; Critical Discourse Analysis
日期 2014-12
上傳時間 22-Apr-2020 15:39:17 (UTC+8)
摘要 對人文研究者來說,一個具有意義的思維或觀點的論證過程中,往往牽涉大 量文本資料的爬梳,篩選出研究目標相關資訊,進而由諸多線索累積為更具輪廓 的脈絡,而這些過程也往往會引發研究問題的重新定義、聚焦、深化。數位人文 研究以電腦的資料處理與計算能力,協助研究者以全新的方式從資料中尋找答案, 如同透過一個可移動、可調整的鏡片,以微觀、巨觀或不同視角的彈性檢視能力, 分析大量人文資料,探討人文議題,解讀人文現象[1]。目前許多數位人文研究 應用資訊技術於史料文本的資訊探索取向,乃先鎖定特定文字所代表之主題概念 或現象,再從史料文本中搜尋比對,而以統計量化分析方法,從數量、比例等觀 點,觀察主題概念或現象的顯著或差異程度,進而驗證部分假設或獲取片段式的 新資訊[2][3]。這種資訊探索取向是一種被描述為「hunt and peck」的單一目標費 力搜尋方式,或以「slicing」橫向切片、連貫比較的方式,找出趨勢或異常現象 [4][5]。 大量文本資料往往隱藏之意義豐富的主題資訊,其中錯綜複雜的關係與層層 因果的交疊,更需要資訊技術的功能突破,以有效的關聯挖掘,具體呈現其脈絡 面貌,協助人文研究者解讀與發現。單一視角的現象挖掘,雖然能彰顯特定主題 資訊的意義,但通常也忽略了關聯性資訊與脈絡結構的關鍵性。有鑑於此,本研 究除了採納過去社會、心理研究領域之質化分析精神,即看重每一個代表人、事、 物之個體的相對存在意義,進一步考量每一個體與周遭其他個體所關聯的局部情 況,將個體之間的關聯視為是一種社會定位的呈現;再進而以橫向角度觀察比較 同類型個體與個體之間的異同、或以縱向角度觀察比較基於不同環境、不同時空下所浮現出的脈絡樣貌。這樣的概念是利用先發掘出較具意義的關鍵個體,進一步觀察個體所擁有之關聯情況,再施以橫向合併、縱向貫穿的資訊凝聚視野,期能以一種多層次的資料維度觀點,進行較深度的資訊擷取及關聯探索,協助研究者得到更具深刻意義的發現與結果。
This study integrates methods on computer science and social sciences and, with historian perspectives, views historical text as embedding a miniature social system. The task involves extracting relations among entities from text and performing structural analysis of the constructed entity-relationship network. One of the primary goals is to find the key-role actor and reveal its social position, which may defined by certain incidents, words, behaviors. Another further goal is to find other actors with similar social position and identify the underlying community. Finally, an abstract social role can be characterized to provide insight on the constructed social system from text. We develop an experimental platform – PARTEX, which provides text analytic tools and allows exploratory observation on relational structure among entities. Among our well-preprocessed and imported document collections, with historian inputs on key conceptual words as focal issues, the platform has been used to identify entity relations and construct the embedded social system. Discourse perspectives of position, demand, emotion, and action, are investigated with contextual parameters of boundary, association type, and relational strength. Both visual representation of the discourse-oriented social system and the quantitative measures are presented for analytic comparison. This study hopes to provide an effective text analytic tool and contribute in discovering historical implications. We intend to further improve the platform by recursive use test and validate the approach by fostering fruitful research results.
關聯 Symposiums of the Fifth International Conference of Digital Archives and Digital Humanities, pp. 533-540, Academia Sinica, TAIWAN
資料類型 conference
dc.contributor 資科系
dc.creator (作者) 劉吉軒
dc.creator (作者) Liu, Jyi-shane
dc.creator (作者) 甯格致
dc.creator (作者) 薛化元
dc.creator (作者) 蔡銘峰
dc.date (日期) 2014-12
dc.date.accessioned 22-Apr-2020 15:39:17 (UTC+8)-
dc.date.available 22-Apr-2020 15:39:17 (UTC+8)-
dc.date.issued (上傳時間) 22-Apr-2020 15:39:17 (UTC+8)-
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/129361-
dc.description.abstract (摘要) 對人文研究者來說,一個具有意義的思維或觀點的論證過程中,往往牽涉大 量文本資料的爬梳,篩選出研究目標相關資訊,進而由諸多線索累積為更具輪廓 的脈絡,而這些過程也往往會引發研究問題的重新定義、聚焦、深化。數位人文 研究以電腦的資料處理與計算能力,協助研究者以全新的方式從資料中尋找答案, 如同透過一個可移動、可調整的鏡片,以微觀、巨觀或不同視角的彈性檢視能力, 分析大量人文資料,探討人文議題,解讀人文現象[1]。目前許多數位人文研究 應用資訊技術於史料文本的資訊探索取向,乃先鎖定特定文字所代表之主題概念 或現象,再從史料文本中搜尋比對,而以統計量化分析方法,從數量、比例等觀 點,觀察主題概念或現象的顯著或差異程度,進而驗證部分假設或獲取片段式的 新資訊[2][3]。這種資訊探索取向是一種被描述為「hunt and peck」的單一目標費 力搜尋方式,或以「slicing」橫向切片、連貫比較的方式,找出趨勢或異常現象 [4][5]。 大量文本資料往往隱藏之意義豐富的主題資訊,其中錯綜複雜的關係與層層 因果的交疊,更需要資訊技術的功能突破,以有效的關聯挖掘,具體呈現其脈絡 面貌,協助人文研究者解讀與發現。單一視角的現象挖掘,雖然能彰顯特定主題 資訊的意義,但通常也忽略了關聯性資訊與脈絡結構的關鍵性。有鑑於此,本研 究除了採納過去社會、心理研究領域之質化分析精神,即看重每一個代表人、事、 物之個體的相對存在意義,進一步考量每一個體與周遭其他個體所關聯的局部情 況,將個體之間的關聯視為是一種社會定位的呈現;再進而以橫向角度觀察比較 同類型個體與個體之間的異同、或以縱向角度觀察比較基於不同環境、不同時空下所浮現出的脈絡樣貌。這樣的概念是利用先發掘出較具意義的關鍵個體,進一步觀察個體所擁有之關聯情況,再施以橫向合併、縱向貫穿的資訊凝聚視野,期能以一種多層次的資料維度觀點,進行較深度的資訊擷取及關聯探索,協助研究者得到更具深刻意義的發現與結果。
dc.description.abstract (摘要) This study integrates methods on computer science and social sciences and, with historian perspectives, views historical text as embedding a miniature social system. The task involves extracting relations among entities from text and performing structural analysis of the constructed entity-relationship network. One of the primary goals is to find the key-role actor and reveal its social position, which may defined by certain incidents, words, behaviors. Another further goal is to find other actors with similar social position and identify the underlying community. Finally, an abstract social role can be characterized to provide insight on the constructed social system from text. We develop an experimental platform – PARTEX, which provides text analytic tools and allows exploratory observation on relational structure among entities. Among our well-preprocessed and imported document collections, with historian inputs on key conceptual words as focal issues, the platform has been used to identify entity relations and construct the embedded social system. Discourse perspectives of position, demand, emotion, and action, are investigated with contextual parameters of boundary, association type, and relational strength. Both visual representation of the discourse-oriented social system and the quantitative measures are presented for analytic comparison. This study hopes to provide an effective text analytic tool and contribute in discovering historical implications. We intend to further improve the platform by recursive use test and validate the approach by fostering fruitful research results.
dc.format.extent 1473162 bytes-
dc.format.mimetype application/pdf-
dc.relation (關聯) Symposiums of the Fifth International Conference of Digital Archives and Digital Humanities, pp. 533-540, Academia Sinica, TAIWAN
dc.subject (關鍵詞) 史料資訊探勘 ; 社會網路分析 ; 資訊檢索
dc.subject (關鍵詞) Historical Text Mining ; Social Network Analysis ; Information Retrieval ; Critical Discourse Analysis
dc.title (題名) 關聯式文本探勘資訊探索實驗平台設計─以「二二八事件臺灣本地新聞史料彙編」為例
dc.title (題名) Designing an Experiment Platform for Information Exploration with Relational Text Mining: A Case Study with the Taiwan 228 - event News Archive
dc.type (資料類型) conference