題名 應用文字探勘於影評文章自動摘要之研究
A Study on Application of Text Mining for Automatic Text Summarization of Film Review
作者 鄧亦安
Teng, I An
貢獻者 楊建民
Teng, I An
關鍵詞 文字探勘
Film review summary
Automatic text summarization
日期 2016
上傳時間 20-Jul-2016 17:15:39 (UTC+8)
摘要 隨著網路世界的興起,在面臨選擇難題時,民眾不僅會接收口耳相傳的資訊,也會以關鍵字上網搜尋目標資訊,但是在海量資料的浪潮中,如何快速的整合資料是一大挑戰。電影影評文章摘要可以幫助民眾進電影院前了解電影的資訊,透過這樣的方式確認電影是自身有興趣的電影。
As Facing the Big Data issue, there are too many information on the website for reader to understand. How to perform and summarize essential information quickly is a challenge. People who want to go to a movie will also face this situation. Before choosing movies, they will search relative information of the movies. However, there are many film reviews all over the websites. Automatic text summarization can efficiently extract important information for readers, and conclude concepts of reviews on the websites. Through this method, readers can easily comprehend the best idea of all the reviews and save their time.
The research presents a multi-concept and extractive film review summary for readers. It generates film review summary from the most popular blog platform, PIXNET, with extract-based method and clustering concept. The method using K-Means algorism let the film review summary focus on specific film to cluster the sentences by features, and having statistical sense and WWA method to measure the weight of sentences in order to choose the representative sentences. On the last step, it will compare to templates to decide the sequence of classified sentences and summary all represent sentences from each cluster. The research provides a multi-concept and extractive film review summary for people.
From the result, there are five movies, which are used summary method increase the average similarity to 21.19% that comparing between the film reviews summary and templates summary. It shows that the automatic film reviews summarization can extract the important sentences from the reviews. Also, with comparing template method to order the cluster, it can sequentially list the cluster of the sentences to generate a movie review, which saves readers’ time and easily comprehend.
描述 碩士
資料類型 thesis
