基於 RGBD 影音串流之肢體表情語言表現評估 | 學術產出

學術產出-學位論文

文章檢視/開啟

pdf(489)

書目匯出

Google Scholar^TM

題名	基於 RGBD 影音串流之肢體表情語言表現評估 Estimation and Evaluation of Body Language Using RGBD Data
作者	吳怡潔 Wu, Yi Chieh
貢獻者	廖文宏 Liao, Wen Hung 吳怡潔 Wu, Yi Chieh
關鍵詞	肢體語言 RGBD Kinect 感測器表現評估聲音處理模式分類 Body language RGBD Kinect sensor performance evaluation audio processing pattern classification
日期	2013
上傳時間	3-十一月-2014 10:11:57 (UTC+8)
摘要	本論文基於具備捕捉影像深度的RGBD影音串流裝置－Kinect感測器，在簡報場域中，作為擷取簡報者肢體動作、表情、以及語言表現模式的設備。首先我們提出在特定時段內的表現模式，可以經由大眾的評估，而具有喜歡／不喜歡的性質，我們將其分別命名為Period of Like(POL)以及Period of Dislike(POD)。論文中並以三種Kinect SDK所提供的影像特徵：動畫單元、骨架關節點、以及3D臉部頂點，輔以35位評估者所提供之評估資料，以POD/POL取出的特徵模式，分析是否具有一致性，以及是否可用於未來預測。最後將研究結果開發應用於原型程式，期許這樣的預測系統，能夠為在簡報中表現不佳而困擾的人們，提點其優劣之處，以作為後續改善之依據。 In this thesis, we capture body movements, facial expressions, and voice data of subjects in the presentation scenario using RGBD-capable Kinect sensor. The acquired videos were accessed by a group of reviewers to indicate their preferences/aversions to the presentation style. We denote the two classes of ruling as Period of Like (POL) and Period of Dislike (POD), respectively. We then employ three types of image features, namely, animation units (AU), skeletal joints, and 3D face vertices to analyze the consistency of the evaluation result, as well as the ability to classify unseen footage based on the training data supplied by 35 evaluators. Finally, we develop a prototype program to help users to identify their strength/weakness during their presentation so that they can improve their skills accordingly.
參考文獻	[1] Rachael E. Jack, “Facial expressions of emotion are not culturally universal”, PNAS online, 2012. [2] Alex Pentland, ” Honest Signals : How They Shape Our World”, The MIT Press, August.2008. [3] Oxford Dictionaries, http://www.oxforddictionaries.com/definition/english/body-language. [4] Wikipedia contributors, "Kinect," Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/w/index.php?title=Kinect&oldid=612754262 (accessed June 29, 2014). [5] MSDN, ”Face Tracking", http://msdn.microsoft.com/en-us/library/jj130970.aspx. [6] MSDN, ”Tracking Users with Kinect Skeletal Tracking", http://msdn.microsoft.com/en-us/library/jj131025.aspx. [7] Microsoft, “Pre-order the Kinect for Windows v2 sensor”, http://www.microsoft.com/en-us/kinectforwindows/Purchase/developer-sku.aspx [8] M. E. Hoque, M. Courgeon, B. Mutlu, J-C. Martin, R. W. Picard, “MACH: My Automated Conversation coacH “, In the 15th International Conference on Ubiquitous Computing (Ubicomp), September 2013. [9] S. Feese, B. Arnrich, G. Tröster, B. Meyer, K. Jonas, “Automatic Clustering of Conversational Patterns from Speech and Motion Data”, Measuring Behavior 2012. [10] Nick Morgan, “7 Surprising Truths about Body Language”, http://www.forbes.com/sites/nickmorgan/2012/10/25/7-surprising-truths-about-body-language/. [11] Alex Pentland, ” Honest Signals : How They Shape Our World”, The MIT Press, p.3-4, August.2008. [12] Marco Pasch, Monica Landoni, “Building Corpora of Bodily Expressions of Affect”, Measuring Behavior 2012. [13] Xsens MVN suit, http://www.xsens.com/products/xsens-mvn/. [14] Wouter van Teijlingen, Egon L. van den Broek, Reinier Könemann, John G.M. Schavemaker, “Towards Sensing Behavior Using the Kinect”, Measuring Behavior 2012. [15] MSDN, “Using the Kinect as an Audio Device”, http://msdn.microsoft.com/en-us/library/jj883682.aspx. [16] Posner MI, “Timing the Brain: Mental Chronometry as a Tool in Neuroscience”, PLoS Biol 3(2): e51. doi:10.1371/journal.pbio.0030051, 2005, http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.0030051. [17] Ken Goldberg, Siamak Faridani, Ron Alterovitz, “A New Derivation and Dataset for Fitts` Law of Human Motion”, Technical Report No. UCB/EECS-2013-171 , October 22, 2013, http://www.tele-actor.net/fitts-dataset/. [18] FFmpeg, https://www.ffmpeg.org/. [19] w3schools, “HTML
描述	碩士國立政治大學資訊科學學系 101971004 102
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0101971004
資料類型	thesis

dc.contributor.advisor	廖文宏	zh_TW
dc.contributor.advisor	Liao, Wen Hung	en_US
dc.contributor.author (作者)	吳怡潔	zh_TW
dc.contributor.author (作者)	Wu, Yi Chieh	en_US
dc.creator (作者)	吳怡潔	zh_TW
dc.creator (作者)	Wu, Yi Chieh	en_US
dc.date (日期)	2013	en_US
dc.date.accessioned	3-十一月-2014 10:11:57 (UTC+8)	-
dc.date.available	3-十一月-2014 10:11:57 (UTC+8)	-
dc.date.issued (上傳時間)	3-十一月-2014 10:11:57 (UTC+8)	-
dc.identifier (其他識別碼)	G0101971004	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/70998	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學學系	zh_TW
dc.description (描述)	101971004	zh_TW
dc.description (描述)	102	zh_TW
dc.description.abstract (摘要)	本論文基於具備捕捉影像深度的RGBD影音串流裝置－Kinect感測器，在簡報場域中，作為擷取簡報者肢體動作、表情、以及語言表現模式的設備。首先我們提出在特定時段內的表現模式，可以經由大眾的評估，而具有喜歡／不喜歡的性質，我們將其分別命名為Period of Like(POL)以及Period of Dislike(POD)。論文中並以三種Kinect SDK所提供的影像特徵：動畫單元、骨架關節點、以及3D臉部頂點，輔以35位評估者所提供之評估資料，以POD/POL取出的特徵模式，分析是否具有一致性，以及是否可用於未來預測。最後將研究結果開發應用於原型程式，期許這樣的預測系統，能夠為在簡報中表現不佳而困擾的人們，提點其優劣之處，以作為後續改善之依據。	zh_TW
dc.description.abstract (摘要)	In this thesis, we capture body movements, facial expressions, and voice data of subjects in the presentation scenario using RGBD-capable Kinect sensor. The acquired videos were accessed by a group of reviewers to indicate their preferences/aversions to the presentation style. We denote the two classes of ruling as Period of Like (POL) and Period of Dislike (POD), respectively. We then employ three types of image features, namely, animation units (AU), skeletal joints, and 3D face vertices to analyze the consistency of the evaluation result, as well as the ability to classify unseen footage based on the training data supplied by 35 evaluators. Finally, we develop a prototype program to help users to identify their strength/weakness during their presentation so that they can improve their skills accordingly.	en_US
dc.description.tableofcontents	第一章緒論 1 1.1 研究動機 1 1.2 論文架構 4 第二章相關研究 5 2.1 文獻探討 5 2.2 工具探討 7 第三章研究方法 11 3.1 基本構想 11 3.2 前期研究 14 3.2.1 錄製影音檔，並儲存相關特徵 14 3.2.2 影片格式轉換 16 3.2.3 取出影片中的聲音，分析其特徵，及偵測特定聲音事件 16 3.2.4 使用網頁技術呈現影片，並試作使用者介面及需求功能 21 3.3 研究架構設計 22 3.3.1 問題陳述 22 3.3.2 研究架構 22 3.3.3 研究分析工具 23 3.3 目標設定 24 第四章研究過程與結果分析 25 4.1 研究過程 25 4.1.1 小量測試評估階段 25 4.1.2 實驗者自行測試評估階段 27 4.1.3 大量正式評估階段 28 4.2 分析項目 29 4.2.1 動畫單元特徵模式 29 4.2.2 骨架關節點位置差值（Skeletal Joints Position Difference）特徵模式 31 4.2.3 3D臉部頂點位置差值（3D Vertex Position Difference）特徵模式 32 4.2.4 聲音事件命中率 33 4.3 可用性分析 34 4.3.1 檢驗動畫單元特徵模式的共識程度 35 4.3.2 檢驗骨架關節點位置差值特徵模式的共識程度 38 4.3.3 檢驗3D臉部位置差值特徵模式的共識程度 40 4.3.4 檢驗聲音事件的共識程度 41 4.3.5 喜歡／不喜歡的表現模式，機器能否學習並預測？ 43 第五章研究結果之應用 55 5.1 基於研究結果之應用 55 5.2 應用實例 56 第六章結論與未來研究方向 59 5.1 結論 59 5.2 未來研究方向 59 參考文獻 61 附錄 65	zh_TW
dc.format.extent	9841690 bytes	-
dc.format.mimetype	application/pdf	-
dc.language.iso	en_US	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0101971004	en_US
dc.subject (關鍵詞)	肢體語言	zh_TW
dc.subject (關鍵詞)	RGBD Kinect 感測器	zh_TW
dc.subject (關鍵詞)	表現評估	zh_TW
dc.subject (關鍵詞)	聲音處理	zh_TW
dc.subject (關鍵詞)	模式分類	zh_TW
dc.subject (關鍵詞)	Body language	en_US
dc.subject (關鍵詞)	RGBD Kinect sensor	en_US
dc.subject (關鍵詞)	performance evaluation	en_US
dc.subject (關鍵詞)	audio processing	en_US
dc.subject (關鍵詞)	pattern classification	en_US
dc.title (題名)	基於 RGBD 影音串流之肢體表情語言表現評估	zh_TW
dc.title (題名)	Estimation and Evaluation of Body Language Using RGBD Data	en_US
dc.type (資料類型)	thesis	en
dc.relation.reference (參考文獻)	[1] Rachael E. Jack, “Facial expressions of emotion are not culturally universal”, PNAS online, 2012. [2] Alex Pentland, ” Honest Signals : How They Shape Our World”, The MIT Press, August.2008. [3] Oxford Dictionaries, http://www.oxforddictionaries.com/definition/english/body-language. [4] Wikipedia contributors, "Kinect," Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/w/index.php?title=Kinect&oldid=612754262 (accessed June 29, 2014). [5] MSDN, ”Face Tracking", http://msdn.microsoft.com/en-us/library/jj130970.aspx. [6] MSDN, ”Tracking Users with Kinect Skeletal Tracking", http://msdn.microsoft.com/en-us/library/jj131025.aspx. [7] Microsoft, “Pre-order the Kinect for Windows v2 sensor”, http://www.microsoft.com/en-us/kinectforwindows/Purchase/developer-sku.aspx [8] M. E. Hoque, M. Courgeon, B. Mutlu, J-C. Martin, R. W. Picard, “MACH: My Automated Conversation coacH “, In the 15th International Conference on Ubiquitous Computing (Ubicomp), September 2013. [9] S. Feese, B. Arnrich, G. Tröster, B. Meyer, K. Jonas, “Automatic Clustering of Conversational Patterns from Speech and Motion Data”, Measuring Behavior 2012. [10] Nick Morgan, “7 Surprising Truths about Body Language”, http://www.forbes.com/sites/nickmorgan/2012/10/25/7-surprising-truths-about-body-language/. [11] Alex Pentland, ” Honest Signals : How They Shape Our World”, The MIT Press, p.3-4, August.2008. [12] Marco Pasch, Monica Landoni, “Building Corpora of Bodily Expressions of Affect”, Measuring Behavior 2012. [13] Xsens MVN suit, http://www.xsens.com/products/xsens-mvn/. [14] Wouter van Teijlingen, Egon L. van den Broek, Reinier Könemann, John G.M. Schavemaker, “Towards Sensing Behavior Using the Kinect”, Measuring Behavior 2012. [15] MSDN, “Using the Kinect as an Audio Device”, http://msdn.microsoft.com/en-us/library/jj883682.aspx. [16] Posner MI, “Timing the Brain: Mental Chronometry as a Tool in Neuroscience”, PLoS Biol 3(2): e51. doi:10.1371/journal.pbio.0030051, 2005, http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.0030051. [17] Ken Goldberg, Siamak Faridani, Ron Alterovitz, “A New Derivation and Dataset for Fitts` Law of Human Motion”, Technical Report No. UCB/EECS-2013-171 , October 22, 2013, http://www.tele-actor.net/fitts-dataset/. [18] FFmpeg, https://www.ffmpeg.org/. [19] w3schools, “HTML	zh_TW

學術產出-學位論文

文章檢視/開啟

書目匯出

Google ScholarTM

政大圖書館

引文資訊

TAIR相關學術產出

Google Scholar^TM