Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 非監督式學習下高風險行為青少年探究
Unsupervised learning of adolescent risk-taking study
作者 李承軒
貢獻者 周珮婷
李承軒
關鍵詞 非監督式學習
聚合式階層分群法
資料雲幾何樹
風險行為
Unsupervised learning
Agglomerative hierarchical clustering
Data cloud geometry tree
Risk-taking
日期 2017
上傳時間 24-Jul-2017 11:59:12 (UTC+8)
摘要 本研究主要以非監督式學習的演算法,以兩種分群演算法,交叉探討青少年族群的高風險行為特徵。兩種分群演算法中,第一種為資料雲幾何樹,它具有溫度與時間兩個維度構面,透過溫度的篩選以及時間軸的自動偵測,提高群聚間的差異,另一種為聚合式階層分群法,它屬於簡潔明瞭、快速實用的方法。在此將風險行為資料分成連續型與類別型兩部分同時進行分群,並以檢定的方式來驗證是否滿足群間差異大。從顯著變數個數的比較中發現,階層式分群法的表現較佳,推測其群間差異較大,而從一方面來看,從變異比的比較中發現,資料幾何雲樹在特殊群聚下變異比較大,也就是群間差異大,反而階層式分群則只在第一次分群時群聚間差異較大,最後,計算特殊群聚與非特殊群聚的差異,發現特殊群聚的風險值較高,推測為高風險青少年,並從兩演算法下所得的特殊群聚中挑出重複出現的觀測值,作為我們所要找的高風險青少年目標群,並針對目標群人口資料整理。
The current study used the two clustering algorithms in unsupervised learning to explore adolescents’ risk-taking behaviors cross-culturally. The first algorithm was data cloud geometry tree, which considered two elements, temperature and time, in the algorithm. Through the filtering of temperature and the automatic detection of time axis, the differences between clusters were increased as temperature was lowered. The second algorithm was agglomerative hierarchical clustering, a simple and practical method. The risk-taking data were divided into two parts: numerical type and categorical type. Hypothesis tests were conducted to verify whether the differences between groups were significant. The results showed that the hierarchical clustering method performed better. In addition, the findings showed that the group differences in the special cluster were larger when using the data cloud geometry tree. Finally, the difference between the special group and the non-special group was calculated, and the risk value of the special group was high, which identified the potentially high-risk adolescents. The special clusters obtained from the two algorithms were compared to get the repeated subjects, which served as our target. Also, demographic data of the target were discussed.
參考文獻 Abbas, O. A. (2008). Comparisons Between Data Clustering Algorithms. Int. Arab J. Inf. Technol., 5(3), 320-325.
Ahmad, A., & Dey, L. (2007). A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set. Pattern Recognition Letters, 28(1), 110-118.
Fushing, H., & McAssey, M. P. (2010). Time, temperature, and data cloud geometry. Physical Review E, 82(6), 061110.
Fushing, H., Wang, H., VanderWaal, K., McCowan, B., & Koehl, P. (2013). Multi-scale clustering by building a robust and self correcting ultrametric topology on data points. PLoS ONE, 8(2), e56259.
Hamming, R. W. (1950), Error Detecting and Error Correcting Codes. Bell System Technical Journal, 29: 147–160. doi: 10.1002/j.1538-7305.1950.tb00463.x
Jia, H., Cheung, Y.-m., & Liu, J. (2016). A new distance metric for unsupervised learning of categorical data. IEEE transactions on neural networks and learning systems, 27(5), 1065-1079.
Murtagh, F., & Legendre, P. (2011). Ward`s hierarchical clustering method: clustering criterion and agglomerative algorithm. arXiv preprint arXiv:1111.6285.
描述 碩士
國立政治大學
統計學系
104354017
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0104354017
資料類型 thesis
dc.contributor.advisor 周珮婷zh_TW
dc.contributor.author (Authors) 李承軒zh_TW
dc.creator (作者) 李承軒zh_TW
dc.date (日期) 2017en_US
dc.date.accessioned 24-Jul-2017 11:59:12 (UTC+8)-
dc.date.available 24-Jul-2017 11:59:12 (UTC+8)-
dc.date.issued (上傳時間) 24-Jul-2017 11:59:12 (UTC+8)-
dc.identifier (Other Identifiers) G0104354017en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/111305-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 104354017zh_TW
dc.description.abstract (摘要) 本研究主要以非監督式學習的演算法,以兩種分群演算法,交叉探討青少年族群的高風險行為特徵。兩種分群演算法中,第一種為資料雲幾何樹,它具有溫度與時間兩個維度構面,透過溫度的篩選以及時間軸的自動偵測,提高群聚間的差異,另一種為聚合式階層分群法,它屬於簡潔明瞭、快速實用的方法。在此將風險行為資料分成連續型與類別型兩部分同時進行分群,並以檢定的方式來驗證是否滿足群間差異大。從顯著變數個數的比較中發現,階層式分群法的表現較佳,推測其群間差異較大,而從一方面來看,從變異比的比較中發現,資料幾何雲樹在特殊群聚下變異比較大,也就是群間差異大,反而階層式分群則只在第一次分群時群聚間差異較大,最後,計算特殊群聚與非特殊群聚的差異,發現特殊群聚的風險值較高,推測為高風險青少年,並從兩演算法下所得的特殊群聚中挑出重複出現的觀測值,作為我們所要找的高風險青少年目標群,並針對目標群人口資料整理。zh_TW
dc.description.abstract (摘要) The current study used the two clustering algorithms in unsupervised learning to explore adolescents’ risk-taking behaviors cross-culturally. The first algorithm was data cloud geometry tree, which considered two elements, temperature and time, in the algorithm. Through the filtering of temperature and the automatic detection of time axis, the differences between clusters were increased as temperature was lowered. The second algorithm was agglomerative hierarchical clustering, a simple and practical method. The risk-taking data were divided into two parts: numerical type and categorical type. Hypothesis tests were conducted to verify whether the differences between groups were significant. The results showed that the hierarchical clustering method performed better. In addition, the findings showed that the group differences in the special cluster were larger when using the data cloud geometry tree. Finally, the difference between the special group and the non-special group was calculated, and the risk value of the special group was high, which identified the potentially high-risk adolescents. The special clusters obtained from the two algorithms were compared to get the repeated subjects, which served as our target. Also, demographic data of the target were discussed.en_US
dc.description.tableofcontents 第一章 緒論 1
第二章 文獻探討 3
第三章 資料敘述 5
第四章 研究方法 9
第一節 演算法 9
第二節 距離方式 14
第三節 檢定方式 15
第五章 研究過程與結果 18
第一節 DCG-TREE 18
第二節 HIERARCHICAL CLUSTERING 26
第三節 交互比較 33
第四節 群間變異比例 38
第五節 特殊群聚與非特殊群聚比較 43
第六章 結論 50
第一節 研究結果 50
第二節 研究討論與建議 51
第七章 參考資料 52
zh_TW
dc.format.extent 6064353 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0104354017en_US
dc.subject (關鍵詞) 非監督式學習zh_TW
dc.subject (關鍵詞) 聚合式階層分群法zh_TW
dc.subject (關鍵詞) 資料雲幾何樹zh_TW
dc.subject (關鍵詞) 風險行為zh_TW
dc.subject (關鍵詞) Unsupervised learningen_US
dc.subject (關鍵詞) Agglomerative hierarchical clusteringen_US
dc.subject (關鍵詞) Data cloud geometry treeen_US
dc.subject (關鍵詞) Risk-takingen_US
dc.title (題名) 非監督式學習下高風險行為青少年探究zh_TW
dc.title (題名) Unsupervised learning of adolescent risk-taking studyen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Abbas, O. A. (2008). Comparisons Between Data Clustering Algorithms. Int. Arab J. Inf. Technol., 5(3), 320-325.
Ahmad, A., & Dey, L. (2007). A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set. Pattern Recognition Letters, 28(1), 110-118.
Fushing, H., & McAssey, M. P. (2010). Time, temperature, and data cloud geometry. Physical Review E, 82(6), 061110.
Fushing, H., Wang, H., VanderWaal, K., McCowan, B., & Koehl, P. (2013). Multi-scale clustering by building a robust and self correcting ultrametric topology on data points. PLoS ONE, 8(2), e56259.
Hamming, R. W. (1950), Error Detecting and Error Correcting Codes. Bell System Technical Journal, 29: 147–160. doi: 10.1002/j.1538-7305.1950.tb00463.x
Jia, H., Cheung, Y.-m., & Liu, J. (2016). A new distance metric for unsupervised learning of categorical data. IEEE transactions on neural networks and learning systems, 27(5), 1065-1079.
Murtagh, F., & Legendre, P. (2011). Ward`s hierarchical clustering method: clustering criterion and agglomerative algorithm. arXiv preprint arXiv:1111.6285.
zh_TW