Publications-Periodical Articles

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 Residual permutation tests for feature importance in machine learning
作者 黃柏僩
Huang, Po-Hsien
貢獻者 心理系
關鍵詞 feature importance; machine learning; permutation test
日期 2025-08
上傳時間 24-Sep-2025 09:38:54 (UTC+8)
摘要 Psychological research has traditionally relied on linear models to test scientific hypotheses. However, the emergence of machine learning (ML) algorithms has opened new opportunities for exploring variable relationships beyond linear constraints. To interpret the outcomes of these ‘black-box’ algorithms, various tools for assessing feature importance have been developed. However, most of these tools are descriptive and do not facilitate statistical inference. To address this gap, our study introduces two versions of residual permutation tests (RPTs), designed to assess the significance of a target feature in predicting the label. The first variant, RPT on Y (RPT-Y), permutes the residuals of the label conditioned on features other than the target. The second variant, RPT on X (RPT-X), permutes the residuals of the target feature conditioned on the other features. Through a comprehensive simulation study, we show that RPT-X maintains empirical Type I error rates under the nominal level across a wide range of ML algorithms and demonstrates appropriate statistical power in both regression and classification contexts. These findings suggest the utility of RPT-X for hypothesis testing in ML applications.
關聯 British Journal of Mathematical and Statistical Psychology,
資料類型 article
DOI https://doi.org/10.1111/bmsp.70009
dc.contributor 心理系
dc.creator (作者) 黃柏僩
dc.creator (作者) Huang, Po-Hsien
dc.date (日期) 2025-08
dc.date.accessioned 24-Sep-2025 09:38:54 (UTC+8)-
dc.date.available 24-Sep-2025 09:38:54 (UTC+8)-
dc.date.issued (上傳時間) 24-Sep-2025 09:38:54 (UTC+8)-
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/159629-
dc.description.abstract (摘要) Psychological research has traditionally relied on linear models to test scientific hypotheses. However, the emergence of machine learning (ML) algorithms has opened new opportunities for exploring variable relationships beyond linear constraints. To interpret the outcomes of these ‘black-box’ algorithms, various tools for assessing feature importance have been developed. However, most of these tools are descriptive and do not facilitate statistical inference. To address this gap, our study introduces two versions of residual permutation tests (RPTs), designed to assess the significance of a target feature in predicting the label. The first variant, RPT on Y (RPT-Y), permutes the residuals of the label conditioned on features other than the target. The second variant, RPT on X (RPT-X), permutes the residuals of the target feature conditioned on the other features. Through a comprehensive simulation study, we show that RPT-X maintains empirical Type I error rates under the nominal level across a wide range of ML algorithms and demonstrates appropriate statistical power in both regression and classification contexts. These findings suggest the utility of RPT-X for hypothesis testing in ML applications.
dc.format.extent 98 bytes-
dc.format.mimetype text/html-
dc.relation (關聯) British Journal of Mathematical and Statistical Psychology,
dc.subject (關鍵詞) feature importance; machine learning; permutation test
dc.title (題名) Residual permutation tests for feature importance in machine learning
dc.type (資料類型) article
dc.identifier.doi (DOI) 10.1111/bmsp.70009
dc.doi.uri (DOI) https://doi.org/10.1111/bmsp.70009