Publications-Periodical Articles

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 Topological Risk-Landscape in Metric-Free Categorical Database
作者 周珮婷
Chou, Elizabeth P.;Hsieh, Fushing;Kao, Hong-Wei
貢獻者 統計系
關鍵詞 Behavioral risk factor surveillance system (BRFSS); bipartite network heatmap; categorical exploratory data analysis (CEDA); complex system; conditional entropy
日期 2024-05
上傳時間 2024-07-17
摘要 The Entropy-based Categorical Exploratory Data Analysis (CEDA) paradigm is elaborately refined to algorithmically explore the intricate high-order directional associative relational patterns within the heterogeneous chronical disease dynamics captured by Behavioral Risk Factor Surveillance System (BRFSS) database. Operating on this imbalanced categorical dataset represented fully by its metric-free high-dimensional histogram, our algorithms conduct data-driven computations to investigate chronic disease mechanisms across four sub-populations along the age-axis, culminating in comprehensive systemic understandings. Upon this categorical data-world, CEDA first recognizes the category-oriented 1D histogram as the simplest form of a piece of explainable information. Then, utilizing Kolmogorov’s randomness-proper-based reliability check, CEDA identifies and confirms collectives of 1D histograms as major feature-categories of varying orders within each sub-population. These confirmed major feature-categories’ binary memberships are then arranged into a subject-vs-feature-category bipartite network heatmap, revealing serial horizontal and vertical blocks framed by clusters of similar subjects characterized by individual-risk-landscapes (IRL) against clusters of structurally dependent major feature-categories. Based on such block-series, sub-population-specific disease mechanisms emerge as collective high-order interacting effects, elucidating directional associative relationships from study subjects’ topological neighborhoods to response-categories. Notably, the topological individual-risk-landscape offers profound insights into complex system dynamics and simultaneously exposes atypical subjects as explainable errors across all Machine Learning classifiers.
關聯 IEEE Access, Vol.12, pp.66296-66318
資料類型 article
DOI https://doi.org/10.1109/ACCESS.2024.3398416
dc.contributor 統計系
dc.creator (作者) 周珮婷
dc.creator (作者) Chou, Elizabeth P.;Hsieh, Fushing;Kao, Hong-Wei
dc.date (日期) 2024-05
dc.date.accessioned 2024-07-17-
dc.date.available 2024-07-17-
dc.date.issued (上傳時間) 2024-07-17-
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/152335-
dc.description.abstract (摘要) The Entropy-based Categorical Exploratory Data Analysis (CEDA) paradigm is elaborately refined to algorithmically explore the intricate high-order directional associative relational patterns within the heterogeneous chronical disease dynamics captured by Behavioral Risk Factor Surveillance System (BRFSS) database. Operating on this imbalanced categorical dataset represented fully by its metric-free high-dimensional histogram, our algorithms conduct data-driven computations to investigate chronic disease mechanisms across four sub-populations along the age-axis, culminating in comprehensive systemic understandings. Upon this categorical data-world, CEDA first recognizes the category-oriented 1D histogram as the simplest form of a piece of explainable information. Then, utilizing Kolmogorov’s randomness-proper-based reliability check, CEDA identifies and confirms collectives of 1D histograms as major feature-categories of varying orders within each sub-population. These confirmed major feature-categories’ binary memberships are then arranged into a subject-vs-feature-category bipartite network heatmap, revealing serial horizontal and vertical blocks framed by clusters of similar subjects characterized by individual-risk-landscapes (IRL) against clusters of structurally dependent major feature-categories. Based on such block-series, sub-population-specific disease mechanisms emerge as collective high-order interacting effects, elucidating directional associative relationships from study subjects’ topological neighborhoods to response-categories. Notably, the topological individual-risk-landscape offers profound insights into complex system dynamics and simultaneously exposes atypical subjects as explainable errors across all Machine Learning classifiers.
dc.format.extent 107 bytes-
dc.format.mimetype text/html-
dc.relation (關聯) IEEE Access, Vol.12, pp.66296-66318
dc.subject (關鍵詞) Behavioral risk factor surveillance system (BRFSS); bipartite network heatmap; categorical exploratory data analysis (CEDA); complex system; conditional entropy
dc.title (題名) Topological Risk-Landscape in Metric-Free Categorical Database
dc.type (資料類型) article
dc.identifier.doi (DOI) 10.1109/ACCESS.2024.3398416
dc.doi.uri (DOI) https://doi.org/10.1109/ACCESS.2024.3398416