Publications-Proceedings
Article View/Open
Publication Export
-
Google ScholarTM
NCCU Library
Citation Infomation
Related Publications in TAIR
題名 dataSDA and ggESDA: two R packages for exploratory symbolic data analysis 作者 吳漢銘
Wu, Han-Ming貢獻者 統計系 日期 2023-08 上傳時間 26-Dec-2024 13:27:49 (UTC+8) 摘要 Exploratory Data Analysis (EDA) serves as a preliminary yet essential tool for summarizing the main characteristics of a dataset before appropriate statistical modeling can be applied. Quite often, EDA employs traditional graphical techniques such as boxplots, histograms, and scatterplots, and is equipped with various dimension reduction methods and computer-aided interactive functionalities. Over the years, data collected has become increasingly large and complex. Data descriptions have moved beyond single-value representations, encompassing intervals, histograms, and distributions. These are examples of the so-called symbolic data. In response to this development, we have created two R packages: dataSDA and ggESDA. The dataSDA package is designed to collect a diverse range of symbolic data and offers a comprehensive set of functions that facilitate the conversion of traditional data into the symbolic data format. These datasets can serve as benchmarks for evaluating symbolic data analysis methods. In addition, the package implements various R functions for computing symbolic descriptive statistics. The ggESDA package extends ggplot2 to offer a variety of plots specifically designed for exploratory symbolic data analysis. We will discuss how ggESDA is implemented. We will demonstrate its utility through the analysis of two real symbolic datasets found in dataSDA. 關聯 The 25th International Conference on Computational Statistics, The European Regional Section of the IASC 資料類型 conference dc.contributor 統計系 dc.creator (作者) 吳漢銘 dc.creator (作者) Wu, Han-Ming dc.date (日期) 2023-08 dc.date.accessioned 26-Dec-2024 13:27:49 (UTC+8) - dc.date.available 26-Dec-2024 13:27:49 (UTC+8) - dc.date.issued (上傳時間) 26-Dec-2024 13:27:49 (UTC+8) - dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/154913 - dc.description.abstract (摘要) Exploratory Data Analysis (EDA) serves as a preliminary yet essential tool for summarizing the main characteristics of a dataset before appropriate statistical modeling can be applied. Quite often, EDA employs traditional graphical techniques such as boxplots, histograms, and scatterplots, and is equipped with various dimension reduction methods and computer-aided interactive functionalities. Over the years, data collected has become increasingly large and complex. Data descriptions have moved beyond single-value representations, encompassing intervals, histograms, and distributions. These are examples of the so-called symbolic data. In response to this development, we have created two R packages: dataSDA and ggESDA. The dataSDA package is designed to collect a diverse range of symbolic data and offers a comprehensive set of functions that facilitate the conversion of traditional data into the symbolic data format. These datasets can serve as benchmarks for evaluating symbolic data analysis methods. In addition, the package implements various R functions for computing symbolic descriptive statistics. The ggESDA package extends ggplot2 to offer a variety of plots specifically designed for exploratory symbolic data analysis. We will discuss how ggESDA is implemented. We will demonstrate its utility through the analysis of two real symbolic datasets found in dataSDA. dc.format.extent 186 bytes - dc.format.mimetype text/html - dc.relation (關聯) The 25th International Conference on Computational Statistics, The European Regional Section of the IASC dc.title (題名) dataSDA and ggESDA: two R packages for exploratory symbolic data analysis dc.type (資料類型) conference