Publications-Proceedings

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 dataSDA and ggESDA: two R packages for exploratory symbolic data analysis
作者 吳漢銘
Wu, Han-Ming
貢獻者 統計系
日期 2023-08
上傳時間 26-Dec-2024 13:27:49 (UTC+8)
摘要 Exploratory Data Analysis (EDA) serves as a preliminary yet essential tool for summarizing the main characteristics of a dataset before appropriate statistical modeling can be applied. Quite often, EDA employs traditional graphical techniques such as boxplots, histograms, and scatterplots, and is equipped with various dimension reduction methods and computer-aided interactive functionalities. Over the years, data collected has become increasingly large and complex. Data descriptions have moved beyond single-value representations, encompassing intervals, histograms, and distributions. These are examples of the so-called symbolic data. In response to this development, we have created two R packages: dataSDA and ggESDA. The dataSDA package is designed to collect a diverse range of symbolic data and offers a comprehensive set of functions that facilitate the conversion of traditional data into the symbolic data format. These datasets can serve as benchmarks for evaluating symbolic data analysis methods. In addition, the package implements various R functions for computing symbolic descriptive statistics. The ggESDA package extends ggplot2 to offer a variety of plots specifically designed for exploratory symbolic data analysis. We will discuss how ggESDA is implemented. We will demonstrate its utility through the analysis of two real symbolic datasets found in dataSDA.
關聯 The 25th International Conference on Computational Statistics, The European Regional Section of the IASC
資料類型 conference
dc.contributor 統計系
dc.creator (作者) 吳漢銘
dc.creator (作者) Wu, Han-Ming
dc.date (日期) 2023-08
dc.date.accessioned 26-Dec-2024 13:27:49 (UTC+8)-
dc.date.available 26-Dec-2024 13:27:49 (UTC+8)-
dc.date.issued (上傳時間) 26-Dec-2024 13:27:49 (UTC+8)-
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/154913-
dc.description.abstract (摘要) Exploratory Data Analysis (EDA) serves as a preliminary yet essential tool for summarizing the main characteristics of a dataset before appropriate statistical modeling can be applied. Quite often, EDA employs traditional graphical techniques such as boxplots, histograms, and scatterplots, and is equipped with various dimension reduction methods and computer-aided interactive functionalities. Over the years, data collected has become increasingly large and complex. Data descriptions have moved beyond single-value representations, encompassing intervals, histograms, and distributions. These are examples of the so-called symbolic data. In response to this development, we have created two R packages: dataSDA and ggESDA. The dataSDA package is designed to collect a diverse range of symbolic data and offers a comprehensive set of functions that facilitate the conversion of traditional data into the symbolic data format. These datasets can serve as benchmarks for evaluating symbolic data analysis methods. In addition, the package implements various R functions for computing symbolic descriptive statistics. The ggESDA package extends ggplot2 to offer a variety of plots specifically designed for exploratory symbolic data analysis. We will discuss how ggESDA is implemented. We will demonstrate its utility through the analysis of two real symbolic datasets found in dataSDA.
dc.format.extent 186 bytes-
dc.format.mimetype text/html-
dc.relation (關聯) The 25th International Conference on Computational Statistics, The European Regional Section of the IASC
dc.title (題名) dataSDA and ggESDA: two R packages for exploratory symbolic data analysis
dc.type (資料類型) conference