Please use this identifier to cite or link to this item:

Title: 結合spline及分箱方式之廣義線性模型預測
Generalized linear model prediction combined with spline and binning method
Authors: 楊翔宇
Yang, Shiang-Yu
Contributors: 黃子銘
Huang, Tzee-Ming
Yang, Shiang-Yu
Keywords: 無母數方法
Nonparametric method
Piecewise polynomial
Variable selection
Knot selection
WOE of binning
Binning method
Date: 2021
Issue Date: 2021-08-05 10:12:41 (UTC+8)
Abstract: 在日常生活中,總是要面臨許多資料。大部分的資料都是夾雜著類別型變數以及連續型變數的資料。針對這種資料,提出了一個方式可以對自變數稍作些許處理,並以處理後的自變數加以預測資料,達到不錯的效果。
In our daily lives, we always have to face a great amount of large datasets. Most of them are combined with categorical variables and continuous variables. Regarding this type of data, we proposed a method for model construction and prediction.
The proposed method is applied to the data of bank credit card default payments as the main research object. The response variable is the payment situation in the following months. “1” means the user with breach of contract and “0” means without breach of contract. Using the model, we can understand the association between the basic information of credit card users and their default behavior, which can be used to measure the probabilities that credit card users will default in the future, so as to help banks monitor customers and reduce the risk of bank losses.
Reference: [1] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA, 1984.
[2] C. de Boor. A Practical Guide to Splines. Springer Verlag, New York, 1978.
[3] J. F. Gamble. Asbestos and colon cancer: A weight-of-the-evidence review. Environmental Health Perspectives, 102:1038-1050, 1994.
[4] I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157-1182, 2003.
[5] T. K. Ho. Random decision forests. In Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1), pages 278-282,
Montreal, Que.,Canada, 1995. IEEE Computer Society.
[6] T. M Huang. A knot selection algorithm for splines in logistic regression. In Proceedings of the 2020 3rd International Conference on Mathematics and Statistics,
page 29-33, New York, NY, USA, 2020. Association for Computing Machinery.
[7] J. Jinot and S. Bayard. Dissent respiratory health effects of passive smoking: Epa’s weight-of-evidence analysis. Journal of Clinical Epidemiology, 47(4):339-349, 1994.
[8] R. Kerber. Chimerge: Discretization of numeric attributes. In Proceedings of the Tenth National Conference on Artificial Intelligence, AAAI’92, page 123-128.
AAAI Press, 1992.
[9] N. Shaltout, M. Elhefnawi, A. Rafea, and A. Moustafa. Information gain as a feature selection method for the efficient classification of influenza based on viral hosts. Lecture Notes in Engineering and Computer Science, 1:625-631, 2014.
[10] D. Weed. Weight of evidence: A review of concept and methods. Risk analysis : an official publication of the Society for Risk Analysis, 25:1545-1557, 2005.
[11] G. Zeng. A necessary condition for a good binning algorithm in credit scoring. Applied Mathematical Sciences, Vol. 8:3229-3242, 2014.
Description: 碩士
Source URI:
Data Type: thesis
Appears in Collections:[統計學系] 學位論文

Files in This Item:

File Description SizeFormat
400801.pdf1738KbAdobe PDF0View/Open

All items in 學術集成 are protected by copyright, with all rights reserved.

社群 sharing