學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 賽局理論與學習模型的實證研究
An empirical study of game theory and learning model
作者 陳冠儒
Chen, Kuan Lu
貢獻者 余清祥
Yue, Ching Syang
陳冠儒
Chen, Kuan Lu
關鍵詞 囚犯兩難
重複賽局
學習模型
蒙地卡羅模擬
適合度檢定
prisoner dilemma
repeated game
learning model
Monte Carlo simulation
goodness-of-fit
日期 2010
上傳時間 5-Oct-2011 14:31:58 (UTC+8)
摘要 賽局理論(Game Theory)大多假設理性決策,單一回合賽局通常可由理論證明均衡(Equilibrium)或是最佳決策,然而如果賽局重複進行,不見得只存在單一均衡,光從理論推導可能無法找到所有均衡。以囚犯困境(Prisoner Dilemma)為例,理論均衡為不合作,若重複的賽局中存有互利關係,不合作可能不是最佳選擇。近年來,經濟學家藉由和統計實驗設計類似的賽局實驗(Game Experiment),探討賽局在理論與實際間的差異,並以學習模型(Learning Model)描述參賽者的決策及行為,但學習模型的優劣大多依賴誤差大小判定,但誤差分析結果可能與資料有關(Data Dependent)。有鑑於學習模型在模型選取上的不足,本文引進統計分析的模型選取及殘差檢定,以實證資料、配合電腦模擬評估學習模型。
本文使用的實證資料,屬於囚犯困境的重複賽局(Repeated Game),包括四種不同的實驗設定,參加賽局實驗者(或是「玩家」)為政治大學大學部學生;比較學習模型有四種:增強學習模型(Reinforcement Learning model)、延伸的增強學習模型(Extend Reinforcement Learning Model)、信念學習模型(Belief Learning Model)、加權經驗吸引模型(Experience-Weighted Attraction Model)。實證及模擬分析發現,增強學習模型較適合用於描述囚犯困境資料,無論是較小的誤差或是適合度分析,增強學習模型都有較佳的結果;另外,也發現玩家在不同實驗設定中的反應並不一致,將玩家分類後會有較佳的結果。
In game theory, the optimal strategy (or equilibrium) of one-shot games usually can be solved theoretically. But, the optimal strategies of repeated games are likely not unique and are more difficult to find. For example, the defection is the optimal decision for the one-shot Prisoner Dilemma (PD) game. But for the repeated PD game, if the players can benefit from cooperation between rounds then the defection won’t be the only optimal rule. In recent years, economists design game experiments to explore the behavior in repeated games and use the learning models to evaluate the player’s choices. Most of the evaluation criteria are based on the estimation and prediction errors, but the results are likely to be data dependent. In this study, we adapt the model selection process in regression analysis and apply the idea to evaluate learning models. We use empirical data, together with Monte Carlo simulation, to demonstrate the evaluation process.
The empirical data used are repeated PD game, including four different experimental settings, and the players of the game are from National Chengchi University in Taiwan. Also, we consider four learning models: Reinforcement learning (RL) model, Extend Reinforcement learning (ERL) model, Belief Learning (BL) model, and Experience-weighted attraction (EWA) model. We found that the RL model is more appropriate to describe the PD data. In addition, the behaviors of players in a group can be quite different and separating the players into different sets can reduce the estimation errors.
參考文獻 中文部分
張宮熊(2009):"賽局:又稱博奕論"高雄市:玲果國際文化出版。
劉常勇(2008):"賽局理論中的雙贏策略"。
取自http://cm.nsysu.edu.tw/~cyliu/paper/paper10.html
英文部分
Anderson C. and Camerer C. F. (2000) , “Experience-weighted Attraction Learning in
Sender-receiver Signaling Games,” Economic Theory, 16(3), 689-718
Arifovic J. and Ledyard J. (2004), “Scaling up learning models in public good games,” Journal of Public Economic Theory 6, 205–238
Bendor J., Diermeier D., and Ting M.(2003), “The Predictive Power of Learning Models in Social Dilemma Research,” Stanford University working paper.
Cabrales A. and Garcia-Fontes W. (2000), “Estimating Learning Models with Experimental Data,” University of Pompeu Febra working paper 501.
Cabrales A., Garcia -Fontes W. and Motta M. (2000) , “Risk Dominance Selects the Leader. An Experimental Analysis,” International Journal of Industrial Organization, 18:137-162.
Camerer, C. F., and Ho, T. (1999) , “Experience-weighted attraction learning in normal form games,” Econometrica 67, pp. 827–874.
Camerer C.F.(2003), “Behavioral Game Theory: Experiments in Strategic Interaction,” Princeton Univ. Press, Princeton, NJ.
Cournot, A. (1960), “Recherches sur les Principles Mathematiques de la Theories des Richesses,” Translate into English by N. Bacon as Researches in the Mathematical Principle of the Theory of Wealth. London: Haffner
Erev I. and Roth A. E.(1998), “Predicting how people play games with unique, mixed-strategy equilibria,” Amer. Econom. Rev., vol. 88, pp. 848 - 881.
Friedman D.(1998), “Evolutionary economics goes mainstream: A review of the theory of learning in games,” Journal of Evolutionary Economics, 8: 423-432.
Jacob K. Goeree, Charles A. Holt, and Thomas R. Palfrey(2002), “Quantal response equilibrium and overbidding in private-value auctions,” Journal of Economic Theory, 104(1):247-272
Joy Woller(1996) , “ The Basics of Monte Carlo Simulations,” Retrieved from http://www.chem.unl.edu/zeng/joy/mclab/mcintro.html
Lai Y.H.(2005), “A Study of Learning models for analyzing prisoners` dilemma data,” Working Paper, Chengchi university.
McKelvey, R.D. and Palfrey, T.R.(1995), “Quantal Response Equilibria for Normal Form Games,” Games and Economic Behavior, Vol. 10, pp. 6–38.
Sutton R.S., Barto A.G.(1998), “Reinforcement Learning: An Introduction,” MIT Press, Cambridge, MA.
Yang C.-L., Yue J.C. and Yu I.-T. (2007) , “The Rise of Cooperation in Correlated Matching Prisoners Dilemma: An Experiment,” Experimental Economics 10, 3-20.
Yang C.-L. and Yue, C.J.(2010), “Assortative Matching, Information and Cooperation: An Experiment,” Economics Bulletin (SSCI), vol. 30(1), 414-420.
描述 碩士
國立政治大學
統計研究所
98354008
99
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0098354008
資料類型 thesis
dc.contributor.advisor 余清祥zh_TW
dc.contributor.advisor Yue, Ching Syangen_US
dc.contributor.author (Authors) 陳冠儒zh_TW
dc.contributor.author (Authors) Chen, Kuan Luen_US
dc.creator (作者) 陳冠儒zh_TW
dc.creator (作者) Chen, Kuan Luen_US
dc.date (日期) 2010en_US
dc.date.accessioned 5-Oct-2011 14:31:58 (UTC+8)-
dc.date.available 5-Oct-2011 14:31:58 (UTC+8)-
dc.date.issued (上傳時間) 5-Oct-2011 14:31:58 (UTC+8)-
dc.identifier (Other Identifiers) G0098354008en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/51203-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計研究所zh_TW
dc.description (描述) 98354008zh_TW
dc.description (描述) 99zh_TW
dc.description.abstract (摘要) 賽局理論(Game Theory)大多假設理性決策,單一回合賽局通常可由理論證明均衡(Equilibrium)或是最佳決策,然而如果賽局重複進行,不見得只存在單一均衡,光從理論推導可能無法找到所有均衡。以囚犯困境(Prisoner Dilemma)為例,理論均衡為不合作,若重複的賽局中存有互利關係,不合作可能不是最佳選擇。近年來,經濟學家藉由和統計實驗設計類似的賽局實驗(Game Experiment),探討賽局在理論與實際間的差異,並以學習模型(Learning Model)描述參賽者的決策及行為,但學習模型的優劣大多依賴誤差大小判定,但誤差分析結果可能與資料有關(Data Dependent)。有鑑於學習模型在模型選取上的不足,本文引進統計分析的模型選取及殘差檢定,以實證資料、配合電腦模擬評估學習模型。
本文使用的實證資料,屬於囚犯困境的重複賽局(Repeated Game),包括四種不同的實驗設定,參加賽局實驗者(或是「玩家」)為政治大學大學部學生;比較學習模型有四種:增強學習模型(Reinforcement Learning model)、延伸的增強學習模型(Extend Reinforcement Learning Model)、信念學習模型(Belief Learning Model)、加權經驗吸引模型(Experience-Weighted Attraction Model)。實證及模擬分析發現,增強學習模型較適合用於描述囚犯困境資料,無論是較小的誤差或是適合度分析,增強學習模型都有較佳的結果;另外,也發現玩家在不同實驗設定中的反應並不一致,將玩家分類後會有較佳的結果。
zh_TW
dc.description.abstract (摘要) In game theory, the optimal strategy (or equilibrium) of one-shot games usually can be solved theoretically. But, the optimal strategies of repeated games are likely not unique and are more difficult to find. For example, the defection is the optimal decision for the one-shot Prisoner Dilemma (PD) game. But for the repeated PD game, if the players can benefit from cooperation between rounds then the defection won’t be the only optimal rule. In recent years, economists design game experiments to explore the behavior in repeated games and use the learning models to evaluate the player’s choices. Most of the evaluation criteria are based on the estimation and prediction errors, but the results are likely to be data dependent. In this study, we adapt the model selection process in regression analysis and apply the idea to evaluate learning models. We use empirical data, together with Monte Carlo simulation, to demonstrate the evaluation process.
The empirical data used are repeated PD game, including four different experimental settings, and the players of the game are from National Chengchi University in Taiwan. Also, we consider four learning models: Reinforcement learning (RL) model, Extend Reinforcement learning (ERL) model, Belief Learning (BL) model, and Experience-weighted attraction (EWA) model. We found that the RL model is more appropriate to describe the PD data. In addition, the behaviors of players in a group can be quite different and separating the players into different sets can reduce the estimation errors.
en_US
dc.description.tableofcontents 第一章 前言 1
第一節 研究動機 1
第二節 研究目的 3
第二章 文獻探討 5
第一節 學習模型 6
第二節 蒙地卡羅模擬 12
第三章 資料介紹與研究方法 14
第一節 資料介紹 14
第二節 研究方法 20
第四章 實證資料分析 30
第一節 學習模型的必要性 30
第二節 情境資料-RM 35
第三節 情境資料-WH 40
第四節 情境資料-WH(c) 45
第五節 情境資料-WH(p) 49
第六節 敏感度分析 53
第五章 結論及建議 58
第一節 結論 58
第二節 後續發展與建議 60
參考文獻 61
zh_TW
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0098354008en_US
dc.subject (關鍵詞) 囚犯兩難zh_TW
dc.subject (關鍵詞) 重複賽局zh_TW
dc.subject (關鍵詞) 學習模型zh_TW
dc.subject (關鍵詞) 蒙地卡羅模擬zh_TW
dc.subject (關鍵詞) 適合度檢定zh_TW
dc.subject (關鍵詞) prisoner dilemmaen_US
dc.subject (關鍵詞) repeated gameen_US
dc.subject (關鍵詞) learning modelen_US
dc.subject (關鍵詞) Monte Carlo simulationen_US
dc.subject (關鍵詞) goodness-of-fiten_US
dc.title (題名) 賽局理論與學習模型的實證研究zh_TW
dc.title (題名) An empirical study of game theory and learning modelen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) 中文部分zh_TW
dc.relation.reference (參考文獻) 張宮熊(2009):"賽局:又稱博奕論"高雄市:玲果國際文化出版。zh_TW
dc.relation.reference (參考文獻) 劉常勇(2008):"賽局理論中的雙贏策略"。zh_TW
dc.relation.reference (參考文獻) 取自http://cm.nsysu.edu.tw/~cyliu/paper/paper10.htmlzh_TW
dc.relation.reference (參考文獻) 英文部分zh_TW
dc.relation.reference (參考文獻) Anderson C. and Camerer C. F. (2000) , “Experience-weighted Attraction Learning inzh_TW
dc.relation.reference (參考文獻) Sender-receiver Signaling Games,” Economic Theory, 16(3), 689-718zh_TW
dc.relation.reference (參考文獻) Arifovic J. and Ledyard J. (2004), “Scaling up learning models in public good games,” Journal of Public Economic Theory 6, 205–238zh_TW
dc.relation.reference (參考文獻) Bendor J., Diermeier D., and Ting M.(2003), “The Predictive Power of Learning Models in Social Dilemma Research,” Stanford University working paper.zh_TW
dc.relation.reference (參考文獻) Cabrales A. and Garcia-Fontes W. (2000), “Estimating Learning Models with Experimental Data,” University of Pompeu Febra working paper 501.zh_TW
dc.relation.reference (參考文獻) Cabrales A., Garcia -Fontes W. and Motta M. (2000) , “Risk Dominance Selects the Leader. An Experimental Analysis,” International Journal of Industrial Organization, 18:137-162.zh_TW
dc.relation.reference (參考文獻) Camerer, C. F., and Ho, T. (1999) , “Experience-weighted attraction learning in normal form games,” Econometrica 67, pp. 827–874.zh_TW
dc.relation.reference (參考文獻) Camerer C.F.(2003), “Behavioral Game Theory: Experiments in Strategic Interaction,” Princeton Univ. Press, Princeton, NJ.zh_TW
dc.relation.reference (參考文獻) Cournot, A. (1960), “Recherches sur les Principles Mathematiques de la Theories des Richesses,” Translate into English by N. Bacon as Researches in the Mathematical Principle of the Theory of Wealth. London: Haffnerzh_TW
dc.relation.reference (參考文獻) Erev I. and Roth A. E.(1998), “Predicting how people play games with unique, mixed-strategy equilibria,” Amer. Econom. Rev., vol. 88, pp. 848 - 881.zh_TW
dc.relation.reference (參考文獻) Friedman D.(1998), “Evolutionary economics goes mainstream: A review of the theory of learning in games,” Journal of Evolutionary Economics, 8: 423-432.zh_TW
dc.relation.reference (參考文獻) Jacob K. Goeree, Charles A. Holt, and Thomas R. Palfrey(2002), “Quantal response equilibrium and overbidding in private-value auctions,” Journal of Economic Theory, 104(1):247-272zh_TW
dc.relation.reference (參考文獻) Joy Woller(1996) , “ The Basics of Monte Carlo Simulations,” Retrieved from http://www.chem.unl.edu/zeng/joy/mclab/mcintro.htmlzh_TW
dc.relation.reference (參考文獻) Lai Y.H.(2005), “A Study of Learning models for analyzing prisoners` dilemma data,” Working Paper, Chengchi university.zh_TW
dc.relation.reference (參考文獻) McKelvey, R.D. and Palfrey, T.R.(1995), “Quantal Response Equilibria for Normal Form Games,” Games and Economic Behavior, Vol. 10, pp. 6–38.zh_TW
dc.relation.reference (參考文獻) Sutton R.S., Barto A.G.(1998), “Reinforcement Learning: An Introduction,” MIT Press, Cambridge, MA.zh_TW
dc.relation.reference (參考文獻) Yang C.-L., Yue J.C. and Yu I.-T. (2007) , “The Rise of Cooperation in Correlated Matching Prisoners Dilemma: An Experiment,” Experimental Economics 10, 3-20.zh_TW
dc.relation.reference (參考文獻) Yang C.-L. and Yue, C.J.(2010), “Assortative Matching, Information and Cooperation: An Experiment,” Economics Bulletin (SSCI), vol. 30(1), 414-420.zh_TW