學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 統計機器學習:理論建構與因果分析
Statistical Machine Learning for Theory-building and Causal Analysis
作者 周平
Chou, Ping
貢獻者 莊皓鈞<br>周彥君
Chuang, Hao-Chun<br>Chou, Yen-Chun
周平
Chou, Ping
關鍵詞 理論建構
統計機器學習
雙重機器學習
統計與因果推論
人機互動
Theory building
statistical machine learning
double machine learning
statistical and causal inference
human-system interaction
日期 2023
上傳時間 1-Feb-2024 10:56:09 (UTC+8)
摘要 解決管理問題並提供決策支援是管理研究的核心議題,其中,機器學習模型,尤其具有黑箱性質、被廣泛用於數學函數估計的演算法,在決策最佳化過程所需的預測及處方性分析扮演了關鍵的角色。儘管如此,在實證研究中,以機器學習方法幫助驗證理論有效或為真的效用並未吸引多數管理研究者的關注。隨著統計機器學習及通用性的模型解釋技術的進展,機器學習已成為統計及因果推論的有力工具。在探索變數相關性的描述性理論建構,少數頂尖管理期刊文章近來提倡將機器學習用於模式的探索和理論建構。然而,這些研究並未對變數及交互作用的重要性進行統計檢定,對於現有的統計機器學習技術亦缺乏系統性的整合。而在分析變數因果關係的規範性理論建構,計量經濟文獻雖然發展了基於機器學習的估計方法,但文獻仍缺乏相關技術在資訊及作業管理的實證應用。為彌補上述研究缺口,本學位論文包含了兩篇研究。第一篇研究闡述了統計機器學習如何基於相關性進行解釋及驗證性的理論建構。當中,本研究提出了一套以隨機森林為基礎的研究流程,能提供演繹及溯因推理性的理論建構,針對變數與交互作用的重要性進行統計檢定,從而偵測資料中隱藏的模式並測試變數的相關性。透過廣泛的模擬實驗,本研究指出機器學習有較卓越表現的情境,以及與傳統統計分析方法互補的情境。第二篇研究介紹了一套基於機器學習的通用因果推論技術──雙重機器學習;本研究除透過模擬實驗來驗證該技術在因果推論的有效性及穩健性,同時將其應用於零售供應鏈情境下的庫存管理實證分析,探討半自動化的決策支援系統建議與經理人修正的人機互動模式。本研究從動態面板數據中發掘了數個穩健的因果模式,這些模式指出庫存管理績效可能因為經理人修改系統決策而得到改善的情境。綜上所述,本學位論文的兩篇研究皆致力於整合、驗證,及應用前沿的機器學習於理論建構,從而為方法論文獻(機器學習的統計推論)及實證作業管理和人機互動研究提出貢獻。
Problem-solving and decision support represent a prominent discipline in the management literature, whereas Machine Learning (ML) – the black-box algorithms dedicated to function-fitting in particular – serves as pivotal tools for predictive and prescriptive analytics for decision optimization. However, management researchers tend to overlook ML’s potential as a research methodology for empirical research, which answers the central inquiry of theory building “Is it valid or true.” The up-to-date developments in statistical ML and model-agnostic interpretation have rendered the algorithms decent statistical and causal inference tools. For descriptive theory development reliant on association, recent works from authoritative management journals have begun using ML for pattern discovery and theory building. Nevertheless, they fall short of offering statistical tests for variable/interaction importance and integrating them with the extant techniques. For causality that facilitates normative theory, the econometrics literature has devised ML-based causal effect estimators but lacks empirical application in information and operations management research. To fill in the gaps, this dissertation comprises two essays. The first essay articulates the utilization of statistical ML for explanatory and confirmatory theory-building with associations. We propose an analysis protocol based on Random Forest (RF) for abductive and deductive theory building, equipped with statistical tests for the significance of variables and the interactions. The protocol helps uncover the patterns and test the association. Based on extensive simulations, we shed light on the context where MLs exhibit superiority and are complementary to the prevalently used econometrics model for empirical data analysis. For the second essay, we introduce Double Machine Learning (DML), a general framework of ML-based causal inference, for theory building. In addition to numerically assessing the statistical properties/robustness of DML, we use DML to empirically analyze human-system interaction in the context of semi-automated replenishment decision-making. Reliant on the proven statistical validity of DML for dynamic panel data, we uncover robust causal patterns for the contexts where managers’ interventions to the algorithmic decisions improve inventory performance. The two essays are dedicated to integrating, validating, and employing sophisticated ML techniques for theory building, and make contributions to the methodological literature on ML-based inference in management and the literature on empirical operations management and human-system interaction in decision-making.
參考文獻 Agrawal, A., Gans, J., & Goldfarb, A. (2020). How to win with machine learning. Harvard Business Review. Allcott, H., Braghieri, L., and Gentzkow, M. (2020). The welfare effects of social media. American Economic Review, 110(3), 629-676. Alley, M., Biggs, M., Hariss, R., Herrmann, C., Li, M., and Perakis, G. (2020). Pricing for heterogeneous products: Analytics for ticket reselling. Working Paper, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3360622. Anand, G., Larson, E. C., & Mahoney, J. T. (2020). Thomas Kuhn on paradigms. Production and Operations Management, 29(7), 1650-1657. Ang, E., Kwasnick, S., Bayati, M., Plambeck, E. L., & Aratow, M. (2016). Accurate emergency department wait time prediction. Manufacturing & Service Operations Management, 18(1), 141-156. Athey, S. and Imbens, G. W. (2019). Machine learning methods that economists should know about. Annual Review of Economics, 11, 685-725. Athey, S., Tibshirani, J., and Wager, S. (2019). Generalized random forest. The Annals of Statistics, 47(2), 1148-1178. Austin, P. C. (2012). Using ensemble-based methods for directly estimating causal effects: An investigation of tree-based G-computation. Multivariate Behavioral Research, 47(1), 115-135. Bastani, H., Zhang, D., and Heng, Z. (2021). Applied machine learning in operations management. Working Paper, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3736466. Baumer, E. P., Mimno, D., Guha, S., Quan, E., and Gay, G. K. (2017). Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence? Journal of the Association for Information Science and Technology, 68(6), 1397-1410. Bertolini, M., Mezzogori, D., Neroni, M., and Zammori, F. (2021). Machine learning for industrial applications: A comprehensive literature review. Expert Systems with Applications, 175, 114820. Bertsimas, D., Kallus, N., & Hussain, A. (2016). Inventory management in the era of big data. Production and Operations Management, 25(12), 2006-2009. Biau, G. (2012). Analysis of a random forest model. Journal of Machine Learning Research, 13, 1063-1095. Bica, I., Alaa, A., & Van Der Schaar, M. (2020, November). Time series deconfounder: Estimating treatment effects over time in the presence of hidden confounders. In International Conference on Machine Learning (pp. 884-895). PMLR. Bojer, C. S. and Meldgaard, J. P. (2021). Kaggle forecasting competitions: An overlooked learning opportunity. International Journal of Forecasting, 37(2), 587-603. Brand, J. E., Xu, J., Koch, B., and Geraldo, P. (2021). Uncovering sociological effect heterogeneity using tree-based machine learning. Sociological Methodology, in press. Breiman, L. (2001). Random forest. Machine Learning, 45(1), 5-32. Briggs, D. C. (2004). Causal inference and the Heckman model. Journal of Educational and Behavioral Statistics, 29(4), 397-420. Brusco, M. J., Singh, R., Cradit, J. D., and Steinley, D. (2017). Cluster analysis in empirical OM research: Survey and recommendations. International Journal of Operations & Production Management. 37(3), 300-320. Carlile, P. R., & Christensen, C. M. (2005). The cycles of theory building in management research. Division of Research, Harvard Business School. https://www.hbs.edu/ris/Publication%20Files/05-057.pdf Caro, F., & de Tejada Cuenca, A. S. (2023). Believing in analytics: Managers’ adherence to price recommendations from a DSS. Manufacturing & Service Operations Management, 25(2), 524-542. Caro, F., & Gallien, J. (2010). Inventory management of a fast-fashion retail network. Operations Research, 58(2), 257-273. Caro, F., Colliard, J. E., Katok, E., Ockenfels, A., Stier-Moses, N., Tucker, C., & Wu, D. J. (2022). Call for papers—Management Science Special Issue on the Human-Algorithm Connection. Management Science, 68(1), 7-8. Caro, F., Gallien, J., Díaz, M., García, J., Corredoira, J. M., Montes, M., ... & Correa, J. (2010). Zara uses operations research to reengineer its global distribution process. Interfaces, 40(1), 71-84. Chakraborty, B. and Moodie, E. E. (2013). Statistical methods for dynamic treatment regimes (Vol. 2). Springer, New York. Chandrasekaran, A., de Treville, S., and Browning, T. (2020). Editorial: Intervention-based research (IBR) – What, where, and how to use it in operations management. Journal of Operations Management, 66(4), 370-378. Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1-C68. Chou, P., Chuang, H. H. C., Chou, Y. C., & Liang, T. P. (2022). Predictive analytics for customer repurchase: Interdisciplinary integration of buy till you die modeling and machine learning. European Journal of Operational Research, 296(2), 635-651. Chou, Y. C., Chuang, H. H. C., Chou, P., & Oliva, R. (2023). Supervised machine learning for theory building and testing: Opportunities in operations management. Journal of Operations Management, 69(4), 643-675. Choudhury, P., Allen, R. T., & Endres, M. G. (2021). Machine learning for pattern discovery in management research. Strategic Management Journal, 42(1), 30-57. Christopher, M., & Lee, H. (2004). Mitigating supply chain risk through improved confidence. International Journal of Physical Distribution & Logistics Management, 34(5), 388-396. Chuang, H. H. C., & Oliva, R. (2015). Inventory record inaccuracy: Causes and labor effects. Journal of Operations Management, 39, 63-78. Chuang, H. H. C., Chou, Y. C., & Oliva, R. (2021). Cross‐item learning for volatile demand forecasting: An intervention with predictive analytics. Journal of Operations Management, 67(7), 828-852. Chuang, H. H., Oliva, R., and Heim, G. R. (2019). Examining the link between retailer inventory leanness and operational efficiency: Moderating roles of firm size and demand uncertainty. Production and Operations Management, 28(9), 2338-2364. Coleman, T., Peng, W. and Mentch, L. (2022). Scalable and efficiency hypothesis testing with random forest. Journal of Machine Learning Research, 23, 1-35. Cook, T. D. and Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Houghton Mifflin, Boston. Cui, R., Gallino, S., Moreno, A., & Zhang, D. J. (2018). The operational value of social media information. Production and Operations Management, 27(10), 1749-1769. Cui, R., Li, M., & Zhang, S. (2022). AI and Procurement. Manufacturing & Service Operations Management, 24(2), 691-706. DeHoratius, N., Mersereau, A. J., & Schrage, L. (2008). Retail inventory management when records are inaccurate. Manufacturing & Service Operations Management, 10(2), 257-277. Dombi, J., Jónás, T., & Tóth, Z. E. (2018). Modeling and long-term forecasting demand in spare parts logistics businesses. International Journal of Production Economics, 201, 1-17. Domnici, F., Bargagli-Stoffi, F. J., & Mealli, F. (2021). From controlled to undisciplined data: Estimating causal effects in the era of data science using a potential outcome framework. Harvard Data Science Review, 3(3), 1-34. Duke, K. E., & Amir, O. (2023). The importance of selling formats: When integrating purchase and quantity decisions increases sales. Marketing Science, 42(1), 87-109. Edmondson, A. C. and McManus, S. E. (2007). Methodological fit in management field research. Academy of Management Review, 32(4), 1155-1179. Efron, B. (1992). Jackknife-after-bootstrap standard errors and influence functions. Journal of the Royal Statistical Society: Series B (Methodological), 54(1), 83-111. Efron, B. (2014). Estimation and accuracy after model selection. Journal of the American Statistical Association, 109(507), 991-1007. Efron, B. and Hastie, T. (2016). Computer age statistical inference: Algorithms, evidence, and data science. Cambridge University Press, Cambridge, UK. Elmaghraby, W., Jank, W., Zhang, S., & Karaesmen, I. Z. (2015). Sales force behavior, pricing information, and pricing decisions. Manufacturing & Service Operations Management, 17(4), 495-510. Eroglu, C., Williams, B. D., & Waller, M. A. (2013). The backroom effect in retail operations. Production and Operations Management, 22(4), 915-923. Ertekin, N. (2018). Intermediate and long-term benefits of in-store return experience. Production and Operations Management, 27(1), 121-142. Feng, H., Wu, Q., Muthuraman, K., & Deshpande, V. (2015). Replenishment policies for multi‐product stochastic inventory systems with correlated demand and joint‐replenishment costs. Production and Operations Management, 24(4), 647-664. Ferreira, K. J., Lee, B. H. A., & Simchi-Levi, D. (2016). Analytics for an online retailer: Demand forecasting and price optimization. Manufacturing & Service Operations Management, 18(1), 69-88. Fildes, R., Goodwin, P., & Lawrence, M. (2006). The design features of forecasting support systems and their effectiveness. Decision Support Systems, 42(1), 351-361. Fiore, A. M., Yah, X., & Yoh, E. (2000). Effects of a product display and environmental fragrancing on approach responses and pleasurable experiences. Psychology & Marketing, 17(1), 27-54. Fisher, M., & Raman, A. (2010). The new science of retailing: How analytics are transforming the supply chain and improving performance. Harvard Business Review Press. Fong, Y., Huang, Y., Gilbert, P. B., and Permar, S. R. (2017). Changpt: Threshold regression model estimation and inference. BMC Bioinformatics, 18(1), 1-7. Geva, T., & Saar‐Tsechansky, M. (2021). Who is a better decision maker? Data‐driven expert ranking under unobserved quality. Production and Operations Management, 30(1), 127-144. Glaeser, C. K., Fisher, M., & Su, X. (2019). Optimal retail location: Empirical methodology and application to practice: Finalist–2017 M&SOM practice-based research competition. Manufacturing & Service Operations Management, 21(1), 86-102. Glynn, A. N., & Quinn, K. M. (2010). An introduction to the augmented inverse propensity weighted estimator. Political Analysis, 18(1), 36-56. Gneiting, T., Balabdaoui, F., & Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(2), 243-268. Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E. (2015). Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. Journal of Computational and Graphical Statistics, 24(1), 44-65. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. Guajardo, J. A. (2019). How do usage and payment behavior interact in rent‐to‐own business models? Evidence from developing economies. Production and Operations Management, 28(11), 2808-2822. Gulli, A., & Pal, S. (2017). Deep learning with Keras. Packt Publishing Ltd. Haans, R. F. J., Pieters, C., and He, Z. (2016). Thinking about U: Theorizing and testing U- and inverted U-shaped relationships in strategy research. Strategic Management Journal, 37(7), 1177-1195. Hanson, N. R. (1958). Patterns of discovery. Cambridge University Press, Cambridge, UK. Hastie, T. and Tibshirani, R. (1990). Generalized additive models. Chapman and Hall. Hatt, T., & Feuerriegel, S. (2021). Sequential deconfounding for causal inference with unobserved confounders. arXiv preprint arXiv:2104.09323. Hirano, K., & Imbens, G. W. (2004). The propensity score with continuous treatments. Applied Bayesian modeling and causal inference from incomplete-data perspectives, 226164, 73-84. Ho, T. H., Lim, N., Reza, S., and Xia, X. (2017). Causal inference models in operations management. Manufacturing & Service Operations Management, 19(4), 509-525. Hoffman, M., Kahn, L. B., & Li, D. (2018). Discretion in hiring. The Quarterly Journal of Economics, 133(2), 765-800. Hofman, J. M., Sharma, A., and Watts, D. J. (2017). Prediction and explanation in social systems. Science, 355, 486-488. Huang, M. G. (2009). Real options approach-based demand forecasting method for a range of products with highly volatile and correlated demand. European Journal of Operational Research, 198(3), 867-877. Huang, N., Yan, Z., and Yin, H. (2021). Effects of online-offline service integration on e-health providers: A quasi-natural experiment. Production and Operations Management, 30(8), 2359-2378. Huynh-Thu, V. A., Saeys, Y., Wehenkel, L., and Geurts, P. (2012). Statistical interpretation of machine learning-based feature importance scores for biomarker discovery. Bioinformatics, 28(13), 1766-1774. Ilk, N., Shang, G., & Goes, P. (2020). Improving customer routing in contact centers: An automated triage design based on text analytics. Journal of Operations Management, 66(5), 553-577. Imdahl, C., Hoberg, K., & Schmidt, W. (2021). Targeted automation of order decisions using machine learning. Available at SSRN 3822131. James, G., Witten, D., Hastie, T., and Tibshirani, R. (2021). An introduction to statistical learning with applications in R (2nd ed.). Springer, New York. Janitza, S., Celik, E., and Boulesteix, A. (2018). A computationally fast variable importance test for random forests for high-dimensional data. Advances in Data Analysis and Classification. 12(4), 885-915. Kamalzadeh, H., Ahuja, V., Hahsler, M., & Bowen, M. E. (2021). An analytics‐driven approach for optimal individualized diabetes screening. Production and Operations Management, 30(9), 3161-3191. Karlinsky-Shichor, Y., & Netzer, O. (2023). Automating the b2b salesperson pricing decisions: A human-machine hybrid approach. Marketing Science. Keen, P. G. (1980). Decision support systems: a research perspective. In Decision support systems: Issues and challenges: Proceedings of an international task force meeting (pp. 23-44). Kelleher, J. D., Namee, B. M., and D’Arcy, A. (2020). Fundamentals of machine learning for predictive data analytics: Algorithms, worked examples, and case studies (2nd ed.). The MIT Press, Cambridge, MA. Kennedy, E. H. (2022). Semiparametric doubly robust targeted double machine learning: A review. arXiv preprint arXiv:2203.06469. Kesavan, S., & Kushwaha, T. (2020). Field experiment on the profit implications of merchants’ discretionary power to override data-driven decision-making tools. Management Science, 66(11), 5182-5190. Keskin, B. B., Bott, G. J., & Freeman, N. K. (2021). Cracking sex trafficking: Data analysis, pattern recognition, and path prediction. Production and Operations Management, 30(4), 1110-1135. Ketokivi, K. and Mantere, S. (2021). What warrants our claims? A methodological evaluation of argument structure. Journal of Operations Management, 67(6): 755-776. Ketzenberg, M. E., Abbey, J. D., Heim, G. R., & Kumar, S. (2020). Assessing customer return behaviors through data analytics. Journal of Operations Management, 66(6), 622-645. Khosrowabadi, N., Hoberg, K., & Imdahl, C. (2022). Evaluating human behaviour in response to AI recommendations for judgemental forecasting. European Journal of Operational Research, 303(3), 1151-1167. Kleinberg, J., Ludwig, J., & Mullainathan, S. (2016). A guide to solving social problems with machine learning. Harvard Business Review, 8, 2. Knaus, M. C. (2021). A double machine learning approach to estimate the effects of musical practice on student’s skills. Journal of the Royal Statistical Society Series A: Statistics in Society, 184(1), 282-300. Knaus, M. C. (2022). Double machine learning-based programme evaluation under unconfoundedness. The Econometrics Journal, 25(3), 602-627. Kuhn, T. (1970). The structure of scientific revolutions (2nd ed.). University of Chicago Press, Chicago. Lainder, A. D., & Wolfinger, R. D. (2022). Forecasting with gradient boosted trees: Augmentation, tuning, and cross-validation strategies: Winning solution to the M5 uncertainty competition. International Journal of Forecasting, 38(4), 1426-1433. Lantzy, S., & Anderson, D. (2020). Can consumers use online reviews to avoid unsuitable doctors? Evidence from RateMDs.com and the Federation of State Medical Boards. Decision Sciences, 51(4), 962-984. Lawrence, M., Goodwin, P., O'Connor, M., & Önkal, D. (2006). Judgmental forecasting: A review of progress over the last 25 years. International Journal of Forecasting, 22(3), 493-518. Lee, B. K., Lessler, J., & Stuart, E. A. (2010). Improving propensity score weighting using machine learning. Statistics in Medicine, 29(3), 337-346. Leoni, V. and Nilsson, W. (2021). Dynamic pricing and revenues of Airbnb listings: Estimating heterogeneous causal effects. International Journal of Hospitality Management, 95, 102914. Lepenioti, K., Bousdekis, A., Apostolou, D., & Mentzas, G. (2020). Prescriptive analytics: Literature review and research challenges. International Journal of Information Management, 50, 57-70. Lewis, G., & Syrgkanis, V. (2020). Double/debiased machine learning for dynamic treatment effects via g-estimation. arXiv preprint arXiv:2002.07285. Li, K. C. (1997). Nonlinear confounding in high-dimensional regression. The Annals of Statistics, 25(2), 577-612. Li, M., & Li, T. (2022). AI automation and retailer regret in supply chains. Production and Operations Management, 31(1), 83-97. Lin, M., Lucas Jr., H. C., and Shmueli, G. (2013). Too big to fail: Large samples and the p-value problem. Information Systems Research, 24(4), 906-917. Liu, S., Duffy, A. H., Whitfield, R. I., & Boyle, I. M. (2010). Integration of decision support systems to improve decision support performance. Knowledge and Information Systems, 22, 261-286. Lu, G., Ding. X. D., Peng, D. X., and Chuang. H. H. (2018). Addressing endogeneity in operations management research: Recent developments, common problems, and directions for future research. Journal of Operations Management, 64, 53-64. MacKinnon, J. G., Nielsen, M. Ø., & Webb, M. D. (2023). Cluster-robust inference: A guide to empirical practice. Journal of Econometrics, 232(2), 272-299. Mantere, S. and Ketokivi, M. (2013). Reasoning in organization science. Academy of Management Review, 38(1), 70-89. McAlexander, R. J. and Mentch, L. (2020). Predictive inference with random forest: A new perspective on classical analyses. Research and Politics, January-March, 1-7. Mcauliffe, W. H. B. (2015). How did abduction get confused with inference to the best explanation? Transactions to the Charles S. Peirce Society, 51(3), 300-319. Medeiros, M. C. and Mendes, E. F. (2016) L1-regularization of high-dimensional time-series models with non-Gaussian and heteroskedastic errors. Journal of Econometrics, 191(1), 255-271. Meinshausen, N. (2006). Quantile regression forests. Journal of Machine Learning Research, 7, 983-999. Mentch, L. and Hooker, G. (2016) Quantifying uncertainty in random forest via confidence intervals and hypothesis tests. Journal of Machine Learning Research, 17(1), 841-881. Mintzberg, H. (1979). An emerging strategy of” direct” research. Administrative Science Quarterly, 24(4), 582-589. Mohr, L. B. (1982). Explaining organizational behavior. Jossey-Bass, San Francisco, CA. Montoya, R., & Gonzalez, C. (2019). A hidden Markov model to detect on-shelf out-of-stocks using point-of-sale data. Manufacturing & Service Operations Management, 21(4), 932-948. Moodie, E. E., Chakraborty, B., and Kramer, M. S. (2012). Q‐learning for estimating optimal dynamic treatment rules from observational data. Canadian Journal of Statistics, 40(4), 629-645. Moodie, E. E., Dean, N., and Sun, Y. R. (2014). Q-learning: Flexible learning about useful utilities. Statistics in Biosciences, 6(2), 223-243. Mullainathan, S., & Spiess, J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31(2), 87-106. Nembrini, S. Konig, I. R., and Wright, M. N. (2018). The revival of the Gini importance? Bioinformatics, 34(21), 3711-3718. Nenova, Z. D., & May, J. H. (2016). Determining an optimal hierarchical forecasting model based on the characteristics of the data set. Journal of Operations Management, 44, 62-68. Neslin, S. A., & Schneider Stone, L. G. (1996). Consumer inventory sensitivity and the postpromotion dip. Marketing Letters, 7, 77-94. Norrman, A., & Jansson, U. (2004). Ericsson's proactive supply chain risk management approach after a serious sub‐supplier accident. International Journal of Physical Distribution & Logistics Management, 34(5), 434-456. Oh, HK, Oliva, R. (2021). Better together? How managers can complement algorithms, Working Paper. Oliva, R. (2019). Intervention as a research strategy. Journal of Operations Management, 65(7), 710-724. Oliva, R., & Watson, N. (2009). Managing functional biases in organizational forecasts: A case study of consensus forecasting in supply chain planning. Production and Operations Management, 18(2), 138-151. Oliva, R., Abdulla, H., Gonçalves, P. (2022). Do managers overreact when in backlog? Evidence from a serial supply chain experiment. Manufacturing & Service Operations Management, in press. https://doi.org/10.1287/msom.2021.1072. Oreskes, N., Shrader-Frechette, K., and Belitz, K. (1994). Verification, validation, and confirmation of numerical models in earth science. Science, 263, 641-646. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32. Peirce, C. S. (1878). Deduction, induction, and hypothesis. Popular Science Monthly, 13, 470-482. Peirce, C. S. (1955). Philosophical writings of Peirce. Dover Publications, New York. Peirce, C. S. (1965). Collected papers of Charles Sanders Pierce Vol. 5. In Hartshorne C, P Weiss (Eds.), Pragmatism and pragmaticism. Harvard University Press, Cambridge, MA. Poole, M. S., Van de Ven, A. H., Dooley, K., and Holmes, M. E. (2000). Organizational change and innovation processes: Theory and methods for research. Oxford University Press, New York. Prak, D., Teunter, R., & Syntetos, A. (2017). On the calculation of safety stocks when demand is forecasted. European Journal of Operational Research, 256(2), 454-461. Queenan, C., Cameron, K., Snell, A., Smalley, J., & Joglekar, N. (2019). Patient heal thyself: Reducing hospital readmissions with technology‐enabled continuity of care and patient activation. Production and Operations Management, 28(11), 2841-2853. Quenouille, M. H. (1949). Problems in plane sampling. The Annals of Mathematical Statistics, 20(3), 355-375. Rajaram, K., & Tang, C. S. (2001). The impact of product substitution on retail merchandising. European Journal of Operational Research, 135(3), 582-601. Richards, G. (2017). Warehouse management: a complete guide to improving efficiency and minimizing costs in the modern warehouse. Kogan Page Publishers. Rozeboom, W. W. (1997). Good science is abductive, not hypothetico-deductive. In Harlow LL, SA Mulaik, JH Steiger (Eds.), What if there is no significance test? Lawrence Erlbaum Assoc., Mahwah, NJ. Rubin, D. (2005). Causal inference using potential outcomes. Journal of American Statistical Association, 100(469), 322-331. Sachs, A. L., & Minner, S. (2014). The data-driven newsvendor with censored demand observations. International Journal of Production Economics, 149, 28-36. Scornet, E., Biau, G., and Vert, J. P. (2015). Consistency of random forest. The Annals of Statistics, 43, 1716-1741. Semenova, V., Goldman, M., Chernozhukov, V., & Taddy, M. (2023). Inference on heterogeneous treatment effects in high‐dimensional dynamic panels under weak dependence. Quantitative Economics, 14(2), 471-510. Shi, Y., Wang, T., & Alwan, L. C. (2020). Analytics for cross‐border e‐commerce: Inventory risk management of an online fashion retailer. Decision Sciences, 51(6), 1347-1376. Shin, H., Collier, D. A., & Wilson, D. D. (2000). Supply management orientation and supplier/buyer performance. Journal of Operations Management, 18(3), 317-333. Shmueli, G. (2010). To explain or to predict? Statistical Science, 25(3), 289-310. Shrestha, Y. R., He, V. F., Puranam, P., and von Krogh, G. (2021). Algorithm supported inducting for building theory: How can we use prediction models to theorize? Organization Science, 32(3), 856-880. Silver, M. S. (1991). Decisional guidance for computer-based decision support. MIS Quarterly, 105-122. Simon, H. A. (1981). The sciences of the artificial (2nd ed.). The MIT Press, Cambridge, MA. Singh, J. B., Kumar, M. V., Chandwani, R., and Varkkey, B. (2020). Machine learning and grounded theory: New opportunities for mixed-design research. AMCIS 2020 Proceedings. Song, P., Wang, Q., Liu, H., and Li, Q. (2021). The value of buy-online-and-pickup-in-store in omni-channel: Evidecne from customer usage data. Production and Operations Management, 29(4), 995-110. Sroginis, A., Fildes, R., & Kourentzes, N. (2023). Use of contextual and model-based information in adjusting promotional forecasts. European Journal of Operational Research, 307(3), 1177-1191. Strauss, A. and Corbin, J. M. (1997). Grounded theory in practice. SAGE, Thousand Oaks, CA. Sun, J., Zhang, D. J., Hu, H., & Van Mieghem, J. A. (2022). Predicting human discretion to adjust algorithmic prescription: A large-scale field experiment in warehouse operations. Management Science, 68(2), 846-865. Sun, L., Zheng, X., Jin, Y., Jiang, M., & Wang, H. (2019). Estimating promotion effects using big data: A partially profiled LASSO model with endogeneity correction. Decision Sciences, 50(4), 816-846. Sutton, R. I. and Staw, B. M. (1995). What theory is not. Administrative Science Quarterly, 40(3), 371-384. Tan, T. F., & Staats, B. R. (2020). Behavioral drivers of routing decisions: Evidence from restaurant table assignment. Production and Operations Management, 29(4), 1050-1070. Tao, Y. and Wang, L. (2017). Adaptive contrast weighted learning for multi‐stage multi‐treatment decision‐making. Biometrics, 73(1), 145-155. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. Tikhonov, A. N. (1963). Solution of incorrectly formulated problems and the regularization method. Soviet Math, 4, 1035-1038. Van Der Laan, M. J., & Rubin, D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics, 2(1). Van Donselaar, K. H., Gaur, V., Van Woensel, T., Broekmeulen, R. A., & Fransoo, J. C. (2010). Ordering behavior in retail stores and implications for automated replenishment. Management Science, 56(5), 766-784. Vansteelandt, S., & Dukes, O. (2022). Assumption-lean inference for generalised linear model parameters. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(3), 657-685. Wager, S. and Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forest. Journal of the American Statistical Association, 113(523), 1228-1242. Wager, S., Hastie, T., and Efron, B. (2014). Confidence intervals for random forest: The Jackknife and the infinitesimal Jackknife. Journal of Machine Learning Research, 15, 1625-1651. Wagner, H. M. (2002). And then there were none. Operations Research, 50(1), 217-226. Weick, K. E. (1989). Theory construction as disciplined imagination. Academy of Management Review, 14(4), 516-531. Wright, M. N. and Ziegler, A. (2017). ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software, 77, 1-17. Yang, J. C., Chuang, H. C., & Kuan, C. M. (2020). Double machine learning with gradient boosting and its application to the Big N audit quality effect. Journal of Econometrics, 216(1), 268-283. Zepeda, E. D., Nyaga, G. N., and Young, G. J. (2016). Supply chain risk management and hospital inventory: Effects of system affiliation. Journal of Operations Management, 44, 30-47. Zhang, M., & Luo, L. (2023). Can consumer-posted photos serve as a leading indicator of restaurant survival? Evidence from Yelp. Management Science, 69(1), 25-50. Zhao, Q. and Hastie, T. (2019). Causal interpretations of black-box models. Journal of Business & Economic Statistics, 39(1), 272-281. Zivich, P. N., & Breskin, A. (2021). Machine learning for causal inference: on the use of cross-fit estimators. Epidemiology (Cambridge, Mass.), 32(3), 393.
描述 博士
國立政治大學
資訊管理學系
108356508
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0108356508
資料類型 thesis
dc.contributor.advisor 莊皓鈞<br>周彥君zh_TW
dc.contributor.advisor Chuang, Hao-Chun<br>Chou, Yen-Chunen_US
dc.contributor.author (Authors) 周平zh_TW
dc.contributor.author (Authors) Chou, Pingen_US
dc.creator (作者) 周平zh_TW
dc.creator (作者) Chou, Pingen_US
dc.date (日期) 2023en_US
dc.date.accessioned 1-Feb-2024 10:56:09 (UTC+8)-
dc.date.available 1-Feb-2024 10:56:09 (UTC+8)-
dc.date.issued (上傳時間) 1-Feb-2024 10:56:09 (UTC+8)-
dc.identifier (Other Identifiers) G0108356508en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/149466-
dc.description (描述) 博士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理學系zh_TW
dc.description (描述) 108356508zh_TW
dc.description.abstract (摘要) 解決管理問題並提供決策支援是管理研究的核心議題,其中,機器學習模型,尤其具有黑箱性質、被廣泛用於數學函數估計的演算法,在決策最佳化過程所需的預測及處方性分析扮演了關鍵的角色。儘管如此,在實證研究中,以機器學習方法幫助驗證理論有效或為真的效用並未吸引多數管理研究者的關注。隨著統計機器學習及通用性的模型解釋技術的進展,機器學習已成為統計及因果推論的有力工具。在探索變數相關性的描述性理論建構,少數頂尖管理期刊文章近來提倡將機器學習用於模式的探索和理論建構。然而,這些研究並未對變數及交互作用的重要性進行統計檢定,對於現有的統計機器學習技術亦缺乏系統性的整合。而在分析變數因果關係的規範性理論建構,計量經濟文獻雖然發展了基於機器學習的估計方法,但文獻仍缺乏相關技術在資訊及作業管理的實證應用。為彌補上述研究缺口,本學位論文包含了兩篇研究。第一篇研究闡述了統計機器學習如何基於相關性進行解釋及驗證性的理論建構。當中,本研究提出了一套以隨機森林為基礎的研究流程,能提供演繹及溯因推理性的理論建構,針對變數與交互作用的重要性進行統計檢定,從而偵測資料中隱藏的模式並測試變數的相關性。透過廣泛的模擬實驗,本研究指出機器學習有較卓越表現的情境,以及與傳統統計分析方法互補的情境。第二篇研究介紹了一套基於機器學習的通用因果推論技術──雙重機器學習;本研究除透過模擬實驗來驗證該技術在因果推論的有效性及穩健性,同時將其應用於零售供應鏈情境下的庫存管理實證分析,探討半自動化的決策支援系統建議與經理人修正的人機互動模式。本研究從動態面板數據中發掘了數個穩健的因果模式,這些模式指出庫存管理績效可能因為經理人修改系統決策而得到改善的情境。綜上所述,本學位論文的兩篇研究皆致力於整合、驗證,及應用前沿的機器學習於理論建構,從而為方法論文獻(機器學習的統計推論)及實證作業管理和人機互動研究提出貢獻。zh_TW
dc.description.abstract (摘要) Problem-solving and decision support represent a prominent discipline in the management literature, whereas Machine Learning (ML) – the black-box algorithms dedicated to function-fitting in particular – serves as pivotal tools for predictive and prescriptive analytics for decision optimization. However, management researchers tend to overlook ML’s potential as a research methodology for empirical research, which answers the central inquiry of theory building “Is it valid or true.” The up-to-date developments in statistical ML and model-agnostic interpretation have rendered the algorithms decent statistical and causal inference tools. For descriptive theory development reliant on association, recent works from authoritative management journals have begun using ML for pattern discovery and theory building. Nevertheless, they fall short of offering statistical tests for variable/interaction importance and integrating them with the extant techniques. For causality that facilitates normative theory, the econometrics literature has devised ML-based causal effect estimators but lacks empirical application in information and operations management research. To fill in the gaps, this dissertation comprises two essays. The first essay articulates the utilization of statistical ML for explanatory and confirmatory theory-building with associations. We propose an analysis protocol based on Random Forest (RF) for abductive and deductive theory building, equipped with statistical tests for the significance of variables and the interactions. The protocol helps uncover the patterns and test the association. Based on extensive simulations, we shed light on the context where MLs exhibit superiority and are complementary to the prevalently used econometrics model for empirical data analysis. For the second essay, we introduce Double Machine Learning (DML), a general framework of ML-based causal inference, for theory building. In addition to numerically assessing the statistical properties/robustness of DML, we use DML to empirically analyze human-system interaction in the context of semi-automated replenishment decision-making. Reliant on the proven statistical validity of DML for dynamic panel data, we uncover robust causal patterns for the contexts where managers’ interventions to the algorithmic decisions improve inventory performance. The two essays are dedicated to integrating, validating, and employing sophisticated ML techniques for theory building, and make contributions to the methodological literature on ML-based inference in management and the literature on empirical operations management and human-system interaction in decision-making.en_US
dc.description.tableofcontents Section1 Introduction 1 Section2 Supervised Machine Learning for Theory Building and Testing: Opportunities in Operations Management 6 Section2.1 Introduction 6 Section2.2 Machine Learning in Empirical OM 9 Section2.3 Machine Learning for Theory Building 15 Section2.3.1 Pattern Discovery for Theorizing 15 Section2.3.2 Abductive Reasoning 22 Section2.4 Machine Learning for Theory Testing 24 Section2.4.1 Moving Beyond Linearity and Low-Dimensionality 24 Section2.4.2 Illustrations of RF for Theory Testing 26 Section2.4.2.1 CIs and Predictive Significance 27 Section2.4.2.2 Linear Hypothesis 28 Section2.4.2.3 Nonlinear Hypothesis 33 Section2.5 Extended Applications 37 Section2.6 Concluding Remarks 40 Section2.7 Appendix 43 Section2.7.1 Recent OM Studies with the Use of Supervised ML 43 Section2.7.2 Machine Learning Algorithms 45 Section2.7.3 Jackknife CIs and Permutation Test 47 Section2.7.3.1 Jackknife-After-Bootstrap and Infinitesimal Jackknife 47 Section2.7.3.2 Bias-Variance Analysis 49 Section2.7.3.3 Permutation Test with Random Forest 50 Section2.7.4 Linear Regression Modeling 52 Section3 Understanding Human-Algorithm Interaction in Ordering Decisions: A Computational Intelligence Approach 54 Section3.1 Introduction 54 Section3.2 Literature Review 58 Section3.3 Methodology 63 Section3.3.1 Causal Inference and Estimators 63 Section3.3.2 Double Machine Learning 64 Section3.3.3 Simulation Study 67 Section3.3.3.1 Cross-sectional Data and ATE 67 Section3.3.3.2 Longitudinal Data and CATE 70 Section3.4 Research Context 72 Section3.4.1 The Case Company 72 Section3.4.2 Outcome and Treatment 74 Section3.4.3 Contexts and Moderators 76 Section3.4.3.1 Inventory-related Contexts 77 Section3.4.3.2 Supplier-related Contexts 78 Section3.4.3.3 Demand-related Contexts 79 Section3.5 Empirical Analysis 80 Section3.5.1 Model and Pre-processing 80 Section3.5.2 Main Analysis 82 Section3.5.2.1 Inventory-related Contexts 85 Section3.5.2.2 Supplier-related Contexts 87 Section3.5.2.3 Demand-related Contexts 89 Section3.5.2.4 Robustness Check 90 Section3.6 Exploratory Theory Development 91 Section3.7 Discussion 95 Section3.7.1 Conclusions and Contributions 95 Section3.7.2 Limitations and Future Research 97 Section3.8 Appendix 100 Section3.8.1 Variables for DML Orthogonalization 100 Section3.8.2 Z-test for Robustness Test 101 References 102zh_TW
dc.format.extent 2376870 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0108356508en_US
dc.subject (關鍵詞) 理論建構zh_TW
dc.subject (關鍵詞) 統計機器學習zh_TW
dc.subject (關鍵詞) 雙重機器學習zh_TW
dc.subject (關鍵詞) 統計與因果推論zh_TW
dc.subject (關鍵詞) 人機互動zh_TW
dc.subject (關鍵詞) Theory buildingen_US
dc.subject (關鍵詞) statistical machine learningen_US
dc.subject (關鍵詞) double machine learningen_US
dc.subject (關鍵詞) statistical and causal inferenceen_US
dc.subject (關鍵詞) human-system interactionen_US
dc.title (題名) 統計機器學習:理論建構與因果分析zh_TW
dc.title (題名) Statistical Machine Learning for Theory-building and Causal Analysisen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Agrawal, A., Gans, J., & Goldfarb, A. (2020). How to win with machine learning. Harvard Business Review. Allcott, H., Braghieri, L., and Gentzkow, M. (2020). The welfare effects of social media. American Economic Review, 110(3), 629-676. Alley, M., Biggs, M., Hariss, R., Herrmann, C., Li, M., and Perakis, G. (2020). Pricing for heterogeneous products: Analytics for ticket reselling. Working Paper, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3360622. Anand, G., Larson, E. C., & Mahoney, J. T. (2020). Thomas Kuhn on paradigms. Production and Operations Management, 29(7), 1650-1657. Ang, E., Kwasnick, S., Bayati, M., Plambeck, E. L., & Aratow, M. (2016). Accurate emergency department wait time prediction. Manufacturing & Service Operations Management, 18(1), 141-156. Athey, S. and Imbens, G. W. (2019). Machine learning methods that economists should know about. Annual Review of Economics, 11, 685-725. Athey, S., Tibshirani, J., and Wager, S. (2019). Generalized random forest. The Annals of Statistics, 47(2), 1148-1178. Austin, P. C. (2012). Using ensemble-based methods for directly estimating causal effects: An investigation of tree-based G-computation. Multivariate Behavioral Research, 47(1), 115-135. Bastani, H., Zhang, D., and Heng, Z. (2021). Applied machine learning in operations management. Working Paper, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3736466. Baumer, E. P., Mimno, D., Guha, S., Quan, E., and Gay, G. K. (2017). Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence? Journal of the Association for Information Science and Technology, 68(6), 1397-1410. Bertolini, M., Mezzogori, D., Neroni, M., and Zammori, F. (2021). Machine learning for industrial applications: A comprehensive literature review. Expert Systems with Applications, 175, 114820. Bertsimas, D., Kallus, N., & Hussain, A. (2016). Inventory management in the era of big data. Production and Operations Management, 25(12), 2006-2009. Biau, G. (2012). Analysis of a random forest model. Journal of Machine Learning Research, 13, 1063-1095. Bica, I., Alaa, A., & Van Der Schaar, M. (2020, November). Time series deconfounder: Estimating treatment effects over time in the presence of hidden confounders. In International Conference on Machine Learning (pp. 884-895). PMLR. Bojer, C. S. and Meldgaard, J. P. (2021). Kaggle forecasting competitions: An overlooked learning opportunity. International Journal of Forecasting, 37(2), 587-603. Brand, J. E., Xu, J., Koch, B., and Geraldo, P. (2021). Uncovering sociological effect heterogeneity using tree-based machine learning. Sociological Methodology, in press. Breiman, L. (2001). Random forest. Machine Learning, 45(1), 5-32. Briggs, D. C. (2004). Causal inference and the Heckman model. Journal of Educational and Behavioral Statistics, 29(4), 397-420. Brusco, M. J., Singh, R., Cradit, J. D., and Steinley, D. (2017). Cluster analysis in empirical OM research: Survey and recommendations. International Journal of Operations & Production Management. 37(3), 300-320. Carlile, P. R., & Christensen, C. M. (2005). The cycles of theory building in management research. Division of Research, Harvard Business School. https://www.hbs.edu/ris/Publication%20Files/05-057.pdf Caro, F., & de Tejada Cuenca, A. S. (2023). Believing in analytics: Managers’ adherence to price recommendations from a DSS. Manufacturing & Service Operations Management, 25(2), 524-542. Caro, F., & Gallien, J. (2010). Inventory management of a fast-fashion retail network. Operations Research, 58(2), 257-273. Caro, F., Colliard, J. E., Katok, E., Ockenfels, A., Stier-Moses, N., Tucker, C., & Wu, D. J. (2022). Call for papers—Management Science Special Issue on the Human-Algorithm Connection. Management Science, 68(1), 7-8. Caro, F., Gallien, J., Díaz, M., García, J., Corredoira, J. M., Montes, M., ... & Correa, J. (2010). Zara uses operations research to reengineer its global distribution process. Interfaces, 40(1), 71-84. Chakraborty, B. and Moodie, E. E. (2013). Statistical methods for dynamic treatment regimes (Vol. 2). Springer, New York. Chandrasekaran, A., de Treville, S., and Browning, T. (2020). Editorial: Intervention-based research (IBR) – What, where, and how to use it in operations management. Journal of Operations Management, 66(4), 370-378. Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1-C68. Chou, P., Chuang, H. H. C., Chou, Y. C., & Liang, T. P. (2022). Predictive analytics for customer repurchase: Interdisciplinary integration of buy till you die modeling and machine learning. European Journal of Operational Research, 296(2), 635-651. Chou, Y. C., Chuang, H. H. C., Chou, P., & Oliva, R. (2023). Supervised machine learning for theory building and testing: Opportunities in operations management. Journal of Operations Management, 69(4), 643-675. Choudhury, P., Allen, R. T., & Endres, M. G. (2021). Machine learning for pattern discovery in management research. Strategic Management Journal, 42(1), 30-57. Christopher, M., & Lee, H. (2004). Mitigating supply chain risk through improved confidence. International Journal of Physical Distribution & Logistics Management, 34(5), 388-396. Chuang, H. H. C., & Oliva, R. (2015). Inventory record inaccuracy: Causes and labor effects. Journal of Operations Management, 39, 63-78. Chuang, H. H. C., Chou, Y. C., & Oliva, R. (2021). Cross‐item learning for volatile demand forecasting: An intervention with predictive analytics. Journal of Operations Management, 67(7), 828-852. Chuang, H. H., Oliva, R., and Heim, G. R. (2019). Examining the link between retailer inventory leanness and operational efficiency: Moderating roles of firm size and demand uncertainty. Production and Operations Management, 28(9), 2338-2364. Coleman, T., Peng, W. and Mentch, L. (2022). Scalable and efficiency hypothesis testing with random forest. Journal of Machine Learning Research, 23, 1-35. Cook, T. D. and Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Houghton Mifflin, Boston. Cui, R., Gallino, S., Moreno, A., & Zhang, D. J. (2018). The operational value of social media information. Production and Operations Management, 27(10), 1749-1769. Cui, R., Li, M., & Zhang, S. (2022). AI and Procurement. Manufacturing & Service Operations Management, 24(2), 691-706. DeHoratius, N., Mersereau, A. J., & Schrage, L. (2008). Retail inventory management when records are inaccurate. Manufacturing & Service Operations Management, 10(2), 257-277. Dombi, J., Jónás, T., & Tóth, Z. E. (2018). Modeling and long-term forecasting demand in spare parts logistics businesses. International Journal of Production Economics, 201, 1-17. Domnici, F., Bargagli-Stoffi, F. J., & Mealli, F. (2021). From controlled to undisciplined data: Estimating causal effects in the era of data science using a potential outcome framework. Harvard Data Science Review, 3(3), 1-34. Duke, K. E., & Amir, O. (2023). The importance of selling formats: When integrating purchase and quantity decisions increases sales. Marketing Science, 42(1), 87-109. Edmondson, A. C. and McManus, S. E. (2007). Methodological fit in management field research. Academy of Management Review, 32(4), 1155-1179. Efron, B. (1992). Jackknife-after-bootstrap standard errors and influence functions. Journal of the Royal Statistical Society: Series B (Methodological), 54(1), 83-111. Efron, B. (2014). Estimation and accuracy after model selection. Journal of the American Statistical Association, 109(507), 991-1007. Efron, B. and Hastie, T. (2016). Computer age statistical inference: Algorithms, evidence, and data science. Cambridge University Press, Cambridge, UK. Elmaghraby, W., Jank, W., Zhang, S., & Karaesmen, I. Z. (2015). Sales force behavior, pricing information, and pricing decisions. Manufacturing & Service Operations Management, 17(4), 495-510. Eroglu, C., Williams, B. D., & Waller, M. A. (2013). The backroom effect in retail operations. Production and Operations Management, 22(4), 915-923. Ertekin, N. (2018). Intermediate and long-term benefits of in-store return experience. Production and Operations Management, 27(1), 121-142. Feng, H., Wu, Q., Muthuraman, K., & Deshpande, V. (2015). Replenishment policies for multi‐product stochastic inventory systems with correlated demand and joint‐replenishment costs. Production and Operations Management, 24(4), 647-664. Ferreira, K. J., Lee, B. H. A., & Simchi-Levi, D. (2016). Analytics for an online retailer: Demand forecasting and price optimization. Manufacturing & Service Operations Management, 18(1), 69-88. Fildes, R., Goodwin, P., & Lawrence, M. (2006). The design features of forecasting support systems and their effectiveness. Decision Support Systems, 42(1), 351-361. Fiore, A. M., Yah, X., & Yoh, E. (2000). Effects of a product display and environmental fragrancing on approach responses and pleasurable experiences. Psychology & Marketing, 17(1), 27-54. Fisher, M., & Raman, A. (2010). The new science of retailing: How analytics are transforming the supply chain and improving performance. Harvard Business Review Press. Fong, Y., Huang, Y., Gilbert, P. B., and Permar, S. R. (2017). Changpt: Threshold regression model estimation and inference. BMC Bioinformatics, 18(1), 1-7. Geva, T., & Saar‐Tsechansky, M. (2021). Who is a better decision maker? Data‐driven expert ranking under unobserved quality. Production and Operations Management, 30(1), 127-144. Glaeser, C. K., Fisher, M., & Su, X. (2019). Optimal retail location: Empirical methodology and application to practice: Finalist–2017 M&SOM practice-based research competition. Manufacturing & Service Operations Management, 21(1), 86-102. Glynn, A. N., & Quinn, K. M. (2010). An introduction to the augmented inverse propensity weighted estimator. Political Analysis, 18(1), 36-56. Gneiting, T., Balabdaoui, F., & Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(2), 243-268. Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E. (2015). Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. Journal of Computational and Graphical Statistics, 24(1), 44-65. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. Guajardo, J. A. (2019). How do usage and payment behavior interact in rent‐to‐own business models? Evidence from developing economies. Production and Operations Management, 28(11), 2808-2822. Gulli, A., & Pal, S. (2017). Deep learning with Keras. Packt Publishing Ltd. Haans, R. F. J., Pieters, C., and He, Z. (2016). Thinking about U: Theorizing and testing U- and inverted U-shaped relationships in strategy research. Strategic Management Journal, 37(7), 1177-1195. Hanson, N. R. (1958). Patterns of discovery. Cambridge University Press, Cambridge, UK. Hastie, T. and Tibshirani, R. (1990). Generalized additive models. Chapman and Hall. Hatt, T., & Feuerriegel, S. (2021). Sequential deconfounding for causal inference with unobserved confounders. arXiv preprint arXiv:2104.09323. Hirano, K., & Imbens, G. W. (2004). The propensity score with continuous treatments. Applied Bayesian modeling and causal inference from incomplete-data perspectives, 226164, 73-84. Ho, T. H., Lim, N., Reza, S., and Xia, X. (2017). Causal inference models in operations management. Manufacturing & Service Operations Management, 19(4), 509-525. Hoffman, M., Kahn, L. B., & Li, D. (2018). Discretion in hiring. The Quarterly Journal of Economics, 133(2), 765-800. Hofman, J. M., Sharma, A., and Watts, D. J. (2017). Prediction and explanation in social systems. Science, 355, 486-488. Huang, M. G. (2009). Real options approach-based demand forecasting method for a range of products with highly volatile and correlated demand. European Journal of Operational Research, 198(3), 867-877. Huang, N., Yan, Z., and Yin, H. (2021). Effects of online-offline service integration on e-health providers: A quasi-natural experiment. Production and Operations Management, 30(8), 2359-2378. Huynh-Thu, V. A., Saeys, Y., Wehenkel, L., and Geurts, P. (2012). Statistical interpretation of machine learning-based feature importance scores for biomarker discovery. Bioinformatics, 28(13), 1766-1774. Ilk, N., Shang, G., & Goes, P. (2020). Improving customer routing in contact centers: An automated triage design based on text analytics. Journal of Operations Management, 66(5), 553-577. Imdahl, C., Hoberg, K., & Schmidt, W. (2021). Targeted automation of order decisions using machine learning. Available at SSRN 3822131. James, G., Witten, D., Hastie, T., and Tibshirani, R. (2021). An introduction to statistical learning with applications in R (2nd ed.). Springer, New York. Janitza, S., Celik, E., and Boulesteix, A. (2018). A computationally fast variable importance test for random forests for high-dimensional data. Advances in Data Analysis and Classification. 12(4), 885-915. Kamalzadeh, H., Ahuja, V., Hahsler, M., & Bowen, M. E. (2021). An analytics‐driven approach for optimal individualized diabetes screening. Production and Operations Management, 30(9), 3161-3191. Karlinsky-Shichor, Y., & Netzer, O. (2023). Automating the b2b salesperson pricing decisions: A human-machine hybrid approach. Marketing Science. Keen, P. G. (1980). Decision support systems: a research perspective. In Decision support systems: Issues and challenges: Proceedings of an international task force meeting (pp. 23-44). Kelleher, J. D., Namee, B. M., and D’Arcy, A. (2020). Fundamentals of machine learning for predictive data analytics: Algorithms, worked examples, and case studies (2nd ed.). The MIT Press, Cambridge, MA. Kennedy, E. H. (2022). Semiparametric doubly robust targeted double machine learning: A review. arXiv preprint arXiv:2203.06469. Kesavan, S., & Kushwaha, T. (2020). Field experiment on the profit implications of merchants’ discretionary power to override data-driven decision-making tools. Management Science, 66(11), 5182-5190. Keskin, B. B., Bott, G. J., & Freeman, N. K. (2021). Cracking sex trafficking: Data analysis, pattern recognition, and path prediction. Production and Operations Management, 30(4), 1110-1135. Ketokivi, K. and Mantere, S. (2021). What warrants our claims? A methodological evaluation of argument structure. Journal of Operations Management, 67(6): 755-776. Ketzenberg, M. E., Abbey, J. D., Heim, G. R., & Kumar, S. (2020). Assessing customer return behaviors through data analytics. Journal of Operations Management, 66(6), 622-645. Khosrowabadi, N., Hoberg, K., & Imdahl, C. (2022). Evaluating human behaviour in response to AI recommendations for judgemental forecasting. European Journal of Operational Research, 303(3), 1151-1167. Kleinberg, J., Ludwig, J., & Mullainathan, S. (2016). A guide to solving social problems with machine learning. Harvard Business Review, 8, 2. Knaus, M. C. (2021). A double machine learning approach to estimate the effects of musical practice on student’s skills. Journal of the Royal Statistical Society Series A: Statistics in Society, 184(1), 282-300. Knaus, M. C. (2022). Double machine learning-based programme evaluation under unconfoundedness. The Econometrics Journal, 25(3), 602-627. Kuhn, T. (1970). The structure of scientific revolutions (2nd ed.). University of Chicago Press, Chicago. Lainder, A. D., & Wolfinger, R. D. (2022). Forecasting with gradient boosted trees: Augmentation, tuning, and cross-validation strategies: Winning solution to the M5 uncertainty competition. International Journal of Forecasting, 38(4), 1426-1433. Lantzy, S., & Anderson, D. (2020). Can consumers use online reviews to avoid unsuitable doctors? Evidence from RateMDs.com and the Federation of State Medical Boards. Decision Sciences, 51(4), 962-984. Lawrence, M., Goodwin, P., O'Connor, M., & Önkal, D. (2006). Judgmental forecasting: A review of progress over the last 25 years. International Journal of Forecasting, 22(3), 493-518. Lee, B. K., Lessler, J., & Stuart, E. A. (2010). Improving propensity score weighting using machine learning. Statistics in Medicine, 29(3), 337-346. Leoni, V. and Nilsson, W. (2021). Dynamic pricing and revenues of Airbnb listings: Estimating heterogeneous causal effects. International Journal of Hospitality Management, 95, 102914. Lepenioti, K., Bousdekis, A., Apostolou, D., & Mentzas, G. (2020). Prescriptive analytics: Literature review and research challenges. International Journal of Information Management, 50, 57-70. Lewis, G., & Syrgkanis, V. (2020). Double/debiased machine learning for dynamic treatment effects via g-estimation. arXiv preprint arXiv:2002.07285. Li, K. C. (1997). Nonlinear confounding in high-dimensional regression. The Annals of Statistics, 25(2), 577-612. Li, M., & Li, T. (2022). AI automation and retailer regret in supply chains. Production and Operations Management, 31(1), 83-97. Lin, M., Lucas Jr., H. C., and Shmueli, G. (2013). Too big to fail: Large samples and the p-value problem. Information Systems Research, 24(4), 906-917. Liu, S., Duffy, A. H., Whitfield, R. I., & Boyle, I. M. (2010). Integration of decision support systems to improve decision support performance. Knowledge and Information Systems, 22, 261-286. Lu, G., Ding. X. D., Peng, D. X., and Chuang. H. H. (2018). Addressing endogeneity in operations management research: Recent developments, common problems, and directions for future research. Journal of Operations Management, 64, 53-64. MacKinnon, J. G., Nielsen, M. Ø., & Webb, M. D. (2023). Cluster-robust inference: A guide to empirical practice. Journal of Econometrics, 232(2), 272-299. Mantere, S. and Ketokivi, M. (2013). Reasoning in organization science. Academy of Management Review, 38(1), 70-89. McAlexander, R. J. and Mentch, L. (2020). Predictive inference with random forest: A new perspective on classical analyses. Research and Politics, January-March, 1-7. Mcauliffe, W. H. B. (2015). How did abduction get confused with inference to the best explanation? Transactions to the Charles S. Peirce Society, 51(3), 300-319. Medeiros, M. C. and Mendes, E. F. (2016) L1-regularization of high-dimensional time-series models with non-Gaussian and heteroskedastic errors. Journal of Econometrics, 191(1), 255-271. Meinshausen, N. (2006). Quantile regression forests. Journal of Machine Learning Research, 7, 983-999. Mentch, L. and Hooker, G. (2016) Quantifying uncertainty in random forest via confidence intervals and hypothesis tests. Journal of Machine Learning Research, 17(1), 841-881. Mintzberg, H. (1979). An emerging strategy of” direct” research. Administrative Science Quarterly, 24(4), 582-589. Mohr, L. B. (1982). Explaining organizational behavior. Jossey-Bass, San Francisco, CA. Montoya, R., & Gonzalez, C. (2019). A hidden Markov model to detect on-shelf out-of-stocks using point-of-sale data. Manufacturing & Service Operations Management, 21(4), 932-948. Moodie, E. E., Chakraborty, B., and Kramer, M. S. (2012). Q‐learning for estimating optimal dynamic treatment rules from observational data. Canadian Journal of Statistics, 40(4), 629-645. Moodie, E. E., Dean, N., and Sun, Y. R. (2014). Q-learning: Flexible learning about useful utilities. Statistics in Biosciences, 6(2), 223-243. Mullainathan, S., & Spiess, J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31(2), 87-106. Nembrini, S. Konig, I. R., and Wright, M. N. (2018). The revival of the Gini importance? Bioinformatics, 34(21), 3711-3718. Nenova, Z. D., & May, J. H. (2016). Determining an optimal hierarchical forecasting model based on the characteristics of the data set. Journal of Operations Management, 44, 62-68. Neslin, S. A., & Schneider Stone, L. G. (1996). Consumer inventory sensitivity and the postpromotion dip. Marketing Letters, 7, 77-94. Norrman, A., & Jansson, U. (2004). Ericsson's proactive supply chain risk management approach after a serious sub‐supplier accident. International Journal of Physical Distribution & Logistics Management, 34(5), 434-456. Oh, HK, Oliva, R. (2021). Better together? How managers can complement algorithms, Working Paper. Oliva, R. (2019). Intervention as a research strategy. Journal of Operations Management, 65(7), 710-724. Oliva, R., & Watson, N. (2009). Managing functional biases in organizational forecasts: A case study of consensus forecasting in supply chain planning. Production and Operations Management, 18(2), 138-151. Oliva, R., Abdulla, H., Gonçalves, P. (2022). Do managers overreact when in backlog? Evidence from a serial supply chain experiment. Manufacturing & Service Operations Management, in press. https://doi.org/10.1287/msom.2021.1072. Oreskes, N., Shrader-Frechette, K., and Belitz, K. (1994). Verification, validation, and confirmation of numerical models in earth science. Science, 263, 641-646. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32. Peirce, C. S. (1878). Deduction, induction, and hypothesis. Popular Science Monthly, 13, 470-482. Peirce, C. S. (1955). Philosophical writings of Peirce. Dover Publications, New York. Peirce, C. S. (1965). Collected papers of Charles Sanders Pierce Vol. 5. In Hartshorne C, P Weiss (Eds.), Pragmatism and pragmaticism. Harvard University Press, Cambridge, MA. Poole, M. S., Van de Ven, A. H., Dooley, K., and Holmes, M. E. (2000). Organizational change and innovation processes: Theory and methods for research. Oxford University Press, New York. Prak, D., Teunter, R., & Syntetos, A. (2017). On the calculation of safety stocks when demand is forecasted. European Journal of Operational Research, 256(2), 454-461. Queenan, C., Cameron, K., Snell, A., Smalley, J., & Joglekar, N. (2019). Patient heal thyself: Reducing hospital readmissions with technology‐enabled continuity of care and patient activation. Production and Operations Management, 28(11), 2841-2853. Quenouille, M. H. (1949). Problems in plane sampling. The Annals of Mathematical Statistics, 20(3), 355-375. Rajaram, K., & Tang, C. S. (2001). The impact of product substitution on retail merchandising. European Journal of Operational Research, 135(3), 582-601. Richards, G. (2017). Warehouse management: a complete guide to improving efficiency and minimizing costs in the modern warehouse. Kogan Page Publishers. Rozeboom, W. W. (1997). Good science is abductive, not hypothetico-deductive. In Harlow LL, SA Mulaik, JH Steiger (Eds.), What if there is no significance test? Lawrence Erlbaum Assoc., Mahwah, NJ. Rubin, D. (2005). Causal inference using potential outcomes. Journal of American Statistical Association, 100(469), 322-331. Sachs, A. L., & Minner, S. (2014). The data-driven newsvendor with censored demand observations. International Journal of Production Economics, 149, 28-36. Scornet, E., Biau, G., and Vert, J. P. (2015). Consistency of random forest. The Annals of Statistics, 43, 1716-1741. Semenova, V., Goldman, M., Chernozhukov, V., & Taddy, M. (2023). Inference on heterogeneous treatment effects in high‐dimensional dynamic panels under weak dependence. Quantitative Economics, 14(2), 471-510. Shi, Y., Wang, T., & Alwan, L. C. (2020). Analytics for cross‐border e‐commerce: Inventory risk management of an online fashion retailer. Decision Sciences, 51(6), 1347-1376. Shin, H., Collier, D. A., & Wilson, D. D. (2000). Supply management orientation and supplier/buyer performance. Journal of Operations Management, 18(3), 317-333. Shmueli, G. (2010). To explain or to predict? Statistical Science, 25(3), 289-310. Shrestha, Y. R., He, V. F., Puranam, P., and von Krogh, G. (2021). Algorithm supported inducting for building theory: How can we use prediction models to theorize? Organization Science, 32(3), 856-880. Silver, M. S. (1991). Decisional guidance for computer-based decision support. MIS Quarterly, 105-122. Simon, H. A. (1981). The sciences of the artificial (2nd ed.). The MIT Press, Cambridge, MA. Singh, J. B., Kumar, M. V., Chandwani, R., and Varkkey, B. (2020). Machine learning and grounded theory: New opportunities for mixed-design research. AMCIS 2020 Proceedings. Song, P., Wang, Q., Liu, H., and Li, Q. (2021). The value of buy-online-and-pickup-in-store in omni-channel: Evidecne from customer usage data. Production and Operations Management, 29(4), 995-110. Sroginis, A., Fildes, R., & Kourentzes, N. (2023). Use of contextual and model-based information in adjusting promotional forecasts. European Journal of Operational Research, 307(3), 1177-1191. Strauss, A. and Corbin, J. M. (1997). Grounded theory in practice. SAGE, Thousand Oaks, CA. Sun, J., Zhang, D. J., Hu, H., & Van Mieghem, J. A. (2022). Predicting human discretion to adjust algorithmic prescription: A large-scale field experiment in warehouse operations. Management Science, 68(2), 846-865. Sun, L., Zheng, X., Jin, Y., Jiang, M., & Wang, H. (2019). Estimating promotion effects using big data: A partially profiled LASSO model with endogeneity correction. Decision Sciences, 50(4), 816-846. Sutton, R. I. and Staw, B. M. (1995). What theory is not. Administrative Science Quarterly, 40(3), 371-384. Tan, T. F., & Staats, B. R. (2020). Behavioral drivers of routing decisions: Evidence from restaurant table assignment. Production and Operations Management, 29(4), 1050-1070. Tao, Y. and Wang, L. (2017). Adaptive contrast weighted learning for multi‐stage multi‐treatment decision‐making. Biometrics, 73(1), 145-155. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. Tikhonov, A. N. (1963). Solution of incorrectly formulated problems and the regularization method. Soviet Math, 4, 1035-1038. Van Der Laan, M. J., & Rubin, D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics, 2(1). Van Donselaar, K. H., Gaur, V., Van Woensel, T., Broekmeulen, R. A., & Fransoo, J. C. (2010). Ordering behavior in retail stores and implications for automated replenishment. Management Science, 56(5), 766-784. Vansteelandt, S., & Dukes, O. (2022). Assumption-lean inference for generalised linear model parameters. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(3), 657-685. Wager, S. and Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forest. Journal of the American Statistical Association, 113(523), 1228-1242. Wager, S., Hastie, T., and Efron, B. (2014). Confidence intervals for random forest: The Jackknife and the infinitesimal Jackknife. Journal of Machine Learning Research, 15, 1625-1651. Wagner, H. M. (2002). And then there were none. Operations Research, 50(1), 217-226. Weick, K. E. (1989). Theory construction as disciplined imagination. Academy of Management Review, 14(4), 516-531. Wright, M. N. and Ziegler, A. (2017). ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software, 77, 1-17. Yang, J. C., Chuang, H. C., & Kuan, C. M. (2020). Double machine learning with gradient boosting and its application to the Big N audit quality effect. Journal of Econometrics, 216(1), 268-283. Zepeda, E. D., Nyaga, G. N., and Young, G. J. (2016). Supply chain risk management and hospital inventory: Effects of system affiliation. Journal of Operations Management, 44, 30-47. Zhang, M., & Luo, L. (2023). Can consumer-posted photos serve as a leading indicator of restaurant survival? Evidence from Yelp. Management Science, 69(1), 25-50. Zhao, Q. and Hastie, T. (2019). Causal interpretations of black-box models. Journal of Business & Economic Statistics, 39(1), 272-281. Zivich, P. N., & Breskin, A. (2021). Machine learning for causal inference: on the use of cross-fit estimators. Epidemiology (Cambridge, Mass.), 32(3), 393.zh_TW