基於市場相對價格型態之股價預測模型 | Publication

Publications-Theses

Article View/Open

pdf(4)

Publication Export

Google Scholar^TM

題名	基於市場相對價格型態之股價預測模型 Stock price prediction based on relative price patterns
作者	曾祐展 Tseng, Yu-Chan
貢獻者	彭彥璁 Peng, Yan-Tsung 曾祐展 Tseng, Yu-Chan
關鍵詞	深度學習多尺度分析卷積神經網路視覺變壓器時間序列 Deep Learning Multi-scale Analysis Convolutional Neural Networks Vision Transformer Time Series
日期	2025
上傳時間	1-Sep-2025 16:18:22 (UTC+8)
摘要	隨著金融市場資訊的爆炸性增長與波動迅速加劇，投資者必須在極短的時間內分析大量且複雜的市場資料，以制定有效且即時的交易策略。傳統的技術分析方法雖能透過價格模式與技術指標判斷市場趨勢，但在面臨多尺度資料整合及即時決策需求時，常顯得不足以應付。本研究提出一套創新的智慧交易系統，透過深度學習方法融合多尺度K線圖，以提升市場趨勢預測的準確性與即時性。研究首先將多尺度價格序列映射為圖像特徵，分別以傳統線性模型羅吉斯回歸（Logistic Regression），以及深度學習模型卷積神經網路（Convolutional Neural Networks, CNN）與基於Transformer架構的視覺Transformer（Vision Transformer, ViT）進行特徵萃取與方向分類；其次，為驗證影像化方法之優勢，另以不經圖像轉換的數值 OHLCV 序列輸入長短期記憶網路（LSTM）進行對照；並納入雙均線交叉策略作為傳統基線。實驗採用兩段獨立之一年期間資料進行樣本外檢驗，以評估模型穩健性與時序遷移效應。實證結果顯示，本研究所開發之多尺度智慧交易系統，在不同模型結構下均能穩定表現，顯著提高預測準確度與交易績效，成功整合多維市場資訊，為投資人提供具實務價值且可靠的決策支援工具。 With the explosive growth of financial-market information and increasingly rapid price volatility, investors must analyse massive, complex data within extremely short time frames to formulate effective and timely trading strategies. Although traditional technical analysis can infer market trends via price patterns and technical indicators, it often falls short when confronted with the need for multi-scale data integration and real-time decision-making. This study proposes an innovative intelligent trading system that leverages deep learning to fuse multi-scale candlestick (K-line) charts, thereby enhancing both the accuracy and timeliness of market-trend prediction. The research first maps price sequences at multiple time resolutions into image representations and employs three classification frameworks - traditional linear Logistic Regression (LR), a Convolutional Neural Network (CNN), and a Transformer-based Vision Transformer (ViT) - to extract features and predict market direction. To verify the advantage of image-based approaches, raw OHLCV sequences are also fed directly into a Long Short-Term Memory (LSTM) network for comparison, while a dual moving-average crossover strategy serves as a traditional baseline. Two non-overlapping one-year periods are used for out-of-sample evaluation, enabling assessment of model robustness and temporal transferability. Empirical results show that the proposed multi-scale intelligent trading system delivers stable performance across all model architectures, significantly improving predictive accuracy and trading profitability. By successfully integrating multidimensional market information, the system provides investors with a practical and reliable decision-support tool.
參考文獻	[1] J. S. Abarbanell and B. J. Bushee, “Fundamental analysis, future earnings, and stock prices,” Journal of accounting research, vol. 35, no. 1, pp. 1–24, 1997. [2] J. J. Murphy, Technical analysis of the financial markets: A comprehensive guide to trading methods and applications. Penguin, 1999. [3] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., “Mastering the game of go with deep neural networks and tree search,” nature, vol. 529, no. 7587, pp. 484–489, 2016. [4] A. Adadi and M. Berrada, “Peeking inside the black-box: a survey on explainable artificial intelligence (xai),” IEEE access, vol. 6, pp. 52 138–52 160, 2018. [5] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015. [6] R. S. Sutton, A. G. Barto et al., Reinforcement learning: An introduction. MIT press Cambridge, 1998, vol. 1, no. 1. [7] E. F. Fama, “Efficient capital markets,” Journal of finance, vol. 25, no. 2, pp. 383–417, 1970. [8] S. J. Grossman and J. E. Stiglitz, “On the impossibility of informationally efficient markets,” The American economic review, vol. 70, no. 3, pp. 393–408, 1980. [9] R. A. Meese and K. Rogoff, “Empirical exchange rate models of the seventies: Do they fit out of sample?” Journal of international economics, vol. 14, no. 1-2, pp. 3–24, 1983. 59 [10] J. Jiang, B. Kelly, and D. Xiu, “(re-) imag (in) ing price trends,” The Journal of Finance, vol. 78, no. 6, pp. 3193–3249, 2023. [11] D. G. Kleinbaum, K. Dietz, M. Gail, M. Klein, and M. Klein, Logistic regression. Springer, 2002. [12] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 2002. [13] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020. [14] G. E. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung, Time series analysis: forecasting and control. John Wiley & Sons, 2015. [15] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, pp. 273– 297, 1995. [16] L. R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, 2002. [17] Y. Han, G. Zhou, and Y. Zhu, “A trend factor: Any economic gains from using information over investment horizons?” Journal of Financial Economics, vol. 122, no. 2, pp. 352–375, 2016. [18] Y. Chen, Z. Wei, and X. Huang, “Incorporating corporation relationship via graph convolutional neural networks for stock price prediction,” in Proceedings of the 27th ACM international conference on information and knowledge management, 2018, pp. 1655– 1658. [19] H. S. Sim, H. I. Kim, and J. J. Ahn, “Is deep learning for image recognition applicable to stock market prediction?” Complexity, vol. 2019, no. 1, p. 4324878, 2019. [20] G. Liu, Y. Mao, Q. Sun, H. Huang, W. Gao, X. Li, J. Shen, R. Li, and X. Wang, “Multiscale two-way deep neural network for stock trend prediction,” in Proceedings of the 60 Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 4555–4561. [21] Z. Zeng, R. Kaur, S. Siddagangappa, S. Rahimi, T. Balch, and M. Veloso, “Financial time series forecasting using cnn and transformer,” arXiv preprint arXiv:2304.04912, 2023. [22] Y. Huang, C. Zhou, K. Cui, and X. Lu, “A multi-agent reinforcement learning framework for optimizing financial trading strategies based on timesnet,” Expert Systems with Applications, vol. 237, p. 121502, 2024. [23] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997. [24] K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological cybernetics, vol. 36, no. 4, pp. 193–202, 1980. [25] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012. [26] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. [27] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9. [28] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [29] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708. 61 [30] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017. [31] M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in International conference on machine learning. PMLR, 2019, pp. 6105–6114. [32] Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 11 976–11 986. [33] X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, and J. Sun, “Repvgg: Making vgg-style convnets great again,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 13 733–13 742. [34] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017. [35] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3–19. [36] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by backpropagating errors,” nature, vol. 323, no. 6088, pp. 533–536, 1986. [37] I. Beltagy, M. E. Peters, and A. Cohan, “Longformer: The long-document transformer,” arXiv preprint arXiv:2004.05150, 2020. [38] K. Choromanski, V. Likhosherstov, D. Dohan, X. Song, A. Gane, T. Sarlos, P. Hawkins, J. Davis, A. Mohiuddin, L. Kaiser et al., “Rethinking attention with performers,” arXiv preprint arXiv:2009.14794, 2020. [39] S. Wang, B. Z. Li, M. Khabsa, H. Fang, and H. Ma, “Linformer: Self-attention with linear complexity,” arXiv preprint arXiv:2006.04768, 2020. 62 [40] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, pp. 4171–4186. [41] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” 2018. [42] M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 9650–9660. [43] X. Chen, S. Xie, and K. He, “An empirical study of training self-supervised vision transformers,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 9640–9649. [44] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16 000–16 009. [45] S. Mehta and M. Rastegari, “Mobilevit: light-weight, general-purpose, and mobilefriendly vision transformer,” arXiv preprint arXiv:2110.02178, 2021. [46] K. Wu, J. Zhang, H. Peng, M. Liu, B. Xiao, J. Fu, and L. Yuan, “Tinyvit: Fast pretraining distillation for small vision transformers,” in European conference on computer vision. Springer, 2022, pp. 68–85.
描述	碩士國立政治大學資訊科學系碩士在職專班 111971004
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0111971004
資料類型	thesis

dc.contributor.advisor	彭彥璁	zh_TW
dc.contributor.advisor	Peng, Yan-Tsung	en_US
dc.contributor.author (Authors)	曾祐展	zh_TW
dc.contributor.author (Authors)	Tseng, Yu-Chan	en_US
dc.creator (作者)	曾祐展	zh_TW
dc.creator (作者)	Tseng, Yu-Chan	en_US
dc.date (日期)	2025	en_US
dc.date.accessioned	1-Sep-2025 16:18:22 (UTC+8)	-
dc.date.available	1-Sep-2025 16:18:22 (UTC+8)	-
dc.date.issued (上傳時間)	1-Sep-2025 16:18:22 (UTC+8)	-
dc.identifier (Other Identifiers)	G0111971004	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/159292	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學系碩士在職專班	zh_TW
dc.description (描述)	111971004	zh_TW
dc.description.abstract (摘要)	隨著金融市場資訊的爆炸性增長與波動迅速加劇，投資者必須在極短的時間內分析大量且複雜的市場資料，以制定有效且即時的交易策略。傳統的技術分析方法雖能透過價格模式與技術指標判斷市場趨勢，但在面臨多尺度資料整合及即時決策需求時，常顯得不足以應付。本研究提出一套創新的智慧交易系統，透過深度學習方法融合多尺度K線圖，以提升市場趨勢預測的準確性與即時性。研究首先將多尺度價格序列映射為圖像特徵，分別以傳統線性模型羅吉斯回歸（Logistic Regression），以及深度學習模型卷積神經網路（Convolutional Neural Networks, CNN）與基於Transformer架構的視覺Transformer（Vision Transformer, ViT）進行特徵萃取與方向分類；其次，為驗證影像化方法之優勢，另以不經圖像轉換的數值 OHLCV 序列輸入長短期記憶網路（LSTM）進行對照；並納入雙均線交叉策略作為傳統基線。實驗採用兩段獨立之一年期間資料進行樣本外檢驗，以評估模型穩健性與時序遷移效應。實證結果顯示，本研究所開發之多尺度智慧交易系統，在不同模型結構下均能穩定表現，顯著提高預測準確度與交易績效，成功整合多維市場資訊，為投資人提供具實務價值且可靠的決策支援工具。	zh_TW
dc.description.abstract (摘要)	With the explosive growth of financial-market information and increasingly rapid price volatility, investors must analyse massive, complex data within extremely short time frames to formulate effective and timely trading strategies. Although traditional technical analysis can infer market trends via price patterns and technical indicators, it often falls short when confronted with the need for multi-scale data integration and real-time decision-making. This study proposes an innovative intelligent trading system that leverages deep learning to fuse multi-scale candlestick (K-line) charts, thereby enhancing both the accuracy and timeliness of market-trend prediction. The research first maps price sequences at multiple time resolutions into image representations and employs three classification frameworks - traditional linear Logistic Regression (LR), a Convolutional Neural Network (CNN), and a Transformer-based Vision Transformer (ViT) - to extract features and predict market direction. To verify the advantage of image-based approaches, raw OHLCV sequences are also fed directly into a Long Short-Term Memory (LSTM) network for comparison, while a dual moving-average crossover strategy serves as a traditional baseline. Two non-overlapping one-year periods are used for out-of-sample evaluation, enabling assessment of model robustness and temporal transferability. Empirical results show that the proposed multi-scale intelligent trading system delivers stable performance across all model architectures, significantly improving predictive accuracy and trading profitability. By successfully integrating multidimensional market information, the system provides investors with a practical and reliable decision-support tool.	en_US
dc.description.tableofcontents	誌謝 i 摘要 ii Abstract iii 目錄 v 圖目錄 vi 表目錄 viii 第一章緒論 1 第一節研究背景 1 第二節研究動機及目的 3 第二章文獻探討 5 第一節金融時序分類應用 5 第二節研究方法 13 第三章研究方法 22 第一節資料集 22 第二節資料處理 24 第三節模型架構 28 第四節實驗環境與訓練參數 32 第四章實驗結果 34 第一節衡量指標 34 第二節實驗結果 36 第三節交易績效 45 第四節研究限制與未來方向 55 第五章結論 57 參考文獻 59	zh_TW
dc.format.extent	4046628 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0111971004	en_US
dc.subject (關鍵詞)	深度學習	zh_TW
dc.subject (關鍵詞)	多尺度分析	zh_TW
dc.subject (關鍵詞)	卷積神經網路	zh_TW
dc.subject (關鍵詞)	視覺變壓器	zh_TW
dc.subject (關鍵詞)	時間序列	zh_TW
dc.subject (關鍵詞)	Deep Learning	en_US
dc.subject (關鍵詞)	Multi-scale Analysis	en_US
dc.subject (關鍵詞)	Convolutional Neural Networks	en_US
dc.subject (關鍵詞)	Vision Transformer	en_US
dc.subject (關鍵詞)	Time Series	en_US
dc.title (題名)	基於市場相對價格型態之股價預測模型	zh_TW
dc.title (題名)	Stock price prediction based on relative price patterns	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] J. S. Abarbanell and B. J. Bushee, “Fundamental analysis, future earnings, and stock prices,” Journal of accounting research, vol. 35, no. 1, pp. 1–24, 1997. [2] J. J. Murphy, Technical analysis of the financial markets: A comprehensive guide to trading methods and applications. Penguin, 1999. [3] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., “Mastering the game of go with deep neural networks and tree search,” nature, vol. 529, no. 7587, pp. 484–489, 2016. [4] A. Adadi and M. Berrada, “Peeking inside the black-box: a survey on explainable artificial intelligence (xai),” IEEE access, vol. 6, pp. 52 138–52 160, 2018. [5] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015. [6] R. S. Sutton, A. G. Barto et al., Reinforcement learning: An introduction. MIT press Cambridge, 1998, vol. 1, no. 1. [7] E. F. Fama, “Efficient capital markets,” Journal of finance, vol. 25, no. 2, pp. 383–417, 1970. [8] S. J. Grossman and J. E. Stiglitz, “On the impossibility of informationally efficient markets,” The American economic review, vol. 70, no. 3, pp. 393–408, 1980. [9] R. A. Meese and K. Rogoff, “Empirical exchange rate models of the seventies: Do they fit out of sample?” Journal of international economics, vol. 14, no. 1-2, pp. 3–24, 1983. 59 [10] J. Jiang, B. Kelly, and D. Xiu, “(re-) imag (in) ing price trends,” The Journal of Finance, vol. 78, no. 6, pp. 3193–3249, 2023. [11] D. G. Kleinbaum, K. Dietz, M. Gail, M. Klein, and M. Klein, Logistic regression. Springer, 2002. [12] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 2002. [13] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020. [14] G. E. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung, Time series analysis: forecasting and control. John Wiley & Sons, 2015. [15] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, pp. 273– 297, 1995. [16] L. R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, 2002. [17] Y. Han, G. Zhou, and Y. Zhu, “A trend factor: Any economic gains from using information over investment horizons?” Journal of Financial Economics, vol. 122, no. 2, pp. 352–375, 2016. [18] Y. Chen, Z. Wei, and X. Huang, “Incorporating corporation relationship via graph convolutional neural networks for stock price prediction,” in Proceedings of the 27th ACM international conference on information and knowledge management, 2018, pp. 1655– 1658. [19] H. S. Sim, H. I. Kim, and J. J. Ahn, “Is deep learning for image recognition applicable to stock market prediction?” Complexity, vol. 2019, no. 1, p. 4324878, 2019. [20] G. Liu, Y. Mao, Q. Sun, H. Huang, W. Gao, X. Li, J. Shen, R. Li, and X. Wang, “Multiscale two-way deep neural network for stock trend prediction,” in Proceedings of the 60 Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 4555–4561. [21] Z. Zeng, R. Kaur, S. Siddagangappa, S. Rahimi, T. Balch, and M. Veloso, “Financial time series forecasting using cnn and transformer,” arXiv preprint arXiv:2304.04912, 2023. [22] Y. Huang, C. Zhou, K. Cui, and X. Lu, “A multi-agent reinforcement learning framework for optimizing financial trading strategies based on timesnet,” Expert Systems with Applications, vol. 237, p. 121502, 2024. [23] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997. [24] K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological cybernetics, vol. 36, no. 4, pp. 193–202, 1980. [25] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012. [26] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. [27] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9. [28] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [29] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708. 61 [30] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017. [31] M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in International conference on machine learning. PMLR, 2019, pp. 6105–6114. [32] Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 11 976–11 986. [33] X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, and J. Sun, “Repvgg: Making vgg-style convnets great again,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 13 733–13 742. [34] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017. [35] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3–19. [36] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by backpropagating errors,” nature, vol. 323, no. 6088, pp. 533–536, 1986. [37] I. Beltagy, M. E. Peters, and A. Cohan, “Longformer: The long-document transformer,” arXiv preprint arXiv:2004.05150, 2020. [38] K. Choromanski, V. Likhosherstov, D. Dohan, X. Song, A. Gane, T. Sarlos, P. Hawkins, J. Davis, A. Mohiuddin, L. Kaiser et al., “Rethinking attention with performers,” arXiv preprint arXiv:2009.14794, 2020. [39] S. Wang, B. Z. Li, M. Khabsa, H. Fang, and H. Ma, “Linformer: Self-attention with linear complexity,” arXiv preprint arXiv:2006.04768, 2020. 62 [40] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, pp. 4171–4186. [41] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” 2018. [42] M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 9650–9660. [43] X. Chen, S. Xie, and K. He, “An empirical study of training self-supervised vision transformers,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 9640–9649. [44] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16 000–16 009. [45] S. Mehta and M. Rastegari, “Mobilevit: light-weight, general-purpose, and mobilefriendly vision transformer,” arXiv preprint arXiv:2110.02178, 2021. [46] K. Wu, J. Zhang, H. Peng, M. Liu, B. Xiao, J. Fu, and L. Yuan, “Tinyvit: Fast pretraining distillation for small vision transformers,” in European conference on computer vision. Springer, 2022, pp. 68–85.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM