Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 人工智慧輔助之專利分類檢索方法
AI-assisted Patent Classification and Retrieval Method
作者 林宴葶
Lin, Yen-Ting
貢獻者 彭彥璁
Peng, Yan-Tsung
林宴葶
Lin, Yen-Ting
關鍵詞 國際專利分類
多標籤分類
Transformer 模型
資料擴增
模型可解釋性
International Patent Classification
Multi-label Classification
Transformer Models
Data Augmentation
SHapley Additive exPlanations (SHAP)
日期 2026
上傳時間 2-Feb-2026 13:54:17 (UTC+8)
摘要 隨著專利申請量快速攀升,依賴人工標註的國際專利分類(IPC)系統面臨效率與一致性挑戰。自動化分類雖已廣泛應用深度學習模型,但在多標籤與長尾分布樣本的情境下仍具高度困難。本研究聚焦 IPC G06 類(計算、推算、計數)專利,針對資料不平衡與語意多樣性所造成之分類困難,探討結合語言模型改寫生成的資料擴增策略,並系統性比較三種 Transformer-based 語言模型(BERT、T5、DeepSeek)於中文專利多標籤分類任務的表現。 為克服大型模型的運算限制,DeepSeek 採用 LoRA 微調、4-bit 量化與混合精度訓練 等記憶體優化技術,使其能在資源受限環境下運行。實驗以 2021 至 2025 年臺灣公開中文專利為資料來源,主要以 2024 年樣本為主體,並搭配歷年資料進行訓練。結果顯示,BERT 在 Micro-F1 上略優於 DeepSeek,兩者均顯著優於 T5。BERT 在中小型運算環境中達到效能與效率的平衡,而 DeepSeek 展現出在硬體受限情境下仍具實務應用上的可行性。 綜上所述,本研究不僅驗證了大型語言模型於專利分類的實務可行性,亦透過資料擴增與模型解釋分析(SHAP),強化了模型對長尾樣本的敏感度與可解釋性,對未來智慧審查系統具實際應用價值。
The growing number of patent applications has created efficiency and consistency challenges for the manually labeled International Patent Classification (IPC) system. This study focuses on the IPC G06 category (Computing, Calculating, Counting) and examines the effects of data imbalance and semantic diversity on classification performance, using a language-model-based paraphrasing data augmentation strategy. Three Transformer-based language models—BERT, T5, and DeepSeek—are compared for Chinese patent multi-label classification. To address the computational limits of large models, DeepSeek employs LoRA fine-tuning, 4-bit quantization, and mixed-precision training to improve memory efficiency. Experiments on 2021–2025 Taiwanese patent data centered on 2024 samples show that BERT slightly outperforms DeepSeek in Micro-F1, while both surpass T5. BERT achieves a strong balance between accuracy and efficiency, whereas DeepSeek demonstrates high feasibility under resource constraints. The results confirm the practicality of large language models for patent classification and show that integrating data augmentation with interpretability analysis using SHapley Additive exPlanations (SHAP) enhances model transparency and sensitivity to long-tailed samples, providing practical insights for the development of intelligent patent examination systems.
參考文獻 [1] H.-F. Yu, K. Zhong, I. S. Dhillon, W.-C. Wang, and Y. Yang, “X-bert: extreme multi-labeltext classification using bidirectional encoder representations from transformers,” 2019.[Online]. Available: https://www.amazon.science/publications/x-bert-extreme-multilabel-text-classification-using-bidirectional-encoder-representations-from-transformers [2] I. Chalkidis, E. Fergadiotis, P. Malakasiotis, and I. Androutsopoulos, “Largescalemulti-label text classification on EU legislation,” in Proceedings of the 57thAnnual Meeting of the Association for Computational Linguistics. Florence, Italy:Association for Computational Linguistics, Jul. 2019, pp. 6314–6322. [Online]. Available:https://aclanthology.org/P19-1636/ [3] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le,“Xlnet: Generalized autoregressive pretraining for language understanding,” inAdvances in Neural Information Processing Systems. Curran Associates, Inc.,2019, pp. 5753–5763. [Online]. Available: https://papers.nips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html [4] X. Liu, F. Li, and W. Xiao, “Measuring linguistic complexity in chinese: An informationtheoreticapproach,” Humanities and Social Sciences Communications, vol. 11, 2024.[Online]. Available: https://doi.org/10.1057/s41599-024-03510-7 [5] C.-R. Huang and K.-J. Chen, “Issues and topics in Chinese natural language processing,”in Readings in Chinese Natural Language Processing, ser. Journal of Chinese LinguisticsMonograph Series, C.-R. Huang, K.-J. Chen, and B. K. T’sou, Eds. Journal of ChineseLinguistics, 1996, no. 9, pp. 1–22. [Online]. Available: https://www.researchgate.net/publication/313641679_Issues_and_topics_in_Chinese_natural_language_processing [6] A. H. Roudsari, J. Afshar, W. Lee, and S. Lee, “Patentnet: Multi-label classification ofpatent documents using deep learning based language understanding,” Scientometrics,vol. 127, no. 1, pp. 207–231, 2022. [Online]. Available: https://link.springer.com/article/10.1007/s11192-021-04179-4 [7] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan,P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan,R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler,M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford,I. Sutskever, and D. Amodei, “Language models are few-shot learners,” in Advancesin Neural Information Processing Systems (NeurIPS), 2020. [Online]. Available:https://arxiv.org/abs/2005.14165 [8] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J.Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,”Journal of Machine Learning Research, vol. 21, no. 140, pp. 1–67, 2020. [Online].Available: http://jmlr.org/papers/v21/20-074.html [9] J.-S. Lee and J. Hsiang, “Patent classification by fine-tuning bert language model,”World Patent Information, vol. 61, p. 101965, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0172219019300742 [10] H. Bekamiri, D. S. Hain, and R. Jurowetzki, “Patentsberta: A deep nlp based hybridmodel for patent distance and classification using augmented sbert,” TechnologicalForecasting and Social Change, vol. 206, p. 123536, 2024. [Online]. Available:https://www.sciencedirect.com/science/article/abs/pii/S0040162524003329 [11] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”arXiv preprint arXiv:1705.07874, 2017. [Online]. Available: https://arxiv.org/abs/1705.07874 [12] World Intellectual Property Organization (WIPO), “Guide to the international patentclassification (ipc),” https://www.wipo.int/classifications/ipc/en/guide/, 2024, accessed:2025-01-13. [13] Y.-H. Tseng, C.-J. Lin, and Y.-I. Lin, “Text mining techniques for patent analysis,”Information Processing & Management, vol. 43, no. 5, pp. 1216–1247, 2007. [Online].Available: https://www.sciencedirect.com/science/article/abs/pii/S0306457306002020 [14] R. Krestel, R. Chikkamath, C. Hewel, and J. Risch, “A survey on deep learning forpatent analysis,” World Patent Information, vol. 65, p. 102035, 2021. [Online]. Available:https://www.sciencedirect.com/science/article/abs/pii/S017221902100017X [15] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, andI. Polosukhin, “Attention is all you need,” in Advances in Neural Information ProcessingSystems, 2017, pp. 5998–6008, arXiv preprint arXiv:1706.03762. [Online]. Available:https://arxiv.org/abs/1706.03762 [16] S. Raschka, “Understanding encoder and decoder architectures in transformers,” 2024,accessed: 2025-07-02. [Online]. Available: https://magazine.sebastianraschka.com/p/understanding-encoder-and-decoder [17] “深度學習中encoder 與decoder 的區別與聯繫,” 2023, accessed: 2025-07-02.[Online]. Available: https://article.baoyueai.com/archives/13363 [18] D. AI. (2024) Deepseek-llm: Advancing open-source chinese large language models.Accessed: 2025-06-26. [Online]. Available: https://arxiv.org/abs/2401.02954 [19] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deepbidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805,2018. [Online]. Available: https://arxiv.org/abs/1810.04805 [20] J. Wei and K. Zou, “Eda: Easy data augmentation techniques for boosting performanceon text classification tasks,” arXiv preprint arXiv:1901.11196, 2019. [Online]. Available:https://arxiv.org/abs/1901.11196 [21] R. Sennrich, B. Haddow, and A. Birch, “Improving neural machine translation modelswith monolingual data,” arXiv preprint arXiv:1511.06709, 2015. [Online]. Available:https://arxiv.org/abs/1511.06709 [22] S. Kobayashi, “Contextual augmentation: Data augmentation by words with paradigmaticrelations,” arXiv preprint arXiv:1805.06201, 2018. [Online]. Available: https://arxiv.org/abs/1805.06201 [23] Y. Xu, Y. Lin, Y. Nie, M. Zhou, J. Li, Z. Yang, and J. Zhou, “Chinesebert: Chinesepretraining enhanced by glyph and pinyin information,” arXiv preprint arXiv:2106.16038,2021. [Online]. Available: https://arxiv.org/abs/2106.16038 [24] Z. Wang, “Linguistic knowledge in data augmentation for natural language processing:An example on chinese question matching,” arXiv preprint arXiv:2111.14709, 2021, v3,5 Sep 2022. [Online]. Available: https://arxiv.org/abs/2111.14709 [25] K. M. Yoo, D. Park, J. Kang, S.-W. Lee, and W. Park, “Gpt3mix: Leveraginglarge-scale language models for text augmentation,” arXiv preprint arXiv:2104.08826,2021, available online: https://arxiv.org/abs/2104.08826. [Online]. Available: https://arxiv.org/abs/2104.08826 [26] C. Whitehouse, M. Choudhury, and A. F. Aji, “Llm-powered data augmentation forenhanced cross-lingual performance,” arXiv preprint arXiv:2305.14288, 2023, versionv2, 22 Oct 2023. [Online]. Available: https://arxiv.org/abs/2305.14288
描述 碩士
國立政治大學
資訊科學系
112971022
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0112971022
資料類型 thesis
dc.contributor.advisor 彭彥璁zh_TW
dc.contributor.advisor Peng, Yan-Tsungen_US
dc.contributor.author (Authors) 林宴葶zh_TW
dc.contributor.author (Authors) Lin, Yen-Tingen_US
dc.creator (作者) 林宴葶zh_TW
dc.creator (作者) Lin, Yen-Tingen_US
dc.date (日期) 2026en_US
dc.date.accessioned 2-Feb-2026 13:54:17 (UTC+8)-
dc.date.available 2-Feb-2026 13:54:17 (UTC+8)-
dc.date.issued (上傳時間) 2-Feb-2026 13:54:17 (UTC+8)-
dc.identifier (Other Identifiers) G0112971022en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/161483-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系zh_TW
dc.description (描述) 112971022zh_TW
dc.description.abstract (摘要) 隨著專利申請量快速攀升,依賴人工標註的國際專利分類(IPC)系統面臨效率與一致性挑戰。自動化分類雖已廣泛應用深度學習模型,但在多標籤與長尾分布樣本的情境下仍具高度困難。本研究聚焦 IPC G06 類(計算、推算、計數)專利,針對資料不平衡與語意多樣性所造成之分類困難,探討結合語言模型改寫生成的資料擴增策略,並系統性比較三種 Transformer-based 語言模型(BERT、T5、DeepSeek)於中文專利多標籤分類任務的表現。 為克服大型模型的運算限制,DeepSeek 採用 LoRA 微調、4-bit 量化與混合精度訓練 等記憶體優化技術,使其能在資源受限環境下運行。實驗以 2021 至 2025 年臺灣公開中文專利為資料來源,主要以 2024 年樣本為主體,並搭配歷年資料進行訓練。結果顯示,BERT 在 Micro-F1 上略優於 DeepSeek,兩者均顯著優於 T5。BERT 在中小型運算環境中達到效能與效率的平衡,而 DeepSeek 展現出在硬體受限情境下仍具實務應用上的可行性。 綜上所述,本研究不僅驗證了大型語言模型於專利分類的實務可行性,亦透過資料擴增與模型解釋分析(SHAP),強化了模型對長尾樣本的敏感度與可解釋性,對未來智慧審查系統具實際應用價值。zh_TW
dc.description.abstract (摘要) The growing number of patent applications has created efficiency and consistency challenges for the manually labeled International Patent Classification (IPC) system. This study focuses on the IPC G06 category (Computing, Calculating, Counting) and examines the effects of data imbalance and semantic diversity on classification performance, using a language-model-based paraphrasing data augmentation strategy. Three Transformer-based language models—BERT, T5, and DeepSeek—are compared for Chinese patent multi-label classification. To address the computational limits of large models, DeepSeek employs LoRA fine-tuning, 4-bit quantization, and mixed-precision training to improve memory efficiency. Experiments on 2021–2025 Taiwanese patent data centered on 2024 samples show that BERT slightly outperforms DeepSeek in Micro-F1, while both surpass T5. BERT achieves a strong balance between accuracy and efficiency, whereas DeepSeek demonstrates high feasibility under resource constraints. The results confirm the practicality of large language models for patent classification and show that integrating data augmentation with interpretability analysis using SHapley Additive exPlanations (SHAP) enhances model transparency and sensitivity to long-tailed samples, providing practical insights for the development of intelligent patent examination systems.en_US
dc.description.tableofcontents 第一章緒論 1 第一節研究背景 1 第二節研究動機及目的 3 第二章文獻探討 6 第一節IPC 分類與多標籤專利分類背景 6 第二節Transformer 模型架構類型與應用差異 8 第三節資料擴增技術 11 第三章研究方法 14 第一節研究設計理念與原則 14 第二節研究流程概覽 15 第四章實驗設計 17 第一節資料前處理與編碼策略 17 第二節語意導向資料擴增模型比較實驗 18 第三節資料版本設計與資料擴增策略 22 第四節模型選擇與訓練設計 27 第五節模型評估與推論延遲測試設計 34 第六節模型可解釋性分析 37 第五章實驗結果與分析 41 第一節模型效能與資源比較 41 第二節資料擴增策略影響 46 第三節SHAP 解釋性分析 53 第六章結論 57 第一節研究貢獻與結論 57 第二節未來研究建議 59 參考文獻 60zh_TW
dc.format.extent 1972172 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0112971022en_US
dc.subject (關鍵詞) 國際專利分類zh_TW
dc.subject (關鍵詞) 多標籤分類zh_TW
dc.subject (關鍵詞) Transformer 模型zh_TW
dc.subject (關鍵詞) 資料擴增zh_TW
dc.subject (關鍵詞) 模型可解釋性zh_TW
dc.subject (關鍵詞) International Patent Classificationen_US
dc.subject (關鍵詞) Multi-label Classificationen_US
dc.subject (關鍵詞) Transformer Modelsen_US
dc.subject (關鍵詞) Data Augmentationen_US
dc.subject (關鍵詞) SHapley Additive exPlanations (SHAP)en_US
dc.title (題名) 人工智慧輔助之專利分類檢索方法zh_TW
dc.title (題名) AI-assisted Patent Classification and Retrieval Methoden_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] H.-F. Yu, K. Zhong, I. S. Dhillon, W.-C. Wang, and Y. Yang, “X-bert: extreme multi-labeltext classification using bidirectional encoder representations from transformers,” 2019.[Online]. Available: https://www.amazon.science/publications/x-bert-extreme-multilabel-text-classification-using-bidirectional-encoder-representations-from-transformers [2] I. Chalkidis, E. Fergadiotis, P. Malakasiotis, and I. Androutsopoulos, “Largescalemulti-label text classification on EU legislation,” in Proceedings of the 57thAnnual Meeting of the Association for Computational Linguistics. Florence, Italy:Association for Computational Linguistics, Jul. 2019, pp. 6314–6322. [Online]. Available:https://aclanthology.org/P19-1636/ [3] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le,“Xlnet: Generalized autoregressive pretraining for language understanding,” inAdvances in Neural Information Processing Systems. Curran Associates, Inc.,2019, pp. 5753–5763. [Online]. Available: https://papers.nips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html [4] X. Liu, F. Li, and W. Xiao, “Measuring linguistic complexity in chinese: An informationtheoreticapproach,” Humanities and Social Sciences Communications, vol. 11, 2024.[Online]. Available: https://doi.org/10.1057/s41599-024-03510-7 [5] C.-R. Huang and K.-J. Chen, “Issues and topics in Chinese natural language processing,”in Readings in Chinese Natural Language Processing, ser. Journal of Chinese LinguisticsMonograph Series, C.-R. Huang, K.-J. Chen, and B. K. T’sou, Eds. Journal of ChineseLinguistics, 1996, no. 9, pp. 1–22. [Online]. Available: https://www.researchgate.net/publication/313641679_Issues_and_topics_in_Chinese_natural_language_processing [6] A. H. Roudsari, J. Afshar, W. Lee, and S. Lee, “Patentnet: Multi-label classification ofpatent documents using deep learning based language understanding,” Scientometrics,vol. 127, no. 1, pp. 207–231, 2022. [Online]. Available: https://link.springer.com/article/10.1007/s11192-021-04179-4 [7] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan,P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan,R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler,M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford,I. Sutskever, and D. Amodei, “Language models are few-shot learners,” in Advancesin Neural Information Processing Systems (NeurIPS), 2020. [Online]. Available:https://arxiv.org/abs/2005.14165 [8] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J.Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,”Journal of Machine Learning Research, vol. 21, no. 140, pp. 1–67, 2020. [Online].Available: http://jmlr.org/papers/v21/20-074.html [9] J.-S. Lee and J. Hsiang, “Patent classification by fine-tuning bert language model,”World Patent Information, vol. 61, p. 101965, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0172219019300742 [10] H. Bekamiri, D. S. Hain, and R. Jurowetzki, “Patentsberta: A deep nlp based hybridmodel for patent distance and classification using augmented sbert,” TechnologicalForecasting and Social Change, vol. 206, p. 123536, 2024. [Online]. Available:https://www.sciencedirect.com/science/article/abs/pii/S0040162524003329 [11] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”arXiv preprint arXiv:1705.07874, 2017. [Online]. Available: https://arxiv.org/abs/1705.07874 [12] World Intellectual Property Organization (WIPO), “Guide to the international patentclassification (ipc),” https://www.wipo.int/classifications/ipc/en/guide/, 2024, accessed:2025-01-13. [13] Y.-H. Tseng, C.-J. Lin, and Y.-I. Lin, “Text mining techniques for patent analysis,”Information Processing & Management, vol. 43, no. 5, pp. 1216–1247, 2007. [Online].Available: https://www.sciencedirect.com/science/article/abs/pii/S0306457306002020 [14] R. Krestel, R. Chikkamath, C. Hewel, and J. Risch, “A survey on deep learning forpatent analysis,” World Patent Information, vol. 65, p. 102035, 2021. [Online]. Available:https://www.sciencedirect.com/science/article/abs/pii/S017221902100017X [15] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, andI. Polosukhin, “Attention is all you need,” in Advances in Neural Information ProcessingSystems, 2017, pp. 5998–6008, arXiv preprint arXiv:1706.03762. [Online]. Available:https://arxiv.org/abs/1706.03762 [16] S. Raschka, “Understanding encoder and decoder architectures in transformers,” 2024,accessed: 2025-07-02. [Online]. Available: https://magazine.sebastianraschka.com/p/understanding-encoder-and-decoder [17] “深度學習中encoder 與decoder 的區別與聯繫,” 2023, accessed: 2025-07-02.[Online]. Available: https://article.baoyueai.com/archives/13363 [18] D. AI. (2024) Deepseek-llm: Advancing open-source chinese large language models.Accessed: 2025-06-26. [Online]. Available: https://arxiv.org/abs/2401.02954 [19] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deepbidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805,2018. [Online]. Available: https://arxiv.org/abs/1810.04805 [20] J. Wei and K. Zou, “Eda: Easy data augmentation techniques for boosting performanceon text classification tasks,” arXiv preprint arXiv:1901.11196, 2019. [Online]. Available:https://arxiv.org/abs/1901.11196 [21] R. Sennrich, B. Haddow, and A. Birch, “Improving neural machine translation modelswith monolingual data,” arXiv preprint arXiv:1511.06709, 2015. [Online]. Available:https://arxiv.org/abs/1511.06709 [22] S. Kobayashi, “Contextual augmentation: Data augmentation by words with paradigmaticrelations,” arXiv preprint arXiv:1805.06201, 2018. [Online]. Available: https://arxiv.org/abs/1805.06201 [23] Y. Xu, Y. Lin, Y. Nie, M. Zhou, J. Li, Z. Yang, and J. Zhou, “Chinesebert: Chinesepretraining enhanced by glyph and pinyin information,” arXiv preprint arXiv:2106.16038,2021. [Online]. Available: https://arxiv.org/abs/2106.16038 [24] Z. Wang, “Linguistic knowledge in data augmentation for natural language processing:An example on chinese question matching,” arXiv preprint arXiv:2111.14709, 2021, v3,5 Sep 2022. [Online]. Available: https://arxiv.org/abs/2111.14709 [25] K. M. Yoo, D. Park, J. Kang, S.-W. Lee, and W. Park, “Gpt3mix: Leveraginglarge-scale language models for text augmentation,” arXiv preprint arXiv:2104.08826,2021, available online: https://arxiv.org/abs/2104.08826. [Online]. Available: https://arxiv.org/abs/2104.08826 [26] C. Whitehouse, M. Choudhury, and A. F. Aji, “Llm-powered data augmentation forenhanced cross-lingual performance,” arXiv preprint arXiv:2305.14288, 2023, versionv2, 22 Oct 2023. [Online]. Available: https://arxiv.org/abs/2305.14288zh_TW