Publications-Theses
Article View/Open
Publication Export
-
Google ScholarTM
NCCU Library
Citation Infomation
Related Publications in TAIR
題名 人工智慧輔助之專利分類檢索方法
AI-assisted Patent Classification and Retrieval Method作者 林宴葶
Lin, Yen-Ting貢獻者 彭彥璁
Peng, Yan-Tsung
林宴葶
Lin, Yen-Ting關鍵詞 國際專利分類
多標籤分類
Transformer 模型
資料擴增
模型可解釋性
International Patent Classification
Multi-label Classification
Transformer Models
Data Augmentation
SHapley Additive exPlanations (SHAP)日期 2026 上傳時間 2-Feb-2026 13:54:17 (UTC+8) 摘要 隨著專利申請量快速攀升,依賴人工標註的國際專利分類(IPC)系統面臨效率與一致性挑戰。自動化分類雖已廣泛應用深度學習模型,但在多標籤與長尾分布樣本的情境下仍具高度困難。本研究聚焦 IPC G06 類(計算、推算、計數)專利,針對資料不平衡與語意多樣性所造成之分類困難,探討結合語言模型改寫生成的資料擴增策略,並系統性比較三種 Transformer-based 語言模型(BERT、T5、DeepSeek)於中文專利多標籤分類任務的表現。 為克服大型模型的運算限制,DeepSeek 採用 LoRA 微調、4-bit 量化與混合精度訓練 等記憶體優化技術,使其能在資源受限環境下運行。實驗以 2021 至 2025 年臺灣公開中文專利為資料來源,主要以 2024 年樣本為主體,並搭配歷年資料進行訓練。結果顯示,BERT 在 Micro-F1 上略優於 DeepSeek,兩者均顯著優於 T5。BERT 在中小型運算環境中達到效能與效率的平衡,而 DeepSeek 展現出在硬體受限情境下仍具實務應用上的可行性。 綜上所述,本研究不僅驗證了大型語言模型於專利分類的實務可行性,亦透過資料擴增與模型解釋分析(SHAP),強化了模型對長尾樣本的敏感度與可解釋性,對未來智慧審查系統具實際應用價值。
The growing number of patent applications has created efficiency and consistency challenges for the manually labeled International Patent Classification (IPC) system. This study focuses on the IPC G06 category (Computing, Calculating, Counting) and examines the effects of data imbalance and semantic diversity on classification performance, using a language-model-based paraphrasing data augmentation strategy. Three Transformer-based language models—BERT, T5, and DeepSeek—are compared for Chinese patent multi-label classification. To address the computational limits of large models, DeepSeek employs LoRA fine-tuning, 4-bit quantization, and mixed-precision training to improve memory efficiency. Experiments on 2021–2025 Taiwanese patent data centered on 2024 samples show that BERT slightly outperforms DeepSeek in Micro-F1, while both surpass T5. BERT achieves a strong balance between accuracy and efficiency, whereas DeepSeek demonstrates high feasibility under resource constraints. The results confirm the practicality of large language models for patent classification and show that integrating data augmentation with interpretability analysis using SHapley Additive exPlanations (SHAP) enhances model transparency and sensitivity to long-tailed samples, providing practical insights for the development of intelligent patent examination systems.參考文獻 [1] H.-F. Yu, K. Zhong, I. S. Dhillon, W.-C. Wang, and Y. Yang, “X-bert: extreme multi-labeltext classification using bidirectional encoder representations from transformers,” 2019.[Online]. Available: https://www.amazon.science/publications/x-bert-extreme-multilabel-text-classification-using-bidirectional-encoder-representations-from-transformers [2] I. Chalkidis, E. Fergadiotis, P. Malakasiotis, and I. Androutsopoulos, “Largescalemulti-label text classification on EU legislation,” in Proceedings of the 57thAnnual Meeting of the Association for Computational Linguistics. Florence, Italy:Association for Computational Linguistics, Jul. 2019, pp. 6314–6322. [Online]. Available:https://aclanthology.org/P19-1636/ [3] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le,“Xlnet: Generalized autoregressive pretraining for language understanding,” inAdvances in Neural Information Processing Systems. Curran Associates, Inc.,2019, pp. 5753–5763. [Online]. Available: https://papers.nips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html [4] X. Liu, F. Li, and W. Xiao, “Measuring linguistic complexity in chinese: An informationtheoreticapproach,” Humanities and Social Sciences Communications, vol. 11, 2024.[Online]. Available: https://doi.org/10.1057/s41599-024-03510-7 [5] C.-R. Huang and K.-J. Chen, “Issues and topics in Chinese natural language processing,”in Readings in Chinese Natural Language Processing, ser. Journal of Chinese LinguisticsMonograph Series, C.-R. Huang, K.-J. Chen, and B. K. T’sou, Eds. Journal of ChineseLinguistics, 1996, no. 9, pp. 1–22. [Online]. Available: https://www.researchgate.net/publication/313641679_Issues_and_topics_in_Chinese_natural_language_processing [6] A. H. Roudsari, J. Afshar, W. Lee, and S. Lee, “Patentnet: Multi-label classification ofpatent documents using deep learning based language understanding,” Scientometrics,vol. 127, no. 1, pp. 207–231, 2022. [Online]. Available: https://link.springer.com/article/10.1007/s11192-021-04179-4 [7] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan,P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan,R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler,M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford,I. Sutskever, and D. Amodei, “Language models are few-shot learners,” in Advancesin Neural Information Processing Systems (NeurIPS), 2020. [Online]. Available:https://arxiv.org/abs/2005.14165 [8] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J.Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,”Journal of Machine Learning Research, vol. 21, no. 140, pp. 1–67, 2020. [Online].Available: http://jmlr.org/papers/v21/20-074.html [9] J.-S. Lee and J. Hsiang, “Patent classification by fine-tuning bert language model,”World Patent Information, vol. 61, p. 101965, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0172219019300742 [10] H. Bekamiri, D. S. Hain, and R. Jurowetzki, “Patentsberta: A deep nlp based hybridmodel for patent distance and classification using augmented sbert,” TechnologicalForecasting and Social Change, vol. 206, p. 123536, 2024. [Online]. Available:https://www.sciencedirect.com/science/article/abs/pii/S0040162524003329 [11] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”arXiv preprint arXiv:1705.07874, 2017. [Online]. Available: https://arxiv.org/abs/1705.07874 [12] World Intellectual Property Organization (WIPO), “Guide to the international patentclassification (ipc),” https://www.wipo.int/classifications/ipc/en/guide/, 2024, accessed:2025-01-13. [13] Y.-H. Tseng, C.-J. Lin, and Y.-I. Lin, “Text mining techniques for patent analysis,”Information Processing & Management, vol. 43, no. 5, pp. 1216–1247, 2007. [Online].Available: https://www.sciencedirect.com/science/article/abs/pii/S0306457306002020 [14] R. Krestel, R. Chikkamath, C. Hewel, and J. Risch, “A survey on deep learning forpatent analysis,” World Patent Information, vol. 65, p. 102035, 2021. [Online]. Available:https://www.sciencedirect.com/science/article/abs/pii/S017221902100017X [15] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, andI. Polosukhin, “Attention is all you need,” in Advances in Neural Information ProcessingSystems, 2017, pp. 5998–6008, arXiv preprint arXiv:1706.03762. [Online]. Available:https://arxiv.org/abs/1706.03762 [16] S. Raschka, “Understanding encoder and decoder architectures in transformers,” 2024,accessed: 2025-07-02. [Online]. Available: https://magazine.sebastianraschka.com/p/understanding-encoder-and-decoder [17] “深度學習中encoder 與decoder 的區別與聯繫,” 2023, accessed: 2025-07-02.[Online]. Available: https://article.baoyueai.com/archives/13363 [18] D. AI. (2024) Deepseek-llm: Advancing open-source chinese large language models.Accessed: 2025-06-26. [Online]. Available: https://arxiv.org/abs/2401.02954 [19] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deepbidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805,2018. [Online]. Available: https://arxiv.org/abs/1810.04805 [20] J. Wei and K. Zou, “Eda: Easy data augmentation techniques for boosting performanceon text classification tasks,” arXiv preprint arXiv:1901.11196, 2019. [Online]. Available:https://arxiv.org/abs/1901.11196 [21] R. Sennrich, B. Haddow, and A. Birch, “Improving neural machine translation modelswith monolingual data,” arXiv preprint arXiv:1511.06709, 2015. [Online]. Available:https://arxiv.org/abs/1511.06709 [22] S. Kobayashi, “Contextual augmentation: Data augmentation by words with paradigmaticrelations,” arXiv preprint arXiv:1805.06201, 2018. [Online]. Available: https://arxiv.org/abs/1805.06201 [23] Y. Xu, Y. Lin, Y. Nie, M. Zhou, J. Li, Z. Yang, and J. Zhou, “Chinesebert: Chinesepretraining enhanced by glyph and pinyin information,” arXiv preprint arXiv:2106.16038,2021. [Online]. Available: https://arxiv.org/abs/2106.16038 [24] Z. Wang, “Linguistic knowledge in data augmentation for natural language processing:An example on chinese question matching,” arXiv preprint arXiv:2111.14709, 2021, v3,5 Sep 2022. [Online]. Available: https://arxiv.org/abs/2111.14709 [25] K. M. Yoo, D. Park, J. Kang, S.-W. Lee, and W. Park, “Gpt3mix: Leveraginglarge-scale language models for text augmentation,” arXiv preprint arXiv:2104.08826,2021, available online: https://arxiv.org/abs/2104.08826. [Online]. Available: https://arxiv.org/abs/2104.08826 [26] C. Whitehouse, M. Choudhury, and A. F. Aji, “Llm-powered data augmentation forenhanced cross-lingual performance,” arXiv preprint arXiv:2305.14288, 2023, versionv2, 22 Oct 2023. [Online]. Available: https://arxiv.org/abs/2305.14288 描述 碩士
國立政治大學
資訊科學系
112971022資料來源 http://thesis.lib.nccu.edu.tw/record/#G0112971022 資料類型 thesis dc.contributor.advisor 彭彥璁 zh_TW dc.contributor.advisor Peng, Yan-Tsung en_US dc.contributor.author (Authors) 林宴葶 zh_TW dc.contributor.author (Authors) Lin, Yen-Ting en_US dc.creator (作者) 林宴葶 zh_TW dc.creator (作者) Lin, Yen-Ting en_US dc.date (日期) 2026 en_US dc.date.accessioned 2-Feb-2026 13:54:17 (UTC+8) - dc.date.available 2-Feb-2026 13:54:17 (UTC+8) - dc.date.issued (上傳時間) 2-Feb-2026 13:54:17 (UTC+8) - dc.identifier (Other Identifiers) G0112971022 en_US dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/161483 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系 zh_TW dc.description (描述) 112971022 zh_TW dc.description.abstract (摘要) 隨著專利申請量快速攀升,依賴人工標註的國際專利分類(IPC)系統面臨效率與一致性挑戰。自動化分類雖已廣泛應用深度學習模型,但在多標籤與長尾分布樣本的情境下仍具高度困難。本研究聚焦 IPC G06 類(計算、推算、計數)專利,針對資料不平衡與語意多樣性所造成之分類困難,探討結合語言模型改寫生成的資料擴增策略,並系統性比較三種 Transformer-based 語言模型(BERT、T5、DeepSeek)於中文專利多標籤分類任務的表現。 為克服大型模型的運算限制,DeepSeek 採用 LoRA 微調、4-bit 量化與混合精度訓練 等記憶體優化技術,使其能在資源受限環境下運行。實驗以 2021 至 2025 年臺灣公開中文專利為資料來源,主要以 2024 年樣本為主體,並搭配歷年資料進行訓練。結果顯示,BERT 在 Micro-F1 上略優於 DeepSeek,兩者均顯著優於 T5。BERT 在中小型運算環境中達到效能與效率的平衡,而 DeepSeek 展現出在硬體受限情境下仍具實務應用上的可行性。 綜上所述,本研究不僅驗證了大型語言模型於專利分類的實務可行性,亦透過資料擴增與模型解釋分析(SHAP),強化了模型對長尾樣本的敏感度與可解釋性,對未來智慧審查系統具實際應用價值。 zh_TW dc.description.abstract (摘要) The growing number of patent applications has created efficiency and consistency challenges for the manually labeled International Patent Classification (IPC) system. This study focuses on the IPC G06 category (Computing, Calculating, Counting) and examines the effects of data imbalance and semantic diversity on classification performance, using a language-model-based paraphrasing data augmentation strategy. Three Transformer-based language models—BERT, T5, and DeepSeek—are compared for Chinese patent multi-label classification. To address the computational limits of large models, DeepSeek employs LoRA fine-tuning, 4-bit quantization, and mixed-precision training to improve memory efficiency. Experiments on 2021–2025 Taiwanese patent data centered on 2024 samples show that BERT slightly outperforms DeepSeek in Micro-F1, while both surpass T5. BERT achieves a strong balance between accuracy and efficiency, whereas DeepSeek demonstrates high feasibility under resource constraints. The results confirm the practicality of large language models for patent classification and show that integrating data augmentation with interpretability analysis using SHapley Additive exPlanations (SHAP) enhances model transparency and sensitivity to long-tailed samples, providing practical insights for the development of intelligent patent examination systems. en_US dc.description.tableofcontents 第一章緒論 1 第一節研究背景 1 第二節研究動機及目的 3 第二章文獻探討 6 第一節IPC 分類與多標籤專利分類背景 6 第二節Transformer 模型架構類型與應用差異 8 第三節資料擴增技術 11 第三章研究方法 14 第一節研究設計理念與原則 14 第二節研究流程概覽 15 第四章實驗設計 17 第一節資料前處理與編碼策略 17 第二節語意導向資料擴增模型比較實驗 18 第三節資料版本設計與資料擴增策略 22 第四節模型選擇與訓練設計 27 第五節模型評估與推論延遲測試設計 34 第六節模型可解釋性分析 37 第五章實驗結果與分析 41 第一節模型效能與資源比較 41 第二節資料擴增策略影響 46 第三節SHAP 解釋性分析 53 第六章結論 57 第一節研究貢獻與結論 57 第二節未來研究建議 59 參考文獻 60 zh_TW dc.format.extent 1972172 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0112971022 en_US dc.subject (關鍵詞) 國際專利分類 zh_TW dc.subject (關鍵詞) 多標籤分類 zh_TW dc.subject (關鍵詞) Transformer 模型 zh_TW dc.subject (關鍵詞) 資料擴增 zh_TW dc.subject (關鍵詞) 模型可解釋性 zh_TW dc.subject (關鍵詞) International Patent Classification en_US dc.subject (關鍵詞) Multi-label Classification en_US dc.subject (關鍵詞) Transformer Models en_US dc.subject (關鍵詞) Data Augmentation en_US dc.subject (關鍵詞) SHapley Additive exPlanations (SHAP) en_US dc.title (題名) 人工智慧輔助之專利分類檢索方法 zh_TW dc.title (題名) AI-assisted Patent Classification and Retrieval Method en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] H.-F. Yu, K. Zhong, I. S. Dhillon, W.-C. Wang, and Y. Yang, “X-bert: extreme multi-labeltext classification using bidirectional encoder representations from transformers,” 2019.[Online]. Available: https://www.amazon.science/publications/x-bert-extreme-multilabel-text-classification-using-bidirectional-encoder-representations-from-transformers [2] I. Chalkidis, E. Fergadiotis, P. Malakasiotis, and I. Androutsopoulos, “Largescalemulti-label text classification on EU legislation,” in Proceedings of the 57thAnnual Meeting of the Association for Computational Linguistics. Florence, Italy:Association for Computational Linguistics, Jul. 2019, pp. 6314–6322. [Online]. Available:https://aclanthology.org/P19-1636/ [3] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le,“Xlnet: Generalized autoregressive pretraining for language understanding,” inAdvances in Neural Information Processing Systems. Curran Associates, Inc.,2019, pp. 5753–5763. [Online]. Available: https://papers.nips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html [4] X. Liu, F. Li, and W. Xiao, “Measuring linguistic complexity in chinese: An informationtheoreticapproach,” Humanities and Social Sciences Communications, vol. 11, 2024.[Online]. Available: https://doi.org/10.1057/s41599-024-03510-7 [5] C.-R. Huang and K.-J. Chen, “Issues and topics in Chinese natural language processing,”in Readings in Chinese Natural Language Processing, ser. Journal of Chinese LinguisticsMonograph Series, C.-R. Huang, K.-J. Chen, and B. K. T’sou, Eds. Journal of ChineseLinguistics, 1996, no. 9, pp. 1–22. [Online]. Available: https://www.researchgate.net/publication/313641679_Issues_and_topics_in_Chinese_natural_language_processing [6] A. H. Roudsari, J. Afshar, W. Lee, and S. Lee, “Patentnet: Multi-label classification ofpatent documents using deep learning based language understanding,” Scientometrics,vol. 127, no. 1, pp. 207–231, 2022. [Online]. Available: https://link.springer.com/article/10.1007/s11192-021-04179-4 [7] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan,P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan,R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler,M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford,I. Sutskever, and D. Amodei, “Language models are few-shot learners,” in Advancesin Neural Information Processing Systems (NeurIPS), 2020. [Online]. Available:https://arxiv.org/abs/2005.14165 [8] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J.Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,”Journal of Machine Learning Research, vol. 21, no. 140, pp. 1–67, 2020. [Online].Available: http://jmlr.org/papers/v21/20-074.html [9] J.-S. Lee and J. Hsiang, “Patent classification by fine-tuning bert language model,”World Patent Information, vol. 61, p. 101965, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0172219019300742 [10] H. Bekamiri, D. S. Hain, and R. Jurowetzki, “Patentsberta: A deep nlp based hybridmodel for patent distance and classification using augmented sbert,” TechnologicalForecasting and Social Change, vol. 206, p. 123536, 2024. [Online]. Available:https://www.sciencedirect.com/science/article/abs/pii/S0040162524003329 [11] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”arXiv preprint arXiv:1705.07874, 2017. [Online]. Available: https://arxiv.org/abs/1705.07874 [12] World Intellectual Property Organization (WIPO), “Guide to the international patentclassification (ipc),” https://www.wipo.int/classifications/ipc/en/guide/, 2024, accessed:2025-01-13. [13] Y.-H. Tseng, C.-J. Lin, and Y.-I. Lin, “Text mining techniques for patent analysis,”Information Processing & Management, vol. 43, no. 5, pp. 1216–1247, 2007. [Online].Available: https://www.sciencedirect.com/science/article/abs/pii/S0306457306002020 [14] R. Krestel, R. Chikkamath, C. Hewel, and J. Risch, “A survey on deep learning forpatent analysis,” World Patent Information, vol. 65, p. 102035, 2021. [Online]. Available:https://www.sciencedirect.com/science/article/abs/pii/S017221902100017X [15] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, andI. Polosukhin, “Attention is all you need,” in Advances in Neural Information ProcessingSystems, 2017, pp. 5998–6008, arXiv preprint arXiv:1706.03762. [Online]. Available:https://arxiv.org/abs/1706.03762 [16] S. Raschka, “Understanding encoder and decoder architectures in transformers,” 2024,accessed: 2025-07-02. [Online]. Available: https://magazine.sebastianraschka.com/p/understanding-encoder-and-decoder [17] “深度學習中encoder 與decoder 的區別與聯繫,” 2023, accessed: 2025-07-02.[Online]. Available: https://article.baoyueai.com/archives/13363 [18] D. AI. (2024) Deepseek-llm: Advancing open-source chinese large language models.Accessed: 2025-06-26. [Online]. Available: https://arxiv.org/abs/2401.02954 [19] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deepbidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805,2018. [Online]. Available: https://arxiv.org/abs/1810.04805 [20] J. Wei and K. Zou, “Eda: Easy data augmentation techniques for boosting performanceon text classification tasks,” arXiv preprint arXiv:1901.11196, 2019. [Online]. Available:https://arxiv.org/abs/1901.11196 [21] R. Sennrich, B. Haddow, and A. Birch, “Improving neural machine translation modelswith monolingual data,” arXiv preprint arXiv:1511.06709, 2015. [Online]. Available:https://arxiv.org/abs/1511.06709 [22] S. Kobayashi, “Contextual augmentation: Data augmentation by words with paradigmaticrelations,” arXiv preprint arXiv:1805.06201, 2018. [Online]. Available: https://arxiv.org/abs/1805.06201 [23] Y. Xu, Y. Lin, Y. Nie, M. Zhou, J. Li, Z. Yang, and J. Zhou, “Chinesebert: Chinesepretraining enhanced by glyph and pinyin information,” arXiv preprint arXiv:2106.16038,2021. [Online]. Available: https://arxiv.org/abs/2106.16038 [24] Z. Wang, “Linguistic knowledge in data augmentation for natural language processing:An example on chinese question matching,” arXiv preprint arXiv:2111.14709, 2021, v3,5 Sep 2022. [Online]. Available: https://arxiv.org/abs/2111.14709 [25] K. M. Yoo, D. Park, J. Kang, S.-W. Lee, and W. Park, “Gpt3mix: Leveraginglarge-scale language models for text augmentation,” arXiv preprint arXiv:2104.08826,2021, available online: https://arxiv.org/abs/2104.08826. [Online]. Available: https://arxiv.org/abs/2104.08826 [26] C. Whitehouse, M. Choudhury, and A. F. Aji, “Llm-powered data augmentation forenhanced cross-lingual performance,” arXiv preprint arXiv:2305.14288, 2023, versionv2, 22 Oct 2023. [Online]. Available: https://arxiv.org/abs/2305.14288 zh_TW
