學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 基於Transformer之多任務學習用於清代奏摺斷句斷詞命名實體識別
Text Segmentation and Name Entity Recognition for Memorials from the Qing Dynasty with Transformer-based Multitask Learning
作者 薛卉吟
Xue, Hui-Yin
貢獻者 蔡瑞煌<br>黃瀚萱
Tsaih, Rua-Huan<br>Huang, Hen-Hsen
薛卉吟
Xue, Hui-Yin
關鍵詞 清代奏摺
斷詞斷句
命名實體識別
多任務學習
自然語言處理
機器學習
古文
Memorial
Qing Dynasty
Transformer
BERT
Sentence segmentation
Word segmentation
Name entity recognition
Multitask learning
Classical Chinese
NLP
日期 2021
上傳時間 1-Dec-2021 14:30:04 (UTC+8)
摘要 奏摺,是研究清代政策實施和法制建設的珍貴的史料。雖然存於國立故宮博物院的清代宮中檔及軍機處的奏摺已完成數化,但應用仍然不普及,原因之一是辨識古典漢語的斷句、斷詞和詞義需花費歷史學家大量的時間。對於古典漢語,很少有有用的自然語言處理(NLP)工具,並且先進的人工智能(AI)模型學習不同朝代的訓練數據後,其性能也不盡相同。此外,沒有合適的NLP工具來分析清代的奏摺。為了解決有關於分析清代奏摺的挑戰,本研究探索一種基於Transformer之單任務學習(STL)及多任務學習(MTL)之模型,該模型可同時應付以下三個任務:斷句、斷詞、詞性(POS)標記和命名實體識別(NER)。為了完成此任務,本研究建議的標記方案包括三個部分:(1)用於斷句的BOE格式標籤;(2)用於斷詞的BIES格式標籤;以及(3)用於POS和NER的聯合標籤。為了評估該提案,本研究著重於雍正皇帝時期之奏摺,並收集並建立由中文專業人士參照新標籤標記方案所標註的清朝宮中檔奏摺數據集。研究結果顯示,斷句及斷詞任務中,多任務學習效能顯著優於單任務學習,兩個學習方法在詞性標記和命名實體識別則無顯著差異。模型的斷句結果可以達到輔助初學者們閱讀奏摺,斷詞以及詞性的標注結果則可以協助學者辨認詞義,減少對詞義誤讀的可能。
Memorials are important materials for research on policy implementation and the formation of legal institutions. Although the memorials of Qing palace and the Grand Council had been accomplished with image scanning, the application is still not popular in academia. One of the reasons is that classical Chinese will often take a lot of historian’s time to determine the segmentation of sentences and the meaning of words. The use of natural language processing (NLP) tools for analyzing classical Chinese remains an emerging topic in the digital humanity community. For classical Chinese, there are few NLP tools, and the performance of artificial intelligence (AI) models is not the same after learning the data of different dynasties. To address the challenges regarding the memorials of Qing dynasty, this study proposes a classical Chinese analysis model with transformer-based single task learning (STL) and multitask learning (MTL) that simultaneously copes with three tasks for classical Chinese: word segmentation, sentence segmentation, and the joint task for part-of-speech (POS) tagging and named entity recognition (NER). To accomplish the goal, the labels have three parts: (1) BOE format tags for sentence segmentation, (2) BIES format tags for word segmentation, and (3) the joint tags for POS and NER. For evaluating the proposal, this study focuses on the Yong-zheng (雍正) emperor and the Qing’s memorials dataset annotated with new tagging schemes by Chinese professionals is collected. The research results show that method MTL performs significantly better on both sentence segmentation task and word segmentation task than method STL. And on POS+NER task, there is no significant difference between the two methods. The prediction of the memorials can help scholars to read memorials easily and reduce the probability of misinterpretation of word meaning.
參考文獻 [1]莊吉發 (1983)。故宮檔案述要。國立故宮博物院。
[2]袁晖、管锡华、岳方遂 (2002)。汉语标点符号流变史。湖北教育出版社。
[3]黃宇暘、郭鎮武、周維強、林國平、蔡瑞煌 (2021)。人工智慧在中文歷史文獻判讀領域應用初探:以國立故宮博物院典藏為例。科技博物,25(3),5-26。
[4]Cai, D., & Zhao, H. (2016). Neural word segmentation learning for Chinese. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics , 1, 409–420. https://aclanthology.org/P16-1039/
[5]Caruana, R. (1997). Multitask learning. Machine learning, 28(1), 41-75. https://doi.org/10.1023/A:1007379606734
[6]Chang, C. H., & Chen, C. D. (1993). HMM-based part-of-speech tagging for Chinese corpora. Very Large Corpora: Academic and Industrial Perspectives. https://aclanthology.org/W93-0305
[7]Chen X., Li B., Feng M., Xu C., Xu R., Shi M., Yu L., Xiao L., & Wang Q. (2017). Ancient Chinese Corpus LDC2017T14. Philadelphia: Linguistic Data Consortium. https://doi.org/10.35111/ctjv-ez04
[8]Chen, J., Qiu, X., Liu, P., & Huang, X. (2018). Meta multi-task learning for sequence modeling. Proceedings of the AAAI Conference on Artificial Intelligenc, 32(1). https://ojs.aaai.org/index.php/AAAI/article/view/12007
[9]Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. Conference on Empirical Methods in Natural Language Processing (EMNLP 2014). https://arxiv.org/abs/1406.1078
[10]Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., & Bai, X. (2019). Named entity recognition using bert bilstm crf for chinese electronic health records. 2019 12th international congress on image and signal processing, biomedical engineering and informatics (cisp-bmei), 1-5. IEEE.
[11]Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171–4186. https://aclanthology.org/N19-1423
[12]Gong, L., Zhang, Z., & Chen, S. (2020). Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining. Journal of Healthcare Engineering, 2020. https://doi.org/10.1155/2020/8829219
[13]Gu, C., Wu, M., & Zhang, C. (2017). Chinese sentence classification based on convolutional neural network. 2017 International Conference on Artificial Intelligence Applications and Technologies (AIAAT 2017), Hawaii, USA. https://iopscience.iop.org/article/10.1088/1757-899X/261/1/012008
[14]Han, X., Wang, H., Zhang, S., Fu, Q., & Liu, J. (2019). Sentence segmentation for classical Chinese based on LSTM with radical embedding. The Journal of China Universities of Posts and Telecommunications, 26(02). doi: 10.19682/j.cnki.1005-8885.2019.1001
[15]Huang, H. H., Sun, C. T., & Chen, H. H. (2010). Classical Chinese sentence segmentation. CIPS-SIGHAN joint conference on Chinese language processing. https://aclanthology.org/W10-4103/
[16]Huang, S., & Wu, J. (2018). A pragmatic approach for classical Chinese word segmentation. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://aclanthology.org/L18-1186
[17]Jiao, Z., Sun, S., & Sun, K. (2018). Chinese lexical analysis with deep Bi-GRU-CRF network. arXiv preprint. https://arxiv.org/abs/1807.01882
[18]Liu, P., Qiu, X., & Huang, X. (2016). Recurrent neural network for text classification with multi-task learning. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), 2873–2879. https://arxiv.org/abs/1605.05101
[19]Ma, J., Ganchev, K., & Weiss, D. (2018). State-of-the-art Chinese word segmentation with Bi-LSTMs. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 4902–4908. https://aclanthology.org/D18-1529/
[20]Ng, H. T., & Low, J. K. (2004). Chinese part-of-speech tagging: One-at-a-time or all-at-once? word-based or character-based?. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 277-284. https://aclanthology.org/W04-3236
[21]Norman, J., & Jerry, N. (1988). Chinese. Cambridge University Press.
[22]Panchendrarajan, R., & Amaresan, A. (2018). Bidirectional LSTM-CRF for Named Entity Recognition. Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation. https://aclanthology.org/Y18-1061
[23]Qin, Q., Zhao, S., & Liu, C. (2021). A BERT-BiGRU-CRF Model for Entity Recognition of Chinese Electronic Medical Records. Complexity, 2021. https://doi.org/10.1155/2021/6631837
[24]Ratnaparkhi, A. (1996). A maximum entropy model for part-of-speech tagging. Proceedings of the Conference on Empirical Methods in Natural Language Processing, 133-142. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.5102
[25]Shao, Y., Hardmeier, C., Tiedemann, J., & Nivre, J. (2017). Character-based joint segmentation and POS tagging for Chinese using bidirectional RNN-CRF. Proceedings of the Eighth International Joint Conference on Natural Language Processing, 1, 173–183. https://aclanthology.org/I17-1018
[26]Shi, M., Li, B., & Chen, X. (2010). CRF based research on a unified approach to word segmentation and POS tagging for Pre-Qin Chinese. Journal of Chinese Information Processing, 2(24), 39-46. http://jcip.cipsc.org.cn/CN/Y2010/V24/I2/39
[27]Tian, Y., Song, Y., Ao, X., Xia, F., Quan, X., Zhang, T., & Wang, Y. (2020). Joint Chinese word segmentation and part-of-speech tagging via two-way attentions of auto-analyzed knowledge. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 8286-8296. https://aclanthology.org/2020.acl-main.735/
[28]Tian, Y., Song, Y., & Xia, F. (2020). Joint Chinese Word Segmentation and Part-of-speech Tagging via Multi-channel Attention of Character N-grams. Proceedings of the 28th International Conference on Computational Linguistics, 2073-2084. https://aclanthology.org/2020.coling-main.187/
[29]Wang, Q., & Zeng, L. (2018). Chinese symptom component recognition via bidirectional LSTM-CRF. 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI) , 45-50. IEEE. doi: 10.1109/ICACI.2018.8377564.
[30]Wilkinson, E. P. (2000). Chinese history: a manual. Harvard Univ Asia Center.
[31]Wu, Y., Jiang, M., Lei, J., & Xu, H. (2015). Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network. Studies in health technology and informatics, 216, 624-628.
[32]Zhang, H. P., Yu, H. K., Xiong, D., & Liu, Q. (2003). HHMM-based Chinese lexical analyzer ICTCLAS. Proceedings of the second SIGHAN workshop on Chinese language processing, 17, 184-187. https://doi.org/10.3115/1119250.1119280
[33]Zhang, H. P., Liu, Q., Yu, H. K., Cheng, X., & Bai, S. (2003). Chinese named entity recognition using role model. International Journal of Computational Linguistics & SChinese Language Processing, 8(2), 29-60. https://aclanthology.org/O03-5002
描述 碩士
國立政治大學
資訊管理學系
108356036
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0108356036
資料類型 thesis
dc.contributor.advisor 蔡瑞煌<br>黃瀚萱zh_TW
dc.contributor.advisor Tsaih, Rua-Huan<br>Huang, Hen-Hsenen_US
dc.contributor.author (Authors) 薛卉吟zh_TW
dc.contributor.author (Authors) Xue, Hui-Yinen_US
dc.creator (作者) 薛卉吟zh_TW
dc.creator (作者) Xue, Hui-Yinen_US
dc.date (日期) 2021en_US
dc.date.accessioned 1-Dec-2021 14:30:04 (UTC+8)-
dc.date.available 1-Dec-2021 14:30:04 (UTC+8)-
dc.date.issued (上傳時間) 1-Dec-2021 14:30:04 (UTC+8)-
dc.identifier (Other Identifiers) G0108356036en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/138003-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理學系zh_TW
dc.description (描述) 108356036zh_TW
dc.description.abstract (摘要) 奏摺,是研究清代政策實施和法制建設的珍貴的史料。雖然存於國立故宮博物院的清代宮中檔及軍機處的奏摺已完成數化,但應用仍然不普及,原因之一是辨識古典漢語的斷句、斷詞和詞義需花費歷史學家大量的時間。對於古典漢語,很少有有用的自然語言處理(NLP)工具,並且先進的人工智能(AI)模型學習不同朝代的訓練數據後,其性能也不盡相同。此外,沒有合適的NLP工具來分析清代的奏摺。為了解決有關於分析清代奏摺的挑戰,本研究探索一種基於Transformer之單任務學習(STL)及多任務學習(MTL)之模型,該模型可同時應付以下三個任務:斷句、斷詞、詞性(POS)標記和命名實體識別(NER)。為了完成此任務,本研究建議的標記方案包括三個部分:(1)用於斷句的BOE格式標籤;(2)用於斷詞的BIES格式標籤;以及(3)用於POS和NER的聯合標籤。為了評估該提案,本研究著重於雍正皇帝時期之奏摺,並收集並建立由中文專業人士參照新標籤標記方案所標註的清朝宮中檔奏摺數據集。研究結果顯示,斷句及斷詞任務中,多任務學習效能顯著優於單任務學習,兩個學習方法在詞性標記和命名實體識別則無顯著差異。模型的斷句結果可以達到輔助初學者們閱讀奏摺,斷詞以及詞性的標注結果則可以協助學者辨認詞義,減少對詞義誤讀的可能。zh_TW
dc.description.abstract (摘要) Memorials are important materials for research on policy implementation and the formation of legal institutions. Although the memorials of Qing palace and the Grand Council had been accomplished with image scanning, the application is still not popular in academia. One of the reasons is that classical Chinese will often take a lot of historian’s time to determine the segmentation of sentences and the meaning of words. The use of natural language processing (NLP) tools for analyzing classical Chinese remains an emerging topic in the digital humanity community. For classical Chinese, there are few NLP tools, and the performance of artificial intelligence (AI) models is not the same after learning the data of different dynasties. To address the challenges regarding the memorials of Qing dynasty, this study proposes a classical Chinese analysis model with transformer-based single task learning (STL) and multitask learning (MTL) that simultaneously copes with three tasks for classical Chinese: word segmentation, sentence segmentation, and the joint task for part-of-speech (POS) tagging and named entity recognition (NER). To accomplish the goal, the labels have three parts: (1) BOE format tags for sentence segmentation, (2) BIES format tags for word segmentation, and (3) the joint tags for POS and NER. For evaluating the proposal, this study focuses on the Yong-zheng (雍正) emperor and the Qing’s memorials dataset annotated with new tagging schemes by Chinese professionals is collected. The research results show that method MTL performs significantly better on both sentence segmentation task and word segmentation task than method STL. And on POS+NER task, there is no significant difference between the two methods. The prediction of the memorials can help scholars to read memorials easily and reduce the probability of misinterpretation of word meaning.en_US
dc.description.tableofcontents 1 INTRODUCTION 7
2 PREVIOUS WORKS 9
2.1 Qing Palace Memorials of National Palace Museum 9
2.2 Chinese Text Classification Tasks 10
2.3 Bidirectional Encoder Representations from Transformers 12
2.4 RNN-based Multi-Task Learning 14
2.5 Bidirectional Gate Recurrent Unit 15
3 EXPERIMENT DESIGN 17
3.1 Models 17
3.2 Input X 20
3.3 Output Tags 20
3.3.1 Sentence Segmentation Tags 20
3.3.2 Word Segmentation Tags 21
3.3.3 Joint Tags of POS and NER 21
3.3.4 Example 23
3.4 Dataset 24
3.4.1 Data Collection for Qing’s Dataset 24
3.4.2 Data Labeling for the Qing’s Dataset 26
3.4.3 Statistical Description of the Qing’s Dataset 27
3.5 Experiment Environment 28
3.7 Evaluation 29
4 EXPERIMENTS 30
4.1 Preprocessing 30
4.2 Training 30
4.3 Evaluation 31
4.4 Comparisons 34
4.4.1 Residual Connection 34
4.4.2 Compare with Other Models 34
4.4.3 Compare with Other Chinese NLP Tools 35
4.4.4 Different Tagging Scheme of POS+NER 35
4.4.5 Different Granularity of Word Segmentation 36
4.5 Discussion 37
5 CONCLUSION 41
REFERANCE 43
APPENDIX 46
Chinese Version of Interview and Feedback 46
zh_TW
dc.format.extent 1466792 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0108356036en_US
dc.subject (關鍵詞) 清代奏摺zh_TW
dc.subject (關鍵詞) 斷詞斷句zh_TW
dc.subject (關鍵詞) 命名實體識別zh_TW
dc.subject (關鍵詞) 多任務學習zh_TW
dc.subject (關鍵詞) 自然語言處理zh_TW
dc.subject (關鍵詞) 機器學習zh_TW
dc.subject (關鍵詞) 古文zh_TW
dc.subject (關鍵詞) Memorialen_US
dc.subject (關鍵詞) Qing Dynastyen_US
dc.subject (關鍵詞) Transformeren_US
dc.subject (關鍵詞) BERTen_US
dc.subject (關鍵詞) Sentence segmentationen_US
dc.subject (關鍵詞) Word segmentationen_US
dc.subject (關鍵詞) Name entity recognitionen_US
dc.subject (關鍵詞) Multitask learningen_US
dc.subject (關鍵詞) Classical Chineseen_US
dc.subject (關鍵詞) NLPen_US
dc.title (題名) 基於Transformer之多任務學習用於清代奏摺斷句斷詞命名實體識別zh_TW
dc.title (題名) Text Segmentation and Name Entity Recognition for Memorials from the Qing Dynasty with Transformer-based Multitask Learningen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1]莊吉發 (1983)。故宮檔案述要。國立故宮博物院。
[2]袁晖、管锡华、岳方遂 (2002)。汉语标点符号流变史。湖北教育出版社。
[3]黃宇暘、郭鎮武、周維強、林國平、蔡瑞煌 (2021)。人工智慧在中文歷史文獻判讀領域應用初探:以國立故宮博物院典藏為例。科技博物,25(3),5-26。
[4]Cai, D., & Zhao, H. (2016). Neural word segmentation learning for Chinese. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics , 1, 409–420. https://aclanthology.org/P16-1039/
[5]Caruana, R. (1997). Multitask learning. Machine learning, 28(1), 41-75. https://doi.org/10.1023/A:1007379606734
[6]Chang, C. H., & Chen, C. D. (1993). HMM-based part-of-speech tagging for Chinese corpora. Very Large Corpora: Academic and Industrial Perspectives. https://aclanthology.org/W93-0305
[7]Chen X., Li B., Feng M., Xu C., Xu R., Shi M., Yu L., Xiao L., & Wang Q. (2017). Ancient Chinese Corpus LDC2017T14. Philadelphia: Linguistic Data Consortium. https://doi.org/10.35111/ctjv-ez04
[8]Chen, J., Qiu, X., Liu, P., & Huang, X. (2018). Meta multi-task learning for sequence modeling. Proceedings of the AAAI Conference on Artificial Intelligenc, 32(1). https://ojs.aaai.org/index.php/AAAI/article/view/12007
[9]Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. Conference on Empirical Methods in Natural Language Processing (EMNLP 2014). https://arxiv.org/abs/1406.1078
[10]Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., & Bai, X. (2019). Named entity recognition using bert bilstm crf for chinese electronic health records. 2019 12th international congress on image and signal processing, biomedical engineering and informatics (cisp-bmei), 1-5. IEEE.
[11]Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171–4186. https://aclanthology.org/N19-1423
[12]Gong, L., Zhang, Z., & Chen, S. (2020). Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining. Journal of Healthcare Engineering, 2020. https://doi.org/10.1155/2020/8829219
[13]Gu, C., Wu, M., & Zhang, C. (2017). Chinese sentence classification based on convolutional neural network. 2017 International Conference on Artificial Intelligence Applications and Technologies (AIAAT 2017), Hawaii, USA. https://iopscience.iop.org/article/10.1088/1757-899X/261/1/012008
[14]Han, X., Wang, H., Zhang, S., Fu, Q., & Liu, J. (2019). Sentence segmentation for classical Chinese based on LSTM with radical embedding. The Journal of China Universities of Posts and Telecommunications, 26(02). doi: 10.19682/j.cnki.1005-8885.2019.1001
[15]Huang, H. H., Sun, C. T., & Chen, H. H. (2010). Classical Chinese sentence segmentation. CIPS-SIGHAN joint conference on Chinese language processing. https://aclanthology.org/W10-4103/
[16]Huang, S., & Wu, J. (2018). A pragmatic approach for classical Chinese word segmentation. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://aclanthology.org/L18-1186
[17]Jiao, Z., Sun, S., & Sun, K. (2018). Chinese lexical analysis with deep Bi-GRU-CRF network. arXiv preprint. https://arxiv.org/abs/1807.01882
[18]Liu, P., Qiu, X., & Huang, X. (2016). Recurrent neural network for text classification with multi-task learning. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), 2873–2879. https://arxiv.org/abs/1605.05101
[19]Ma, J., Ganchev, K., & Weiss, D. (2018). State-of-the-art Chinese word segmentation with Bi-LSTMs. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 4902–4908. https://aclanthology.org/D18-1529/
[20]Ng, H. T., & Low, J. K. (2004). Chinese part-of-speech tagging: One-at-a-time or all-at-once? word-based or character-based?. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 277-284. https://aclanthology.org/W04-3236
[21]Norman, J., & Jerry, N. (1988). Chinese. Cambridge University Press.
[22]Panchendrarajan, R., & Amaresan, A. (2018). Bidirectional LSTM-CRF for Named Entity Recognition. Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation. https://aclanthology.org/Y18-1061
[23]Qin, Q., Zhao, S., & Liu, C. (2021). A BERT-BiGRU-CRF Model for Entity Recognition of Chinese Electronic Medical Records. Complexity, 2021. https://doi.org/10.1155/2021/6631837
[24]Ratnaparkhi, A. (1996). A maximum entropy model for part-of-speech tagging. Proceedings of the Conference on Empirical Methods in Natural Language Processing, 133-142. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.5102
[25]Shao, Y., Hardmeier, C., Tiedemann, J., & Nivre, J. (2017). Character-based joint segmentation and POS tagging for Chinese using bidirectional RNN-CRF. Proceedings of the Eighth International Joint Conference on Natural Language Processing, 1, 173–183. https://aclanthology.org/I17-1018
[26]Shi, M., Li, B., & Chen, X. (2010). CRF based research on a unified approach to word segmentation and POS tagging for Pre-Qin Chinese. Journal of Chinese Information Processing, 2(24), 39-46. http://jcip.cipsc.org.cn/CN/Y2010/V24/I2/39
[27]Tian, Y., Song, Y., Ao, X., Xia, F., Quan, X., Zhang, T., & Wang, Y. (2020). Joint Chinese word segmentation and part-of-speech tagging via two-way attentions of auto-analyzed knowledge. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 8286-8296. https://aclanthology.org/2020.acl-main.735/
[28]Tian, Y., Song, Y., & Xia, F. (2020). Joint Chinese Word Segmentation and Part-of-speech Tagging via Multi-channel Attention of Character N-grams. Proceedings of the 28th International Conference on Computational Linguistics, 2073-2084. https://aclanthology.org/2020.coling-main.187/
[29]Wang, Q., & Zeng, L. (2018). Chinese symptom component recognition via bidirectional LSTM-CRF. 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI) , 45-50. IEEE. doi: 10.1109/ICACI.2018.8377564.
[30]Wilkinson, E. P. (2000). Chinese history: a manual. Harvard Univ Asia Center.
[31]Wu, Y., Jiang, M., Lei, J., & Xu, H. (2015). Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network. Studies in health technology and informatics, 216, 624-628.
[32]Zhang, H. P., Yu, H. K., Xiong, D., & Liu, Q. (2003). HHMM-based Chinese lexical analyzer ICTCLAS. Proceedings of the second SIGHAN workshop on Chinese language processing, 17, 184-187. https://doi.org/10.3115/1119250.1119280
[33]Zhang, H. P., Liu, Q., Yu, H. K., Cheng, X., & Bai, S. (2003). Chinese named entity recognition using role model. International Journal of Computational Linguistics & SChinese Language Processing, 8(2), 29-60. https://aclanthology.org/O03-5002
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202101726en_US