學術產出-學位論文

文章檢視/開啟

書目匯出

Google ScholarTM

政大圖書館

引文資訊

TAIR相關學術產出

題名 辨識華語健康照護資料集內的症狀
Symptom name recognition in a Mandarin Chinese healthcare dataset
作者 徐宜君
Hsu, Yi-Chun
貢獻者 張瑜芸
Chang, Yu-Yun
徐宜君
Hsu, Yi-Chun
關鍵詞 生物醫學命名實體識別
症狀辨識
語言學分析
Biomedical NER
Symptom name recognition
Linguistic analysis
日期 2024
上傳時間 4-九月-2024 16:03:17 (UTC+8)
摘要 隨著對生物醫學命名實體識別(NER)任務的需求不斷增加,本研究著重於從華語健康照護資料集中提取醫學實體,以症狀為主,原因是症狀的辨識相對困難。本篇的研究目的在於確定何種症狀的組成形式難以辨認,並檢驗包括部首、詞性標記、語義角色和使役動詞在內的四種語言特徵如何影響不同組成形式的症狀識別。本研究的方法包括根據症狀的組成形式修改症狀的原始標記,並應用條件隨機場(CRF)和 BERT 連結條件隨機場(BERT-CRF)兩種模型,結合四種特徵進行症狀識別任務。 結果顯示,「修飾語和中心語」和「修飾語和主事者和中心語」這兩種組成形式的症狀較難辨認。句法和語意層面上的兩種特徵,分別是詞性標記和語意角色,可能會影響症狀識別的表現。總結:本研究基於症狀的組成形式和四個語言層次的特徵,提供了症狀識別任務的語言學分析,此分析可用於評估該任務的表現。此外,此任務可能有益於提高醫病溝通的效率。
As the increasing need for biomedical named entity recognition (NER) tasks, this study targets the extraction of medical entities, especially symptoms, from a Mandarin Chinese healthcare dataset since symptoms are challenging to be recognized. The purpose of this study aims to identify which composition form of symptom is difficult to be recognized and examine how four linguistic features including radicals, POS tagging, semantic roles, and causative verbs affect symptom recognition in different composition forms. The method of this study includes revising the original annotation of symptoms based on the composition forms and applying two models, CRF and BERT-CRF, combined with four features to conduct the symptom recognition task. The results revealed that symptoms of these composition forms "modifier" and "modifier and theme and head" were difficult to be recognized. Two features in syntactic and semantic level, POS tagging and semantic roles, may have an influence on the performance of symptom recognition. In conclusion, this study provides a linguistic analysis of the symptom recognition task based on the composition forms of symptom and features in four linguistic levels, which can be utilized to evaluate the performance of this task. This task may improve the efficiency of medical communication.
參考文獻 Budi, I., & Bressan, S. (2003). Association rules mining for name entity recognition. Proceedings of the Fourth International Conference on Web Information Systems Engineering, 2003. WISE 2003., 325–328. Cai, X., Dong, S., & Hu, J. (2019). A deep learning model incorporating part of speech and self-matching attention for named entity recognition of chinese electronic medical records. BMC medical informatics and decision making, 19(2), 101– 109. Carnie, A. (2012). Syntax: A generative introduction. Wiley-Blackwell. Chen, F.-Y., Tsai, P.-F., Chen, K.-J., & Hunag, C.-R. (1999). 中文句結構樹資料庫的構建 (Sinica Treebank) [In Chinese]. International Journal of Computational Linguistics & Chinese Language Processing, 4(2), 87–104. https://aclanthology. org/O99-4004 Chen, J., Wang, Z., Tian, R., Yang, Z., & Yang, D. (2020). Local additivity based data augmentation for semi-supervised ner. arXiv preprint arXiv:2010.01677. Chinese Knowledge and Information Processing Laboratory. (2013, January). 句結構樹中的語意角色 (tech. rep. No. 13-01). Institute of Information Science, Academia Sinica. Chiou, S.-T., Huang, S.-W., Lo, Y.-C., Wu, Y.-H., & Wu, J.-L. (2022). Scu-nlp at rocling 2022 shared task: Experiment and error analysis of biomedical entity detection model. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 350–355. Chomsky, N. (1993). Lectures on government and binding: The pisa lectures. Walter de Gruyter. Chou, Y., Huang, C., et al. (2013). The formal representation for chinese characters. 当代语言学 (Contemporary linguistics), 15(2), 142–161. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Divita, G., Luo, G., Tran, L.-T. T., Workman, T. E., Gundlapalli, A. V., & Samore, M. H. (2017). General symptom extraction from va electronic medical notes. In Medinfo 2017: Precision healthcare through informatics (pp. 356–360). IOS Press. Dogan, R. I., & Lu, Z. (2012). An improved corpus of disease mentions in pubmed citations. BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing, 91–99. Doğan, R. I., Leaman, R., & Lu, Z. (2014). Ncbi disease corpus: A resource for disease name recognition and concept normalization. Journal of biomedical informatics, 47, 1–10. Feng, Z.-Q., Chen, P.-K., & Wang, J.-C. (2022). Ncu1415 at rocling 2022 shared task: A light-weight transformer-based approach for biomedical name entity recog- nition. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 316–320. Fukuda, K.-i., Tsunoda, T., Tamura, A., Takagi, T., et al. (1998). Toward information extraction: Identifying protein names from biological papers. Pac symp biocom- put, 707(18), 707–718. Gill, V. T., & Maynard, D. W. (2006). Explaining illness: Patients’ proposals and physi- cians’ responses. Studies in Interactional Sociolinguistics, 20, 115. Gu, X., Sun, Z., & Zhang, W. (2021). Composition-driven symptom phrase recognition for chinese medical consultation corpora. BMC medical informatics and decision making, 21, 1–15. Hassan, M., Makkaoui, O., Coulet, A., & Toussaint, Y. (2015). Extracting disease- symptom relationships by learning syntactic patterns from dependency graphs. BioNLP 15, 184. He, L., Yang, Z., Lin, H., & Li, Y. (2014). Drug name recognition in biomedical texts: A machine-learning-based method. Drug discovery today, 19(5), 610–617. Hu, Y., Ameer, I., Zuo, X., Peng, X., Zhou, Y., Li, Z., Li, Y., Li, J., Jiang, X., & Xu, H. (2023). Zero-shot clinical entity recognition using chatgpt. arXiv preprint arXiv:2303.16416. Huang, C.-R., & Hsieh, S.-K. (2015). Chinese lexical semantics. In The oxford handbook of chinese linguistics (pp. 290–305). Oxford Univ. Press. Huang, S.-L., Lin, S.-C., Ma, W.-Y., & Chen, K.-j. (2015, January). Semantic roles and semantic role labeling. https://doi.org/10.13140/RG.2.2.25040.20481 Ji, B., Liu, R., Li, S., Yu, J., Wu, Q., Tan, Y., & Wu, J. (2019). A hybrid approach for named entity recognition in chinese electronic medical record. BMC medical informatics and decision making, 19(2), 149–158. Lee, L.-H., Chen, C.-Y., Yu, L.-C., & Tseng, Y.-H. (2022). Overview of the rocling 2022 shared task for chinese healthcare named entity recognition. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 363–368. Lee, L.-H., & Lu, Y. (2021). Multiple embeddings enhanced multi-graph neural net- works for chinese healthcare named entity recognition. IEEE Journal of Biomed- ical and Health Informatics, 25(7), 2801–2810. Li, J., Fei, H., Liu, J., Wu, S., Zhang, M., Teng, C., Ji, D., & Li, F. (2022). Unified named entity recognition as word-word relation classification. proceedings of the AAAI conference on artificial intelligence, 36(10), 10965–10973. Li, X., Zhang, H., & Zhou, X.-H. (2020). Chinese clinical named entity recognition with variant neural structures based on bert methods. Journal of biomedical informat- ics, 107, 103422. Lin, B.-S., Chen, J.-H., & Chang, T.-H. (2022). Nerve at rocling 2022 shared task: A comparison of three named entity recognition frameworks based on language model and lexicon approach. Proceedings of the 34th Conference on Computa- tional Linguistics and Speech Processing (ROCLING 2022), 343–349. Liu, P., Guo, Y., Wang, F., & Li, G. (2022). Chinese named entity recognition: The state of the art. Neurocomputing, 473, 37–53. Luo, L., Yang, Z., Song, Y., Li, N., & Lin, H. (2020). Chinese clinical named entity recognition based on stroke elmo and multi-task learning. Chinese Journal of Computers, 43(10), 1943–1957. Luo, X., Wang, J., & Zhang, X. (2022). Ynu-hpcc at rocling 2022 shared task: A transformer- based model with focal loss and regularization dropout for chinese healthcare named entity recognition. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 335–342. Luo, X., Gandhi, P., Storey, S., & Huang, K. (2021). A deep language model for symptom extraction from clinical text and its application to extract covid-19 symptoms from social media. IEEE Journal of Biomedical and Health Informatics, 26(4), 1737–1748. Ma, H.-Y., Li, W.-J., & Liu, C.-L. (2022). Migbaseline at rocling 2022 shared task: Re- port on named entity recognition using chinese healthcare datasets. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 356–362. Ma, W.-Y., & Chen, K.-J. (2003). Introduction to CKIP Chinese word segmentation sys- tem for the first international Chinese word segmentation bakeoff. Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, 168–171. https://doi.org/10.3115/1119250.1119276 Mansouri, A., Affendey, L. S., & Mamat, A. (2008). Named entity recognition ap- proaches. International Journal of Computer Science and Network Security, 8(2), 339–344. Martin, L., Battistelli, D., & Charnois, T. (2014). Symptom recognition issue. 13th work- shop on Biomedical Natural Language Processing (BioNLP 2014), 107–111. Métivier, J.-P., Serrano, L., Charnois, T., Cuissart, B., & Widlöcher, A. (2015). Auto- matic symptom extraction from texts to enhance knowledge discovery on rare diseases. Artificial Intelligence in Medicine: 15th Conference on Artificial Intel- ligence in Medicine, AIME 2015, Pavia, Italy, June 17-20, 2015. Proceedings 15, 249–254. Pustejovsky, J. (1998). The generative lexicon. MIT press. Scheuermann, R. H., Ceusters, W., & Smith, B. (2009). Toward an ontological treatment of disease and diagnosis. Summit on Translational Bioinformatics, 2009, 116– 120. Schuyler, P. L., Hole, W. T., Tuttle, M. S., & Sherertz, D. D. (1993). The umls metathe- saurus: Representing different views of biomedical concepts. Bulletin of the Medical Library Association, 81(2), 217. Singh, A. P., Joshi, H. S., Singh, A., Agarwal, M., & Kaur, P. (2018). Online medi- cal consultation: A review. International Journal of Community Medicine and Public Health, 5(4), 1230–1232. Steinkamp, J. M., Bala, W., Sharma, A., & Kantrowitz, J. J. (2020). Task definition, an- notated dataset, and supervised natural language processing models for symptom extraction from unstructured clinical notes. Journal of biomedical informatics, 102, 103354. Wang, Y., Liu, Y., Yu, Z., Chen, L., & Jiang, Y. (2012). A preliminary work on symptom name recognition from free-text clinical records of traditional chinese medicine using conditional random fields and reasonable features. BioNLP: Proceedings of the 2012 workshop on biomedical natural language processing, 223–230. Wang, Y., Yu, Z., Chen, L., Chen, Y., Liu, Y., Hu, X., & Jiang, Y. (2014). Supervised methods for symptom name recognition in free-text clinical records of traditional chinese medicine: An empirical study. Journal of biomedical informatics, 47, 91–104. Wen, G., Chen, H., Li, H., Hu, Y., Li, Y., & Wang, C. (2020). Cross domains adversarial learning for chinese named entity recognition for online medical consultation. Journal of Biomedical Informatics, 112, 103608. Xu, K., Zhou, Z., Hao, T., & Liu, W. (2018). A bidirectional LSTM and conditional random fields approach to medical named entity recognition. Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2017, 355–365. Xu, S. (2012). Shuowen jiezi. 艺术中国网. Yang, J., Wang, H., Tang, Y., & Yang, F. (2021). Incorporating lexicon and character glyph and morphological features into bilstm-crf for chinese medical ner. 2021 IEEE International Conference on Consumer Electronics and Computer Engi- neering (ICCECE), 12–17. Yang, T.-H., Su, R.-C., Su, T.-E., Chong, S.-S., & Su, M.-H. (2022). Scu-mesclab at rocling-2022 shared task: Named entity recognition using bert classifier. Pro- ceedings of the 34th Conference on Computational Linguistics and Speech Pro- cessing (ROCLING 2022), 329–334. You, J.-M., & Chen, K.-J. (2004). Automatic semantic role assignment for a tree struc- ture. Proceedings of the third SIGHAN workshop on chinese language process- ing, 109–115. Zhang, Q., Sun, Y., Zhang, L., Jiao, Y., & Tian, Y. (2021). Named entity recognition method in health preserving field based on BERT. Procedia Computer Science, 183, 212–220. Zhang, Q.-X., Chi, T.-Y., Yang, T.-L., & Jang, J.-S. R. (2022). Crowner at rocling 2022 shared task: Ner using macbert and adversarial training. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 321–328. Zhou, X., Menche, J., Barabási, A.-L., & Sharma, A. (2014). Human symptoms–disease network. Nature communications, 5(1), 4212. Zong, S., Baheti, A., Xu, W., & Ritter, A. (2020). Extracting a knowledge base of covid- 19 events from social media. arXiv preprint arXiv:2006.02567.
描述 碩士
國立政治大學
語言學研究所
109555003
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109555003
資料類型 thesis
dc.contributor.advisor 張瑜芸zh_TW
dc.contributor.advisor Chang, Yu-Yunen_US
dc.contributor.author (作者) 徐宜君zh_TW
dc.contributor.author (作者) Hsu, Yi-Chunen_US
dc.creator (作者) 徐宜君zh_TW
dc.creator (作者) Hsu, Yi-Chunen_US
dc.date (日期) 2024en_US
dc.date.accessioned 4-九月-2024 16:03:17 (UTC+8)-
dc.date.available 4-九月-2024 16:03:17 (UTC+8)-
dc.date.issued (上傳時間) 4-九月-2024 16:03:17 (UTC+8)-
dc.identifier (其他 識別碼) G0109555003en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/153552-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 語言學研究所zh_TW
dc.description (描述) 109555003zh_TW
dc.description.abstract (摘要) 隨著對生物醫學命名實體識別(NER)任務的需求不斷增加,本研究著重於從華語健康照護資料集中提取醫學實體,以症狀為主,原因是症狀的辨識相對困難。本篇的研究目的在於確定何種症狀的組成形式難以辨認,並檢驗包括部首、詞性標記、語義角色和使役動詞在內的四種語言特徵如何影響不同組成形式的症狀識別。本研究的方法包括根據症狀的組成形式修改症狀的原始標記,並應用條件隨機場(CRF)和 BERT 連結條件隨機場(BERT-CRF)兩種模型,結合四種特徵進行症狀識別任務。 結果顯示,「修飾語和中心語」和「修飾語和主事者和中心語」這兩種組成形式的症狀較難辨認。句法和語意層面上的兩種特徵,分別是詞性標記和語意角色,可能會影響症狀識別的表現。總結:本研究基於症狀的組成形式和四個語言層次的特徵,提供了症狀識別任務的語言學分析,此分析可用於評估該任務的表現。此外,此任務可能有益於提高醫病溝通的效率。zh_TW
dc.description.abstract (摘要) As the increasing need for biomedical named entity recognition (NER) tasks, this study targets the extraction of medical entities, especially symptoms, from a Mandarin Chinese healthcare dataset since symptoms are challenging to be recognized. The purpose of this study aims to identify which composition form of symptom is difficult to be recognized and examine how four linguistic features including radicals, POS tagging, semantic roles, and causative verbs affect symptom recognition in different composition forms. The method of this study includes revising the original annotation of symptoms based on the composition forms and applying two models, CRF and BERT-CRF, combined with four features to conduct the symptom recognition task. The results revealed that symptoms of these composition forms "modifier" and "modifier and theme and head" were difficult to be recognized. Two features in syntactic and semantic level, POS tagging and semantic roles, may have an influence on the performance of symptom recognition. In conclusion, this study provides a linguistic analysis of the symptom recognition task based on the composition forms of symptom and features in four linguistic levels, which can be utilized to evaluate the performance of this task. This task may improve the efficiency of medical communication.en_US
dc.description.tableofcontents Chapter 1 Introduction 1 Section 1.1 Research background 1 Section 1.1.1 Biomedical NER 1 Section 1.1.2 Symptom recognition 2 Section 1.2 Research motivation and gaps 3 Section 1.3 Research purposes and questions 4 Section 1.3.1 Research purposes 4 Section 1.3.2 Research questions 5 Section 1.4 Significance of the research 5 Section 1.5 Organization of the research 5 Chapter 2 Literature Review 7 Section 2.1 The ambiguities between symptoms and medical signs, symptoms and diseases 7 Section 2.2 The annotation of symptoms in Mandarin Chinese 8 Section 2.3 Theoretical frameworks 12 Section 2.3.1 Semantic roles 12 Section 2.3.2 The connection between radicals and Chinese characters 14 Section 2.3.3 Patients’ explanations of symptoms through a causal relationship 14 Section 2.3.4 Features for symptom recognition in previous studies 15 Section 2.4 Models applied on biomedical NER tasks 17 Chapter 3 Methodology 21 Section 3.1 Dataset 21 Section 3.2 Symptom annotation 22 Section 3.3 Feature extraction 25 Section 3.4 Proposed models 25 Section 3.4.1 CRF model 27 Section 3.4.2 BERT-CRF model 27 Section 3.5 Evaluation metrics 29 Section 3.6 Procedures 30 Chapter 4 Results 32 Section 4.1 Model performance 32 Section 4.2 The performance of each composition form 35 Section 4.3 Symptom recognition using ChatGPT 37 Chapter 5 Discussion of Error Analysis 40 Section 5.1 Error analysis from the perspective of four linguistic features 41 Section 5.1.1 An error case of the form “head” 42 Section 5.1.2 An error case of the form “modifier and head” 43 Section 5.1.3 An error case of the form “theme and head” 43 Section 5.1.4 An error case of the form “modifier and theme and head” 45 Section 5.1.5 Other potential factors in error cases 47 Section 5.2 Limitations 47 Chapter 6 Conclusions 49 References 52zh_TW
dc.format.extent 1287937 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109555003en_US
dc.subject (關鍵詞) 生物醫學命名實體識別zh_TW
dc.subject (關鍵詞) 症狀辨識zh_TW
dc.subject (關鍵詞) 語言學分析zh_TW
dc.subject (關鍵詞) Biomedical NERen_US
dc.subject (關鍵詞) Symptom name recognitionen_US
dc.subject (關鍵詞) Linguistic analysisen_US
dc.title (題名) 辨識華語健康照護資料集內的症狀zh_TW
dc.title (題名) Symptom name recognition in a Mandarin Chinese healthcare dataseten_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Budi, I., & Bressan, S. (2003). Association rules mining for name entity recognition. Proceedings of the Fourth International Conference on Web Information Systems Engineering, 2003. WISE 2003., 325–328. Cai, X., Dong, S., & Hu, J. (2019). A deep learning model incorporating part of speech and self-matching attention for named entity recognition of chinese electronic medical records. BMC medical informatics and decision making, 19(2), 101– 109. Carnie, A. (2012). Syntax: A generative introduction. Wiley-Blackwell. Chen, F.-Y., Tsai, P.-F., Chen, K.-J., & Hunag, C.-R. (1999). 中文句結構樹資料庫的構建 (Sinica Treebank) [In Chinese]. International Journal of Computational Linguistics & Chinese Language Processing, 4(2), 87–104. https://aclanthology. org/O99-4004 Chen, J., Wang, Z., Tian, R., Yang, Z., & Yang, D. (2020). Local additivity based data augmentation for semi-supervised ner. arXiv preprint arXiv:2010.01677. Chinese Knowledge and Information Processing Laboratory. (2013, January). 句結構樹中的語意角色 (tech. rep. No. 13-01). Institute of Information Science, Academia Sinica. Chiou, S.-T., Huang, S.-W., Lo, Y.-C., Wu, Y.-H., & Wu, J.-L. (2022). Scu-nlp at rocling 2022 shared task: Experiment and error analysis of biomedical entity detection model. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 350–355. Chomsky, N. (1993). Lectures on government and binding: The pisa lectures. Walter de Gruyter. Chou, Y., Huang, C., et al. (2013). The formal representation for chinese characters. 当代语言学 (Contemporary linguistics), 15(2), 142–161. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Divita, G., Luo, G., Tran, L.-T. T., Workman, T. E., Gundlapalli, A. V., & Samore, M. H. (2017). General symptom extraction from va electronic medical notes. In Medinfo 2017: Precision healthcare through informatics (pp. 356–360). IOS Press. Dogan, R. I., & Lu, Z. (2012). An improved corpus of disease mentions in pubmed citations. BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing, 91–99. Doğan, R. I., Leaman, R., & Lu, Z. (2014). Ncbi disease corpus: A resource for disease name recognition and concept normalization. Journal of biomedical informatics, 47, 1–10. Feng, Z.-Q., Chen, P.-K., & Wang, J.-C. (2022). Ncu1415 at rocling 2022 shared task: A light-weight transformer-based approach for biomedical name entity recog- nition. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 316–320. Fukuda, K.-i., Tsunoda, T., Tamura, A., Takagi, T., et al. (1998). Toward information extraction: Identifying protein names from biological papers. Pac symp biocom- put, 707(18), 707–718. Gill, V. T., & Maynard, D. W. (2006). Explaining illness: Patients’ proposals and physi- cians’ responses. Studies in Interactional Sociolinguistics, 20, 115. Gu, X., Sun, Z., & Zhang, W. (2021). Composition-driven symptom phrase recognition for chinese medical consultation corpora. BMC medical informatics and decision making, 21, 1–15. Hassan, M., Makkaoui, O., Coulet, A., & Toussaint, Y. (2015). Extracting disease- symptom relationships by learning syntactic patterns from dependency graphs. BioNLP 15, 184. He, L., Yang, Z., Lin, H., & Li, Y. (2014). Drug name recognition in biomedical texts: A machine-learning-based method. Drug discovery today, 19(5), 610–617. Hu, Y., Ameer, I., Zuo, X., Peng, X., Zhou, Y., Li, Z., Li, Y., Li, J., Jiang, X., & Xu, H. (2023). Zero-shot clinical entity recognition using chatgpt. arXiv preprint arXiv:2303.16416. Huang, C.-R., & Hsieh, S.-K. (2015). Chinese lexical semantics. In The oxford handbook of chinese linguistics (pp. 290–305). Oxford Univ. Press. Huang, S.-L., Lin, S.-C., Ma, W.-Y., & Chen, K.-j. (2015, January). Semantic roles and semantic role labeling. https://doi.org/10.13140/RG.2.2.25040.20481 Ji, B., Liu, R., Li, S., Yu, J., Wu, Q., Tan, Y., & Wu, J. (2019). A hybrid approach for named entity recognition in chinese electronic medical record. BMC medical informatics and decision making, 19(2), 149–158. Lee, L.-H., Chen, C.-Y., Yu, L.-C., & Tseng, Y.-H. (2022). Overview of the rocling 2022 shared task for chinese healthcare named entity recognition. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 363–368. Lee, L.-H., & Lu, Y. (2021). Multiple embeddings enhanced multi-graph neural net- works for chinese healthcare named entity recognition. IEEE Journal of Biomed- ical and Health Informatics, 25(7), 2801–2810. Li, J., Fei, H., Liu, J., Wu, S., Zhang, M., Teng, C., Ji, D., & Li, F. (2022). Unified named entity recognition as word-word relation classification. proceedings of the AAAI conference on artificial intelligence, 36(10), 10965–10973. Li, X., Zhang, H., & Zhou, X.-H. (2020). Chinese clinical named entity recognition with variant neural structures based on bert methods. Journal of biomedical informat- ics, 107, 103422. Lin, B.-S., Chen, J.-H., & Chang, T.-H. (2022). Nerve at rocling 2022 shared task: A comparison of three named entity recognition frameworks based on language model and lexicon approach. Proceedings of the 34th Conference on Computa- tional Linguistics and Speech Processing (ROCLING 2022), 343–349. Liu, P., Guo, Y., Wang, F., & Li, G. (2022). Chinese named entity recognition: The state of the art. Neurocomputing, 473, 37–53. Luo, L., Yang, Z., Song, Y., Li, N., & Lin, H. (2020). Chinese clinical named entity recognition based on stroke elmo and multi-task learning. Chinese Journal of Computers, 43(10), 1943–1957. Luo, X., Wang, J., & Zhang, X. (2022). Ynu-hpcc at rocling 2022 shared task: A transformer- based model with focal loss and regularization dropout for chinese healthcare named entity recognition. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 335–342. Luo, X., Gandhi, P., Storey, S., & Huang, K. (2021). A deep language model for symptom extraction from clinical text and its application to extract covid-19 symptoms from social media. IEEE Journal of Biomedical and Health Informatics, 26(4), 1737–1748. Ma, H.-Y., Li, W.-J., & Liu, C.-L. (2022). Migbaseline at rocling 2022 shared task: Re- port on named entity recognition using chinese healthcare datasets. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 356–362. Ma, W.-Y., & Chen, K.-J. (2003). Introduction to CKIP Chinese word segmentation sys- tem for the first international Chinese word segmentation bakeoff. Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, 168–171. https://doi.org/10.3115/1119250.1119276 Mansouri, A., Affendey, L. S., & Mamat, A. (2008). Named entity recognition ap- proaches. International Journal of Computer Science and Network Security, 8(2), 339–344. Martin, L., Battistelli, D., & Charnois, T. (2014). Symptom recognition issue. 13th work- shop on Biomedical Natural Language Processing (BioNLP 2014), 107–111. Métivier, J.-P., Serrano, L., Charnois, T., Cuissart, B., & Widlöcher, A. (2015). Auto- matic symptom extraction from texts to enhance knowledge discovery on rare diseases. Artificial Intelligence in Medicine: 15th Conference on Artificial Intel- ligence in Medicine, AIME 2015, Pavia, Italy, June 17-20, 2015. Proceedings 15, 249–254. Pustejovsky, J. (1998). The generative lexicon. MIT press. Scheuermann, R. H., Ceusters, W., & Smith, B. (2009). Toward an ontological treatment of disease and diagnosis. Summit on Translational Bioinformatics, 2009, 116– 120. Schuyler, P. L., Hole, W. T., Tuttle, M. S., & Sherertz, D. D. (1993). The umls metathe- saurus: Representing different views of biomedical concepts. Bulletin of the Medical Library Association, 81(2), 217. Singh, A. P., Joshi, H. S., Singh, A., Agarwal, M., & Kaur, P. (2018). Online medi- cal consultation: A review. International Journal of Community Medicine and Public Health, 5(4), 1230–1232. Steinkamp, J. M., Bala, W., Sharma, A., & Kantrowitz, J. J. (2020). Task definition, an- notated dataset, and supervised natural language processing models for symptom extraction from unstructured clinical notes. Journal of biomedical informatics, 102, 103354. Wang, Y., Liu, Y., Yu, Z., Chen, L., & Jiang, Y. (2012). A preliminary work on symptom name recognition from free-text clinical records of traditional chinese medicine using conditional random fields and reasonable features. BioNLP: Proceedings of the 2012 workshop on biomedical natural language processing, 223–230. Wang, Y., Yu, Z., Chen, L., Chen, Y., Liu, Y., Hu, X., & Jiang, Y. (2014). Supervised methods for symptom name recognition in free-text clinical records of traditional chinese medicine: An empirical study. Journal of biomedical informatics, 47, 91–104. Wen, G., Chen, H., Li, H., Hu, Y., Li, Y., & Wang, C. (2020). Cross domains adversarial learning for chinese named entity recognition for online medical consultation. Journal of Biomedical Informatics, 112, 103608. Xu, K., Zhou, Z., Hao, T., & Liu, W. (2018). A bidirectional LSTM and conditional random fields approach to medical named entity recognition. Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2017, 355–365. Xu, S. (2012). Shuowen jiezi. 艺术中国网. Yang, J., Wang, H., Tang, Y., & Yang, F. (2021). Incorporating lexicon and character glyph and morphological features into bilstm-crf for chinese medical ner. 2021 IEEE International Conference on Consumer Electronics and Computer Engi- neering (ICCECE), 12–17. Yang, T.-H., Su, R.-C., Su, T.-E., Chong, S.-S., & Su, M.-H. (2022). Scu-mesclab at rocling-2022 shared task: Named entity recognition using bert classifier. Pro- ceedings of the 34th Conference on Computational Linguistics and Speech Pro- cessing (ROCLING 2022), 329–334. You, J.-M., & Chen, K.-J. (2004). Automatic semantic role assignment for a tree struc- ture. Proceedings of the third SIGHAN workshop on chinese language process- ing, 109–115. Zhang, Q., Sun, Y., Zhang, L., Jiao, Y., & Tian, Y. (2021). Named entity recognition method in health preserving field based on BERT. Procedia Computer Science, 183, 212–220. Zhang, Q.-X., Chi, T.-Y., Yang, T.-L., & Jang, J.-S. R. (2022). Crowner at rocling 2022 shared task: Ner using macbert and adversarial training. Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 321–328. Zhou, X., Menche, J., Barabási, A.-L., & Sharma, A. (2014). Human symptoms–disease network. Nature communications, 5(1), 4212. Zong, S., Baheti, A., Xu, W., & Ritter, A. (2020). Extracting a knowledge base of covid- 19 events from social media. arXiv preprint arXiv:2006.02567.zh_TW