Publications-Periodical Articles

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 Revisiting the automatic prediction of lexical errors in Mandarin
作者 萬依萍
Wan, I-Ping;Allassonnière-Tang, Marc
貢獻者 語言所
關鍵詞 speech errors; Mandarin; frequency gap; position-in-utterance; Levenshtein edit distance
日期 2024-12
上傳時間 29-Apr-2024 13:41:08 (UTC+8)
摘要 Speech errors provide cues for explaining the process of word retrieval. For example, speech errors are less likely to occur with high-frequency words since these words already receive a high level of activation. The current analysis further develops existing findings in two ways. First, instead of considering the overall frequency of the words in the entire corpora, we consider the gap in frequency between sequential pairs of words. We hypothesize that speech errors are more likely to occur if the target has a much lower frequency than its preceding word. Second, we use word embedding methods to quantify the semantic distance between sequential pairs of words. We hypothesize that speech errors are more likely to occur with words that have a large semantic distance from their preceding context. We also consider the potential effects of phonetic distance between sequential pairs of words and position-in-utterance of words in utterances. The results from a Mandarin corpus of speech errors show that word frequency and semantic distance between sequential pairs of words can be used to predict the occurrence of speech errors with an accuracy above the majority baseline.
關聯 Linguistics Vanguard, Vol.10, No.1, pp.527-535
資料類型 article
DOI https://doi.org/10.1515/lingvan-2023-0036
dc.contributor 語言所-
dc.creator (作者) 萬依萍-
dc.creator (作者) Wan, I-Ping;Allassonnière-Tang, Marc-
dc.date (日期) 2024-12-
dc.date.accessioned 29-Apr-2024 13:41:08 (UTC+8)-
dc.date.available 29-Apr-2024 13:41:08 (UTC+8)-
dc.date.issued (上傳時間) 29-Apr-2024 13:41:08 (UTC+8)-
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/150959-
dc.description.abstract (摘要) Speech errors provide cues for explaining the process of word retrieval. For example, speech errors are less likely to occur with high-frequency words since these words already receive a high level of activation. The current analysis further develops existing findings in two ways. First, instead of considering the overall frequency of the words in the entire corpora, we consider the gap in frequency between sequential pairs of words. We hypothesize that speech errors are more likely to occur if the target has a much lower frequency than its preceding word. Second, we use word embedding methods to quantify the semantic distance between sequential pairs of words. We hypothesize that speech errors are more likely to occur with words that have a large semantic distance from their preceding context. We also consider the potential effects of phonetic distance between sequential pairs of words and position-in-utterance of words in utterances. The results from a Mandarin corpus of speech errors show that word frequency and semantic distance between sequential pairs of words can be used to predict the occurrence of speech errors with an accuracy above the majority baseline.-
dc.format.extent 116 bytes-
dc.format.mimetype text/html-
dc.relation (關聯) Linguistics Vanguard, Vol.10, No.1, pp.527-535-
dc.subject (關鍵詞) speech errors; Mandarin; frequency gap; position-in-utterance; Levenshtein edit distance-
dc.title (題名) Revisiting the automatic prediction of lexical errors in Mandarin-
dc.type (資料類型) article-
dc.identifier.doi (DOI) 10.1515/lingvan-2023-0036-
dc.doi.uri (DOI) https://doi.org/10.1515/lingvan-2023-0036-