Publications-Proceedings

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

  • Loading...
    Loading...

Related Publications in TAIR

TitleExploring the Semantic Representations of Text in Subspaces of Latent Space: A Case Study on Color
Creator蕭舜文; 羅永富
Hsiao, Shun-Wen;Lo, Yung-Fu
Contributor資管系
Key WordsLatent Space; Semantic Representation; Concept Subspace; Projection Optimization; NLP
Date2024-12
Date Issued12-Mar-2025 10:22:06 (UTC+8)
SummaryLanguage models like BERT have advanced the representation of textual semantics in high-dimensional latent spaces, enabling numerous natural language processing applications. However, their capacity to represent domain-specific concepts, such as "color," remains underexplored. This study investigated the semantic representation of text in color concept subspace of latent space. Using embeddings of nearly 1,000 color names from the XKCD color survey generated by BERT, we identified limitations in BERT’s ability to cluster perceptually similar colors. To address this, we proposed a supervised learning approach to project embeddings into a color-specific subspace, isolating and enhancing color semantics. Experimental results demonstrated the methodology’s effectiveness in improving semantic clustering through qualitative and quantitative evaluations. Moreover, our general approach not only explored the concept of color but also provided the possibility of exploring and disentangling semantic subspaces for other domain-specific concepts, contributing to the understanding and manipulation of latent space structures in language models.
RelationProceeding of IEEE International Conference on Big Data, IEEE, pp.8765-8767
Typeconference
DOI https://doi.org/10.1109/BigData62323.2024.10825707
dc.contributor 資管系-
dc.creator (作者) 蕭舜文; 羅永富-
dc.creator (作者) Hsiao, Shun-Wen;Lo, Yung-Fu-
dc.date (日期) 2024-12-
dc.date.accessioned 12-Mar-2025 10:22:06 (UTC+8)-
dc.date.available 12-Mar-2025 10:22:06 (UTC+8)-
dc.date.issued (上傳時間) 12-Mar-2025 10:22:06 (UTC+8)-
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/156149-
dc.description.abstract (摘要) Language models like BERT have advanced the representation of textual semantics in high-dimensional latent spaces, enabling numerous natural language processing applications. However, their capacity to represent domain-specific concepts, such as "color," remains underexplored. This study investigated the semantic representation of text in color concept subspace of latent space. Using embeddings of nearly 1,000 color names from the XKCD color survey generated by BERT, we identified limitations in BERT’s ability to cluster perceptually similar colors. To address this, we proposed a supervised learning approach to project embeddings into a color-specific subspace, isolating and enhancing color semantics. Experimental results demonstrated the methodology’s effectiveness in improving semantic clustering through qualitative and quantitative evaluations. Moreover, our general approach not only explored the concept of color but also provided the possibility of exploring and disentangling semantic subspaces for other domain-specific concepts, contributing to the understanding and manipulation of latent space structures in language models.-
dc.format.extent 114 bytes-
dc.format.mimetype text/html-
dc.relation (關聯) Proceeding of IEEE International Conference on Big Data, IEEE, pp.8765-8767-
dc.subject (關鍵詞) Latent Space; Semantic Representation; Concept Subspace; Projection Optimization; NLP-
dc.title (題名) Exploring the Semantic Representations of Text in Subspaces of Latent Space: A Case Study on Color-
dc.type (資料類型) conference-
dc.identifier.doi (DOI) 10.1109/BigData62323.2024.10825707-
dc.doi.uri (DOI) https://doi.org/10.1109/BigData62323.2024.10825707-