Publications-Proceedings
Article View/Open
Publication Export
Google ScholarTM
NCCU Library
Citation Infomation
-
Loading...Loading...
Related Publications in TAIR
Title | Exploring the Semantic Representations of Text in Subspaces of Latent Space: A Case Study on Color |
Creator | 蕭舜文; 羅永富 Hsiao, Shun-Wen;Lo, Yung-Fu |
Contributor | 資管系 |
Key Words | Latent Space; Semantic Representation; Concept Subspace; Projection Optimization; NLP |
Date | 2024-12 |
Date Issued | 12-Mar-2025 10:22:06 (UTC+8) |
Summary | Language models like BERT have advanced the representation of textual semantics in high-dimensional latent spaces, enabling numerous natural language processing applications. However, their capacity to represent domain-specific concepts, such as "color," remains underexplored. This study investigated the semantic representation of text in color concept subspace of latent space. Using embeddings of nearly 1,000 color names from the XKCD color survey generated by BERT, we identified limitations in BERT’s ability to cluster perceptually similar colors. To address this, we proposed a supervised learning approach to project embeddings into a color-specific subspace, isolating and enhancing color semantics. Experimental results demonstrated the methodology’s effectiveness in improving semantic clustering through qualitative and quantitative evaluations. Moreover, our general approach not only explored the concept of color but also provided the possibility of exploring and disentangling semantic subspaces for other domain-specific concepts, contributing to the understanding and manipulation of latent space structures in language models. |
Relation | Proceeding of IEEE International Conference on Big Data, IEEE, pp.8765-8767 |
Type | conference |
DOI | https://doi.org/10.1109/BigData62323.2024.10825707 |
dc.contributor | 資管系 | - |
dc.creator (作者) | 蕭舜文; 羅永富 | - |
dc.creator (作者) | Hsiao, Shun-Wen;Lo, Yung-Fu | - |
dc.date (日期) | 2024-12 | - |
dc.date.accessioned | 12-Mar-2025 10:22:06 (UTC+8) | - |
dc.date.available | 12-Mar-2025 10:22:06 (UTC+8) | - |
dc.date.issued (上傳時間) | 12-Mar-2025 10:22:06 (UTC+8) | - |
dc.identifier.uri (URI) | https://nccur.lib.nccu.edu.tw/handle/140.119/156149 | - |
dc.description.abstract (摘要) | Language models like BERT have advanced the representation of textual semantics in high-dimensional latent spaces, enabling numerous natural language processing applications. However, their capacity to represent domain-specific concepts, such as "color," remains underexplored. This study investigated the semantic representation of text in color concept subspace of latent space. Using embeddings of nearly 1,000 color names from the XKCD color survey generated by BERT, we identified limitations in BERT’s ability to cluster perceptually similar colors. To address this, we proposed a supervised learning approach to project embeddings into a color-specific subspace, isolating and enhancing color semantics. Experimental results demonstrated the methodology’s effectiveness in improving semantic clustering through qualitative and quantitative evaluations. Moreover, our general approach not only explored the concept of color but also provided the possibility of exploring and disentangling semantic subspaces for other domain-specific concepts, contributing to the understanding and manipulation of latent space structures in language models. | - |
dc.format.extent | 114 bytes | - |
dc.format.mimetype | text/html | - |
dc.relation (關聯) | Proceeding of IEEE International Conference on Big Data, IEEE, pp.8765-8767 | - |
dc.subject (關鍵詞) | Latent Space; Semantic Representation; Concept Subspace; Projection Optimization; NLP | - |
dc.title (題名) | Exploring the Semantic Representations of Text in Subspaces of Latent Space: A Case Study on Color | - |
dc.type (資料類型) | conference | - |
dc.identifier.doi (DOI) | 10.1109/BigData62323.2024.10825707 | - |
dc.doi.uri (DOI) | https://doi.org/10.1109/BigData62323.2024.10825707 | - |