dc.contributor | 資科系 | |
dc.creator (作者) | 黃瀚萱 | |
dc.creator (作者) | Huang, Hen-Hsen | |
dc.creator (作者) | Lin, Wei-Rou | |
dc.creator (作者) | Chen, Hsin-Hsi | |
dc.date (日期) | 2020-06 | |
dc.date.accessioned | 4-Jun-2021 14:45:27 (UTC+8) | - |
dc.date.available | 4-Jun-2021 14:45:27 (UTC+8) | - |
dc.date.issued (上傳時間) | 4-Jun-2021 14:45:27 (UTC+8) | - |
dc.identifier.uri (URI) | http://nccur.lib.nccu.edu.tw/handle/140.119/135531 | - |
dc.description.abstract (摘要) | This paper introduces visual story ordering, a challenging task in which images and text are ordered in a visual story jointly. We propose a neural network model based on the reader-processor-writer architecture with a self-attention mechanism. A novel bidirectional decoder is further proposed with bidirectional beam search. Experimental results show the effectiveness of the approach. The information gained from multimodal learning is presented and discussed. We also find that the proposed embedding narrows the distance between images and their corresponding story sentences, even though we do not align the two modalities explicitly. As it addresses a general issue in generative models, the proposed bidirectional inference mechanism applies to a variety of applications. | |
dc.format.extent | 1744655 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.relation (關聯) | Proceedings of the 2020 International Conference on Multimedia Retrieval (ICMR ’20), Association for Computing Machinery, pp.326-330 | |
dc.subject (關鍵詞) | Multimodal modeling ; temporal information ordering ; sentence ordering ; visual-semantic representation | |
dc.title (題名) | Visual Story Ordering with a Bidirectional Writer | |
dc.type (資料類型) | conference | |
dc.identifier.doi (DOI) | 10.1145/3372278.3390735 | |
dc.doi.uri (DOI) | https://doi.org/10.1145/3372278.3390735 | |