Visual Story Ordering with a Bidirectional Writer | 學術產出 | 政大學術集成

學術產出-會議論文

文章檢視/開啟

pdf(560)

書目匯出

Google Scholar^TM

政大圖書館

學術資源探索系統

引文資訊

TAIR相關學術產出

Simple Record
Full Record

題名	Visual Story Ordering with a Bidirectional Writer
作者	黃瀚萱 Huang, Hen-Hsen Lin, Wei-Rou Chen, Hsin-Hsi
貢獻者	資科系
關鍵詞	Multimodal modeling ; temporal information ordering ; sentence ordering ; visual-semantic representation
日期	2020-06
上傳時間	4-六月-2021 14:45:27 (UTC+8)
摘要	This paper introduces visual story ordering, a challenging task in which images and text are ordered in a visual story jointly. We propose a neural network model based on the reader-processor-writer architecture with a self-attention mechanism. A novel bidirectional decoder is further proposed with bidirectional beam search. Experimental results show the effectiveness of the approach. The information gained from multimodal learning is presented and discussed. We also find that the proposed embedding narrows the distance between images and their corresponding story sentences, even though we do not align the two modalities explicitly. As it addresses a general issue in generative models, the proposed bidirectional inference mechanism applies to a variety of applications.
關聯	Proceedings of the 2020 International Conference on Multimedia Retrieval (ICMR ’20), Association for Computing Machinery, pp.326-330
資料類型	conference
DOI	https://doi.org/10.1145/3372278.3390735

dc.contributor	資科系
dc.creator (作者)	黃瀚萱
dc.creator (作者)	Huang, Hen-Hsen
dc.creator (作者)	Lin, Wei-Rou
dc.creator (作者)	Chen, Hsin-Hsi
dc.date (日期)	2020-06
dc.date.accessioned	4-六月-2021 14:45:27 (UTC+8)	-
dc.date.available	4-六月-2021 14:45:27 (UTC+8)	-
dc.date.issued (上傳時間)	4-六月-2021 14:45:27 (UTC+8)	-
dc.identifier.uri (URI)	https://ah.lib.nccu.edu.tw/item?item_id=154638	-
dc.description.abstract (摘要)	This paper introduces visual story ordering, a challenging task in which images and text are ordered in a visual story jointly. We propose a neural network model based on the reader-processor-writer architecture with a self-attention mechanism. A novel bidirectional decoder is further proposed with bidirectional beam search. Experimental results show the effectiveness of the approach. The information gained from multimodal learning is presented and discussed. We also find that the proposed embedding narrows the distance between images and their corresponding story sentences, even though we do not align the two modalities explicitly. As it addresses a general issue in generative models, the proposed bidirectional inference mechanism applies to a variety of applications.
dc.format.extent	1744655 bytes	-
dc.format.mimetype	application/pdf	-
dc.relation (關聯)	Proceedings of the 2020 International Conference on Multimedia Retrieval (ICMR ’20), Association for Computing Machinery, pp.326-330
dc.subject (關鍵詞)	Multimodal modeling ; temporal information ordering ; sentence ordering ; visual-semantic representation
dc.title (題名)	Visual Story Ordering with a Bidirectional Writer
dc.type (資料類型)	conference
dc.identifier.doi (DOI)	10.1145/3372278.3390735
dc.doi.uri (DOI)	https://doi.org/10.1145/3372278.3390735