Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/135531
題名: Visual Story Ordering with a Bidirectional Writer
作者: 黃瀚萱
Huang, Hen-Hsen
Lin, Wei-Rou
Chen, Hsin-Hsi
貢獻者: 資科系
關鍵詞: Multimodal modeling ; temporal information ordering ; sentence ordering ; visual-semantic representation
日期: Jun-2020
上傳時間: 4-Jun-2021
摘要: This paper introduces visual story ordering, a challenging task in which images and text are ordered in a visual story jointly. We propose a neural network model based on the reader-processor-writer architecture with a self-attention mechanism. A novel bidirectional decoder is further proposed with bidirectional beam search. Experimental results show the effectiveness of the approach. The information gained from multimodal learning is presented and discussed. We also find that the proposed embedding narrows the distance between images and their corresponding story sentences, even though we do not align the two modalities explicitly. As it addresses a general issue in generative models, the proposed bidirectional inference mechanism applies to a variety of applications.
關聯: Proceedings of the 2020 International Conference on Multimedia Retrieval (ICMR ’20), Association for Computing Machinery, pp.326-330
資料類型: conference
DOI: https://doi.org/10.1145/3372278.3390735
Appears in Collections:會議論文

Files in This Item:
File SizeFormat
289.pdf1.7 MBAdobe PDF2View/Open
Show full item record

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.