Video Description Using Learning Multiple Features

Xin Xu; Chunping Liu; Haibin Liu; Yi Ji; Zhaohui Wang

doi:10.2991/itim-17.2017.34

<Previous Article In Volume

Next Article In Volume>

Video Description Using Learning Multiple Features

Authors

Xin Xu, Chunping Liu, Haibin Liu, Yi Ji, Zhaohui Wang

Corresponding Author

Xin Xu

Available Online August 2017.

DOI: 10.2991/itim-17.2017.34 How to use a DOI?
Keywords: Video description, SIFT flow, VGG-16, mean pooling, LSTM
Abstract: Generating descriptions for open-domain videos is a major challenge for computer vision due to the complex dynamics. In this paper, we propose a video description model based on multiple features. In the process of encoding, we exploit two complementary features. The spatial one is extracted from the raw frame by VGG-16 model. The temporal one is extracted from the SIFT flow image by a fine-tuned VGG-16 model. In the process of decoding, we further add the mean pooling feature which represents holistic feature of the video. For generating sentence of the video, we utilize two-layer LSTMs model to generate sentence about the video. We evaluate several variants of our model on the MSVD dataset for METEOR metrics. The experimental results show that our model can be beneficial for generating sequence about the video.
Copyright: © 2017, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2017 International Conference on Information Technology and Intelligent Manufacturing (ITIM 2017)
Series: Advances in Intelligent Systems Research
Publication Date: August 2017
ISBN: 978-94-6252-365-4
ISSN: 1951-6851
DOI: 10.2991/itim-17.2017.34 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Xin Xu
AU  - Chunping Liu
AU  - Haibin Liu
AU  - Yi Ji
AU  - Zhaohui Wang
PY  - 2017/08
DA  - 2017/08
TI  - Video Description Using Learning Multiple Features
BT  - Proceedings of the 2017 International Conference on Information Technology and Intelligent Manufacturing (ITIM 2017)
PB  - Atlantis Press
SP  - 137
EP  - 140
SN  - 1951-6851
UR  - https://doi.org/10.2991/itim-17.2017.34
DO  - 10.2991/itim-17.2017.34
ID  - Xu2017/08
ER  -

download .riscopy to clipboard