Query-adaptive Video Summarization via Quality-aware Relevance Estimation

التفاصيل البيبلوغرافية
العنوان: Query-adaptive Video Summarization via Quality-aware Relevance Estimation
المؤلفون: Arun Balajee Vasudevan, Luc Van Gool, Anna Volokitin, Michael Gygli
المساهمون: Liu, Qiong, Lienhart, Rainer, Wang, Haohong, Chen, Sheng-Wei Kuan-Ta, Boll, Susanne, Chen, Yi-Ping Phoebe, Friedland, Gerald, Li, Jia, Yan, Shuicheng
المصدر: ACM Multimedia
سنة النشر: 2017
مصطلحات موضوعية: FOS: Computer and information sciences, Information retrieval, Web search query, Computer Science - Computation and Language, Artificial neural network, cs.MM, Computer science, Computer Vision and Pattern Recognition (cs.CV), Frame (networking), Computer Science - Computer Vision and Pattern Recognition, cs.CL, 020207 software engineering, 02 engineering and technology, Automatic summarization, Multimedia (cs.MM), 0202 electrical engineering, electronic engineering, information engineering, Selection (linguistics), Embedding, 020201 artificial intelligence & image processing, Relevance (information retrieval), Computation and Language (cs.CL), cs.CV, Computer Science - Multimedia
الوصف: Although the problem of automatic video summarization has recently received a lot of attention, the problem of creating a video summary that also highlights elements relevant to a search query has been less studied. We address this problem by posing query-relevant summarization as a video frame subset selection problem, which lets us optimise for summaries which are simultaneously diverse, representative of the entire video, and relevant to a text query. We quantify relevance by measuring the distance between frames and queries in a common textual-visual semantic embedding space induced by a neural network. In addition, we extend the model to capture query-independent properties, such as frame quality. We compare our method against previous state of the art on textual-visual embeddings for thumbnail selection and show that our model outperforms them on relevance prediction. Furthermore, we introduce a new dataset, annotated with diversity and query-specific relevance labels. On this dataset, we train and test our complete model for video summarization and show that it outperforms standard baselines such as Maximal Marginal Relevance.
ACM Multimedia 2017
اللغة: English
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7222e8816600880c64867b4c66b06852Test
http://arxiv.org/abs/1705.00581Test
حقوق: OPEN
رقم الانضمام: edsair.doi.dedup.....7222e8816600880c64867b4c66b06852
قاعدة البيانات: OpenAIRE