This paper presents a system for compact and intuitive video summarisation aimed at both high-end professional production environments and small-screen portable devices. To represent large amounts of information in the form of a video key-frame summary, this paper studies the narrative grammar of comics, and using its universal and intuitive rules, lays out visual summaries in an efficient
and user-centered way. In addition, the system exploits visual attention modelling and rapid serial visual presentation to generate highly compact summaries on mobile devices. A robust real-time algorithm for key-frame extraction is presented. The system ranks importance of key-frame sizes in the final layout by balancing the dominant visual representability and discovery of unanticipated content utilising a specific cost function and an unsupervised robust spectral clustering technique. A final layout is created using an optimisation algorithm based on dynamic programming. Algorithm efficiency and robustness are demonstrated by comparing the results with a manually labelled ground truth and with optimal panelling solutions. Copyright © 2007 J. ´ Cali´c and N. W. Campbell. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION The conventional paradigm to bridge the semantic gap be- tween low-level information extracted from the digital videos and the user’s need to meaningfully interact with large mul- timedia databases in an intuitive way is to learn and model the way different users link perceived stimuli and their mean- ing [1]. This widespread approach attempts to uncover the underpinning processes of human visual understanding and thus often fails to achieve reliable results, unless it targets a narrow application context or only a certain type of the video content. The work presented in this paper makes a shift to- wards more user-centered summarisation and browsing of large video collections by augmenting user’s interaction with the content rather than learning the way users create related semantics. In order to create an effortless and intuitive interaction with the overwhelming extent of information embedded in video archives, we propose two systems for generation of compact video summaries in two different scenarios. The first system targets high-end users such as broadcasting pro- duction professionals, exploiting the universally familiar nar- rative structure of comics to generate easily readable visual summaries. In case of browsing video archives in a mobile application scenario, visual summary is generated using a model of human visual attention. The extracted salient in- formation from the attention model is exploited to lay out an optimal presentation of the content on a device with a small size display, whether it is a mobile phone, handheld PC, or PDA. Being defined as “spatially juxtaposed images in deliber- ate sequence intended to convey information”...
Website: www.cs.bris.ac.uk | Filesize: -
No of Page(s): 16
Download Research Article Compact Visualisation of Video Summaries.pdf
No comments:
Post a Comment