Tuesday, October 23, 2012

Efficient Layout of Comic-Like Video Summaries

In order to represent large amounts of information in the form of a video key-frame summary, this paper studies narrative grammar of comics, and using its universal and intuitive rules, lays out visual summaries in an efficient and user centered way. The system ranks importance of key-frame sizes in the final layout by balancing the dominant visual representability and discovery of unanticipated content utilizing a specific cost function and an unsupervised robust spectral clustering technique. A final layout is created using an optimization algorithm based on

dynamic pro- gramming. Algorithm efficiency and robustness are demonstrated by comparing the results with the optimal panelling solutions. Index Terms—Reverse story boarding, video representation, video summarization. I. INTRODUCTION I N ORDER to enable intuitive access to large image and video archives, the main challenge of systems for video sum- marization and browsing is to achieve a good balance between removal of redundant sections of video and representative coverage of the video summary. Zhuang et al. [1] proposed an unsupervised clustering method based on HSV color features, where the frame closest to the cluster center is chosen as the key frame representative for a given video shot. Utilizing cluster-validity analysis, Hanjalic and Zhang [2] remove the visual content redundancy among video frames using an unsupervised procedure. An interesting approach introduced by DeMenthon et al. [3] represents the video sequence as a curve in a high dimensional space, and the summary is represented by the set of salient points on that curve. Recently, Wah et al. [4] exploited a normalized cut algorithm to globally and optimally partition the graph representation into video clusters and describe the evolution and perceptual importance of a video segment. This work makes a shift towards more user centered summarization and browsing of large video collections by augmenting interaction rather than learning the way users create related semantics. In order to create an effortless and intuitive interaction with the overwhelming extent of information embedded in video archives, we propose a system that exploits the universally familiar narrative structure of comics to generate easily readable visual summaries. Being defined as “spatially juxtaposed im- ages in deliberate sequence intended to convey information” [5], comics are the most prevalent medium that expresses meaning through a sequence of spatially structured images. Exploiting this concept, the proposed system follows the narrative structure of comics, linking the temporal flow of video sequence with the Manuscript received May 1, 2006; revised September 6, 2006 and November 22, 2006. This work was supported in part by the ICBR project within the 3C Research, Digital Media and Communications Innovation Centre. This paper was recommended by Associate Editor E. Izquierdo. The authors are with the Department of Computer Science, University of Bristol, Bristol BS8 1RZ, U.K. (e-mail: janko@cs.bris.ac.uk). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCSVT.2007.897466 spatial...

Website: epubs.surrey.ac.uk | Filesize: -
No of Page(s): 6
Download Efficient Layout of Comic-Like Video Summaries - Surrey Research ....pdf

No comments:

Post a Comment