Multi-View Video Summarization Using Bipartite Matching Constrained Optimum-Path Forest Clustering

Size: px
Start display at page:

Download "Multi-View Video Summarization Using Bipartite Matching Constrained Optimum-Path Forest Clustering"

Transcription

1 1166 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 17, NO. 8, AUGUST 2015 Multi-View Video Summarization Using Bipartite Matching Constrained Optimum-Path Forest Clustering Sanjay K. Kuanar, Kunal B. Ranga, and Ananda S. Chowdhury, Member, IEEE Abstract The task of multi-view video summarization is to efficiently represent the most significant information from a set of videos captured for a certain period of time by multiple cameras. The problem is highly challenging because of the huge size of the data, presence of many unimportant frames with low activity, inter-view dependencies, and significant variations in illumination. In this paper, we propose a graph-theoretic solution to the above problems. Semantic feature in form of visual bag of words and visual features like color, texture, and shape are used to model shot representative frames after temporal segmentation. Gaussian entropy is then applied to filter out frames with low activity. Inter-view dependencies are captured via bipartite graph matching. Finally, the optimum-path forest algorithm is applied for the clustering purpose. Subjective as well as objective evaluations clearly indicate the effectiveness of the proposed approach. Index Terms Bipartite matching, Gaussian entropy, multi-view video summarization, optimum-path forest, visual bag of words. I. INTRODUCTION I NCREASING demand of security, traffic monitoring in the recent years has led to deployment of multiple video cameras with overlapping fields of views at public places like banks, ATMs and road junctions. The surveillance/monitoring systems simultaneously record a set of videos capturing various events. Multi-view video summarization techniques can be applied for obtaining significant information from these videos in short time [4], [16]. Application areas in which multi-view video summarization can be of immense help include investigative analysis of post accident scenarios, close scrutiny of traffic patterns and prompt recognition of suspicious events and activities like theft and robbery at public places. Many works for summarization of monocular videos(single view) can be found in the literature [1], [3], [5] [13]. However, multi-view video summarization poses certain distinct(from mono-view) challenges. The size of the multi-view video data Manuscript received March 21, 2015; revised May 23, 2015; accepted May 30, Date of publication June 10, 2015; date of current version July 15, The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Changsheng Xu. The authors are with the Department of Electronics and Telecommunications Engineering, Jadavpur University, Kolkata , India ( sanjay. kuanar@gmail.com; kunalranga@gmail.com; aschowdhury@etce.jdvu.ac.in). Color versions of one or more of the figures in this paper are available online at Digital Object Identifier /TMM collected by a surveillance camera system for even few hours can be very large. Moreover, since these videos are captured by fixed camera systems, much of the recorded content is uninteresting which makes useful information extraction more difficult [2]. Thirdly, since all cameras capture the same scene from different viewpoints, these videos have large amount of (inter-view) statistical dependencies [4]. So, correlations among videos captured from multiple views need to be properly modeled for obtaining an informative and compact summary. Finally, the individual views can suffer from significant variations in illumination. So, mono-view video summarization approaches may not necessarily work well for the multi-view problem [4], [16], [41], [42]. In this paper, we propose a graph based solution for multi-view video summarization where correlations among different views are established via bipartite graph matching [43], [44] and clustering of high-dimensional video data is performed using optimum-path forest (OPF) algorithm [34], [35]. After temporal segmentation, semantic as well as visual features (e.g. color, texture and shape) are used for modeling shot representative frames. Gaussian entropy is used to filter out frames with low activity. II. RELATED WORK We start this section with some representative mono-view video summarization methods. For a comprehensive review of this subject area, please see [1], [5]. Jiang et al. [6] developed an automatic video summarization algorithm following a set of consumer-oriented guidelines as high level semantic rules. Hannon et al. [7] used the time-stamped opinions from the social networking sites for generating summaries of soccer matches. Semantically important information from a set of user inputted keyframes was used by Han et al. [8] for producing video summaries. A generic framework of user attention model through multiple sensory perceptions was employed by Ma et al. [9]. Since our proposed method is based on graph algorithms, we now mention some graph based summarization approaches. Dynamic Delaunay clustering and information-theoretic pre-sampling was proposed by Kuanar et al. [11]. Lu et al. [12] developed a graph optimization method that computes optimal video skim in each scene via dynamic programming. In another approach, Peng et al. [13] showed that highlighted events can be detected using an effective similarity metric for video clips. Temporal graph analysis was carried out by Ngo et al. [14] to effectively encapsulate information for video IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.

2 KUANAR et al.: MULTI-VIEW VIDEO SUMMARIZATION USING OPTIMUM-PATH FOREST CLUSTERING 1167 Though several works can be found in the area of monoview video summarization, very little work has thus far been reported for summarizing multi-view videos. Paucity of research papers in this highly important area has primarily motivated us to investigate the problem. Fu et al. [4] presented a multi-view video summarization technique using random walks applied on spatio-temporal shot graphs. A hypergraph based representation is used for capturing correlations among different views. Clique expansion is then used to convert the hypergraph into the spatio-temporal graph. Random walks are applied on the spatio-temporal graph for final clustering. The authors in [4] have mentioned that the graph building process is intrinsically complex and it consumes most of the overall processing time. Moreover, short walks may end up in local network neighborhoods [15] producing erroneous clustering results. Li et al. [16] presented another method for abstracting multi-key frames from video datasets using Support Vector Machine(SVM) and rough sets. However, performance of SVM depends on the choice of the kernel and its design parameters [17]. Recently, Ou et al. [42] proposed a low-complexity online multi-view video summarization on wireless video sensors to save compression and transmission power while keeping critical information. We now highlight the contributions of the current work from the viewpoints of both multimedia as well as graph-theoretic pattern clustering. From the multimedia standpoint, our method produces a more accurate multi-view summary compared to [4], [16], [42] and faster summary compared to [4]. The salient features of the work are stated below. 1. We use a novel combination of features, namely, color, texture, visual bag of words and Tamura. While color and texture comprise the regular visual features; visual bag of words help in handling different lighting conditions and three Tamura features help in handling orientations. This choice of features, which improves the final clustering process, has not been reported in any similar work. 2. In sharp contrast to [4], we developed a compact spatiotemporal graph on which the clustering algorithm is to be applied. Specifically, intra-view redundancy removal which is equivalent to order reduction of the above spatiotemporal graph is achieved through Gaussian entropy. 3. Unlike the related existing approaches, we have captured the correlations among multiple views in a more accurate and efficient manner using the Maximum Cardinality Minimum Weight (MCMW) bipartite matching [43], [44]. 4. Unsupervised Optimum-Path Forest (OPF) algorithm [34] is used for the first time in the field of multimedia for rapid clustering of high volume as well as somewhat high dimensional data obviating the need of any dimensionality reduction technique. To further improve the performance of the OPF algorithm, the match sets from the MCMW matching are imposed as constraints in its input adjacency matrix. From the point of view of graph-theoretic pattern clustering, MCMW bipartite matching-constrained OPF has not been applied to the best of our knowledge. III. PROPOSED METHOD In this section, we provide a detailed description of our method. In Fig. 1, the four main components of our method, Fig. 1. Flowchart showing various components of our method. namely, video pre-processing, unimportant frame elimination, multi-view correlation and shot clustering are shown. A. Video Preprocessing The preprocessing part of a video consists of two steps, namely, (i) shot detection and representation, and, (ii) feature extraction. These steps are now described below. 1) Shot Detection and Representation: Shot boundary detection or temporal segmentation [18] [21] is carried out first. To parse the multiple views in our problem, we apply a motion based shot boundary detection method [22]. Various schemes exist to represent these detected shots using a single key frame or a set of key frames [23] [25]. However, for large datasets, like surveillance videos which contains a large number of shots, comparing every pair of frames within a shot becomes computationally prohibitive. Moreover, many shots from the multi-view videos are static in nature with no camera motion. Hence for representing a shot, we use the middle frame as it captures the general view of the overall shot content [23]. 2) Feature Extraction: We employ both visual and semantic features for content modeling. Within the visual category, color, edge and shape features are used. We decide to obtain the color histogram using the HSV color space(16 ranges of ranges of ranges of V) because it is found to be more resilient to noise and is also robust to small changes of the camera position [7], [26]. Since the global color histogram alone is incapable of preserving spatial information, we use texture features as well. These texture features are extracted using edge histogram descriptor [27]. A video frame is first sub-divided into blocks. Then the local edge histograms for each of these blocks are obtained. Edges are broadly grouped into five bins: vertical, horizontal, 45 diagonal, 135 diagonal, and isotropic. So, texture of a frame is represented by an 80-dimensional(16 blocks 5 bins) feature vector. In addition to color and edge information we extract three pixel level Tamura features [28] as they correlate very strongly to human perception. These three

3 1168 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 17, NO. 8, AUGUST 2015 features denote coarseness, contrast and directionality for the neighborhood of the pixels. In a multi-view video, the representative frame of a shot from one view and that from the adjoining shot of a different view can be considered as partially rotated versions in which other visual features could remain same. In order to distinguish these frames, Tamura features are used. Multi-view videos are often found to suffer from significant variations in illumination. In such cases, a single event simultaneouslycapturedbydifferentviews will be erroneously treated as different events since they are visually dissimilar. To deal with this type of situations, we also consider semantic features between the shots. An event is modeled as a semantic object. Semantic similarity between documents is addressed using Bag of Words [29]. To capture the semantic similarity of the events in the multi-view videos, we apply the Visual bag of words (BoVW) model for our problem [30], [31]. Visual bag of words are obtained by applying K-means clustering on SIFT features [32] extracted from all the shot representative frames. Each visual word is represented by a cluster. A visual word appears in a shot if there exists some SIFT feature points in the shot representative frame within the th cluster. A shot is represented by where represents the normalized frequency of the th visual word and is the total number of visual words/clusters. Hence, we define a vocabulary as a set of centroids, where every centroid represents a word. We consider 500,000 SIFT features for this work and group them into 100 or 1000 clusters based on the duration and heterogeneity of the videos. After combining all the features, eventually a frame is represented by a 439-dimensional feature vector (256 for color for texture for Tamura for visual bag of words). B. Unimportant Shot Elimination Using Gaussian Entropy A careful scrutiny of the multi-view videos reveals a lot of redundancy in the contents of the individual views as they are being mostly captured by static cameras. In this step, we remove the unimportant(low or no activity) shots using the Gaussian entropy fusion model [4]. The importance of a shot is expressed as interaction of feature sets in the following manner: where and denote the information content of a shot and that of a set of features respectively. A well-known measure of the information content is entropy. We modify the equation (2) accordingly in the following manner: where, is the th feature set for the th shot representative frame and denotes the entropy of the feature. The entropy of a shot is calculated by taking into consideration color, texture, shape and BoVW features. Hence, (1) (2) (3) Fig. 2. Gaussian entropy scores for frames in Office1 video. Blue dots showing unimportant (low activity) frames with low scores and the green dots showing the important (high activity) frames with high scores. in our case is set to 4. The Gaussian entropy of a shot is hence expressed as The representative frames of various shots are sorted in an ascending order of the Gaussian entropy. Then we eliminate those shots whose representative frames have entropy scores below a certain threshold, chosen experimentally for a given dataset. In Fig. 2, we show a plot showing the distribution of the frames according to their Gaussian entropy scores for view1 of office1 video. Here the lowest entropy value was found to be 7.20 and the threshold was set to C. Multi-View Correlation Using Bipartite Matching Use of bipartite matching for solving correspondence problem under various contexts like shape matching, object recognition is quite well-known in the area of computer vision [43], [44]. The general underlying principle is to model the correspondence problem as an optimal assignment problem. For this work, the problem of finding the correlation among multiple views is modeled as one of Maximum Cardinality Minimum Weight (MCMW) matching problem in bipartite graphs. A key advantage of using bipartite matching is its inherent scalability, which arises from its polynomial(cubic) time-complexity [10]. We assume that the similarity between shots across multiple views can be measured according to the similarity of the key frames extracted from the corresponding shots. Let us represent any two overlapping views of a multi-view video by, with and as their number of shot representative frames(each frame is a point in 439-dimensional space) respectively. So, we can write, and where represents the th shot representative frame in the th view. To capture the similarity between these views, we construct a bipartite graph with two disjoint vertex sets and representing the two views. The vertices of the graph are the shot representative frames of the views which pass through the Gaussian entropy based filtering stage. The edge set is denoted by,where denotes the edge connecting the vertices and. The edge weight between and, is computed as the Euclidean distance between these two points in 439-dimensional space. After applying the MCMW algorithm, we obtain a match set whose elements (4)

4 KUANAR et al.: MULTI-VIEW VIDEO SUMMARIZATION USING OPTIMUM-PATH FOREST CLUSTERING 1169 indicate the actual key frame correspondences between the two views and. The time-complexity of the MCMW algorithm is,where. We can apply this MCMW bipartite matching algorithm between every pair of views and obtain similar correspondences. So, for views, we need to apply the MCMW bipartite matching algorithm times. he overall complexity of the matching process is times which is still as (e.g. typically,,and, ). D. Shot Clustering by OPF We apply OPF algorithm in an unsupervised manner [34] for clustering of the key frames. We choose this method for rapid clustering of high volume as well as somewhat high dimensional data obviating the need of any dimensionality reduction technique. The method is also not constrained by the number and the form of clusters [35], [36]. OPF is based on Image Foresting Transform(IFT) [33]. An appropriate graph is constructed on which OPF is to be applied. The node set contains the shot representative frames which are retained after Gaussian entropy based filtering. For every frame there is a feature vector. Node pairs appearing in any of the number of MCMW Bipartite match sets(for views) build the edges of the graph. So, bipartite matching is used to refine the adjacency relation for OPF in form of the following constraint: if in otherwise In the above equation, indicates presence of an edge. Let be the adjacent edge weight, i.e., the distance between two adjacent frames and in the feature space. Then is given by The graph also has node weights in form of probability density functions(pdfs) which can characterize relevant clusters. The pdfs can be estimated using a Parzen window [36] based approach with a Gaussian kernel given by where, [34] and is the maximum arc weight in. The parameter considers all nodes for density computation, since a Gaussian function covers most samples within. For the clustering process, we require the concept of a path and a connectivity function denoting the strength of association of any node in the path to its root. A path is a sequence of adjacent nodes starting from a root and ending at a node,with a trivial path and the concatenation of and arc. We need to maximize the connectivity for all where if Otherwise (5) (6) (7) (8) TABLE I DATASET INFORMATION Fig. 3. Variations in values with number of words (K) for different datasets. In (8), and is a root set with one element for each maximum of the pdf. Initially all nodes define trivial paths with values.highervalues of delta reduce the number of maxima. We set and following [37]. The IFT algorithm maximizes such that the optimum paths form an optimum-path forest. This algorithm is implemented using the codes made available by the authors of [34]. The reported time-complexity with such implementation is. IV. EXPERIMENTAL RESULTS AND DISCUSSIONS Four multi-view datasets with thirty videos in total along with the corresponding ground truths from [4], [42] are used for the experiments which are carried out on a desktop PC with Intel(R) core(tm) i processor and 8 GB of DDR2-memory. Table I shows the information on experimental data. For multi-view summaries generated by our method, please visit: result/multi-view-video-summarization. [8] is used as the objective measure while Informativeness [4] and Visual pleasantness [38] are used as the subjective measures for performance evaluation. For the BoVW model, we experimentally choose 100 words for the Office1 and Office Lobby datasets and 1000 words for the more complex Campus and the BL-7F datasets to achieve the best performance. Fig. 3shows measures for various choices of K for the four datasets. A. Validation of the Components in the Solution Pipeline To show the successive improvements from the various components in our solution pipeline, we use a simple base method(a) where K-means clustering algorithm is applied to visual features (color, edge and shape) for obtaining the multi-view summary. K-means is chosen because of its low computational overhead in clustering of high dimensional data [39]. To show the utility of OPF, we apply both the K-means [39] as well as the OPF clustering on the visual features (denoted as B). The measure values in Table II clearly

5 1170 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 17, NO. 8, AUGUST 2015 TABLE II COMPARATIVE PERFORMANCE ANALYSIS WITH VARIOUS APPROACHES Fig. 4. Representative frames of strongly correlated shots as detected by us across pairs of different views for the office1 dataset. indicate that B performs significantly better than the A for all four datasets. We next show the effectiveness of the semantic features by applying OPF algorithm on the combined (Visual Bag of Words and the visual) feature set (denoted as C). Table II clearly demonstrates that C has a better score as compared to that of B. This is mainly because of the poor recall score of B as the summary contains more false negatives (more missed frames) compared to that of C. These results show that the use of semantic features can lead to a significant increase in the detection of number of salient events in the summaries. We next demonstrate the effectiveness of the MCMW bipartite matching. We call the method D where MCMW bipartite matching is added to C. Thus, D represents our complete solution. From a comparison of the measures for C and D in Table II, it becomes evident that incorporation of bipartite matching significantly improves the quality of the results.infig.4,werepresentthe strongly correlated shots, with their representative middle frames across different pairs of views for the office1 video. Here we notice that most of the strongly correlated shots show same activity simultaneously recorded by the different views. We finally demonstrate that the utility of the Gaussian Entropy fusion model for elimination of unimportant frames. We use D without the Gaussian Entropy model for that purpose and call it.asshownintableii, the precision of D is much higher than that of.thisdrop in precision for is attributed to the presence of unimportant shots as false positives in the generated summary. B. Comparison With Some Mono-View Methods We first compare our method with some mono-view video summarization methods as in [4]. Table III reveals that there are lots of redundancies (simultaneous presence of most of the events) in the summaries obtained from all the mono-view strategies. Furthermore, there exist significant temporal overlaps among summarized multi-view shots in these methods. The proposed multi-view summarization method has much less redundancy and captures most of the important events that were missing in the mono-view summaries. This is because a mono-view method fails to capture the correlation between multiple views as it simply coalesces the videos from multiple views to form a single combined video. C. Comparison With the State-of-the-Art Multi-View Methods We now compare our method with four state-of-the-art multiview methods, namely, [4], [16], [41], [42] based on the available results and ground-truths. Out of these four methods, [41] uses a learning strategy. Note that the methods in [2], [40] use video data from multiple cameras mainly for the purpose of

6 KUANAR et al.: MULTI-VIEW VIDEO SUMMARIZATION USING OPTIMUM-PATH FOREST CLUSTERING 1171 TABLE III PERFORMANCE COMPARISON WITH TWO MONO-VIEW AND THREE MULTI-VIEW METHODS TABLE IV PERFORMANCE COMPARISON WITH A LEARNING-BASED MULTI-VIEW METHOD [41] TABLE V STATISTICAL DATA SHOWING SUBJECTIVE COMPARISON WITH [4] tracking and are hence left out for comparison. In [4], the authors capture the correlations among the different views using a hypergraph based representation, which was found to be intrinsically complex and slow. Office1 video is a point in the case. We found that on an average, MCMW bipartite matching algorithm for any pair of views takes less than 1 min. to complete. For the office1 video, there are 4 views. This requires altogether MCMW matching which takes less than 5 min. to complete. The feature extraction takes 3 min. and obtaining Gaussian entropy scores for the shots takes around 2 min. So, processing time for our method is 10 min. as compared to the reported value of 15 min. in [4] before a clustering algorithm is applied. The execution time of the OPF clustering in our work is about the same time for the random walks-based clustering in [4]. Hence, we say that the proposed method is almost 30% faster than that of [4]. Next, we show the comparisons using the objective measure in Table III and Table IV. Table III shows that the precision of our method as well as that of [4] are 100% for the Office 1 and Office lobby datasets and somewhat low for the Campus video dataset. Presence of the fence in the campus video has caused this reduction in precision. Still, for Campus video dataset, precision of our method is about 6% better than that of [4]. For all the three datasets, we obtain a better recall than [4]. Overall, the measure for our method easily supersedes [4] for all the three datasets. The same table also clearly demonstrates the superiority of the proposed method over [16] in terms of a much higher value. Note that the method [16] is frame based and more than one key frames are used to describe a single shot. So, in that work more than one key frame may be detected for one video shot(one event shot). In contrast, our method as well as Fig. 5. Representative frames of some events detected by us which were missed by other approaches. that of [4] are both shot-based and only a single key frame is used to describe a single shot. That is why the number of detected events in case of [16] is higher. Table III also exhibits the superiority of the proposed method over [42] for Office, Lobby and the BL-7F dataset in terms of a much higher measure. Table IV shows that though we have not used any learning strategy in our method, we are quite comparable to [41] which uses a metric learning approach [41]. For the subjective evaluation, we conducted a user study with 10 volunteers. Summaries generated from the proposed method and that from [4] for all the datasets were shown to the volunteers. We asked them to assign a score between 1 and 5(1 indicates the worst score and 5 indicates the best score) for Informativeness and Visual pleasantness for each summary. From the user study results in Table V, it is evident that most of our summaries are more informative as compared to [4]. The maximum increase in Informativeness is 18% for the challenging Campus video. Similarly, a comparison

7 1172 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 17, NO. 8, AUGUST 2015 Fig. 6. View-board of the multi-view summary. Representative frames of the summarized shots arranged in temporal order. of the Visual pleasantness values indicates that for four out of five videos, we obtain a better result. The maximum improvement is about 12% for the Office lobby video. We now specifically show some of the relevant events which we could correctly detect but were reported as missed or false negatives by [4]. In Office 1 video, the event of pulling a thick book from the shelf by a member which is the 36th shot from the second view is preserved by our summary [Fig. 5(a)]. The event from 4th view of the campus video which captures a bus moving across from right to left outside the fence is preserved by our summary [Fig. 5(b)]. Similarly, another event from the 3rd view of the office lobby video, where a woman wearing a white coat walks across the lobby towards the gate without interrupting the man playing with the baby is captured by our method [Fig. 5(c)]. So, the summary can play an important role for sophisticated video surveillance tasks like event and activity detection [45]. The reason we outperform three multiview approaches and is close to one with learning is because we use i) a very strong set of features to handle complex issues like lighting conditions, and varying orientations, ii) a compact spatio-temporal graph with intra-view redundancy being removed by Gaussian entropy, iii) an efficient and accurate way of capturing correlation among multiple views by MCMW bipartite matching and iv) OPF clustering to handle high volume and somewhat high dimensional data. Finally, to statically represent our multi-view summaries we introduce a View-board as illustrated in Fig. 6. The representative middle frames of the summarized shots are assembled along the timeline across multiple views. Each shot is associated with a number(shown inside a box) that indicates the view to which the shot belongs. V. CONCLUSION AND FUTURE WORK We have presented a novel framework for the multi-view video summarization problem using bipartite matching constrained OPF. The problem of capturing the inherent correlation between the multiple views was modeled as a MCMW matching problem in bipartite graphs. OPF clustering is finally used for summarizing the multi-view videos. Performance comparisons show marked improvement over several existing mono-view and multi-view summarization algorithms. In future, we will focus on integration of more extensive set of video features. We also plan to work with large duration multi-view surveillance videos to demonstrate the scalability of our approach. ACKNOWLEDGMENT The authors would like to thank Prof. R.W. Robinson with the University of Georgia, Athens, GA, USA, and Prof. A. X. Falcão at the University of Campinas, Campinas, Brazil, for their helpful discussions on bipartite graph matching and OPF clustering. REFERENCES [1] B. T. Truong and S. Venkatesh, Video abstraction: A systematic review and classification, ACM Trans. Multimedia Comput., Commun., Appl., vol. 3, no. 1, pp. 1 37, [2] C. De Leo and B. S. Manjunath, Multicamera video summarization and anomaly detection from activity motifs, ACM Trans. Sensor Netw., vol. 10, no. 2, pp. 1 30, 2014, Article 27. [3] A. S. Chowdhury, S. Kuanar, R. Panda, and M. N. Das, Video storyboard design using Delaunay graphs, in Proc. IEEE Int. Conf. Pattern Recog., Nov. 2012, pp [4] Y. Fu, Y. Guo, Y. Zhu, F. Liu, C. Song, and Z. H. Zhou, Multi view video summmarization, IEEE Trans. Multimedia, vol. 12, no. 7, pp , Nov [5] A.G.MoneyandH.W.Agius, Videosummarization:Aconceptual framework and survey of the state of the art, J. Vis. Commun. Image Representation, vol. 19, no. 2, pp , [6] W. Jiang, C. Cotton, and A. C. Loui, Automatic consumer video summarization by audio and visual analysis, in Proc. IEEE Int. Conf. Multimedia Expo, Jul. 2011, pp [7] J. Hannon, K. McCarthy, J. Lynch, and B. Smyth, Personalized and automatic social summarization of events in video, in Proc. Int. Conf. Intell. User Interfaces, 2011, pp [8] B. Han, J. Hamm, and J. Sim, Personalized video summarization with human in the loop, in Proc. IEEE Workshop Appl. Comput. Vis.,2011, pp [9] Y. F. Ma, X. S. Hua, L. Lu, and H. J. Zhang, A generic framework of user attention model and its application in video summarization, IEEE Trans. Multimedia, vol. 7, no. 5, pp , Oct [10] C. H. Papadimitrou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity. Delhi, India: Prentice-Hall of India, [11] S. K. Kuanar, R. Panda, and A. S. Chowdhury, Video key frame extraction through dynamic delaunay clustering with a structural constraint, J. Vis. Commun. Image Representation, vol.24,no.7,pp , [12] S.Lu,I.King,andM.R.Lyu, Videosummarizationbyvideostructure analysis and graph optimization, in Proc. IEEE Int. Conf. Multimedia Expo, 2004, pp [13] Y. Peng and C. W. Ngo, Clip-based similarity measure for query-dependent clip retrieval and video summarization, IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 5, pp , May [14] C. W. Ngo, Y. F. Ma, and H. J. Zhang, Video summarization and scene detection by graph modeling, IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 2, pp , Feb [15] A. Z. Broder, A. R. Karlin, P. Raghavan, and E. Upfal, Trading space for time in undirected S-T connectivity, SIAM J. Comput., vol. 23, pp , 1994.

8 KUANAR et al.: MULTI-VIEW VIDEO SUMMARIZATION USING OPTIMUM-PATH FOREST CLUSTERING 1173 [16] P. Li, Y. Guo, and H. Sun, Multi-keyframe abstraction from videos, in Proc. IEEE Int. Conf. Image Process.,Brussels,Belgium,Sep.2011, pp [17] A. Ben-Hur and J. Weston, A users guide to support vector machines, Methods Molecular Biol., Data Mining Tech. Life Sci., vol. 609, pp , [18] C. Cotsaces, N. Nikolaidis, and I. Pitas, Video shot detection and condensed representation, A review, IEEE Signal Process. Mag., vol. 23, no. 2, pp , Mar [19] H. Zhang, A. Kankanhalli, and S. W. Smoliar, Automatic partitioning of full-motion video, Multimedia Syst., vol. 1, no. 1, pp , [20] D. A. Adjeroh and M. C. Lee, Robust and efficient transform domain video sequence analysis: An approach from the generalized color ratio model, J. Vis. Commun. Image Representation, vol. 8, no. 2, pp , [21] R. Zabih, J. Miller, and K. Mai, A feature-based algorithm for detecting and classifying production effects, Multimedia Syst., vol. 7, no. 2, pp , [22] A.Amel,B.Abdessalem,andM.Abdellatif, Videoshotboundary detection using motion activity descriptor, J. Telecommun., vol. 2, no. 1, pp , [23] Z. Rasheed and M. Shah, Detection and representation of scenes in videos, IEEE Trans. Multimedia, vol. 7, no. 6, pp , Dec [24] V. T. Chasanis, A. C. Likas, and N. P. Galatsanos, Scene detection in videos using shot clustering and sequence alignment, IEEE Trans. Multimedia, vol. 11, no. 1, pp , Jan [25] P. P. Mohanta, S. K. Saha, and B. Chanda, A heuristic algorithm for video scene detection using shot cluster sequence analysis, in Proc. Indian Conf. Vis. Graph. Image Process., 2010, pp [26] G. Paschos, Perceptually uniform color spaces for color texture analysis: An empirical evaluation, IEEE Trans. Image Process., vol. 10, no. 6, pp , Jun [27] B. S. Manjunath, J. R. Ohm, V. V. Vasudevan, and A. Yamada, Color and texture descriptors, IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 6, pp , Jun [28] H. Tamura, S. Mori, and T. Yamawaki, Textural features corresponding to visual perception, IEEE Trans. Syst., Man, Cybern., vol. SMC-8, no. 6, pp , Jun [29] D. Bollegala, Y. Matsuo, and M. Ishizuka, A web search engine based approach to measure semantic similarity between words, IEEE Trans. Knowl. Data Eng., vol. 23, no. 7, pp , Jul [30] J. Sivic and A. Zisserman, Video Google: A text retrieval approach to object matching in videos, in Proc. IEEE Int. Conf. Comput. Vis.,Oct. 2003, vol. 2, pp [31] Y. G. Jiang, J. Yang, C. W. Ngo, and A. G. Hauptmann, Representations of keypoint-based semantic concept detection: A comprehensive study, IEEE Trans. Multimedia, vol. 12, no. 1, pp , Jan [32] D. G. Lowe, Distinctive image features from scale-invariant keypoints, Int. J. Comput. Vis., vol. 60, no. 2, pp , [33] A. X. Falcao, J. Stolfi, and R. A. Lotufo, The image foresting transform: Theory, algorithms, and applications, IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 1, pp , Jan [34] L. M. Rocha, F. A. M. Cappabianco, and A. X. Falcão, Data clustering as an optimum path forest problem with applications in image analysis, Int. J. Imag. Syst. Technol., vol. 19, no. 2, pp , [35] J. P. Papa, F. A. M. Cappabianco, and A. X. Falcão, Optimizing optimum-path forest classification for huge datasets, in Proc. Int. Conf. Pattern Recog., 2010, pp [36] R.O.Duda,P.E.Hart,andD.G.Stork, Pattern Classification. New York, NY, USA: Interscience, 2001, vol. 2. [37] F. A. M. Cappabianco, A. X. Falco, C. L. Yasuda, and J. K. Udupa, Brain tissue MR-image segmentation via optimum-path forest clustering, J. Comput. Vis. Image Understanding, vol. 116, pp , [38] J. Sasongko, C. Rohr, and D. Tjondronegoro, Efficient generation of pleasant video summaries, in Proc. TRECVID BBC Rushes Summarization Workshop ACM Multimedia, New York, NY, USA, 2008, pp [39] J. B. MacQueen, Some methods for classification and analysis of multivariate observations, in Proc. Berkeley Symp. Math. Statist. Probability, 1967, pp. 1: [40] X. Zhu, J. Liu, J. Wang, and H. Lu, Key observation selection-based effective video synopsis for camera network, Machine Vis. Appl., vol. 25, pp , [41] Y. Fu, Multi-view metric learning for multi-video video summarization, CoRR, vol. abs/ , 2014 [Online]. Available: [42] S.H.Ou,C.H.Lee,V.S.Somayazulu,Y.K.Chen,andS.Y.Chien, On-line multi-view video summarization for wireless video sensor network, IEEE J. Sel. Topics Signal Process., vol. 9, no. 1, pp , Feb [43] S. Belongie, J. Malik, and J. Puzicha, Shape matching and object recognition using shape contexts, IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 24, pp , Apr [44] A. Shokoufandeh and S. Dickinson, Applications of bipartite matching to problems in object recognition, in Proc. ICCV Workshop Graph Algorithms Comput. Vis., 1999, pp [45] P. Napoletano, G. Boccignone, and F. Tisato, Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy, CoRR, vol. abs/ , 2014 [Online]. Available: abs/ Sanjay K. Kuanar receivedthem.e.degreeinelectronics and telecommunication engineering from Jadavpur University, Kolkata, India, in 2007, and is currently working towards the Ph.D. degree at Jadavpur University. His current research interests include pattern recognition, multimedia analysis, and computer vision. Kunal B. Ranga received the B.E. degree in computer science and engineering from the Government Engineering College Bikaner, Bikaner, India, in 2007,and is currently working towards the M.E. degree at Jadavpur University, Kolkata, India. His current research interests include pattern recognition, multimedia analysis, and computer vision. Ananda S. Chowdhury (M 01) received the Ph.D. in computer science from the University of Georgia, Athens, GA, USA, in July He is currently an Associate Professor with the Department of Electronics and Telecommunication Engineering, Jadavpur University, Kolkata, India, where he leads the Imaging, Vision and Pattern Recognition group. He was a Post-Doctoral Fellow with the Department of Radiology and Imaging Sciences, National Institutes of Health, Bethesda, MD, USA, from 2007 to He has authored or coauthored more than forty-five papers in leading international journals and conferences, in addition to a monograph in the Springer Advances in Computer Vision and Pattern Recognition Series. His current research interests include computer vision, pattern recognition, biomedical image processing, and multimedia analysis. Dr. Chowdhury is a member of the IEEE Computer Society, the IEEE Signal Processing Society, and the IAPR TC on Graph-Based Representations. His Erdös number is 2.

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 7, NOVEMBER

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 7, NOVEMBER IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 7, NOVEMBER 2010 717 Multi-View Video Summarization Yanwei Fu, Yanwen Guo, Yanshu Zhu, Feng Liu, Chuanming Song, and Zhi-Hua Zhou, Senior Member, IEEE Abstract

More information

Reducing False Positives in Video Shot Detection

Reducing False Positives in Video Shot Detection Reducing False Positives in Video Shot Detection Nithya Manickam Computer Science & Engineering Department Indian Institute of Technology, Bombay Powai, India - 400076 mnitya@cse.iitb.ac.in Sharat Chandran

More information

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm International Journal of Signal Processing Systems Vol. 2, No. 2, December 2014 Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm Walid

More information

Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences

Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences , pp.120-124 http://dx.doi.org/10.14257/astl.2017.146.21 Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences Mona A. M. Fouad 1 and Ahmed Mokhtar A. Mansour

More information

A Framework for Segmentation of Interview Videos

A Framework for Segmentation of Interview Videos A Framework for Segmentation of Interview Videos Omar Javed, Sohaib Khan, Zeeshan Rasheed, Mubarak Shah Computer Vision Lab School of Electrical Engineering and Computer Science University of Central Florida

More information

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions 1128 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 10, OCTOBER 2001 An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions Kwok-Wai Wong, Kin-Man Lam,

More information

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced

More information

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Mohamed Hassan, Taha Landolsi, Husameldin Mukhtar, and Tamer Shanableh College of Engineering American

More information

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr

More information

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin Indexing local features Wed March 30 Prof. Kristen Grauman UT-Austin Matching local features Kristen Grauman Matching local features? Image 1 Image 2 To generate candidate matches, find patches that have

More information

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264 Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264 Ju-Heon Seo, Sang-Mi Kim, Jong-Ki Han, Nonmember Abstract-- In the H.264, MBAFF (Macroblock adaptive frame/field) and PAFF (Picture

More information

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION H. Pan P. van Beek M. I. Sezan Electrical & Computer Engineering University of Illinois Urbana, IL 6182 Sharp Laboratories

More information

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning Mitsubishi Electric Research Laboratories, Cambridge, MA, USA {peker,ajayd,}@merl.com

More information

Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling

Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling International Conference on Electronic Design and Signal Processing (ICEDSP) 0 Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling Aditya Acharya Dept. of

More information

WE CONSIDER an enhancement technique for degraded

WE CONSIDER an enhancement technique for degraded 1140 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 9, SEPTEMBER 2014 Example-based Enhancement of Degraded Video Edson M. Hung, Member, IEEE, Diogo C. Garcia, Member, IEEE, and Ricardo L. de Queiroz, Senior

More information

An Efficient Reduction of Area in Multistandard Transform Core

An Efficient Reduction of Area in Multistandard Transform Core An Efficient Reduction of Area in Multistandard Transform Core A. Shanmuga Priya 1, Dr. T. K. Shanthi 2 1 PG scholar, Applied Electronics, Department of ECE, 2 Assosiate Professor, Department of ECE Thanthai

More information

SCALABLE video coding (SVC) is currently being developed

SCALABLE video coding (SVC) is currently being developed IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 7, JULY 2006 889 Fast Mode Decision Algorithm for Inter-Frame Coding in Fully Scalable Video Coding He Li, Z. G. Li, Senior

More information

CHAPTER 8 CONCLUSION AND FUTURE SCOPE

CHAPTER 8 CONCLUSION AND FUTURE SCOPE 124 CHAPTER 8 CONCLUSION AND FUTURE SCOPE Data hiding is becoming one of the most rapidly advancing techniques the field of research especially with increase in technological advancements in internet and

More information

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder.

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder. Video Streaming Based on Frame Skipping and Interpolation Techniques Fadlallah Ali Fadlallah Department of Computer Science Sudan University of Science and Technology Khartoum-SUDAN fadali@sustech.edu

More information

Smart Traffic Control System Using Image Processing

Smart Traffic Control System Using Image Processing Smart Traffic Control System Using Image Processing Prashant Jadhav 1, Pratiksha Kelkar 2, Kunal Patil 3, Snehal Thorat 4 1234Bachelor of IT, Department of IT, Theem College Of Engineering, Maharashtra,

More information

Story Tracking in Video News Broadcasts. Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004

Story Tracking in Video News Broadcasts. Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004 Story Tracking in Video News Broadcasts Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004 Acknowledgements Motivation Modern world is awash in information Coming from multiple sources Around the clock

More information

Color Image Compression Using Colorization Based On Coding Technique

Color Image Compression Using Colorization Based On Coding Technique Color Image Compression Using Colorization Based On Coding Technique D.P.Kawade 1, Prof. S.N.Rawat 2 1,2 Department of Electronics and Telecommunication, Bhivarabai Sawant Institute of Technology and Research

More information

A repetition-based framework for lyric alignment in popular songs

A repetition-based framework for lyric alignment in popular songs A repetition-based framework for lyric alignment in popular songs ABSTRACT LUONG Minh Thang and KAN Min Yen Department of Computer Science, School of Computing, National University of Singapore We examine

More information

ROBUST ADAPTIVE INTRA REFRESH FOR MULTIVIEW VIDEO

ROBUST ADAPTIVE INTRA REFRESH FOR MULTIVIEW VIDEO ROBUST ADAPTIVE INTRA REFRESH FOR MULTIVIEW VIDEO Sagir Lawan1 and Abdul H. Sadka2 1and 2 Department of Electronic and Computer Engineering, Brunel University, London, UK ABSTRACT Transmission error propagation

More information

TERRESTRIAL broadcasting of digital television (DTV)

TERRESTRIAL broadcasting of digital television (DTV) IEEE TRANSACTIONS ON BROADCASTING, VOL 51, NO 1, MARCH 2005 133 Fast Initialization of Equalizers for VSB-Based DTV Transceivers in Multipath Channel Jong-Moon Kim and Yong-Hwan Lee Abstract This paper

More information

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed, VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS O. Javed, S. Khan, Z. Rasheed, M.Shah {ojaved, khan, zrasheed, shah}@cs.ucf.edu Computer Vision Lab School of Electrical Engineering and Computer

More information

Optimized Color Based Compression

Optimized Color Based Compression Optimized Color Based Compression 1 K.P.SONIA FENCY, 2 C.FELSY 1 PG Student, Department Of Computer Science Ponjesly College Of Engineering Nagercoil,Tamilnadu, India 2 Asst. Professor, Department Of Computer

More information

Adaptive Key Frame Selection for Efficient Video Coding

Adaptive Key Frame Selection for Efficient Video Coding Adaptive Key Frame Selection for Efficient Video Coding Jaebum Jun, Sunyoung Lee, Zanming He, Myungjung Lee, and Euee S. Jang Digital Media Lab., Hanyang University 17 Haengdang-dong, Seongdong-gu, Seoul,

More information

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for

More information

Principles of Video Segmentation Scenarios

Principles of Video Segmentation Scenarios Principles of Video Segmentation Scenarios M. R. KHAMMAR 1, YUNUSA ALI SAI D 1, M. H. MARHABAN 1, F. ZOLFAGHARI 2, 1 Electrical and Electronic Department, Faculty of Engineering University Putra Malaysia,

More information

Line-Adaptive Color Transforms for Lossless Frame Memory Compression

Line-Adaptive Color Transforms for Lossless Frame Memory Compression Line-Adaptive Color Transforms for Lossless Frame Memory Compression Joungeun Bae 1 and Hoon Yoo 2 * 1 Department of Computer Science, SangMyung University, Jongno-gu, Seoul, South Korea. 2 Full Professor,

More information

Color Quantization of Compressed Video Sequences. Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 CSVT

Color Quantization of Compressed Video Sequences. Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 CSVT CSVT -02-05-09 1 Color Quantization of Compressed Video Sequences Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 Abstract This paper presents a novel color quantization algorithm for compressed video

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005. Wang, D., Canagarajah, CN., & Bull, DR. (2005). S frame design for multiple description video coding. In IEEE International Symposium on Circuits and Systems (ISCAS) Kobe, Japan (Vol. 3, pp. 19 - ). Institute

More information

Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder

Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder Roshini R, Udhaya Kumar C, Muthumani D Abstract Although many different low-power Error

More information

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION Hui Su, Adi Hajj-Ahmad, Min Wu, and Douglas W. Oard {hsu, adiha, minwu, oard}@umd.edu University of Maryland, College Park ABSTRACT The electric

More information

Image Resolution and Contrast Enhancement of Satellite Geographical Images with Removal of Noise using Wavelet Transforms

Image Resolution and Contrast Enhancement of Satellite Geographical Images with Removal of Noise using Wavelet Transforms Image Resolution and Contrast Enhancement of Satellite Geographical Images with Removal of Noise using Wavelet Transforms Prajakta P. Khairnar* 1, Prof. C. A. Manjare* 2 1 M.E. (Electronics (Digital Systems)

More information

InSync White Paper : Achieving optimal conversions in UHDTV workflows April 2015

InSync White Paper : Achieving optimal conversions in UHDTV workflows April 2015 InSync White Paper : Achieving optimal conversions in UHDTV workflows April 2015 Abstract - UHDTV 120Hz workflows require careful management of content at existing formats and frame rates, into and out

More information

Interframe Bus Encoding Technique for Low Power Video Compression

Interframe Bus Encoding Technique for Low Power Video Compression Interframe Bus Encoding Technique for Low Power Video Compression Asral Bahari, Tughrul Arslan and Ahmet T. Erdogan School of Engineering and Electronics, University of Edinburgh United Kingdom Email:

More information

A Fast Alignment Scheme for Automatic OCR Evaluation of Books

A Fast Alignment Scheme for Automatic OCR Evaluation of Books A Fast Alignment Scheme for Automatic OCR Evaluation of Books Ismet Zeki Yalniz, R. Manmatha Multimedia Indexing and Retrieval Group Dept. of Computer Science, University of Massachusetts Amherst, MA,

More information

Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting

Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting Luiz G. L. B. M. de Vasconcelos Research & Development Department Globo TV Network Email: luiz.vasconcelos@tvglobo.com.br

More information

EMBEDDED SPARSE CODING FOR SUMMARIZING MULTI-VIEW VIDEOS

EMBEDDED SPARSE CODING FOR SUMMARIZING MULTI-VIEW VIDEOS EMBEDDED SPARSE CODING FOR SUMMARIZING MULTI-VIEW VIDEOS Rameswar Panda Abir Das Amit K. Roy-Chowdhury Electrical and Computer Engineering Department, University of California, Riverside Computer Science

More information

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.

More information

Audio-Based Video Editing with Two-Channel Microphone

Audio-Based Video Editing with Two-Channel Microphone Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science

More information

MPEG has been established as an international standard

MPEG has been established as an international standard 1100 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 9, NO. 7, OCTOBER 1999 Fast Extraction of Spatially Reduced Image Sequences from MPEG-2 Compressed Video Junehwa Song, Member,

More information

Automatic Piano Music Transcription

Automatic Piano Music Transcription Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening

More information

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 6, NO. 3, JUNE 1996 313 Express Letters A Novel Four-Step Search Algorithm for Fast Block Motion Estimation Lai-Man Po and Wing-Chung

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

Wipe Scene Change Detection in Video Sequences

Wipe Scene Change Detection in Video Sequences Wipe Scene Change Detection in Video Sequences W.A.C. Fernando, C.N. Canagarajah, D. R. Bull Image Communications Group, Centre for Communications Research, University of Bristol, Merchant Ventures Building,

More information

FRAME RATE CONVERSION OF INTERLACED VIDEO

FRAME RATE CONVERSION OF INTERLACED VIDEO FRAME RATE CONVERSION OF INTERLACED VIDEO Zhi Zhou, Yeong Taeg Kim Samsung Information Systems America Digital Media Solution Lab 3345 Michelson Dr., Irvine CA, 92612 Gonzalo R. Arce University of Delaware

More information

Subjective Similarity of Music: Data Collection for Individuality Analysis

Subjective Similarity of Music: Data Collection for Individuality Analysis Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp

More information

Detecting Musical Key with Supervised Learning

Detecting Musical Key with Supervised Learning Detecting Musical Key with Supervised Learning Robert Mahieu Department of Electrical Engineering Stanford University rmahieu@stanford.edu Abstract This paper proposes and tests performance of two different

More information

arxiv: v1 [cs.ir] 16 Jan 2019

arxiv: v1 [cs.ir] 16 Jan 2019 It s Only Words And Words Are All I Have Manash Pratim Barman 1, Kavish Dahekar 2, Abhinav Anshuman 3, and Amit Awekar 4 1 Indian Institute of Information Technology, Guwahati 2 SAP Labs, Bengaluru 3 Dell

More information

Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table

Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table 48 3, 376 March 29 Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table Myounghoon Kim Hoonjae Lee Ja-Cheon Yoon Korea University Department of Electronics and Computer Engineering,

More information

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur Module 8 VIDEO CODING STANDARDS Lesson 27 H.264 standard Lesson Objectives At the end of this lesson, the students should be able to: 1. State the broad objectives of the H.264 standard. 2. List the improved

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS

OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS Habibollah Danyali and Alfred Mertins School of Electrical, Computer and

More information

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)

More information

Multi-modal Analysis for Person Type Classification in News Video

Multi-modal Analysis for Person Type Classification in News Video Multi-modal Analysis for Person Type Classification in News Video Jun Yang, Alexander G. Hauptmann School of Computer Science, Carnegie Mellon University, 5000 Forbes Ave, PA 15213, USA {juny, alex}@cs.cmu.edu,

More information

ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS

ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS Multimedia Processing Term project on ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS Interim Report Spring 2016 Under Dr. K. R. Rao by Moiz Mustafa Zaveri (1001115920)

More information

Evaluation of Automatic Shot Boundary Detection on a Large Video Test Suite

Evaluation of Automatic Shot Boundary Detection on a Large Video Test Suite Evaluation of Automatic Shot Boundary Detection on a Large Video Test Suite Colin O Toole 1, Alan Smeaton 1, Noel Murphy 2 and Sean Marlow 2 School of Computer Applications 1 & School of Electronic Engineering

More information

Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2

Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2 IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 03, 2015 ISSN (online): 2321-0613 V Priya 1 M Parimaladevi 2 1 Master of Engineering 2 Assistant Professor 1,2 Department

More information

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Article. ISSN (Print) *Corresponding author Shireen Fathima Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

BBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1

BBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1 BBM 413 Fundamentals of Image Processing Dec. 11, 2012 Erkut Erdem Dept. of Computer Engineering Hacettepe University Segmentation Part 1 Image segmentation Goal: identify groups of pixels that go together

More information

Dual Frame Video Encoding with Feedback

Dual Frame Video Encoding with Feedback Video Encoding with Feedback Athanasios Leontaris and Pamela C. Cosman Department of Electrical and Computer Engineering University of California, San Diego, La Jolla, CA 92093-0407 Email: pcosman,aleontar

More information

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.210

More information

Automatic Music Clustering using Audio Attributes

Automatic Music Clustering using Audio Attributes Automatic Music Clustering using Audio Attributes Abhishek Sen BTech (Electronics) Veermata Jijabai Technological Institute (VJTI), Mumbai, India abhishekpsen@gmail.com Abstract Music brings people together,

More information

Scalable Foveated Visual Information Coding and Communications

Scalable Foveated Visual Information Coding and Communications Scalable Foveated Visual Information Coding and Communications Ligang Lu,1 Zhou Wang 2 and Alan C. Bovik 2 1 Multimedia Technologies, IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA 2

More information

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013 International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013 Design and Implementation of an Enhanced LUT System in Security Based Computation dama.dhanalakshmi 1, K.Annapurna

More information

CS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016

CS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016 CS 1674: Intro to Computer Vision Face Detection Prof. Adriana Kovashka University of Pittsburgh November 7, 2016 Today Window-based generic object detection basic pipeline boosting classifiers face detection

More information

THE USE OF forward error correction (FEC) in optical networks

THE USE OF forward error correction (FEC) in optical networks IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 8, AUGUST 2005 461 A High-Speed Low-Complexity Reed Solomon Decoder for Optical Communications Hanho Lee, Member, IEEE Abstract

More information

Figure.1 Clock signal II. SYSTEM ANALYSIS

Figure.1 Clock signal II. SYSTEM ANALYSIS International Journal of Advances in Engineering, 2015, 1(4), 518-522 ISSN: 2394-9260 (printed version); ISSN: 2394-9279 (online version); url:http://www.ijae.in RESEARCH ARTICLE Multi bit Flip-Flop Grouping

More information

Indexing local features and instance recognition

Indexing local features and instance recognition Indexing local features and instance recognition May 14 th, 2015 Yong Jae Lee UC Davis Announcements PS2 due Saturday 11:59 am 2 Approximating the Laplacian We can approximate the Laplacian with a difference

More information

Selective Intra Prediction Mode Decision for H.264/AVC Encoders

Selective Intra Prediction Mode Decision for H.264/AVC Encoders Selective Intra Prediction Mode Decision for H.264/AVC Encoders Jun Sung Park, and Hyo Jung Song Abstract H.264/AVC offers a considerably higher improvement in coding efficiency compared to other compression

More information

SHOT DETECTION METHOD FOR LOW BIT-RATE VIDEO CODING

SHOT DETECTION METHOD FOR LOW BIT-RATE VIDEO CODING SHOT DETECTION METHOD FOR LOW BIT-RATE VIDEO CODING J. Sastre*, G. Castelló, V. Naranjo Communications Department Polytechnic Univ. of Valencia Valencia, Spain email: Jorsasma@dcom.upv.es J.M. López, A.

More information

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY (Invited Paper) Anne Aaron and Bernd Girod Information Systems Laboratory Stanford University, Stanford, CA 94305 {amaaron,bgirod}@stanford.edu Abstract

More information

UNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT

UNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT UNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT Stefan Schiemenz, Christian Hentschel Brandenburg University of Technology, Cottbus, Germany ABSTRACT Spatial image resizing is an important

More information

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique Dhaval R. Bhojani Research Scholar, Shri JJT University, Jhunjunu, Rajasthan, India Ved Vyas Dwivedi, PhD.

More information

An FPGA Implementation of Shift Register Using Pulsed Latches

An FPGA Implementation of Shift Register Using Pulsed Latches An FPGA Implementation of Shift Register Using Pulsed Latches Shiny Panimalar.S, T.Nisha Priscilla, Associate Professor, Department of ECE, MAMCET, Tiruchirappalli, India PG Scholar, Department of ECE,

More information

FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION

FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION 1 YONGTAE KIM, 2 JAE-GON KIM, and 3 HAECHUL CHOI 1, 3 Hanbat National University, Department of Multimedia Engineering 2 Korea Aerospace

More information

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532 www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 5 Issue 10 Oct. 2016, Page No. 18532-18540 Pulsed Latches Methodology to Attain Reduced Power and Area Based

More information

Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding. Abstract. I. Introduction

Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding. Abstract. I. Introduction Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding Jun Xin, Ming-Ting Sun*, and Kangwook Chun** *Department of Electrical Engineering, University of Washington **Samsung Electronics Co.

More information

Bit Rate Control for Video Transmission Over Wireless Networks

Bit Rate Control for Video Transmission Over Wireless Networks Indian Journal of Science and Technology, Vol 9(S), DOI: 0.75/ijst/06/v9iS/05, December 06 ISSN (Print) : 097-686 ISSN (Online) : 097-5 Bit Rate Control for Video Transmission Over Wireless Networks K.

More information

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter? Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter? Yi J. Liang 1, John G. Apostolopoulos, Bernd Girod 1 Mobile and Media Systems Laboratory HP Laboratories Palo Alto HPL-22-331 November

More information

INTRA-FRAME WAVELET VIDEO CODING

INTRA-FRAME WAVELET VIDEO CODING INTRA-FRAME WAVELET VIDEO CODING Dr. T. Morris, Mr. D. Britch Department of Computation, UMIST, P. O. Box 88, Manchester, M60 1QD, United Kingdom E-mail: t.morris@co.umist.ac.uk dbritch@co.umist.ac.uk

More information

Temporal Error Concealment Algorithm Using Adaptive Multi- Side Boundary Matching Principle

Temporal Error Concealment Algorithm Using Adaptive Multi- Side Boundary Matching Principle 184 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 Temporal Error Concealment Algorithm Using Adaptive Multi- Side Boundary Matching Principle Seung-Soo

More information

Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection

Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection Ahmed B. Abdurrhman 1, Michael E. Woodward 1 and Vasileios Theodorakopoulos 2 1 School of Informatics, Department of Computing,

More information

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS Item Type text; Proceedings Authors Habibi, A. Publisher International Foundation for Telemetering Journal International Telemetering Conference Proceedings

More information

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS Susanna Spinsante, Ennio Gambi, Franco Chiaraluce Dipartimento di Elettronica, Intelligenza artificiale e

More information

Music Similarity and Cover Song Identification: The Case of Jazz

Music Similarity and Cover Song Identification: The Case of Jazz Music Similarity and Cover Song Identification: The Case of Jazz Simon Dixon and Peter Foster s.e.dixon@qmul.ac.uk Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary

More information

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University A Pseudo-Statistical Approach to Commercial Boundary Detection........ Prasanna V Rangarajan Dept of Electrical Engineering Columbia University pvr2001@columbia.edu 1. Introduction Searching and browsing

More information

Constant Bit Rate for Video Streaming Over Packet Switching Networks

Constant Bit Rate for Video Streaming Over Packet Switching Networks International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Constant Bit Rate for Video Streaming Over Packet Switching Networks Mr. S. P.V Subba rao 1, Y. Renuka Devi 2 Associate professor

More information

White Paper : Achieving synthetic slow-motion in UHDTV. InSync Technology Ltd, UK

White Paper : Achieving synthetic slow-motion in UHDTV. InSync Technology Ltd, UK White Paper : Achieving synthetic slow-motion in UHDTV InSync Technology Ltd, UK ABSTRACT High speed cameras used for slow motion playback are ubiquitous in sports productions, but their high cost, and

More information

Improved Error Concealment Using Scene Information

Improved Error Concealment Using Scene Information Improved Error Concealment Using Scene Information Ye-Kui Wang 1, Miska M. Hannuksela 2, Kerem Caglar 1, and Moncef Gabbouj 3 1 Nokia Mobile Software, Tampere, Finland 2 Nokia Research Center, Tampere,

More information

WITH the rapid development of high-fidelity video services

WITH the rapid development of high-fidelity video services 896 IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 7, JULY 2015 An Efficient Frame-Content Based Intra Frame Rate Control for High Efficiency Video Coding Miaohui Wang, Student Member, IEEE, KingNgiNgan,

More information

New Approach to Multi-Modal Multi-View Video Coding

New Approach to Multi-Modal Multi-View Video Coding Chinese Journal of Electronics Vol.18, No.2, Apr. 2009 New Approach to Multi-Modal Multi-View Video Coding ZHANG Yun 1,4, YU Mei 2,3 and JIANG Gangyi 1,2 (1.Institute of Computing Technology, Chinese Academic

More information

A Combined Compatible Block Coding and Run Length Coding Techniques for Test Data Compression

A Combined Compatible Block Coding and Run Length Coding Techniques for Test Data Compression World Applied Sciences Journal 32 (11): 2229-2233, 2014 ISSN 1818-4952 IDOSI Publications, 2014 DOI: 10.5829/idosi.wasj.2014.32.11.1325 A Combined Compatible Block Coding and Run Length Coding Techniques

More information

Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection

Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection Ahmed B. Abdurrhman, Michael E. Woodward, and Vasileios Theodorakopoulos School of Informatics, Department of Computing,

More information

Using enhancement data to deinterlace 1080i HDTV

Using enhancement data to deinterlace 1080i HDTV Using enhancement data to deinterlace 1080i HDTV The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Andy

More information

Research Article Design and Analysis of a High Secure Video Encryption Algorithm with Integrated Compression and Denoising Block

Research Article Design and Analysis of a High Secure Video Encryption Algorithm with Integrated Compression and Denoising Block Research Journal of Applied Sciences, Engineering and Technology 11(6): 603-609, 2015 DOI: 10.19026/rjaset.11.2019 ISSN: 2040-7459; e-issn: 2040-7467 2015 Maxwell Scientific Publication Corp. Submitted:

More information

Video summarization based on camera motion and a subjective evaluation method

Video summarization based on camera motion and a subjective evaluation method Video summarization based on camera motion and a subjective evaluation method Mickaël Guironnet, Denis Pellerin, Nathalie Guyader, Patricia Ladret To cite this version: Mickaël Guironnet, Denis Pellerin,

More information