, pp.120-124 http://dx.doi.org/10.14257/astl.2017.146.21 Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences Mona A. M. Fouad 1 and Ahmed Mokhtar A. Mansour 2 1 National Telecommunication Institute, Cairo, Egypt 2 Nile Innovations, Cairo, Egypt mfouad@nti.sci.eg, ahmedmokhtar_adu@yahoo.com Abstract. Shot transition detection is important for surveillance systems and video retrieval. Video coding standards reduce the video size by encoding the differences of similar regions in the successive frames. In this work, the search process, to locate these similar regions, is utilized to identify the presence of abrupt shot transitions in video sequences in a novel way. The proposed scheme is developed and the conducted experiments show that 100% of the existed abrupt shot transitions in the examined video sequences have been successfully detected. Further work will consider fully automatic scheme to detect all types of shot transitions. Keywords: Motion compensation, matched macroblocks, shot transition, surveillance systems, video codec, video retrieval, video abstraction and summarization. 1 Introduction The consecutive frames in video sequences are highly similar unless there is a transition. This transition could be gradual or abrupt. Shot transition detection is needed for many applications such as surveillance systems, video summarization and abstraction. Several researches already exist to solve this problem. This paper considers the detection of abrupt transitions based on a new and simple idea that utilizes the search process to locate the matched macroblock in the reference frame while encoding the video sequence frames. According to the recommendations for standard video codec produced by ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG), standard video bit stream hierarchy is composed of layers; the video sequence layer, the group of pictures layer, the picture layer, the slice layer, the macroblock (MB) layer, and the block layer [1]. The video codec encode the video sequences frame by frame. The considered frame is first partitioned into a set of sequenced Macroblocks of small fixed size. The correlated content of these MBs are transformed into frequency domain to convert the correlated content into uncorrelated content to be encoded separately. Further, instead of transmitting information of ISSN: 2287-1233 ASTL Copyright 2017 SERSC
identical MBs in the consecutive frames, the matched MBs are located in the previous (or/and forward) frame, then only the differences and their places are encoded [2]. The search process examines the MB to be encoded in the current (target) frame against its corresponding MB in the reference frame and the surrounding MBs within small area (sliding window). The proposed shot transition scheme is applied at the MB layer. The distortion between the corresponding MBs of two successive frames is measured using the absolute mean difference (other measures could be used), then the place of the MB in the previous frame is shifted a single pixel to the left, then to up, and so on for all the eight neighbors (3x3-slide window). This means that the MB X in the current frame is compared to each of the nine MBs in the previous frame (Fig. 1 shows an example). This process is repeated for all MBs in the examined frames, or only few of them are enough to determine whether the current frame and the previous frame belong to different shots (scenes) or not. X 2 3 4 1 0 5 8 7 6 Fig. 1. Positions of compared MBs in the: current frame, and previous frame. The distortion between the two MBs in the current and the previous frames is denoted as D. The five distortion values are denoted as D 0, D 1, D 2, D 3, D 4, D 5, D 6, D 7, and D 8. If the current and the previous frames belong to two different shots, D 0 should satisfy two specified criteria. First, it is greater than certain threshold (t 1). Second, the variance of D 0 and the other eight values is greater than a threshold (t 2). The rest of all pair of corresponding MBs is examined the same way. If more than seventy percent of the examined MBs are specified as 'not correlated', then the current and the previous frames belong to different shots, and abrupt transition exists. On the contrary, if the considered frames belong to the same shot, the distortion check of the pair of corresponding MBs is less than (t 1) and is the least among the neighbors, or there is at least single value that is less than D 0 indicating that the matched macroblock is nearby. Further, if the distortion check is continued by moving the MB in the target frame by two pixels (and so on ) apart from the initial position, the correlation is tracked within small window. This idea could be utilized to locate the direct path to the matched macroblock for motion compensation in video coding. 2 Shot Transition Detection Scheme The proposed abrupt shot transition scheme is explored at the macroblock layer. At this layer, the frame is subdivided into non-overlapped macroblocks, each of fixed size 16x16 pixels. The pseudo code of the proposed scheme is illustrated in Fig. 2, showing the main steps to detect the abrupt shot transition based on the correlation tracking check. As seen in the pseudo code, the first MB in the current frame is Copyright 2017 SERSC 121
compared to its corresponding in the previous frame, then the MB at the previous frame is shifted a single pixel in all the eight direction to further comparing the considered MB at the current frame to the MBs at these positions. The comparison estimates the distortion between the examined pair of MBs within the slide window of size 3x3pixels using the absolute mean difference. program AbruptShotTransitionDetection (Output) {Read the first two successive frames} const t1 = 5.0; counter = 0; var t2, t3, D0, D1, D2, D3, D4, D5, D6, D7, D8; begin {Partition each frame into non-overlapped 16x16 MB size. no_mbs is specified} t3 = no_mbs * 0.7; repeat {for all MBs of the examined frames} D0 = d(c0, p0); {d is the absolute mean difference between the two corresponding MBs} D1 = d(c0, p1); D2 = d(c0, p2); D3 = d(c0, p3); D4 = d(c0, p4); D5 = d(c0, p5); D6 = d(c0, p6); D7 = d(c0, p7); D8 = d(c0, p8); v = variance(d0, D1, D3, D5, D7); t2 = 10.0; if D0 > t1 & v < t2 then {the MB 'c0' is a candidate to be in the first frame of a new shot} counter = counter + 1; end {end of repeat}; if counter > t3 then {current frame belongs to new shot and abrupt shot transition is detected}; end. Fig. 2. The pseudo code of the proposed Abrupt Transition Detection. 3 Experiments and Results Preliminary evaluation of the proposed scheme is performed on two video sequences of 70,000 frames having 65 abrupt transitions. All the transitions have been successfully detected (Fig. 3 shows two examples of abrupt transitions). 122 Copyright 2017 SERSC
Advanced Science and Technology Letters Fig. 3. Two successive frames in the video sequence wedding, and Horse Racing. The absolute mean difference between a MB in the current frame and its corresponding MB as well as the difference of the surrounding eight MBs in the previous frame at an abrupt transition is differ than those at a non-abrupt transition (Fig. 4 shows two examples). The variance of these values within 3x3-window is also illustrated in Fig. 4. 50.91 51.23 51.3 49.29 49.96 50.25 49.68 50.1 49.98 16.098 12.414 13.789 7.9336 0.1563 8.668 12.133 11.254 14.949 50.965 50.574 49.191 48.93 48.477 47.281 (c) 47.629 47.297 46.371 21.098 17.059 14.176 12.961 13.957 14.254 8.6016 3.5859 5.4727 11.188 12.32 10.234 10.211 14.156 19.777 19.578 19.445 21.41 24.805 28.965 27.406 29.121 31.816 35.324 38.469 (d) Fig. 4. AMDs of a pair of MBs within a 3x3-window at, (c) abrupt transition (variances = 0.486, 2.339), and, (d) non-abrupt transition (variances = 23.106, 84.76). 4 Related Work Shot transition detection has been addressed in many researches since the last two decades. Intensity and color information analysis was the hub of many works [3]-[8]. Color and texture analysis is used in [3]. Variance distribution of edge frames in gradual transitions has been analyzed in [4]. Support vector machine classifier is trained, using color histogram of adjacent frames, to locate transitions and identify their types in [5]. Feature vectors of color moments for corresponding blocks in two successive frames are examined to detect abrupt transitions in [6]. Abrupt shot Copyright 2017 SERSC 123
transition detection in night time video sequences are evaluated based on the detection of sudden change of intensity due the appearance of vehicle lights in surveillance systems [7]. Intensity of adjacent frames is analyzed to detect abrupt transitions in [8]. Recently, attention draws to the utilization of compressed data, rather than the decoded frames, such as those in [9]-[10]. The DC and AC-coefficients are examined along successive frame to detect shot transitions in [9]. Analysis of macroblock types of inter-coded frames is the research core for detecting gradual transitions in [10]. 5 Conclusion and Further Work The proposed work developed a new scheme for abrupt shot transition detection in video sequences by utilizing the search process for the best matched macroblock in the reference frame to encode video standards. The scheme is evaluated on a set of video sequences and all cut transitions have been successfully detected. The preliminary results encouraged to further extend this work to fully automate the proposed scheme to detect all types of transitions. References 1. G. J. Sullivan, J.-R. Ohm and T. Wiegand, Overview of the High Efficiency Video Coding (HEVC) Standard, IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, (2012) December, pp. 1649-1669. 2. B. G. Haskell and A. Puri, MPEG Video Compression Basics MPEG Representation of Digital Media, Chiariglione L. (eds) The Springer, New York, NY, (2012), pp. 7-38. 3. W. Ngo, T.C. Pong and R. T. Chin, Detection of gradual transitions through temporal slice analysis, IEEE Conference on Computer Vision and Pattern Recognition ICVPR, Computer Society, (1999). 4. H. W. Yoo, H. J. Ryoo and D. S. Jang, Gradual shot boundary detection using localized edge blocks, In Multimed Tools & Applications, vol. 28, no. 3, (2006), pp. 283-300. 5. V. Chasanis, A. Likas and N. Galatsanos, Simultaneous detection of abrupt cuts and dissolves in videos using support vector machines, Pattern Recognition Letters Elsevier, vol. 30, (2009), pp. 55-65. 6. S. A. Angadi1 and Vilas Naik: A Shot Boundary Detection Technique Based on Local Color Moments in YCBCR Color Space. In Natarajan Meghanathan, et al. (Eds): SIPM, FCST, ITCA, WSE, ACSIT, CS & IT 06, (2012), pp. 57-65. 7. S. Padmavathi and G. Abirami, Detection of Abrupt Transitions in Night Time Video for Illumination Enhancement, In: Suresh L., Panigrahi B. (eds) Proceedings of the International Conference on Soft Computing Systems. Advances in Intelligent Systems and Computing, Springer, New Delhi, vol. 397, (2016). 8. A. Carlos Sousa e Santos and H. Pedrini, Shot Boundary Detection for Video Temporal Segmentation based on the Weber Local Descriptor, In: IEEE International Conference on Systems, Man, and Cybernetics (SMC), Canada, (2017). 9. B.-L. Yeo and B. Liu, Rapid Scene Analysis on Compressed Video, In: IEEE Trans. on Circuits and Systemsfor Video Technology. vol. 5, no. 5, (1995). 10. M. A. Fouad, F. M. Bayoumi, Hoda M. Onsi and M. G. Darwish, Real-time shot transition detection in compressed MPEG video streams, In: Journal of Electronic Imaging SPIE, vol. 17, no. 2, (2008), pp. 1-16. 124 Copyright 2017 SERSC