880 IEEE Transactions on Consumer Electronics, Vol. 58, No. 3, August 2012 Multi-Directional Spatial Error Concealment Using Adaptive Edge Thresholding Hadi Asheri, Hamid R. Rabiee, Senior Member, IEEE, Nima Pourdamghani, Mohammad Ghanbari, Fellow, IEEE Abstract In this paper, a new method for spatial concealment of missing areas in image and video signals transmitted over error prone infrastructures, is presented. The concealment process is performed in three steps. First, a novel technique is used to estimate the significant edges of missing areas after performing a directional edge analysis on the correctly received neighboring blocks of the missing areas. This technique uses the moments of the neighboring edge magnitudes to obtain an adaptive threshold for rejecting non-significant directions. Second, based on the predicted significant directions, an approximation is obtained for each missing pixel. Finally, for each pixel, we compute a weighted average by using two edge correspondence measures as weighting factors. Moreover, the adaptive thresholding of the edges let us include as many edges as necessary in the interpolation process. Experiments show that the proposed method outperforms the previous state of the art methods based on both subjective and objective measures. 1 Index Terms Multi-directional spatial error concealment, Image and video concealment, Directional edge analysis, Adaptive edge thresholding, Directional entropy I. INTRODUCTION Transmission of image and video signals over error-prone channels is susceptible to bit erasures and packet losses. Using spatial or temporal redundancy, error concealment is the extension of correctly received areas of the images and video signals to areas corrupted by transmission errors [1]. Spatial error concealment methods take advantage of the underlying spatial redundancy among pixels of the corrupted area and correctly received neighboring area in the same frame. In recent years, various methods have been proposed for spatial error concealment. Interpolation in the transform domain includes [2]-[6], where in [2]-[4], correctly received areas are represented as a linear combination of the basis functions in the transform domain. An estimation of the missing area is obtained by error minimization through selecting the most significant basis functions. Maximally smooth recovery and projection onto convex sets have been respectively proposed in [5] and [6]. Other methods perform error concealment in spatial domain. Weighted 1 The authors are with the AICTC Research Center, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran e-mail: (hadi.asheri@gmail.com, rabiee@sharif.edu, pourdamghani@ce.sharif.edu, ghan@essex.ac.uk). M. Ghanbari is with the School of Computer Science and Electronic Engineering, University of Essex, Colchester, CO4 3SQ UK (e-mail:ghan@essex.ac.uk). Contributed Paper Manuscript received 05/31/12 Current version published 09/25/12 Electronic version published 09/25/12. 0098 3063/12/$20.00 2012 IEEE averaging [7] is a simple and fast method for smooth images that has been employed in H.264/AVC reference model [8]. All the above mentioned methods, either implicitly or explicitly ignore the edge information and hence fail to faithfully restore junction edges as fundamental components of the images. To resolve this problem, the proposed methods in [9]-[13] have applied edge related information in the interpolation process. A Hough transform based directional method has been proposed in [14]. In order to obtain a low complexity solution, the proposed method in [15] utilizes the intra prediction information of the coded bit-stream of H.264/AVC. Gaussian process regression as a robust model, has been applied in a Bayesian framework in [16]. The proposed method in [17], utilizes the Bayesian estimation framework along with DCT pyramids to reconstruct the missing macro-blocks. As the properties of video sequences and images vary with their corresponding contents, it seems rational to combine different methods of error/loss concealments to achieve better performance. The classification based method of [18] uses this concept to introduce a hybrid method for spatial error concealment.the rest of this paper is organized as follows. In section 2, the proposed method is described. Simulation results are shown in section 3. Finally, section 4 is devoted to conclusions. II. THE PROPOSED METHOD It is known that ignoring significant edge directions during the process of concealment results in monotonous interpolations which blur or even totally dismiss edges in the missing macro-blocks. To avoid this problem, we propose a three step interpolation process. First, a novel technique is used to estimate the significant edges of missing areas after performing a directional edge analysis on the correctly received neighboring blocks of the missing areas. Second, an approximation is obtained along the direction of each significant edge. Finally, for each pixel, we compute a weighted average by using two edge correspondence measures as weighting factors. Strength of each significant edge, i.e. its magnitude, is used as the first measure for weighted averaging. Similarity of two boundary pixels located at the ends of each significant direction, is used as the second weighting measure. A. Extracting the Significant Edges Considering a missing macro-block, B m, and its correctly received neighboring macro-blocks, for each pixel pxy (, ) of the neighboring boundary of B m, the edge gradients are defined as:,, (1),, (2)
H. Asheri et al.: Multi-Directional Spatial Error Concealment Using Adaptive Edge Thresholding 881 where S h and Sv are the horizontal and vertical Sobel kernels, respectively. These kernels are defined as: 1 0 1 1 2 1 Sh 2 0 2, Sv 0 0 0, (3) 1 0 1 1 2 1 For each pixel pxy (, ), the magnitude and direction components of the gradient are obtained by Eq. (4) and Eq. (5), respectively. G( x, y) g 2(, ) 2 h x y gv( x, y) (4) 1 g ( x, y ) ( x, y ) tan v (5) g h ( x, y ) As some of these edges are caused by noise or insignificant differences in intensity, the next step is to eliminate the weak edges. For this purpose, we have employed a two pass edge thresholding mechanism. First, we apply a fixed minimum threshold T m to remove extremely weak edges; mostly caused by noise. Therefore, the set of valid edges, E c, is defined as: E G( x, y ) ( x, y ) G( x, y ) T c m (6) Second, in order to retain significant edges, an adaptive threshold based on the content of neighboring areas is required. To this end, we introduce an adaptive threshold,, a as: c, (7) a in which, and are the mean and standard deviation of the edge magnitudes in the correctly received neighboring areas, respectively. One may note that the two parts of a are specified based on the edge content of boundary areas around the missing blocks. The parameter c is a constant normalizing factor that is determined as follows. First, the members of E c are quantized into a predefined number of direction levels. Then, for each level, direction of the edge with maximum magnitude is selected as the representative direction i. Finally, the array of magnitudes for the representative edges are normalized to form E q as follows. n E i i Gi G, i 1,2,..., n q j (8) j 1 where n is the number of quantization levels. Therefore, it is reasonable to compute the coefficient c as: 1 1 c 1 log 2 ( ) log 2( n) i i Eq i Which implies that the small values of edge entropy corresponds to a few number of significant edges. Consequently, to preserve more edges, c should have a relatively small value and otherwise, c should have a high value. This means that the value of c is adapted to the content of the neighboring blocks. As a consequence, the threshold value,, can be automatically selected based on the content a of correctly received neighboring boundary blocks of the (9) missing macro-block.finally, we may conclude that the set of all significant edges E t, for the missing block can be defined as: E G( x, y ) ( x, y ) Ec ( x, y ) t a (10) B. Interpolation In order to reconstruct each pixel,, of the missing macroblock, a multi-directional interpolation is performed as defined by Eq. (11)., (11) n g Gi G i j (12) j 1 where, E t is the number of significant edges, i.e. the cardinality of Et, g is the first weighting factor, and G i i is the magnitude of the corresponding edge in Et. As shown in * Fig. 1, the approximated value, p i, along the i th significant edge is obtained by: * d1p2d2p p w 1 (13) i di d1d2 where the second weighting factor, w d i, is a similarity measure between the two boundary pixels, p 1 and p 2, along the direction of the i th significant edge and and are the distances of the given pixel from the boundary pixels p 1 and p 2. Weighting factor is define as: w d i w exp 1 d p i 1 p 2 where is a scaling factor. Fig. 1. The directional interpolation (14) In fact, the coefficient w d i implies that edges with two similar end points are more likely to be the correct edges through the missing blocks. The block diagram of the proposed Multi-Directional Interpolation (MDI) method is shown in Fig. 2. III. EXPERIMENTAL RESULTS We have tested the performance of our method on various standard images. The error patterns are shown in Fig. 3 and the size of corrupted macro-blocks is 16 16. We have
882 IEEE Transactions on Consumer Electronics, Vol. 58, No. 3, August 2012 compared our work, the so called Multi-Directional Interpolation (MDI), with the popular Weighted Averaging (WAVG) [7] method and the state of the art methods such as Frequency Selective Extrapolation (FSE) [4], Fast-FSE (FFSE) [3], and Fine Directional Interpolation (FDI) [11]. Note that FSE and FFSE both suffer from computational complexity. Since the implementation of [4] was not available, for FSE we make use of [3] instead of [4]. Notice that this implementation provides the same quality, but increases the computational complexity of [4]. Here, the performance of the proposed method is shown on three images of Fig. 4. In our experiments, we have set T to 0.1 and to 10. min In image and video communications, it is common to miss multiple macro-blocks in one row of macroblocks. In order to verify the performance of the proposed method, we have also tested with the consecutive error pattern shown in Fig. 3(b). Tables III and IV present the objective results of the competing methods for the conseutive error pattern. In this case, MDI shows an avaerage PSNR improvement of 0.4 db, and SSIM improvement of 0.4 db. Subjective results of the competing methods for the consecutive error pattern are shown in Fig. 7 and Fig. 8. As illustrated, compared to other methods, MDI concealed frames show better reconstruction of significant edges while producing less artifacts. (a) Checker-board (b) Consecutive Fig. 3. Two examined error patterns. Fig. 4. The original test images TABLE I PERFORMANCE COMPARISON IN PSNR FOR CHECKER PATTERN. Fig. 2. Block diagram of the proposed Multi-Directional Interpolation with adaptive edge threshold (MDI) Tables I and II show the objective performance of all the competing methods in Peak-Signal-to-Noise-Ratio (PSNR) and Structural Similarity Measures (SSIM) for the checkerboard error pattern of Fig. 3(a). The last columns of these tables denote the improvement of the proposed MDI method compared to the best performance among the competing methods. As illustrated, MDI consistantly performs better, and shows an avaerage PSNR improvement of 1.05 db, and SSIM improvement of 1.5dB. Subjective performance of all the methods for first frame of the foreman sequence is shown in Fig. 5. As illustrated, MDI shows a better reconstruction on significant edges of the missing blocks. Fig. 6 presents the error concealed images for the mobile sequence. In general, the mobile frames are less smooth than the foreman frames and the missing blocks of the mobile frames are more difficult to reconstruct. As illustrated, the quality of the reconstruction for MDI is more pleasant compared to the other competing methods. Image PER WAVG FSE FFSE FDI MDI Bridge 0.36 35.9 35.5 35.7 36.1 Container 0.36 36.6 36.5 36.7 36.8 Foreman 0.36 36.8 38.1 38.2 39 House 0.33 38.4 38.2 38.1 39 Lena 0.41 36.4 36.7 36.7 37.1 Mother 0.36 37.8 37.4 37.4 38.5 Average 0.4 37 37.1 37.1 37.8 TABLE II PERFORMANCE COMPARISON IN SSIM CHECKER PATTERN. Image PER WAVG FSE FFSE FDI MDI bridge 0.36 86.8 85.3 86 85 87.6 container 0.36 88 87.9 88.4 87.2 89.2 foreman 0.36 89.1 92 92.1 92.2 93.8 house 0.33 89.4 91.4 91.2 89.7 91.9 lena 0.41 87.5 90.6 90.8 90.1 92.1 mother 0.36 91 90.6 90.9 90.9 92.8 average 0.4 88.6 89.6 89.9 89.2 91.2
H. Asheri et al.: Multi-Directional Spatial Error Concealment Using Adaptive Edge Thresholding 883 Finally, Fig. 9 and Fig. 10 compare the computational complexity of all the examined methods. However, from these figures, it can be observed that the proposed method increases the concealment overal time. Fig. 6. Subjective comparison of mobile with checker pattern. Fig. 5. Subjective comparison of foreman with checker pattern. IV. CONCLUSION High quality and efficient error concealment of images and video frames is an important task in many emerging multimedia communication systems. In this paper, we proposed a simple but yet effective method to compensate for the missing macro-blocks in noisy channels. We introduced the concept of hypothesizing significant edges for the missing blocks by using the correctly received contents of the neighboring blocks. Moreover, the use of adaptive thresholding based on the edge related contents of the image, has enabled the proposed algorithm to adaptively select significant edges for interpolation to produce subjectively pleasant results. Experimental results showed that using content adaptive threshold values result in better interpolations both for smooth images with significant edges and images with high details. We compared our work with both popular methods such as WAVG and state of the art methods such as FDI, FSE and FFSE. The results showed noticeable improvement of the proposed method (MDI) for both checker-board and consecutive error patterns based on objective and subjective measures. TABLE III PERFORMANCE COMPARISON IN PSNR FOR CONSECUTIVE PATTERN. Image PER % WAVG FSE FFSE FDI MDI Bridge 0.23 37.8 35.6 35.6 37.6 38.1 Container 0.23 37 36.3 36.2 37 37.8 Foreman 0.23 37.4 36.7 37.2 38.2 39.5 House 0.23 40.9 38 38.3 40.8 40.9 Lena 0.27 38.1 38 37.9 38.2 38.6 Mother 0.23 39.5 37.6 38.2 39.7 40.1 Average 38.5 37 37.2 38.6 39.2 TABLE IV PERFORMANCE COMPARISON IN SSIM FOR CONSECUTIVE PATTERN. Image PER % WAVG FSE FFSE FDI MDI bridge 0.23 90.2 87.7 88.3 89.6 90.2 container 0.23 91.1 90 90.3 90.6 91.4 foreman 0.23 91.4 92.6 92.8 92 93.7 house 0.23 92.7 91.4 91.8 92.5 93.4 lena 0.27 90.5 92 92.1 91.1 92.6 mother 0.23 93.2 91.9 92.4 93 93.8 Average 0.2 91.5 90.9 91.3 91.5 92.5
884 IEEE Transactions on Consumer Electronics, Vol. 58, No. 3, August 2012 Fig. 9. Complexity comparison for checker-board error pattern. Fig. 10. Complexity comparison for consecutive error pattern. Fig. 7. Subjective comparison for consecutive error pattern. Fig. 8. Subjective comparison for consecutive error pattern. REFERENCES [1] Y. Wang, and Q. F. Zhu, Error control and concealment for video communication: A review, Proceedings of the IEEE, vol. 86, no. 5, pp. 974-997, 1998. [2] A. Kaup, K. Meisinger, and T. Aach, Frequency selective signal extrapolation with applications to error concealment in image communication, AEU-International Journal of Electronics and Communications, vol. 59, no. 3, pp. 147-156, 2005. [3] J. Seiler, and A. Kaup, "Fast orthogonality deficiency compensation for improved frequency selective image extrapolation," Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, 2008, pp. 781-784. [4] J. Seiler, and A. Kaup, Complex-valued frequency selective extrapolation for fast image and video signal extrapolation, Signal Processing Letters, IEEE, vol. 17, no. 11, pp. 949-952, nov., 2010. [5] Y. Wang, Q. F. Zhu, and L. Shaw, Maximally smooth image recovery in transform coding, Communications, IEEE Transactions on, vol. 41, no. 10, pp. 1544-1551, 2002. [6] B.J. Yun, M.-H. Park, and H.-D. Hong, "POCS-based error concealment using interlayer correlation and features of neighbor blocks in multilayer video coding," Consumer Electronics, International Conference on, 2010, pp. 385-386. [7] P. Salama, N. B. Shroff, and E. J. Delp, Error concealment in encoded video streams, Signal Recovery Techniques for Image and Video Compression and Transmission, pp. 199-233, 1998. [8] Y. K. Wang, M. M. Hannuksela, V. Varsa et al., "The error concealment feature in the H. 26L test model," Proc. ICIP, 2002, pp. 729-732. [9] X. Chen, Y. Chung, and C. Bae, Dynamic multi-mode switching error concealment Algorithm for H.264/AVC Video Applications, Consumer Electronics, IEEE Transactions on, vol. 54, no. 1, pp. 154-162, 2008. [10] M. Kim, H. Lee, and S. Sull, Spatial error concealment for H.264 using sequential directional interpolation, Consumer Electronics, IEEE Transactions on vol. 54, no. 4, pp. 1811-1818, 2008. [11] W. Kim, J. Koo, and J. Jeong, Fine directional interpolation for spatial error concealment, IEEE Transactions on Consumer Electronics, vol. 52, no. 3, pp. 1050-1056, 2006. [12] M. Ma, O. C. Au, S. H. G. Chan et al., Edge-directed error concealment, Circuits and Systems for Video Technology, IEEE Transactions on, vol. 20, no. 3, pp. 382-395, 2010. [13] H. R. Rabiee, and R. Kashyap, "Error concealment of still image and video streams with multi-directional recursive non-linear filters," IEEE International Conference on Image Processing, 1996. [14] H. Gharavi, and S. Gao, "Spatial interpolation algorithm for error concealment," Acoustics, Speech and Signal Processing, IEEE International Conference on, 2008, pp. 1153-1156. [15] S. V. Chapaneri, and J. J. Rodriguez, "Low complexity error concealment scheme for intra-frames in H.264/AVC," Image Processing, IEEE International Conference on, 2009, pp. 925-928.
H. Asheri et al.: Multi-Directional Spatial Error Concealment Using Adaptive Edge Thresholding 885 [16] H. Asheri, H. R. Rabiee, N. Pourdamghani et al., "A Gaussian process regression framework for spatial error concealment with adaptive kernels," Pattern Recognition (ICPR), 2010 20th International Conference on, 2010, pp. 4541-4544. [17] G. Zhai, X. Yang, W. Lin et al., "Bayesian error concealment with DCT pyramid," Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, 2010, pp. 1366-1369. [18] M. Chen, Y. Zheng, and M. Wu, Classification-based spatial error concealment for visual communications, EURASIP Journal on Applied Signal Processing, vol. 5, pp. 1-17, 2006. Biographies Hadi Asheri received his M.Sc. in Artificial Intelligence from Sharif University of Technology, Tehran, Iran in 2010 and his B.Sc. in Computer Engineering from University of Kerman, Kerman, Iran. He is currently a research assistant at Digital Media Labratoary, Department of Computer Engineering, Sharif University of Technology. His current research interests include statistical signal processing, computer vision and machine learning. Hamid R. Rbiee (SM 07) received his B.S. and M.S. degrees (with great distinction) in Electrical Engineering from CSULB, USA, his EEE in Electrical and Computer Engineering from USC, USA and his Ph.D. in Electrical and Computer Engineering from Purdue University, West Lafayette, USA in 1996. From 1993 to 1996 he was a Member of Technical Staff at AT&T Bell Laboratories. From 1996 to 1999 he worked as a Senior Software Engineer at Intel Corporation. He was also with PSU, OGI and OSU universities as an adjunct professor of Electrical and Computer Engineering from 1996 to 2000. Since September 2000, he has joined Sharif University of Technology (SUT), Tehran, Iran. He is the founder of Sharif University s Advanced Information and Communication Technology Research Center (AICTC), Sharif University Advanced Technologies Incubator (SATI), Sharif Digital Media Laboratory (DML) and Mobile Value Added Services (MVAS) laboratories. He is currently an Associate Professor of the Computer Engineering at Sharif University of Technology, an Adjunct Professor of Computer Science at UNB, Canada, and the Director of AICTC, DML and MVAS. He has been the initiator and director of national and international level projects in the context of UNDP International Open Source Network (IOSN) and Iran's National ICT Development Plan. He has received numerous awards and honors for his Industrial, scientific and academic contributions, and has acted as chairman in a number of national and international conferences, and holds three patents. Nima Pourdamghani received his M.Sc. in Artificial Intelligence from Sharif University of Technology, Tehran, Iran in 2010 and his B.Sc. in omputer Engineering from Sharif University of Technology, Tehran, Iran. His current research interests include computer vision and machine learning. Mohammad Ghanbari (M 78-SM 97-F 01) is a Professor of Video Networking in the Department of Computing and Electronic Systems, University of Essex, United Kingdom. He is best known for the pioneering work on two-layer video coding for ATM networks, now known as SNR scalability in the standard video codecs, which earned him the Fellowship of IEEE in 2001. He has registered for eleven patents and published more than 450 technical papers on various aspects of video networking. He was the co-recipient of A.H. Reeves prize for the best paper published in the 1995 proceedings of IEE in the theme of digital coding. He was also co-investigator of the European MOSAIC project studying the subjective assessment of picture quality, which resulted to ITU-R Recommendation 500. He is the co-author of Principles of Performance Engineering, book published by IEE press in 1997, and the author of Video coding: an introduction to standard codecs, book also published by IEE press in 1999, which received the Reyleigh prize as the best book of year 2000 by IEE. His latest book Standard Codecs: image compression to advanced video coding, was published by IEE in 2003. He has been an organizing member of several international conferences and workshops. He was the general chair of 1997 international workshop on Packet Video and Guest Editor to 1997 IEEE Transactions on circuits and systems for Video Technology, Special issue on Multimedia technology and applications. He was served as Associate Editor to IEEE Transactions on Multimedia (IEEE-T-MM from 1998-2004) and represented University of Essex as one of the six UK academic partners in the Virtual Centre of Excellence in Digital Broadcasting and Multimedia. He is a Fellow of IEEE, Fellow of IET and Charted Engineer (CEng).