Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

Similar documents
FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION

SCALABLE video coding (SVC) is currently being developed

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

Selective Intra Prediction Mode Decision for H.264/AVC Encoders

Reduced complexity MPEG2 video post-processing for HD display

Chapter 2 Introduction to

Adaptive Key Frame Selection for Efficient Video Coding

Project Proposal: Sub pixel motion estimation for side information generation in Wyner- Ziv decoder.

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm

ROBUST ADAPTIVE INTRA REFRESH FOR MULTIVIEW VIDEO

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

Fast Mode Decision Algorithm for Intra prediction in H.264/AVC Video Coding

Video coding standards

Error Resilient Video Coding Using Unequally Protected Key Pictures

Chapter 10 Basic Video Compression Techniques

Overview: Video Coding Standards

Comparative Study of JPEG2000 and H.264/AVC FRExt I Frame Coding on High-Definition Video Sequences

Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding. Abstract. I. Introduction

Error concealment techniques in H.264 video transmission over wireless networks

A Study of Encoding and Decoding Techniques for Syndrome-Based Video Coding

WITH the rapid development of high-fidelity video services

Highly Efficient Video Codec for Entertainment-Quality

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder.

Performance Evaluation of Error Resilience Techniques in H.264/AVC Standard

International Journal for Research in Applied Science & Engineering Technology (IJRASET) Motion Compensation Techniques Adopted In HEVC

The H.26L Video Coding Project

A High Performance VLSI Architecture with Half Pel and Quarter Pel Interpolation for A Single Frame

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

Video Compression - From Concepts to the H.264/AVC Standard

Video Over Mobile Networks

Visual Communication at Limited Colour Display Capability

Key Techniques of Bit Rate Reduction for H.264 Streams

Dual Frame Video Encoding with Feedback

Overview of the H.264/AVC Video Coding Standard

Bit Rate Control for Video Transmission Over Wireless Networks

SUMMIT LAW GROUP PLLC 315 FIFTH AVENUE SOUTH, SUITE 1000 SEATTLE, WASHINGTON Telephone: (206) Fax: (206)

Introduction to Video Compression Techniques. Slides courtesy of Tay Vaughan Making Multimedia Work

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?

A Fast Intra Skip Detection Algorithm for H.264/AVC Video Encoding

Constant Bit Rate for Video Streaming Over Packet Switching Networks

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation

Contents. xv xxi xxiii xxiv. 1 Introduction 1 References 4

WE CONSIDER an enhancement technique for degraded

Performance Comparison of JPEG2000 and H.264/AVC High Profile Intra Frame Coding on HD Video Sequences

Error Concealment for SNR Scalable Video Coding

Memory interface design for AVS HD video encoder with Level C+ coding order

Impact of scan conversion methods on the performance of scalable. video coding. E. Dubois, N. Baaziz and M. Matta. INRS-Telecommunications

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

ABSTRACT ERROR CONCEALMENT TECHNIQUES IN H.264/AVC, FOR VIDEO TRANSMISSION OVER WIRELESS NETWORK. Vineeth Shetty Kolkeri, M.S.

ARTICLE IN PRESS. Signal Processing: Image Communication

Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table

Temporal Error Concealment Algorithm Using Adaptive Multi- Side Boundary Matching Principle

AUDIOVISUAL COMMUNICATION

ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS

Parameters optimization for a scalable multiple description coding scheme based on spatial subsampling

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Video compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and

MPEG has been established as an international standard

Conference object, Postprint version This version is available at

Study of AVS China Part 7 for Mobile Applications. By Jay Mehta EE 5359 Multimedia Processing Spring 2010

New forms of video compression

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Error-Resilience Video Transcoding for Wireless Communications

Free Viewpoint Switching in Multi-view Video Streaming Using. Wyner-Ziv Video Coding

THE new video coding standard H.264/AVC [1] significantly

COMPLEXITY REDUCTION FOR HEVC INTRAFRAME LUMA MODE DECISION USING IMAGE STATISTICS AND NEURAL NETWORKS.

WITH the demand of higher video quality, lower bit

SCENE CHANGE ADAPTATION FOR SCALABLE VIDEO CODING

Principles of Video Compression

PERCEPTUAL QUALITY OF H.264/AVC DEBLOCKING FILTER

Improved Error Concealment Using Scene Information

Dual frame motion compensation for a rate switching network

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

FRAME RATE CONVERSION OF INTERLACED VIDEO

Multimedia Communications. Video compression

Rate-Distortion Analysis for H.264/AVC Video Coding and its Application to Rate Control

PACKET-SWITCHED networks have become ubiquitous

MPEG-2. ISO/IEC (or ITU-T H.262)

DWT Based-Video Compression Using (4SS) Matching Algorithm

Variable Block-Size Transforms for H.264/AVC

Systematic Lossy Error Protection of Video based on H.264/AVC Redundant Slices

MULTI-STATE VIDEO CODING WITH SIDE INFORMATION. Sila Ekmekci Flierl, Thomas Sikora

Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection

SKIP Prediction for Fast Rate Distortion Optimization in H.264

CONSTRAINING delay is critical for real-time communication

CODING EFFICIENCY IMPROVEMENT FOR SVC BROADCAST IN THE CONTEXT OF THE EMERGING DVB STANDARDIZATION

Quarter-Pixel Accuracy Motion Estimation (ME) - A Novel ME Technique in HEVC

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

Design of a Fast Multi-Reference Frame Integer Motion Estimator for H.264/AVC

Motion Compensation Hardware Accelerator Architecture for H.264/AVC

Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection

HEVC Subjective Video Quality Test Results

Minimax Disappointment Video Broadcasting

HEVC: Future Video Encoding Landscape

A parallel HEVC encoder scheme based on Multi-core platform Shu Jun1,2,3,a, Hu Dong1,2,3,b

Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences

Power Reduction via Macroblock Prioritization for Power Aware H.264 Video Applications

Transcription:

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264 Ju-Heon Seo, Sang-Mi Kim, Jong-Ki Han, Nonmember Abstract-- In the H.264, MBAFF (Macroblock adaptive frame/field) and PAFF (Picture adaptive frame/field) is used to enhance the coding efficiency for interlaced video sequences. But the computational complexity of the ME module using the MBAFF and PAFF techniques is huge. Therefore, reducing the speed of MBAFF and PAFF module is one of the important issues to construct an efficient H.264 encoder. In this paper, we proposed three efficient algorithms to reduce the complexity of ME (Motion Estimation) and MD (Mode Decision) modules using MBAFF and PAFF. The simulation results show that the proposed scheme can reduce the computational complexity while the generated bit rate and the image qualities are unchanged. Index Terms-- H.264, Motion estimation, Mode decision, MBAFF coding, PAFF coding. I. INTRODUCTION.264/AVC is an international video coding standard H approved by ITU-T as Recommendation H.264 and by ISO/IEC as MPEG-4 part 10 AVC. The scheme has been designed to provide a technical solution appropriate for broadcast, storage device, and conversational service over wireless networks, VOD or multimedia streaming service [1]. In H.264, new coding tools, such as rate constrained coder control, variable block size motion estimation with small block size, quarter sample accurate motion compensation, context adaptive entropy coding, multiple reference frame motion estimation, directional spatial prediction for intra coding, weighted prediction, picture adaptive frame/field (PAFF) [1][2] coding, macroblock adaptive frame/field (MBAFF) [1][3] coding, have been adapted [1]. Due to these efficient technical coding tools, H.264 has a significant performance benefit from the view point of compression efficiency when compared with previous standards. Among the techniques adapted in H.264, an efficient coding method using MBAFF and PAFF is very important to have high picture quality and compression ratio for interlaced video sequence. Interlaced video sequence with regions of moving objects or camera motion consists of two fields, scanning at different time instants. The two fields of a frame can be coded Ju-Heon Seo is with the Department of Information and Communications Engineering, Information and Telecommunications Research Institute, Sejong University, Seoul, South Korea (e-mail: crazydreamer99@msn.com). Sang-Mi Kim is with the Department of Information and Communications Engineering, Information and Telecommunications Research Institute, Sejong University, Seoul, South Korea (e-mail: sangmikim82@teramail.com). Jong-Ki Han (corresponding author) is with the Department of Information and Communications Engineering, Information and Telecommunications Research Institute, Sejong University, Seoul, South Korea (e-mail: hjk@sejong. ac.kr). This study was supported by a grant of the Seoul R&D program (10557). jointly (e.g. frame coding) or separately (e.g. field coding). Traditionally, pictures with high and low motions are preferable coded by field and frame coding respectively. When a sequence consists of pictures high motion or low motion in the picture level, using PAFF technique can increase the coding performance. On the other hand, in the macroblock level, a picture contains some MBs with high motion and others with low motion, MBAFF coding technique will enhance the coding performance. The coding efficiency can be improved by adaptive coding modes instead of fixed coding mode. Since the computational complexity of the MBAFF and PAFF coding is one of the biggest modules in the H.264 encoding system, increasing the speed of MBAFF and PAFF is a very important issue to construct an efficient H.264 encoder. Various algorithms have been proposed to reduce the computational complexity of MBAFF and PAFF coding [4][5][6]. In [4], a fast PAFF and MBAFF mode prediction which uses the motion detection and statistic of motion vectors (MVs) has been studied. In [5], a new data structure called MBG (Macro Block Group) for progressive and interlaced video has been proposed for MBAFF coding of H.264. A fast decision scheme [6] using motion activity decides a coding mode of the picture between frame and field types for PAFF coding in H.264 encoder. In the conventional H.264 [7], MBAFF coding is performed before PAFF process. At first, a current picture is encoded with a frame type, where each MB is compressed by frame MB mode or field MB mode according to the property of pixels in the MB. If the RDcost of the current MB with frame MB mode is smaller than that with field MB mode, the MB is decided to be coded as a frame mode. This process is called as MBAFF. After a current picture is encoded by frame with MBAFF coding technique, the frame is compressed with field picture coding mode where the frame is split into a top field and a bottom field. If the RDcost of the current picture encoded by frame coding with MBAFF is smaller than that resulted from the field picture coding mode with top field and bottom field coding, the picture is decided to be coded as a frame coding mode using MBAFF. Otherwise, the frame is separated into the top/bottom fields, and each field is compressed by field coding mode. This process is called as PAFF. In this paper, we propose a modified MBAFF and PAFF scheme to reduce the complexity of motion estimation (ME) and mode decision (MD) module. This paper is organized as follows. Section 2 describes the ME and MD scheme using MBAFF and PAFF coding adapted in H.264 codec [7]. In Section 3, we propose an efficient 216

algorithm for MBAFF and PAFF coding. Computer simulation results for the proposed algorithm are presented in Section 4. The conclusion is given in Section 5. Fig. 3. Coding order modified due to using super MBs in MBAFF coding. II. MBAFF AND PAFF IN H.264 The H.264 [7] standard allows three picture coding modes which are frame MB coding, field MB coding, field picture coding. Flowchart of MBAFF and PAFF is represented in Fig.1 [3]. In MBAFF coding, the selection of frame or field coding is at MB level, where an input super MB (whose size is 32x16) can be coded as two frame MBs or two field MBs based on RDcost. In PAFF, an input frame can be coded by the frame picture coding mode incorporating the MBAFF or field picture coding mode. Fig. 4. Conventional ME and MD scheme using MBAFF and PAFF in H.264 In the H.264 [7], a multi-pass approach is used to estimate MVs and optimal modes for a frame picture. Figure 4 shows the flowchart of the conventional process using MBAFF and PAFF coding where mcost and RDcost are defined as in (1) and (2). mcost = SAD + λmotion Rate( MV PMV ) (1) RDcost = SSD + λ Rate (2) mode mode Fig. 1. Flowchart of MBAFF and PAFF coding for a frame picture In MBAFF coding, a super MB can be encoded after it is split to a top frame MB and a bottom frame MB as shown in Fig. 2 (a). On the other hand, it can be processed after splitting into two field MBs (a top field MB and a bottom field MB) as in Fig. 2 (b). Note that the sizes of all kind MB (top frame MB, bottom frame MB, top field MB, bottom field MB) are 16x16, except for a super MB (32x16). Coding order is modified due to using super MB which is described in Fig. 3. In the inter frame prediction procedure, the reference pictures for a frame are previously reconstructed frames, whereas for the field coding, reference data were previously reconstructed fields. If we consider only one reference picture for simplicity, examples of reference pictures are shown in Fig. 5, Fig. 6, and Fig. 7. Fig. 5. Reference data used for top frame MB and bottom frame MB in frame picture coding using MBAFF (a) Reference data for a top frame MB in a super MB, (b) Reference data for a bottom frame MB in a super MB Fig. 2. Structure of a super MB in MBAFF coding. (a) A super MB can be split to two frame MBs, (b) A super MB can be split to two field MBs. Fig. 5 shows ME process and reference data used for frame MBs in a super MB in MBAFF coding, where the nearest one of previously reconstructed frames is used as the reference picture for both a top frame MB and a bottom frame MB. Fig. 6 shows ME process applied to field MBs when the super MB is encoded with top/bottom field MB in MBAFF coding, where the last two field pictures of previously reconstructed pictures are used as the reference data. Fig. 7 represents ME process and reference data used in field picture coding. When a MB in a top field picture is encoded for field picture coding procedure, two previous reconstructed fields are used as the reference pictures 217

shown in Fig. 7 (a). On the other hand, when a MB in a bottom field is compressed, only one previous field is used as a reference picture as shown in Fig. 7(b). Fig. 6. Reference data used for top field MB and bottom field MB in frame picture coding using MBAFF (a) Reference data for a top field MB in a super MB, (b) Reference data for a bottom field MB in a super MB Fig. 7. Reference data used for field pictures in PAFF coding (a) Reference data for a MB in a top field picture, (b) Reference data for a MB in a bottom field picture III. PROPOSED ALGORITHMS FOR MBAFF AND PAFF In this paper, we proposed efficient algorithms to reduce computational complexity of ME and MD modules for MBAFF and PAFF, where three schemes are proposed. Firstly, because both the field MB coding and the field picture coding use same reference data and their processes are very similar to each other, the results (MVs and modes) of field MBs coding in MBAFF can be used for the processes of field picture coding as the predictive information. Secondly, using similarity between search ranges of top MB and bottom MB in a super MB can further reduce the complexity of ME and MD process. Lastly, RDcost calculated in INTRA coding for a super MB can be reused for INTRA coding for top/bottom field coding. Reused MV for proposed algorithm A, B is integer level, and refinement for proposed algorithm A, B is the same method as accuracy of quarter sample for conventional scheme. A. Efficient ME scheme for a MB in a top field picture (proposed A) Figure 6(a) and 7(a) shows that optimal MVs of the top field MB in a super MB and a MB in a top field picture are estimated over the same area. Using the fact can simplify the process of ME for a MB in the top field picture. At first, MVs of the top field MB in a super MB during MBAFF process are estimated by minimizing mcost of (1). Then, the estimated MVs and mode information are used as predictive MVs (PMVs) in ME procedure for a MB in the top field picture. To increase the accuracy of estimation of MVs for a MB in the top field picture, a refinement process is performed over a narrow search region. Since both ME procedures are similar to each other, the complexity of ME s can be reduced significantly without any additional degradation in the reconstructed images. To calculate the computational complexity, we denote the number of MBs in a frame picture as M. If the frame is split into top and bottom fields, the number of MBs in each field is M/2. When MVs for all MBs in a top field picture are estimated by the conventional scheme, the computational complexity is O(0.5M 2 C ME ) since the number of MBs in the top field is M/2, MVs are searched over both the top and bottom fields of the reference frame. C ME is the complexity of ME performed for a MB. On the other hand, when the proposed scheme A is used, the computational complexity becomes O(0.5M 2 C R ) where O(C R ) denote that for MV refinement. B. Efficient ME scheme for a bottom field MB in MBAFF coding (proposed B) Figure 6 shows the procedures and reference data for ME process for a top field MB and a bottom field MB in MBAFF coding. To simplify the procedure, at first, we estimate an optimal MV of a top field MB over a search range. Then, the estimated MV for a top field MB is used as a PMV for a bottom field MB in a super MB. Due to similarity of two MBs, the refinement of ME process can be performed over a very narrow region. Similar to the complexity described in Section 3.1, the complexity of the conventional ME for bottom field MBs in MBAFF is O(0.5M 2 C ME ), while the proposed scheme B reduces the burden to O(0.5M 2 C R ). C. Efficient intra coding scheme for MBs in field picture coding (proposed C) Figure 8 shows neighbor pixels used in intra prediction coding for a top/bottom field MB in a super MB and MBs in a top/bottom field picture during field picture coding. Since the pixel values in the top/bottom MBs in a super MB are equal to those in the MBs in the top/bottom field, respectively, the RDcosts calculated in 16 16 Intra and 4 4 Intra coding process for top/bottom MBs in a super MB can be reused in Intra coding for MBs in top/bottom fields, respectively. Based on the calculated RDcost of a super MB, we can simply decide an optimal intra coding mode for MBs during field picture coding procedure. Fig. 8. Neighborhood pixels used in Intra Prediction Coding for (a) a top field MB and a bottom field MB in a super MB using MBAFF coding, and (b) MBs in a top field picture and a bottom field picture during field picture coding 218

To perform MD process for MBs in field picture coding, the encoding parameters are listed in Table 2. conventional scheme considers 7 modes (Inter 16 16 mode, Inter 16 8 mode, Inter 8 16 mode, Inter P8 8 mode, Intra Table. 2 Conditions of coding parameters 16 16 mode, Intra 4 4 mode, SKIP mode), while the proposed method calculates RDcost s for only 5 modes (Inter 16 16 mode, Inter 16 8 mode, Inter 8 16 mode, Inter P8 8 mode, SKIP mode) by reusing the RDcost s calculated in frame picture coding. Thus, the computational complexity of MD using the proposed scheme for each field picture is O(0.5M C MD 5/7) approximately, where O(C MD ) denotes the complexity require to decide an optimal mode for a MB. That for the conventional MD algorithm is O(0.5M C MD ). The PSNR (Peak Signal to Noise Ratio) s of the images The computational complexities of the paths for ME and MD encoded at various bit rates are shown in Fig. 9. The PSNR s of of H.264 encoder are summarized in Table 1. The entire the encoded images are evaluated with respect to the original complexity to complete ME and MD for a frame picture is images. The test images are encoded by Full search, a conventional scheme [4], and the proposed scheme. As shown Conventional H.264 : in Fig. 9, the PSNR s of the images encoded by the proposed schemes are almost equal to those by the full search scheme. It 4.5 OM ( CME) + 3 OM ( CMD) (3) (3) implies that using the proposed scheme does not result in the Proposed algorithm : degradation of the image quality when compared to the 2.5 OM ( C ) + 2.7 OM ( C ) + 2 OM ( C) (4) conventional scheme. ME MD R where note that O(C ME ) >> O(C R ). Table. 1 Complexity Comparison between the conventional and the proposed schemes Fig. 8. Comparison of Rate distortion curves between the conventional H.264 encoder, Y.Qu[4] s scheme, and the proposed scheme for (a) football, (b) bicycle sequence To compare the computational complexity of the proposed method with those of the conventional schemes, the CPU times consumed by the ME and total encoding module for various sequences are checked in Fig. 10 and Fig. 11, respectively, where the consumed time is displayed in the resolution of msec/frame. As shown in these figures, the proposed scheme requires much smaller computing time than the conventional schemes. From results in Fig. 10 and Fig. 11, we can see that proposed method has good performance for interlaced video sequences. IV. SIMULATION RESULTS Computer simulations using video sequences were performed to evaluate the proposed algorithm. Test image sequences are football and bicycle which are interlaced video sequences. The images are encoded by JM10.0 codec [7]. In this test, GOP structure is IPPPP. The H.264 Fig. 10 Comparison of consumed CPU times between the conventional H.264 encoder, Y.Qu[4] s scheme, and the proposed scheme for football and bicycle sequences 219

Fig. 11 Comparison of consumed CPU times between the conventional H.264 encoder, Y.Qu[4] s scheme, and the proposed scheme for football and bicycle sequences In Table 3, the reduction ratio of the computational complexities of the algorithms is evaluated by (CPU time consumed by H.264 [7] - CPU time consumed by a scheme) 100. CPU time consumed by H.264 [7] (5) REFERENCES [1] T. Wiegand, G.J. Sullivan, G. Bintegaard, and A. Luthra, Overview of the H.264/AVC video coding standard, IEEE Trans. Circuits Syst. Video Technol, vol.13, no.7, pp.560-576, July 2003. [2] L. Wang, K. Panusopone, R. Gnadhi, Y. Yu and A. Luthra, Adaptive frame/field coding for JVT video coding, JVT-B071, Geneva, Jan. 2002. [3] L. Wang, R. Gnadhi, K. Panusopone, Y. Yu and A. Luthra, MB-level adaptive frame/field coding for JVT, JVT-B106, Geneva, Jan. 2002. [4] Y. Qu, G. Li, Y. He, A fast MBAFF mode prediction strategy for H.264/AVC, IEEE ICOSP 2004, vol.2, pp.1195-1198, Aug. 2004. [5] G. Li, Y. He, An adaptive macroblock-group coding algorithm for progressive and interlaced video, IEEE ISCAS 2004, vol.3, pp.iii-969-972, May 2004. [6] P. Yin, A.M. Tourapix, J. Boyce, Fast decision on picture adaptive frame/field coding for H.264, Ed.: Andrew G. Tescher, Proc. SPIE, vol.5960 (2005), pp.2092-2099 [7] JVT Reference Software version JM10.0, http://iphome.hhi.de/suehring/tml/download/old_jm/jm10.zip Table. 3 Performance comparisons between the conventional and the proposed schemes for football sequence. From this table, we can see that the computational complexity of the encoder using the proposed scheme is much less than those of the conventional H.264 encoder while PSNR s of the encoded images and the generated bit rates are maintained. The proposed scheme can skip some paths in ME and MD process. That is why the proposed scheme can reduce the computational complexity. V. CONCLUSION We have proposed an efficient scheme to estimate the motion vector and to decide the mode in H.264 encoder. Since the proposed ME and MD utilize the correlation between field MB coding and field coding for the interlaced fields, the proposed scheme can reduce the computational complexity while the bit rate and the image qualities are unchanged. Various computer simulations show that the proposed algorithm significantly reduces computational complexity compared with performance of conventional schemes. 220