IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY

Size: px
Start display at page:

Download "IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY"

Transcription

1 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY A Highly Efficient VLSI Architecture for H.264/AVC CAVLC Decoder Heng-Yao Lin, Student Member, IEEE, Ying-Hong Lu, Bin-Da Liu, Fellow, IEEE, and Jar-Ferr Yang, Fellow, IEEE Abstract In this paper, an efficient algorithm is proposed to improve the decoding efficiency of the context-based adaptive variable length coding (CAVLC) procedure. Due to the data dependency among symbols in the decoding flow, the CAVLC decoder requires large computation time, which dominates the overall decoder system performance. To expedite its decoding speed, the critical path in the CAVLC decoder is first analyzed and then reduced by forwarding the adaptive detection for succeeding symbols. With a shortened critical path, the CAVLC architecture is further divided into two segments, which can be easily implemented by a pipeline structure. Consequently, the overall performance is effectively improved. In the hardware implementation, a low power combined LUT and single output buffer have been adopted to reduce the area as well as power consumption without affecting the decoding performance. Experimental results show that the proposed architecture surpassing other recent designs can approximately reduce power consumption by 40% and achieve three times decoding speed in comparison to the original decoding procedure suggested in the H.264 standard. The maximum frequency can be larger than 210 MHz, which can easily support the real-time requirement for resolutions higher than the HD1080 format. Index Terms Context-based adaptive variable length coding (CAVLC), H.264/AVC, variable length coding. I. INTRODUCTION IN RECENT YEARS, with the rapid growth of multimedia and communication techniques, multimedia systems have become indispensable. However, rich multimedia services result in problems of limited data storage and transmission bandwidth. There are several important video coding standards that have been developed to effectively compress multimedia information while maintaining a high quality. The Moving Picture Experts Group (MPEG) was established to successfully build compression techniques, such as MPEG-1 [1], MPEG-2 [2], and MPEG-4 [3], for enabling many important multimedia services. At the same time, the International Telecommunication Union (ITU) also established a series of video compression standards, such as H.261 [4], H.263 [5], H.263+, and H In 2003, the Joint Video Team (JVT), consisting of experts from MPEG and ITU, approved a new video standard, H.264/AVC with many advanced features to achieve effective video compression [6]. Manuscript received January 29, 2007; revised July 5, This work was supported in part by the National Science Council of Taiwan, R.O.C., under Grants NSC E and E The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Anna Hac. The authors are with the Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan, R.O.C. ( lhy92@spic.ee. ncku.edu.tw; y_h_lu@novatek.com.tw; bdliu@mail.ncku.edu.tw; jfyang@ ee.ncku.edu.tw). Digital Object Identifier /TMM However, the computational complexity of the H.264/AVC encoders and decoders is dramatically increased. For video coding standards, variable length coding (VLC) is a well-known lossless entropy coding technique, widely adopted in image/video compression standards. However, since codeword length is variable, the cascaded codeword boundary cannot be determined until the previous codeword is decoded, which limits decoding throughput. In addition, entropy coding is normally based on pre-defined tables of variable-length codes. Adaptation to actual input symbol statistics is difficult. Thus, Jeon et al. proposed a joint use of adaptive codebook selection and dynamic codeword re-association to achieve better coding performance than pre-defined Huffman tables [7]. Lakhani also suggested that run-length coding should encode the run-length of subsequent zeros instead of preceding zeros of nonzero AC coefficients [8]. To achieve the maximum compression ratio under reasonable hardware cost, the context-based adaptive variable length coding (CAVLC) method including all the above mentioned features is adopted in the latest H.264/AVC standard. The CAVLC algorithm advantageously uses the trend among AC coefficients in each block to predict the next codeword. The prediction mechanism can significantly improve decoding performance. Compared to the previous entropy coding method, the CAVLC introduces the context model concept to model symbol probability more accurately so that the compression ratio can be further increased. However, the cost of high compression ratio achieved by the CAVLC coder is the high computational complexity. Thus, developing an efficient CAVLC decoder implementation is of practical importance. Several recent design studies about efficient VLC decoding hardware [9] [13] have emerged. These architectures can be mainly classified into two groups: tree-based and parallel decoding approaches. In traditional VLC decoding algorithms, a level code is searched in the coding tree per operation. The throughput rate is therefore limited according to the search level. Although the tree-based method is simple, it is not suitable for real-time processing applications. Therefore, parallel VLC decoding approaches are mostly adopted in VLSI hardware designs. How to implement the CAVLC decoding hardware has been of great interest in recent years [14] [25]. To realize the H.264 decoder, a decoding hardware design to parse the CAVLC codeword was proposed in [14], [15]. In [16], the VLSI implementation of the CAVLC decoder for H.264/AVC was presented. Moreover, several low-cost, high-performance VLSI architectures for realizing the CAVLC decoder were proposed in [17] [19]. Nevertheless, the proposed methods mainly focused on improving look-up table processing speed. Moon proposed an efficient algorithm based on arithmetic operations instead of /$ IEEE

2 32 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008 memory access [20], which is further matured by Kim [21]. Obviously, the proposed schemes are not suitable for hardware implementation since the straightforward implementation of the arithmetic algorithms requires a number of operative resources. The designs [22] [24] to reduce the clock cycles through skipping methods and multiple symbol design were suggested to speed up the processing performance. In this paper, after analyzing the CAVLC coding algorithm, we propose an efficient decoding architecture to deal with the critical path of CAVLC decoders. The decoder critical path delay can be divided into two pipelined stages to improve the processing speed. Finally, a low-cost and low-power CAVLC decoder integrated with recently-developed techniques is synthesized while maintaining the designed fast decoding performance. The rest of this paper is organized as follows: Section II addresses the basic concept of the decoding procedure of the CALVC framework. In Section III, the algorithm level optimization that fits the hardware design to speed up the decoding performance is presented. A detailed description of the designed CAVLC decoder is discussed in Section IV. The simulation results and synthesis circuits to evidently show the performance improvement of the proposed CAVLC decoder are then presented in Section V. In Section VI, the conclusion is drawn. Fig. 1. Reverse scan magnitudes and their transmitted bitstream. TABLE I CAVLC DECODING PROCEDURE FOR THE EXAMPLE DEPICTED IN FIG. 1 II. CONTEXT-BASED ADAPTIVE VLC PRINCIPLES In H.264/AVC, the context-based adaptive variable length coding (CAVLC) technique is designed to encode the quantized residual coefficients of 4 4 (and 2 2) blocks. It makes effective use of several characteristics in block-based video compression. After prediction, transformation, and quantization, the quantized DCT coefficients are mostly zeros. The CAVLC uses run-level coding to compactly represent strings of zeros. Since the tendencies of Run and Level are not quite correlated, the CAVLC encodes Run and Level information separately with better adaptation to achieve better coding efficiency. In transformed blocks, the nonzero high frequency coefficients after the zig-zag scan are often sequences of. The CAVLC signals the number of high-frequency coefficients by using trailing 1s to achieve a compact representation. For nonzero coefficients, the coefficients according the decoding status can be effectively encoded by using different VLC tables. The CAVLC encoder adaptively selects the best-matched VLC table to advance the coding performance. The choice of look-up table depends on the number of encoded coefficients and the magnitude of nonzero coefficient. The CAVLC effectively uses the mechanism, called context-based adaptive (CA), to achieve better coding performance than the traditional fixed VLC table. In a 4 4 block, the levels (magnitudes) of nonzero coefficients tend to be higher near the DC coefficient. Hence, the CAVLC also reversely encodes them from lower to higher frequencies. By using another CA concept, the CAVLC adaptively selects the VLC look-up table for the next level parameter depending on recently-coded level magnitudes. In the CAVLC encoder, the quantized coefficients are zig-zag scanned and then encoded by five syntax elements. These syntax elements are defined as follows. a) coeff_token: Both numbers of all nonzero coefficients (total_coeff) and trailing ones are encoded using this syntax element. b) Sign of T1s: This syntax element encodes the sign bit of each in reverse zig-zag scan order. c) Level: The value of each nonzero coefficients, except for, is encoded using this syntax element. d) total_zeros: This syntax element encodes the total number of zero coefficients preceding the last nonzero coefficients in zig-zag scanned order. e) run_before: This syntax element encodes the number of successive zero coefficients preceding each nonzero coefficients in reverse zig-zag scanned order. A. CAVLC Decoding Procedure Fig. 1 shows an example of the reverse scan values of a 4 4 block and its associated transmitted CAVLC bitstream from the encoder point of view. The CAVLC decoder will decode the CAVLC bitstream into syntax elements. Table I exhibits the parsed syntax elements and decoded information bits of the CAVLC bitstream, which are shown in Fig. 1. The corresponding 4 4 block in the reverse zig-zag scan of magnitudes,, is finally reconstructed. The details of the CAVLC process can be found in the H.264 standard [6]. According to the H.264/AVC standard, Sign of T1s and Levels are decoded using simple arithmetic operations since these two syntax elements are encoded by regular VLC codes. However, the other syntax elements, such as coeff_token, total_zeros, and run_before, are encoded by content-dependent VLC codes. Decoding the content-dependent symbols is mostly realized by multiple look-up tables.

3 LIN et al.: HIGHLY EFFICIENT VLSI ARCHITECTURE FOR H.264/AVC CAVLC DECODER 33 TABLE II REAL-TIME REQUIREMENT FOR DIFFERENT VIDEO RESOLUTIONS B. Real-Time Requirement Generally speaking, the number of required clock cycles for each block is dependent on block coefficients. However, in CAVLC, the average number of symbols is about twice that of the previous VLC method. To guarantee that our architecture is suitable for most resolutions and satisfies the real-time requirement for most applications, the worst case, which requires the most processing time, should be considered for analyzing the real-time requirement. The CAVLC decoder operation frequencies for various video formats that satisfy real-time requirements (resolutions) are listed in Table II. III. HARDWARE-ORIENTATED ALGORITHM OPTIMIZATION In most applications, such as broadcasting and video conferencing, real-time processing is the most important issue. In the VLC decoding procedure, the next symbol cannot be processed until the current symbol is decoded due to data dependency in variable length codes. The system performance is mainly confined by the critical path. In the CAVLC decoder, the critical path occurs in decoding the level information since this syntax element is decoded based on arithmetic operations. Moreover, two parameters, current code length and suffixlength, should be obtained to decode the subsequent level coefficients. Unfortunately, the suffixlength value, which is processed at the end of the level decoding procedure, can only be decided until the level coefficient is resolved. Therefore, the original decoding procedure can not be divided as pipeline structure. In order to improve overall performance, a modified suffixlength detector (MSD) algorithm is presented to advance the suffixlength computation prior to the determination of level coefficient. In addition to shortened critical path, the level decoding process can be realized with a pipeline structure. The detailed description of MSD algorithm is presented in the following subsection. A. Original Level Decoding Process The level coefficient includes the sign and the magnitude of each remaining nonzero coefficient in the 4 4 block. The code for each level is composed of a prefix part (level_prefix) and a suffix part (level_suffix) as (1) Fig. 2. Level decoding process in the H.264 standard. where the amount of leading zeros in the prefix part is represented as level_prefix, and level_suffix contains the value of the suffix part. Unlike Exp-Golomb entropy coding, in which leading zeros determine the amount of trailing information bits after the first 1 bit, there is no relationship between level_prefix and level_suffix in the Level bitstream. The level_prefix also carries level information. In the decoding procedure, these variables are used to compute the level value. Fig. 2 shows the level decoding procedure flow diagram, which can be divided into two stages [6]. The decoding process in the first stage starts from level_prefix decoding, which is physically a first 1 detector and counts the number of zeros preceding the first 1 bit. After the level_prefix value has been decided, the size of the suffix part, represented as levelsuffixsize, is examined in the decoding process. If the variable levelsuffixsize is equal to 0, the syntax element level_suffix is inferred to be equal to 0. Otherwise, the level_suffix based on the corresponding suffixlength is obtained. The variable suffixlength, ranging from 0 to 6, is used to decide the corresponding magnitude range of level_suffix. The probability model entity used in the level code can be

4 34 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008 modified depending on the previous level information, such that each model can be identified by a unique suffixlength value, i.e., the context-based adaptive feature in CAVLC. A large suffixlength value is suitable for higher magnitude levels, whereas a small suffixlength value is appropriate for levels with lower magnitudes. In general, the levelsuffixsize value is equal to the variable suffixlength with the exception of the Escape_code, and determines the codeword representative range in level_suffix. Afterward, the intermediate levelcode value, represented in italics, is obtained from level_prefix and level_suffix values and transferred to the second stage. In the second stage, levelcode is refined and applied to calculate the level value in the level processing unit (LPU). In the LPU, there are two special conditions that cause levelcode to be incremented as follows. When level_prefix is equal to 15 and suffixlength is equal to zero, the first special condition occurs. The Escape_code adopted for a large CAVLC level value occurs in this condition. The variable LevelCode should be increased by 15. If the current symbol is the first Level symbol and the amount of trailing ones is less than 3 in this block, the second special condition appears. The first Level symbol absolute value cannot be 1, or the first Level symbol will be merged into the TrailingOnes index. Hence, the level- Code, which has been modified to improve compression efficiency in this condition, should be increased by 2 in the decoding procedure. After checking the two special conditions, the levelcode value is completely decoded and transferred into the LPU. The level value would be derived from levelcode. The equations of the LPU are as follows: From these equations, it is obvious that if levelcode is even, the level value is positive, and if levelcode is odd, the level value is negative. Finally, the absolute value of the level should be checked with the boundary defined in Table III. If the magnitude exceeds the boundary, the syntax element suffixlength value must be updated to decode the next Level symbol. For example, if the current suffixlength is equal to 2 and the input bitstream is , it is obvious that the prefix part is (000), and the suffix part is (01). Therefore, the level_prefix and level_suffix are equal to 3 and 1, respectively. As shown in Fig. 2, following the calculation procedure in the first part, the output pattern levelcode is equal to 13 and then submitted into the second part. By substituting the levelcode into (2) in the second part, the final level value is equal to. After checking the level value in Table III, the suffixlength for next Level symbol should be incremented by 1. (2) TABLE III THRESHOLD VALUE FOR DETERMINING NEXT SUFFIXLENGTH B. Critical Path Improvement in the Decoding Level In the level code decoding procedure, two parameters should be derived and delivered to decode the next piece of information. One is the current codeword length, and the other is the suffixlength value. Generally speaking, the VLC decoding bottleneck occurs in the codeword boundary. The next codeword starting point remains unknown until the current codeword length is decoded. In the level decoder, Level symbol length is defined in the following equation: In the original level decoding procedure shown in Fig. 2, level_prefix and levelsuffixsize are available in the first stage; thus, codeword length can be obtained in the first stage. Although the length is derived in the first decoding procedure stage, the decoding process cannot perform the next symbol immediately since the suffixlength, which identifies the next symbol magnitude range, is not updated until the end of the second stage. The suffixlength updating step refers to the complete level calculation after the LPU in the second stage. Furthermore, the LPU calculates the level value from the levelcode, which is derived from level_prefix and level_suffix. The data dependency in the original decoding process leads to an unavoidably long critical path. In order to improve decoding performance, a new decoding Level procedure algorithm is proposed to break data dependency and reduce the critical path. The threshold value in the suffixlength detector is defined in Table III, which can be described as the following threshold function: where level[ ] represents the level value. If condition (4) is satisfied, suffixlength is increased by 1. Since the level value refers to levelcode as shown in (2), we can rewrite (4) as shown in (5) at the bottom of the next page. Since levelcode changes with different conditions, to clearly explain the proposed algorithm, it is categorized into three cases: Normal mode, Escape_code mode, and TrailingOnes mode. 1) Normal Mode With SuffixLength > 0: First, we consider the variable levelsuffixsize larger than 0, which also means that the suffixlength is not equal to 0. Simultaneously, we also ignore (3) (4) (5)

5 LIN et al.: HIGHLY EFFICIENT VLSI ARCHITECTURE FOR H.264/AVC CAVLC DECODER 35 the special conditions in the second stage that will increment the levelcode value. Thus, levelcode can be replaced with the combination of level_prefix and level_suffix as in the following equation: (6) By substituting (6) into (5), the upper equation in (5) is rewritten as when level[ ] is positive. After arranging terms, we can represent (7) as Since level[ ] is positive, levelcode must be even. In (6), if levelcode is even, the level_suffix value must be even and equal to, depending on the levelsuffixsize, i.e., suffixlength. Thus, the last term of (8) ranges between 0 and 1 as The level_prefix value must be an integer. Therefore, for any level_suffix value, (8) can represent an equivalent condition with the following equation: (7) (8) (9) (10) Similarly, when level [ ] is negative, the lower equation in (5) can be rewritten as (11) (12) Since level [ ] is negative, levelcode must be odd, and the level_suffix value can be. Then, we can represent (12) as (13) It is obvious that no matter whether the Level symbol is positive or negative, if the level_prefix value is larger than 2, the syntax element suffixlength must be incremented by 1 to decode the next Level symbol. 2) Normal Mode With : If suffixlength is equal to 0, there is no level_suffix value. The levelcode can be described as (14) In the H.264 standard, when suffixlength is equal to 0, suffixlength is updated to 1 to decode the next level of information. Moreover, if the absolute level value is larger than the threshold value for suffixlength equal to 1, defined in Table III, which is equal to 3, suffixlength is set equal to 2 directly. Under this condition, we can rewrite (5) as for an even levelcode and (15) (16) for an odd levelcode. In brief, we can summarize (15) and (16) as follows: (17) If level_prefix is larger than 5, suffixlength is incremented by 2; otherwise, suffixlength is incremented by 1. 3) Escape Code Mode: Afterward, the two special cases, which will modify the levelcode value and lead suffixlength to be incremented, may be considered in threshold detection conditions. First, if Escape_code is adopted for the level value, it means that the level value is very large and definitely larger than the threshold value. The syntax element suffixlength should be updated for next Level symbol. 4) TrailingOnes Mode: Another special case occurs when the current symbol is the first Level symbol and the number of trailing ones is less than 3. levelcode should be increased by 2 in the decoding procedure. In the H.264 standard, the variable suffixlength is initialized as follows: if there are more than 10 residual coefficients and the TrailingOnes value is less than 3, suffixlength is set equal to 1; otherwise, suffixlength is set equal to 0. Thus, the possible suffixlength for the first Level symbol could be 0 or 1. Under this condition, if suffixlength is equal to 0, levelcode can be given by (18) which could be substituted into (5) to solve for the threshold function as (19) On the other hand, if suffixlength is equal to 1, levelcode can be described as Substituting (20) into (5), we can get the following result: (20) (21) 5) Summary: The results of all conditions mentioned above are summarized in Fig. 3. The modified suffixlength detector (MSD) input signal is level_prefix instead of the level value. The suffixlengeh value of decoding the next Level symbol can be obtained from the current symbol information and a recent

6 36 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008 Fig. 3. Proposed MSD decoding procedure. Fig. 4. Modified Level decoding process. level_prefix value. For this reason, the suffixlength detector can be used after the level_prefix unit in the first stage. The modified decoding procedure, using the MSD, is shown in Fig. 4. With the proposed decoding process, the MSD can be integrated into the first stage and the level decoding path can be reduced. Moreover, since the two parameters utilized for decoding the next codeword can be derived in the first stage, the decoding process will decode the next symbol serially after completing the first stage. The level decoder critical path delay can be further improved with a pipeline structure. Deciding the Level value can be performed in the second stage in the following clock cycle. In the meantime, the first stage of the pipeline structure can directly decode the next Level symbol. Following the example as described in Section III, the current suffixlength is equal to 2 and the level_prefix is equal to 3. After checking the detecting function in Fig. 3, the suffixlength

7 LIN et al.: HIGHLY EFFICIENT VLSI ARCHITECTURE FOR H.264/AVC CAVLC DECODER 37 TABLE IV MERGED CATEGORIES IN THE PROPOSED LPCLUT Fig. 5. Overview of the proposed CAVLD architecture. should be incremented by 1 for decoding the next Level symbol, which can be detected in the first stage of the level decoding process. The result is the same with the original level decoding process. IV. DESIGN OF CAVLC DECODING ARCHITECTURE Based on the proposed algorithm and the CAVLC decoding flow, a highly effective VLSI architecture for the CAVLC decoder (CAVLD) was designed. Fig. 5 shows the proposed CAVLC decoder architecture, which is mainly composed of a Combined_LUTs decoder, a sign decoder, a Level VLC decoder, a Barrel shifter, a Control unit, and an Output buffer. The Combined_LUTs decoder is the most important VLC component, which supports coeff_token, total_zeros, and run_before decoding process. In addition to the Level VLC decoder, the remaining functional blocks should be considered when designing an effective CAVLC decoder. In this research, several techniques related to hardware implementation are introduced to reduce the hardware cost and power consumption, and increase the data throughput rate. A more detailed description of the realization of these function units is provided in the following subsections. A. Low-Power Combined Look-Up Table (LPCLUT) The VLC tables use the most area and consume the most power of the chip if the syntax elements coeff_token, total_zero, and run_before in the CAVLC decoder are decoded by exploiting their own look-up tables (LUTs). In [25], a low power CAVLC decoder architecture with prefix-predecoding and table partitioning methods was presented to reduce the power consumption of LUTs without affecting the overall performance. However, adding a latch before each sub-table to prevent the unselected tables from consuming unnecessary dynamic power requires extra area overhead. To overcome the drawbacks of conventional methods, the correlation of unstructured VLC tables for different syntax elements is analyzed. The detailed procedures for decoding these three symbols are discussed in the following. The first VLC symbol to be decoded in the CAVLC decoder is coeff_token. As defined in the coding standard [6], there are five sub-tables for decoding this symbol. The choice of table depends on the number of nonzero coefficients in the neighboring blocks. The index for sending a separate VLC LUTs to indicate total_zeros is based on the total_coeff value except the maximum block size. The total_zeros table is partitioned into 15 sub-tables for luminance and three sub-tables for chrominance. During the run_before decoding flow, the zerosleft symbol is first checked to ascertain whether the number of zero coefficients should be decoded or not. If zerosleft is larger than zero, the run_before symbol is decoded by searching the corresponding table. The decoding procedure is terminated when zerosleft is equal to zero or the decoded run_before symbol indicates the last coefficient. The run_before table is divided into seven sub-tables and only one of them is selected for decoding procedure based on the index zerosleft value. In the CAVLC decoding procedure, each symbol is decoded by a specific decoding unit, which requires three individual LUTs in hardware implementation traditionally. Due to the CAVLC decoding flow, the coeff_token, total_zero, and run_before symbols will not be happened at the same time such that we suggest a combined LUT, which can decode these three symbols in the proposed CAVLD architecture. Using the VLC features such as bit arrangement, value, and code length, the sub-tables within different syntax elements are grouped into multiple categories. Each category can be actually combined to the other category in a different disjoint symbol table. For example, the first total_zeros VLC table is merged with the first coeff_token table due to the variance of codeword distribution. The combined table has fewer latches than two separate tables, which results in a smaller hardware cost. The merged categories from coeff_token, total_zero, and run_before symbols in the combined LUT are listed in Table IV. The number in the parentheses indicates the bits required for the latch. There are fifteen VLC tables and one fixed length coding (FLC) table in the architecture of one combined LUT, as shown in Fig. 6. Control

8 38 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008 Fig. 6. Proposed architecture of the LPCLUT. unit outputs enable signals to activate the latches, and only one of these latches is enabled during the decoding process. B. Throughput Improvement in Decoding Sign of T1s In the H.264 standard, the number of trailing ones can be any from 0 to 3. For each trailing one, the sign is coded with a single bit. Therefore, there are at most three bits to represent the trailing_one_sign_flag, in which the codeword 1 expresses positive one and the codeword 0 expresses negative one. Generally speaking, due to data dependencies, one cycle is needed to decode one symbol in a simple VLC decoder. Since the trailing_one_sign_flag is not a VLC symbol, there are no data dependencies between trailing_one_sign_flag elements. The throughput (symbols/cycle) of the CAVLC decoder can be improved by decoding multiple sign symbols per cycle. The proposed architecture for decoding the sign of trailing ones is shown in Fig. 7. In the trailing_one_sign_flag decoding process, the controller simultaneously reads three bits from a barrel shifter. The syntax element Num_Trail, which is decoded in coeff_token, is used to indicate the number of trailing ones that should be decoded in the block. Decoded symbols will then be stored in the level register for reconstructing the 4 4 block in the next step. Since the occurrence probability of the number of trailing ones is varied with different QPs, the performance improvement versus QP is shown in Table V. The normal VLC decoder for trailing_one_sign_flag spends about two cycles; however, the proposed method can decode all trailing_one_sign_flag elements in one clock cycle, achieving an approximately 88% performance improvement. C. Output Buffer After the CAVLC decoding, the quantized transform residues are obtained in the output buffer and transmitted to inverse quantization for further processing. Conventionally, all level/run symbols of a 4 4 block are first decoded and stored in buffers. Then, the level/run symbols are converted to on 16-element buffer and then transferred to a 4 4 quantized transform residues block according to the inverse scan. Normally, it requires some clock cycles to reconstruct and reorder Fig. 7. Proposed architecture for trailing_one_sign_flag. residual coefficients. The next CAVLC decoding procedure will not be initiated if the whole CAVLC decoding process cannot be completed. Thus, the extra output buffer should be introduced for increase the decoding performance. In [17], an output buffer using the double-stack architecture with a block pipelining scheme was proposed to speed up the data transition between the CAVLC decoder and the IQ. However, it takes more hardware area and power to store the level/run symbols by introduction of the output buffer. In this paper, only single output buffer is adopted in the proposed CAVLC decoder, which can keep the decoding speed as the double-stack architecture. In the CAVLC decoding procedure, the 16-zero coefficients can be reset into the output buffer and the nonzero coefficients are firstly decoded and written into the level buffer during the cycles in decoding Sign of T1s and Level symbols. In total_zeros and run_before cycles, the first nonzero coefficient position index in the reverse order is at total_coeff + total_zeros. Then, other nonzero coefficients are consecutively transmitted to the buffer in the corresponding position whenever run_before symbols are decoded. Since the quantized transform residues are immediately stored in the buffer after the CAVLC decoding procedure, the run_before buffer is no longer needed. Moreover, during the reordering step, only one of the pixel data is transmitted into the corresponding position and it will not overwrite the other nonzero coefficients. At the same time, a zero value is written into the original position. If there are no more zeros left (i.e., [run_before] =total_zeros), it is unnecessary to decode any run_before symbols and the remaining

9 LIN et al.: HIGHLY EFFICIENT VLSI ARCHITECTURE FOR H.264/AVC CAVLC DECODER 39 TABLE V PERFORMANCE IMPROVEMENT VERSUS THE NUMBER OF TRAILING ONES VARIED WITH QP TABLE VI CRITICAL PATH IMPROVEMENT Fig. 8. Proposed architecture with a single output buffer. TABLE VII HARDWARE COST PROFILE FOR DIFFERENT FUNCTIONAL BLOCK (GATE COUNT) Fig. 9. Example of data movement in a single output buffer. coefficients in the level buffer are located in the corresponding position without any movement. Therefore, only one level buffer is required. Thus, the cycle counts, hardware area, and power consumption with one output buffer can be reduced at the same time. The output buffer architecture is shown in Fig. 8. Since level coefficient values can be represented with 13 bits, a 16-entry deep and 13-bit wide memory is adopted to store the level coefficients and to buffer the reconstructed quantized coefficients. As an example, the data movement in the output buffer is illustrated in Fig. 9. Following the decoding procedure shown in Table I, six nonzero coefficients are successively stored in the level buffer during the Sign of T1s and Level cycles. After the total_zeros cycle, the last coefficient 1 is directly moved to the location index, total_coeff + total_zeros, and the level buffer in the original position is filled with 0. Moreover, the zero_left is defaulted as the value of total_zeros. When the first run_before value is decoded, since the zero_left is equal to 2, the possible run_before value is set between 0 and 2. Hence, the second nonzero coefficient in the reverse scan -1 is moved into the corresponding position without overwriting the other nonzero coefficients. Then, after the second run_before value is decoded, the sum of total run_before values is equal to total_zeros. The remaining coefficients are located in the correct positions and can be transmitted into the next functional block directly. V. SIMULATION AND VERIFICATION The proposed hardware architecture was synthesized with a 0.18 m CMOS standard cell-based library for performance evaluation. The experimental results for the critical path of the optimized algorithm are shown in Table VI. When the modified suffixlength detector (MSD) is shifted to the first stage, the proposed architecture can efficiently shorten the critical path delay by 45% as compared with the original level decoder. Moreover, by forwarding the MSD, the proposed architecture can be easily implemented using a pipeline structure. The critical path delay is further reduced to one-third of the original level decoder,

10 40 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008 Fig. 10. Power consumption MHz. TABLE VIII SIMULATION RESULTS AND COMPARISONS OF H.264 CAVLC DECODERS which allows the maximum work frequency to be about 213 MHz. In contrast with the worst case described above, the proposed architecture processing capability can easily guarantee the real-time requirement for resolutions higher than 1080 HD ( ) video format. The hardware cost profile in the individual functional blocks of the CAVLC decoders in terms of gate count is shown in Table VII. There are four kinds of CAVLC decoders. The original CAVLC decoder is implemented as the H.264 standard. Low Power LUTs [25] introduces table partitioning and prefix predecoding to reduce power consumption. Since extra latches are adopted to prevent the unselected tables from consuming unnecessary dynamic power, the area for coeff_token, total_zeros, and run_before are larger than the original ones. Because the prefix predecoding method is used to reduce the table size, the coeff_token area does not increase as much as these of total_zeros and run_before. Low Power Combined LUT (LPCLUT) combines the tables within coeff_token, total_zeros, and run_before. Since the number of latches for low power LUTs can be further reduced using the combining procedure, the LPCLUT requires less hardware than three separate tables. The analyzed results show that the proposed one combined LUT method reduces area by 11% in average. Moreover, the proposed architecture with LPCLUT and a single output buffer is labeled as Improved LPCLUT in which only half the register size is required. Although the self-controller in the output buffer is more complex than the original one, the hardware saving in the output buffer part achieves 33% compared with the double buffer architecture. Consequently, the overall hardware cost is about 19% less than that of the previous CAVLD architecture. Fig. 10 shows the experimental results for the average power consumption for the proposed Improved LPCLUT CAVLC decoders and the original criterion. A previous low power design [25], marked as LP LUTs, is also included for evaluation of power consumption. Three sequences with four different quantization parameters were applied to verify the low power architecture. Since the quantized coefficients become smaller as the QP gets larger, the power reduction declines for large QP. The sequence Mobile, which is the most complex sequence, has better power degradation comparing to the other sequences. The results show that the proposed architecture reduced the power consumption by approximately 40% in average. A comparison of hardware cost and processing speed of the proposed design with other existing designs is shown in Table VIII. The proposed architecture has minimum area compared to all other designs except the one suggested in [16]. It is noted that the area of the output buffer unit used to reconstruct the CAVLD elements into a 4 4 residual block of zig-zag scan is not included in [16]. In [17], extra memory is required for the IDS (Interleave Double Stacks) buffer. In addition, it can only decode one symbol per cycle. The proposed design outperforms the method addressed in [23] in terms of both hardware cost and operation speed while providing the comparable throughput. Alle s design [19] achieves slightly higher performance due to superior CMOS technology; however, it requires roughly three times area overhead. VI. CONCLUSION A high-performance CAVLC decoding architecture for H.264 decoder is proposed in this paper. To improve overall performance, the CAVLC decoding process was precisely analyzed.

11 LIN et al.: HIGHLY EFFICIENT VLSI ARCHITECTURE FOR H.264/AVC CAVLC DECODER 41 An effective algorithm for decoding Level symbols was proposed to improve the suffixlength detector. The critical path was then reduced by forwarding the suffixlength detector to the first stage in the Level decoding procedure. With the improved algorithm, the proposed architecture was implemented by using a pipeline structure, which triples the decoding speed of the conventional decoding procedure suggested in the H.264 standard. Using parallel realization in all syntax element decoders, the proposed CAVLD architecture can decode more than one syntax element per cycle. Moreover, the VLC tables within the different syntax elements were combined into one combined LUT, where the latches were added to avoid unnecessary dynamic power consumption. A single output buffer was used in the hardware implementation. The hardware cost and power consumption were reduced without affecting the decoding performance. Experimental results show that the proposed architecture achieves approximately 19% area reduction and 40% power savings compared to conventional CAVLC decoding methods while maintaining triple decoding speed. The maximum frequency of the proposed architecture is 213 MHz, which is fast enough to decode the 1080 HD ( ) video format. REFERENCES [1] Information Technology-Coding of Moving Pictures and Associated Audio for Digital Storage Media at Up to About 1.5 Mbit/s: Video, in ISO/IEC (MPEG-1 Video), ISO/IEC Std [2] Information Technology-Generic Coding of Moving Pictures and Associated Audio Information: Video, in ITU-T Rec. H.262 (MPEG-2 Video), ISO/IEC Std [3] Information Technology-Generic Coding of Audio-Visual Objects Part 2: Visual, in ISO/IEC (MPEG-4 Video), ISO/IEC Std [4] Video codec for audiovisual services at px64 kbits/s, in ITU-T Recommend. H.261 Version [5] Video Coding for Low Bitrate Communication, in ITU-T Recommend. H.263, 1995, Version 1, Version 2 Sep [6] Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec.H.264 jiso/iec AVC), in Joint Video Team, Mar. 2003, Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-G050. [7] B. Jeon, J. Park, and J. Jeong, Huffman coding of DCT coefficients using dynamic codeword assignment and adaptive codebook selection, Signal Process. Image Commun., vol. 12, no. 3, pp , Jun [8] G. Lakhani, Optimal huffman coding of DCT blocks, IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 4, pp , Apr [9] S. M. Lei and M. T. Sun, An entropy coding system for digital HDTV applications, IEEE Trans. Circuits Syst. Video Technol., vol. 1, no. 1, pp , Mar [10] D. S. Ma, J. F. Yang, and J. Y. Lee, Programmable and parallel variable-length decoder for video systems, IEEE Trans. Consum. Electron., vol. 39, no. 3, pp , Jun [11] B. J. Shieh, Y. S. Lee, and C. Y. Lee, A new approach of groupbased VLC codec system with full table programmability, IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 2, pp , Feb [12] R. Hashemian, Design and hardware implementation of a memory efficient huffman decoding, IEEE Trans. Consum. Electron., vol. 40, no. 3, pp , Aug [13] J. Nikara, S. Vassiliadis, J. Takala, and P. Liuha, Multiple-symbol parallel decoding for variable length codes, IEEE Trans. Very Large Scale Integrat. Syst., vol. 12, no. 7, pp , Jul [14] H. Y. Kang, K. A. Jeong, J. Y. Bae, Y. S. Lee, and S. H. Lee, MPEG4 AVC/H.264 decoder with scalable bus architecture and dual memory controller, in Proc. IEEE Int. Symp. Circuits Syst., May 2004, pp [15] S. H. Wang, W. H. Peng, Y. He, G. Y. Lin, C. Y. Lin, S. C. Chang, C. N. Wang, and T. Chiang, A platform-based MPEG-4 advanced video coding (AVC) decoder with block level pipelining, in Proc. IEEE Int. Conf. Inform. Commun. Security, Huhehaote, China, Dec. 2003, pp [16] D. Wu, W. Gao, M. Hu, and Z. Ji, A VLSI architecture design of CAVLC decoder, in Proc. IEEE Int. Conf. ASIC, Beijing, China, Oct. 2003, vol. 2, pp [17] H. C. Chang, C. C. Lin, and J. I. Guo, A novel low-cost high-performance VLSI architecture for MPEG-4 AVC/H.264 CAVLC decoding, in Proc. IEEE Int. Symp. Circuits Syst., Kobe, Japan, May 2005, pp [18] Y. M. Lin and P. Y. Chen, An efficient implementation of CAVLC for H.264/AVC, in Proc. Int. Conf. Innovative Comput. Inform. Contr., Beijing, China, Aug. 2006, pp [19] M. Alle, J. Biswas, and S. K. Nandy, High performance VLSI architecture design for H.264 CAVLC decoder, in Proc. IEEE 17th Int. Conf. Application-Specific Systems, Architectures Processors, Steamboat Springs, CO, Sep. 2006, pp [20] Y. H. Moon, G. Y. Kim, and J. H. Kim, An efficient decoding of CAVLC in H.264/AVC video coding standard, IEEE Trans. Consum. Electron., vol. 51, no. 3, pp , Aug [21] Y.-H. Kim, Y.-J. Yoo, J. Shin, B. Choi, and J. Paik, Memory-efficient H.264/AVC CAVLC for fast decoding, IEEE Trans. Consum. Electron., vol. 52, no. 3, pp , Aug [22] S.-Y. Tseng and T.-W. Hsieh, A pattern-search method for H.264/AVC CAVLC decoding, in 2006 IEEE Int. Conf. Multimedia Expo, Toronto, ON, Canada, Jul. 2006, pp [23] G.-S. Yu and T.-S. Chang, A zero-skipping multi-symbol CAVLC decoder for MPEG-4 AVC/H.264, in Proc. Int. Symp. Circuits Syst., Island of Kos, Greece, May 2006, pp [24] Y.-N. Wen, G.-L. Wu, S.-J. Chen, and Y.-H. Hu, Multiple-Symbol parallel CAVLC decoder for H.264/AVC, in Proc. APCCAS IEEE Asia Pacific Conf. Circuits Syst., Singapore, Dec. 2006, pp [25] H. Y. Lin, Y. H. Lu, B. D. Liu, and J. F. Yang, Low power design of H.264 CAVLC decoder, in Proc IEEE Int. Symp. Circuits Syst., Island of Kos, Greece, May 2006, pp Heng-Yao Lin (S 06) was born in Tainan, Taiwan, R.O.C., in He received the B.S. and M.S. degrees in electrical engineering from National Cheng Kung University, Tainan, in 2001 and 2003, respectively. He is currently pursuing the Ph.D. degree in electrical engineering at National Cheng Kung University, Tainan. His major research interests include fast algorithms, low power designs, and VLSI architectures for H.264/AVC and multimedia application. Ying-Hung Lu received the B.S. and M.S. degrees in electrical engineering from National Cheng Kung University, Tainan, Taiwan, R.O.C., in 2003 and 2005, respectively. Since 2005, he has been with NOVATEK Microelectronics Corp., Hsinchu, Taiwan. His current research interests include low power designs and VLSI architectures for H.264/AVC and multimedia application.

12 42 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008 Bin-Da (Brian) Liu (S 79-M 82-SM 95-F 06) received the B.S., M.S., and Ph.D. degrees all in electrical engineering from the National Cheng Kung University, Tainan, Taiwan, R.O.C., in 1973, 1975, and 1983, respectively. From 1975 to 1977, he was Electrical Officer in the Combined Service Forces. Since 1977, he has been on the faculty of the National Cheng Kung University, where he is currently Distinguished Professor in the Department of Electrical Engineering and Director of the SoC Research Center. During , he was a Visiting Assistant Professor in the Department of Computer Science, University of Illinois at Urbana-Champaign. During , he was the Director of Electrical Laboratories, National Cheng Kung University. He was the Associate Chair of the Electrical Engineering Department during and the Chair during He has been a Consultant of the Chip Implementation Center, National Applied Research Laboratories, and the VLSI Educational Program, Ministry of Education, Taiwan, since 1995 and 1997, respectively. He has published more than 240 technical papers. He also contributed chapters in the books Neural Networks and Systolic Array Design (Singapore: World Scientific, 2002, D. Zhang, Ed.), Accuracy Improvements in Linguistic Fuzzy Modeling (Heidelberg, Germany: Springer-Verlag, 2003, J. Casillas, O. Cordón, F. Herrera, and L. Magdalena, Eds.), and VLSI Handbook, (Boca Raton, FL: CRC, 2006, W. K. Chen, Ed.). His current research interests include low power circuits, neural network circuits, sensory and biomedical circuits, and VLSI implementation of fuzzy/neural circuits and audio/video signal processors. Dr. Liu is on the Board of Directors of Taiwan IC Design Society and IEEE Tainan Section, and a member of Phi Tau Phi, Taiwan SOC Consortium, International Union of Radio Science, Chinese Fuzzy Systems Association, Chinese Institute of Electrical Engineering (CIEE), and the Institute of Electronics, Information and Communication Engineers (IEICE). He was the Chair of IEEE Circuits and Systems Society-Taipei Chapter during and the Vice President of Region 10, IEEE Circuits and Systems Society, during He served as a CAS Associate Editor for the IEEE Circuits and Devices Magazine during , and an Associate Editor for the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS PART I: REGULAR PAPERS during He received the Dragon Distinguished Paper Award from the Acer Foundation in 1991, 1997, and 2004, the Best Paper Award from the CIEE in 1995 and 2002, the Golden Silicon Award from the Macronix Foundation in 2001, 2002, 2003, and 2006, the MPC Chip Design Award from the National Chip Implementation Center in 2002, 2003, and 2004, the Low Power Design Contest Award form the ACM/IEEE in 2003, the Shen Wen-Zen Memorial Paper Award from the Taiwan IC Design Society in 2004, the Outstanding Electrical Engineering Professor Award from the CIEE in 2004, the Lam Research Thesis Award from Lam Research Corporation in 2005, the Best Paper Award from the Fourth Regional Inter-University Postgraduate Electrical and Electronics Engineering Conference in 2006, the Best Paper Award from 2006 IEEE Asia-Pacific Conference on Circuits and Systems, and the Research Award from the National Science Council annually since He organized the Taiwan Student VLSI Design Contest from 1998, to Since 1992 he has served as a Member of the Steering Committee of VLSI Design/CAD Symposium and served as the General Chair in He served as a member of the Program Committee for many international conferences, including 1998 and 1999 IEEE Workshop on VLSI Signal Processing Systems, 1998 and 2000 IEEE Asia Pacific Conference on Circuits and Systems, 1999 to 2001 IEEE Asia Pacific Conference on ASICs, 1997 to 2003 International Symposium on VLSI Technology, Systems, and Applications, 2005 to 2007 International Symposium on VLSI Design, Automation and Test, 2006 and 2007 IEEE International Conference on Multimedia and Expo, IEEE TENCON and IEEE International Conference on Systems, Man, and Cybernetics. He was a member of International Advisory Committee of the 2003 IEEE International Conference on Neural Networks & Signal Processing, and the International Steering Committee of the IEEE Asia-Pacific Conference on Circuits and Systems from 2001 to He also served as the Technical Program Chair of the 2003 Workshop on Consumer Electronics, the General Co-Chair of the First and Second International Meeting on Microsensors and Microsystems in 2003 and 2006, the General Chair of the 2004 IEEE Asia-Pacific Conference on Circuits and Systems, and the Meeting Chair of the IEEE 9th International Workshop of Cellular Neural Networks and Their Applications in He is currently serving as an Associate Editor for the IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, the IEEE TRANSACTIONS ON FUZZY SYSTEMS, and the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION SYSTEMS. Jar-Ferr Yang (S 84-M 88-SM 98-F 07) was born in Keelung, Taiwan, R.O.C., on September 15, He received his B.S. degree from the Chung-Yuan Christian University, Taiwan, in 1977, the M.S. degree from the National Taiwan University, Taiwan, in 1979, and the Ph.D. degree from the University of Minnesota, Minneapolis, in 1988 all in electrical engineering. He was an Instructor in the Chinese Naval Engineering School for his Navy ROTC service in As an Assistant Researcher, he worked in the Data Transmission and Network Design Research Group, Telecommunication Laboratories, during From 1982 to 1984, he was an Adjunct Lecturer in the Chung-Yuan Christian University. From 1984 to 1988, he received the Government Study Abroad Scholarship supported his advanced study in the University of Minnesota. In 1988, he jointed the National Cheng Kung University started from an Associate Professor and promoted to Full Professor and Distinguished Professor in 1994 and 2004, respectively. He was the Chairman of the Center for Computer and Communication Research, National Cheng Kung University, from 1997 to In 2002, he was a Visiting Scholar at the Department of Electrical Engineering, University of Washington, Seattle. Currently, he is the Chairperson of Graduate Institute of Computer and Communication Engineering, the Director of the Electrical and Information Technology Center. Dr. Yang was selected as a speaker in the Distinguished Lecturer Program by the IEEE Circuits and Systems Society in He was the Technical Program Co-chair of the 2004 IEEE Asia Pacific Conference on Circuits and Systems and the 9th 2005 IEEE International Workshop on Cellular Neural Networks and Their Applications. From 2004 to 2006, he was Chair of IEEE Signal Processing Society, Tainan Chapter, and Treasurer of IEEE Tainan Section. From 2006, he joins Chair Committee of the IEEE Signal Processing Society. During , he was the Secretary (Chair-elected) of IEEE Multimedia Systems and Applications Technical Committee in the IEEE Circuits and Systems Society. Currently, he is an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY and IEEE Circuits and Devices Magazine. He is an Associate Editor of the Journal of Applied Signal Processing and was a Guest Editor of the Special Issue on Advanced Video Technologies and Applications for H.264/AVC and Beyond in this journal. During , he is an Editorial Board Member of IET Signal Processing. He has published over 75 journal and 120 conference papers. His teaching and research areas primarily include video, audio, and speech signal processing and coding, adaptive signal processing, and digital life system design and integration. He is a Fellow of IEEE for his contributions to fast algorithms and efficient realization of video and audio coding.

THE new video coding standard H.264/AVC [1] significantly

THE new video coding standard H.264/AVC [1] significantly 832 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 9, SEPTEMBER 2006 Architecture Design of Context-Based Adaptive Variable-Length Coding for H.264/AVC Tung-Chien Chen, Yu-Wen

More information

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions 1128 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 10, OCTOBER 2001 An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions Kwok-Wai Wong, Kin-Man Lam,

More information

An Efficient Reduction of Area in Multistandard Transform Core

An Efficient Reduction of Area in Multistandard Transform Core An Efficient Reduction of Area in Multistandard Transform Core A. Shanmuga Priya 1, Dr. T. K. Shanthi 2 1 PG scholar, Applied Electronics, Department of ECE, 2 Assosiate Professor, Department of ECE Thanthai

More information

WITH the demand of higher video quality, lower bit

WITH the demand of higher video quality, lower bit IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 8, AUGUST 2006 917 A High-Definition H.264/AVC Intra-Frame Codec IP for Digital Video and Still Camera Applications Chun-Wei

More information

Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table

Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table 48 3, 376 March 29 Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table Myounghoon Kim Hoonjae Lee Ja-Cheon Yoon Korea University Department of Electronics and Computer Engineering,

More information

ALONG with the progressive device scaling, semiconductor

ALONG with the progressive device scaling, semiconductor IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 57, NO. 4, APRIL 2010 285 LUT Optimization for Memory-Based Computation Pramod Kumar Meher, Senior Member, IEEE Abstract Recently, we

More information

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur Module 8 VIDEO CODING STANDARDS Lesson 27 H.264 standard Lesson Objectives At the end of this lesson, the students should be able to: 1. State the broad objectives of the H.264 standard. 2. List the improved

More information

Visual Communication at Limited Colour Display Capability

Visual Communication at Limited Colour Display Capability Visual Communication at Limited Colour Display Capability Yan Lu, Wen Gao and Feng Wu Abstract: A novel scheme for visual communication by means of mobile devices with limited colour display capability

More information

FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION

FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION 1 YONGTAE KIM, 2 JAE-GON KIM, and 3 HAECHUL CHOI 1, 3 Hanbat National University, Department of Multimedia Engineering 2 Korea Aerospace

More information

Selective Intra Prediction Mode Decision for H.264/AVC Encoders

Selective Intra Prediction Mode Decision for H.264/AVC Encoders Selective Intra Prediction Mode Decision for H.264/AVC Encoders Jun Sung Park, and Hyo Jung Song Abstract H.264/AVC offers a considerably higher improvement in coding efficiency compared to other compression

More information

Chapter 2 Introduction to

Chapter 2 Introduction to Chapter 2 Introduction to H.264/AVC H.264/AVC [1] is the newest video coding standard of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG). The main improvements

More information

THE USE OF forward error correction (FEC) in optical networks

THE USE OF forward error correction (FEC) in optical networks IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 8, AUGUST 2005 461 A High-Speed Low-Complexity Reed Solomon Decoder for Optical Communications Hanho Lee, Member, IEEE Abstract

More information

Implementation of Memory Based Multiplication Using Micro wind Software

Implementation of Memory Based Multiplication Using Micro wind Software Implementation of Memory Based Multiplication Using Micro wind Software U.Palani 1, M.Sujith 2,P.Pugazhendiran 3 1 IFET College of Engineering, Department of Information Technology, Villupuram 2,3 IFET

More information

A low-power portable H.264/AVC decoder using elastic pipeline

A low-power portable H.264/AVC decoder using elastic pipeline Chapter 3 A low-power portable H.64/AVC decoder using elastic pipeline Yoshinori Sakata, Kentaro Kawakami, Hiroshi Kawaguchi, Masahiko Graduate School, Kobe University, Kobe, Hyogo, 657-8507 Japan Email:

More information

SCALABLE video coding (SVC) is currently being developed

SCALABLE video coding (SVC) is currently being developed IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 7, JULY 2006 889 Fast Mode Decision Algorithm for Inter-Frame Coding in Fully Scalable Video Coding He Li, Z. G. Li, Senior

More information

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards COMP 9 Advanced Distributed Systems Multimedia Networking Video Compression Standards Kevin Jeffay Department of Computer Science University of North Carolina at Chapel Hill jeffay@cs.unc.edu September,

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005. Wang, D., Canagarajah, CN., & Bull, DR. (2005). S frame design for multiple description video coding. In IEEE International Symposium on Circuits and Systems (ISCAS) Kobe, Japan (Vol. 3, pp. 19 - ). Institute

More information

WITH the rapid development of high-fidelity video services

WITH the rapid development of high-fidelity video services 896 IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 7, JULY 2015 An Efficient Frame-Content Based Intra Frame Rate Control for High Efficiency Video Coding Miaohui Wang, Student Member, IEEE, KingNgiNgan,

More information

Comparative Study of JPEG2000 and H.264/AVC FRExt I Frame Coding on High-Definition Video Sequences

Comparative Study of JPEG2000 and H.264/AVC FRExt I Frame Coding on High-Definition Video Sequences Comparative Study of and H.264/AVC FRExt I Frame Coding on High-Definition Video Sequences Pankaj Topiwala 1 FastVDO, LLC, Columbia, MD 210 ABSTRACT This paper reports the rate-distortion performance comparison

More information

Hardware study on the H.264/AVC video stream parser

Hardware study on the H.264/AVC video stream parser Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 5-1-2008 Hardware study on the H.264/AVC video stream parser Michelle M. Brown Follow this and additional works

More information

Video coding standards

Video coding standards Video coding standards Video signals represent sequences of images or frames which can be transmitted with a rate from 5 to 60 frames per second (fps), that provides the illusion of motion in the displayed

More information

176 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 2, FEBRUARY 2003

176 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 2, FEBRUARY 2003 176 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 2, FEBRUARY 2003 Transactions Letters Error-Resilient Image Coding (ERIC) With Smart-IDCT Error Concealment Technique for

More information

LUT Optimization for Memory Based Computation using Modified OMS Technique

LUT Optimization for Memory Based Computation using Modified OMS Technique LUT Optimization for Memory Based Computation using Modified OMS Technique Indrajit Shankar Acharya & Ruhan Bevi Dept. of ECE, SRM University, Chennai, India E-mail : indrajitac123@gmail.com, ruhanmady@yahoo.co.in

More information

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.210

More information

Design of Memory Based Implementation Using LUT Multiplier

Design of Memory Based Implementation Using LUT Multiplier Design of Memory Based Implementation Using LUT Multiplier Charan Kumar.k 1, S. Vikrama Narasimha Reddy 2, Neelima Koppala 3 1,2 M.Tech(VLSI) Student, 3 Assistant Professor, ECE Department, Sree Vidyanikethan

More information

A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System

A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-core System Zhibin Xiao and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Outline Introduction to H.264

More information

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 6, NO. 3, JUNE 1996 313 Express Letters A Novel Four-Step Search Algorithm for Fast Block Motion Estimation Lai-Man Po and Wing-Chung

More information

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264 Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264 Ju-Heon Seo, Sang-Mi Kim, Jong-Ki Han, Nonmember Abstract-- In the H.264, MBAFF (Macroblock adaptive frame/field) and PAFF (Picture

More information

A High Performance Deblocking Filter Hardware for High Efficiency Video Coding

A High Performance Deblocking Filter Hardware for High Efficiency Video Coding 714 IEEE Transactions on Consumer Electronics, Vol. 59, No. 3, August 2013 A High Performance Deblocking Filter Hardware for High Efficiency Video Coding Erdem Ozcan, Yusuf Adibelli, Ilker Hamzaoglu, Senior

More information

OMS Based LUT Optimization

OMS Based LUT Optimization International Journal of Advanced Education and Research ISSN: 2455-5746, Impact Factor: RJIF 5.34 www.newresearchjournal.com/education Volume 1; Issue 5; May 2016; Page No. 11-15 OMS Based LUT Optimization

More information

A Low Power Implementation of H.264 Adaptive Deblocking Filter Algorithm

A Low Power Implementation of H.264 Adaptive Deblocking Filter Algorithm A Low Power Implementation of H.264 Adaptive Deblocking Filter Algorithm Mustafa Parlak and Ilker Hamzaoglu Faculty of Engineering and Natural Sciences Sabanci University, Tuzla, 34956, Istanbul, Turkey

More information

A High Performance VLSI Architecture with Half Pel and Quarter Pel Interpolation for A Single Frame

A High Performance VLSI Architecture with Half Pel and Quarter Pel Interpolation for A Single Frame I J C T A, 9(34) 2016, pp. 673-680 International Science Press A High Performance VLSI Architecture with Half Pel and Quarter Pel Interpolation for A Single Frame K. Priyadarshini 1 and D. Jackuline Moni

More information

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013 International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013 Design and Implementation of an Enhanced LUT System in Security Based Computation dama.dhanalakshmi 1, K.Annapurna

More information

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

Overview: Video Coding Standards

Overview: Video Coding Standards Overview: Video Coding Standards Video coding standards: applications and common structure ITU-T Rec. H.261 ISO/IEC MPEG-1 ISO/IEC MPEG-2 State-of-the-art: H.264/AVC Video Coding Standards no. 1 Applications

More information

An FPGA Implementation of Shift Register Using Pulsed Latches

An FPGA Implementation of Shift Register Using Pulsed Latches An FPGA Implementation of Shift Register Using Pulsed Latches Shiny Panimalar.S, T.Nisha Priscilla, Associate Professor, Department of ECE, MAMCET, Tiruchirappalli, India PG Scholar, Department of ECE,

More information

Line-Adaptive Color Transforms for Lossless Frame Memory Compression

Line-Adaptive Color Transforms for Lossless Frame Memory Compression Line-Adaptive Color Transforms for Lossless Frame Memory Compression Joungeun Bae 1 and Hoon Yoo 2 * 1 Department of Computer Science, SangMyung University, Jongno-gu, Seoul, South Korea. 2 Full Professor,

More information

Area-efficient high-throughput parallel scramblers using generalized algorithms

Area-efficient high-throughput parallel scramblers using generalized algorithms LETTER IEICE Electronics Express, Vol.10, No.23, 1 9 Area-efficient high-throughput parallel scramblers using generalized algorithms Yun-Ching Tang 1, 2, JianWei Chen 1, and Hongchin Lin 1a) 1 Department

More information

H.264/AVC Baseline Profile Decoder Complexity Analysis

H.264/AVC Baseline Profile Decoder Complexity Analysis 704 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003 H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, Senior

More information

A Novel Macroblock-Level Filtering Upsampling Architecture for H.264/AVC Scalable Extension

A Novel Macroblock-Level Filtering Upsampling Architecture for H.264/AVC Scalable Extension 05-Silva-AF:05-Silva-AF 8/19/11 6:18 AM Page 43 A Novel Macroblock-Level Filtering Upsampling Architecture for H.264/AVC Scalable Extension T. L. da Silva 1, L. A. S. Cruz 2, and L. V. Agostini 3 1 Telecommunications

More information

Interframe Bus Encoding Technique for Low Power Video Compression

Interframe Bus Encoding Technique for Low Power Video Compression Interframe Bus Encoding Technique for Low Power Video Compression Asral Bahari, Tughrul Arslan and Ahmet T. Erdogan School of Engineering and Electronics, University of Edinburgh United Kingdom Email:

More information

Optimization of memory based multiplication for LUT

Optimization of memory based multiplication for LUT Optimization of memory based multiplication for LUT V. Hari Krishna *, N.C Pant ** * Guru Nanak Institute of Technology, E.C.E Dept., Hyderabad, India ** Guru Nanak Institute of Technology, Prof & Head,

More information

A Novel Architecture of LUT Design Optimization for DSP Applications

A Novel Architecture of LUT Design Optimization for DSP Applications A Novel Architecture of LUT Design Optimization for DSP Applications O. Anjaneyulu 1, Parsha Srikanth 2 & C. V. Krishna Reddy 3 1&2 KITS, Warangal, 3 NNRESGI, Hyderabad E-mail : anjaneyulu_o@yahoo.com

More information

Color Image Compression Using Colorization Based On Coding Technique

Color Image Compression Using Colorization Based On Coding Technique Color Image Compression Using Colorization Based On Coding Technique D.P.Kawade 1, Prof. S.N.Rawat 2 1,2 Department of Electronics and Telecommunication, Bhivarabai Sawant Institute of Technology and Research

More information

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.8, NO.5, OCTOBER, 08 ISSN(Print) 598-657 https://doi.org/57/jsts.08.8.5.640 ISSN(Online) -4866 A Modified Static Contention Free Single Phase Clocked

More information

Video compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and

Video compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and Video compression principles Video: moving pictures and the terms frame and picture. one approach to compressing a video source is to apply the JPEG algorithm to each frame independently. This approach

More information

IN DIGITAL transmission systems, there are always scramblers

IN DIGITAL transmission systems, there are always scramblers 558 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 7, JULY 2006 Parallel Scrambler for High-Speed Applications Chih-Hsien Lin, Chih-Ning Chen, You-Jiun Wang, Ju-Yuan Hsiao,

More information

Implementation of an MPEG Codec on the Tilera TM 64 Processor

Implementation of an MPEG Codec on the Tilera TM 64 Processor 1 Implementation of an MPEG Codec on the Tilera TM 64 Processor Whitney Flohr Supervisor: Mark Franklin, Ed Richter Department of Electrical and Systems Engineering Washington University in St. Louis Fall

More information

Adaptive Key Frame Selection for Efficient Video Coding

Adaptive Key Frame Selection for Efficient Video Coding Adaptive Key Frame Selection for Efficient Video Coding Jaebum Jun, Sunyoung Lee, Zanming He, Myungjung Lee, and Euee S. Jang Digital Media Lab., Hanyang University 17 Haengdang-dong, Seongdong-gu, Seoul,

More information

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532 www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 5 Issue 10 Oct. 2016, Page No. 18532-18540 Pulsed Latches Methodology to Attain Reduced Power and Area Based

More information

Memory efficient Distributed architecture LUT Design using Unified Architecture

Memory efficient Distributed architecture LUT Design using Unified Architecture Research Article Memory efficient Distributed architecture LUT Design using Unified Architecture Authors: 1 S.M.L.V.K. Durga, 2 N.S. Govind. Address for Correspondence: 1 M.Tech II Year, ECE Dept., ASR

More information

Key Techniques of Bit Rate Reduction for H.264 Streams

Key Techniques of Bit Rate Reduction for H.264 Streams Key Techniques of Bit Rate Reduction for H.264 Streams Peng Zhang, Qing-Ming Huang, and Wen Gao Institute of Computing Technology, Chinese Academy of Science, Beijing, 100080, China {peng.zhang, qmhuang,

More information

Multimedia Communications. Video compression

Multimedia Communications. Video compression Multimedia Communications Video compression Video compression Of all the different sources of data, video produces the largest amount of data There are some differences in our perception with regard to

More information

Motion Video Compression

Motion Video Compression 7 Motion Video Compression 7.1 Motion video Motion video contains massive amounts of redundant information. This is because each image has redundant information and also because there are very few changes

More information

CERIAS Tech Report Preprocessing and Postprocessing Techniques for Encoding Predictive Error Frames in Rate Scalable Video Codecs by E

CERIAS Tech Report Preprocessing and Postprocessing Techniques for Encoding Predictive Error Frames in Rate Scalable Video Codecs by E CERIAS Tech Report 2001-118 Preprocessing and Postprocessing Techniques for Encoding Predictive Error Frames in Rate Scalable Video Codecs by E Asbun, P Salama, E Delp Center for Education and Research

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Recommendation ITU-T H.261 Fernando Pereira The objective of this lab session about Recommendation ITU-T H.261 is to get the students familiar with many aspects

More information

Design of a Fast Multi-Reference Frame Integer Motion Estimator for H.264/AVC

Design of a Fast Multi-Reference Frame Integer Motion Estimator for H.264/AVC http://dx.doi.org/10.5573/jsts.2013.13.5.430 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.13, NO.5, OCTOBER, 2013 Design of a Fast Multi-Reference Frame Integer Motion Estimator for H.264/AVC Juwon

More information

Reduced complexity MPEG2 video post-processing for HD display

Reduced complexity MPEG2 video post-processing for HD display Downloaded from orbit.dtu.dk on: Dec 17, 2017 Reduced complexity MPEG2 video post-processing for HD display Virk, Kamran; Li, Huiying; Forchhammer, Søren Published in: IEEE International Conference on

More information

A VLSI Architecture for Variable Block Size Video Motion Estimation

A VLSI Architecture for Variable Block Size Video Motion Estimation A VLSI Architecture for Variable Block Size Video Motion Estimation Yap, S. Y., & McCanny, J. (2004). A VLSI Architecture for Variable Block Size Video Motion Estimation. IEEE Transactions on Circuits

More information

128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY

128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY 128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY 1 Mrs.K.K. Varalaxmi, M.Tech, Assoc. Professor, ECE Department, 1varuhello@Gmail.Com 2 Shaik Shamshad

More information

FRAME RATE BLOCK SELECTION APPROACH BASED DIGITAL WATER MARKING FOR EFFICIENT VIDEO AUTHENTICATION USING NETWORK CONDITIONS

FRAME RATE BLOCK SELECTION APPROACH BASED DIGITAL WATER MARKING FOR EFFICIENT VIDEO AUTHENTICATION USING NETWORK CONDITIONS FRAME RATE BLOCK SELECTION APPROACH BASED DIGITAL WATER MARKING FOR EFFICIENT VIDEO AUTHENTICATION USING NETWORK CONDITIONS A. Kirthika 1 and A. Senthilkumar 2 1 Department of Electronics and Communication

More information

OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS

OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS Habibollah Danyali and Alfred Mertins School of Electrical, Computer and

More information

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes Digital Signal and Image Processing Lab Simone Milani Ph.D. student simone.milani@dei.unipd.it, Summer School

More information

MPEG has been established as an international standard

MPEG has been established as an international standard 1100 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 9, NO. 7, OCTOBER 1999 Fast Extraction of Spatially Reduced Image Sequences from MPEG-2 Compressed Video Junehwa Song, Member,

More information

Motion Compensation Hardware Accelerator Architecture for H.264/AVC

Motion Compensation Hardware Accelerator Architecture for H.264/AVC Motion Compensation Hardware Accelerator Architecture for H.264/AVC Bruno Zatt 1, Valter Ferreira 1, Luciano Agostini 2, Flávio R. Wagner 1, Altamiro Susin 3, and Sergio Bampi 1 1 Informatics Institute

More information

Implementation of Area Efficient Memory-Based FIR Digital Filter Using LUT-Multiplier

Implementation of Area Efficient Memory-Based FIR Digital Filter Using LUT-Multiplier Implementation of Area Efficient Memory-Based FIR Digital Filter Using LUT-Multiplier K.Purnima, S.AdiLakshmi, M.Jyothi Department of ECE, K L University Vijayawada, INDIA Abstract Memory based structures

More information

International Journal for Research in Applied Science & Engineering Technology (IJRASET) Motion Compensation Techniques Adopted In HEVC

International Journal for Research in Applied Science & Engineering Technology (IJRASET) Motion Compensation Techniques Adopted In HEVC Motion Compensation Techniques Adopted In HEVC S.Mahesh 1, K.Balavani 2 M.Tech student in Bapatla Engineering College, Bapatla, Andahra Pradesh Assistant professor in Bapatla Engineering College, Bapatla,

More information

The Multistandard Full Hd Video-Codec Engine On Low Power Devices

The Multistandard Full Hd Video-Codec Engine On Low Power Devices The Multistandard Full Hd Video-Codec Engine On Low Power Devices B.Susma (M. Tech). Embedded Systems. Aurora s Technological & Research Institute. Hyderabad. B.Srinivas Asst. professor. ECE, Aurora s

More information

JPEG 2000 [1] [4] uses two key components, discrete

JPEG 2000 [1] [4] uses two key components, discrete IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 6, OCTOBER 2007 1103 Word-Level Parallel Architecture of JPEG 2000 Embedded Block Coding Decoder Yu-Wei Chang, Hung-Chi Fang, Chun-Chia Chen, Chung-Jr Lian,

More information

Transactions Briefs. Interframe Bus Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression

Transactions Briefs. Interframe Bus Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 5, MAY 2010 831 Transactions Briefs Interframe Bus Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression

More information

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE S.Basi Reddy* 1, K.Sreenivasa Rao 2 1 M.Tech Student, VLSI System Design, Annamacharya Institute of Technology & Sciences (Autonomous), Rajampet (A.P),

More information

How to Manage Video Frame- Processing Time Deviations in ASIC and SOC Video Processors

How to Manage Video Frame- Processing Time Deviations in ASIC and SOC Video Processors WHITE PAPER How to Manage Video Frame- Processing Time Deviations in ASIC and SOC Video Processors Some video frames take longer to process than others because of the nature of digital video compression.

More information

Chapter 10 Basic Video Compression Techniques

Chapter 10 Basic Video Compression Techniques Chapter 10 Basic Video Compression Techniques 10.1 Introduction to Video compression 10.2 Video Compression with Motion Compensation 10.3 Video compression standard H.261 10.4 Video compression standard

More information

Jun-Hao Zheng et al.: An Efficient VLSI Architecture for MC of AVS HDTV Decoder 371 ture for MC which contains a three-stage pipeline. The hardware ar

Jun-Hao Zheng et al.: An Efficient VLSI Architecture for MC of AVS HDTV Decoder 371 ture for MC which contains a three-stage pipeline. The hardware ar May 2006, Vol.21, No.3, pp.370 377 J. Comput. Sci. & Technol. An Efficient VLSI Architecture for Motion Compensation of AVS HDTV Decoder Jun-Hao Zheng 1;3 (ΨΞ ), Lei Deng 2 ( Π), Peng Zhang 1;3 (Φ ±),

More information

Performance Comparison of JPEG2000 and H.264/AVC High Profile Intra Frame Coding on HD Video Sequences

Performance Comparison of JPEG2000 and H.264/AVC High Profile Intra Frame Coding on HD Video Sequences Performance Comparison of and H.264/AVC High Profile Intra Frame Coding on HD Video Sequences Pankaj Topiwala, Trac Tran, Wei Dai {pankaj, trac, daisy} @ fastvdo.com FastVDO, LLC, Columbia, MD 210 ABSTRACT

More information

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm International Journal of Signal Processing Systems Vol. 2, No. 2, December 2014 Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm Walid

More information

Interframe Bus Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression

Interframe Bus Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression Interframe Encoding Technique and Architecture for MPEG-4 AVC/H.264 Video Compression Asral Bahari, Tughrul Arslan and Ahmet T. Erdogan Abstract In this paper, we propose an implementation of a data encoder

More information

PAPER A High-Speed Low-Complexity Time-Multiplexing Reed-Solomon-Based FEC Architecture for Optical Communications

PAPER A High-Speed Low-Complexity Time-Multiplexing Reed-Solomon-Based FEC Architecture for Optical Communications 2424 IEICE TRANS. FUNDAMENTALS, VOL.E95 A, NO.12 DECEMBER 2012 PAPER A High-Speed Low-Complexity Time-Multiplexing Reed-Solomon-Based FEC Architecture for Optical Communications Jeong-In PARK, Nonmember

More information

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT. An Advanced and Area Optimized L.U.T Design using A.P.C. and O.M.S K.Sreelakshmi, A.Srinivasa Rao Department of Electronics and Communication Engineering Nimra College of Engineering and Technology Krishna

More information

IN A SERIAL-LINK data transmission system, a data clock

IN A SERIAL-LINK data transmission system, a data clock IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 9, SEPTEMBER 2006 827 DC-Balance Low-Jitter Transmission Code for 4-PAM Signaling Hsiao-Yun Chen, Chih-Hsien Lin, and Shyh-Jye

More information

A CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS

A CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS 9th European Signal Processing Conference (EUSIPCO 2) Barcelona, Spain, August 29 - September 2, 2 A 6-65 CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS Jinjia Zhou, Dajiang

More information

Performance Evaluation of Error Resilience Techniques in H.264/AVC Standard

Performance Evaluation of Error Resilience Techniques in H.264/AVC Standard Performance Evaluation of Error Resilience Techniques in H.264/AVC Standard Ram Narayan Dubey Masters in Communication Systems Dept of ECE, IIT-R, India Varun Gunnala Masters in Communication Systems Dept

More information

Scalable Foveated Visual Information Coding and Communications

Scalable Foveated Visual Information Coding and Communications Scalable Foveated Visual Information Coding and Communications Ligang Lu,1 Zhou Wang 2 and Alan C. Bovik 2 1 Multimedia Technologies, IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA 2

More information

Power Optimization by Using Multi-Bit Flip-Flops

Power Optimization by Using Multi-Bit Flip-Flops Volume-4, Issue-5, October-2014, ISSN No.: 2250-0758 International Journal of Engineering and Management Research Page Number: 194-198 Power Optimization by Using Multi-Bit Flip-Flops D. Hazinayab 1, K.

More information

The H.263+ Video Coding Standard: Complexity and Performance

The H.263+ Video Coding Standard: Complexity and Performance The H.263+ Video Coding Standard: Complexity and Performance Berna Erol (bernae@ee.ubc.ca), Michael Gallant (mikeg@ee.ubc.ca), Guy C t (guyc@ee.ubc.ca), and Faouzi Kossentini (faouzi@ee.ubc.ca) Department

More information

LUT Design Using OMS Technique for Memory Based Realization of FIR Filter

LUT Design Using OMS Technique for Memory Based Realization of FIR Filter International Journal of Emerging Engineering Research and Technology Volume. 2, Issue 6, September 2014, PP 72-80 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) LUT Design Using OMS Technique for Memory

More information

Error Resilient Video Coding Using Unequally Protected Key Pictures

Error Resilient Video Coding Using Unequally Protected Key Pictures Error Resilient Video Coding Using Unequally Protected Key Pictures Ye-Kui Wang 1, Miska M. Hannuksela 2, and Moncef Gabbouj 3 1 Nokia Mobile Software, Tampere, Finland 2 Nokia Research Center, Tampere,

More information

The H.26L Video Coding Project

The H.26L Video Coding Project The H.26L Video Coding Project New ITU-T Q.6/SG16 (VCEG - Video Coding Experts Group) standardization activity for video compression August 1999: 1 st test model (TML-1) December 2001: 10 th test model

More information

A Low Energy HEVC Inverse Transform Hardware

A Low Energy HEVC Inverse Transform Hardware 754 IEEE Transactions on Consumer Electronics, Vol. 60, No. 4, November 2014 A Low Energy HEVC Inverse Transform Hardware Ercan Kalali, Erdem Ozcan, Ozgun Mert Yalcinkaya, Ilker Hamzaoglu, Senior Member,

More information

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks July 22 nd 2008 Vineeth Shetty Kolkeri EE Graduate,UTA 1 Outline 2. Introduction 3. Error control

More information

A Study of Encoding and Decoding Techniques for Syndrome-Based Video Coding

A Study of Encoding and Decoding Techniques for Syndrome-Based Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com A Study of Encoding and Decoding Techniques for Syndrome-Based Video Coding Min Wu, Anthony Vetro, Jonathan Yedidia, Huifang Sun, Chang Wen

More information

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur Module 8 VIDEO CODING STANDARDS Lesson 24 MPEG-2 Standards Lesson Objectives At the end of this lesson, the students should be able to: 1. State the basic objectives of MPEG-2 standard. 2. Enlist the profiles

More information

A Novel Bus Encoding Technique for Low Power VLSI

A Novel Bus Encoding Technique for Low Power VLSI A Novel Bus Encoding Technique for Low Power VLSI Jayapreetha Natesan and Damu Radhakrishnan * Department of Electrical and Computer Engineering State University of New York 75 S. Manheim Blvd., New Paltz,

More information

Multimedia Communications. Image and Video compression

Multimedia Communications. Image and Video compression Multimedia Communications Image and Video compression JPEG2000 JPEG2000: is based on wavelet decomposition two types of wavelet filters one similar to what discussed in Chapter 14 and the other one generates

More information

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop 1 S.Mounika & 2 P.Dhaneef Kumar 1 M.Tech, VLSIES, GVIC college, Madanapalli, mounikarani3333@gmail.com

More information

Variable Block-Size Transforms for H.264/AVC

Variable Block-Size Transforms for H.264/AVC 604 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003 Variable Block-Size Transforms for H.264/AVC Mathias Wien, Member, IEEE Abstract A concept for variable block-size

More information

Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture

Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture Vinaykumar Bagali 1, Deepika S Karishankari 2 1 Asst Prof, Electrical and Electronics Dept, BLDEA

More information

Error concealment techniques in H.264 video transmission over wireless networks

Error concealment techniques in H.264 video transmission over wireless networks Error concealment techniques in H.264 video transmission over wireless networks M U L T I M E D I A P R O C E S S I N G ( E E 5 3 5 9 ) S P R I N G 2 0 1 1 D R. K. R. R A O F I N A L R E P O R T Murtaza

More information

An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency

An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency Journal From the SelectedWorks of Journal December, 2014 An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency P. Manga

More information

EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH

EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH 1 Kalaivani.S, 2 Sathyabama.R 1 PG Scholar, 2 Professor/HOD Department of ECE, Government College of Technology Coimbatore,

More information