Adaptive Down-Sampling Video Coding

Adapive Down-Sampling Video Coding Ren-Jie Wang a, Ming-Chen Chien ab and Pao-Chi Chang* a a Dep. of Communicaion Engineering, Naional Cenral Univ., Jhongli, Taiwan; b Dep. of Elecrical Engineering, Chin Min Insiue of Technology, Miaoli, Taiwan ABSTRACT Down-sampling coding, which sub-samples he image and encodes he smaller sized images, is one of he soluions o raise he image qualiy a insufficienly high raes. In his work, we propose an Adapive Down-Sampling (ADS) coding for H.64/AVC. The overall sysem disorion can be analyzed as he sum of he down-sampling disorion and he coding disorion. The down-sampling disorion is mainly he loss of he high frequency componens ha is highly dependen of he spaial difference. The coding disorion can be derived from he classical Rae-Disorion heory. For a given rae and a video sequence, he opimum down-sampling resoluion-raio can be derived by uilizing he opimum heory oward minimizing he sysem disorion based on he models of he wo disorions. This opimal resoluion-raio is used in boh down-sampling and up-sampling processes in ADS coding scheme. As a resul, he rae-disorion performance of ADS coding is always higher han he fixed raio coding or H.64/AVC by o 4 db a low o medium raes. Keywords: Down-sampling, Video Coding, H.64, High-Definiion Video 1. INTRODUCTION Mulimedia applicaions on various devices and heerogeneous neworks become popular. The demand for High- Definiion (HD) video is also increasing. HD video usually delivers much higher biraes han radiional video. However, he available raes provided by he neworks, especially wireless neworks, are no always adequae for high qualiy video. In such a condiion, down-sampling coding, which down-samples he image and encodes he smaller sized images, is one of he soluions o raise he image qualiy. In general, i yields beer performance han he original full-size video coding a low o medium raes. Lieraures on down-sampling coding focus on he discussions for down-sampling or up-sampling procedures. In he procedure of up-sampling, super-resoluion echnique can be applied o he decoder of down-sampling coding o raise he rae-disorion performance [1]. However, he compuaional complexiy is very high. Fas algorihms of superresoluion also exis [], bu he degraded R-D performance significanly reduces he advanage of super-resoluion. To improve he coding performance, video frames can be classified ino differen ypes based on differen visual characerisics, and hen be processed wih condiional super-resoluion [3]. However, he coding efficiency improvemen by adjusing up-sampling procedure is limied, and he improvemen would be varied because of differen video conens. We shall focus on he down-sampling procedure for beer and sable performance. I is observed ha he opimum down-sampling raio depends on he biraes [4]. Namely, he superior performance over he radiional coding only exiss in a range of biraes. A variable down-sampling raio may exend he superior biraes range. For sill images, adapive down-sampling raio has been applied o JPEG image compression [5]. Sampling-rae selecion schemes for video coding were proposed in recen years. An adapive down-sampling mode decision in he encoder was proposed [6]. The modes including differen direcions and sizes can be deermined by he feaures of block conens. This mehod also provides beer RD performance han regular video coding. However, here exiss blocking effec because of differen disorion feaures in differen blocks. Moreover, modifying coding loop loses he synax conformaion for video coding sandard. Anoher relaed discussion for sampling-rae selecion is on he resoluion ranscoding wih up-sampling procedure [7]. The mehod esimaes he biraes for differen resoluion versions and hen selecs he larges resoluion saisfying he biraes consrain o encode. However, his sraegy may no reach he minimum disorion and only five resoluion versions can be used. *pcchang@vaplab.ce.ncu.edu.w; phone 886 3 47151; fax 886 3 49187; vaplab.ce.ncu.edu.w Mulimedia on Mobile Devices 010, edied by Reiner Creuzburg, David Akopian, Proc. of SPIE-IS&T Elecronic Imaging, SPIE Vol. 754, 7540P 010 SPIE-IS&T CCC code: 077-786X/10/$18 doi: 10.1117/1.84057 SPIE-IS&T/ Vol. 754 7540P-1

In his work, we propose an Adapive Down-Sampling (ADS) coding for H.64/AVC video coding. Furhermore, he opimum down-sampling resoluion-raio is derived based on he models of down-sampling disorion and coding disorion. The archiecure and disorion analysis are discussed in Secion. The derivaion of opimum resoluion-raio is also shown in his secion. Secion 3 provides simulaion resuls. Finally, Secion 4 provides he conclusions.. ADAPTIVE DOWN-SAMPLING CODING.1 Archiecure of adapive down-sampling coding The archiecure of he proposed ADS coding is shown in Figure 1. The original frame F high is firs down-sampled as DF low. I is hen encoded as convenional video coding wih he biraes R. A he decoder side, he video sequence is decoded as RDF low. I is hen up-sampled o he original resoluion as URDF high. The resoluion-raio a is defined as he area raio of he original frame area o he down-sampled frame area. For insance, in he case of down-sampling from CIF o QCIF, a is se o 0.5. Figure shows he performance comparison beween down-sampling scheme (circle) and original H.64 (riangle). There exiss a cross-over rae ha he down-sampling scheme ouperforms he original H.64 under he rae, bu is worse if he rae is higher han he cross-over rae. We also observe ha a small a resuls in high gain in PSNR bu low cross-over rae. Therefore, i is exremely imporan o find a suiable a for a given rae. Original frame in a GOP, F high Reconsruced frame of F high, URDF high Down-sampling wih Adapive Resoluion-raio Up-sampling wih Adapive Resoluion-raio DF low Encoding Transmission wih Biraes Consrain R Decoding RDF low Figure 1. Down-sampling coding scheme wih adapive resoluion-raio. Figure. Performance comparison beween down-sampling scheme and original H.64, Sequence Riverbed. SPIE-IS&T/ Vol. 754 7540P-

. Opimizaion of resoluion-raio In his sysem, wo ypes of disorions are involved in he down-sampling coding scheme. Boh are relaed o resoluionraio under birae consrain. The firs disorion is creaed from down-sampling, denoed as D down. I is inverse proporional o he resoluion-raio and is highly dependen of characerisics of video sequences. The second disorion is generaed from he quanizaion of he encoder, denoed as D coding. I is proporional o he resoluion-raio and also depends on biraes and characerisics of video sequences. Figure 3 shows he coding disorion a various raes as well as down-sampling disorion versus he resoluion-raio. I clearly shows ha he wo disorions have opposie rends as he resoluion-raio increases. Thus i is ineresing o sudy he opimum resoluion-raio ha minimizes he overall disorion. Figure 3. Coding disorion a various raes and down-sampling disorion versus he resoluion-raio (Riverbed). Since he coding disorion is mainly quanizaion noise which has differen cause from he down-sampling disorion ha is mainly he loss of high frequency componens, we propose an addiive noise model as D( R, a, seq) = D ( R, a, seq) D ( a, seq) (1) coding + where D represens he average disorion of each pixel of video sequence for a given video sequence seq. Figure 4 shows he experimenal resuls of comparisons beween he oal disorion and he addiive disorion model. They are very close a medium o high raes, differen a low rae bu he rend (i.e., locaion of he minimum MSE, Mean Square Error) is sill similar. down Figure 4. Comparisons beween he oal disorion and addiive disorion model (Riverbed) SPIE-IS&T/ Vol. 754 7540P-3

Because he down-sampling disorion D down is mainly he loss of he high frequency componens ha is highly dependen of he spaial difference, a reasonable model for down-sampling disorion can be described as 1 Ddown = σ s ( 1) () a where σ s is he variance in pixel domain, i.e., spaial difference, of a frame in video sequences. Figure 5 shows he MSE verses he resoluion-raio a for various video sequences. Alhough he MSE of differen sequences varies significanly, he disorion model, shown as he dash line, is acually very accurae. Figure 5. Down-sampling disorion modeling The coding disorion can be derived from he classical Rae-Disorion (RD) heory [8]. The R-D model, which includes he ransform, quanizaion, inverse quanizaion, and inverse ransform, for an MB has been proposed [9] as D ref = σ γr (3) where σ is he emporal difference of frames in video sequence and γ R is he bis used o represen a pixel on average. In his work, he proposed resoluion-rae-disorion model ha akes resoluion-raio ino consideraion can be described based on (3) as R / a = σ γ (4) D coding The coding disorion increases wih a. For small a, he number of effecive pixels per frame afer down-sampling decreases, and he average biraes allocaed o each pixel increases. Thus he coding disorion may decrease because of he relaively high biraes/pixel. Figure 6 shows he MSE versus resoluion-raio and he disorion model a high raes and low raes, respecively. In eiher case, he mismach beween he acual MSE and he model is very limied. SPIE-IS&T/ Vol. 754 7540P-4

Figure 6. Coding Disorion Model a Low and High biraes (Riverbed) The overall disorion can be shown in (5) by combining () and (4) as γ R / a 1 D = σ + σ s ( 1) (5) a For a given rae and a video sequence, he resoluion-raio can be adjused o ge he opimum value by leing he firs order differeniaion be zero dd γ R / a 1 1 = σ ln γr + σ = 0 (6) s da a a and he opimum a can be describe as γr aop = (7) σ s 1 1 log( ) σ ln γr Alhough (7) looks complicaed, he relaionship beween he opimum resoluion-raio and he biraes is acually nearly linear, as shown in Figure 7. This opimal resoluion-raio a op is used in boh down-sampling and up-sampling processes in ADS coding scheme. Figure 7. Opimum resoluion-raio a various raes SPIE-IS&T/ Vol. 754 7540P-5

3. SIMULATIONS We implemened he proposed algorihm on JM1.4 H.64/AVC codec and assessed he performance on sample HD video sequences. The firs 5 frames of he es sequences including Riverbed, Saion, and Rush-hour a 5 fps frame rae are used in he experimens as shown in Figure 8. Each sequence is coded wih he srucure IPPP and GOP size 5. Boh he RDO and fas moion search algorihm UMHexagons are enabled as lised in Table 1. The bi-cubic inerpolaion is chosen for boh down-sampling and up-sampling filers. The PSNR comparison of he down-sampling schemes and he original H.64 is presened in Figure 9. I is shown ha he down-sampling video coding wih fixed resoluion-raio 0.5 ouperforms he regular H.64 video coding a he biraes only before he cross-over rae 8Mbps. However, he fixed raio down-sampling scheme performs poorly a high biraes. On he oher hand, he proposed algorihm ha uilizes he opimum resoluion-raio performs beer han he original H.64 as well as he fixed raio down-sampling scheme a all biraes. Moreover, he proposed algorihm achieves up o db gain in PSNR over regular H.64 coding a low o medium raes. Among he hree es sequences, Riverbed processed by wo mehods are presened in Figure 10. The upper-righ corners are shown for easy comparisons of he deails. As he figure shows, our proposed scheme is able o improve he visual qualiy significanly by eliminaing he blocking effec ha is crucial o HD video. Table 1 Encoding parameer lis in JM1.4 Profile Baseline Resoluion HD(190x1056) Tes sequence Riverbed, Saion, Rush-hour FramesToBeEncoded firs 5 frames FrameRae 5 frame per second RDO On(high-complexiy) Fas moion esimaion UMHexagons Number of reference frame 1 frame Search range for HD ±18 Block mode all mode Subpixel Moion Esimaion On GOP 5(IPPP~~) Riverbed Saion Rush-hour Figure 8. Three ypes of differen es sequences SPIE-IS&T/ Vol. 754 7540P-6

Figure 9. Rae-disorion performance of coding scheme wih adapive resoluion-raio ADS coding Original H.64 Figure 10. Subjecive comparison of upper-righ corner of Riverbed a 1.7Mbps 4. CONCLUTIONS We have proposed a down-sampling coding scheme wih adapive resoluion-raio ha performs efficienly for highdefiniion video. For a given rae and a video sequence, he opimal resoluion-raio can be deermined oward minimizing he sysem disorion. This opimal resoluion-raio is used in boh down-sampling and up-sampling processes in ADS coding scheme. Compared wih fixed resoluion-raio, he proposed scheme has beer RD performance a low as well as high biraes. Compared wih original H.64 (wihou down-sampling process), he proposed scheme has db~4db gain in PSNR a medium and low biraes. This sysem is suiable for heerogeneous neworks or variable birae environmens. REFERENCES [1] R. Molina, A.K. Kasaggelos, L.D. Alvarez, and J. Maeos, "Towards a new video compression scheme using superresoluion," Proc. SPIE-IS&T Elecronic Imaging, Visual Communicaions and Image Processing 6077, 0601-0613 (006). [] G. M. Callicó, R. P. Llopis, S. López, J. F. López, A. Núñez, R. Sehuraman, R. Sarmieno, "Low-cos superresoluion algorihms implemenaion over a HW/SW video compression plaform," EURASIP Journal on Applied Signal Processing archive, Papers 006, 37 (006). SPIE-IS&T/ Vol. 754 7540P-7

[3] D. Barreo, L.D. Alvarez, R. Molina, A. K. Kasagelos, G.M. Callico, "Region-based super-resoluion for compression," Springer Science +Business media on Mulidim sys sign process, Papers 18, 59-81 (007). [4] C. A. Segall, M. Elad, P. Milanfar, R. Webb and C. Fogg, "Improved High-Definiion Video by Encoding a an Inermediae Resoluion," Proc. of he SPIE Conference on Visual Communicaions and Image Processing 5308, 1007-1018 (004). [5] A. Brucksein, M. Elad and R. Kimmel, "Down Scaling for Beer Transform Compression," IEEE Trans. On Image Processing, Papers 1(9), 113-44 (003). [6] V. A. Nguyen, Y. P. Tan, and W.S. Lin, "Adapive Downsampling/Upsampling for Beer Video Compression a Low Bi Rae," in Proc. IEEE Inernaional Symposium on Circuis and Sysems (ISCAS), 164-167 (008). [7] H.Y. Shu and L.P. Chau, "The Realizaion of Arbirary Downsizing Video Transcoding," IEEE Transacions on Circuis and Sysems for Video Technology, Papers 16(4), 540-546 (006). [8] T. Berger, [Rae Disorion Theory], Englewood Cliffs, NJ: Prenice-Hall, (1984). [9] Z. H. He, Y. F. Liang, L. L. Chen, I. Ahmad, and D. P. Wu, "Power-Rae-Disorion analysis for wireless video communicaion under energy consrains," IEEE Trans. Circuis Sys. Video Technol., Papers 15(5), 645-658 (005). SPIE-IS&T/ Vol. 754 7540P-8