Efficient Implementation of Neural Network Deinterlacing

Similar documents
Using enhancement data to deinterlace 1080i HDTV

FRAME RATE CONVERSION OF INTERLACED VIDEO

FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION

Temporal Error Concealment Algorithm Using Adaptive Multi- Side Boundary Matching Principle

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

Selective Intra Prediction Mode Decision for H.264/AVC Encoders

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder.

Adaptive Key Frame Selection for Efficient Video Coding

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation

Understanding PQR, DMOS, and PSNR Measurements

Interleaved Source Coding (ISC) for Predictive Video Coded Frames over the Internet

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?

Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling

MULTI-STATE VIDEO CODING WITH SIDE INFORMATION. Sila Ekmekci Flierl, Thomas Sikora

Impact of scan conversion methods on the performance of scalable. video coding. E. Dubois, N. Baaziz and M. Matta. INRS-Telecommunications

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

Reduced complexity MPEG2 video post-processing for HD display

Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding. Abstract. I. Introduction

Design of a Fast Multi-Reference Frame Integer Motion Estimator for H.264/AVC

THE CAPABILITY of real-time transmission of video over

WE CONSIDER an enhancement technique for degraded

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table

A Study of Encoding and Decoding Techniques for Syndrome-Based Video Coding

WITH the rapid development of high-fidelity video services

LCD Motion Blur Reduced Using Subgradient Projection Algorithm

SCALABLE video coding (SVC) is currently being developed

Project Proposal: Sub pixel motion estimation for side information generation in Wyner- Ziv decoder.

Spatial Error Concealment Technique for Losslessly Compressed Images Using Data Hiding in Error-Prone Channels

N T I. Introduction. II. Proposed Adaptive CTI Algorithm. III. Experimental Results. IV. Conclusion. Seo Jeong-Hoon

1022 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL 2010

Error Concealment for SNR Scalable Video Coding

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

AUDIOVISUAL COMMUNICATION

Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection

Color Image Compression Using Colorization Based On Coding Technique

Optimized Color Based Compression

Using Motion-Compensated Frame-Rate Conversion for the Correction of 3 : 2 Pulldown Artifacts in Video Sequences

Lecture 2 Video Formation and Representation

International Journal for Research in Applied Science & Engineering Technology (IJRASET) Motion Compensation Techniques Adopted In HEVC

Error Resilience for Compressed Sensing with Multiple-Channel Transmission

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm

Speeding up Dirac s Entropy Coder

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS

Error Resilient Video Coding Using Unequally Protected Key Pictures

Comparative Study of JPEG2000 and H.264/AVC FRExt I Frame Coding on High-Definition Video Sequences

PERCEPTUAL QUALITY OF H.264/AVC DEBLOCKING FILTER

An FPGA Implementation of Shift Register Using Pulsed Latches

No Reference, Fuzzy Weighted Unsharp Masking Based DCT Interpolation for Better 2-D Up-sampling

MPEG has been established as an international standard

Line-Adaptive Color Transforms for Lossless Frame Memory Compression

UNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

A SVD BASED SCHEME FOR POST PROCESSING OF DCT CODED IMAGES

Optimizing Fuzzy Flip-Flop Based Neural Networks by Bacterial Memetic Algorithm

CERIAS Tech Report Preprocessing and Postprocessing Techniques for Encoding Predictive Error Frames in Rate Scalable Video Codecs by E

Planning Tool of Point to Poin Optical Communication Links

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

RATE-REDUCTION TRANSCODING DESIGN FOR WIRELESS VIDEO STREAMING

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

A Luminance Adjusting Algorithm for High Resolution and High Image Quality AMOLED Displays of Mobile Phone Applications

Drift Compensation for Reduced Spatial Resolution Transcoding

DICOM medical image watermarking of ECG signals using EZW algorithm. A. Kannammal* and S. Subha Rani

On the Characterization of Distributed Virtual Environment Systems

Design of Fault Coverage Test Pattern Generator Using LFSR

InSync White Paper : Achieving optimal conversions in UHDTV workflows April 2015

INTRA-FRAME WAVELET VIDEO CODING

Region Based Laplacian Post-processing for Better 2-D Up-sampling

Error concealment techniques in H.264 video transmission over wireless networks

Deep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj

An Lut Adaptive Filter Using DA

Weighted Random and Transition Density Patterns For Scan-BIST

New-Generation Scalable Motion Processing from Mobile to 4K and Beyond

IN OBJECT-BASED video coding, such as MPEG-4 [1], an. A Robust and Adaptive Rate Control Algorithm for Object-Based Video Coding

A parallel HEVC encoder scheme based on Multi-core platform Shu Jun1,2,3,a, Hu Dong1,2,3,b

Feasibility Study of Stochastic Streaming with 4K UHD Video Traces

Understanding Compression Technologies for HD and Megapixel Surveillance

Video Processing Applications Image and Video Processing Dr. Anil Kokaram

Power Reduction via Macroblock Prioritization for Power Aware H.264 Video Applications

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection

Scalable Foveated Visual Information Coding and Communications

Power Problems in VLSI Circuit Testing

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

TERRESTRIAL broadcasting of digital television (DTV)

Reconfigurable Neural Net Chip with 32K Connections

Low-Power and Area-Efficient Shift Register Using Pulsed Latches

Overview of All Pixel Circuits for Active Matrix Organic Light Emitting Diode (AMOLED)

Achieve Accurate Critical Display Performance With Professional and Consumer Level Displays

A 5-Gb/s Half-rate Clock Recovery Circuit in 0.25-μm CMOS Technology

Analysis of Video Transmission over Lossy Channels

An Efficient Reduction of Area in Multistandard Transform Core

ROBUST ADAPTIVE INTRA REFRESH FOR MULTIVIEW VIDEO

DISPLAY AWARENESS IN SUBJECTIVE AND OBJECTIVE VIDEO QUALITY EVALUATION

Transcription:

Efficient Implementation of Neural Network Deinterlacing Guiwon Seo, Hyunsoo Choi and Chulhee Lee Dept. Electrical and Electronic Engineering, Yonsei University 34 Shinchon-dong Seodeamun-gu, Seoul -749, Korea Rep. ABSTRACT Interlaced scanning has been widely used in most broadcasting systems. However, there are some undesirable artifacts such as jagged patterns, flickering, and line twitters. Moreover, most recent TV monitors utilize flat panel display technologies such as LCD or PDP monitors and these monitors require progressive formats. Consequently, the conversion of interlaced video into progressive video is required in many applications and a number of deinterlacing methods have been proposed. Recently deinterlacing methods based on neural network have been proposed with good results. On the other hand, with high resolution video contents such as HDTV, the amount of video data to be processed is very large. As a result, the processing time and hardware complexity become an important issue. In this paper, we propose an efficient implementation of neural network deinterlacing using polynomial approximation of the sigmoid function. Experimental results show that these approximations provide equivalent performance with a considerable reduction of complexity. This implementation of neural network deinterlacing can be efficiently incorporated in HW implementation. Keywords: neural networks, deinterlacing, HW implementation, polynomial approximation. INTRODUCTION Most broadcasting systems have employed an interlaced scanning, which makes it possible to reduce bandwidth while doubling the frame rate []. However, interlaced scanning causes undesirable artifacts such as jagged patterns, flickering, and line twitters [9]. These artifacts may degrade video quality. Furthermore, most recent TV monitors utilize flat panel technologies such as LCD or PDP monitors. Consequently, conversions between interlaced and progressive video sequences are required in many applications. Due to its importance, a number of deinterlacing methods have been proposed [-6]. These techniques can be roughly classified into two categories: intra-field methods and inter-field methods. Intra-field deinterlacing methods use only the pixel values of the current frame. Although their performance might not be optimal, they have been widely used since their requirements for memory and computing power are manageable. In order to process large video data in real time, this is a very important problem. The inter-field deinterlacing algorithms utilize the information from adjacent fields to fill in missing lines. Although they provide improved performance than intra-field deinterlacing, they require high computational complexity. Moreover, in motion-compensated deinterlacing methods, inaccurate motion estimation may cause artifacts which degrade perceptual video quality, though overall PSNR is good. Recently, several neural network deinterlacing methods have been proposed with promising results [3-6]. If a neural network deinterlacing method uses a number of fields as input, it can be classified as inter-field deinterlacing. However, since motion-compensated deinterlacing methods are computationally expensive, the neural network deinterlacing method emerges as a promising solution. With the advancement of display and transmission technologies, high resolution video programs such as HDTV become widely available. With such high resolution video contents, the amount of video data to be processed in real time is very large. Consequently, the processing time and HW complexity are important issues when deinterlacing is performed at display monitors. A problem with neural network deinterlacing methods is that they require sigmoid functions which are expensive to implement. In this paper, we propose an efficient implementation of neural network deinterlacing methods using polynomial approximation of the sigmoid function. We tested a number of polynomial functions and analyzed their performance. Experimental results show that these approximations provide equivalent performance with a considerable reduction of complexity. This implementation of neural network deinterlacing can be efficiently incorporated in HW implementation. Image Processing: Algorithms and Systems VII, edited by Jaakko T. Astola, Karen O. Egiazarian Nasser M. Nasrabadi, Syed A. Rizvi, Proc. of SPIE-IS&T Electronic Imaging, SPIE Vol. 745, 7459 9 SPIE-IS&T CCC code: 77-786X/9/$8 doi:.7/.857 SPIE-IS&T/ Vol. 745 7459-

. NEURAL NETWORKS Figure shows a diagram of multilayer neural network algorithm. Although there is one hidden layer in Figure, there can be multiple hidden layers. At each node, we compute the following summation: net j = d i= x w i ji + w = d x w = w t j x j i ji () i= y = f net ) () j ( j where x r is an input vector and y i is an output of the node. As can be seen in equation (), to compute y i, we have to compute the activation function f (x). A different type of activation function can be used. On the other hand, there are some constraint on the activation function. First, the activation function must be nonlinear. If the activation function is linear, the neural network would be a linear function. Second, the activation function should have a saturation property. In other words, it has maximum and minimum output values. Third, the activation function needs to be continuous and smooth. It is desirable that the activation function and its derivatives are must have continuous. The following sigmoid function is widely used as an activation function. The sigmoid function satisfies all the properties. f ( x) = (3) x + e output layer hidden layer input layer Fig.. Diagram of multilayer neural networks. Back-propagation algorithm The back-propagation algorithm is most widely used in training neural networks [7]. The back-propagation algorithm is used to find optimal weight vectors for a given data set. In the back-propagation algorithm, we first compute the error between target values and output values as follows: E = k ( t k o k ) where t k is a target value and ok is an output value. A weight vector is modified so that the error is reduced as follows: (4) SPIE-IS&T/ Vol. 745 7459-

E Δwij = η (5) w ij where η is a learning rate.. Sigmoid Approximation In the previous session, we explained the activation function and the sigmoid function. However, the sigmoid function includes an exponential function. When an algorithm is implemented in hardware, the exponential function is an expensive operation. In this paper, we aim to approximate the sigmoid function with a polynomial function. In particular, we approximate the sigmoid function with a quadratic function. The quadratic function can be expressed as follows: y a + = (6) + ax ax To obtain the optimal coefficients, we try to find the polynomial coefficients that maximize the correlation coefficient between the sigmoid function and the quadratic function. We uniformly sampled an interval (-4) and generated, points ( x values). Using the generated data points, we constructed two vectors: x, x. Since constant terms can t affect the correlation, we can ignore the constant coefficient a in maximizing the correlation. Then, we compute the output values of the quadratic function and constructed an output vector: y '. T T [ w w ] [ x x ] = W D y' = w x + w x = (7) We also computed the corresponding output values of the sigmoid function and constructed another vector: y sgm. Finally, we try to maximize the correlation between y sgm and y '. Using the optimization method in [8], we can formulate the optimal weight vector which maximizes the correlation coefficient as follows: Σ Σ = ρ W (8) D Q W where Q = E( y D), Σ = QQ and sgm minimize MSE as follows: Q T Σ D = DD T. After finding the optimal weight vector, we apply linear fitting to y + = a + k y' = a + kw x kw x (9) where a = kw and a = kw. Using linear fitting we can find a and k. Since the quadratic function can t provide the saturation property of the sigmoid function, we approximate the sigmoid function with a linear function when x is larger than a threshold (Fig. ). The coefficients of the linear function were selected so that the overall function will be continuous and differentiable. The approximation function for x> is given by a + ax + ax for x < xth y = () b + b x for x xth SPIE-IS&T/ Vol. 745 7459-3

..8.6.4. 3 4 5 6 7 Fig.. Threshold of quadratic functions and its substituting linear function For x<, we can take advantages of the sigmoid function which has an anti-symmetry property. The final form of the proposed function is follows: b + b x a + ax + ax y = a + ax ax b + b x for for for for x x x < x x th th x < x x < It is noted that a >, a < and a =. 5. The sigmoid and the proposed function are shown Figure 3. As can be seen, the sigmoid and approximation functions are very similar and the error between two functions is very small. th th ().8.6.4 Sigmoid Approximation x Error. -. -6-4 - 4 6 Fig. 3. Sigmoid function, proposed function and error between two functions SPIE-IS&T/ Vol. 745 7459-4

3. NEURAL NETWORK DEINTERLACING Deinterlacing is a process of filling in mission lines of interlaced video sequences. Due to it importance, a number of deinterlacing methods have been proposed by many researchers. In particular, there are several deinterlacing methods based on neural networks [3-6]. In neural network deinterlacing methods, input neurons are obtained from field pixels and the outputs of the neural network are desired values. There are two categories in neural network deinterlacing methods: intra-field methods and inter-field methods. In intra methods we use only the pixels of the current field while the inter-field deinterlacing method may use pixels from several fields including the current, previous and next fields. In [3], Plaziac proposed an intra-field deinterlacing algorithm using neural networks. The inputs and outputs of the methods are shown in Figure 4. As can be seen in Figure 4, it computes 3 output neurons using 3 input neurons. The method used a 3 layer neural network with 6 hidden neurons. Input of neural network Output of neural network Fig. 4. Deinterlacing method using single field Previous frame Current frame Next frame Input of neural network Output of neural network Fig. 5. Deinterlacing method using previous and next field SPIE-IS&T/ Vol. 745 7459-5

In [5], Choi proposed an inter-field deinterlacing method. The inputs and outputs of the methods are shown in Figure 5. The inputs are extracted from the previous, current and next fields. From the previous and next fields, 5 reference pixels are taken. From the current field, it takes pixels. The network has three layers with 6 hidden neurons. The input and output neuron can be expressed using vectors as follows: T [ a a,, a ] B = [ b] A =,, L () where A is the input vector and B is the output vector. It is reported that inter-field deinterlacing methods show better performance than intra-field deinterlacing methods [5]. Thus, in this paper, we selected the inter-field deinterlacing method [5] to verify the proposed activation function that is used instead of the sigmoid function. 4. EXPERIMENTAL RESULTS Experiments were performed to test the proposed function. In the experiments, we implemented the neural network deinterlacing method in [5]. Along with the proposed activation function, we tested the sigmoid function (3) and the hyperbolic tangent function, which are frequently as activation functions. These functions satisfy the non-linearity and saturation properties. These functions have been widely used in many applications and their performance has been satisfactory. We tested the neural network deinterlacing algorithm using nine QCIF video sequences. To evaluate the performance of the activation functions, we first converted the progressive video sequences to interlaced video sequences. After we applied the neural network deinterlacing algorithm, we computed PSNRs with the original progressive video sequences. Table shows performance comparison. Table. Average result PSNRs (db) Format Video Sigmoid Hyperbolic Tangent Proposed Coastguard 33.95 33.94 34. Container 38.7 4.6 38.98 Foreman 34. 34.7 34.4 Hall & Monitor 37.7 38.83 37.55 QCIF Mobile 3.93 33.99 33.34 Mother & Daughter 4.67 4.6 4.94 Silent 38.7 36.95 38.53 Stefan 5.5 5.3 5.48 Table 3.5 3.5 3.65 Average 34.93 35.4 35. Table is obtained by averaging trials with different initial weights. We used the first frames of four video sequences (Coastguard, Foreman, Mobile and Hall & Monitor). The experimental results show that the proposed function provides performance comparable to the other activation functions. SPIE-IS&T/ Vol. 745 7459-6

Table shows the average processing time of the three activation functions (Core Quad CPU.4 GHz). It is noted that the processing time was measured for test only. We excluded the training time. It can be seen that the proposed algorithm is fastest among the three activation functions. The proposed algorithm consumes 73% of the processing time with the sigmoid function and 53% of that with the hyperbolic tangent function. It is noted that the processing time of the proposed activation can be further reduced when it is implemented in hardware. Table. Average computation time of deinterlacing test Sigmoid Hyperbolic Tangent Proposed Computation Time(sec).58 5.83 8.4 Figure 6 shows that frame PSNR comparison of the sigmoid and proposed activation functions for the coastguard video sequence. It shows that the proposed activation function provides almost the identical performance compared to the sigmoid function. PSNR(dB) 37 36 35 34 33 3 3 Sigmoid Proposed 4 6 8 Frame number Fig. 6. Frame by frame comparison for coastguard sequence ( frames) 5. CONCLUSIONS In this paper, we proposed a new activation function for neural networks. Although presenting activation functions assure the performance of neural networks, it is time consuming process because it is used every iteration. Our proposed function is implemented by quadratic polynomial functions and linear functions. It has piecewise continuous and smoothness. This function is adapted to neural network deinterlacing. Experiments results show that the proposed function perform in shorter time with similar quality. It seems very useful to replace activation function to proposed polynomial function in hardware implementation. SPIE-IS&T/ Vol. 745 7459-7

ACKNOWLEDGEMENT This research was supported by the MKE (Ministry of Knowledge Economy) under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute of Information Technology Assessment) (IITA- 8-(C9-8-)), Korea REFERENCES [] [] [3] [4] [5] [6] [7] [8] [9] Renxiang Li, Bing Zeng and Ming L. Liou, Reliable motion detection/compensation for interlaced sequences and its applications to deinterlacing, IEEE Trans. Circuits and System for Video Technology, (), 3-9 (). Taeuk Jeong, Younghie Kim, Kwanghoon Sohn and Chulhee Lee, Deinterlacing with selective motion compensation, Optical Engineering, 45(7), 77, (6). Nathalie Plaziac, Image interpolation using neural networks, IEEE Trans. Image Processing, 8(), 647-65 (999). Xianglin Wang and Yeong Taeg Kim, An edge direction based neural network interpolator for video deinterlacing, IEEE Int. Conf. Neural Networks & Signal Processing, 5-8 (3). Hyunsoo Choi, Eunjae Lee and Chulhee Lee, Neural Network Deinterlacing Using Multiple Fields, LNCIS, 345, 97-975 (6). Hyunsoo Choi and Chulhee Lee, Neural Network Deinterlacing Using Multiple Fields and Field-MESs, IJCNN 7, International Joint Conference on, 869-87 (7). Richard O. Duda, Peter E. Hart and David G. Stork, [Pattern Classification Second Edition], Wiley Interscience, pp. 8-335 (). Document 6Q/4: A new method for objective measure of video quality using wavelet transform, ITU-R/SG 6/WP 6Q, Republic of Korea (Sep. 3, ). Ohjae Kwon, Kwanghoon Sohn and Chulhee Lee, Deinterlacing using directional interpolation and motion compensation, IEEE Trans. Consumer Electronics, 49(), 98-3(3). SPIE-IS&T/ Vol. 745 7459-8