IMPROVED SUBSTITUTION FOR ERRONEOUS LTP-PARAMETERS IN A SPEECH DECODER. Jari Makinen, Janne Vainio, Hannu Mikkola, Jani Rotola-Pukkila

Similar documents
ETSI TS V6.0.0 ( )

The Comparison of Selected Audio Features and Classification Techniques in the Task of the Musical Instrument Recognition

How to reduce light leakage and clipping in local-dimming liquid-crystal displays

Novel Automatic Test Pattern Generator (ATPG) for degenerated SCAN - BIST VLSI Circuits

The Use of the Attack Transient Envelope in Instrument Recognition

Research on the optimization of voice quality of network English teaching system

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

3GPP TS V7.0.0 ( )

ETSI TS V5.0.0 ( )

Music Plus One and Machine Learning

ROBUST ADAPTIVE INTRA REFRESH FOR MULTIVIEW VIDEO

Automatic Chord Recognition with Higher-Order Harmonic Language Modelling

Appendix A. Strength of metric position. Line toward next core melody tone. Scale degree in the melody. Sonority, in intervals above the bass

Improved Error Concealment Using Scene Information

Performance Improvement of AMBE 3600 bps Vocoder with Improved FEC

Predicting when to Laugh with Structured Classification

Quantitative Evaluation of Violin Solo Performance

Error Resilient Video Coding Using Unequally Protected Key Pictures

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Error Concealment for SNR Scalable Video Coding

Convention Paper Presented at the 132nd Convention 2012 April Budapest, Hungary

DATA COMPRESSION USING NEURAL NETWORKS IN BIO-MEDICAL SIGNAL PROCESSING

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Title: Lucent Technologies TDMA Half Rate Speech Codec

The Informatics Philharmonic By Christopher Raphael

Analysis of Video Transmission over Lossy Channels

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm

A Chance Constraint Approach to Multi Response Optimization Based on a Network Data Envelopment Analysis

PRACTICAL PERFORMANCE MEASUREMENTS OF LTE BROADCAST (EMBMS) FOR TV APPLICATIONS

A device for spatial, temporal and contrast resolul:ion measurement using a VDU screen

3GPP TS V4.0.0 ( )

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

CHAPTER 2 SUBCHANNEL POWER CONTROL THROUGH WEIGHTING COEFFICIENT METHOD

ESG Engineering Services Group

A Fractal Video Communicator. J. Streit, L. Hanzo. Department of Electronics and Computer Sc., University of Southampton, UK, S09 5NH

ABSTRACT ERROR CONCEALMENT TECHNIQUES IN H.264/AVC, FOR VIDEO TRANSMISSION OVER WIRELESS NETWORK. Vineeth Shetty Kolkeri, M.S.

Advanced Scalable Hybrid Video Coding

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1

IMPROVED ERROR RESILIENCE FOR VOLTE AND VOIP WITH 3GPP EVS CHANNEL AWARE CODING

Measuring Radio Network Performance

Bit Rate Control for Video Transmission Over Wireless Networks

WITH the rapid development of high-fidelity video services

Packet Scheduling Bandwidth Type-Based Mechanism for LTE

ETSI TS V5.4.1 ( )

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

Free Viewpoint Switching in Multi-view Video Streaming Using. Wyner-Ziv Video Coding

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

Course 10 The PDH multiplexing hierarchy.

FRAME ERROR RATE EVALUATION OF A C-ARQ PROTOCOL WITH MAXIMUM-LIKELIHOOD FRAME COMBINING

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

Chapter 2 Introduction to

Chapter 2. Advanced Telecommunications and Signal Processing Program. E. Galarza, Raynard O. Hinds, Eric C. Reed, Lon E. Sun-

Demonstration of geolocation database and spectrum coordinator as specified in ETSI TS and TS

Contents. xv xxi xxiii xxiv. 1 Introduction 1 References 4

1 Introduction to PSQM

Analysis of Technique Evolution and Aesthetic Value Realization Path in Piano Performance Based on Musical Hearing

an organization for standardization in the

A Study of Encoding and Decoding Techniques for Syndrome-Based Video Coding

Complex audio feature extraction: Transcription

Dynamics and Relativity: Practical Implications of Dynamic Markings in the Score

CHROMA CODING IN DISTRIBUTED VIDEO CODING

AUDIOVISUAL COMMUNICATION

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS

Predicting Performance of PESQ in Case of Single Frame Losses

SIC receiver in a mobile MIMO-OFDM system with optimization for HARQ operation

TERRESTRIAL broadcasting of digital television (DTV)

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

Improving Frame FEC Efficiency. Improving Frame FEC Efficiency. Using Frame Bursts. Lior Khermosh, Passave. Ariel Maislos, Passave

Chapter 10 Basic Video Compression Techniques

PACKET-SWITCHED networks have become ubiquitous

Constant Bit Rate for Video Streaming Over Packet Switching Networks

ON THE ENHANCEMENT OF AUDIO AND VIDEO IN MOBILE EQUIPMENT

Modeling and Evaluating Feedback-Based Error Control for Video Transfer

ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS

Reducing False Positives in Video Shot Detection

Multi-view Video Streaming with Mobile Cameras

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

P SNR r,f -MOS r : An Easy-To-Compute Multiuser

Compressed-Sensing-Enabled Video Streaming for Wireless Multimedia Sensor Networks Abstract:

viola D B, Köniliche Hausmusik [KHM], mss (Schober) and 1902 DB, SA, ms (Johann Leonhard Hesse) The two KHM copies as well as SA 3378 and

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

Internet of Things. RF-Test. Eduardo Inzunza Speaker Title 18-Jun-2017

A GoP Based FEC Technique for Packet Based Video Streaming

MULTI-STATE VIDEO CODING WITH SIDE INFORMATION. Sila Ekmekci Flierl, Thomas Sikora

Modeling and Optimization of a Systematic Lossy Error Protection System based on H.264/AVC Redundant Slices

Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection

IEEE P a. IEEE P Wireless Personal Area Networks. hybrid modulation schemes and cameras ISC modes

Error concealment techniques in H.264 video transmission over wireless networks

Error performance objective for 400GbE

Minimax Disappointment Video Broadcasting

Dual Frame Video Encoding with Feedback

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?

A guide to the new. Singing Syllabus. What s changing in New set songs and sight-singing

Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection

PAPER Wireless Multi-view Video Streaming with Subcarrier Allocation

INTERNATIONAL TELECOMMUNICATION UNION

Visual Communication at Limited Colour Display Capability

ETSI TS V3.0.2 ( )

Error performance objective for 25 GbE

Transcription:

IMPROVED SUBSTITUTION FOR ERRONEOUS LTP-PARAMETERS IN A SPEECH DECODER Jari Makinen, Janne Vainio, Hannu Mikkola, Jani Rotola-Pukkila Seech and Audio Systems Laboratory, Nokia Research Center Tamere, Finland, jari.m.makinen@nokia.com ABSTRACT In diital mobile communication systems, there is a need for error concealment techniques to reduce the jective effects of bit errors, which have not been eliated by channel codin. This contribution resents a new aroach for concealment of lon-term rediction arameters (LTP. It can be alied to any seech codec, where the lon-term rediction system is imlemented. The roosed method takes into account diverse nature of seech sinal and uses seech arameter history for findin otimal stitution for LTP arameters. Based on extensive exert listenin, it can be concluded that the roosed method imroves jective seech quality comared to traditional methods. This method is used as a art of error concealment of the 3GPP AMR-WB codec [].. INTRODUCTION Error concealment techniques are necessary, if transmission errors occur and if channel codin is not able to correct all of them. Seech arameters, detered by modern hihly comressed source codin alorithms are vulnerable aainst channel errors and hence residual bit errors in the seech arameters can result annoyin artifacts in reconstructed seech. Lon-term rediction (LTP arameter usually includes itch and ain arameters [8]. When the error is detected in a seech decoder, erroneous seech arameters, also LTP arameters, are stituted. Traditional LTP arameter error concealment methods are introduced in this article [2, 3, 4]. Also the deficiencies of traditional methods are revealed. The roosed LTP arameter error concealment method attemts to avoid introduced deficiencies. Basically two new asects are included to the roosed method. Firstly unreliable information in erroneous frame is exloited in error concealment. Secondly roosed method utilizes inter- and intra-frame correlation for error concealment. It is well known, that the residual redundancy, esecially inter-frame correlation, can be exloited at the receiver to enhance the disturbed sinal. This roosed error concealment for LTP arameters focuses on the enhancement of the seech quality rather than on the imization of a bit or symbol error rate. 2. IMPROVED LTP-PARAMETER SUBSTITUTION The idea behind the LTP la and ain stitution methods is to conceal the LTP arameters with values causin imal deradation in seech quality. In the roosed method, it is done by analyzin the characteristics of decoded seech sequence. When bad frame is detected, the revious sets of seech arameters and the current corruted arameters are analyzed. This analyzin is done in order to find the imroved LTP arameter concealment. 2. Backround for roosed LTP arameter stitution 2.. classification In error concealment, bad frame tye can be classified as a corruted or lost. Frame is set to be a lost frame, if it has not been in decoder side. The lost frames are very tyical e.. in acket switched networks, where seech frames are transmitted in ackets throuh the transmission channel and some ackets are never or they will arrive too late. A lost frame contains no information about the oriinally transmitted seech frame. Frame can be said as a corruted frame, if it is in decoder side, but error detection mechanism has classified it as an erroneous frame. These frames contain unreliable seech information. Tyically, corruted frames can be related to circuit-switched networks. In many cases the bit error rate (BER er corruted frame is very small when the channel is relatively ood. Accordin to testin in GSM seech traffic channel with 3% of frame error rate (FER, 6% of the LTPla values, in erroneous frames, are still correct. Additionally, more than 7% of the LTP-las are correct when FER is.2%. 2..2 Deficiencies in rior LTP arameter stitution method Traditionally LTP arameter stitution durin the erroneous seech frames is always done with the same alorithm no matter what are the characteristics of decoded seech sequence [2, 3, 4]. Therefore, LTP-la is stituted by last ood LTP-la value with sliht moication. The LTP-ain is stituted by slihtly deraded last ood LTP-ain value. If erroneous seech frame is

detected durin voiced seech sequence, the concealment rocedure can be efficiently done like this. This is because voiced seech sinal is stationary sinal and last ood LTP arameters ive a ood estimation for transmitted arameters. On the other hand, unvoiced seech sinal is non-stationary, where LTP arameters have a lot of variation and therefore inter-frame correlation is low. In this case, traditional concealment rocedures [2, 3, 4] usually enerate eriodic sinal. When bad frame is detected, traditional concealment methods use only last ood seech arameters in each frames. This enerates a eriodic sinal sequence into the middle of nonstationary seech sinal. This can enerate an annoyin artifact, which does not belon to oriinal seech. To avoid this, the concealment rocedure needs to be ferent for non-stationary sinals in order to imrove the concealment method for LTP arameters. Traditionally, frame is classified as bad frame, when bad frame indication (BFI is set by cyclic redundancy check (CRC [5] or other error detection mechanisms in channel decodin rocess. Error detection mechanisms are used to detect errors amon the most imortant bits. Traditionally, these bits are not used for decodin, when the frame is classified as bad frame. However, the frame may have only few erroneous bits, but BFI is set and therefore the whole frame is discarded even thouh the rest of the bits are correct. This s that lots of correct information is thrown away. Lon enouh CRC detects erroneous frames very well, but it does not ive estimation about bit error rate (BER in the frame. 2.2 The roosed methods The block diaram of the roosed method is shown in Fiure. Last ood seech arameters are used to deterate the roer error concealment rocedure for LTP arameters. The roosed method uses ferent schemes for lost and corruted frames. When the frame is classified as corruted, LTP information of erroneous frame is used in addition to the last ood seech arameters. However, for LTP-ain, the information of erroneous frames is not used. LTP-la and LTP-ain error concealment methods are based on the inter- and intra-frame correlation, esecially the correlation of LTP arameters. 2.2. LTP-la concealment for corruted frames LTP-la concealment for corruted frames attemts to avoid deficiencies discussed in Chater 2..2. The basic concealment idea for ossibly corruted LTP-la is that accordin to adative criteria the LTP-la information from the erroneous frame is used for the decodin even thouh the frame is classified as a bad frame. The criterion, which deteres the ossibly correct LTP-la from erroneous frame, is calculated from ast values of LTP arameters. If the LTP-la meets the criteria, it is used in decodin. Otherwise, la is classified as very corruted and the LTP-la concealment rocedure for lost frames is erformed (exlained in Chater 2.2.2. The criterion for usin LTP-la from erroneous frame is adated accordin to last ood LTP arameters. The correlation and characteristic (demonstrated in Fiures 2 and 3 between LTP-la and LTP-ain has an imortant role in adatin the selection criteria. The decision criteria for usin the LTP la from corruted frame need to be very strict esecially in the case of voiced seech sinal. Criteria can be easily desined, because LTP-la is very stable durin voiced sequence and LTP-ain is usually relatively hih. This s that corruted LTP-la can be detected and correct la value can be observed with hih robability. On the other hand, e.. in the case of non-stationary seech sinal the decision criteria do not need to be so strict and the LTPla may be allowed to have larer variation. For the unvoiced seech sinal the small error in the LTP la does not usually enerate audible artifacts. La variation can be considered very advisable for non-stationary seech sequences. Even when the bits of the LTP-la information are corruted, they can be aroved by the decision criterion. 2.2.2 LTP-la concealment for lost frames When the frame is classified as lost frame or corruted frame where LTP information is badly corruted, the stituted LTP-la is calculated accordin to last ood seech arameters. First, the characteristic of current decoded seech sinal is analyzed and LTP-la is calculated based on that. Basically, the analysis alorithm reconizes the stationarity of seech sinal by usin the correlation of adjacent LTP arameters (Fiures 2 and 3. For stationary seech sinal, the LTP arameter rocedure is done like in [2, 3, 4]. For seech sinal, which has non-stationary characteristic, the error concealment is done by usin weihted median of the revious ood LTP la values added with adative random contribution. Limits for the adative randomization are calculated from the ast LTP arameters. The roosed alorithm decides, whether the la is exected to be relatively constant or if it should fluctuate. The roosed and traditional methods are comared to error free case in Fiures 4 and 5 and 6. Fiures 4 and 5 illustrate the imrovement that can be achieved in arametrical domain. In Fiure 4, the concealment without adativity creates audible artifact in the middle of unvoiced seech sequence. In Fiure 5, LTP-la concealed with traditional method creates annoyin bin -artifact whereas roosed LTP-la concealment oerates smoothly. Fiure 6 shows the ference in concealment methods in the reconstructed seech sinal, when the LTP arameters are concealed as in Fiure 5. 2.2.3 LTP-ain concealment for lost and corruted frames Received seech arameters are also analyzed when error concealment is done for LTP-ain. Simle stitution with revious values of LTP-ain with sliht deradation in the middle of the non-stationary sequence may cause audible artifacts. In the case of non-stationary seech sinal, the LTP-ain is not stable, but fluctuatin

over the whole rane (tyically. -,2. Fiures 2 and 3 demonstrate the characteristic of LTP-ain. Because of this characteristic, the ain concealment can be imroved. In the new method, random variation between roer limits is enerated, when the ain concealment is done for the non-stationary seech sinal. Esecially when last ood LTP-ain is hih, it is imortant to do the concealment accordin to decision criterion (Chater 2.4.3. In this roosed method, also LTP-ain is concealed by usin weihted median from the revious LTP ains and addin random variation to it accordin to adative treshold limits. Like in the case of LTP-la concealment, the adative decision tresholds used in the LTP-ain concealment are also calculated from the ast values of LTP. 2.4 Fiures BFI 6 4 2 La (samles 8 6 LTP-la concealment for bad frames Imroved concealment Prior-art concealment Error-free 34 342 344 346 348 35 352 354 Subframe(n Fiure 4 Prior LTP-la concealment alorithms create audible artifact in the middle of unvoiced seech sequence. Proosed concealment uses revious set of LTP arameters and so error concealment can be imroved. 22 2 8 Imroved concealment Prior-art concealment Error-free LTP-la concealment for bad frames Seech arameters Switch RX Parameter history Parameter concealment Analyser Switch Decoder 6 La (samles 4 2 Fiure Block diaram of the roosed bad frame handlin. 8 356 358 36 362 364 366 368 37 372 374 376 Subframe (n 4 2 LTP-la LTP-ain Fiure 5 Prior LTP-la concealment alorithms create annoyin "bin"-artifact. Proosed LTP-la concealment eliates the artifact. 8 6. 4 (a 2 vii 8.55 8.6 8.65 8.7 8.75 8.8 8.85 8.9 8.95 9 Time Fiure 2 The seech sequence 'viiniä' has very stationary characteristic. 4 niä LTP-la LTP-ain -. 2 4 6 8 2 4 6. Samles (b -. 2 4 6 8 2 4 6. Samles 2 8 6 4 2 exhi bition 5.6 5.7 5.8 5.9 6 6. 6.2 Time Fiure 3 The seech sequence 'exhibition' has mostly non-stationary sequences. (c -. 2 4 6 8 2 4 6 Samles Fiure 6 Difference in reconstructed seech sinal, when LTP-la error concealment is done as in Fiure 5. (a Seech in error-free channel, (b rior-art la concealment, (c imroved la concealment. 2.4 Equations 2.4. Udated LTP-la when the frame is corruted frame When BFI is set, decoded LTP la durin the bad frame (T bf is decoded accordin to the index like it is done, when BFI is not set. T bf is used if it is inside the LTP feature-criteria. The LTP feature -criteria has the followin conditions. If one

of these conditions is true T bf is used for udatin LTPla. la = < and T, ( n >.5 and ( n 2 >.5 and T (n - -, <.4 and ( n = < 7 and T 5 and T + 5 T ( n is LTP la from the revious ood frame, T = T T = ( T, T = ( T, T is la, = (, is LTP ain of the current frame, (- is LTP ain of the revious ood frame, (n - + (-2 is LTP ain of the frame before revious ood frame, T = averae ( T LTP la value for the current frame is defined as follows: T, la = T = ( T + T + T 2 + RND( T T 2, la = 3 T = ( T, T T 2 RND(x is second larest value in is second larest value in is random value enerated to rane [-x/2, x/2]. If none of these LTP feature-criteria conditions are true, T bf is not used and la is calculated from the LTP history s like it is done in section 2.4.2. 2.4. Udated LTP-la when the frame is lost frame The usability of the LTP la from last ood frame ( la _ t is defined as follows: (Estimates, if the la is most robably very close to the transmitted la and therefore its usae should not introduce any bad artifacts., >.5 and T < la _ t =, ( n >.5 and ( n 2 >.5 = (, (n- is LTP ain of the revious ood frame, (n-2 is LTP ain of the frame before revious ood frame. LTP la value for the current frame is defined as follows: T ( n, la_t- = T = ( T + T + T 2 + RND( T T 2, la_t- = 3 T ( n is LTP la from the revious ood frame, T = ( T, T is second larest value in T is second larest value in 2 RND(x is random value enerated to rane [-x/2, x/2]. LTP-la is also udated with revious alorithm, when the frame is corruted frame and LTP featurecriteria is false. 2.4.2 LTP-ain concealment When BFI is set, the udated LTP-ain is calculated usin the followin ain concealment-rule. The usability of the last LTP ain from last ood frame is defined by variables and ain _ as follows: ain _, >.5and ( = and ( >.9 =, >.5 and ( = 2, <.5 and ( = 3,.5 = averae( = ( = ( = ( is the LTP ain of the revious ood frame, ( 2 is the LTP ain of the frame before revious ood frame, ( 3 is the LTP ain of the frame second before revious ood frame, Fr is the order of the frame, RND (.. is random value enerated to rane [,]. LTP ain value for the current frame is defined as follows: [ ( 2 + ( 3 ], ain _ = and Fr = 2 = + (.. *(, = = 2 RND ain _ and Fr RND(..*(, ain _ = and Fr = 3 + RND(.. *(, ain _ = and Fr = 4 = ( (,,, = and = 2and = 3and ain _ ain _ ain _ = = =

3. CONCLUSIONS Based on extensive exert listenin, it can be concluded that the roosed methods imrove the jective seech quality comared to traditional methods. Esecially LTPla error concealment erformed very well. On the other hand roosed error concealment for LTP-ain has some disadvantaes, even if some cases it was better than traditional methods [2, 3, 4]. The roosed LTP-la concealment is imlemented in AMR-WB [] and it was tested carefully in 3GPP AMR-WB selection hase listenin tests [6, 7]. 4. REFERENCES [] 3G TS 26.9, AMR-WB seech codec, Error concealment of lost frames, release 5 2. [2] 3G TS 26.9 3GPP, AMR seech codec, Error concealment of lost frames, version 3.. 999. [3] GSM 6.6 Diital cellular telecommunications system (Phase 2 Substitution and mutin of lost frames for full rate seech traffic channels, version 5.. 996. [4] TIA/EIA/IS-64-ADMA Cellular/PCS-Radio Interference Enhanced Full-Rate Seech Codec, 996. [5] Raymond Steele, Mobile Radio Communication, IEEE Press 994. [6] 3GPP TSG-S4, AMR-WB Selection Test Plan, Version. 2 [7] 3GPP TSG-S4, AMR-WB Selection Process Plan, Version. 2 [8] A.M. Kondoz, Diital seech codin for low bit rate communication system, John Wiley & Sons 2.