IMPROVED SUBSTITUTION FOR ERRONEOUS LTP-PARAMETERS IN A SPEECH DECODER. Jari Makinen, Janne Vainio, Hannu Mikkola, Jani Rotola-Pukkila

IMPROVED SUBSTITUTION FOR ERRONEOUS LTP-PARAMETERS IN A SPEECH DECODER Jari Makinen, Janne Vainio, Hannu Mikkola, Jani Rotola-Pukkila Seech and Audio Systems Laboratory, Nokia Research Center Tamere, Finland, jari.m.makinen@nokia.com ABSTRACT In diital mobile communication systems, there is a need for error concealment techniques to reduce the jective effects of bit errors, which have not been eliated by channel codin. This contribution resents a new aroach for concealment of lon-term rediction arameters (LTP. It can be alied to any seech codec, where the lon-term rediction system is imlemented. The roosed method takes into account diverse nature of seech sinal and uses seech arameter history for findin otimal stitution for LTP arameters. Based on extensive exert listenin, it can be concluded that the roosed method imroves jective seech quality comared to traditional methods. This method is used as a art of error concealment of the 3GPP AMR-WB codec [].. INTRODUCTION Error concealment techniques are necessary, if transmission errors occur and if channel codin is not able to correct all of them. Seech arameters, detered by modern hihly comressed source codin alorithms are vulnerable aainst channel errors and hence residual bit errors in the seech arameters can result annoyin artifacts in reconstructed seech. Lon-term rediction (LTP arameter usually includes itch and ain arameters [8]. When the error is detected in a seech decoder, erroneous seech arameters, also LTP arameters, are stituted. Traditional LTP arameter error concealment methods are introduced in this article [2, 3, 4]. Also the deficiencies of traditional methods are revealed. The roosed LTP arameter error concealment method attemts to avoid introduced deficiencies. Basically two new asects are included to the roosed method. Firstly unreliable information in erroneous frame is exloited in error concealment. Secondly roosed method utilizes inter- and intra-frame correlation for error concealment. It is well known, that the residual redundancy, esecially inter-frame correlation, can be exloited at the receiver to enhance the disturbed sinal. This roosed error concealment for LTP arameters focuses on the enhancement of the seech quality rather than on the imization of a bit or symbol error rate. 2. IMPROVED LTP-PARAMETER SUBSTITUTION The idea behind the LTP la and ain stitution methods is to conceal the LTP arameters with values causin imal deradation in seech quality. In the roosed method, it is done by analyzin the characteristics of decoded seech sequence. When bad frame is detected, the revious sets of seech arameters and the current corruted arameters are analyzed. This analyzin is done in order to find the imroved LTP arameter concealment. 2. Backround for roosed LTP arameter stitution 2.. classification In error concealment, bad frame tye can be classified as a corruted or lost. Frame is set to be a lost frame, if it has not been in decoder side. The lost frames are very tyical e.. in acket switched networks, where seech frames are transmitted in ackets throuh the transmission channel and some ackets are never or they will arrive too late. A lost frame contains no information about the oriinally transmitted seech frame. Frame can be said as a corruted frame, if it is in decoder side, but error detection mechanism has classified it as an erroneous frame. These frames contain unreliable seech information. Tyically, corruted frames can be related to circuit-switched networks. In many cases the bit error rate (BER er corruted frame is very small when the channel is relatively ood. Accordin to testin in GSM seech traffic channel with 3% of frame error rate (FER, 6% of the LTPla values, in erroneous frames, are still correct. Additionally, more than 7% of the LTP-las are correct when FER is.2%. 2..2 Deficiencies in rior LTP arameter stitution method Traditionally LTP arameter stitution durin the erroneous seech frames is always done with the same alorithm no matter what are the characteristics of decoded seech sequence [2, 3, 4]. Therefore, LTP-la is stituted by last ood LTP-la value with sliht moication. The LTP-ain is stituted by slihtly deraded last ood LTP-ain value. If erroneous seech frame is

detected durin voiced seech sequence, the concealment rocedure can be efficiently done like this. This is because voiced seech sinal is stationary sinal and last ood LTP arameters ive a ood estimation for transmitted arameters. On the other hand, unvoiced seech sinal is non-stationary, where LTP arameters have a lot of variation and therefore inter-frame correlation is low. In this case, traditional concealment rocedures [2, 3, 4] usually enerate eriodic sinal. When bad frame is detected, traditional concealment methods use only last ood seech arameters in each frames. This enerates a eriodic sinal sequence into the middle of nonstationary seech sinal. This can enerate an annoyin artifact, which does not belon to oriinal seech. To avoid this, the concealment rocedure needs to be ferent for non-stationary sinals in order to imrove the concealment method for LTP arameters. Traditionally, frame is classified as bad frame, when bad frame indication (BFI is set by cyclic redundancy check (CRC [5] or other error detection mechanisms in channel decodin rocess. Error detection mechanisms are used to detect errors amon the most imortant bits. Traditionally, these bits are not used for decodin, when the frame is classified as bad frame. However, the frame may have only few erroneous bits, but BFI is set and therefore the whole frame is discarded even thouh the rest of the bits are correct. This s that lots of correct information is thrown away. Lon enouh CRC detects erroneous frames very well, but it does not ive estimation about bit error rate (BER in the frame. 2.2 The roosed methods The block diaram of the roosed method is shown in Fiure. Last ood seech arameters are used to deterate the roer error concealment rocedure for LTP arameters. The roosed method uses ferent schemes for lost and corruted frames. When the frame is classified as corruted, LTP information of erroneous frame is used in addition to the last ood seech arameters. However, for LTP-ain, the information of erroneous frames is not used. LTP-la and LTP-ain error concealment methods are based on the inter- and intra-frame correlation, esecially the correlation of LTP arameters. 2.2. LTP-la concealment for corruted frames LTP-la concealment for corruted frames attemts to avoid deficiencies discussed in Chater 2..2. The basic concealment idea for ossibly corruted LTP-la is that accordin to adative criteria the LTP-la information from the erroneous frame is used for the decodin even thouh the frame is classified as a bad frame. The criterion, which deteres the ossibly correct LTP-la from erroneous frame, is calculated from ast values of LTP arameters. If the LTP-la meets the criteria, it is used in decodin. Otherwise, la is classified as very corruted and the LTP-la concealment rocedure for lost frames is erformed (exlained in Chater 2.2.2. The criterion for usin LTP-la from erroneous frame is adated accordin to last ood LTP arameters. The correlation and characteristic (demonstrated in Fiures 2 and 3 between LTP-la and LTP-ain has an imortant role in adatin the selection criteria. The decision criteria for usin the LTP la from corruted frame need to be very strict esecially in the case of voiced seech sinal. Criteria can be easily desined, because LTP-la is very stable durin voiced sequence and LTP-ain is usually relatively hih. This s that corruted LTP-la can be detected and correct la value can be observed with hih robability. On the other hand, e.. in the case of non-stationary seech sinal the decision criteria do not need to be so strict and the LTPla may be allowed to have larer variation. For the unvoiced seech sinal the small error in the LTP la does not usually enerate audible artifacts. La variation can be considered very advisable for non-stationary seech sequences. Even when the bits of the LTP-la information are corruted, they can be aroved by the decision criterion. 2.2.2 LTP-la concealment for lost frames When the frame is classified as lost frame or corruted frame where LTP information is badly corruted, the stituted LTP-la is calculated accordin to last ood seech arameters. First, the characteristic of current decoded seech sinal is analyzed and LTP-la is calculated based on that. Basically, the analysis alorithm reconizes the stationarity of seech sinal by usin the correlation of adjacent LTP arameters (Fiures 2 and 3. For stationary seech sinal, the LTP arameter rocedure is done like in [2, 3, 4]. For seech sinal, which has non-stationary characteristic, the error concealment is done by usin weihted median of the revious ood LTP la values added with adative random contribution. Limits for the adative randomization are calculated from the ast LTP arameters. The roosed alorithm decides, whether the la is exected to be relatively constant or if it should fluctuate. The roosed and traditional methods are comared to error free case in Fiures 4 and 5 and 6. Fiures 4 and 5 illustrate the imrovement that can be achieved in arametrical domain. In Fiure 4, the concealment without adativity creates audible artifact in the middle of unvoiced seech sequence. In Fiure 5, LTP-la concealed with traditional method creates annoyin bin -artifact whereas roosed LTP-la concealment oerates smoothly. Fiure 6 shows the ference in concealment methods in the reconstructed seech sinal, when the LTP arameters are concealed as in Fiure 5. 2.2.3 LTP-ain concealment for lost and corruted frames Received seech arameters are also analyzed when error concealment is done for LTP-ain. Simle stitution with revious values of LTP-ain with sliht deradation in the middle of the non-stationary sequence may cause audible artifacts. In the case of non-stationary seech sinal, the LTP-ain is not stable, but fluctuatin

over the whole rane (tyically. -,2. Fiures 2 and 3 demonstrate the characteristic of LTP-ain. Because of this characteristic, the ain concealment can be imroved. In the new method, random variation between roer limits is enerated, when the ain concealment is done for the non-stationary seech sinal. Esecially when last ood LTP-ain is hih, it is imortant to do the concealment accordin to decision criterion (Chater 2.4.3. In this roosed method, also LTP-ain is concealed by usin weihted median from the revious LTP ains and addin random variation to it accordin to adative treshold limits. Like in the case of LTP-la concealment, the adative decision tresholds used in the LTP-ain concealment are also calculated from the ast values of LTP. 2.4 Fiures BFI 6 4 2 La (samles 8 6 LTP-la concealment for bad frames Imroved concealment Prior-art concealment Error-free 34 342 344 346 348 35 352 354 Subframe(n Fiure 4 Prior LTP-la concealment alorithms create audible artifact in the middle of unvoiced seech sequence. Proosed concealment uses revious set of LTP arameters and so error concealment can be imroved. 22 2 8 Imroved concealment Prior-art concealment Error-free LTP-la concealment for bad frames Seech arameters Switch RX Parameter history Parameter concealment Analyser Switch Decoder 6 La (samles 4 2 Fiure Block diaram of the roosed bad frame handlin. 8 356 358 36 362 364 366 368 37 372 374 376 Subframe (n 4 2 LTP-la LTP-ain Fiure 5 Prior LTP-la concealment alorithms create annoyin "bin"-artifact. Proosed LTP-la concealment eliates the artifact. 8 6. 4 (a 2 vii 8.55 8.6 8.65 8.7 8.75 8.8 8.85 8.9 8.95 9 Time Fiure 2 The seech sequence 'viiniä' has very stationary characteristic. 4 niä LTP-la LTP-ain -. 2 4 6 8 2 4 6. Samles (b -. 2 4 6 8 2 4 6. Samles 2 8 6 4 2 exhi bition 5.6 5.7 5.8 5.9 6 6. 6.2 Time Fiure 3 The seech sequence 'exhibition' has mostly non-stationary sequences. (c -. 2 4 6 8 2 4 6 Samles Fiure 6 Difference in reconstructed seech sinal, when LTP-la error concealment is done as in Fiure 5. (a Seech in error-free channel, (b rior-art la concealment, (c imroved la concealment. 2.4 Equations 2.4. Udated LTP-la when the frame is corruted frame When BFI is set, decoded LTP la durin the bad frame (T bf is decoded accordin to the index like it is done, when BFI is not set. T bf is used if it is inside the LTP feature-criteria. The LTP feature -criteria has the followin conditions. If one

of these conditions is true T bf is used for udatin LTPla. la = < and T, ( n >.5 and ( n 2 >.5 and T (n - -, <.4 and ( n = < 7 and T 5 and T + 5 T ( n is LTP la from the revious ood frame, T = T T = ( T, T = ( T, T is la, = (, is LTP ain of the current frame, (- is LTP ain of the revious ood frame, (n - + (-2 is LTP ain of the frame before revious ood frame, T = averae ( T LTP la value for the current frame is defined as follows: T, la = T = ( T + T + T 2 + RND( T T 2, la = 3 T = ( T, T T 2 RND(x is second larest value in is second larest value in is random value enerated to rane [-x/2, x/2]. If none of these LTP feature-criteria conditions are true, T bf is not used and la is calculated from the LTP history s like it is done in section 2.4.2. 2.4. Udated LTP-la when the frame is lost frame The usability of the LTP la from last ood frame ( la _ t is defined as follows: (Estimates, if the la is most robably very close to the transmitted la and therefore its usae should not introduce any bad artifacts., >.5 and T < la _ t =, ( n >.5 and ( n 2 >.5 = (, (n- is LTP ain of the revious ood frame, (n-2 is LTP ain of the frame before revious ood frame. LTP la value for the current frame is defined as follows: T ( n, la_t- = T = ( T + T + T 2 + RND( T T 2, la_t- = 3 T ( n is LTP la from the revious ood frame, T = ( T, T is second larest value in T is second larest value in 2 RND(x is random value enerated to rane [-x/2, x/2]. LTP-la is also udated with revious alorithm, when the frame is corruted frame and LTP featurecriteria is false. 2.4.2 LTP-ain concealment When BFI is set, the udated LTP-ain is calculated usin the followin ain concealment-rule. The usability of the last LTP ain from last ood frame is defined by variables and ain _ as follows: ain _, >.5and ( = and ( >.9 =, >.5 and ( = 2, <.5 and ( = 3,.5 = averae( = ( = ( = ( is the LTP ain of the revious ood frame, ( 2 is the LTP ain of the frame before revious ood frame, ( 3 is the LTP ain of the frame second before revious ood frame, Fr is the order of the frame, RND (.. is random value enerated to rane [,]. LTP ain value for the current frame is defined as follows: [ ( 2 + ( 3 ], ain _ = and Fr = 2 = + (.. *(, = = 2 RND ain _ and Fr RND(..*(, ain _ = and Fr = 3 + RND(.. *(, ain _ = and Fr = 4 = ( (,,, = and = 2and = 3and ain _ ain _ ain _ = = =

3. CONCLUSIONS Based on extensive exert listenin, it can be concluded that the roosed methods imrove the jective seech quality comared to traditional methods. Esecially LTPla error concealment erformed very well. On the other hand roosed error concealment for LTP-ain has some disadvantaes, even if some cases it was better than traditional methods [2, 3, 4]. The roosed LTP-la concealment is imlemented in AMR-WB [] and it was tested carefully in 3GPP AMR-WB selection hase listenin tests [6, 7]. 4. REFERENCES [] 3G TS 26.9, AMR-WB seech codec, Error concealment of lost frames, release 5 2. [2] 3G TS 26.9 3GPP, AMR seech codec, Error concealment of lost frames, version 3.. 999. [3] GSM 6.6 Diital cellular telecommunications system (Phase 2 Substitution and mutin of lost frames for full rate seech traffic channels, version 5.. 996. [4] TIA/EIA/IS-64-ADMA Cellular/PCS-Radio Interference Enhanced Full-Rate Seech Codec, 996. [5] Raymond Steele, Mobile Radio Communication, IEEE Press 994. [6] 3GPP TSG-S4, AMR-WB Selection Test Plan, Version. 2 [7] 3GPP TSG-S4, AMR-WB Selection Process Plan, Version. 2 [8] A.M. Kondoz, Diital seech codin for low bit rate communication system, John Wiley & Sons 2.