962 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000 Robust Joint Source-Channel Coding for Image Transmission Over Wireless Channels Jianfei Cai and Chang Wen Chen Abstract In this letter, we present a fixed-length robust joint source-channel coding (RJSCC) scheme for transmitting images over wireless channels. The system integrates a joint source-channel coding (JSCC) scheme with all-pass filtering source shaping to enable robust image transmission. In particular, we are able to incorporate both transition probability and bit error rate of a bursty channel model into an end-to-end rate-distortion (R-D) function to achieve an optimum tradeoff between source coding accuracy and channel error protection under a fixed transmission rate. Experimental results show that the proposed scheme can achieve not only high peak signal-to-noise ratio performance, but also excellent perceptual quality, especially when the channel mismatch occurs. Index Terms Bit allocation, image communication, joint source-channel coding, source reshaping, wireless communications. I. INTRODUCTION Recently, Ruf and Modestino [1] proposed a fixed-length joint source-channel coding (JSCC) scheme for image transmission over additive white Gaussian noise (AWGN) channels, in which different channel codes are applied to different bits according to their respective importance on the reconstructed image. To extend this scheme to wireless transmission, it is necessary to consider bursty characteristics of wireless channels. It is also desired to improve the performance of fixed-length coding scheme so that it becomes competitive against algorithms that employ variable-length coding. In this paper, we propose a fixed-length robust joint sourcechannel coding (RJSCC) scheme. This scheme is based on an optimal joint source and channel coding (OJSCC) developed for generalized Gaussian sources [2] and an all-pass filtering source reshaping. We derive general R-D functions for three channel models: binary symmetric channels (BSC) and AWGN channels for memoryless channels, and Gilbert Elliott channels (GEC) [3] for bursty channels. The idea of source reshaping is motivated by the scheme of robust quantization (RQ) presented in [4]. However, there are several fundamental differences. First, the channel coding and optimal bit allocation were not considered in [4]. Second, bursty characteristics of the channel was not addressed. Third, we show that source reshaping is applicable to cases beyond Gaussian distribution, including the limit case of uniform distribution. The contribution of this paper is twofold. First, we derive an explicit R-D function based on the fixed-length RJSCC Manuscript received June 28, 1999; revised March 16, 2000. This research is supported by the University of Missouri Research Board under Grant URB-98-142. This paper was recommended by Associate Editor F. Pereira. The authors are with the Department of Electrical Engineering, University of Missouri Columbia, Columbia, MO 65211 USA. Publisher Item Identifier S 1051-8215(00)07563-7. Fig. 1. SNR performance for 3 bits/sample encoding of memoryless sources. scheme for general wireless channels modeled by finite-state Markov processes. Second, an integration of all-pass filtering with OJSCC scheme enables the wireless image transmission to achieve not only better PSNR performance but also better perceptual quality. Compared with the state-of-the-art JSCC schemes, this proposed scheme outperforms many of them especially when the channel mismatch occurs. II. AN OJSCC FOR MEMORYLESS AND BURSTY CHANNELS We have recently developed an OJSCC scheme for generalized Gaussian distribution (GGD) sources over memoryless and bursty channels [2]. The scheme is similar to the approach proposed by Ruf and Modestino [1]. However, we consider a full range of GGD shape factors and bursty characteristics of a channel. This scheme enables us to conduct extensive study on the behavior of OJSCC under different source distributions and channel conditions. The study facilitates the integration of all-pass filtering with OJSCC to develop the RJSCC. Let denote the exponential decay rate, or shape factor, of a GGD source. The source becomes Gaussian distribution when, and Laplacian distribution when. GGD with a value of in the range provides a useful model for broad-tailed processes. Notice that for very large value of, the distribution tends to a uniform distribution [5]. In the case of wavelet image coding, the transformed coefficients are shown to be distributed as generalized Gaussian with their shape factors usually less than two [6]. Fig. 1 shows a summary of the study reported in [2] with SNR performance of the OJSCC system for coding memory-less Uniform, Gaussian, Laplacian, and GGD-0.5 sources with 3 bits/sample over BSC, where GGD-0.5 denotes GGD data with. An interesting observation is that, the larger the 1051 8215/00$10.00 2000 IEEE
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000 963 Fig. 2. System structure. shape factor of the input data, the better SNR performance the OJSCC scheme. Other coding rates and other channel models produce similar observations [2]. Therefore, if a source of smaller shape factor can be reshaped into a source of larger shape factor, an improved transmission performance can be obtained. It is such observation that motivates us to develop the RJSCC scheme, an integration of the OJSCC scheme with all-pass filtering source reshaping. The all-pass filtering is able to shape a wide range of input sources into Gaussian distributed sources based on, intuitively, the central limit theorem. Notice that if we can design a filter to shape input sources into uniform distributed sources, we can achieve even better performance. However, it is nontrivial to design such a filter capable of nonlinear mapping. Therefore, the all-pass filtering method is adopted for its implementation advantage. The study reported in [2] also shows that, for Gaussian sources, the performance of uniform quantization and that of optimal quantization are nearly the same for all three channel models. Therefore, the complexity of quantization is greatly reduced in the proposed RJSCC scheme since, after all-pass filtering, all input sources become Gaussian distributed and the uniform quantization can be employed. III. SYSTEM DESCRIPTION The proposed RJSCC system is shown in Fig. 2. First, an input image is decorrelated by using discrete wavelet transform (DWT). The transform also facilitates the decomposition of the original image source into many subsources so that high compression efficiency can be achieved. Then, all-pass filtering is applied to reshape each subsource into approximately Gaussian distribution. Each sample of these reshaped subsources is then mapped into an index by an -bit uniform quantizer designed for memoryless Gaussian sources. The output of the quantizer is fed into channel encoder, where unequal error protection (UEP) is applied. The receiver is essentially an inverse process. A. Image-Coding Structures Three image-coding systems are considered in this research. These systems are the following. 1) System A: In this system, an input image is decomposed into 13 hierarchical subbands using DWT with the Daubechies 9/7 biorthogonal filterbanks. Each subband is treated as a subsource. There are totally 13 subsources in System A. 2) System B: In this system, an input image is decomposed into 16 hierarchical subbands using DWT. There are totally 16 subsources in System B. 3) System C: This system is the same as System B for the hierarchical decomposition. However, each subband is further partitioned into subsources with size equal to that of the low-frequency subband (LFS). Therefore, there are totally 1024 subsources in System C. System A is the same as the A-RQ system in [4]. System B and System C are the same as the systems of 16-UT and 1024-GG in [1], respectively. We adopt these systems for comparison purpose. B. All-Pass Filtering The principles of all-pass filtering have been addressed in detail in [7]. Our extensive study [2] also shows that, for same variance, the performance of OJSCC for a Gaussian source is better than any other GGD source with shape factor less than two. Therefore, for these sources, we can improve performance by applying all-pass filtering source reshaping. For GGD sources with, all-pass filtering may increase the mean square error (MSE). However, application of all-pass filtering will improve perceptual quality of the reconstructed images [4]. This is because the application of all-pass filtering is able to spread the error energy over many transformed coefficients. The total energy of distortion caused by transmission errors is unchanged, but the noise energy is now distributed over many coefficients, hence the perceptual advantage is dramatic [4]. There are many ways to implement all-pass filtering. We adopt the binary phase spectra of pseudonoise signals [4] because pseudonoise signals, such as -sequence or quasi- -array, can be easily generated, and will appear random to human perception. Furthermore, there are numerous such sequences and arrays available. For image coding, the phase spectra of 2-D quasi- -arrays are employed. As these arrays can be generated in advance and quantized to two values, 1 and 1, the applications of prefiltering and postfiltring are straightforward and can be implemented via the sequential operations of FFT, spectra multiplication, and inverse FFT.
964 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000 C. R-D Function Following [1], we can define the R-D function for memory-less channels as TABLE I PSNR RESULTS OF THREE IMAGE-CODING STRUCTURES FOR 512 2 512 LENA OVER BSC WITH 0.5 BPP (1) where number of subsources; total number of pixels in the original image; number of pixels in the th subsource; distortion caused by -bit quantization for the th subsource; bit-error sensitivity and the equivalent channel bit-error rate after channel coding for the th bit of the th subsource. When a finite-state Markov model [3] is adopted to model the wireless channel, the general R-D function for wireless channels can be derived as where number of channel states; probability of channel staying at the th state; bit-error rate for the th state; equivalent bit-error rate of the th bit of the th subsource for the th state after channel coding. For BSC or AWGN models, there is only one state, so (2) becomes exactly the same as (1). For the GEC model, there are two states: Good and Bad. Let and represent the transition probabilities from one state to the other; let and denote the BERs at Good state and Bad state, respectively; let and denote the probabilities staying at Good state and Bad state, respectively. By replacing with, we can generate the R-D function for GEC model. The overall rate (bits/sample) can be written as with the channel code rate assigned to the th bit of the th subsource. Suppose the channel condition ( is known. is determined by the channel codes. and are determined by the quantization scheme. As indicated in Section II, uniform quantization can be employed in the proposed RJSCC system. More precisely, it can be called Gaussian optimized uniform quantization as all subsources have now been reshaped into Gaussian sources. For an -bit quantizer, let denote quantization step and denote the quantization error. It is clear that such uniform quantization is a one-dimensional optimization problem to find the optimal so that is minimized. For each quantization index, let the ( )th bit denote the sign (2) (3) bit, and other bits, from 0th to ( )th, denote magnitude bits. We can derive the bit-error sensitivity for all bits as The details on the derivation of (2) and (4) can be found in [2]. Therefore, the remaining unknown parameters are and,( ; ). The optimal values of and can be obtained using the well-known bit-allocation algorithm of Westerink et al. [8] to minimize the total distortion. We would like to point out that the RJSCC system shown in Fig. 2 is a flexible system. Various scalar quantization schemes and channel codes can be employed in the system without changing the analytical form of the R-D function. IV. EXPERIMENTAL RESULTS The monochrome Lena and Goldhill images with 8 bits/pixel are used as the test images. We use rate-compatible punctured convolutional (RCPC) codes [9] to provide UEP for coded image transmission. This is because RCPC can easily change coding rates without changing the basic codec structure. The available channel code rates of the RCPC codes (with memory, puncture period ) are {1/1, 8/9, 4/5, 2/3, 4/7, 1/2, 4/9, 4/10/, 4/11, 1/3, 4/13, 4/15, 1/4}. Other channelcoding techniques, including the combination of several channel coding schemes, can also be applied. A. Results with BSC Table I shows the PSNR results of 20 trials of the three different image coding structures with RJSCC for Lena image at 0.5 bpp. Notice that both average and worst PSNRs are shown to demonstrate the robustness of the proposed scheme over different trials. We show that System C produces the best performance. This is because more subsources are introduced by System C coding structure so that bits can be better allocated between source coding and channel coding. The comparisons using System A among A-RJSCC, A-OJSCC and A-RQ [4] are tabulated in Tables II and III. (4)
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000 965 TABLE II PSNR (IN DECIBELS) OF TRANSMITTING 512 2 512 LENA OVER BSC (a) (b) TABLE III PSNR (IN DECIBELS) OF TRANSMITTING 512 2 512 GOLDHILL OVER BSC (c) (d) Fig. 3. Reconstructed images using A-RJSCC coded at 0.5 bpp. (a) Lena with BER = 10. (b) Lena with BER =10. (c) Goldhill with BER =10. (d) Goldhill with BER =10. We show that A-RJSCC outperforms A-OJSCC at all cases. Comparing A-RJSCC with A-RQ, we show that A-RJSCC performs slightly worse than A-RQ for noise free cases. However, A-RJSCC outperforms A-RQ for up to 5.86 db for moderate and high BER channel conditions. The reconstructed images of Lena and Goldhill at 0.5 bpp with various channel BERs are shown in Fig. 3. These images show that the perceptual quality is still quite good for highly corrupted channels with BER 0.1. Table IV summarizes the comparison of the proposed scheme with several best systems, reported in [4] (A-RQ), [6] [S/C-SUB(D)] and [10] (SPIHT/RCPC/CRC), respectively. The proposed C-RJSCC is better than S/C-SLJB(D) and A-RQ, but worse than SPIHT/RCPC/CRC. We also show the results of RJSCC/CRC, where RCPC/CRC channel codes instead of RCPC are adopted. Notice that the scheme of SPIHT/RCPC/CRC [10] is based on known and fixed BER channels. If there is a sufficient mismatch between the design and the actual BER, that system will perform poorly. The fixed-length coding in this RJSCC system will not produce catastrophic error propagation. Table V shows a channel mismatch example when channel mismatch is moderate, namely, from 0.01 to 0.05, and all schemes use designed BER 0.01. These results demonstrate that the C-RJSCC scheme degrades gracefully as the channel BER increases and therefore is resilient to channel mismatch. In such a case, C-RJSCC will outperform both schemes proposed in [10] and [11]. B. Results with AWGN For AWGN channels, we use BPSK as the modulation scheme and 8-bit soft decision for the Viterbi algorithm. We have performed the comparisons between C-RJSCC and Ruf TABLE IV COMPARISON OF DIFFERENT SYSTEMS FOR 512 2 512 LENA OVER BSC TABLE V PSNR FOR LENA AT 0.5 BPP WITH DESIGNED BER = 0.01 and Modestino s 1024-GG reported in [1], both with the same image-coding structure of System C. The results are shown in Table VI for coding rates of 1 and 0.5 bpp, and for a range of channel SNRS. The experiments show that C-RJSCC out-performs 1024-GG between 0.42 0.73 db. Also, we compare B-RJSCC with Ruf and Modestino s 16-UT reported in [1],
966 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 6, SEPTEMBER 2000 both with the same image coding structure of System B. The experiments show that B-RJSCC outperforms 16-UT for 2 3 db. C. Results with GEC A particular GEC model is adopted to generate bursty errors. The parameters of the channel are,,,,,, and so the average BER 0.0248. For this channel, we compare three optimal bit allocation strategies. 1) Strategy A: The optimization is based on our proposed R-D function shown in (2). 2) Strategy B: The channel is treated as a BSC with the average BER of the GEC model and the optimization based on the R-D function shown in (1). 3) Strategy C: The channel is also treated as a BSC with the BER of the Bad state, and the optimization based on the R-D function shown in (1). Table VII shows the PSNR results of Lena coded at 0.5 bpp by A-RJSCC and C-RJSCC for different strategies. The results show that the optimal design based on the fully considered R-D function, Strategy A, outperforms the other two strategies based on the simple R-D function not only on average but also in the worst case among 20 trials. It demonstrates that our proposed R-D function better characterizes the wireless channels, because the channel transition probability is incorporated. D. Side Information Certain side information must be reliably transmitted, including the mean of LFS and the variance of each subsource. Suppose we use 16 bits to quantize the mean and variance of the LFS and 8 bits to quantize the variances of other subsources. We will have about 128 bits overhead for System A, or less than 0.0005 bpp. For System C, the overhead is about 0.0313 bpp. In this research, we have assumed the overhead can be reliably transmitted over the noisy channel with appropriate channel coding. Comparing with Ruf and Modestino s OJSCC system [1], the side information has been reduced. This is because the shape factor of each subsource needs to be sent as overhead in their scheme. However, in the case of RJSCC, as all subsources become Gaussian distributed after all-pass filtering, there is no need to send shape factors. V. CONCLUSION In this paper, a robust joint source-channel coding scheme is proposed for transmitting images over wireless channels. Experimental results show that RJSCC is able to achieve very good PSNR performance and perceptual quality with modest com- TABLE VI PSNR RESULTS FOR LENA OVER AWGN CHANNELS TABLE VII PSNR RESULTS FOR LENA AT 0.5 BPP OVER THE BURSTY CHANNEL plexity. We also show that the proposed scheme is more robust to channel mismatches. This is particularly useful as wireless channels often fluctuate over a modest range of channel condition due to fading and multipath. REFERENCES [1] M. J. Ruf and J. W. Modestino, Operational rate-distortion performance for joint source and channel coding of images, IEEE Trans. Image Processing, vol. 8, pp. 305 320, March 1999. [2] J. Cai and C. W. Chen, Operational rate-distortion design for joint source-channel coding over noisy channels, in Proc. IEEE WCNC 99, New Orleans, LA, Sept. 1999. [3] H. S. Wang and N. Moayeri, Finite-state Markov channel A useful model for radio communication channels, IEEE Trans. Veh. Technol., vol. 44, pp. 163 171, Feb. 1995. [4] Q. Chen and T. R. Fischer, Image coding using robust quantization for noisy digital transmission, IEEE Trans. Image Processing, vol. 7, pp. 496 505, Apr. 1998. [5] N. Farvardin and V. Vaishampayan, Optimal quantizer design for noisy channels: An approach to combined source-channel coding, IEEE Trans. Inform. Theory, vol. 33, pp. 827 838, Nov. 1987. [6] N. Tanabe and N. Farvardin, Subband image coding using entropy-coded quantization over noisy channels, IEEE J. Select. Areas Commun., vol. 10, pp. 926 943, June 1992. [7] A. C. Popat and K. Zeger, Robust quantization of memoryless sources using dispersive FIR filters, IEEE Trans. Commun., vol. 40, pp. 1670 1674, Nov. 1992. [8] R. H. Westerink, J. Biemond, and D. E. Boekee, An optimal bit allication algorithm for subband coding, in Proc. ICASSP 88, 1988, pp. 757 760. [9] J. Hagenauer, Rate-compatible punctured convolutional codes and their applications, IEEE Trans. Commun., vol. 36, pp. 389 400, Apr. 1988. [10] P. G. Sherwood and K. Zeger, Progressive image coding for noisy channels, IEEE Signal Processing Lett., vol. 4, pp. 189 191, July 1997. [11] H. Li and C. W. Chen, Bi-directional synchronization and hierarchical error correction for robust image transmission, in Proc. SPIE Visual Communication and Image Processing 99, San Jose, CA, Jan. 1999, pp. 63 72.