Discriminatory Lossy Source Coding: Side Information Privacy

Similar documents
CONSIDER the problem of transmitting two correlated

UC Berkeley UC Berkeley Previously Published Works

1360 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 3, MARCH Optimal Encoding for Discrete Degraded Broadcast Channels

On the Optimal Compressions in the Compress-and-Forward Relay Schemes

THE advent of digital communications in radio and television

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

IN a point-to-point communication system the outputs of a

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 5, MAY Note that the term distributed coding in this paper is always employed

2550 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 6, JUNE 2008

Distributed Video Coding Using LDPC Codes for Wireless Video

Systematic Lossy Error Protection of Video based on H.264/AVC Redundant Slices

Modeling and Optimization of a Systematic Lossy Error Protection System based on H.264/AVC Redundant Slices

Systematic Lossy Error Protection of Video Signals Shantanu Rane, Member, IEEE, Pierpaolo Baccichet, Member, IEEE, and Bernd Girod, Fellow, IEEE

A Study of Encoding and Decoding Techniques for Syndrome-Based Video Coding

Error Resilience for Compressed Sensing with Multiple-Channel Transmission

IN 1968, Anderson [6] proposed a memory structure named

Example: compressing black and white images 2 Say we are trying to compress an image of black and white pixels: CSC310 Information Theory.

NUMEROUS elaborate attempts have been made in the

Distributed Video Coding

Wyner-Ziv Coding of Motion Video

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

TERRESTRIAL broadcasting of digital television (DTV)

Analysis of Video Transmission over Lossy Channels

Free Viewpoint Switching in Multi-view Video Streaming Using. Wyner-Ziv Video Coding

Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2

A NOTE ON FRAME SYNCHRONIZATION SEQUENCES

ORTHOGONAL frequency division multiplexing

ISSN (Print) Original Research Article. Coimbatore, Tamil Nadu, India

Robust Joint Source-Channel Coding for Image Transmission Over Wireless Channels

DATA hiding technologies have been widely studied in

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?

L NATINAL HA LJ" l AAP A

Lecture 16: Feedback channel and source-channel separation

Behavior Forensics for Scalable Multiuser Collusion: Fairness Versus Effectiveness H. Vicky Zhao, Member, IEEE, and K. J. Ray Liu, Fellow, IEEE

THE transmission of video over the wireless channel represents

ONE SENSOR MICROPHONE ARRAY APPLICATION IN SOURCE LOCALIZATION. Hsin-Chu, Taiwan

Systematic Lossy Forward Error Protection for Error-Resilient Digital Video Broadcasting

Adaptive decoding of convolutional codes

AN UNEQUAL ERROR PROTECTION SCHEME FOR MULTIPLE INPUT MULTIPLE OUTPUT SYSTEMS. M. Farooq Sabir, Robert W. Heath and Alan C. Bovik

A New Compression Scheme for Color-Quantized Images

Rate-Adaptive Codes for Distributed Source Coding

Piya Pal. California Institute of Technology, Pasadena, CA GPA: 4.2/4.0 Advisor: Prof. P. P. Vaidyanathan

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Embedding Multilevel Image Encryption in the LAR Codec

LAYERED WYNER-ZIV VIDEO CODING FOR NOISY CHANNELS. A Thesis QIAN XU

Joint Rewriting and Error Correction in Flash Memories

Fault Detection And Correction Using MLD For Memory Applications

Successive Cancellation Decoding of Single Parity-Check Product Codes

Compressed-Sensing-Enabled Video Streaming for Wireless Multimedia Sensor Networks Abstract:

Dual Frame Video Encoding with Feedback

Project Proposal: Sub pixel motion estimation for side information generation in Wyner- Ziv decoder.

An optimal broadcasting protocol for mobile video-on-demand

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 6, JUNE Since this work considers feedback schemes where the roles of transmitter

CODING FOR CHANNELS WITH FEEDBACK

Optimized Color Based Compression

Adaptive Key Frame Selection for Efficient Video Coding

Application of Symbol Avoidance in Reed-Solomon Codes to Improve their Synchronization

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm

Optimum Frame Synchronization for Preamble-less Packet Transmission of Turbo Codes

THE CAPABILITY of real-time transmission of video over

OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS

Authentication of Musical Compositions with Techniques from Information Theory. Benjamin S. Richards. 1. Introduction

REDUCED-COMPLEXITY DECODING FOR CONCATENATED CODES BASED ON RECTANGULAR PARITY-CHECK CODES AND TURBO CODES

WE CONSIDER an enhancement technique for degraded

Joint Security and Robustness Enhancement for Quantization Based Data Embedding

Performance Improvement of AMBE 3600 bps Vocoder with Improved FEC

Feasibility Study of Stochastic Streaming with 4K UHD Video Traces

A Discrete Time Markov Chain Model for High Throughput Bidirectional Fano Decoders

COMPRESSION OF DICOM IMAGES BASED ON WAVELETS AND SPIHT FOR TELEMEDICINE APPLICATIONS

Data Representation. signals can vary continuously across an infinite range of values e.g., frequencies on an old-fashioned radio with a dial

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

CONSTRAINING delay is critical for real-time communication

An Efficient Reduction of Area in Multistandard Transform Core

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

Transmission System for ISDB-S

WITH the rapid development of high-fidelity video services

Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table

Error-Resilience Video Transcoding for Wireless Communications

MULTI-STATE VIDEO CODING WITH SIDE INFORMATION. Sila Ekmekci Flierl, Thomas Sikora

A Novel Bus Encoding Technique for Low Power VLSI

HARQ for the AWGN Wire-Tap Channel: A Security Gap Analysis

WHEN a fault occurs on power systems, not only are the

Dynamic bandwidth allocation scheme for multiple real-time VBR videos over ATM networks

Error Resilient Video Coding Using Unequally Protected Key Pictures

Interleaver Design for Turbo Codes

II. SYSTEM MODEL In a single cell, an access point and multiple wireless terminals are located. We only consider the downlink

A Layered Approach for Watermarking In Images Based On Huffman Coding

Area-efficient high-throughput parallel scramblers using generalized algorithms

Efficient Implementation of Multi Stage SQRT Carry Select Adder

Spatial Error Concealment Technique for Losslessly Compressed Images Using Data Hiding in Error-Prone Channels

Developing Inter-disciplinary Education in Circuits and Systems Community

Low-Floor Decoders for LDPC Codes

BER Performance Comparison of HOVA and SOVA in AWGN Channel

Decoder Assisted Channel Estimation and Frame Synchronization

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

Implementation of CRC and Viterbi algorithm on FPGA

MULTIVIEW DISTRIBUTED VIDEO CODING WITH ENCODER DRIVEN FUSION

Transcription:

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 9, SEPTEMBER 2013 5665 Discriminatory Lossy Source Coding: Side Information Privacy Ravi Tandon, Member, IEEE, Lalitha Sankar, Member, IEEE, and H. Vincent Poor, Fellow, IEEE Abstract A lossy source coding problem is studied in which a source encoder communicates with two decoders, one with and one without correlated side information with an additional constraint on the privacy of the side information at the uninformed decoder. Two cases of this problem arise depending on the availability of the side information at the encoder. The set of all feasible rate-distortion-equivocation tuples is characterized for each case. The difference between the informed and uninformed cases and the advantages of encoder side information for enhancing privacy are highlighted for a binary symmetric source with erasure side information and Hamming distortion. Index Terms Discriminatory coding, equivocation, Heegard Berger problem, information privacy, informed and uninformed encoders, Kaspi problem, lossy source coding, side information. I. INTRODUCTION I NFORMATION sources often need to be made accessible to multiple legitimate users simultaneously, some of whom can have correlated side information obtained from other sources or from prior interactions. A natural question that arises in this context is the following: can the source publish (encode) its data in a discriminatory manner such that the uninformed user does not infer the side information, i.e., it is kept private, while providing utility (fidelity) to both users? Two possible cases can arise in this context depending on whether the encoder is informed or uninformed, i.e., it has or does not have access to the correlated side information, respectively. This question is addressed from strictly a rate-fidelity viewpoint by Heegard and Berger in [1], henceforth referred to as the Heegard Berger problem, for the uninformed case and by Kaspi [2], henceforth referred to as the Kaspi problem, for the informed case wherein the corresponding rate-distortion functions Manuscript received May 26, 2011; revised May 28, 2012; accepted February 18, 2013. Date of publicationapril 23, 2013; date of current versionaugust 14, 2013. This work was supported in part by the National Science Foundation under Grants CNS-09-05086, CNS-09-05398, and CCF-10-16671, in part by the Air Force Office of Scientific Research under Grant FA9550-09-1-0643, and in part by a fellowship from the Princeton University Council on Science and Technology. This paper was presented at the 2011 IEEE Global Communications Conference. R. Tandon was with the Princeton University, Princeton, NJ 08544 USA. He is now with the Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, VA 24061 USA (e-mail: tandonr@vt.edu). L. Sankar was with the Princeton University, Princeton, NJ 08544 USA. He is now with the Department of Electrical, Computer, and Energy Engineering, Arizona State University, Tempe, AZ 85287 USA (e-mail: lalithasankar@asu. edu). H. V. Poor is with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA (e-mail: poor@princeton.edu). Communicated by S. Diggavi, Associate Editor for Shannon Theory. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIT.2013.2259613 Fig. 1. Source network model. for a discrete and memoryless source pair were determined. Using equivocation as the privacy metric, we address the question posed above using the source network models in [1] and [2] with an additional constraint on the side information privacy at the decoder without access to it, i.e., decoder 1 (see Fig. 1). When additional privacy/security constraints are included, it is not clear apriorithat the same coding schemes as for the original Heegard Berger and Kaspi problems can achieve the set of all rate-distortion-equivocation tuples. We prove here that the encoding scheme for the Heegard Berger problem achieves the set of all feasible rate-equivocation pairs for the desired fidelity requirements at the two decoders. Informally speaking, the Heegard Berger coding scheme involves a combination of a rate-distortion code and a conditional quantize-and-bin code which is revealed to both decoders. Our proof exploits the fact that conditioned on what is decodable by decoder 1, i.e., the rate-distortion code, the additional information intended for decoder 2, i.e., the conditional quantize-and-bin bin index, is asymptotically independent of the side information, (see Fig. 1). Observing that the generation of the conditional quantize-and-bin bin index is analogous to the Slepian Wolf binning scheme, we prove this independence property for both the Slepian Wolf [3] and the Wyner Ziv [4] encoding. Next, we prove a similar independence property for the Heegard Berger coding scheme, which in turn allows us to demonstrate the optimality of this scheme for the problem studied in this paper. While this orthogonality has been alluded to in [4], we present formal proofs here for all the encoding schemes considered in this paper. On the other hand, for the informed encoder case, we present amodifiedcoding scheme (vis-à-vis the Kaspi scheme) which achieves the set of all feasible rate-equivocation pairs for the desired fidelity requirements at the two decoders. The Kaspi coding scheme exploits the encoder side information (see Fig. 1) via a combination of a rate-distortion code, intended for decoder 1, and a conditional rate-distortion code, intended for decoder 2, which is then revealed to both the decoders. 0018-9448 2013 IEEE

5666 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 9, SEPTEMBER 2013 However, for the ease of proving the orthogonality of the encoder index intended for decoder 2 from the side information as well as the equivocation computation, we present a two-step encoding scheme in which the first step is the same as in the Kaspi problem while in the second step we first choose the codeword intended for decoder 2 and then bin it. We prove that the resulting conditional bin index is asymptotically independent of the side information. While we do not prove it here, it is worth noting that the conditional rate-distortion index of the Kaspi encoding scheme should also be orthogonal to the side information. The last part of our paper focuses on a specific source model, a binary equiprobable source with erased side information (with erasure probability and Hamming distortion constraints. For this source pair, we focus on the rate-distortionequivocation (RDE) tradeoffs for both the uninformed and informed cases. For the uninformed encoder case, we prove that the maximal equivocation is independent of the fidelity requirement at decoder 2, i.e., the only information leaked about the side information is a direct consequence of the distortion requirement at decoder 1. We also explicitly characterize the RDE tradeoff for this problem over the space of all achievable distortion pairs. Our results clearly demonstrate the optimality of the Heegard Berger encoding scheme from both rate and equivocation standpoints. In contrast, for the informed encoder case, we explicitly demonstrate the usefulness of encoder side information. We first prove that the set of distortion pairs for which perfect equivocation is achievable at decoder 1 is strictly larger than that for the uninformed case. We prove this by showing that the informed encoder uses the side information via a single description which satisfies the distortion constraints at both the decoders while simultaneously achieving perfect privacy at decoder 1. Furthermore, we also demonstrate that access to side information leads to a tradeoff between rate and equivocation. To guarantee a desired equivocation, we show that the minimal rate required can be strictly larger than the rate-distortion function for the original Kaspi problem. It is worth noting that while the Heegard Berger and the Kaspi rate-distortion functions have been explicitly evaluated in [5] and [6] for the binary source model, our main focus here is on understanding the effect of an additional privacy constraint on the rate-distortion tradeoffs with and without encoder side information. Our approach thus enables us to understand the cases for which (e.g., the Kaspi problem with privacy constraints) the rate-distortion region differs from the original problem (without privacy constraints). The problem of source coding with equivocation constraints has gained attention recently (e.g., [7] [19]) as have problems of generating and sharing secret keys over noisy or noiseless channels (e.g., [16], [19], [20] and the references therein) when side information is available at one or more decoders. In contrast to these papers where the focus is on an external eavesdropper, we address the problem of privacy leakage to a legitimate user, i.e., we seek to understand whether the encoding at the source can discriminate between legitimate users with and without access to correlated side information. The paper is organized as follows. In Section II, we present the system model. In Section III, we first prove the asymptotic independence of the bin index and the decoder side information in the Slepian Wolf and Wyner Ziv source coding problems. Subsequently, we establish the rate-equivocation tradeoff regions for both the uninformed and informed cases. In Section V, we characterize the achievable RDE tradeoff for a specific sourcepair where is binary and results from passing Section VI. through an erasure channel. We conclude in II. SYSTEM MODEL We consider a source network with a single encoder which observes and communicates all or a part of a discrete, memoryless bivariate source over a finite rate link to decoders 1 and 2 at distortions and, respectively, in which decoder 2 has access to and an equivocation about is required at decoder 1. The network is shown in Fig. 1 where the two cases with and without side information at the encoder correspond to the switch being in the closed and open positions, respectively. Without the equivocation constraint at decoder 1, the problems with the switch in open and closed positions, are the Heegard Berger and Kaspi problems for which the set of feasible tuples are characterized by Heegard and Berger [1] and Kaspi [2], respectively. We seek to characterize the set of all achievable tuples for both problems. Formally, let denote the bivariate source with random variables and.furthermore,let and denote the reconstruction alphabets at decoders 1 and 2, respectively, and let and, be distortion measures associated with reconstruction of at decoders 1 and 2, respectively. Let take the values 0 and 1 to denote the open and closed switch positions, respectively. An code for this network consists of an encoder and two decoders The expected distortion at decoder is given by where, and the equivocation rate is given by where for the case of an uninformed encoder and for the case of an informed encoder. Definition 1: The rate-distortion-equivocation tuple is achievable for the above source network if there exists an code with (1) (2) (3) (4)

TANDON et al.: DISCRIMINATORY LOSSY SOURCE CODING: SIDE INFORMATION PRIVACY 5667 for all and sufficiently large. Let denote the set of all achievable tuples. We assume that the code is known at both the encoders (source) and the decoders (users). III. RELATED OBSERVATIONS In order to develop the main results in our paper, we present some related observations for specific well-known source coding problems. We specifically focus on two such problems, namely the Slepian Wolf [3] and the Wyner Ziv [4] problems, in which the decoder has access to side information correlated with the sequence observed at the encoder. In the context of lossless source coding, Slepian and Wolf [3] studies a problem of losslessly communicating a part of a bivariate source to a single decoder which has access to and proves that a minimal rate of is needed. On the other hand, [4] studies the problem of lossily communicating apart of a bivariate source subject to a fidelity criterion to a single decoder which has access to and proves that a minimum rate of where the minimization is over all distributions and deterministic functions such that and. In both of the aforementioned problems, the coding index communicated is chosen to exploit the side information at the decoder. In the lemmas that follow, we prove that in both cases the optimal encoding is such that the coding index is asymptotically independent of the side information at the decoder. A. Slepian Wolf Coding : Independence of Bin Index and Side Information Consider a pair of independent and identically distributed (i.i.d.) source sequences generated according to. In the Slepian Wolf problem, the encoder sends an index which is a function of. It is required that be recovered losslessly from at the decoder. This requirement along with Fano s inequality implies that where as. Using this property, we shall prove the following lemma. Lemma 1: For a sequence of Slepian Wolf codes with rates not exceeding,for any and asymptotically vanishing error probability at the decoder, we have Proof: We have the following sequence of inequalities: (5) (6) (7) (8) (9) (10) where (10) follows from (5). Normalizing (10) by,weobtain and taking, we have the proof of the lemma. Remark 1: Lemma 1 captures the intuition that it suffices to encode only that part of that is asymptotically independent of the decoder side-information Furthermore, zero leakage can be approached by choosing to be arbitrary small. B. Wyner Ziv Coding: Independence of Bin Index and Side Information Consider a pair of i.i.d. sources generated according to. In the Wyner Ziv problem, the encoder sends an output which is a function of.itisrequired that be recovered in a lossy fashion from. Here, instead of Fano s inequality, we use the exact converse proof of Wyner and Ziv [4], in which it is shown that (11) We shall now prove the following claim. Lemma 2: For a sequence of Wyner Ziv codes with rates not exceeding, we have Proof: We have the following inequalities: (12) (13) (14) (15) where (15) follows from using (11) as in the converse of Wyner Ziv [4]. Normalizing by, we obtain and taking, we have the proof of the lemma. One can choose arbitrarily small to achieve near-zero leakage of sideinformation. In the next section, we illustrate these properties for two related encoding schemes, which will act as key ingredients for the equivocation proofs for the problems considered in the paper. C. Quantize and Bin: Uninformed Encoder Case We first present an encoding scheme in which a source sequence is quantized to a sequence. The encoder then transmits a bin index so that using the bin index and side information, the quantized sequence can be losslessly reconstructed. For a class of such encoding schemes (described below), we show that the bin index is asymptotically independent of the side information. More formally, consider an input distribution :

5668 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 9, SEPTEMBER 2013 i.e., forms a Markov chain. The encoder generates i.i.d. sequences, enumerated as,where each sequence is generated according to the product distribution governed by the marginal. The encoder partitions the space of typical -sequences into disjoint regions, denoted by and selected as satisfying [21] where as. Note that there is a one-to-one correspondence between and the partitioned set. Next, for ease of notation, denote the following: The encoder bins the sequences into bins as follows: Upon observing, encoder finds the partition in which the sequence falls. It transmits the bin-index corresponding to the sequence. This encoding scheme implies the decodability of the sequence as follows: upon receiving the bin index, the uncertainty at the decoder about is reduced. In particular, having the bin index, it knows that there are only possible sequences that could have resulted in the bin index. It then uses joint typical decoding using to decode the correct sequence (the probability of decoding error goes to zero as by standard arguments as in the channel coding theorem [23]). The above argument and Fano s inequality imply that for this quantize-and-bin encoding scheme the following holds: (16) where as. We now state our main claim in the following lemma. Lemma 3: For the quantize-and-bin scheme presented above, we have Proof: We prove this lemma by the following sequence of inequalities: (17) (18) (19) (20) (21) (22) (23) (24) where (21) follows from (16). Normalizing by and taking, we have the proof of the lemma. To complete the proof, we prove (25) in Appendix A. D. Quantize and Bin: Informed Encoder Case We now present the informed encoder case, in which both are available at the encoder. The encoder jointly compresses them to a sequence and transmits a bin index to the decoder. It is required that using the bin index and the side information, the sequence can be losslessly reconstructed. For this encoding scheme, we show that the bin index is asymptotically independent of the side information. More formally, consider an arbitrary input distribution. The encoder generates i.i.d. sequences, enumerated as, where each sequence is generated according to the product distribution governed by. The encoder partitions the space of typical -sequences into disjoint regions, denoted by and selected as satisfying. Note that there is a one-to-one correspondence between and the partitioned set. As in the previous section, denote the following: The encoder bins the sequences into bins as follows: Upon observing, the encoder finds the partition in which the pair falls. It transmits the bin-index corresponding to the sequence. This encoding scheme implies the decodability of the sequence as follows: upon receiving the bin index, the decoder s uncertainty about is reduced. In particular, having the bin index, it knows that there are only possible sequences that could have resulted in the bin index. It then uses joint typical decoding using to decode the correct sequence (the probability of decoding error goes to zero as by standard arguments as in the channel coding theorem [23]). The above argument and Fano s inequality imply that for this quantize-and-bin encoding scheme the following holds: (27) where as. We now state the following lemma. Lemma 4: For the quantize-and-bin scheme with encoder side information presented above, we have (25) (26) The proof of this lemma follows from Lemma 3 by replacing by.

TANDON et al.: DISCRIMINATORY LOSSY SOURCE CODING: SIDE INFORMATION PRIVACY 5669 IV. MAIN RESULTS A. Uninformed Encoder With Side Information Privacy Theorem 1: For a fixed target distortion pair set of achievable tuples is given as,the (28a) (28b) for some distribution such that there exist functions and for which,for and and. We prove Theorem 1 in Appendix B. Here, we present a sketch of the achievability proof. The main idea is to show that the Heegard Berger encoding scheme achieves the rate-equivocation tradeoff. In particular, the encoding used in the Heegard Berger scheme is as follows: the source is quantized to at a total rate of. A useful way to interpret this is as follows: the rate of the first quantization is and the rate of the second layer conditioned on the first is. The encoder sends the first quantization uncoded which requires a rate of, and it is received by both decoders. Recall that we are interested in the leakage of at decoder 1. Hence, the uncertainty of given is. To transmit the second quantization layer, the encoder performs binning (as in the quantize-bin scheme with an uninformed encoder) to reduce the rate of transmission to.from Lemma 3, if can be losslessly reconstructed at the decoder, then the bin index is asymptotically independent of the decoder side information. This is the key to the equivocation proof; and the second part of the encoder output does not leak any information at decoder 1. Hence, the uncertainty about given the encoder output is. B. Informed Encoder With Side Information Privacy Theorem 2: For a fixed target distortion pair set of achievable tuples is given as,the (29a) (29b) for some distribution such that there exist functions and for which,for and and. The proof for Theorem 2 follows from Theorem 1 by replacing by. Here, we present a sketch of the achievability proof. The main idea is to show that the Kaspi encoding scheme achieves the rate-equivocation tradeoff. In particular, the encoding used in Kaspi scheme is as follows: the source is quantized to at a total rate of.the difference from the previous uninformed case is that the encoder also has access to. A useful way to interpret this is as follows: therateofthefirst quantization is and the rate of the second layer conditioned on the first is. The encoder sends the first quantization uncoded which requires a rate of, and it is received by both decoders. Recall that we are interested in the leakage of at decoder 1. Hence, the uncertainty of given is. To transmit the second quantization layer, the encoder performs binning (as in the quantize-bin scheme with an informed encoder) to reduce the rate of transmission to.from Lemma 4, if can be losslessly reconstructed at the decoder, the bin index is asymptotically independent of the decoder side information. Hence, the uncertainty about given the encoder output is. V. RESULTS FOR A BINARY SOURCE WITH ERASED SIDE INFORMATION In this section, we illustrate the main results of Theorems 1 and 2. In particular, we consider the following pair of correlated sources: is binary and uniform, and w.p. w.p., and we consider the Hamming distortion metric, i.e., for both decoders and for both the informed and uninformed cases. A. Uninformed Case We are interested in the RDE tradeoff, given as, (30) (31) where the rate and equivocation computation is over all random variables that satisfy the Markov chain relationship and for which there exist functions and satisfying (32) (33) Let denote the binary entropy function defined for The region for this case is partitioned into four regimesasshowninfig.2. The RDE tradeoff is given as follows: and if, if, if, if if, otherwise. In Fig. 3, we have plotted and for the cases in which and,and. Remark 2: This example shows that the equivocation does not depend on the distortion achieved by decoder 2 which has

5670 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 9, SEPTEMBER 2013 where (36d) follows from a direct verification that if is uniform and is an erased version of and forms a Markov chain. 2) Lower Bound on : 1) If, we use the lower bound. 2) If, we use the lower bound [22]. 3) If, we use the lower bound. 4) If, we show that (37) Fig. 2. Partition of the region: uninformed encoder case. Consider an arbitrary such that is a Markov chain and there exist functions and : such that Now consider the following sequence of equalities: (38a) Consider the following term appearing in (38a): Fig. 3. Illustration of the rate-equivocation tradeoff for. access to side-information, but rather depends only on the distortion achieved by the uninformed decoder 1. 1) Upper Bound on : For any,weuse the trivial upper bound (34) (35) (39a) (39b) (39c) (39d) (39e) (39f) (39g) For any,weusethefollowing: We also have (36a) (36b) (36c) (36d) (36e) (36f) (36g) (36h) (36i) (36j) which implies that (40a) (40b) (40c) (40d) (41)

TANDON et al.: DISCRIMINATORY LOSSY SOURCE CODING: SIDE INFORMATION PRIVACY 5671 Now consider the following sequence of inequalities for the last term in (38a): (42a) (42b) (42c) (42d) (42e) (42f) where (42) follows from (41). Using (39g) and (42), we can lower bound (38), to arrive at 3) Coding Scheme: 1) If,the tradeoff is trivial. 2) If, we use the following coding scheme: In this regime, we have, hence the encoder sets, and sends only one description, where and is independent of. It can be verified that. Decoder 2 estimates by as follows: if ; if. Therefore, the achievable distortion at decoder 2 is. 3) If, we use the following coding scheme. The encoder sets, and sends only one description,where and is independent of. It can be verified that.decoder 1 estimates as which leads to distortion of. Decoder 2 estimates by as follows: if ; if. Therefore, the achievable distortion at decoder 2 is. Hence, as long as,the fidelity requirement of decoder 2 is satisfied. 4) If, we use the following coding scheme: We select,and,where,and,where, and the random variables and are independent of each other and are also independent of. At the uninformed decoder, the estimate is created as, so that the desired distortion is achieved. At the decoder with side-information, the estimate is created as follows: if ; if. Therefore, the achievable distortion at this decoder is. It is straightforward to check that the rate required by this scheme matches the stated lower bound on,and. This completes the proof of the achievable part. Fig. 4. Partition of region: informed encoder case. B. Informed Encoder For this case, the RDE tradeoff is given as (43) (44) where the joint distribution of with can be arbitrary. As in the previous section, we partition the space of admissible distortion pairs. For simplicity, we denote these partitions as follows: (45) (46) (47) (48) (49) These partitions are illustrated in Fig. 4. We provide a partial characterization of the optimal tradeoff as a function of. In particular, we establish the tight characterization of pairs for all values of with the exception of when. This characterization reveals the benefit of the encoder side-information, i.e., for a given rate, to achieve the same distortion pair, the maximum possible equivocation is in general larger for the informed case than the uninformed case. 1) : In this case, the region is trivial since both the decoders can satisfy their distortion constraints which also yields the maximum equivocation, i.e., we have (50) (51) 2) : In this case, we use the proof as in the uninformed case for the partition to show that (52) (53)

5672 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 9, SEPTEMBER 2013 Decoder 2 forms its estimate as follows: if ; if, which yields a distortion of its estimate as at decoder 2. Decoder 1 forms Fig. 5. Illustration of when and. which yields 3) :The tradeoff for this case is given as follows: Therefore, as long as (54) (55) This case differs from the uninformed encoder case in the sense that for the same rate, we can achieve the maximum equivocation and a nontrivial distortion for decoder 1. Since,and, the converse proof is straightforward. The interesting aspect of this regime is the coding scheme, which utilizes the side information at the encoder in a nontrivial manner. To achieve this tradeoff, we set, and send only one description to both the decoders. The conditional distribution that is used to generate the codewords is illustrated in Fig. 5. Hence, the rate for this scheme is given by and the equivocation is given as (56) (57) (58) (59) (60) (61) (62) (63) (64) (65) (66) (67) this scheme achieves the optimal tradeoff. We now informally describe the intuition behind this coding scheme: since the encoder has access to side-information,it uses the fact that whenever, no additional rate is required to satisfy the requirement of decoder 2, i.e., for -fraction of time it is guaranteed to exactly recover.however, this yields a distortion of at decoder 1 (since decoder 1 does not have access to ). In the remaining -fraction of time, the encoder describes with a distortion, which contributes to a distortion of at both the decoders. To summarize, the net distortion at decoder 2 is, whereas the distortion at decoder 1 is lowered from to. Furthermore, by construction, is independent of, i.e.,, which results in the maximal equivocation at decoder 1. a) : For this case, the tradeoff is given as the set of pairs (68) (69) where the parameter belongs to the range. We now describe the coding scheme that achieves this region: we set, and send one description at a rate. The conditional distribution that is used to generate the codewords is illustrated in Fig. 6. The parameters that describe this distribution are chosen such that (70) (71)

TANDON et al.: DISCRIMINATORY LOSSY SOURCE CODING: SIDE INFORMATION PRIVACY 5673 Fig. 6. Illustration of when. so that is created as. At decoder 2, the estimate if ; if, which yields a distortion of.since,theworstcase distortion for decoder 2 for a fixed is. Hence, as long as, we can satisfy the fidelity requirements at both decoders. By direct calculations, it can be shown that the resulting tradeoff is as stated above. Compared to all the previous cases, the proof of optimality of the above coding scheme is nontrivial and is relegated to the appendix. We remark here that in this regime, the tradeoff between rate and privacy can be observed in a precise manner. First, note that the choice yields the operating point as in the uninformed encoder case. Next, when decreases from to 0, the equivocation increases, albeit at the cost of a higher rate. This phenomenon does not occur in the case in which the encoder does not have side information. Finally, when is in the range,weobtainalower equivocation by increasing the rate. This phenomenon appears counterintuitive and can be explained as follows: this range of corresponds to a coding scheme in which we give more weight to the side-information when describing to decoder 1. Such a coding scheme can be regarded as the solution to the problem in which the encoder is interested in revealing to decoder 1, while simultaneously satisfying the fidelity requirement for at decoder 1. While it is a feasible solution to the problem, it may not be a desirable coding scheme when the privacy of at the decoder is of primary concern, and thus, there exists a set of rate-equivocation operating points from which one can choose. In Fig. 7, we show the achievable tradeoff when and. a) : For this case, the following pairs are achievable (72) (73) where is such that. The coding scheme that achieves this tradeoff is similar to the one used when Fig. 7. Illustration of the rate-equivocation tradeoff for and with an informed encoder., with the exception that the range of is different. The question of optimality of tradeoff for this regime is still unresolved. Remark 3: The RDE tradeoff developed for the binary symmetric channel with erased side information for the two cases in which the encoder is either informed or uninformed is closely related to and inspired by the rate-distortion region (without the additional privacy constraint) developed for these problems by Perron et al. in [5] and [6]. Specifically, for the case in which the encoder is uninformed, the schemes achieving the rate-distortion region suffice to achieve the set of all RDE tuples. In contrast, an informed encoder allows the use of strategies that exploit this to reduce the leakage at the uninformed decoder, i.e., decoder 1 is informed, there are specific differences between the R-D region in [5] and [6] and RDE region developed here. This is because, when the encoder has access to the side-information, it can use it to generate the auxiliary codeword by adapting it to be as orthogonal to as possible so that it leaks minimal information about Y at decoder 1. VI. CONCLUDING REMARKS We have determined the rate-distortion-equivocation region for a source coding problem with two decoders, in which only one of the decoders has correlated side information and it is desired to keep this side information private from the uninformed decoder. We have studied two cases of this problem depending on the availability of side information at the encoder. We have proved that the Heegard Berger and the Kaspi coding schemes are optimal even with an additional privacy constraint for the uninformed and the informed encoder cases, respectively. We have illustrated our results for a binary symmetric source with erasure side information and Hamming distortion which clearly highlight the difference between the informed and uninformed cases and the advantages of encoder side information for enhancing privacy. Future work includes generalization to multiple decoders as well as to continuous-valued sources.

5674 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 9, SEPTEMBER 2013 APPENDIX A) Proof of (25): Here, we prove the following inequality: a such that the rate and equivocation of the system are bounded as follows: (85) We have the following sequence of inequalities: (86) (87) (74) (88) (75) (89) (90) (91) (76) (92) (93) (77) (78) which further simplifies as see (79) (84) at the bottom of the page, where as. B) Proof of Theorem 1: Converse: We now formally develop lower bounds on the rate and equivocation achievable for the uninformed encoder case. Let denote the output of the encoder. We show that given an code there exists where, and ; (88) follows from the fact that the source is memoryless, and thus, is independent of,and and that form a Markov chain as a result of which ;(91) follows from defining and such that for all ; (92) follows from (28) such that and are defined as (94a) (94b) (94c) (79) (80) (81) (82) (83) (84)

TANDON et al.: DISCRIMINATORY LOSSY SOURCE CODING: SIDE INFORMATION PRIVACY 5675 and (93) follows from standard arguments invoking the convexity of the function defined in (28) (see [23, Ch. 10], [4]). For the same code considered, we can upper bound the achievable equivocation as (95a) (95b) (95c) (95d) where (95d) follows from the concavity of the equivocation (logarithm) function definedin(28). Remark 4: It is worth noting that the minimal rate and the maximal equivocation need not be achieved by the same distribution and just denote the lower and upper bounds, respectively, on the achievable rate and equivocation. This is reflected in Theorem 1 where we develop the set of all rate-equivocation tuples achievable for a desired distortion pair. Achievability: We briefly summarize the Heegard Berger coding scheme [1]. Fix. We now describe the encoding functions and explicitly. First generate, sequences,, i.i.d. according to. For every sequence, generate sequences,, i.i.d. according to. For ease of notation, denote the following: which is equivalent to showing that (97) We now prove (97) by the following sequence of inequalities: (98) (99) (100) (101) (102) (103) (104) (105) (106) (107) (108) (109) (110) (111) The encoder bins the sequences into bins as follows: Upon observing a source sequence the encoder searches for a sequence such that (the choice of ensures that there exists at least one such ). Next, the encoder searches for a such that (the choice of ensures that there exists at least one such ). The encoder sends where is the bin index of the sequence. We remark here that in the scheme presented above, decoder 2 having access to can correctly decode with high probability. In particular, for the scheme we have from Fano s inequality that (96) where as. 3) Proof of Equivocation: For the scheme presented above, we will show that (112) where (104) follows from (96) and (109) follows by the same arguments as in the proof of (25) in Appendix A, and (112) follows from the Markov chain relationship. Normalizing (112) by,wehave (113) Finally, in the limit of, we obtain (97). Note that can be chosen arbitrarily close to 0 as D) Converse Proof for Region : We start by a simple lower bound on the rate: and an upper bound on : (114) (115)

5676 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 9, SEPTEMBER 2013 We will now use the distortion constraint of decoder 1 alone to simultaneously lower bound the rate and upper bound the equivocation. Consider an arbitrary (and denote this as distribution ) given as For this distribution, we have (116) (117) (118) (119) These four quantities characterize the bounds in (114) and (115) exactly and also the achievable distortion. Now consider a new distribution, with conditional probabilities as follows: It is straightforward to verify that the distortion, rate, and equivocation terms are the same for both and.next,define a new distribution as follows: w.p., w.p.. We now note that is convex in and is concave in.by Jensen s inequality, this implies that the distribution defined above uses a rate that is at most as large and leads to an equivocation that is at least as large when compared to both the distributions and. Hence, it suffices to consider input distributions of the form, which can be explicitly written as To satisfy the distortion constraint, we also have which leads to. Now, also note that for a fixed, this scheme yields a distortion of at decoder 2. Furthermore, since the range of, we note that the worst-case distortion for decoder 2 (for a fixed )is. This implies that as long as this region yields the stated tradeoff for the region. REFERENCES [1] C. Heegard and T. Berger, Rate distortion when side information may be absent, IEEE Trans. Inf. Theory, vol. 31, no. 7, pp. 727 733, Nov. 1985. [2] A. Kaspi, Rate-distortion function when side-information may be present at the decoder, IEEE Trans. Inf. Theory, vol. 40, no. 6, pp. 2031 2034, Nov. 1994. [3] D.SlepianandJ.K.Wolf, Noiseless coding of correlated information sources, IEEE Trans. Inf. Theory, vol. 19, no. 4, pp. 471 480, Jul. 1973. [4] A. D. Wyner and J. Ziv, The rate-distortion function for source coding with side information at the decoder, IEEE Trans. Inf. Theory, vol. 22, no. 1, pp. 1 10, Jan. 1976. [5] E. Perron, S. N. Diggavi, and I. E. Telatar, On the role of encoder sideinformation in source coding for multiple decoders, in Proc. IEEE Int. Symp. Inf. Theory, Seattle, WA, USA, Jul. 2006, pp. 331 335. [6] E. Perron, S. N. Diggavi, and I. E. Telatar, Lossy source coding with Gaussian or erased side-information, in Proc. IEEE Int. Symp. Inform. Theory, Seoul, Korea, Jul. 2009, pp. 1035 1039. [7] D.Gündüz,E.Erkip,andH.V.Poor, Losslesscompressionwithsecurity constraints, in Proc. IEEE Int. Symp. Inf. Theory, Toronto, ON, Canada, 2008, pp. 111 115. [8] L. Grokop, A. Sahai, and M. Gastpar, Discriminatory source coding for a noiseless broadcast channel, in Proc. IEEE Int. Symp. Inf. Theory, Adelaide, Australia, 2005, pp. 77 81. [9] J. Villard and P. Piantanida, Secure lossy source coding with side information at the decoders, in Proc. 48th Annu. Allerton Conf. Commun., Control Comput., Monticello, IL, USA, Sep. 2010, pp. 733 739. [10] J. Villard and P. Piantanida, Secure multiterminal source coding with side information at the eavesdropper, IEEE Trans. Inf. Theory, vol. 59, no. 6, pp. 3668 3692, Jun. 2013. [11] R.Tandon,S.Mohajer,andH.V.Poor, Cascade source coding with erased side information, in Proc. IEEE Symp. Inf. Theory, St. Petersburg, Russia, Aug. 2011, pp. 2944 2948. [12] R. Tandon, L. Sankar, and H. V. Poor, Multi-user privacy: The Gray- Wyner system and generalized common information, in Proc. IEEE Symp. Inf. Theory, St. Petersburg, Russia, Aug. 2011, pp. 563 567. [13] R. Tandon, S. Ulukus, and K. Ramachandran, Secure source coding with a helper, IEEE Trans. Inf. Theory, vol. 59, no. 4, pp. 2718 2187, Apr. 2013. [14] L. Sankar, S. R. Rajagopalan, and H. V. Poor, Utility-privacy tradeoffs in databases: An information-theoretic approach, IEEE Trans. Inf. Forensic Security, (Special Issue Privacy Trust Manag. Cloud Distrib. Syst.), vol. 8, no. 6, pp. 838 852, Jun. 2013. [15] V. Prabhakaran and K. Ramchandran, On secure distributed source coding, in Proc. IEEE Inf. Theory Workshop, Tahoe, CA, USA, Sep. 2007, pp. 442 447. [16] P. Cuff, Using a secret key to foil an eavesdropper, in Proc. 48th Annu. Allerton Conf. Commun., Control, Comput., Monticello, IL, USA, Sep. 2010, pp. 1405 1411. [17] H. Yamamoto, A rate-distortion problem for a communication system with a secondary decoder to be hindered, IEEE Trans. Inf. Theory, vol. 34, no. 4, pp. 835 842, Jul. 1988. [18] N. Merhav and E. Arikan, The Shannon cipher system with a guessing wiretapper, IEEE Trans. Inf. Theory, vol. 45, no. 6, pp. 1860 1866, Sep. 1998. [19] N. Merhav, Shannon s secrecy system with informed receivers and its application to systematic coding for wiretapped channels, IEEE Trans. Inf. Theory, vol. 54, no. 6, pp. 2723 2734, Jun. 2008. [20] A. Khisti, S. Diggavi, and G. Wornell, Secret-key generation using correlated sources and channels, IEEE Trans. Inf. Theory, vol. 58, no. 2, pp. 652 670, Feb. 2012. [21] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems. Orlando, FL, USA: Academic, 1982. [22] T. Weissman and S. Verdú, The information lost in erasures, IEEE Trans. Inf. Theory, vol. 54, no. 11, pp. 5030 5058, Nov. 2008. [23] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed. New York, NY, USA: Wiley, 2006. Ravi Tandon (S 03 M 09) received the B.Tech. degree in electrical engineering from the Indian Institute of Technology (IIT), Kanpur in 2004 and the Ph.D. degree in Electrical and Computer Engineering from the University of Maryland, College Park in 2010. From 2010 until 2012, he was a postdoctoral research associate with Princeton University. In 2012, he joined Virginia Polytechnic Institute and State University (Virginia Tech) at Blacksburg, where he is currently a Research Assistant Professor in the Department of Electrical and Computer Engineering. His research interests are in the areas of network information theory, communication theory for wireless networks and information theoretic-security. Dr. Tandon is a recipient of the Best Paper Award at the Communication Theory symposium at the 2011 IEEE Global Communications Conference.

TANDON et al.: DISCRIMINATORY LOSSY SOURCE CODING: SIDE INFORMATION PRIVACY 5677 Lalitha Sankar (S 92 M 07) received the B.Tech. degree from the Indian Institute of Technology, Bombay, the M.S. degree from the University of Maryland, and the Ph.D. degree from Rutgers University in 2007. She is presently an Assistant Professor in the Electrical, Computer, and Energy Engineering department at Arizona State University. Prior to this, she was an Associate Research Scholar at Princeton University. Following her doctorate, Dr Sankar was a recipient of a three year Science and Technology Teaching Postdoctoral Fellowship from the Council on Science and Technology at Princeton University. Prior to her doctoral studies, she was a Senior Member of Technical Staff at AT&T Shannon Laboratories. Her research interests include information privacy and secrecy in distributed and cyber-physical systems, wireless communications, network information theory and its applications to model and study large data systems. For her doctoral work, she received the 2007 2008 Electrical Engineering Academic Achievement Award from Rutgers University. She received the IEEE Globecom 2011 Best Paper Award for her work on privacy of side-information in multi-user data systems. H. Vincent Poor (S 72 M 77 SM 82 F 87) received the Ph.D. degree in EECS from Princeton University in 1977. From 1977 until 1990, he was on the faculty of the University of Illinois at Urbana-Champaign. Since 1990 he has been on the faculty at Princeton, where he is the Michael Henry Strater University Professor of Electrical Engineering and Dean of the School of Engineering and Applied Science. Dr. Poor s research interests are in the areas of stochastic analysis, statistical signal processing, and information theory, and their applications in wireless networks and related fields such as social networks and smart grid. Among his publications in these areas are the recent books Smart Grid Communications and Networking (Cambridge University Press, 2012) and Principles of Cognitive Radio (Cambridge University Press, 2013). Dr. Poor is a member of the National Academy of Engineering and the National Academy of Sciences, a Fellow of the American Academy of Arts and Sciences, an International Fellow of the Royal Academy of Engineering (U.K.), and a Corresponding Fellow of the Royal Society of Edinburgh. He is also a Fellow of the Institute of Mathematical Statistics, the Acoustical Society of America, and other organizations. In 1990, he served as President of the IEEE Information Theory Society, and in 2004 07 he served as the Editor-in-Chief of these TRANSACTIONS. He received a Guggenheim Fellowship in 2002 and the IEEE Education Medal in 2005. Recent recognition of his work includes the 2010 IET Ambrose Fleming Medal, the 2011 IEEE Eric E. Sumner Award, the 2011 IEEE Information Theory Paper Award, and honorary doctorates from Aalborg University, the Hong Kong University of Science and Technology, and the University of Edinburgh.