Image Steganalysis: Challenges Jiwu Huang,China BUCHAREST 2017
Acknowledgement Members in my team Dr. Weiqi Luo and Dr. Fangjun Huang Sun Yat-sen Univ., China Dr. Bin Li and Dr. Shunquan Tan, Mr. Jishen Zeng Shenzhen Univ., China
Outlines Steganography in Images Steganalysis vs. Steganography Challenges in Steganalysis 3
Steganography What steganography? undetectable 4
Steganography Steganography vs. Cryptography? Unreadable! plaintext E k (M) encryption @2*$#&(*%7* =? ciphertext plaintext Cryptography 5
Steganography Steganography and Cryptography? cover Nothing! plaintext M Hiding M stego plaintext Steganography 6
Steganography Secret message Cryptography Encryption Encrypted bits Steganography Cover image Hiding Unsecure channel Security: 1) difficulty to find stego images 2) difficulty to extract the hidden bits 7
Steganography Security for steganography Secure steganography: undetectable Can only be estimated with a probability about random guessing Not necessary or sufficient to be imperceptible Imperceptible does not means undetectable perceptible does not means detectable 8
Steganography Security for steganography K-L distance: security (Cachin 2004) P X ---distribution of covers P Y ---distribution of stegos relative entropy security 9
Steganography Security for steganography Maximum Mean Discrepancy-based Security (Fridrich 2008) x i, y i : Sample of P X, P Y What function f should be selected? 10
Steganography Security for steganography Steganalyzer s ROC-based Security (Memon 2003) security: ROC:TP rate ~FP rate 11
Steganalysis vs. Steganography 12
Steganalysis vs. Steganography Steganalysis 13
Steganalysis vs. Steganography Early targeted approaches Attacking LSB-based steganography Attacking OutGuess, MB, F5, YASS Advanced universal approaches Image Quality Features (Memon et. al) Calibration Based Features (Fridrich et. al) Moment Based Features (Farid et. al) Correlation Based Features (Moulin, Sullivan, Shi) 14
Steganalysis vs. Steganography Cat & Mouse Game targeted approaches steganography LSB-based (J-steg) Model-based (MB1) steganalysis Histogram-based (Chi-square), RS Over fitting Compensationbased (MB2) Block artifact due to many changes 15
Steganalysis vs. Steganography Cat & Mouse Game universal approaches steganography matrix coding wet paper code STC(syndrome-trellis codes) Distortion functions steganalysis High-dimensional features +machine learning Adaptive+ Adaptive+ 16
Steganalysis vs. Steganography Advanced universal approaches Y i testing Y: feature vector Classifier: mapping Y i i : i =0,nature image; i =1,stego image 17
Steganalysis vs. Steganography Advanced universal approaches training Classifiers: SVM, Fisher linear discriminant, neural network, ensemble, others. 18
19 Statistical features nature images stego images Steganalysis wins steganography
20 Statistical features stego images nature images steganography wins Steganalysis
21 Counterwork between steganography and steganalysis statistical model of nature images
Challenges in Steganalysis Statistical model of nature images Is there such universal statistical model It seem to be difficult to answer If yes, how to model? At least, it is not easy to model texture regions 22
Challenges in Steganalysis How to find the traces a steganographic scheme? Limited performance in the existing features Objective image quality measures, Calibration-based features, statistical moments, gray-level co-occurrence matrix based features, Markov process based features, SPAM (subtractive pixel adjacency matrix), SRM (spatial rich model) SRM: generating features (Fridrich 2012) 23
SRM Computing residuals using different submodels Ri,j = X i,j(ni,j)-cxi,j Truncation and quantization by quantization step q and threshold T Ri,j = trunc T (round(ri,j/q)) Using co-occurrence matrices for feature extraction Fridrich et. al, Rich Models for Steganalysis of Digital Images, IEEE T-IFS, 7(3): 868-882, 2012 24
25
Performance of SRM under different datasets SRM + ensemble classifier Image Dataset: downsampling from high resolution raw images Downsampling type: bicubic, bilinear, lanczos2, nearest 5000 images for training, 5000 for testing. stegnographic algorithm: S-UNIWARD, 0.4bpp 26
Challenges in Steganalysis How to find the traces a steganographic scheme? How to design efficient features? To achieve better performance Methodology for constructing features Compact features Low computation load 27
Challenges in Steganalysis Mis-matchng between training and testing databases Mis-matching issue Mis-matching: datasets for training and testing have different properties Steganalyzers should be robust In steganalysis community: downsampling leaves traces 28
Downsampling effect on steganalysis SRM + ensemble classifier Image Dataset: downsampling from high resolution raw images 5000 images for training, 5000 for testing. stegnographic algorithm: S-UNIWARD, 0.4bpp 29
Challenges in Steganalysis Does deep learning work? Great success in pattern recognition Does DL work in recognizing Imperceptible differences? Works on some easy forensics issues However, how about steganalysis? Have not seen good progress in steganalysis in spatial domain Recent progress in steganalysis for JPEG domain 30
Deep learning in JPEG steganalysis JPEG steganography: J-Uniward, 0.4bpp Training: 800K images from ImageNet Testing: 200K images from ImageNet Method Accuracy DCTR features (8000dims) 0.65 PHARM features (12600dims) 0.68 Proposed deep network 0.75 JPEG stegnographic algorithm: J-UNIWARD, 0.4bpp Zeng et,al, Large-scale JPEG image steganalysis using hybrid deep-learning framework, Submitted to IEEE Trans. on IFS 31
Challenges in Steganalysis Dataset for training Golden rules in machine learning data samples should be at least 20-50 times with the total number of parameters. Steganalysis in spatial domain: Highdimensional features vs. small dataset Taking SRM as an example 32
SRM: very high feature dimension vs. small training dataset features:34671 Image database: 10 K, training images: 5K JPEG stegnographic algorithm: Juniward, 0.4bpp Fridrich et. al, Rich Models for Steganalysis of Digital Images, IEEE T-IFS, 7(3): 868-882, 2012 33
DNN: very large number of network parameters vs. small training dataset parameters: 100K Image database: 10 K, training images: 5K JPEG stegnographic algorithm: Juniward, 0.4bpp Xu et., al, Structural Design of Convolutional Neural Networks for Steganalysis, IEEE SPL, 23(5), 708-712, 2016. 34
Challenges in Steganalysis Dataset for training Uncompressed images are not commonly used in practice Very huge manpower to collect an applicable raw dataset 35
Conclusions Stegnography is a good way in secret communication and thus steganalysis is of important Have some progress in laboratory There are still many challenges Deep learning may be useful in JPEG steganalysis 36