steganalysis

7/27/2019 Steganalysis

1/26

0

A Seminar Report on

Steganalysis

Prepared by : Pratixa I Mistry

Roll No. : 110420704003

Class : MEEC

Semester : 3 rd Semester

Year : 2 nd Year

Guided by : Prof. Chirag N Paunwala

Department

Of

Electronics & Communication Engineering

Sarvajanik College of Engineering & Technology

Dr R.K. Desai Road,

Athwalines, Surat - 395001, India


2/26

1

STEGANALYSISA Seminar Report

Submitted by

Ms. PRATIXA I MISTRY

Enrollment Number

(110420704003)

in parti al ful fi ll ment for the award of the degree

of

MASTER OF ENGINEERING

IN

ELECTRONICS & COMMUNICATION

Year- 2012-13

At

SARVAJANIK COLLEGE OF ENGINEERING &TECHNOLOGY

DR.R.K. DESAI ROAD,

ATHWALINES SURAT-395001, INDIA


3/26

I

Abstract

Steganography and steganalysis are important topics in information hiding.

Steganography refers to the technology of hiding data into digital media without drawing

any suspicion, while steganalysis is the art of detecting the presence of steganography.

Steganalysis is a relatively new branch of research. While steganography deals

with techniques for hiding information, the goal of steganalysis is to detect and/or

estimate potentially hidden information from observed data with little or no knowledge

about the steganography algorithm or its parameters. It is fair to say that steganalysis is

both an art and a science. The art of steganalysis plays a major role in the selection of features or characteristics a typical stego message might exhibit, while the science helps

in reliably testing the selected features for the presence of hidden information.

Steganalysis has gained prominence in national security and forensic sciences

since detection of hidden messages can lead to the prevention of disastrous security

incidents. Steganalysis is a very challenging field because of the scarcity of knowledge

about the specific characteristics of the cover media (an image, an audio or video file) that

can be exploited to hide information and detect the same. The approaches adopted for

steganalysis also sometimes depend on the underlying steganography algorithm(s) used.


4/26

II

Acknowledgement

I would like to Thank Prof. Chirag N Paunwala for supervising my seminar

and guiding me throughout the period of my Seminar. He has always been supportiveand egger to help. His great experience has helped me immensely in the difficulties

and delay that I faced in my seminar.

My dear colleagues have also helped me directly or indirectly. Finally my parents

for their constant support & Almighty God for providing the strength to complete the

work.

Ms. Pratixa I Mistry.


5/26

III

Table of Contents Abstract ............................................................................................................................ I

Acknowledgement ........................................................................................................... II

1 Introduction ............................................................................................................. 1

1.1 Motivation: ....................................................................................................... 1

1.2 Types of Steganalysis: ....................................................................................... 2

1.3 Basic Model: ..................................................................................................... 2

1.4 Evaluation Criteria: ........................................................................................... 3

1.4.1 Criteria for Steganography: ........................................................................ 3

1.4.2

Criteria for Steganalysis: ............................................................................ 4

2 Literature review ..................................................................................................... 6

2.1 LSB Matching Steganalysis: .............................................................................. 6

2.2 LSB Steganalysis: ............................................................................................. 8

2.3 JSteg steganalysis: ............................................................................................. 9

2.4 Universal (blind) image Steganalysis using Blockiness: .................................. 11

2.4.1 Comparing The Calibrated image against the original image: ................... 11

2.4.2 Blockiness: ............................................................................................... 12

2.5 Universal (blind) image Steganalysis: .............................................................. 13

2.5.1 Parameterized Run-Length Representations: ............................................ 14

2.6 Universal JPEG steganalysis: ......................................................................... 15

2.6.1 Statistical models and information hiding: ................................................ 15

2.6.2 Feature Extraction: ................................................................................... 17

2.6.3 Feature Based JPEG Steganalysis using Neighboring Joint Density BasedFeatures: ................................................................................................................ 18

2.7 Improving Steganographic Security: ................................................................ 19

REFERENCES.............................................................................................................. 20


6/26

IV

LIST OF FIGURE

Figure 1.1 The model of steganography and steganalysis[4] .......................................... 3

Figure 1.2 Confusion matrix [4]..................................................................................... 4

Figure 1.3 ROC curve[4] ............................................................................................... 5 Figure 2.1 The flow chart of extracting feature .............................................................. 7

Figure 2.2 (a) Cover Image in RGB Colour Model (b) Stego-Image in RGB Colour Model [6] ........................................................................................................................ 9

Figure 2.3 (a) Cover Image in HSI Colour Model (b) Stego-Image in HSI Colour Model[6] ................................................................................................................................... 9

Figure 2.4 The histogram of a clean 8-bit JPEG image [1] . .......................................... 10

Figure 2.5 The histogram of an 8-bit stegogramme produced using JSteg [1]. .............. 10

Figure 2.6 Comparing the histogram of the calibrated image with that of the cover image

[1]. ................................................................................................................................ 12

Figure 2.7 Graphical representation of the blockiness algorithm[1]. ............................. 13

Figure 2.8 Example of run length histograms of original and stego Lena [8]. ............... 14

Figure 2.9 (a): quantization RLHs with Q = 4; (b): difference RLHs with =2, [8]. ..... 15

Figure 2.10 The DCT neighboring joint density probabilities and the difference betweenthe cover and the steganograms[9] ................................................................................. 17
http://g/SEMINAR%20REPORT.docx%23_Toc336528214http://g/SEMINAR%20REPORT.docx%23_Toc336528214http://g/SEMINAR%20REPORT.docx%23_Toc336528214


7/26

1

1 Introduction

Cryptography is often used to protect information secrecy through making messagesillegible. However, indecipherable messages may raise an opponent's suspicion and

probably lead to his destruction of such a communication manner. Therefore,steganography gets a role on the stage of information security[1].

Steganography refers to the technique of hiding information in digital media inorder to conceal the existence of the information. The media with and without hiddeninformation are called stego media and cover media, respectively [1]. Steganography canmeet both legal and illegal interests. For example, civilians may use it for protecting

privacy while terrorists may use it for spreading terroristic information. Compared todigital watermarking, another branch of information hiding, steganography stresses moreon preserving the secrecy of the information instead of making the hidden informationrobust to attacks.

Image Steganography can done using two way. Spatial steganography and JPEGsteganography. The common ground of spatial steganography is to directly change theimage pixel values for hiding data. The embedding rate is often measured in bit per pixel(bpp). JPEG is the common format of the images produced by digital cameras, scanners,and other photographic image capture devices. Therefore, hiding secret information intoJPEG images may provide better camouage. Most of the steganographic schemes embeddata into the nonzero alternate current (AC) discrete cosine transform (DCT) coefficientsof JPEG images because the majority of the DCT energy is concentrated on low

frequencies (DC coefficient) and less on higher frequencies ( AC coefficient). Soinserting in AC coefficient affects less to the image quality compared to inserting in DCcoefficients. As a result, the embedding rate of JPEG steganographic is often evaluated in

bit per non-zero AC DCT coefficient (bpac).

Steganalysis is an art of deterring covert communications while avoiding affectingthe innocent ones. Its basic requirement is to determine accurately whether a secretmessage is hidden in the testing medium [1]. Further requirements may include judgingthe type of the steganography, estimating the rough length of the message, or evenextracting the hidden message. Steganography and steganalysis are in a hide-and-seek game [5]. They try to defeat each other and also develop with each other.

Based on the medium used in steganography to embed the message are classified basically in three types, image steganography, audio steganography and videosteganography. Digital images have high degree of redundancy in representation and

pervasive applications in daily life, thus appealing for hiding data. As a result, the pastdecade has seen growing interests in researches on image steganography and imagesteganalysis [3,5].

1.1 Motivation:

Image steganalysis is the science of analyzing images to discover methods of detecting hidden messages and data within the images.


8/26

2

On the steganography side, this is important in order to find methods in order toimprove the algorithm implementing steganography. By exposing the flaws to thealgorithm, the user can further improve the algorithm in order to make difficult to detectwhether or not data is hidden in the images.

Steganalysis is also especially important in security aspects, namely monitoring ausers communication with the outside world. In the age of internet, images are sent viaemail or by posting on websites. Detecting whether or not data is hidden in the imageswill allow the monitor to further analyze the suspicious in order to find what the hiddenmessage is. Steganalysis is very important to international security, as growing interestemerges as to whether terrorist organizations use steganographic techniques tocommunicate with each other. In fact steganalysis is taken so seriously with securityaspects.

1.2 Types of Steganalysis:

Based on the ultimate outcome of the effort we can classify steganalysis into twocategories[3]:

Passive steganalysis: Detect presence/absence of hidden message in a stego signal,identify the stego embedding algorithm.

Active steganalysis: Estimate the embedded message length, estimate locations of thehidden message, estimate the secret key used in embedding, estimate some parameters of the stego embedding algorithm and extract the hidden message.

In this report methods are described for image steganalysis and of passive type.

1.3 Basic Model:

The issue in steganography and steganalysis is often modeled by the prisoner's problem [4] which involves three parties, as illustrated in figure 1.1. Alice and Bob aretwo prisoners who collaborate to hatch an escape plan while their communications will bemonitored by a warden, Wendy. Using a data embedding method (.), secret informationm is supposed to be hidden into a cover medium X by Alice with a key k 1. Generation of an innocuous-looking stego medium Y can be described as Y =(X, m, k1 ). On thereceiver's side, the medium obtained by Bob, denoted by Y

, is passed to a data extractionmethod (.) to extract information m with a key k 2. The extraction process may bedescribed as m

= (Y

, k 2). The steganographic scheme should ensure m

= m . Althoughthe public key steganographic scheme is considered in some literatures, the private keysteganographic scheme, where k 1 = k 2 is assumed, remains the most common scenario ina steganographic system. Wendy can be active or passive judging from the nature of her work on examining the media in transmission. If she makes Y

Y in order to foil all possible covert communications between Alice and Bob, she is called an active warden. If she only takes actions when Y is found suspicious, she is a passive warden. In the passivewarden case, which is the main focus of this report, once Wendy can differentiate Y fromX ; the steganographic method is considered broken.


9/26

3

Note that this model only aims to explain the concepts of steganography andsteganalysis, but not to detail the way on how to conduct the pract ice.

Figure 1.1 The model of steganography and steganalysis[4]

1.4 Evaluation Criteria:

In order to reasonably evaluate the performance of various kinds of steganographicand steganalytic methods, it is necessary to define some criteria acceptable to themajority. Moreover, the valuation criteria may also lead us to the right direction toimprove the techniques.

1.4.1 Criteria for Steganography:

Three common requirements, security, capacity, and imperceptibility, may beused to rate the performance of steganographic techniques [4].

Security: Steganography may suffer from many active or passive attacks,correspondingly in the prisoner's problem when Wendy acts as an active or passivewarden. If the existence of the secret message can only be estimated with a probabilitynot higher than random guessing in the presence of some steganalytic systems,steganography may be considered secure under such steganalytic systems. Otherwise wemay claim it to be insecure.

Capacity: To be useful in conveying secret message, the hiding capacity provided bysteganography should be as high as possible, which may be given in absolutemeasurement (such as the size of secret message), or in relative value (called dataembedding rate, such as bits per pixel, bits per nonzero DCT coefficient, or the ratio of the secret message to the cover medium, etc.).

Imperceptibility: Stego images should not have severe visual artifacts. If the resultantstego image appears innocuous enough, one can believe this requirement to be satisfiedwell for the warden not having the original cover image to compare.

DataEmbedding

Cover Image x

Secret Key k 1

SecretMessage

mStego

Image yChannel

Steganalysis

Alice

StegoImage y

DataExtraction

Secret Key k 2

Wendy Bob

SecretMessage

m


10/26

4

1.4.2 Criteria for Steganalysis:

The main goal of steganalysis is to identify whether or not a suspected medium isembedded with secret data, in other words, to determine the testing medium belong to the

cover class or the stego class. If a certain steganalytic method is used to steganalyze asuspicious medium, there are four possible resultant situations [4].

True positive (TP): meaning that a stego medium is correctly classified as stego.

False negative (FN): meaning that a stego medium is wrongly classified as cover.

True negative (TN): meaning that a cover medium is correctly classified as cover.

False positive (FP): meaning that a cover medium is wrongly classified as stego.

Confusion Matrix:

When applying a steganalytic method on a testing data set, which may consist of cover and stego media, a 2x2 confusion matrix[4], is illustrated in figure 1.2.

True positives

(TPs)

False positives

(FPs)

False negatives

(FNs)

True negatives

(TNs)

TP Rate =+

1-1

FP Rate =+

1-2

Accuracy =+

+ + + 1-3

Precision=+

1-4

Stego image Cover image

C o v e r

i m a g e

S t e g o

i m a g e

True type

D e

t e c

t e d t y p e

Sum up

by column

Number of

cover images

Number of

stego images

Figure 1.2 Confusion matrix [4]


11/26

5

Receiver Operating Characteristic (ROC) Curve:

The performance of a steganalytic classifier may be visualized by an ROC curve[4], in which true positive rate is plotted on the vertical axis and false positive rate is

plotted on the horizontal axis (see Figure 1.3). If the area under the ROC curve (AUC) is

larger, the performance of the steganalytic method is better. For example, it can beobserved from Figure 1.3 that the performance of ROC curve C is better than B, and B is better than A.

Figure 1.3 ROC curve[4]


12/26

6

2 Literature review

Steganalysis can be regarded as a two-class pattern classification problem whichaims to determine whether a testing medium is a cover medium or a stego one. Accordingto its application fields, it can be divided into specific methods and universal methods [4].

A specific steganalytic method fully utilizes the knowledge of a targetedsteganographic technique and may only be applicable to such a kind of steganography. Auniversal steganalytic method can be used to detect several kinds of steganography.Usually universal methods do not require the knowledge of the details of the embeddingoperations. Therefore, it is also called blind method. Some methods can be considered as"semi-universal", means this methods can reliably detect many JPEG steganographicschemes but may not be effective to spatial steganography.

2.1 LSB Matching Steganalysis:

LSB matching, which is a minor modication of LSB steganography. Instead of replacing the LSBs of the cover image pixels, LSB matching adds or subtracts them by 1if they does not match the message bits.

In [2,10] author has focuses on image steganalysis based on higher order imagestatistics based on neighborhood information of pixels (NIP) to detect the stego imagesfrom original ones. They use subtracting gray values of adjacent pixels to captureneighborhood information. Adjacent image pixels in a neighborhood area contain morelocal information of an image itself, and are suitable for steganalysis. Inspired by theseideas, they developed their image steganalysis feature sets based on neighborhoodinformation of pixels (NIP).

Let a x,y be the gray value of image pixel in coordinates ( x,y). neighbor set N ( x,y) = { a x-y,y , a x,y+1 , a x+1,y , a x,y-1} with elements of gray a x-y,y , a x,y+1 , a x+1, y, a x,y-1 values around pixel ( x,y). Then subtract gray value of pixels of neighbor with that of center and then threshold them with T . More details are presented as follows:

Differenced and thresholded sets DS ( x, y) for a neighbor set N ( x, y) .

DS ( x,y)= { DS 1 ( x,y), DS 2 ( x,y), DS 3 ( x,y), DS 4 ( x,y)}

DS 1 =Tsh (a x-y,y - a x,y) , DS 2 =Tsh (a x,y+1 - a x,y), DS 3 = Tsh ( a x+1,y - a x,y), DS 4 =Tsh (a x,y-1 - a x,y)

While Tsh (.) denote thresholding if input number is larger (or smaller) than T (or -T ), asfollowing definition:

Tsh ( x) =,

After thresholding, elements of take values from - T to T , thus DS have (2 T + 1) 4


13/26

7

possible states for any single pixel because here total (2 T +1) number of values and haveto arrange them in group of four. Although we can reduced number of possible states of

DS by taking threshold, even we can set value T to a very small number, the states of (2T + 1) 4 are still too large to get a histogram of DS . Hence combine rotation invariantstates. In principal we need to map states to same value if they are rotation invariant. Map

any DS ( x,y) to a code that c( x,y) [1,(2T +1)4] that ensure rotate invariant ones be

identically and uniquely coded.

After coding, calculate histogram H for coded DS ( x,y) , as feature sets:

= (11 ,12 ,1(2 +1) 4 ) = , ,, i =1,2, (2 T +1)

4 2-1

Although in this step, dimensionality of H equals to (2 T +1) 4, but it is obvious thatsome bins of H constantly equal to zero due to this special encoding method. Thesedefinite zero bins can be easily distinguished by a simple analysis. Remove thoseredundant zero bins yielding a feature set denoted as F . Dimensionality of F is less than

H . The flowchart of extracting feature is presented in figure 2.1.

Figure 2.1 The flow chart of extracting feature

They have implemented experiments on BOWS2 image database. Performance of feature sets is assessed by their detection rate of test samples. We use true positive (TP),true negative (TN), and average rate (AR) to compare the detection performance.

Considering the trade-off between preserving neighbor structure and lowdimensionality of feature, it is acceptable when we set T =3. Except previousdefined neighbor, we can also define the neighbor of pixel ( x,y) as a set of adjacent pixels in diagonal and mirror diagonal: { a x-1, y-1, a x+1, y-1, a x+1, y+1 , a x-1, y+1} andextract NIP feature with the same procedure as described. The dimensionality of thistype of NIP feature is also 616.

By analyzing the given result in [2], it is concluded that the TN and TP rates for theLSB matching is higher for horizontal and vertical NIP, and those results are better for 0.25 bpp compared to 0.15 bpp embedding.

Where ( x ,y) =0 if x = y

1 if x y

Neighboringpixels difference

Thresholdingwith T

Image Rotation

invariant coding Calculate

normalizedhistogram

NIP featureset


14/26

8

2.2 LSB Steganalysis:

LSB steganography can be done by replacing the LSBs by of randomly selected pixels in the cover image with the secret message bits. The selection of pixels may be

determined by a secret key.In [6] auther has developed Steganalysis algorithm base on RGB to HSI colour

model conversion. It is tested for stego-image database which is obtained byimplementing various RGB Least Significant Bit Steganographic algorithms.

There are three different types of colour models they are HSI, HSV, and RGB. Anycolour model can be converted to other model using mathematical expression In HSImodel the values of Hue, Saturation and Intensity values are derived from all the three R,G and B values. Any change in the values of red or green or blue are easily reflected in allvalues of HSI colour model. RGB to HSI colour model can be mathematically derivedwith respect to normalized values of RGB and the mathematical expression is shown

below.

H = cos -1 12

[ + ][ 2 + ( )( )] 1/2 2-2

S = 1 - ( + + )

[min( , , )] 2-3

I = 1

3( + + ) 2-4

For any given image as input in proposed model it will be converted in to HSIColour Model and by careful observation stego image can be differentiated from theCover Image as shown in figure 2.2 and 2.3.

The Original and Stego-Image of Ace picture is shown in RGB Colour Model in figure 2.2 where from visual perception its difficult to differentiate theCover and Stego-Image. The same figures when converted to HSI Colour Model thevisual distortion is seen in the top rows of Aces Stego-Image and it shown in figure 2.3.

The proposed method was tested only for stego-images generated by LSBSteganography algorithm. Input images from various categories such as natural sceneries, birds, animals etc have chosen. Images generated from their own Stego Image Generator (SIG) tool was given as input.


15/26

9

Figure 2.2 (a) Cover Image in RGB Colour Model (b) Stego-Image in RGB Colour Model [6]

Figure 2.3 (a) Cover Image in HSI Colour Model (b) Stego-Image in HSI Colour Model [6]

2.3 JSteg steganalysis:

JSteg embeds secret information into a cover image by successively replacing theLSBs of non-zero quantized DCT coefficients with secret message bits.

As auther has discussed in [1] the JSteg algorithm introduces Pairs of Values(PoVs) as a result of sequential bit-flipping. It is possible to illustrate these PoVs by extracting all of an images DCT AC coefficients an d tallying their frequencies of occurrence. If we split the values into bins we can narrow the results to a focusedsubsection and display the results by centering them across a specified range x. The resultis referred to as a histogram.

a b

a b


16/26

10

What we expect to see for a clean image is that the histogram illustrates a linear distribution to the frequencies of the DCT coefficients across zero. As the values have not

been altered by any embedding process, there is a clear structure to the values that provides a characteristic for detecting steganography (see figure 2.4 ).

Figure 2.4 The histogram of a clean 8-bit JPEG image [1] .

Figure 2.5 shows the histogram of a clean 8-bit JPEG image. As expected, thevalues increase in frequency in a linear fashion towards zero, and decrease after. If wecompare this histogram with that of a JSteg stegogramme at 80% embedding capacity, wecan see how important a role the PoVs play in steganalysis.

Figure 2.5 The histogram of an 8-bit stegogramme produced using JSteg [1].

Above figure shows that the PoVs created by JStegs bit-flipping methodology areapparent in the stegogrammes histogram. All of the values (except 0 and 1 which JStegdoes not embed within) can be paired together by their neighboring values because their frequency of occurrence has become very similar. For example, the value - 2 occurs roughly as often as the value - 1 , and similarly the value 2 occurs roughly as often as the

value 3 . This trait is characteristic of a stegogramme created by bit-flipping.


17/26

11

2.4 Universal (blind) image Steganalysis using Blockiness:

Perhaps the most important aspect of blind steganalysis is ensuring that we canderive an estimate of the cover image that is as accurate as possible. The attacks that

follow this procedure often compare the data in the estimated cover image to that of thesuspect image, so it is imperative that the data of the estimate is as sound as possible so asto not obscure the results.

One of the most famous approaches for creating an estimate of the cover image isthe model proposed by Jessica Fridrich in [1,7] known as JPEG Calibration. The methodtake advantage of the fact that most stego-systems encode the message data in thetransform domain during the compression procedure to produce JPEG stegogrammes.Given that the JPEG compression algorithm operates by transforming the image into 8x8

blocks, and it is within these blocks that the encoding of the message operates, we canestimate the cover work by introducing a new block structure and comparing it with that

of the suspect image. When there is a large difference, it suggests that the suspect imageis a stegogramme, where as little difference typically indicates that the image is innocent.

The general methodology of the calibration process decompresses the suspectimage, removes 4 pixels from each side, and then recompresses the result using the samequantisation table. Visually, and technically (by measures such as PSNR), the calibratedimage is still very close to that of the suspect image. However, as a result of cropping theimage and recompressing, we effectively break the block structure of the suspect image

because the second compression does not consider the first.

2.4.1 Comparing The Calibrated image against the original image:

Perhaps the most effective way of determining how similar the images are is tocompare the histograms of them both and overlay the plots such that we can see howsimilar they are. Figure 2.6 shows the histograms of the cover image, the calibratedimage, and also the histogram of the stegogramme. The stegogramme was created byembedding a message at 50% capacity using the F4 steganalysis algorithm.

As we can see from Figure 2.6 , the histograms of the cover image and thecalibrated images are very close together, meaning the calibration process has beensuccessful. Create an image that contains roughly the same statistical property of theoriginal cover image, even though we had no access to the cover image at any point.Compare these two histograms against that of the stegogramme. Note that the histogramfor the stegogramme is much more distant from the cover image and calibrated image. If we were to eliminate the plot for the cover image in Figure 2.6, it would be left with tworather varying plots. We could make a guess that the suspect image is a stegogramme

based on this information, but as the calibration process relies heavily on the informationof the quantized DCT coefficients, this histogram can not be used alone to make a finaldecision. Auther have illustrated a blind steganalytical method for evaluating the

probability that the image is a stegogramme.


18/26

12

Figure 2.6 Comparing the histogram of the calibrated image with that of the cover image [1].

2.4.2 Blockiness:

Estimate of the cover image is derived in above part. we need to find somestatistical property that differs between the calibrated image and the suspect image suchthat we can determine the probability that the image is a stegogramme.

One of the strongest methods for achieving this is known as Blockiness which takesadvantage of the fact that JPEG-driven stego-systems encode the message data in thesame 8x8 blocks that are used for compression. The method is defined best by DongdongFu in [1,7,11] when it is stated that:

"[Blockiness] defines the sum of spatial discontinuities along the boundary of all 8x8 blocks of JPEG images ".

Essentially, the logic behind Blockiness is that a stegogramme will contain adifferent set of coefficients across the boundaries of each 8x8 block to that of a cleanimage. We can therefore total the sums of the boundaries column-wise and row-wise for

both a suspect image and a clean image (or our calibrated image) and then calculate thedifference between the two. A large difference suggests that the image is a stegogramme,whilst a small difference is probably down to compression, and therefore reflects a cleanimage. The formula for calculating the Blockiness of an image is shown in equation 2.4.

B = 8 , 8 +1, + ,8 ,8 +1=118 =1 =11

8

=1 2-5

where g i;j refers to the coordinates of a pixel value in an MxN grayscale image.

To express this in graphical terms, consider Figure 2.7. It firstly shows the

boundaries of the 8x8 blocks in (a), and then shows what these values look like in the


19/26

13

spatial domain in (b). The red lines indicate the columns that are multiples of 8, and theyellow lines represent their neighboring columns that are multiples of 8 + 1 . For eachcolumn, the sum of the yellow column is subtracted from the red column. Similarly, thesum of the green rows are subtracted from the blue rows. The absolute values of the twoseparate totals are then added together to yield the blockiness value.

Figure 2.7 Graphical representation of the blockiness algorithm[1].

2.5 Universal (blind) image Steganalysis:

In [8] auther has proposed a new, simple but effective method is for blind imagesteganalysis, which is based on run-length histogram analysis [8. Higher order statistics of characteristic functions of three types of image run-length histograms are selected asfeatures which is described below. Their method is described below.

For a given image, a run length matrix p(i, j ) is defined as the number of runs with pixels of gray level i and run length j . For a run-length matrix p(i, j ), let M be the number of gray levels and N be the maximum run length. The image run-length histogram (RLH)can be defined as a vector:

( ) = ( , )=1 1


20/26

14

example, after LSB data embedding, values of some image pixels will be increased or decreased by one as a result. These changes would directly influence the image RLH. Aconcurrent change occurs: long runs in image would break into short runs, leading to asmaller number of long runs and a larger number of short runs. As a result, the imageRLH would shrink. Although there also exis t cases that short runs may be combined to

a long run, the tendency of these combinations is much less significant than the splittingof long runs because of the spatial correlation of natural images. Figure 2.8. shows theRLHs of the Lena image before and after data hiding, where the shrinkage is clearlyseen.

Figure 2.8 Example of run length histograms of original and stego Lena [8].

2.5.1 Parameterized Run-Length Representations:

The two run-length representations defined below make the long runs in imagerunlength matrix much more than those of traditional run-length matrix. As a result, thetendency of shrinkage of their corresponding RLH caused by data hiding turns to be muchmore obvious, hence the RLH is more sensitive to data embedding which we can see infigure2.9.

For natural images, the number of short runs is significantly more than the number of long runs in an image RLH. The maximal length of runs is usually very limitedcompared to the range of possible length values (see Fig.2.9). In order to make theshrinkage of image RLH more obvious so as to make RLH more sensitive to dataembedding, define two new run-length representations, which are variations of thetraditional run-length matrix p( i, j ), by counting the pixels into a same run with differentrules and parameters.

Quantization Run-length Representation:

Firstly apply intensity quantization on the image plane using a quantization stepfactor Q . Then calculate the RLH of the quantized image matrix. For example, for a 256gray-level image with Q = 2, we get a new image matrix whose range of intensity valuesis from 0 to 127. Hence, the number of long runs in this new image RLH would increasecompared to the original image RLH, because each pair of neighboring intensities would

Run Length

L o g

( N u m

b e r o

f R u n

L e n g

t h + 1 )


21/26

15

fall into the same run. Obviously, the lager the Q is, the more long runs we can expect.The traditional image runlength matrix is just the special case of quantization run-lengthmatrix with Q = 1.

Difference Run-length Representation:

A run in this type of representation is defined as a string of pixels with amaximum inter-pixel absolute intensity difference of along a direction. Thus, a string of consecutive pixels with small intensity difference would form a single run. For example,for a string of 4 image pixels with intensity of 124, 125, 125 and 126, their correspondingtraditional run-length matrix is p(124, 1), p(125, 2) and p(126, 1) , while their corresponding difference run-length matrix in case of = 2 is p(124, 4) . Similarly, thelarger the is, the more long runs we can obtain. When is 0, the difference run-lengthmatrix is simply the traditional image run-length matrix.

Figure 2.9 (a): quantization RLHs with Q = 4; (b): difference RLHs with =2 , [8].

They have choosen a commonly used image database, the CorelDraw Database,intheir experiments. And totally 1142 images from Corel-Draw version 11 CD #4 werecollected as the original images. Also, six sets of stego images were generated by usingsix different (both types of spetial and transform) stego-algorithms.

2.6 Universal JPEG steganalysis:

In [9] auther has proposed a blind image steganalysis, which is based on Intra block and Inter block neighboring joint density based approach.

The neighboring joint density on both intra-block and inter-block are extractedfrom the DCT coefficient array. After the feature space has been constructed, it usesSVM like binary classifier for training and classification (classifier is not discussed inthis paper).

2.6.1 Statistical models and information hiding:

A model Probability density function (PDF) can characterize the statistical behavior of a signal. For multimedia signals, the Generalized Gaussian distribution

(GGD) is often used. GGD can be applied to model the distribution of Discrete

L o g

( N u m

b e r o

f R u n

L e n g t h

+ 1 )

L o g

( N u m

b e r o

f R u n

L e n g t h

+ 1 )

Run Length Run Length

(a) (b)


22/26

16

Cosine Transform (DCT) coefficients, the wavelet transform coefficients, pixelsdifference, etc. Thus, it might be used in video and geometry compression,watermarking, etc. GGD is also known in economy as Generalized Error Distribution (GED). Probability density function of the continuous random variable of GGD takes the form.

; , =2 .(1/ ) (| |/ ) 2-7

z = 1 , > 00 2-8 ( z ) is the Gamma function, scale parameter models the width of the

PDF peak and shape parameter models the shape of the distribution. Their existsthe dependency between the compressed DCT coefficients and their neighbors. Theinformation hiding will modify the neighboring joint density of the DCT coefficients.Let the left or upper adjacent DCT coefficient be denoted by random vector X 1 and theright or lower adjacent DCT coefficients be denoted by random vector X 2; let X =( X 1, X 2). When hidden data are embedded in the compressed DCT domain in JPEGimages by using any steganographic algorithms the DCT neighboring joint

probability density coefficients is affected and these changes will be helpful for steganalysis.

The change in joint density due to message embedding is shown by thefollowing example. Figure 1 shows the cover image, F5 embedded image and thesteghide embedded image. Figure 2 shows the compressed DCT neighboring joint

density probability, the neighboring joint density distribution of a F5 steganogramcarrying some hidden data and the neighboring joint density distribution of asteghide steganogram carrying some hidden data. From figure 2 it is clear that theneighboring joint density is approximately symmetric about the origin. Figure 3shows the difference of neighboring joint density of F5 steganogram and steghidesteganogram with cover image. So by embedding message the neighboring joint densityget modified.

(a) A cover and neighboring joint density


23/26

17

(b) The F5 steganogram and neighboring joint density

(c) The Steghide steganogram and neighboring joint density

(d) The difference of the neighboring joint density

Figure 2.10 The DCT neighboring joint density probabilities and the difference between the coverand the steganograms[9]

Neighboring Joint Density Features:

The information hiding will modify the neighboring joint density. When messagesare embedded in the compressed DCT domain in JPEG images by any of thesteganographic algorithms the DCT neighboring joint density probability density isaffected which will gives a way for steganalysis.

2.6.2 Feature Extraction:

The neighboring joint features are extracted on intra-block and inter-block from the DCT coefficient array respectively. From the DCT coefficient array the


24/26

18

neighboring joint density of intra block and inter block features are extracted asshown below.

Let F denote the compressed DCT coefficient array of a JPEG image, whichconsists of MN blocks, F ij ( i =1, 2, ..M; j = 1, 2, , N) with size 88. The intra -block

neighboring joint density matrix on horizontal direction NJ 1h and the matrix on verticaldirection NJ 1v are constructed as follows:

1 , =( = , +1 = )

7=1

8=1 =1=1

56 2-9

1 , =( = , +1 = )

8=1

7=1 =1=1

56 2-10

Where c ijmn stands for the compressed DCT coefficient located at the m th row and the n th column in the block F ij. = 1 if and only if its arguments are satisfied. For computationalefficiency, the neighboring joint density features on intra-block NJ 1 is calculated by:

1 , = 1 , + 1 ( , ) /2 2-11

Here the values of x and y are in the range of [6, +6], so NJ1 has 169 features.Similarly the inter-block neighboring joint density matrix NJ 2h on horizontaldirection and the matrix on vertical direction NJ 2v are constructed as follows:

2 , =( = , +1 = )1 =1=18 =18 =1

64 ( 1) 2-12 2 , =

( = , +1 = ) =11=18 =18 =1 64 ( 1) 2-13 The neighboring joint density on inter-block NJ 2 is calculated by:

2 , = 2 , + 2 ( , ) /2 2-14 Similarly, the values of x and y are in the range of [6, +6] and NJ 2 has 169 features.Hence we extract 169 features from both neighboring joint density of intra andinter block. So totally 338 features are extracted from neighboring joint density DCTarray.

2.6.3 Feature Based JPEG Steganalysis using Neighboring Joint

Density Based Features:

From the neighboring joint density of intra block 169 features and fromneighboring joint density of inter block another 169 features are extracted and totally 338distinguishable statistics are extracted for better steganalysis. After the features areextracted from both stego and clear images it will be given to SVM like binaryclassifier for training. After the training is completed the features from test images

are given for classification.


25/26

19

2.7 Improving Steganographic Security:

There are some factors that may influence the steganographic security, such as thenumber of changed pixels/coefficients, the properties of cover images, etc. In the

following some techniques are discussed for making the steganography less detectable[4]. Here we discuss with respect to basic model (figure 1.1).

Increasing the Embedding efficiency:

If cover images do not need to be modified at all for conveying secret information,certainly the warden cannot di erentiate the cover images and stego images. Therefore, if the probability of modification to the images is less, the embedding changes to the imagewill reduce, and the security of the steganographic method may increase. Defining theembedding efficiency as the number of embedded bits per one embedding change. Hence,increasing the embedding efficiency is a possible way to enhance the steganographicsecurity.

Reducing the Embedding Distortion:

Increasing the embedding efficiency can reduce the embedding changes to theimage. However, it cannot guarantee that the distortion to the image is minimized. If notall of the coefficients are used for carrying data, Alice has the freedom to select thecoefficients whose resultant distortions after data embedding are the smallest for modification. In this way, the stego image will be close to the cover image perceptuallyand statistically, thus enhancing the steganographic security.

Selecting Proper Cover Images:

In some scenarios, Alice has the freedom to select the most unsuspicious stegoimages for conveying secret information. The better images can be chosen according tothe availability of the knowledge of a potential steganalyzer.


26/26

20

REFERENCES

[1] Philip Bateman, Image Steganography and Steganalysis , Thesis of Master of Science in SecurityTechnologies & Applications, University of Surrey, United Kingdom, August 2008.

[2] Qingxiao Guan, Jing Dong, and Tieniu Tan, An Effective Image Steganalysis Method Based on Neighborhood Information of Pixels, 18 th IEEE International Conference on Image Processing , 2011.

[3] R. Chandramouli, A mathematical framework for active steganalysis , Springer MultimediaSystems, vol. 9, pp. 303 311, 2003.

[4] Bin Li, Junhui He, Jiwu Huang, Yun Qing Shi, A Survey on Image Steganography andSteganalysis, International Journal of Information Hiding and Multimedia Signal Processing Vol. 2, No. 2,April 2011.

[5] Niels Provos, Peter Honeyman, Hide and Seek: An Introduction to Steganography , Journal of IEEE COMPUTER SOCIETY, vol. 03, JUNE 2003.

[6] P.Thiyagarajan, G.Aghila and V. Prasanna Venkatesan, Steganalysis using Colour ModelConversion, Signal & Image Processing : An International Journal (SIPIJ) Vol.2, No.4, December 2011

[7] Dongdong Fu, Yun Q. Shi, Dekun Zou, Guorong Xuan, JPEG Steganalysis Using EmpiricalTransition Matrix in Block DCT Domain ,IEEE Conference of Multimedia Signal Processing, 310 313,Oct. 2006.

[8] Jing Dong , Tieniu Tan, Blind Image Steganalysis based on Run -length Histogram Analysis ,15th IEEE International Conference on Image Processing, 2008.

[9] Arun R Nithin Ravi S and Thiruppathi K , Intra Block and Inter Block Neighboring Joint Density

Based Approach for JPEG steganalysis , International Journal on Soft Computing (IJSC) Vol.3, No.2, May2012.

[10] Vajiheh Sabeti, Shadrokh Samavi , Shahram Shirani , An adaptive LSB matching steganography based on octonary complexity measure, Springer Science+Business Media, LLC 2012.

[11] Kanchan Patil, Ravindra Gupta, Gajendra Singh , Digital Image Steganalysis Schemes for Breaking Steganography, International Journal of Computer Applications (IJCA), 2012.

steganalysis

Documents