copy-paste forgery detection exposing digital forgeries by detecting duplicated image regions (a....

Copy-Paste Forgery Detection

• Exposing Digital Forgeries by Detecting Duplicated Image Regions

(A. Popescu and H. Farid)• Detection of Copy-Move Forgery in Digital

Images (J.Fridrich, D. Soukal, and J. Lukas)

Samah Ghazawi

Overview• Copy-Paste Forgery.• Copy-Paste Forgery Detection.• Possible Approaches.• Exhaustive approach.• Autocorrelation.• Exact block matching.• PCA - Principal component analysis.• Copy-Paste Forgery Detection Using PCA• DCT - discrete cosine transform.• Copy-Paste Forgery Detection Using DCT

Copy-Paste Forgery

• The copy-paste attack in which a part of the image is copied and pasted somewhere else in the same image with the intent to cover an important image feature.

Copy-Paste Forgery Detection

• Given an image with MN pixels (image of size M×N) our task is to determine if it contains duplicated regions of unknown location and shape.

Possible Approaches

• Exhaustive approach (1).• Exhaustive approach (2).• Autocorrelation approach.• Exact block matching.

Exhaustive approach (1)

• examine every possible pair of regions would have an exponential complexity in the number of image pixels.

Such an approach is obviously computationally prohibitive.

• In this method, the image and its circularly shifted version are overlaid looking for closely matching image segments.

• Let us assume that is the pixel value of a grayscale image of size M×N at the position i, j.

• The following differences are examined:

where,

• Comparing with its cyclical shift [k,l] is the same as comparing with its cyclical shift [k’,l’], where k’=M–k and l’=N–l.

• For each shift [k,l], the differences are calculated and thresholded with a small threshold t.

• The total computational requirements are proportional to .

Autocorrelation

• Autocorrelation function computes the energy of original image with shifted image for different shifts.

• The original and copied segments will introduce peaks in the autocorrelation for the shifts that correspond to the copied-moved segments.

• Autocorrelation witch computed directly for the image itself, would have very large peaks at the image corners and their neighborhoods.

Autocorrelation

• Natural images contain most of their power in low-frequencies.

• Computing the autocorrelation not from the image directly, but from its high-pass filtered version.

Autocorrelation

• Assuming the minimal size of a copied-moved segment is K.

• autocorrelation copy-move detection method consists of the following steps:

1. Apply the Marr high-pass filter to the tested image.2. Compute the autocorrelation r of the filtered image.3. Remove half of the autocorrelation. Autocorrelation is symmetric.

Autocorrelation

4. Set r = 0 in the neighborhood of two remaining corners of the entire autocorrelation.5. Find the maximum of r, identify the shift vector, and examine the shift using the exhaustive method this is now computationally efficient because we

do not have to perform the exhaustive search for many different shift vectors.

6. If the detected area is larger than K, finish, else repeat Step 5 with the next maximum of r.

Autocorrelation

• This method is simple and does not have a large computational complexity, it often fails to detect the forgery unless the size of the forged area is at least ¼ of linear image dimensions.

Exact Block Matching

• Given an image of size M×N:• Divide the image into overlapping square

blocks of size B×B pixels, marked as .• Copy pixel values from each block to a

vector of size , marked as .

block vector

• Build an array A of size (M–B+1)(N–B+1) that each row is some .

• Lexicographically order the rows of matrix A.

block vectors array ordered array

• Go through all rows of the ordered matrix and look for two consecutive rows that are identical.

• Complexity: .

Results

• Exact match analysis did not show any exactly matching blocks.

Results

• If the forged image had been saved as JPEG, vast majority of identical blocks would have disappeared because the match would become only approximate and not exact.

And differently…

• Ordered array of what ?!

Principal Component Analysis – PCA

Discrete Cosine Transform - DCT

• PCA of a data set X.• DCT of a data set X.

Principal Component Analysis - PCA

• Mainstay of modern data analysis.• Used abundantly in all forms of analysis - from

neuroscience to computer graphics.• Simple, non-parametric method of extracting

relevant information from confusing data sets.• Reduces a complex data set to a lower

dimension to reveal the sometimes hidden.• Computes the most meaningful basis to re-

express a noisy, garbled data set.

Principal Components

• Orthogonal directions of greatest variance in data.

• First principal component is the direction of greatest variability (covariance) in the data.

• Second is the next orthogonal (uncorrelated) direction of greatest variability. So first remove all the variability along

the first component, and then find the next direction of greatest variability.

• And so on …

• Principal components with larger associated variances represent interesting dynamics, while those with lower variances represent noise.

• The principal components are orthogonal.

Dimensionality Reduction Can ignore the components of lesser significance• You do lose some information, but if the eigenvalues

are small, you don’t lose much.• n dimensions in original data.• calculate n eigenvectors and eigenvalues.• choose only the first p eigenvectors, based on their

eigenvalues.• final data set has only p dimensions.

• The number of measurement types is the dimension of the data set.

• Each data sample is a vector in m dimensional space.

• Naïve Basis: B is the identity matrix I, X = IX.

• PCA asks:Is there another basis, which is a linear combination of the original basis, that best re-expresses our data set?

(1) restricting the set of potential bases.(2) formalizing the implicit assumption of continuity in a data set.

Let The Math Begin

• Let X and Y be m×n matrices related by a linear transformation P. X is the original data set. Y is a re-representation of that data

set. are the rows of P are the columns of X are the columns of Y

PX = Y PX = Y =

• The row vectors {, . . . , } in this transformation will become the principal components of X.

• covariance matrix:

The diagonal terms of are the variance of particular measurement types.

The off-diagonal terms of are the covariance between measurement types.

• Find some orthonormal matrix P where Y = PX such that is diagonalized.

• A is symmetric.• A is diagonalized by an orthogonal

matrix of its eigenvectors.

D is a diagonal matrix. E is a matrix of eigenvectors of A arranged

as columns.

PCA - Now comes the trick

• Select the matrix P to be a matrix where each row is an eigenvector of .

• Therefore P = and P• Therefore

The principal components of X are the eigenvectors of (or the rows of P).

The i-th diagonal value of is the variance of X along .

Copy-Paste Forgery Detection Using PCA

• Using PCA, compute the new -dimensional representation, of each b pixel image block.

, • The value of is chosen to satisfy:

Where: : eigenvalues as computed by the PCA. b: number of pixels per block. Ɛ: fraction of the ignored variance along the

principal axes.

• For color images:(1) analyze each color channel

separately.or

(2) build a single color block of size 3b pixels.

• Build a ×b matrix whose rows are given by the component-wise quantized coordinates:

Where, Q: number of quantization bins.

• Sort the rows of the above matrix in lexicographic order to yield a matrix S. denote the rows of S. () denote the position of the block's image

coordinates (top-left corner) that corresponds to .

• For every pair of rows and from S such that

, place the pair of coordinates () and () onto a list.Where,

: number of neighboring rows to search in the lexicographically sorted matrix.

• For all elements in this list, compute their offsets, defined as:

• Discard all pairs of coordinates with an offset frequency less than .

• Discard all pairs whose offset magnitude, , is less than .

Where, : minimum frequency threshold. : minimum offset threshold.

• From the remaining pairs of blocks build a duplication map by constructing a zero image of the same size as the original, and coloring all pixels in a duplicated region with a unique grayscale intensity value.

Results

Principal Component Analysis – PCA

• PCA of a data set X.• DCT of a data set X.

• Important to numerous applications in science and engineering, from lossy compression of audio and images, to spectral methods for the numerical solution of partial differential equations.

• There are eight standard DCT variants, of which four are common.

• Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers.

• Transforms an image from the spatial domain to the frequency domain.

• Helps separate the image into parts of differing importance.

• Expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies.

𝐷𝐶𝑇

𝐹 (𝑢 ,𝑣)

Let The Math Begin – Again !

• Given an image N by M, the basic operation of the DCT is as follows: f(i,j) is the intensity of the pixel in row i and

column j. F(u,v) is the DCT coefficient in row and column

of the DCT matrix. For JPEG image the DCT input is usually an 8 by

8 (or 16 by 16) array of integers. This array contains each image window’s

respective color band pixel levels.

• For a 2D N by M image 2D DCT is defined:

where,

DCT basis functions

• Computationally easier to implement and more efficient to regard the DCT as a set of basis functions.

• Image is partitioned into 8 x 8 regions — The DCT input is an 8 x 8 array of integers.

• Why 8 x 8?! The output array of DCT coefficients

contains integers. these can range from -1024 to 1023.

Copy-Paste Forgery Detection Using DCT

• Scan the image from the upper left corner to the lower right corner while sliding a B×B blocks.

• Calculate the DCT transform for each block.• Quantize the DCT coefficients and store as one

row in a matrix A. The quantization steps are calculated from a

user- specified parameter Q. Lower values of the Q-factor produce more

matching blocks, possibly some false matches.

• Lexicographically sort the rows of A.• If two consecutive rows of the sorted

matrix A are found, store the positions of the matching blocks in a separate list.the coordinates of the upper left pixel of

a block can be taken as its position.

• Let (i1, i2) and (j1, j2) be the positions of the two matching blocks.

• The shift vector s between the two matching blocks is calculated as s = (s1, s2) = (i1 – j1, i2 – j2)

• the shift vectors –s and s correspond to the same shift, the shift vectors s are normalized, if necessary, by multiplying by –1 so that s ≥ 0.

• For each matching pair of blocks, increment the normalized shift vector counter C by one.

C(s1, s2) = C(s1 , s2) + 1The shift vector C is initialized to zero.The counter C indicates the frequencies

with which different normalized shift vectors occur.

• find all normalized shift vectors s(1), s(2), …, s(K), whose occurrence exceeds a user-specified threshold T: C(s(r)) > T for all r = 1, …, K. Larger values of T may cause the algorithm

to miss some not-so-closely matching blocks, while too small a value of T may introduce too many false matches.

• look at the mutual positions of each matching block pair and output a specific block pair only if there are many other matching pairs in the same mutual position.

• For all normalized shift vectors, the matching blocks that contributed to that specific shift vector are colored with the same color and thus identified as segments that might have been copied and moved.

Results

• the elliptic area on the orange hat has been copied to two other locations, three different shift vectors have been correctly found.

ResultsFalse

matches

Results

DCT VS. Exact Block Matching

References

• Exposing Digital Forgeries by Detecting Duplicated Image Regions (A. Popescu and H. Farid)

• Detection of Copy-Move Forgery in Digital Images (J.Fridrich, D. Soukal, and J. Lukas)

• Wikipedia: https://en.wikipedia.org/wiki/Discrete_cosine_transform

• http://www.cs.cf.ac.uk/Dave/Multimedia/node231.html• A Tutorial On Principal Component Analysis - Derivation,

Discussion and Singular Value Decomposition (Jon Shlens)• http://

mipav.cit.nih.gov/pubwiki/index.php/Autocorrelation_Coefficients

• http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.157.4543&rep=rep1&type=pdf

THANKS

copy-paste forgery detection exposing digital forgeries by detecting duplicated image regions (a....

Documents

copy move forgery

forgery manipulation detection: challenges and...

a robust detection algorithm for copy-move forgery in...

copy-move image forgery detection using ring projection and...

on parameterization of block based copy-move forgery...

detection of geometric transformations in copy-move ... ·...

detection of copy-move forgery in digital images based on...

detection of copy-move forgery in digital images using...

a brief review: copy-move forgery detection · forgery...

copy move forgery detection using key-points structure...

forgery manipulation detection: challenges and...

detection of forgery

detection of copy-move forgery using krawtchouk moment

forgery detection

forgery (copy-move) detection in digital images using block...

face forgery detection by 3d decomposition

a survey of image forgery detection

passive copy move image forgery detection using

a copy-move image forgery detection based on speeded up...

sheng-wen peng 31004109. introduction watermarking for...