real-time compresseddomain video watermarking resistance to geometric distortions
TRANSCRIPT
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
1/10
Real-Time
Compressed-Domain VideoWatermarking
Resistance toGeometricDistortions
Liyun Wang, Hefei Ling, Fuhao Zou, and Zhengding LuHuazhong University of Science and Technology, China
Digital watermarking has been an
important technique for the
copyright protection that embeds
copyright information into the
digital works. In video watermarking, the water-
mark can be added either to uncompressed data
or compressed video streams. Practical video
storage and distribution systems store and
transmit the video sequences in compressed
format, such as using a video on demand
(VoD) service system. In these cases, the water-
mark should be embedded into the compressed
video data to avoid the process of fully decod-
ing and encoding.
The geometric attacks in videos can be eas-
ily implemented by using a nonlinear editor.
However, they are difficult to handle because
they can desynchronize the watermark infor-
mation. Besides resisting geometric distor-
tions, most video watermarking applications
require the watermark to be embedded and
detected in real time. This article focuses on
the video watermarking used for copyright
protection in VoD applications, where both
real-time performance and resistance to geo-
metric distortions are important require-
ments. (See the Related Work in Digital
Watermarking sidebar for previous research.)
Because the histogram shape of the low-
frequency subband of the discrete wavelet
transform (DWT) is invariant to rotation, scal-
ing, and other geometric distortions, we pro-
pose a method to embed the watermark into
histogram bins of frames in the one-level
DWT domain. The video data are partially
decoded to obtain block discrete cosine trans-
form (DCT) coefficients. Which are subse-
quently used to construct one-level DWT. To
lower the computational complexity, we use
a fast intertransformation between one-levelDWT and block DCTs. Thus, our method can
resist many geometric distortions and meet
the real-time requirement.
The main contributions of this work are as
follows. First, we have proposed a geometrically
invariant watermarking method by exploiting
the fact that the histogram shape of the low-
frequency subband in DWT domain is insensi-
tive to various geometric distortions. Second,
we use a fast intertransformation to obtain
the DWT coefficients directly from the com-
pressed data instead of using the traditional
method that first decompresses the block
DCTs of frames into pixel data and then applies
DWT to these data. Thus, we significantly re-
duce the computational cost and meet the
real-time requirement.
Basic Principles
Compared with DCT, the Wavelet transform
is closer to the human visual system (HVS)
becuase it splits the input image into several
statistically frequency bands that can be pro-cessed independently. DWT also causes fewer
visual artifacts than DCT because the wavelet
transform does not decompose the image
into blocks for processing.1
In addition, an images histogram in the spa-
tial domain is approximately invariant to geo-
metric attacks. We can extend this invariance
property to the low-frequency subband of the
DWT domain in order to design a geometric-
invariant watermarking method.
Most compressed video data are stored as
block DCT coefficients and motion vectors.
Multimedia in Forensics, Security,and Intelligence
A proposed real-time
video watermarking
scheme is
transparent and
robust to geometric
distortions, including
rotation with
cropping, scaling,
aspect ratio change,frame dropping, and
swapping.
1070-986X/12/$31.00 c 2012 IEEE Published by the IEEE Computer Society70
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
2/10
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
3/10
Equation 2 only gives us a block of the image
X, so the whole image can be expressed as
X
B1 0 00 B1 0..
...
..
.
.
0
0 0 0 B1
266664
377775
1
LSLS
C11 C12 C1MC21 C22 C2M
.
.
...
. ...
CL1 CL2 CLM
266664
377775
BT1 0 00 BT1 0..
...
..
.
.
0
0 0 0 BT1
2666664
3777775
1
MSMS
3
The three matrices on the right of Equation 3are denoted as B4, Cpartand B5, respectively.
We can also compute the one-level DWT
coefficients of image X. Here, the DWT will be
taken using the Haar wavelet, which is the sim-
plest possible wavelet. It is both separable and
symmetric and can be expressed in matrix form
RHXQT (4)where Hand Qare transformation matrices.
For the Haar wavelet transform, Hcontains
the Haar basis functions, hk(z). They are
defined over the continuous, closed interval
z2 [0, 1] for k 0, 1, 2, . . ., LS 1, whereLS 2e. To generate H, we define the integerk such that k 2p q 1, where 0 pe 1, 0 q e 1, q 0 or 1 for p 0,and 1q2p for p6 0. Then the Haar basisfunctions are
h0z h00z 1ffiffiffiffiffiffiLS
p ; z2 0; 1 5
and
hkz hpqz
1ffiffiffiffiffiffiLS
p2p=2; q 12p z
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
4/10
To evaluate the invariance to geometric dis-
tortions of the histogram shape in the DWT do-
main, we compute the relative relations of each
two successive bins in the number of low-
frequency DWT coefficients, denoted by A(k).
Ak
Gk 1Gk
; 1
k
Lg
1
10
whereG is the histogram vector and Lgis the
number of bins. We took one frame (of size
720 480) from the carriage test video as an ex-ample. We implemented four typical geometric
distortions including rotation, scaling, aspect
ratio change, and cropping. Figure 1 shows
that the relative relations in the number of
DWT coefficients among groups of two neigh-
boring bins are relatively stable under these ge-
ometric distortions. This means the histogram
shape is invariant to various geometric distor-tions, which implies that if we embed the
watermark based on the relative relations we
can expect the watermark to resist those geo-
metric distortions.
Proposed Scheme
In this section, we will introduce our video
watermarking algorithm. We first describe
the watermark embedding and then present
the watermark detection.
Watermark Embedding
Because the watermark-embedding process is
performed in the one-level DWT domain, the
compressed video should be partially decoded
to obtain the 2D block DCT coefficients of the
frames luminance. For P- and B-frames, the
interblock should be updated by adding its ref-
erence block in I- or P-frames. For intrablocks
in P-frames, no updating is necessary.
Figure 2 shows the watermark-embedding
process. In one video sequence, continuous
frames are chosen to form a basic carrier unitfor watermark embedding, which we call the
watermark minimal sequence (WMS). For each
frame in one WMS, we compute the DWT coef-
ficients directly from the block DCTs. Then, we
embed the watermark into the histogram bins
calculated from the low-frequency subband of
the DWT domain.
The binary watermark is denoted as W {wi | i 1, 2, . . .,Lw}. Each bit ofWis either 1 or 0, andLw is the watermark length. The watermark W
is divided into Fequal-sized segments, each of
which is embedded into the histogram shape
of one frame in each WMS, in order. The
steps of the embedding process are as follows.
Step 1. For each WMS, we calculate the
one-level DWT coefficient matrix of every
frame from the block DCTs using Equation 8.
Step 2. We compute the histogram shape
in the DWT domain and acquire the number of
coefficients in each bin.
Figure 1. The effect of the geometric distortions on the histogram shape. We
implemented four typical geometric distortions: (a) rotation, (b) scaling,
(c) aspect ratio change (d), and cropping. The relative relations in the number
of discrete wavelet transform (DWT) coefficients among groups of two
neighboring bins are relatively stable under these geometric distortions.
5 10 15 20 25 300
1
2
3
4
nBINs
A(k)
20rotation
Original
5 10 15 20 25 300
1
2
3
4
nBINs
A(k)
Scaling of 0.8
Original
5 10 15 20 25 300
1
2
3
4
nBINs
A(k)
10% cropping
Original
5 10 15 20 25 300
1
2
3
4
nBINs
A(k)
Aspect to 4:3Original
(a)
(b)
(c)
(d)
JanuaryMarch20
12
73
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
5/10
For low-frequency subband coefficients for
every frame, the average value is calculated as V.
An embedding range U 1 V; 1 V is determined, where is a parameter set to
0.6. The histogram vector produced is denotedbyGq{gq(j) | j 1, 2, . . ., Lg}, 1q F. wheregq(j) is the number of coefficients in the jth bin
of theqth frame. In order to embed all the bits,
Lgshould be not less than 2Lw/F.
Step 3. We embed each watermark bit
into two neighboring bins by reassigning the
number of coefficients in the two bins. Let E1andE2be two consecutive bins in the extracted
histogram vector. These bins include gq(j) and
gq(j
1) coefficients, respectively. We control
the relative relation of the two bins in order
to embed one bit of information:
gqjgqj 1 T; ifwi1
gqj 1gqj T; ifwi0
( 11
where Tis a threshold that controls the number
of modified coefficients. We select the thresh-
old by considering the watermark robustness
performance and the embedding distortion.
Afterward, we embed one bit in two consecu-
tive bins. First, we consider the case when wiis 1.
If gq(j)/gq(j 1) T, no operation is needed.
Otherwise, ifgq(j)/gq(j1) < T, some randomlyselected coefficients will be moved to E1 from
E2, satisfying gq(j)0/gq(j1)0 T.n1denotes the
number of these selected coefficients. Then, if
wi is 0 and gq(j 1)/gq(j) < T, some randomlyselected coefficients will be moved to E2 from
E1, satisfying gq(j 1)00/gq(j)00 T. n2 denotesthe number of these selected coefficients. This
is the rule for reassigning the coefficients:
c1mi c1i M; 1in1c2mj c2j M; 1jn2
( 12
wherec1(i) andc2(j) are theith andjth selected
coefficients inE2and E1, respectively.Mis the
bin width. The modified c1m(i) and c2m(j) be-
long to E1 and E2, respectively. n1 and n2 canbe calculated as follows:
n1 Tgqj gqj 11 T
n2 Tgqj 1 gqj1 T
( 13
We repeat this procedure until watermark bits
are embedded in the corresponding frame in
one WMS.
Step 4. The modified differences of all
DWT coefficients are inversely transformed
to the modified differences of block DCT
BlockDCTs to
DWT
Co
efficients
mo
dification
WatermarkW
Low-frequencysubband modified
coefficients ofDWT
+
All the coefficients arezeros except the modified
coefficients
DWTto block
DCTs
+
+
Block DCTdata
(after DQT)
FContinuous video frames areselected
Differenceof the low-frequency
subbandof DWT
Watermarkedframes
(block DCT data)
Differenceof blockDCT data
Low-frequencysubband
coefficients ofDWT
Extracting
histogram
Figure 2. The
watermark-embedding
process. For each frame
in one watermark
minimal sequence
(WMS), we compute
the discrete wavelet
transform (DWT)
coefficients directly
from the block discrete
cosine transform (DCT)
coefficients.
IE
EEMultiMedia
74
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
6/10
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
7/10
extracted watermark is. Ifis smaller than the
BER thresholdBER, we can successfully extractthe watermark in the WMS. Otherwise, there is
no watermark hidden in the WMS. Then, we
slide the window onto the next frame to
form a new WMS.
Computational Complexity Analysis
A watermarking schemes real-time performance
is inversely proportional to its computational
complexity. Most video watermarking applica-
tions require real-time performance, so the
watermarking algorithms complexity should
be as low as possible.In our proposed algorithm, the whole water-
mark-embedding process includes partial
decoding, which consists of variable length
decoding and dequantization, the intertrans-
formation between the DWT coefficients and
block DCTs, coefficient modification, and par-
tial encoding. Here we focus on the computa-
tional cost of the intertransformation because
it accounts for much of the time required for
the watermark embedding.
We compared our method based on inter-
transformation with the traditional method
using the IDCT and DWT. For the sake of con-
venience, suppose the size of one frame isN N.BothA1and A2are sparse matrices, which con-tributes to significant savings in computational
cost. Our fast method costsO(N2) time to com-
pute the DWT coefficients and transform the
watermarked DWT coefficients back to the
block DCTs, while the traditional method
costs O(N2 ffiffiffiffi
Np
) time. This means our method
isffiffiffiffi
Np
times faster than the traditional method,
which helps our proposed watermarking
scheme achieve a substantial savings in compu-
tational cost.
Experiments and Discussion
As Figure 4 shows, to test our proposed
scheme, we used four video sequences, all of
which are widely used for video watermarking
tests. The first two sequences are MPEG-2
encoded at 6 megabits per seconds (Mbps),
and the rest are MPEG-1 encoded at 2.2 and
1.5 Mbps, respectively. In the experiments,
the length of the embedded watermark is 60.
The threshold Tis set to 3. The BER threshold
BERis set to 0.23.
(b)(a)
(c) (d)
Figure 4. Four test
video sequences. The
(a) mobile and
(b) carriage sequences
are MPEG-2 encoded at
6 megabits per seconds
(Mbps), and the
(c) Paris and (d) farmer
sequences are MPEG-1
encoded at 2.2 and
1.5 Mbps, respectively.
IE
EEMultiMedia
76
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
8/10
We can objectively assess the visual quality
by measuring the peak signal-to-noise ratio
(PSNR) of watermarked frames compared with
the original frames. The average PSNR values
of four watermarked video sequences are 39,
40, 39, and 48 decibels (dB), separately. All
the values are higher than 37 dB. Perceptually,
the original video and the watermarked video
are visually indistinguishable. This implies
that the watermarking scheme can achieve
visual transparence.
Estimation of robustness
We implemented a video watermarking attack
tool named VBMark based on the VirtualDub
program to modify the watermarked videos
with various types of attacks. We considered
the watermarking scheme to be robust if the
computed BER is less than the threshold BER(0.23).
Commonly, the cropping rotations are
slight for video signals and the rotation angles
are no more than 5 degrees. When the rotated
angles gradually increase to 35 degrees, the
BER values remain less than the threshold BER.
In our experiment, we also applied several
scale factors0.7, 0.9, 1.2, and 1.5to the
test video signals.
Frame aspect ratio changes convert the size
of the target video frame. Digital frames come
in several aspect ratios. The most common are
4:3, 11:9, and 16:9. We applied aspect ratios
to the test video signals that differed from
their original ratio.
Table 1 shows the experimental results of
each robustness measure: cropping rotations,
scaling, frame aspect ratio changes, and Gaus-
sian low-pass filtering. In each case, the BER
values are less than the threshold BER, indicat-
ing that our algorithm is robust to these attacks.
Because the watermark is embedded in one
WMS in each GOP repeatedly, it can bedetected successfully even if only one WMS is
left. That means a frame-dropping attack is
not a threat to our proposed method. In our
experiment, we dropped two frames and bor-
rowed two frames from the next GOP. As
Table 1 shows, the BER is equal to zero, which
means we can still extract the watermark
correctly.
Frame swappinginvolves switching the order
of frames randomly within one GOP. However,
too many frame swaps will degrade video qual-
ity. Therefore, we swapped frames three times
during our experiments. In fact, this caused
no significant changes within one GOP. Three
swaps do not cause much difference in tempo-
ral frequency domain. Thus, as Table 1 shows,
our scheme is robust against frame swapping.
In our last robustness test, we measured
robustness against file-format conversion,
which is significant because a video datas file
format is easily changed by some software
tools. Our test watermarked video sequences
were originally in MPEG-2 or MPEG-1 formats.
We converted them into the MPEG-4, Divx,Xvid, and H.264 formats and extracted the
watermark. The BER is nearly reaches zero
(see Table 1), which suggests that our algo-
rithm is robust against common file-format
conversions.
Robustness Performance Comparison
We compared our scheme to the algorithm
proposed by Yulin Wang and Alan Pearmain
(Wangs algorithm), which is a typical video
watermarking algorithms resistant to geomet-
ric attacks.3 Our method proved robust to the
Table 1. Experimental results for four watermarked sequences.
Average bit error rate (%)
Attack Mobile Carriage Paris Farmer
Rotation with cropping (1) 0.0 0.0 0.0 0.0
Rotation with cropping (2) 0.0 0.0 0.0 0.0
Rotation with cropping (5) 0.0 0.0 0.0 0.0Rotation with cropping (10) 0.0 0.0 0.0 0.0
Rotation with cropping (15) 0.0 0.0 0.0 0.0
Rotation with cropping (20) 0.0 0.0 0.0 0.0
Rotation with cropping (25) 1.7 0.0 0.0 0.0
Rotation with cropping (30) 3.3 0.0 1.7 1.7
Rotation with cropping (35) 5.0 0.0 3.3 3.3
Scaling to 0.7 1.7 0.0 3.3 3.3
Scaling to 0.9 0.0 0.0 0.0 1.7
Scaling to 1.2 0.0 0.0 0.0 0.0
Scaling to 1.5 0.0 0.0 0.0 0.0
Aspect to 4:3 0.0 0.0 0.0 0.0
Aspect to 11:9 0.0 0.0 0.0 0.0
Aspect to 16:9 0.0 0.0 0.0 0.0
Guassian low-pass filtering 0.0 0.0 0.0 3.3
Frame dropping 0.0 0.0 0.0 0.0
Frame swapping 0.0 0.0 0.0 0.0
MPEG-4 compression (3,000 Kbps) 0.0 0.0 0.0 1.7
Xvid compression (2,500 Kbps) 0.0 0.0 0.0 1.7
Dvid compression (2,000 Kbps) 1.7 1.7 1.7 3.3
H.264 compression (1,000 Kbps) 3.3 4.6 3.3 6.7
JanuaryMarch20
12
77
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
9/10
attacks listed in Tables 1 and 2, and it outper-
forms Wangs algorithm. For rotation (RST)
attacks, the average BER of our method wasmuch lower thanBER, even when the rotation
angle is up to 35 degrees. The average error
rate for the Wangs algorithm was 2.13 percent
for a 1 degree rotation, but it failed when the
rotation angle was larger than 2 degrees.
Both algorithms are robust to rescaling.
However, our method can resist an aspect
ratio attack, with an average BER of approxi-
mately 0 percent, but the Wangs method can-
not resist this attack.
For the performance against video compres-
sion, format conversion, and other specialized
attacks, both method can resist format conver-
sion and frame dropping. However, the Wangs
algorithm is sensitive to frame swapping while
our method is not.
Real-Time Performance
In our final measure, we used the normal
decoding as a baseline to check whether the
watermark embedding and detection can
achieve real-time performance rates. We con-
sidered the watermarking process acceptable
if the consumed time is less than that of the
normal decoding because then can be finished
when the decoding ends.
We compared our process with the DEW4
and Wangs algorithms.3 DEW has demon-
strated excellent performance in real time, but
it is vulnerable to geometric distortions. The
DEW algorithm also has a low complexity be-cause it embeds the watermark by shifting the
end of block (EOB) marker, which avoids
re-encoding.
In the real-time experiments, we used three
schemes to embed the same watermark into
the carriage video sequence and then
attempted to detect the watermark. Figure 5
shows the consumed time of each method
and the normal decoding time of the test
video. It shows that all methods meet the
real-time requirements. The DEW algorithm
only consumes 353 and 324 ms, respectively,
during the watermark embedding and detec-
tion, while the Wangs algorithm took 1,362
and 1,521 ms, respectively. Although the con-
sumed time of our proposed algorithm during
the watermark embedding and detection was
about 1,032 and 857 ms, respectively, our
schemes consumed time is still less than half
of that of the normal decoding, which means
the watermark embedding and detection pro-
cesses can meet the real-time requirement.
The DEW algorithm outperformed our methodbecause it doesnt take into consideration resis-
tance to the geometric distortions.
Security Analysis
In our proposed scheme, we achieve robustness
using the invariance of the histogram shape of
the low-frequency subband coefficients of the
one-level DWT domain. The watermark embed-
ding is designed by modulating the relative
relations of each two successive bins in the
number of low-frequency DWT coefficients.
Assuming the bin width is set to an appropriate
Figure 5. Time consumed during the watermark embedding and detection
process for a carriage sequence. The experimentwas done using a PC with
a 2.8-Gbyte CPU and 512M DDR2 memory.
Table 2. Robustness performance comparison with Wangs method.3
Average bit error rate (%)
Attack Our method Wangs method*
Rotation with cropping (1) 0.0 2.13
Rotation with cropping (2) 0.0
Rotation with cropping (5) 0.0
Rotation with cropping (10) 0.0
Rotation with cropping (15) 0.0
Scale to 0.7 1.7 0.0
Scale to 0.9 0.0 0.0
Aspect to 11:9 0
Aspect to 4:3 0.0
Format conversion 0.0 0.0
Frame dropping 0.0 0.0
Frame swapping 0.0
* The indicates that the watermark detection failed.
0Watermark embedding Watermark detection Normal decoding
500
1,000
1,500
2,000
Consumedtime(ms)
2,500
3,000
3,500
4,000
Our method
Wang's method
Dew
Full decoding
IE
EEMultiMedia
78
-
8/11/2019 Real-Time CompressedDomain Video Watermarking Resistance to Geometric Distortions
10/10
Our real-time video
watermarking scheme
is suitable for other
DCT-based compressed
videos because the DWT
domain could be directly
acquired from block
DCTs of any size.
value and it is unknown, an adversary attempt-ing to remove the watermark by modifyingcoefficients randomly will fail. Because randommodifications are unlikely to significantlychange the number of low-frequency DWTcoefficients in each bin, the relative relationsof each two successive bins will be unchanged.This implies that the watermark will be stillextracted correctly in the watermark-detectionprocess. Consequently, the security of ourscheme has been guaranteed.
ConclusionIn this work, we presented a real-time video
watermarking scheme with high robustness in
the compressed domain. Although we only
tested our proposed scheme on video in the
MPEG-1 and MPEG-2 format, it is suitable for
other DCT-based compressed videos such as
MPEG-4 and H.264 because the DWT domain
could be directly acquired from block DCTs
of any size. This algorithm can be used for
data hiding in many applications such as
authentication and copyright protection. In
the future, we will consider other attackssuch as camera capturing. We will also adapt
the algorithm for video data in MPEG-4 and
H.264 format. MM
Acknowledgments
This work is supported by the National Science
Foundation of China under grants 60873226
and 60803112, the Fundamental Research
Funds for the Central Universities, and the
Wuhan Youth Science and Technology Chen-
guang Program.
References
1. X.Y. Wang and H. Zhao, A Novel Synchroniza-
tion Invariant Audio Watermarking Scheme Based
on DWT and DCT, IEEE Trans. Signal Processing,
vol. 54, no. 12, 2006, pp. 4835-4840.
2. B.J. Davis and S.H. Nawab, The Relationship of
Transform Coefficients for Differing Transformsand/or Differing Subblock Sizes, IEEE Trans.
Signal Processing, vol. 52, no. 5, 2004,
pp. 1458-1461.
3. Y. Wang and A. Pearmain, Blind MPEG-2 Video
Watermarking Robust Against Geometric Attacks:
A Set of Approaches in DCT Domain,IEEE
Trans. Image Processing, vol. 15, no. 6, 2006,
pp. 1536-1543.
4. G.C. Langelaar and R.L. Lagendijk, Optimal Dif-
ferential Energy Watermarking of DCT Encoded
Images and Video, IEEE Trans. Image Processing,
vol. 10, no. 1, 2001, pp. 148-158.
Liyun Wangis a PhD student in computer science at
the Huazhong University of Science and Technology,
China. Her research interests include digital finger-
printing and digital rights management. Wang has
a BE in computer science from Huazhong University
of Science and Technology. Contact her at
Hefei Lingis an associate professor in the College of
Computer Science at the Huazhong University of
Science and Technology, China. His research interests
include copy and near-duplicate detection, digital
watermarking and fingerprinting, and content secu-
rity and protection. Ling has a PhD in computer
science from the Huazhong University of Science
and Technology. He is a member of IEEE. Contact
him at [email protected] (corresponding author).
Fuhao Zouis an associate professor in the College of
Computer Science at the Huazhong University of
Science and Technology, China. His research interests
include copy and near-duplicate detection. Zou has aPhD in computer science from the Huazhong Univer-
sity of Science and Technology. Contact him at
Zhengding Lu is a professor at the Huazhong Uni-
versity of Science and Technology, China. His
research interests include distributed computing,
distributed database systems, heterogeneous sys-
tem integration, and information security. Lu has
a PhD in computer science from the Huazhong
University of Science and Technology. Contact
him at [email protected].
JanuaryMarch20
12
79