audio digital watermarking for copyright protection · audio digital watermarking for copyright...

AUDIO DIGITAL WATERMARKING FOR COPYRIGHT PROTECTION

Tiberiu MUNTEANa,b, Eric GRIVELa, Ioan NAFORNITAb, Mohamed NAJIMa a. Equipe Signal & Image, UMR-LAP/CNRS, ENSEIRB, Université Bordeaux I,

BP 99, F-33 402 Talence Cedex, France b. Electronics and Telecommunications Faculty, "Politehnica" University of Timişoara

2 Bd. V. Pârvan, 1900 Timişoara,

Abstract. When dealing with copyright protection for audio, watermarking can be considered to label the digital medium, by hiding information into the data. To make the watermark imperceptible and robust against some signal processing, we propose in this paper a new method based on frequential substitution, for which the bit error rate is lower than those obtained with existing methods. For such a purpose, frequential masking threshold is first estimated. Then a BPSK modulation is carried out with an adaptive carrier frequency that depends on the original signal features. To retrieve hidden information, a secret key is generated during the watermark process. Some of the watermarking models are also presented for allowing us to clearly establish our method in watermarking context. Keywords: digital watermarking, spread spectrum.

I. INTRODUCTION

Nowadays, the reproduction of a digital medium such as image, audio, speech and text can be easily completed by using CD writers. However, the drawback of this technological improvement is the illicit distribution of copies, at very low cost and without any deterioration of the quality. This phenomenon has been increasing in an alarming rate, more particularly with audio signals due to MP3 or MPEG-2 AAC coding. A great deal of attention is currently paid to the approaches aiming at protecting the author rights. For instance, an international consortium Secure Digital Music Initiative (SDMI) was created to support researches in that field (http://www.sdmi.org.).

Among the solutions proposed to control this media distribution, inserting hidden information in the original medium can be considered and used for ownership evidence, Cf. figure 1.

Figure 1. Watermarking system

Several techniques for data hiding in audio [1] [3] [7] [8] have been developed since the pioneering works presented by Bender [2]. In order to be effective, the so-called watermarks must satisfy the following constraints:

• be usually robust to intelligent attacks and signal processing such as filtering and resampling [5];

• be statistically invisible and minimally perceptible so that the original signal becomes imperceptibly degraded;

• be inserted into the original signal or in a compressed representation of original signal to avoid a decompression/compression operation if the signal is stored in a compressed form.

There is a compromise to find between the imperceptibility of the inserted information and its robustness against signal processing. This can be achieved for instance by adjusting the power of the inserted information. Cox [5] proposed to embed watermarks in perceptually significant components of the medium to avoid its removal by signal processing operations such as filtering.

Various watermarking schemes have been considered in the literature. On the one hand, the original signal is required to detect whether a watermark is embedded in the analyzed signal or not. On the other hand, the so-called blind and asymmetrical watermarking schemes [6] require a private key to embed the extra information and a public key for the watermark detection.

Two classes of approaches have emerged in the literature; firstly, the temporal addition of the watermark is considered in [3][13][8]. Secondly, frequential substitution is used [7].

Marked signal

Attacks

Original signal Watermark

Watermarking

Distribution

Audition Watermark detection

In [1], Bassia et al. propose to modify the temporal representation of the original signal, by changing the least significant bits of the values representing the audio signal samples. However, the watermarking is neither robust to re-quantification nor to a random change of the least significant bits of marked the signal.

When using echo data hiding [9] [12], the embedded extra information is the delayed audio signal with the same statistical and perceptual features. Nevertheless a trade off must be found to choose the echo that both prevents perceptible distortion of the signal and increases the robustness. Instead of introducing an echo, inserting additive noise can be considered. For instance, Boney et al. [3] and Swanson et al. [13] propose to generate the watermark by using a Pseudo-Noise (PN) sequence. To make the watermark inaudible, the white noise like sequence is filtered by using perceptual frequency masking. The filtered watermark is then weighted by a temporal mask to eliminate the temporal artifacts that eventually appeared. However, the watermark detection, based on a correlation measure between the watermark and the watermarked signal, is completed providing the original signal is available.

Gomes et al. [8] use the concept of asymmetrical watermarking introduced by Furon et al. [6]. They propose to impose a cyclostationnarity property to the watermark. First, the cyclostationnary signature is generated and its spectrum is spread by means of an interleaver. Then, the spread watermark, frequency-shaped by using the psycho acoustical model, is added to the original audio signal. Watermarking detection only requires cycle frequency of the cyclostationnary signature.

An alternative approach, first presented by Garcia [7], is based on frequential substitution. The author proposes to remove the frequential components of the signal that are under the frequential masking threshold and replace them by the audio watermark's ones. While Cox et al. [5] provides experimental results for image watermarking, Garcia et al. [7] propose to use spread spectrum techniques to improve the robustness of the watermarking. Spread spectrum is carried out by multiplying the watermark binary signal by a pseudo noise sequence. A Binary Phase Shift Keying (BPSK) modulation makes it possible to shift the central lobe of the spread spectrum at a pre-determined carrier frequency. However, when the substituted frequential components of the signal do not necessary correspond to the watermark central lobe frequential components, inserting and identifying the watermark are problematic.

Here, our purpose is to improve the bit error rate (BER) of watermark recuperation. For this reason, we propose an improved audio watermarking procedure by spectral substitution based on the techniques presented in [3], [13] and [7]. Our contribution consists in using a BPSK modulation, whose carrier frequency varies and depends

on the original signal features. In addition, inserting the watermark is only completed when the analyzed frames satisfy one criterion.

The article is organized as follows: part 2 deals with watermarking paradigms. In part 3, we present spread spectrum techniques. In part 4, we provide a detailed description of proposed algorithm, part 5 provides the results obtained based on bit error rate. The conclusions and the perspectives are provided in 6th part.

II. WATERMARKING PARADIGMS

i. Watermarking as a communication system

Watermarking techniques were historically developed to allow the identification of the owner of a marked paper. Nowadays, digital watermarking techniques have been developed for a similar use, i.e. copyright protection of digital documents. However, as the research field has matured, watermarking is considered as a way to send information, much like steganography, and its use is no longer restricted to copyright protection.

At the beginning, preserving the perceptual quality of the original data (sound, image or video) was the main criterion for the marking systems.

Testing the robustness of watermarking systems and comparing their performances were not an easy task. Gradually, tools have been developed, especially for images [14] and more recently for sound. A great part of research in the watermarking field aims at finding algorithms more robust against all sorts of attacks. It should be noted that these attacks have also been studied, in a game-like approach of watermarking and attacks [18].

It became obvious that, in copyright protection applications, retrieving the inserted information, i.e. the watermark, is as important as preserving an acceptable quality of the marked work. The impossibility to retrieve the mark implies the impossibility to protect the work and means the loss of that work. Consequently, designing a system that transmits the inserted information with the lowest error rate has become an important goal in watermarking.

Viewing the copyright protection under this angle makes it possible to consider the watermarking systems as the classical telecommunication systems, whose the single goal is to send the information from a source to a receiver with as few errors as possible, Cf. figure 2.

Figure 2. watermarking as communication system

The watermark is considered as the information that must be transmitted in a known channel represented by the original work. On the one hand, it is obvious that knowing the channel makes it easier to complete the channel coding of the watermark. On the other hand, preserving the perceptual quality of the original work imposes very strict limits for the power of the signal that is send through the channel.

The analysis of information transmission can be based on the transmission of a single bit of information, i.e. the insertion of a single bit, because sending more bits can be reduced to a multiplex problem. When a method is developed to send one bit, a classical multiplex technique can be used to send as many bits as necessary.

As it is presented in figure 2, the equivalent transmission channel in a watermarking system is made up of a known additive noise, i.e. the original sound, and an unknown alteration of the marked signal caused by attacks.

The robustness of such a system can be characterized by the BER during the retrieval of the inserted information. Consequently, designing a robust system means optimizing a communication system, by taking into account the BER as performance criterion, by improving the detection of the signal sent through partially known channels.

ii. Blind versus non-blind watermarking systems

Detecting binary signals requires a “partition” of the received signal space into two subsets of data respectively representing a binary one or a binary zero. This partition must be known for coding and for decoding processes. It may depend on the original signal or not.

When the partition depends on the original signal, the original signal must be sent through a safe channel in order to complete the watermark detection and retrieval. Such a system is known as non-blind system.

The partition can be built once for all and inserted in coding and decoding devices if it does not depend on the original signal. Such systems are called blind systems.

a. Non-blind systems: usually, it is easier to design a non-blind system. The detection of binary information is carried out by using a threshold that was preliminary transmitted in a safe manner. The drawback of this approach is that the detection protocol depends on the existence of the safe channel. Realizing the safe transmission is time consuming and not always possible.

b. Blind systems: unlike non-blind systems, blind systems use a predefined partition of the signal space to detect binary signals. As long as that partition must fit any marked signal, its design is not an easy task. The brand new developments of such watermarking techniques [16] [17] [19].are based on the work of Costa dealing with the transmission of signals in a partially known channel [15]. Unfortunately the original design implies the building of huge random dictionaries for coding and decoding process. Even if these dictionaries are generated once for all, extracting the inserted information can become a very difficult operation, needing to find the most similar dictionary entry to a given received signal. Blind watermarking systems using structured dictionaries were proposed but they are not yet robust enough against quite simple attacks such us noise addition or signal magnitude modification [16].

The algorithm that we propose here is a non blind one and diminishes the amount of information that needs to be transmitted through a safe channel.

III. SPREAD SPECTRUM

Spread spectrum techniques are used in radio communication to prevent the interception of messages. They make it possible to modify the representation of information to be inserted, by spreading its spectrum on a much broader frequency band. As a noise-like signal is obtained, it is more difficult to discriminate the hidden information and to remove it from the marked signal. To recover the original signal, the receiver requires knowledge of the spreading process. In this paper, we take advantage of two approaches for spreading the spectrum of a narrow band signal: (A1) multiplying the low frequency binary signal to

transmit by a binary pseudo-random sequence, having a much higher frequency. The resulting

Attacks

Retrieved watermark

Original signal Watermark

Channel

Channel coding PM

Modulator

Demodulator

spectrum has the bandwidth equal to the sum of the bandwidths of multiplied signals.

(A2) frequency hopping: the signal to transmit modulates a carrier with a random time-varying frequency. The resulting bandwidth depends on both the values of the carrier frequencies and the bandwidth of original signal.

As long as the watermarking can be seen as a communication system that sends watermark information through a very noisy channel, using spread spectrum techniques is effective to improve the information recuperation at the receiver.

Let the watermark nw be a binary sequence having the bit duration of wT and the pseudo random sequence, np ,

having the bit duration of pT much shorter than wT . By

the multiplication of nw and np , the spectral bandwidth of resulting signal increases from de wT2 to pT2 .

Using a BPSK modulation of a carrier frequency if , we obtain a signal described by:

)2cos(22

2sin2)( tfPpwpw

tfPtW innnn

i ππ

π =

+=

where P is the power of the carrier.

It should be noted the pseudo random sequence np and

the carrier frequency if are required to spread the spectrum at emission and to despread it at the receiver. A bit-synchronization between the insertion and the recuperation of the transmitted signal is therefore necessary to ensure a correct despreading of the spectrum. In the proposed algorithm we use an hybrid spreading technique by combining (A1) and (A2).

IV. PRESENTATION OF THE WATERMARKING ALGORITHM

i. Methods improving watermarking insertion

The watermark is inserted by substituting spectral components of the original signal that lie under the psycho acoustical mask M(f) , by spectral components of spread watermark [7] :

<≥

=M(f)(f)if PαW(f) M(f)(f)P if S(f)

(f)SS

Sw

where (f)Sw is the watermarked signal spectrum, S(f) is the spectrum of analyzed frame of original signal, W(f) is the spread watermark, (f)PS is the power spectrum

density of original signal, α is a parameter for adjusting the power of inserted spectral components.

We propose to adapt the watermark spreading to the spectral characteristics of the original signal by two methods:

(M1) the insertion is only performed for the frames that allow the substitution of a certain number of spectral components. The minimum number of spectral components is imposed by the bandwidth of the central lobe of the spectrum of the spread watermark W(t) ;

(M2) the carrier frequency if of the BPSK modulation varies and depends on the spectrum of original signal. An analysis of original signal spectrum enables us to choose the carrier frequency if the analyzed frame is to be marked or a pre-established value if the frame is not to be watermarked. These values will compose a secrete key, whose dimension will be equal to the number of frames in the watermarked signal.

We propose two criteria to choose the carrier frequencies:

(C1) the first one consists in finding the broadest frequency band in the original spectrum that has every spectral component substitutable. If its bandwidth is larger than an established threshold, the frame is marked and the carrier frequency is the mid-frequency of that frequency band.

(C2) the second one consists in finding the highest ratio between the number of substitutable components in a given bandwidth and the bandwidth. The carrier frequency is equal to the mid frequency of the spectral band that has highest ratio.

Fig. 3 : two frames and the corresponding carrier frequencies (represented by arrows)

Note that when using watermarking method, the watermark bit rate is variable.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 104-20

0

20

40

60

80

f[Hz]

[dB]

Spectre à marquer, son masque et la frequence porteuse

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2x 104

-20

0

20

40

60

80

f[Hz]

[dB]

Spectre à marquer, son masque et la frequence porteuse

Spectrum to mark, its mask and carrier frequency

Spectrum to mark, its mask and carrier frequencyf[Hz]

f[Hz]

ii. Pseudo random sequence generation

The pseudo random sequence used to spread the watermark is computed using the secrete key.

Fig. 4 : algorithm completed here for pseudo random sequence generation

Each value of the secrete key is used as a seed to generate a short random sequence having the same length as the analyzed frame. When concatenating the short random sequences and randomly permuting the resulted sequence [10], we obtain the pseudo random sequence that will be used to spread the watermark.

iii. Watermark retrieval

Recuperating the mark requires knowledge of the pseudo random sequence, of the secrete key, and also a synchronization with the insertion. For the methods based on spread spectrum, synchronization remains a problem to solve. One way to improve synchronization consists in inserting known binary sequences that are only used to synchronize the recuperation algorithm [11]. When retrieving watermarks, the secrete key provides the carrier frequencies. Then, we consider the spectral components under the psycho acoustical mask. Indeed, we assume that the masking curve of the original signal can be estimated from the watermarked signal. At that stage, the extracted spectral components are then dispread to find the binary sequence wn.

V. TESTS, RESULTS AND CONCLUSIONS

Five tests have been performed with various audio signals sampled at 44.1KHz. The watermark information is a string of 50 binary values, whose bit duration is 10ms. The pseudo random sequence has a bit rate of 1000b/s. Informal subjective acoustical tests proved watermark imperceptibility. In addition, the rate of inserted watermark bits has been considered. In Fig 5, BER Variation reflect the non stationary nature of the original signal.

Fig. 5: rate of inserted watermark bits

It should be noted that the algorithm is robust against imperceptible white noise addition since the BER does not vary significantly. When studying robustness against imperceptible additive colored noise, obtained by shaping a white noise with perceptual mask (to approximate a simulation dealing with compression-decompression of watermarked signal), BER values still lie under 5%. One can also study the relationship between the marked frame ratio and the BER when using (C1) and (C2). Cf. Fig. 6. BER depends on the ratio of marked frames. Note that when computing [7], BER is around 0.06.

Fig. 6: BER of watermark retrieval (a) opera, (b) pop

VI. CONCLUSIONS

The algorithm presented here is effective for non blind protocols. This required a secret key, which is generated from the consecutive carrier frequencies. The improvement we propose in this paper makes it possible

wn

Original signal

W(t)

cos(2πfit)

pn Secrete key Finding carrier

frequencies pn

generator

0.35 0.4 0.45 0.5 0.55 0.6 0.650

0.005

0.01

0.015

0.02

BER

0.35 0.4 0.45 0.5 0.55 0.6 0.650

0.005

0.01

0.015

0.02

BER

C2

C1

marked frame ratio

marked frame ratio

C1

C2

0 2 4 6 8 10 12 140

20

40

60

80

100

rate[bit/s]

pop music ( BER=0.017 solid and BER=0.003 dashed )

0 2 4 6 8 10 12 140

20

40

60

80

100

t [s]

rate[bit/s]

opera music ( BER=0.011 solide and BER=0.003 dashed )

to ensure robustness since the BER is lower. We intend to use this watermarking method to speech signals.

VII. REFERENCES

[1] P. Bassia, I. Pitas. “Robust Audio Watermarking in Time Domain”. EUSIPCO 1998, 8-11 Sept., Patras, Greece, pp. 25-28.

[2] W. Bender, D. Gruhl, N. Morimoto. “Techniques for Data Hiding, Technical Report”, MIT Media Lab, 1994.

[3] L. Boney, A. Tewfik, K. Hamdy. “Digital Watermarks for Audio Signals”. 1996 IEEE Int. Conf. on Multimedia Computing and Systems, Hirochima, Japan 1996.

[4] Q. Cheng, J. Sorensen. “Spread Spectrum Signaling for Speech Watermarking”,. ICASSP 2001.

[5] I. J Cox., J. Killian, T. Leighton, T. Shamoon, “A secure, robust watermark for multimedia”, in Workshop on Information Hiding", Newton Institute, University of Cambridge, May 1996.

[6] T. Furon, P. Duhamel, “An Asymetric Public Detection Watermarking Technique”, Proc of the 3rd Int. Work of Information Hiding, Dresden, September 1999.

[7] R. A. Garcia, “Digital Watermarking of Audio Signals Using a Psycho-acoustic Auditory Model and Spread Spectrum Theory”, 107th AES Convention, Sept. 1999.

[8] L. de C. T. Gomes, M. Mboup, M. Bonnet, N. Moreau, “Cyclostationarity-Based Audio Watermarking with Private and Public Hidden Data”, 109th AES Convention, Sept. 22-25, 2000.

[9] D. Gruhl, A. Lu, W. Bender. „Echo Hiding. Proceedings of the Workshop of Information Hiding”, Cambridge, England, May 1996.

[10] D. Knuth . “The Art of Computer Programming”. vol.II, Addison-Wesley, 1981.

[11] N. Moreau, P. Dymarski, L. de C. T. Gomes. “Tatouage audio: Une réponse à une attaque désynchronisante “, CORESA 2000 (In french).

[12] H. O. Oh, J. W. Seok, J. W. Hong, D. H. Youn. “New Echo Embedding Technique For Robust And Imperceptible Audio Watermarking”. ICASSP 2001.

[13] M. D. Swanson, B. Zhu, A.H. Twefik, L. Boney, “Robust Audio Watermarking Using Perceptual Masking”, IEEE Proceedings on Signal Processing, Vol. n°66, pp. 337-355, 1998.

[14] F. A. P. Petitcolas, M. G. Kuhn. Stirmark software. www.cl.cam.ac.uk/~fapp2/watermarking

[15] M.H.M Costa. Writing on dirty paper, IEEE Trans. Inform. Theory, 29(3):439-441,May 1983

[16] J. J. Eggers, J. K. Su, B. Girod, “A Blind Watermarking Scheme Based on Structured Codebooks”, Secure Images and Image Authentication, IEE Colloquium, London, UK, April 2000, pp. 4/1- 4/6

[17] B. Chen. “Design and Analysis of Digital Watermarking, Information Embedding and Data Hiding Systems”, PhD Thesis, MIT, Cambridge, MA, 2000.

[18] P. Moulin, J. A. O’Sullivan, “Information-theoretic Analysis of Watermarking”. Proceedings ICASSP, Istanbul, Turkey, June 2000.

[19] A. S. Cohen, “Information Theoretic Analysis of Watermarking Systems”, PhD Thesis, MIT, September 2001.

audio digital watermarking for copyright protection · audio digital watermarking for copyright...

Documents