mylingo content management - wikileaks cinema... · uploading the le to and storing on a remote le...
TRANSCRIPT
myLINGO Content Management
Prepared by: Ray Migneco
Chief Application Engineer
Oladas, Inc.
Document version 2.0
October 25, 2013
Oladas, Inc. 1 myLINGO Content Management
Contents
1 Overview 3
1.1 Summary of Invention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Audio File Encryption 4
2.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Content Transport for Remote Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Content Transport for Local Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.3 Content Playback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Content Management 7
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Content Usage Limitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.1 Fixed Expiration after Purchase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3.2 Fixed Expiration after Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.3 Expiration based on Viewed Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.4 Limit By Release Date . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.5 Remotely Disabling Content Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.6 Limit Content Access by Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Forensic Watermarking 14
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Stage 1: Verance Watermark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.4 Stage 2: Forensic Watermark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.4.1 Embedding Client Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4.2 Binary Embedding Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4.3 Decoding Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4.4 Real-time vs Offline Watermarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5 References 21
Oladas, Inc. 2 myLINGO Content Management
1 Overview
The core functionality of the myLINGO application requires the handling of copyrighted multimedia at mul-
tiple stages, including data transmission and playback. As such, the application architecture will include
provisions for protecting the intellectual property owner at multiple stages to discourage and prevent unau-
thorized usage and copying of copyrighted material.
1.1 Summary of Invention
In accordance with the disclosed subject matter, systems and methods are described for:
• Content encryption and decryption
• Management of content usage
• Forensic watermarking of content
An overview of the process to manage these inventions is shown in Figure 1. In this embodiment, encrypted
files representing the dubbed soundtracks are generated on a personal computer. The decrypted files are then
stored on a remote file server. A user requests a file using the myLINGO application and it is downloaded to
the device and stored in the encrypted format. Rental management provisions when the downloaded content
can be accessed and when it should be disabled and removed from the device. For playback, the file is
securely decrypted and watermarked with client information to prevent and discourage unauthorized copies.
The watermarked audio is then sent to the devices output during playback.
1.2 Organization
The organization of this document is as follows:
Section 2 of this document describes the process in which an dubbed soundtrack is locally encrypted using
a digital key, stored on a remote file server, downloaded by a client application and decrypted locally on the
client application. The audio file is played back without ever storing a copy of the unencrypted file on the
client device.
Section 3 describes the provisions for managing downloaded content on the user’s device with myLINGO.
This includes methods for managing content access through release dates, expiration dates and usage time.
Section 4 describes a process for forensic watermarking. This process involves embedding the audio signal
with an imperceptible digital signal that provides information about the user, which can be use to identify
the source of unauthorized usage of copyrighted material.
Oladas, Inc. 3 myLINGO Content Management
Generate Encrypted
Soundtrack
Store Encrypted
Soundtracks
Remote file server Personal computer
Store rented soundtracks locally
Rental file management
Secure file decryption for playback.
NO LOCAL STORAGE
Watermark user specific data
Mobile device running myLINGO application
Download soundtrack to mobile device
Watermarked audio sent to mobile device output
Upload encrypted file to server
Apply Verance Watermark
Master Audio File
Figure 1: Overview of intellectual property management for the myLINGO application architecture. Audiofiles are encrypted and uploaded to a secure file server. Encrypted tracks are then downloaded to the clientsmobile device where they are managed for usage. Files are securely decrypted using a private key andwatermarked using client information before playback.
2 Audio File Encryption
The client-server architecture used by the myLINGO application requires that dubbed audio soundtracks be
uploaded to remote fileserver where they can then be downloaded to the client’s mobile device. During the
transport of content files, the protected material is vulnerable unauthorized users in the following states:
• Uploading the file to and storing on a remote file server
• Downloading the soundtrack file to the client from the remote file server
• While the soundtrack file is stored on the users device and accessed for playback.
This section summarizes the approach for protecting the files during these different states.
2.1 Approach
The myLINGO client-server architecture and application design employs 128-bit file encryption/decryption to
protect the dubbed audio soundtracks. This encryption adheres to the Advanced Encryption Standard (AES)
Oladas, Inc. 4 myLINGO Content Management
which was established by the National Institute of Standards and Technology (need ref) [9]. This approach
allows a soundtrack file to be encrypted and decrypted with a private key to restrict access to the file.
2.2 Implementation
The following will address how the encryption protects unauthorized access to the file during each vulnerable
state as discussed in the section overview.
2.2.1 Content Transport for Remote Storage
Dubbed audio soundtracks must be uploaded to and stored on a remote server in order be accessed by client
devices. This process creates points of vulnerability during the data transport and storage. To prevent
unauthorized access, dubbed audio soundtracks are protected using 128-bit AES encryption before being
uploaded. Figure 2 demonstrates one embodiment of this approach, where a personal computer, located on
a secure, private network performs the file encryption. In this embodiment, a 256-bit private key is used to
generate the encrypted file. Once this process is complete, the encrypted file can be securely uploaded and
stored on the file server. A database running on the file server manages access to the encrypted files.
Encryption occurs on a personal computer located on a private, secure network
Remote file server hosts encrypted audio files
Encrypted file is uploaded to the database
Audio file
128-bit Encryption using
Advanced Encryption Standard
256-bit private key
Encrypted Audio File
Database manages access to the encrypted files
Figure 2: File encryption process on personal computer using a 256-bit key and 128-bit AES encryption. Fileis then uploaded to a remote file server.
2.2.2 Content Transport for Local Storage
When the client wishes to download a dubbed audio soundtrack through the myLINGO application, the
application sends a request to the remote file server. If the request is approved, the client receives a temporary
URL from which it can download the encrypted file. An embodiment of this process is shown in Figure 3
where the client mobile device requests the URL from the server. The downloaded file is then stored in its
encrypted form on the client mobile device.
Oladas, Inc. 5 myLINGO Content Management
Mobile device requests encrypted audio file
Server authenticates request and returns URL which the device uses to download the encrypted file
Mobile device stores encrypted files downloaded from server
Remote file server hosts encrypted audio files
Database manages access to the encrypted files
Figure 3: Mobile device-server interaction for requesting and downloading an encrypted audio soundtrack.
2.2.3 Content Playback
In order to access content with myLINGO, the downloaded files must be decrypted before playback. The
myLINGO application stores a copy of the same 256-bit private key used to encrypt the audio file. The
key is hard-coded into the applications program files and never made available to the user. When the file is
decrypted, the raw, uncompressed audio samples are loaded into the application’s run-time memory as shown
in Figure 4. This instance of the decrypted audio file persists only as long as the application is using the
soundtrack and it is never stored on the client mobile devices storage drive.
Decrypted audio file is stored in device memory and ready for playback. The application never saves the decrypted file to disk
128-bit Decryption using Advanced
Encryption Standard
256-bit private key is hard-coded into the application program files
Encrypted audio file is loaded from the mobile device's hard disk
Figure 4: Managing encrypted file playback on the client mobile device.
Oladas, Inc. 6 myLINGO Content Management
3 Content Management
3.1 Overview
myLINGO provides content to end-users as a “consumable” item which is purchased through the mobile device
platform’s respective application store (i.e. iOS App Store or Google Play). In this regard, “consumable”
purchases are available to users on a limited basis. As such, myLINGO employs provisions for managing
“consumable” content once it has been downloaded to the user’s device. The approach used is flexible, so it
can satisfy the requirements set forth by content creators and theater exhibitors whom may seek to manage
where and when the content can be consumed.
3.2 Content Usage Limitation
myLINGO is capable of managing content for a number of scenario’s, including:
1. Fixed expiration time after the content has been purchased and downloaded to the clients device (e.g.
“You have two weeks to view the purchased content before it is automatically deleted”)
2. Fixed expiration after content has been used by the client (e.g. “After synchronization begins, you will
have 4 hours to use this content”).
3. Expiration after a certain percentage of the content has been viewed (e.g. “If the user has viewed 80%
of the content, it should be removed”)
4. Limiting access to the content until its official release date. This can be used in situations where the
user wants to download a soundtrack in advance of an up-coming release, but the content provider wish
to restrict access until the official release date.
5. Disabling content after a final expiration date. This can be used for disabling/removing content at a
certain date, for example, if content is no longer available in theaters.
6. Location-based access to content. This can be used to prevent content access in a venue or theater
which does not support the myLINGO application.
3.3 Implementation
myLINGO implements the content limitations by managing a database stored locally on the client’s device.
This database is populated with entries for the relevant fields when content is downloaded to the user’s device
and when the content is accessed by the application. The database also is used to populate the available
content within the application (Figure 5) so that if it is removed from the device, the application will not
function.
The limitations in Section 3.2 an be implemented using the following database fields:
• PurchaseDate: The date/time when the content is downloaded
• ExpirationTime: The time duration for which the content can exist on the user’s device (i.e. specified
by Sony as 14 days)
• ContentAccessed : A boolean (“True/False”) value indicating whether or not the user has accessed
purchased content
Oladas, Inc. 7 myLINGO Content Management
(a) (b)
Figure 5: Content view (a) populated by the locally managed database. Synchronization view (b) that ispresented if downloaded content can be accessed
• AccessDate: The date/time when the user first accesses the content
• UsageTime: The amount of time the user has to use the content once it has been accessed (i.e. specified
by Sony as 4 hours).
• ReleaseDate: The date/time on which the content can first be accessed
• ExpirationDate: The date/time at which the content must be removed from the device, regardless of
usage.
• FirstPlayDate: The date/time at which the user begins viewing the content regardless of prior usage
• LastPlayDate: The date/time at which the user last viewed the content regardless of prior usage
• ContentDuration: The duration or length of the content.
This section will overview how the above database fields are used to impose the content limitations in
SEction 3.2.
3.3.1 Fixed Expiration after Purchase
In this case, content must be automatically removed from the myLINGO application after a certain amount
of time. When the content is downloaded to the application, two required fields are populated in the database
PurchaseDate and ExpirationTime. ExpirationTime is a parameter that must be downloaded from myLINGOs
servers along with the content and the value must be specified by the content provider. When the user opens
their content library (See Figure 5) each content item is automatically scanned to determine if the difference
between the current date/time and PurchaseDate exceeds ExpirationTime. If the condition is met, the content
is not selectable by the user and is automatically deleted. Otherwise, the user can still access it. Figure 6
shows the process.
Oladas, Inc. 8 myLINGO Content Management
3.3.2 Fixed Expiration after Usage
In this case, the content must expire after a specific amount of time once it has been accessed. myLINGO will
handle such a situation with three database fields ContentAccessed, AccessDate, UsageTime. ContentAccessed
is boolean value initially set to FALSE, AccessDate is a date/time parameter and UsageTime indicates the
amount of time the user has to view the content. UsageTime is a parameter that must be downloaded with
the content whose exact value is specified by the content provider. When the user opens the content for
synchronization, ContentAccessed is set to TRUE while AccessDate is set to the current date/time. Should
the user attempt to access the track again, the difference between the current date/time is compared to
AccessDate to determine if it exceeds UsageTime. Figure 7 shows the process.
3.3.3 Expiration based on Viewed Content
In this embodiment, the content provider may wish to disable the content after the user has viewed a certain
percentage of it. For example, once a user views more than 80% of the content, it should be considered
“consumed” and automatically disabled on the mobile device. This can be accomplished using two date/time
entries in the database FirstPlayDate, LastPlayDate. When the user first begins viewing the content , First-
PlayDate is updated to the current date/time and as they view the content, the LastPlayDate parameter
is continuously updated using a software timer. In subsequent attempts to access the content, the different
between the LastPlayDate and FirstPlayDate is compared to a threshold (i.e. 80%) multiplied by the Con-
tentDuration parameter. If the difference exceeds this product, access is disabled. The process is shown in
Figure 8.
3.3.4 Limit By Release Date
In this embodiment, the content cannot be accessed until its official release date which is specified by the
content. Therefore, myLINGO must manage a ReleaseDate parameter in the database. This is a date/time
parameter that must be downloaded with the content. When a myLINGO user attempts to access a purchased
feature in the application, the current date and time is compared to ReleaseDate. If the current date/time
occurs after ReleaseDate, the user is allowed to access the track. Figure 9 outlines this process.
3.3.5 Remotely Disabling Content Access
In certain embodiments, content providers may want to disable a content item at a certain date, perhaps
coinciding with the end of the theatrical run of the film. I this situation, myLINGO will download an
ExpirationDate parameter along with the content. The application will simply determine if the current
date/time exceeds ExpirationDate and remove the content item if the condition is true. Figure 10 diagrams
this process.
3.3.6 Limit Content Access by Location
Content providers and theater exhibitors may wish to limit usage of content downloaded to myLINGO based
on geographical location. In one such embodiment, content usage may be restricted to theaters owned by
a certain exhibitor. For this application, myLINGO will make use of the GPS (global positioning system)
API (application programmer interface) provided by the user’s mobile phone and determine if the user is
within the desired proximity of an affiliated theater via the users current GPS coordinates. The myLINGO
application can either manage a list of affiliated theater locations that is downloaded with the content, or
Oladas, Inc. 9 myLINGO Content Management
IS CURRENTDATE -
PURCHASEDATE >= EXPIRATIONTIME ?
Routine begins when user launches
applicationContent Management Database
(Locally stored)
End routine
Content is enabled and viewable
Content is deleted from the user's mobile device
YES
NO
For each downloaded content item:
Read CURRENTDATERead PURCHASEDATERead EXPIRATIONTIME
Figure 6: Logic demonstrating how a track is automatically disabled and deleted after a pre-determinedexpiration time.
communicate with a remote database to determine which theaters are supported. Figure 11 diagrams this
process.
Oladas, Inc. 10 myLINGO Content Management
IF(CONTENTACCESSED)
Routine begins when user launches
applicationContent Management Database
(Locally stored)
End routine
Update ACCESSDATE in Database to CURRENTDATE
Update CONTENTACCESSED in Database to TRUE
Content is enabled and viewable
YES
NO
For each downloaded content item:
Read CURRENTDATERead CONTENTACCESSED
Read ACCESSDATERead USAGETIME
IF(CURRENTDATE - ACCESSDATE) >=
USAGETIME
YESSoundtrack is deleted from the
user's mobile device
NO
Content Management Database(Locally stored)
Figure 7: Logic demonstrating how a track is automatically disabled and deleted after a pre-determined usagetime.
Oladas, Inc. 11 myLINGO Content Management
IS LASTPLAYDATE - FIRSTPLAYDATE
>= THRESHOLD * CONTENTDURATION ?
Routine begins when user attempts to access content
Content Management Database(Locally stored)
End routine
Content is enabled and viewable
Content is disabled on the mobile device
YES
NO
For selected content item:
Read FIRSTPLAYDATERead LASTPLAYDATE
READ CONTENTDURATION
Update FIRSTPLAYDATE to current date/time
Update LASTPLAYDATE to current date/time using a
software timer
Figure 8: Logic demonstrating how the amount of content viewed can be measured and disabled from thedevice..
IS CURRENTDATE >= RELEASEDATE ?
Routine begins when user attempts to
access content in playlist
Content Management Database(Locally stored)
End routine
Content is enabled and viewable
Content is not accessible and user is provided with a notification indicating the release date
NO
YES
For selected content item
Read CURRENTDATERead RELEASEDATE
Figure 9: Process showing how content access is limited by the pre-determined release date.
Oladas, Inc. 12 myLINGO Content Management
IS CURRENTDATE >= EXPIRATIONDATE ?
Routine begins when user launches
applicationContent Management Database
(Locally stored)
End routine
Content is enabled and viewable
Soundtrack is deleted from the user's mobile device
YES
NO
For each downloaded content item:
Read CURRENTDATERead EXPIRATIONDATE
Figure 10: Methodology for remotely disabling a track with an expiration date parameter.
Is user within proximity of supported theater?
Routine begins when user accesses content
in playlist viewDatabase of Supported Theater
Locations (can be located Remotely OR locally on device)
End routine
Determine user's current GPS coordinates
YES
NO
Query list of supported theaters for selected
content
Content is disabled and notification sent to the user
User proceeds to content
Figure 11: Location-based access to content. Client device compares users current GPS coordinates to amaster list of supported theater locations. List may be located locally on the device or remotely accessed overthe network.
Oladas, Inc. 13 myLINGO Content Management
4 Forensic Watermarking
4.1 Overview
While digital audio files used by the myLINGO application are encrypted for transmission and decrypted
for playback, the copyrighted material is still vulnerable during the playback process should the content be
recorded by another device. Thus, provisions are required for protecting this content during audio playback
to discourage unauthorized reproduction and provide a means of identifying the source of said copies.
Through forensic watermarking, myLINGO provides this security measure. Digital watermarking is used
extensively by the multimedia industry to embed imperceptible digital signals in various digital media formats
(e.g. film, audio, photos) to discourage unauthorized reproduction and help intellectual property owners
identify the source(s) of unauthorized copies. The myLINGO application utilizes digital audio watermarking
to embed imperceptible digital information about the client and/or client device in the audio soundtrack.
Using a decoding scheme, the user information can be retrieved from the audio to reveal the source of
unauthorized copies.
While there a number of techniques utilized for digital audio watermarking (see [1]-[6]), the primary goals
are to embed a signal that is 1) imperceptible and 2) robust to a number of attacks, or modifications, that
could remove, damage or destroy the embedded watermark. These attacks include common signal process-
ing techniques (e.g. resampling, re-quantization, time and/or frequency stretching) and audio compression
(MPEG 1 Layer 3, MPEG 2 AAC).
This section overviews the watermarking technique employed by the myLINGO application, the various
embodiments of which it is implemented in the application and the decoding scheme for recovering the data
embedded via watermarking.
4.2 Implementation
The myLINGO application will employ a two-stage watermarking approach.
4.3 Stage 1: Verance Watermark
Before the movie soundtrack file is encrypted and uploaded to myLINGOs servers, a Verance watermark is
applied to the copy of the movies soundtrack provided by the intellectual property owner as shown in Figure 1.
This watermark prevents un-authorized redistribution of the intellectual property due to its standardization
with playback devices and software. Content embedded with a Verance watermark is not viewable by DVD
players, laptops and other consumer electronics products. Oladas will obtain this watermark for content used
with myLINGO via a license with Verance.
4.4 Stage 2: Forensic Watermark
The forensic audio watermarking method utilized by myLINGO is based on a time-spread digital audio
watermarking technique proposed by Ko et al [2]. In this method, the audio signal is sub-divided into non-
overlapping segments and each segment is convolved with either one of two kernel functions, or filters. Each
kernel function represents a binary 1 or 0 by specifying the amount of delay time for the echo. Furthermore,
the kernel functions are generated with a pseudo noise (PN) sequence as shown in Figure 12. The PN sequence
acts as a private key used at the decoding stage to recover the watermarked signal. This prevents any third
party without the key from recovering the watermarked signal.
Oladas, Inc. 14 myLINGO Content Management
0 0.5 1 1.5 2 2.5 3
0
0.2
0.4
0.6
0.8
1
Time (msec)
Am
plitu
de
Echo Kernel for 1 millisecond Delay
Figure 12: Example binary filter kernels generated with pseudo noise sequences. In this example, a binary 0could be encoded with the 1 millisecond delay filter and binary 1 could be encoded using the 2 milliseconddelay filter.
This algorithm was chosen for the implementation for several reasons:
• The kernel filters can be constructed such that the resulting echo is imperceptible when the kernel filter
is convolved with the audio signal.
• The PN sequence acts as a private key that discourages reverse engineering the process to recover or
remove the watermarks.
• This also protects client user information from unauthorized third parties.
• The watermarked audio signal can be computed in real-time on many current mobile devices.
• The method satisfies the requirements specified in Section 9.4.6.1.3 of the Digital Cinema Specification
as specified by Digital Cinema Initiatives, LLC [7].
Oladas, Inc. 15 myLINGO Content Management
4.4.1 Embedding Client Data
The forensic watermark should provide a form of identification that is unique to the user. In one embodiment,
this data may consist of the Universal Device Identifier (UDID) which is specific to the user’s phone. However,
some mobile device platforms do not provide access to this number and it has no direct connection to the
user who own’s the device.
In another embodiment, myLINGO will employ a client-server architecture where the user (client) requests
a verification code to utilize the application. This process is shown in Figure 13. The first step requires the
user to submit their name and mobile phone number over the network to a remote server. The server then
processes this request and sends a verification code back to the client device via SMS (text message). The
user then enters the code from their SMS application back into the myLINGO application for verification.
If the remote server approves the verification code, the user will be allowed to access paid content through
myLINGO. If it is not approved, they will not be able to access the paid content until they successfully verify
their phone number. This is shown in Figure 14.
Upon successful verification, the watermarking payload will consist of the following:
• user’s name
• user’s mobile phone number
• current date & time during (relative to the watermarking process)
The name and number are also saved in the default settings for myLINGO which cannot be accessed or
modified by the user. These settings are made permanent for as long as the application persist’s on the user’s
mobile device. Thus, the forensic watermarking approach presented here provides a way of embedding the
audio signal with imperceptible information that can be used to trace the source of un-authorized content
usage.
Oladas, Inc. 16 myLINGO Content Management
Submit
John Doe
(123)456-7890
Full Name
Mobile Phone
SMS App
Your verification code is 123456. Please enter it in
myLINGO
Client requests verification code from remote server
Server returns verificaiton code in the form of a SMS (text) message
Figure 13: Process for requesting a verification code for the myLINGO application.
Submit
123456
Client sends verification code back to server for confirmation
Enter your verification code
Remote server verifies code and approval sent back to the application.
If it is approved, the submitted information is used in the forensic watermark. If it is not verified, the user cannot access paid content
User verified OR not verified
Figure 14: Verification approval process for accessing paid content.
Oladas, Inc. 17 myLINGO Content Management
s1[n] s2[n] s3[n] sN[n]
Output buffer is written directly to the device audio
output hardware or re-encoded and encrypted for
local storage
Terms
s[n]: audio signalk[n]: binary kernel filterw[n]: watermarked audio signal∗: convolution operatorN: number frames dividing s[n]i = 1, 2, …, N
Convert to binary
representation1 0 0
Character to watermark from client data
1 0 0
Binary data
Binary kernel selection
Convolutionsi[n] ∗ k[n] = w[n]
k[n]
Write result to output buffer
wi[n]
Input audio signal subdivided into non-overlapping frames
Each bit is sequentially processed with the audio stream.
Figure 15: Overview of the embedding stage for watermarking client data.
4.4.2 Binary Embedding Process
The watermarking process embeds the client data by converting each character in the data into its binary
representation and encoding each bit into a small section, or frame, of audio via a convolution operation.
These frames are non-overlapping and have equal length so that the audio stream is encoded in a consistent
manner. The embedding process is detailed in Figure 15.
For an example of this process, consider watermarking the decimal number “7” into the audio signal.
The first step involves obtaining a binary representation of 7 which is 101 using 3 bits. The first bit to be
embedded is 1, which is done by convolving the first audio frame with the filter kernel for binary 1. The next
step involves convolving the subsequent audio frame with the filter kernel for binary 0. Finally, the the third
audio segment is convolved with the filter kernel for binary 1. The binary kernel selection step in Figure 15
determines which filter to use depending on the incoming bit.
Because switching filter kernels rapidly may cause ambiguity in the decoding process, in certain embod-
iments myLINGO will use redundancy to improve the reliability of the encoding. In this way, the previous
example could be modified so that the first 8 segments would be convolved with the binary 1 kernel, the next
8 segments with the binary 0 kernel and so on.
As per Section 9.4.6.1.1. of the Digital Cinema System Specification, myLINGO will employ watermark
embedding at intervals no greater than 15 minutes in length to ensure the audio soundtrack is sufficiently
Oladas, Inc. 18 myLINGO Content Management
covered [7].
Furthermore, the integrity of the watermarking signal may be compromised if the stereo channel configura-
tion is tampered with. For example, stripping off one of the channels and/or creating a mono (single channel)
mix of the track. In certain embodiments, myLINGO will alternate between watermarking the client data on
the left and right channels so that a complete data stream is only available on one channel at a particular
time.
4.4.3 Decoding Process
To retrieve the embedded signal from the watermarked soundtrack, the audio must be processed in non-
overlapping frames as it was during the encoding stage. This process involves application of Cepstral Pro-
cessing to each frame of the watermarked audio. Cepstral Processing can be thought of as a quasi-source
separation step to determine the contributions from the original audio signal and of the binary kernel filter.
The real cepstrum, or the real part of the complex cepstrum, is used here and is mathematically defined as:
DFT (•)→ log |•| → IDFT (•) (1)
where DFT (•) indicates the Discrete Fourier Transform of the argument, log |•| is the logarithm of the
absolute value of the argument and IDFT (•)is the inverse Discrete Fourier Transform of the argument [8].
However, the binary kernel filters were created with a PN sequence that will obscure the filters echo
contribution when examining the cepstrum. To de-spread the echo, the cepstrum must be cross correlated
with the PN sequence used to watermark the signal [2]. The delay imposed by the Kernel can be found in
the resulting cross correlation by search for the maximum peak in this signal (see Figure 17). As shown in
Figure 16, applying this process to each frame in the watermarked audio will recover the delay values which
were used to encode the binary data.
The decoding process is not implemented on the client device running the myLINGO application. Rather,
it is programmed to run on a personal computer when required.
4.4.4 Real-time vs Offline Watermarking
In one embodiment, the clients mobile device will have enough computational power to watermark the sound-
track audio as it is being played back. In this sense, the algorithm is employed in a real-time scenario where
the latency incurred from watermarking algorithm does not introduce a perceptible delay during playback on
the client device.
In one embodiment, the soundtrack is watermarked before playback to avoid real-time processing. The
encrypted, compressed audio file is first decrypted into the applications memory and sequentially decompressed
for processing small blocks of audio. The linear pulse-code modulated (PCM) samples are modified with the
appropriate echo kernels and these samples are then compressed and saved in the applications memory. In this
embodiment, a decrypted version of the soundtrack is never written to the client mobile devices storage drive.
Rather the watermarking, compression/decompression and encryption/decryption occurs entirely within the
applications memory which persists only during run-time.
Oladas, Inc. 19 myLINGO Content Management
Terms
w[n]: watermarked audio signalc[n]: cepstrum of audio segmentp[n]: pseudo noise sequenced[n]: de-spread echo signalDFT(⦁): Discrete Fourier Transformlog�⦁����������� ��IDFT(⦁): Inverse Discrete Fourier Transform⨂: Correlation operationN: number frames dividing s[n]i = 1, 2, …, N
w1[n] w2[n] w3[n] wN[n]
Cepstrum Analysis
DFT(⦁) log�⦁� IDFT(⦁)
Watermarked audio signal is subdivided into non-overlapping frames
⨂
p[n]
pseudo noise sequence
di[n]binary output
ci[n]Locate peak in di[n] to
determine delay
Figure 16: Overview of the process for decoding the watermarked signal to recover binary data.
0 0.5 1 1.5 2 2.5 3 3.5 4−1.5
−1
−0.5
0
0.5
1
1.5
Time (msec)
Am
plitu
de
De−spreading the Echo
Figure 17: Peak selection for the de-spread cepstrum. The time associated with the peak value indicates thebinary number watermarked.
Oladas, Inc. 20 myLINGO Content Management
5 References
1 Pan, Jeng-Shyang, Hsiang-Cheh Huang, and Lakhmi C. Jain, eds. Intelligent watermarking techniques.
Vol. 7. World scientific, 2004.
2 Ko, Byeong-Seob, Ryouichi Nishimura, and Yiti Suzuki. ”Time-spread echo method for digital audio
watermarking.” Multimedia, IEEE Transactions on 7.2 (2005): 212-221.
3 Bassia, Paraskevi, Ioannis Pitas, and Nikos Nikolaidis. ”Robust audio watermarking in the time do-
main.” Multimedia, IEEE Transactions on 3.2 (2001): 232-241.
4 Wei, Foo Say, and Dong Qi. ”Audio watermarking of stereo signals based on echo-hiding method.”
Information, Communications and Signal Processing, 2009. ICICS 2009. 7th International Conference
on. IEEE, 2009.
5 Lie, Wen-Nung, and Li-Chun Chang. ”Robust and high-quality time-domain audio watermarking based
on low-frequency amplitude modification.” Multimedia, IEEE Transactions on 8.1 (2006): 46-59.
6 Lee, Sang-Kwang, and Yo-Sung Ho. ”Digital audio watermarking in the cepstrum domain.” Consumer
Electronics, IEEE Transactions on 46.3 (2000): 744-750.
7 Digital Cinema Specification Version 1.2. Digital Cinema Initiatives, LLC. 2012.
8 Quatieri, Thomas F. Discrete-time speech signal processing. Pearson Education India, 2002.
9 Announcing the Advanced Encryption Standard (AES). Federal Information Processing Standards Pub-
lication 197. 2001 (online) http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
Oladas, Inc. 21 myLINGO Content Management