a target advertisement system based on tv viewer’s profile reasoning

25
A target advertisement system based on TV viewers profile reasoning Jeongyeon Lim & Munjo Kim & Bumshik Lee & Munchurl Kim & Heekyung Lee & Han-kyu Lee Published online: 27 January 2007 # Springer Science + Business Media, LLC 2007 Abstract The traditional broadcasting services such as terrestrial, satellite and cable broadcasting have been unidirectional mass media regardless of TV viewer s preferences. Recently rich media streaming has become possible via the broadband networks. Furthermore, since bidirectional communication is possible, personalcasting such as personalized streaming service has been emerging by taking into account the users preference on content genres, viewing times and actors/actresses etc. Accordingly personal media becomes an important means for content provision service in addition to the traditional broadcasting service as mass media. In this paper, we introduce a user profile reasoning method for TV viewers. The user profile reasoning is made in terms of genre preference and TV viewing times for TV viewer s groups in different genders and ages. For user profiling reasoning, the TV viewing history data is used to train the proposed user profiling reasoning algorithm which allows for target advertisement for different age/gender groups. To show the effectiveness of our proposed user profile reasoning method, we present plenty of the experimental results by using real TV usage history. Multimed Tools Appl (2008) 36:1135 DOI 10.1007/s11042-006-0079-2 J. Lim (*) : M. Kim : B. Lee : M. Kim Information and Communications University, 119 Munji Street, Yuseong-gu, Daejeon 305-732, Korea e-mail: [email protected] M. Kim e-mail: [email protected] B. Lee e-mail: [email protected] M. Kim e-mail: [email protected] H. Lee : H. Lee Electronics and Telecommunications Research Institute, Daejeon, Korea H. Lee e-mail: [email protected] H. Lee e-mail: [email protected]

Upload: jeongyeon-lim

Post on 15-Jul-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A target advertisement system based on TV viewer’s profile reasoning

A target advertisement system based on TV viewer’sprofile reasoning

Jeongyeon Lim & Munjo Kim & Bumshik Lee &

Munchurl Kim & Heekyung Lee & Han-kyu Lee

Published online: 27 January 2007# Springer Science + Business Media, LLC 2007

Abstract The traditional broadcasting services such as terrestrial, satellite and cablebroadcasting have been unidirectional mass media regardless of TV viewer’s preferences.Recently rich media streaming has become possible via the broadband networks.Furthermore, since bidirectional communication is possible, personalcasting such aspersonalized streaming service has been emerging by taking into account the user’spreference on content genres, viewing times and actors/actresses etc. Accordingly personalmedia becomes an important means for content provision service in addition to thetraditional broadcasting service as mass media. In this paper, we introduce a user profilereasoning method for TV viewers. The user profile reasoning is made in terms of genrepreference and TV viewing times for TV viewer’s groups in different genders and ages. Foruser profiling reasoning, the TV viewing history data is used to train the proposed userprofiling reasoning algorithm which allows for target advertisement for different age/gendergroups. To show the effectiveness of our proposed user profile reasoning method, wepresent plenty of the experimental results by using real TV usage history.

Multimed Tools Appl (2008) 36:11–35DOI 10.1007/s11042-006-0079-2

J. Lim (*) :M. Kim : B. Lee :M. KimInformation and Communications University,119 Munji Street, Yuseong-gu,Daejeon 305-732, Koreae-mail: [email protected]

M. Kime-mail: [email protected]

B. Leee-mail: [email protected]

M. Kime-mail: [email protected]

H. Lee :H. LeeElectronics and Telecommunications Research Institute, Daejeon, Korea

H. Leee-mail: [email protected]

H. Leee-mail: [email protected]

Page 2: A target advertisement system based on TV viewer’s profile reasoning

Keywords Target advertisement . Personalcasting . User profile reasoning

1 Introduction

With the rapidly growing Internet, the Internet broadcasting and web casting service havebeen one of the well-known services. Specially, it is expected that the IPTV service will beone of the principal services in the broadband network [2]. However, the currentbroadcasting environment is served for the general public and requires the passive attitudeto consume the TV programs. For the advanced broadcasting environments, variousresearch of the personalized broadcasting is needed. For example, the current unidirectionaladvertisement provides to the TV viewers the advertisement contents, depending on thepopularity of TV programs, the viewing rates, the age groups of TV viewers, and the timebands of the TV programs being broadcast. It is not an efficient way to provide the usefulinformation to the TV viewers from customization perspective. If a TV viewer does notneed particular advertisement contents, then information may be wasteful to the TV viewer.Therefore, it is expected that the target advertisement service will be one of the importantservices in the personalized broadcasting environments. The current research in the area ofthe target advertisement classifies the TV viewers into clustered groups who have similarpreference. The digital TV collaborative filtering estimates the user’s favourite advertise-ment contents by using the usage history [1, 4, 5]. In these studies, the TV viewers arerequired to provide their profile information such as the gender, job, and ages to the serviceproviders via a PC or Set-Top Box (STB) which is connected to digital TV. Based onexplicit information, the advertisement contents are provided to the TV viewers in acustomized way with tailored advertisement contents. However, the TV viewers maydislike exposing to the service providers their private information because of the misuse ofit. In this case, it is difficult to provide appropriate target advertisement service.

In this paper, we only utilize implicit information of TV usage history such as theviewing date, viewing time, and genres for TV programs. We design a multi-stage classifieras a profile reasoning algorithm for TV viewers. The proposed multi-stage classifier istrained with real usage history data of 2,522 people for TV programs. We also develop atarget advertisement system based on the TV viewers’ profile reasoning algorithm. Thetarget advertisement system selects and provides relevant commercials to the targetedgroups. This paper is organized as follows: Section 2 presents the architecture of our targetadvertisement system with possible applications scenarios; Section 3 describes ourproposed profile reasoning algorithm for TV viewers, which classifies unknown TVviewers into an appropriate gender–age group; Section 4 addresses a commercial selectionmethod for target advertisement; Plenty of experimental results are provided and analyzedfor the profile reasoning performance; and finally we conclude our work in Section 6.

2 Architecture of proposed target advertisement system

In the proposed target advertisement service system, there are three major entities: a contentprovider, advertisement companies, and TV viewers. The proposed target advertisementsystem consists of the following necessary modules; a profile reasoning module to infer aTV viewer’s profile by analyzing their TV usage history, a broadcasting transmissionmodule to recommend services based on the inferred result, and a user interface module toprotect TV viewers’ profile. The terminals at the TV viewers’ side send limited information

12 Multimed Tools Appl (2008) 36:11–35

Page 3: A target advertisement system based on TV viewer’s profile reasoning

with their TV usage history to the service provider (target advertisement system), andreceives the selected commercials which are recommended by the target advertisementservice system. Figure 1 shows the architecture of our proposed target advertisementsystem. The target advertisement system consists of three agents such as an inference agentof TV viewer profiles which has the profile reasoning module for TV viewers, a contentprovision agent which contains a selection module of appropriate TV commercials to thetargeted TV viewers and a transmission module for TV program contents, and a userinterface agent which consists of an input interface module and a TV usage historytransmission module.

In Fig. 1, the profile inference agent of TV viewers receives the usage history data of TVprograms such as TV program titles, genres, channels, viewing times band, and viewingdays of the week from the user interface agent. By utilizing this information, the profileinference agent infers the TV viewers’ profile in their preferred genres and time bands ofTV viewing for the groups of different genders and ages by the profile reasoning module,and the inference results are sent to the content provision agent. Based on the profileinference results, the content provision agent selects appropriate commercial contents tounknown target TV viewers by the advertisement content selection module. The selectedcommercial contents can be distributed by the broadcasting station with TV programcontents or VoD (Video on Demand). The user interface agent provides a GUI whichenables TV viewers to consume contents or relative data at the TV terminal. The userinterface agent works on the STB (Set-Top Box) which enables the TV viewers to consumethe recommended TV commercial contents with TV programs from the content provideragent. While the TV viewers watch TV programs, the user interface agent stores the usagedata of the TV programs being watched into the TV usage history DB of STB through the

User Interface Agent

Content Provider AgentProfile Inference Agent

TV Usage History DB

TV viewer Input Interface Module

TV Usage History TX Module

AdvertisementContent

TV viewer’s input * Start/Stop watching TV * Select TV program/channel

Work Place

VOD

Reasoning Profile

* Gender * Age TV Usage

History DB

TV viewer ProfileReasoning Module

Personalized contents

* Preferred TV program * Target Advertisement Contents

TV viewer

Network

Advertisement ContentsSelection Module

Ad content DB

TV AnytimeMetadata DB

Set-Top Box

Broadcasting Station

Advertisement Company

Fig. 1 Target advertisement system architecture

Multimed Tools Appl (2008) 36:11–35 13

Page 4: A target advertisement system based on TV viewer’s profile reasoning

input interface module. By the level of information provision for the TV programconsumption, stored information is divided into TV usage information and privateinformation. Only a limited amount of information about TV program consumption istransmitted to the profile inference agent through the TV usage history transmissionmodule, which makes it possible to infer TV viewers’ profiles.

3 Proposed profile reasoning algorithm

In this section, we describe a multi-stage classifier for the proposed profile reasoningalgorithm, and explain how to extract feature vectors in order to train the multi-stageclassifier.

3.1 Analysis of features depending on user profiles

The feature vector for profile reasoning algorithm can be obtained from the TV usagehistory. In this paper, we use usage history data of TV programs for male and female TVviewers in different ages by AC Nielson Korea. The TV usage history has various fields asshown in Table 1. The TV usage history was recorded by 2,522 people (Male: 1,243 andFemale: 1,279) from Dec. 2002 to May, 2003. The TV programs are categorized into eightgenres such as News, Information, Drama&Movie, Entertainments, Sports, EducationChild, and Miscellaneous. The usage history data of TV programs were collected via sixbroadcasting channels. The one TV channel is dedicated for the education and the othersprovide TV programs in all genres. Figure 2 shows the TV viewing time bands of male andfemale TV viewers over weekday from the usage history data of TV programs. In Fig. 2,the y-axis indicates the portion of the total TV watching time over different TV watchingtime bands in the x-axis. As shown in Fig. 2, the watching time bands are different for theTV viewers in different genders and ages. It is observed from Fig. 2 that, in the morning,the portion of TV viewing time by 50s and 60s is relatively higher than those of the otherages. The children (the 0s TV viewers) and teenager groups mainly watch TV programsfrom 5 to 9 P.M. because the TV programs such as Comics and Drama for the children areusually served after school. The male 20s∼40s do not usually have much time to watch TVprograms during the day time than others. So, we can guess that they usually watch TV

Field Name Description

id TV viewer’s IDprofile TV viewer’s gender and age groupdate A date of watching TV programdayofweek A day of the week for TV programsubscstart_t Beginning time point of watching TVsubscend_t Ending time point of watching TVprogramstart_t Scheduled beginning time of TV programprogramend_t Scheduled ending time of TV programtitle Title of TV programchannel Channel of TV program(six channels)genre Genre of TV program(eight genres)

Table 1 Fields and descriptionof TV usage history DB

14 Multimed Tools Appl (2008) 36:11–35

Page 5: A target advertisement system based on TV viewer’s profile reasoning

during night. The total TV watching time of male 20s and female 20s is the lowest and thatof 60s in both genders is the highest comparatively.

The TV programs are scheduled by the broadcasting stations, and the TV programs havesimilar schedules except for the specific channel (EBS: Education Broadcasting System). Forexample, the five broadcasting companies serves News program contents during 8∼9 P.M.The time band of 10∼11 P.M. is prime time to watch TV drama in Korea. So, we can guessthe user’s genre preferences can be affected by the TV program schedules by thebroadcasting service companies. The longer the TV watching time is, the more various thewatched TV program genres are.

0

0.1

0.2

0.3

1~3 5~7 9~11 13~15 17~19 21~23

M0s M10s M20s M30sM40s M50s M60s

50s , 60s

0s 10s

30s

40s

20s

10s

50s

60s

0s

a Male TV viewing time

0.0

0.1

0.2

0.3

1~3 5~7 9~11 13~15 17~19 21~23

F0s F10s F20s F30sF40s F50s F60s

50s , 60s

0s10s

30s

40s

20s

10s

50s

60s

0s

b Female TV viewing time Fig. 2 TV viewing time of each gender and ages

Multimed Tools Appl (2008) 36:11–35 15

Page 6: A target advertisement system based on TV viewer’s profile reasoning

Figure 3 shows the characteristics of TV program consumption patterns by male andfemale TV viewers. The values in the y-axis are the genre probabilities by counting thenumber of the watched TV program for each genre. In Fig. 3a and b, both genders show thesimilar genre preferences. However, the degree of the genre preferences is different. Forexample, the female TV viewers tend to watch Drama&Movie contents in more favour thanthe News contents. On the other hand, the male TV viewers more prefer to the Newscontents than the TV contents in other genres. Therefore, we use genre preference todiscriminate TV viewers into different gender-ages groups.

Also, a user’s action such as channel hopping exhibits different characteristics,depending on the ages and genders even though the TV viewers in the different ages andgenders watch the same TV program contents. Figure 4 shows the genre probabilities of TVprogram contents which are estimated by the consumed time on each TV program genrecompared to the total TV watching time. The whole shapes of the graphs look similar to

0

0.1

0.2

0.3

0.4

News Info Drama Entertain Sports Education Child Misc

News Info Drama Entertain Sports Education Child Misc

M0s M10s M20s M30sM40s M50s M60s

a Averaged male genre preference

0.0

0.1

0.2

0.3

0.4F0s F10s F20s F30sF40s F50s F60s

b Averaged female genre preference Fig. 3 Genre preferences by the genre probability using the number of watched TV genre

16 Multimed Tools Appl (2008) 36:11–35

Page 7: A target advertisement system based on TV viewer’s profile reasoning

those in Fig. 3 in which the genre preference for each gender–ages group was measured asthe ratio of the number of watching TV programs in each genre to the total number ofwatching TV programs in all genres.

As shown in Figs. 3 and 4, we can use as discriminatory features the two genreprobabilities of the watching times and watching numbers to distinguish the TV viewersinto different gender–ages groups. By analyzing the TV viewer’s preference in detail, wecan achieve high prediction results on reasoning gender–ages groups for unknown TVviewer by his/her usage history date of TV program consumption.

Finally, specific channel information with education, game, music, stocks and news canbe an important key for reasoning the TV viewer’s gender–ages groups. As describedabove, we take into account how many times the TV program contents have consumed ineach genre, how long the TV program contents have consumed in each genre, the averageTV watching time, and how many times the TV viewers have watched TV program contenton each channel.

0

0.1

0.2

0.3

0.4

News Info Drama Entertain Sports Education Child Misc

News Info Drama Entertain Sports Education Child Misc

M0s M10s M20s M30sM40s M50s M60s

a Averaged male genre preference

0.0

0.1

0.2

0.3

0.4F0s F10s F20s F30sF40s F50s F60s

b Averaged female genre preference Fig. 4 Genre preferences by the genre probability using the occupied time of watched TV genre

Multimed Tools Appl (2008) 36:11–35 17

Page 8: A target advertisement system based on TV viewer’s profile reasoning

3.2 Feature extraction

For the reasoning of the TV viewer’s gender and ages, we consider the number of thewatching genre, the watching time of the genre, the averaged watching time and the totaloccupied time on each channel for the feature vector to distinguish TV viewer’s groups.

Before we compute feature vector elements, uncertain history data are removedaccording to the following conditions:

& DcDp

� TTh&

Pm�Do

Nm � CTh

where Dc and Dc are the total duration and the total watching time of the TV programcontent, respectively. TTh is a threshold value to compare with the ratio of Dc and Dc. Withthe first condition, the TV program contents that were consumed during a short period oftime are excluded from the training data of the usage history because the amount ofconsumption time is too short compared to the total time length of the TV program content.The second condition is used to exclude the usage history data for the TV viewers whoseldom watched the TV that contains. If the total number

Pm�Do

Nm of TV watchingduring a certain observation period Do is less that a predefined threshold CTh, then theusage history of the TV viewers are also excluded from the training data. For the usagehistory data that satisfies the two conditions, we calculate the following feature valuesdescribed in Table 2.

In Table 2, GCi,k,a is the frequency of watching genre i of a TV viewer k in an gender–ages group a during a pre-determined period, and GTi,k,a is the consumption time of genre iof the TV viewer k in the group a during the period. Also, CTk,a is the consumption time ofthe TV viewer k in the group a during the period. Lastly, Cj,k,a is the consumption time ofchannel j of the TV viewer k in the group a during the period. I and J are the total numbersof the genres and channels. By utilizing feature values and equations in Table 2, we cangenerate a feature vector for each TV viewer for each date of every week. The featurevector is expressed as Table 3. The feature vector in Table 3 has 23 feature values. The firsteight elements are the genre probability based on the number of counts (GPRC) values andthe second eight elements are the genre probability based on the amount of consumption time(GPRT) values for all eight genres. The 17th element is the average viewing time (AVT) andthe last six elements indicate the channel probability based on the amount of consumptiontime (CPR) values for the six channels. We compute the feature vectors for all TV viewersand also calculate the group vectors of the feature vectors for each gender–ages group. Notice

Types of feature values and equations Number

Genre Probability based on the number of counts (GPRC) 8GPRCi;k;a ¼ GCi;k;a =

PIi¼1 GCi;k;a

Genre probability based on the amount of consumptiontime (GPRT)

8

GPRTi;k;a ¼ GTi;k;a =PI

i¼1 GTi;k;a

Average viewing time (AVT) 1AVTk;a ¼ CTk;a

�TotTime

Channel probability based on the amount of consumptiontime (CPR)

6

CPRj;k;a ¼ Cj;k;a =PJ

j¼1 Cj;k;a

Table 2 Types and the numberof feature values

18 Multimed Tools Appl (2008) 36:11–35

Page 9: A target advertisement system based on TV viewer’s profile reasoning

that the group vector is the mean vector of the feature vectors for each gender–ages group.Therefore, the group vectors are the representative vectors for their respective gender–agesgroups. The profile inference agent in Fig. 1 maintains a look-up table with the group vectorsfor the gender–ages groups. The multi-stage classifier (MSC) infers a TV viewer’s profilefrom his/her feature vectors by comparing to the group vectors in the look-up table. In usagehistory data, we compute the feature vectors from Monday to Friday because most gender–ages groups have similar viewing patterns in the weekend.

3.3 The first stage classifier

The 1st stage classifier is performed by a metric to measure the similarity between a featurevector and all group vectors for a specific day of the week. The similarity measure betweentwo vectors is calculated by the vector correlation (VC) and the normalized Euclideandistance (ED). The VC value to measure the similarity is obtained from (1) [6].

VC x; yð Þ ¼ cos θ ¼ x � yxk k � yk k ¼

Pmi¼1

xiyiffiffiffiffiffiffiffiffiffiffiffiPmi¼1

x2i

s ffiffiffiffiffiffiffiffiffiffiffiPmi¼1

y2i

s ð1Þ

However, the vector correlation only measures the angle between two vectors. That is,the vector correlation does not take into account the distance between the two vectors.

The normalized Euclidean distance uses the variances as the normalized term of theEuclidean distance. The variances are obtained from feature values in feature vectors for aspecific group of gender and ages. Equation (2) shows the normalized Euclidean distance.

ED x; yð Þ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXmi¼1

xi � yið Þ2s2i;g

vuut ð2Þ

In (2), g indicates a specific group of gender and ages. The normalized Euclideandistance only calculates the distance between two vectors. So, we propose a novel methodto measure the distance between two vectors. The proposed method considers the distanceand the correlation of the feature vector and group vectors at the same time. The VC valuebetween a feature vector as input and each group vector is used as a weight in computingthe GVC between the feature vector as input under test and each feature vector in thegender–ages group. The ED value between a feature vector as input and each group vectoris used as a weight in computing the GED value between the feature vector as input andeach feature vector in the gender–ages group. The novel vector distance metric between twovectors, Vi and Vt, is shown in (3).

Dist V i;V tð Þ ¼ GVC V i;V tð Þ þ GED V i;V tð ÞGVC V i;V tð Þ ¼ 1�WI ;v

� �� 1� VC V i;V tð Þð ÞGED V i;V tð Þ ¼ WI ;E � ED V i;V tð Þ

ð3Þ

Index 1∼8 9∼16 17 18∼23

Feature Values GPRC GPRT AVT CPR

Table 3 Feature vector

Multimed Tools Appl (2008) 36:11–35 19

Page 10: A target advertisement system based on TV viewer’s profile reasoning

In (3), i∈I and I is the index of a specific group. Also, WI,v=VC(GI, Vt) and WI,E=ED(GI,Vt). GI is a group feature vector of the group I. That is, WI,v and WI,E are the vectorcorrelation and the normalized Euclidean distance between the group feature vector GI andVt. In addition, Vi is the ith feature vector of the group I in the look-up table, and Vt is theTV viewer’s feature vector to infer his/her profile in terms of gender and ages. Figure 5shows the first stage classifier to measures the vector distance by (3). In Fig. 5, the featurevector Vt of TV viewer A is arranged in the bottom box. The vector distances between TVviewer A and group I are calculated in the ascending order as shown in Fig. 5.

3.4 The second stage classifier

The second stage classifier is constructed by the k-NN (k-Nearest Neighbour) method. Thek-NN method uses as input the k smallest vector distances obtained from the 1st stageclassifier. However, the traditional k-NN method makes a decision, taking only into accountthe k highest ranked distances in the ascending order. Therefore the k-NN method does notutilize information about their distance values in classification. So, the second stageclassifier in this paper adopts the weighted-distance k-NN that considers the distance valuesof the k highest ranked distances [7]. The equation for weighted-distance k-NN (WDK) of aspecific group I is shown in (4).

WDK Ið Þ ¼Pi∈I

1=VDT ið ÞPNI¼1

Pkj¼1

1=VDT j;GIð Þð4Þ

In (4), i∈I, I is the index of a group, and k is k value in k-NN. VDT(i) is the ith vectordistance value among the k smallest vector distances. N is the total number of gender–agesgroups, and VDT( j, GI) is the vector distance values of GI group in the k gender–ages groupsselected for k-NN. Through (4), we can make the weighted distance k-NN table for gender–ages groups with the k vector distances. Figure 6 shows an example about how to computethe similarity between the unknown TV viewer and each gender–ages group by the k-NNmethod. In Fig. 6, the seven smallest vector distances are selected (k=7). Then the inverse(55.2) of the total vector distances is calculated as a normalization value, which leads to theweighted k-NN. We calculate the normalized inverses (weighted distance k-NN) of the vector

News (0.35), Child(0.2) …

G14

G2 G1

News (0.1), Child(0.05) …

News (0.25), Child(0.1) …

Look-up Table ID Feature values

Vector Distance Table

G14

G2

G1 ID Distance

0.001

0.015

0.53

Ascending

News (0.35), Child(0.2) …

Viewer A’s Feature Values G??

Fig. 5 Example of the first stageclassifier

20 Multimed Tools Appl (2008) 36:11–35

Page 11: A target advertisement system based on TV viewer’s profile reasoning

distances for all gender–ages groups (G1, G2, G3 and G4). Notice that there are two G1, threeG2, one G3 and one G4 groups. The corresponding normalized inverses of the vectordistances are 0.5, 0.416, 0.051, and 0.032 for G1, G2, G3 and G4, respectively.

3.5 The third stage classifier

After the second stage classifier, we can obtain an inferred TV viewer’s profile based on themaximum of the weighted-distance k-NN values in the table for each day of the week day.

The third stage classifier calculates the majority rule table with the maximum weighteddistance k-NN values and the gender–ages groups for the weekday. Then the normalizedmajority rule (NMR) values are calculated by combining the maximum weighted distancek-NN values for the weekday. The normalized majority rule value can be calculated by (5).

NMR Ið Þ ¼ max WDKT dð Þ d ∈ Djf gPDd¼1

max WDKT dð Þ d ∈ Djf gð5Þ

In (5), I is the index of the inferred gender–ages group for the weekday, D means theweekday from Monday to Friday, and WDKT(d) is a value of weighted distance k-NN tablein d day of the week.

The third stage classifier categorizes the unknown TV viewer to the gender–ages groupwhich has the maximum NMR value as shown in Fig. 7.

M0s

M10s Mon

Tue

Wed

Thr

Fri

Max . WD k-NN0.4772

0.4687

0.4593

0.732

0.682

M10s

M10s

M0s 8192.2}|)(max{1

= ∑=

D

d

DddWDKT

4984.08192.2/)4593.04687.04772.0(

)10(

=++=

sMNMR

5016.08192.2/)682.0732.0(

)0(

=+=

sMNMR

Inference Results is “Male 0s”

Fig. 7 Example of the third stageclassifier

G4

G3

G2

G2

2.55/)563.0()4( 1−==iWDK

0.051

0.115

0.125

0.135

0.145

0.355

0.563

G1

G2

G12.55),(

1 1

1 ≈ ∑∑= =

−N

I

k

jIGjVDT

2.55/)125.0051.0(

)1(1−1 −+=

=iWDK

Distance

G4

G3

G2

G1 0.5

0.416

0.051

0.032

WD k-NNFig. 6 Example of the secondstage classifier

Multimed Tools Appl (2008) 36:11–35 21

Page 12: A target advertisement system based on TV viewer’s profile reasoning

The majority rule table in Fig. 7 has the maximum values in the weighted distancek-NN tables and the inference result of the second stage classifier. Since the inferencevalue of ‘Male 0s’ is lager than that of ‘Male 10s’, the inference result becomes ‘Male 0s’.Figure 8 shows the architecture of multi stage classifier for the user profile inference asdescribe in this chapter.

4 Target advertisement contents selection method

In this section, we explain how to select a target advertisement content based on the TVviewer’s profile inference. The target advertisement contents are selected from the targetadvertisement selection method which utilizes preference values of advertisement contentsfrom the Korea Broadcasting Advertising Corporation (KOBACO).

4.1 Target advertisement contents selection method

In this section, we describe how to select an advertisement content based on the TVviewer’s profile (gender and age) inference result. In order to select advertisement contents,it is necessary to know preference information about advertisement contents. In this paper,we utilize a survey result from the KOBACO in order to know the TV viewer’spreferences in celebrity endorser, advertising types, and advertising items for gender–agesgroups [3]. The survey results of the preference are shown in Tables 4, 5 and 6. In Table 4,the TV viewer’s preference of celebrity endorser is presented by the percentage. Thepreference values for advertising types and advertising items in Tables 5 and 6 are obtainedfrom the pre-classified lists, and the values are up to 6. By using preference informationfrom KOBACO, the celebrity endorser, advertising types, and advertising items are dividedby TV viewer’s preferring TV viewing as shown in Fig. 9. The numbers in Fig. 9 representthe order of the preferring TV viewing time bands. The time band 1 from 18 to 24 is themost preferred viewing time, and the time band 2 from 6 to 12 is the second preferredviewing time. Three and four are defined in the same way.

Normalized Majority Rule

User Interface Agent Look Up

Table

Mon

Feat Vector Extraction

Feat Vector

Vector DistTable

Novel Vector Distance

1st Stage Classifier

WD k-NN Metric

2nd Stage Classifier

Profile Inference Agent

WD K-NNTable

Look UpTable

Look UpTable

Vector DistTable

Vector DistTable

Tue

Fri

Mon

Tue

Fri

WD K-NNTable

WD K-NNTable

Profile Inference

3rd Stage Classifier

Testingdata

Trainingdata

Fig. 8 Architecture of the multistage classifier (MSC)

22 Multimed Tools Appl (2008) 36:11–35

Page 13: A target advertisement system based on TV viewer’s profile reasoning

Tab

le4

Preferenceinform

ationaboutcelebrity

endorser

from

KOBACO

M10s

M20s

M30s

M40s

OverM50s

F10

sF20

sF30

sF40s

OverF50

s

1Jeon,JH

22.4

Jeon,JH

24.4

Lee,HL12

.5Lee,HL11.3

Lee,YA

9.5

Jeon

,JH

16.8

Jeon

,JH

15.1

Kwon

,SW

11.7

Lee,YA

11.2

Lee,YA

9.7

2Kwon

,SW

12.4

Lee,HL11.2

Lee,YA

12.2

Lee,YA

8.9

Lee,HL8.7

Kwon

,SW

12.0

Kwon

,SW

13.8

Lee,YA

11.6

Kwon

,SW

7.7

Kwon

,SW

7.1

3Lee,HL8.0

Lee,YA

6.6

Jeon,JH

11.4

Jeon

,JH

7.7

Ahn

,SK

3.2

Kang,

DW

9.8

Lee,YA

6.8

Lee,HL4.8

Ahn

,SK

5.3

Lee,HL4.1

4Kim

C4.4

Song,

HK

4.6

Song,

HK

5.2

Son

g,HK

4.0

Kim

,HJ3.0

Won

B5.9

Lee,HL4.5

Jeon

,JH

4.2

Chae,

SL4.5

Chae,

SL3.7

5Lee,YA

3.6

Kwon

,SW

3.4

Ahn

,SK

3.4

Ahn

,SK

3.3

Cho

i,BA

2.8

Rain5.6

Kang,

DW

3.8

Rain4.0

Song,

HK

4.4

Kim

,JE

3.4

6Song,

HK

3.6

Kim

C2.5

Kwon

,SW

2.9

Kwon

,SW

2.9

Kim

,JE

2.5

Lee,NY

4.2

Son

g,HK

3.2

Son

g,HK

3.9

Kim

,JE

4.2

Ahn

,SK

3.4

7Rain2.6

Kim

,JE

2.1

Han,SK

2.6

Kim

,JE

2.3

Ko,

DS2.3

Lee,YA

3.1

Jang

,DK

2.9

Jang

,DK

3.9

Jeon,JH

3.9

Kim

,HJ3.4

8Han,YS2.3

Jung,WS2.1

Kim

,JE

2.5

Kim

,NJ2.0

Jeon

,JH

1.7

Lee,HL3.1

Lee,NY

2.9

Kim

,JE

3.5

Jang,DK

3.9

Kim

,HA

2.6

9Lee,NY

2.1

Han,YS1.8

Kim

,NJ2.2

Jeon

,IH

1.9

Chae,

SL1.7

Son

g,HK

2.8

Rain2.7

Ahn

,SK

3.2

Lee,HL3.8

Son

g,HK

2.4

10Boa

2.1

Lee,NY

1.6

Song,

YA

1.7

Cho

i,MS1.9

Son

g,HK

1.5

Kim

C2.8

Won

B2.7

Lee,MY

2.6

Jeon,IH

2.9

Ko,

DS2.2

Multimed Tools Appl (2008) 36:11–35 23

Page 14: A target advertisement system based on TV viewer’s profile reasoning

Tab

le5

Preferenceinform

ationaboutadvertisingtypesfrom

KOBACO

Hum

our

Traditio

n/hu

manism

Children

entry

Con

sumer

entry

Animal

entry

Animation/

comic

Celebrity

entry

Entertainer

entry

Foreign

Starentry

Sexual

perceptio

nCom

parison

adIm

age

emphasis

ad

Produ

ctem

phasis

ad

Curiosity

M10

s4.8

3.8

3.8

3.6

3.9

4.0

2.9

4.4

3.9

2.8

2.8

2.8

2.8

3.2

M20

s4.8

4.2

3.9

3.8

3.7

3.7

2.9

4.0

3.7

3.3

3.0

3.0

3.0

3.2

M30

s4.6

4.3

4.1

3.9

3.6

3.6

2.9

3.7

3.3

3.0

3.0

3.1

3.1

2.9

M40

s4.3

4.3

4.0

3.9

3.6

3.4

3.1

3.6

3.2

2.9

2.9

3.0

3.1

2.8

M50

s4.3

4.4

3.9

3.8

3.6

3.1

3.2

3.6

3.1

2.6

2.9

3.1

3.1

2.7

F10

s4.8

3.9

4.2

3.7

3.9

4.1

2.9

4.5

3.8

2.5

2.5

2.7

2.7

3.1

F20

s4.8

4.5

4.4

4.0

4.0

3.8

3.0

4.1

3.4

2.6

2.6

2.9

3.0

3.1

F30

s4.7

4.4

4.4

4.0

3.8

3.9

3.1

3.9

3.2

2.5

2.7

3.0

3.1

2.8

F40

s4.5

4.5

4.4

4.1

3.8

3.9

3.2

3.8

3.1

2.3

2.7

3.1

3.3

2.7

F50

s4.3

4.4

4.2

4.0

3.7

3.3

3.2

3.7

3.0

2.3

2.8

3.0

3.1

2.6

24 Multimed Tools Appl (2008) 36:11–35

Page 15: A target advertisement system based on TV viewer’s profile reasoning

Tab

le6

Preferenceinform

ationaboutadvertisingitemsfrom

KOBACO

Drink

Cookie

Foo

dAlcoh

olHou

seho

ldCosmetic

Car

Medical

supp

lies

Hom

eappliance

Com

puter

Cell/m

obile

phon

eDepartm

ent

store

Furniture

Clothes

Finance

Study

book

M10

s4.0

4.1

3.9

2.8

2.8

2.5

3.4

2.6

3.0

4.3

4.7

3.0

2.4

3.5

2.1

2.3

M20

s3.6

3.4

3.5

3.6

3.1

3.0

4.2

3.0

3.5

4.2

4.5

3.2

2.7

3.6

2.8

2.2

M30

s3.3

3.2

3.3

3.5

3.0

2.7

4.3

3.2

3.4

3.9

4.1

3.0

2.7

3.0

3.1

2.5

M40

s3.3

3.1

3.3

3.5

3.0

2.8

4.0

3.5

3.4

3.6

3.8

3.0

2.7

3.0

3.2

2.6

M50

s3.2

3.0

3.1

3.4

3.0

2.7

3.7

3.5

3.3

3.0

3.4

2.9

2.7

2.9

3.0

2.2

F10

s4.1

4.3

3.9

2.9

3.8

3.9

3.1

2.7

3.2

4.1

5.0

3.5

2.9

4.3

2.4

2.7

F20

s3.8

3.8

3.7

3.4

4.0

4.5

3.6

3.2

3.8

3.7

4.5

3.7

3.3

4.3

3.0

2.6

F30

s3.6

3.5

3.7

3.3

3.9

4.1

3.6

3.6

4.1

3.7

4.0

3.7

3.4

3.9

3.4

3.5

F40

s3.5

3.5

3.6

3.2

3.9

4.0

3.5

3.7

4.0

3.6

3.7

3.6

3.4

3.7

3.4

3.0

F50

s3.2

3.1

3.4

2.9

3.7

3.7

3.1

3.6

3.9

2.9

3.2

3.4

3.1

3.4

3.0

2.0

Multimed Tools Appl (2008) 36:11–35 25

Page 16: A target advertisement system based on TV viewer’s profile reasoning

5 Experimental results

In this section, we show the experimental results of the profile reasoning algorithm with themultistage classifier and the implementation result of a prototype target advertisementsystem.

5.1 Experimental result of profile reasoning

The experiment for the profile reasoning algorithm is conducted with real TV usagehistory data from the AC Nielson Korea. The TV usage history data was recorded by2,522 people (Male: 1,243 and Female: 1,279) from Dec. 2002 to May, 2003. In order toperform the experiment, the TV usage history data is divided into two groups such astraining data and testing data. The training data is randomly selected from 70% (1,764people) data of the total TV usage history, and the rest 30% (758 people) is used as thetesting data. That is, the training is viewing information about TV program contents of1,764 people during 6 months, and the testing data is TV usage data of 758 peopleduring 6 months. Also, for more accurate experiment, we created eight different pairs ofthe training and testing data. The threshold values are set to CTh=30 and TTh=0.1 in orderto remove some outliers of the TV usage history data to compute the feature vectors fromthe training data. Figure 10 shows the experimental results for the gender–ages groups bythe proposed multistage classifier (MSC), Euclidian Distance (ED) and Vector Correlation(VC) methods. As shown in Fig. 10, the average accuracy for the performance of theproposed multistage classifier is higher than single classifiers only with ED and VCmeasures, separately. For the male TV viewers, the averaged accuracy in Fig. 10a by theproposed multistage classifier is about 15% higher than other methods, because the malegroups have distinct genre or channel preferences in different ages. For better understanding

6

24

18

12

1Endorser – 1st ~ 3rd Ad types – 1st ~ 4th Ad items – 1st ~ 4th 4 Endorser – 10th ~ 11th

Ad types – 12th ~ 14th Ad items – 13th ~ 16th

3 Endorser – 7th ~ 9th Ad types – 9th ~ 11th Ad items – 9th ~ 12th 2

Endorser – 4th ~ 6th Ad types – 5th ~ 8th Ad items – 5th ~ 8th

Fig. 9 Example of classificationof celebrity endorser, advertisingtypes, and advertising itemsbased on the preferred TV view-ing time

26 Multimed Tools Appl (2008) 36:11–35

Page 17: A target advertisement system based on TV viewer’s profile reasoning

of the experimental results, we model a genre consistency as shown in Fig. 11. For thegenre consistency model, we use the feature vectors: GPRC and GPRT. If the location ofthe preference on Genre 1 in Fig. 11 moves to ① or ②, then it can be understood that thepreference on Genre 1 is increased or decreased. To move the Genre 1 to ③ means that theTV viewer likes the genre much more than other genres because the TV watching isconcentrated on Genre 1 by less watching the other TV genre contents. If the Genre 1 movesto ④, then the TV viewer frequently watches the TV program contents on Genre 1 but thelengths of watching times are very short.

Figure 12 shows the genre consumption consistency (GCC) for all gender–ages groups.In Fig. 12, the male 0s group likes to watch the TV program contents in the Child genre.

50

60

70

80

90

100

M0s M10s M20s M30s M40s M50s M60s

%

MSC

VC

ED

a Average accuracy for male groups (%)

50

60

70

80

90

100

F0s F10s F20s F30s F40s F50s F60s

%

MSC

VC

ED

b Average accuracy for female groups (%) Fig. 10 Experimental results of the accuracy by MSC, ED and VC

Multimed Tools Appl (2008) 36:11–35 27

Page 18: A target advertisement system based on TV viewer’s profile reasoning

The male 10s group prefers to watch the contents in the Drama&Movies genre. The male20s group likes the Entertainment program contents. The male 30s group mostly likes theNews genre. The male 40s∼50s groups prefer to the similar genres such as Information,News and Drama&Movies. On the other hand, the male 60s group can be easilydistinguished because they stick to a specific channel. In Figs. 10 and 12, it can be notedthat the experimental results of the male 0s∼20s groups by the proposed MSC showssimilar pattern in average accuracy only with ED. Since the genres in the GPRC-GPRT planare located along the diagonal axis for the male 0s∼20s groups, the VC value can no longerbe effective instead the ED value becomes an effective discriminatory measure. Theaverage accuracy for the male 30s group by the MSC is relatively low. Even though itsaverage accuracy only with the ED is high, the VC value seems to disturb thediscriminatory power in conjunction with the ED for the MSC. The GCC of the male30s group tends to move along the diagonal axis. For the male 40s∼60s groups by theproposed MSC in Fig. 10, the average accuracy curve looks similar to that of the VC. Inthis case, the VC value becomes an effective measure for discrimination. The locations ofdifferent genres are somewhat different for the male 40s∼60s groups.

For the female groups in different ages, it is difficult to distinguish the ages groupsbecause the ages groups have similar GCC in the GPRC-GPRT plane. In Fig. 12, the genredistribution of the female 0s group is similar to the male 0s group. These groups can thenbe distinguished by the channel preference. The GCC of the female 20s∼60s groups aresimilarly distributed. So, the performance for the female groups is not better than that forthe male groups as shown in Fig. 10. The genre distribution of the female 10s groups issimilar to the male 10s group. Also, the GCC of the female 10s group is distributed to ③.So, the accuracy by the proposed MSC is not higher than those by the ED and VC methodsin Fig. 10b. The accuracy of the female 30s∼50s groups in the proposed MSC is slightlyhigher than those by the ED and VC methods. The accuracy curve of the female 30s∼50s issimilar to that of the ED methods because the distribution of the genres is similar in theGPRC-GPRT plane. The female 60s group by the proposed MSC shows much better resultsthan by the ED and VC methods as shown in Fig. 10b because the test data is distributed inthe direction of ③ and ④ with the low density.

0

0.1

0.2

0.3

0.4

0 0.1 0.2 0.3 0.4

Genre 1

Genre consumption consistency(GCC)

GPRC

GPRT

3

2

1

4

Fig. 11 Genre consumption con-sistency (GCC) model

28 Multimed Tools Appl (2008) 36:11–35

Page 19: A target advertisement system based on TV viewer’s profile reasoning

Fig. 12 Distribution of genre preference in all groups

Multimed Tools Appl (2008) 36:11–35 29

Page 20: A target advertisement system based on TV viewer’s profile reasoning

Table 7 shows the experimental results with average accuracy for the multistageclassifier (MSC), ED and VC, and the accuracy in Table 7 is the average accuracy of theeight different pairs of the training and testing data.

5.2 The implementation result of the prototype target advertisement system

We show the implementation result of the prototype target advertisement system based onthe profile reasoning algorithm and target advertisement content selection method. For the

Fig. 12 (continued)

30 Multimed Tools Appl (2008) 36:11–35

Page 21: A target advertisement system based on TV viewer’s profile reasoning

target advertisement system, we used 28 free advertisement contents from NGTV (http://www.ngtv.net). Figure 13 shows the implementation result of the prototype targetadvertisement system. In Fig. 13, the feature vectors of a 20-year-old man are extractedfrom the user interface agent. The extracted feature vector is sent to the profile inferenceagent. In the profile inference agent, the profile (gender and ages) of the TV viewer isinferred by the MSC. The profile inference agent classifies the extracted feature vector intoM20s. Next, celebrity endorser, advertising types and advertising items are obtained fromthe preference table of advertisement contents based on the profile inference result. Thetarget advertisement content in Fig. 13 is the advertisement content for a 20-year-old man;that is, the advertisement content of celebrity endorser is ‘Lee, HL’, and advertising types is‘Entertainer Entry’, and advertising item is ‘Cell/Mobile Phone.’

Fig. 13 Distribution of genre preference in all groups

Gender and Ages Group Accuracy (%)

VC ED MSC

Male 0s 76.69 71.96 88.21Female0s 67.14 67.86 89.28Male 10s 66.89 71.28 89.29Female 10s 67.76 68.75 65.79Male 20s 60.72 62.95 86.50Female 20s 68.18 73.58 78.49Male 30s 63.69 72.42 78.80Female 30s 63.15 66.88 76.17Male 40s 59.82 64.51 86.59Female 40s 69.71 64.83 72.25Male 50s 54.86 64.58 82.86Female 50s 60.86 60.20 67.91Male 60s 65.76 67.94 89.77Female 60s 56.90 50.86 86.61Avg. Accuracy 64.54 66.81 80.17

Table 7 Experimental result ofmultistage classifier

Multimed Tools Appl (2008) 36:11–35 31

Page 22: A target advertisement system based on TV viewer’s profile reasoning

6 Conclusion

In this paper, we address a TV viewer profile reasoning method by utilizing TV viewers’TV usage history data and introduce a target advertisement system. The feature vectors aredata computed from the TV usage history data and are utilized to infer TV viewer’sprofiles by the proposed multistage classifier. The accuracy of the multistage classifier isabout 80% which is higher than other two methods: Euclidean distance and vectorcorrelation. Also, we proposed the target advertisement system which enables to providetarget advertisement contents based on the inferred TV viewer’s profile and preferencevalues about advertisement contents. Through the proposed target advertisement system, itis expected that TV viewers can watch his/her preferred advertisement contents andadvertisement content providers can see more efficient advertising effects by providingappropriate advertisement contents to their target customers.

References

1. Bozios T, Lekakos G, Skoularidou V, Chorianopoulos K (2001) Advanced techniques for personalisedadvertising in a digital TV environment: the iMEDIA system. Proceedings of The E-business and E-workconference, pp 1025–1031

2. Katsaros D, Manolopoulos Y (2004) Broadcast program generation for webcasting. Data Knowl Eng49(1):1–21

3. Korea Broadcasting Advertising Corporation (2005) Media & consumer research 2004—survey onconsumer pattern. Retrieved September 03, 2005, from http://www.kobaco.co.kr/kor/infor-mation/study-data/studydata_research_annual.asp

4. Miyahara K, Pazzani MJ (2004) Collaborative filtering with the simple bayesian classifier.Proceedings of the Sixth Pacific Rim international conference on artificial intelligence PRICAI2000, pp 679–689

5. Shahabi C, Faisal A, Kashani FB, Faruque J (2000) INSITE: a tool for interpreting users interaction with aweb space. Proceeding of 26th international conference on very large databases, pp 635–638

6. Yu Z, Zhou X (2004) TV3P: an adaptive assistant for personalized TV. IEEE Trans Consum Electron50(1):393–399

7. Yuan W, Liu J, Zhou HB (2004) An improved KNN method and its application to tumordiagnosis. Proceedings of the 3rd international conference on machine learning and cybernetics,pp 2836–2841

32 Multimed Tools Appl (2008) 36:11–35

Page 23: A target advertisement system based on TV viewer’s profile reasoning

Munjo Kim received his B.E degree in Mechatronics from Tongmyong University, Korean, in 2004, and M.Edegree in the School of Engineering from Information and Communications University. Since 2006, he joinedSamsung Electronics, Korea. His research areas of interest include information inference for multimediabroadcasting, TV-Anytime, digital multimedia broadcasting and MPEG-7/21.

Jeongyeon Lim has received the B.E. and M.E. degrees in Information and Communications Engineeringfrom Chungnam National University, Korea, in 1999 and 2001, respectively, and the Ph.D. degree fromInformation and Communications University (ICU), Korea, in 2007. Her research areas of interest includeIntelligent and interactive multimedia, multimedia information processing and MPEG-4/7/21/A.

Multimed Tools Appl (2008) 36:11–35 33

Page 24: A target advertisement system based on TV viewer’s profile reasoning

Munchurl Kim received his B.E degree in electronics from Kyungpook National University, Korea, in 1989, andM.E. and Ph.D. degrees in electrical and computer engineering from the University of Florida, Gainesville, Florida,in 1992 and 1996, respectively. After his graduation, he joined the Electronics and Telecommunications ResearchInstitute where he had been involved in the MPEG-4/7 standardization and the development of broadcasting mediatechnology. In 2001, he joined in the School of Engineering at the Information and Communications University inDaejeon, Korea where he is an associate professor. His research of interest includes MPEG-4/7/21/A/E, videocompression/communications, intelligent and interactive multimedia, and pattern recognition.

Bumshik Lee received the B.S degree in electrical engineering fromKoreaUniversity, in 2000 andM.S degree in theSchool of engineering from the Information and Communications University, Korea in 2006. Since August 2006, hehas been a Ph.D. candidate in the same university. Between 2000 and 2004 he hadworked for development of wirelessand wire communications equipments like cdma 2000 1xEV-DO and VDSL. His research interests include advancevideo coding such as H.264|AVC, Scalable Video Coding, contents based image retrieval and pattern recognition.

34 Multimed Tools Appl (2008) 36:11–35

Page 25: A target advertisement system based on TV viewer’s profile reasoning

Han-kyu Lee received his B.E. and M.E. degree in electronics from Kyungpook National University,Korea, in 1994 and 1996, respectively. In 1996, he joined Electronics and TelecommunicationsResearch Institute where he is a team leader of Personalized Broadcasting Research Team. In 2002, hejoined in Information and Communications University (ICU), Korea where he is a Ph.D Candidate.His research areas of interest include image processing and analysis, personalized broadcasting andstandardization.

Heekyung Lee received her B.E. degree in electronics from Yeungnam University, Korea, in 1999, and M.E. degree in the School of Engineering from Information and Communications University. Since 2002, shejoined Electronics and Telecommunications Research Institute. Her research areas of interest include MPEG-7, TV-Anytime, digital broadcasting and personalized broadcasting.

Multimed Tools Appl (2008) 36:11–35 35