[ieee 2013 international conference on cyberworlds (cw) - yokohama, japan (2013.10.21-2013.10.23)]...
TRANSCRIPT
Improving Structured Content Design of Digital SignageEvolutionarily Through Utilizing Viewers’ Involuntary Behaviors
Ken Nagao and Issei Fujishiro
Graduate School of Science and TechnologyKeio University
Yagami, Kohoku-ku, Yokohama, Kanagawa 223-8522, Japan{nagao, fuji}@fj.ics.keio.ac.jp
Abstract—Digital signage has been getting more popularitydue to the recent development of underlying hardware technol-ogy and improvement in installing environments. Any signageis required to make its content more attractive to viewers byevaluating the current attractiveness on the fly, in order todeliver the message from the sender more effectively. However,most previous methods for signage require time to reflectthe viewers’ evaluations. In this paper, we present a novel,viewer-adaptive digital signage displaying academic conferenceposters, which automatically learns what structure and contentare being attractive through the viewers’ involuntary behaviors,and makes them more adapted to the viewers. This systemtakes away the current gap between viewers’ evaluationsand the content actually displayed on digital signage, andmakes efficient mutual transmission of information betweenthe cyberworld and the reality.
Keywords-digital signage; image processing; genetic algorith-m; human behavior recognition
I. INTRODUCTION
Recent developments of underlying hardware technology
and improvements in installing environments have facilitated
the dissemination of digital signage. There can be found
many advantages to use digital signage compared to tradi-
tional paper signage. Especially, digital sigange can interact
with viewers and thus has recently attracted much attention
as a new interaction device from the field of computer vision.
In order to deliver the original message effectively, any
signage has to provide “attractive” content to its potential
viewers. Therefore, it is important to survey viewers’ eval-
uations towards the signage and modify the appearance of
the content accordingly. In general, such survey and modifi-
cation are carried out by humans because it is necessary
to allow for the subjective matters as well as aesthetic
issues. This imposes them tedious tasks, which brings on the
gap between viewers’ evaluations and the content actually
displayed on digital signage. Direct interaction between
viewers and digital signage can be a solution for this, while
most current systems rely on viewers’ voluntary interactions
with digital signage. In order to remove the gap, we propose
an automatic method for the survey and modification.
Our method utilizes viewers’ involuntary behaviors in
front of a digital signage. For example, a viewer may change
his head angle for reading each part of the content. If the part
was convincing he would nod or if it was unacceptable he
would shrug his shoulders. Such involuntary behaviors come
directly from his emotion, and hence, if the digital signage
system can capture and recognize them, it can automatically
estimate his actual feeling towards each part of the content.
In addition to this, the gaze trajectory of the viewer suggests
an ideal structure of the content to the system.
Not only the survey but also the modification of the
content should be done automatically. Our method utilizes
an idea of genetic algorithm (GA): mutation and crossover.
Through the evolutionary algorithm, the system can auto-
matically learn “what structure and content are attractive”
for the modification. In the system, “content designs” are
regarded as “individuals” in the GA.
With these novel functionalities, the system takes away the
gap, and produces a new comfortable style of transmitting
information. In our previous position paper [1] and poster
[2], we have proposed the basic idea, while they do not
include practical algorithms, nor does not take into account
learning attractive structure of a content. Through the course
of this study, we have recently extended our method so
as to include the utilization of viewers’ attention point
trajectories for learning attractive structure of a content. In
order to prove the effectiveness of the proposed method, we
have developed a system for automatically making academic
conference posters more adapted to viewers by utilizing
viewers’ involuntary behaviors. Figure 1 shows an example
of adaptive conference poster. The proposed system in actual
use can be seen in Figure 2, and Figure 3 illustrates how
the viewers’ behaviors are recognized.
II. RELATED WORK
So far, direct questions to viewers, such as interviews or
questionnaires, have been commonly used for evaluating a
content of digital signage [3]. One major drawback in such
direct questions is that much human labor is required for
gathering and analyzing the sufficient evaluation data. In
addition to this, it is necessary to check up the validity of the
answers because they do not always tell their actual feeling
in such situations.
2013 International Conference on Cyberworlds
978-1-4799-2245-1/13 $31.00 © 2013 IEEE
DOI 10.1109/CW.2013.68
108
In many academic fields, there have been various attempts
to detect human behaviors. Some works achieved to recog-
nize human behaviors in a non-contact manner. For example,
there are many known attempts to recognize human faces in
video [5] [6], and Microsoft KinectTMenables non-contact
motion captures in an inexpensive way. The reliability of the
motion captured by Microsoft KinectTMis verified through
previous attempts of research and development [7].
These methods can gather viewers’ behaviors without
making them aware of the system, and thus have been
becoming to be used for content evaluation of digital sig-
nage. Some digital signage systems have recently started to
use a fixed video camera for gathering viewers’ evaluations
towards the content. Those systems mainly gather demo-
graphics segmentation, such as gender, age and viewing
time [8]. However, those data are not necessarily related to
the viewers’ feelings, and these systems cannot distinguish
whether the viewer has negative or positive feeling towards
the content. In addition, those systems cannot detect where
the viewers are looking at, and employed evaluations are
not enough to analyze which section of the content is
attractive or not. There are many known works to utilize
gaze trajectory for analyzing the structure of a Web page
[9], but there are few research attempts towards the content
of digital signage as far as we know.
The question on the attractive content design has been
a difficult question because it involves many subjective
issues encompassing from psychology to other disciplines.
There have been some research challenges into tackling this
problem. For example, Singh and Bhattacharya proposed an
algorithm to improve the aesthetics of Web interface utilizing
a GA [10]. For the evaluation in the GA, they used more
than ten geometrical features of page layout. However, the
approach does not take user’s evaluation to the Web interface
into primary account, and they concluded that it is difficult
to define the cases in which the approach works well.
Moreover, in the case of digital signage, it should be
considered that local tastes towards the content may be
different depending on the place where it is installed. There
are some known digital signage systems which enable inter-
action with viewers through utilizing interactive devices such
as touch panels. This makes it possible for the systems to
directly show them contents which they want to see, while
the systems have to wait for the explicit interaction from
them, which would not work efficiently.
Considering pros and cons of these methods, we have
proposed an idea of making digital signage more attractive
utilizing viewers’ involuntary behaviors [1] [2]. To the best
of our knowledge, this was the only one attempt to utilize
viewers’ involuntary behaviors for automatically making
digital signage more attractive. We took advantages of a
GA, and presented a method which evolutionarily produces
attractive content designs. However, the users of the system
are required to preliminarily define a hierarchical content
structure they want to show, and modification towards digital
signage is limited in terms of structure, while the defined
structure may not fit local viewers’ evaluations. On the top of
that, the position paper and poster do not include sufficiently
pragmatic algorithms.
In order to ameliorate these problems, our extended
method utilizes viewers’ involuntary behaviors not only for
evaluating each section of the content but also for learning
an ideal, overall structure of the content through utilizing
viewers’ attention point trajectories.
(a) A randomly-generated initial design in a first generation.
(b) An evolved design in the seventh generation.
Figure 1. An example of adaptive conference poster.
Figure 2. A snapshot of the proposed system in actual use.
III. APPROACH
As shown in Figure 4, the main processing flow in our
system is composed of two steps: (a) evaluation step and
(b) modification step. Hereafter, we will refer to a person
109
Figure 4. The processing flow of the proposed method. First, based on the defined content by a sender, the system randomly creates a first generation ofcontent designs. (a) The system displays each content design of the same generation in turn. When each content design is displayed, the system recognizesviewers’ involuntary behaviors and surveys the evaluations. (b) After all content designs are evaluated, they are sorted in an evaluation order. Then thesystem breeds new content designs through mutation and crossover, considering the evaluations. Also, some highly-ranked content designs are inherited tothe next generation. After generating new content designs of the next generation, the system displays them again for re-evaluations. Repeating these steps,the system evolutionarily produces more attractive content designs. If an acceptable content design is reached, the evolution terminates.
Figure 3. An example of recognition of viewers’ behaviors. The systemuses Microsoft KinectTMand Kinect for Windows SDK for motion captureand face recognition.
who uses this system for delivering his messages to viewers
simply as “sender”.
Firstly, a sender is required to give the system what he
wants to show to the public in a certain format. Using these
contents, the system randomly creates some content designs,
which constitute a first generation of content designs.
Next, the system starts to display them to the public
in turn. While each content design is being displayed, the
system recognizes the viewers’ involuntary behaviors in
front of the digital signage for evaluating each section and
the sectional structure of each content design. Without the
need to wait for the viewers’ voluntary interactions, the
system can gather evaluations from a number of viewers.
In addition to this, those behaviors come directly from their
actual feelings towards the content designs, and thus it is not
necessary to check up the validity of the evaluation. “Invol-
untary behaviors” include gaze-related information such as
head pose and pointing fingers. Therefore, our system can
understand which section of the content is attractive from
them. Also, how the viewer’s attention point moves tells the
system the relationships among the sections.
Using these analysis results as a guide, the system gen-
erates the next generation of content designs in two ways:
mutation and crossover. The next generation also includes
the top-ranked content designs in the current generation.
After that, the system displays the new content designs once
again for viewers to re-evaluate them. Repeating these steps,
the system evolutionarily produces more attractive content
designs, and reaches an ideal content design which fits local
tastes. This comes from an idea of GA, which is useful for
solving problems whose structure is not well understood and
for which no exact solution is found, such as the problem
of seeking “what structure and content are attractive”. Also,
GA has an ability to overwhelm manual designs.
Through the proposed method, the system automatically
learns “what structure and content are attractive” and mod-
ifies the content based on the evaluations, so the senders
do not need to analyze the local tastes, search for attractive
designs, nor modify the content manually.
Figure 1 illustrates one example of evolved content de-
signs. In the evolved content design, each of the figures in
the poster is independently adjusted in terms of size, and
each section is positioned based on its relationship with the
others. Also, some sections are modified for its attractiveness
in the evolved content design.
110
A. Defining a Content
First, the sender is required to make it clear what he
wants to show to the public. Here, the sender needs to
divide the content into smaller sections. For example, an
academic conference poster can actually be divided into
“title”; “authors”; “section titles”; “content of sections”; and
“reference figures”. Hereafter, we will simply refer to the
overall design of the content as content design, and small
content sections defined in this step as sections.
Each section is required to have two kinds of properties:
content properties and graphical properties. Content prop-
erties are what the sender wants to tell to the public in
the corresponding section. For example, in the section of
“authors”, content properties are specified as the names of
the authors. On the other hand, graphical properties indicate
how to decorate the section, such as font; font color; frame
color; and size of the section. Based on these properties, the
system creates the first generation of content designs.
B. Evaluation Step
In this step, the system displays the content designs of the
same generation to the public in turn. Here, the content
design will switch to another when a fixed time duration
runs over and at that point no viewer is looking at it.
While each content design is being displayed, the system
gathers the viewers’ evaluation towards the content design
by recognizing their involuntary behaviors, and analyzing
the content design.
1) Recognizing Viewers’ Involuntary Behaviors: In front
of digital signage, the viewers behave in various ways. It is
important that the system recognizes the behaviors without
making the viewers aware of the system, or their behaviors
would be biased. Viewers’ behaviors should be “involuntary”
and come directly from their actual feelings. Therefore, in
addition to the necessity of capturing the behaviors in a non-
contact way, the device should be hidden from the viewers.
In order to make these behaviors easy to understand, we
define two types of behaviors: behaviors indicating viewers’
attention point (attention pointer) and behaviors indicating
viewers’ feeling (feeling indicator).
Attention pointers are behaviors such as changing head
angle and pointing gesture. Recognizing these behaviors and
considering the position where the viewer is standing and the
size of digital signage, the system can estimate which section
the viewer is looking at. In addition to this, the trajectory of
the viewer’s attention point tells the system the relationships
among the sections.
On the other hand, feeling indicators are gestures such
as folding arms; nodding; and shrugging shoulders, which
are deeply related to the viewers’ emotions. Finding the
relationships between these behaviors and feelings has been
studied extensively [4]. Our system can learn the meaning
of the behaviors referring to such a kind of database of our
own that indicates the relationships.
When recognizing these behaviors, the system needs to
take the diversities of the viewers into account. One point is
a difference in moving speed. For example, when the viewer
is a senior, he behaves slowly, while most students behave
quickly. The system needs to consider an average motion
speed of a viewer, and changes the weight of evaluation
according to the motion speed. Another point is a difference
in culture. Meanings of viewers’ behaviors can be different
depending on cultures. Considering that previous research
on the detailed meaning of gestures has limited capability
because human affection is a subjective matter as mentioned
before, we herein focus only on one culture, Japanese, and
then classify the meanings just according to whether the
gesture has positive or negative meaning.
2) Surveying Evaluations of Each Section: Based on
these behaviors, in each content design, the system surveys
and analyzes the gathered evaluations of each section in two
aspects: how eye-catching and unconvincing the section is.
The former one is defined as the total of time duration
when viewers are looking at the section, no matter how they
behave. This time is given to each section respectively and
normalized by the size of the section. This indicates how
eye-catching the section is, without reference to viewers’
feelings towards its content properties. We will refer to this
evaluation as eye-catching evaluation of the section. The
latter one is defined as the total of time duration when the
viewers are looking at the section with negative feeling. In
the same way as eye-catching evaluation, this time is given
to each section and normalized by the size. We will refer to
this evaluation as unconvincing evaluation of the section.
3) Surveying Evaluations of Overall Structure: Unlike
the previous method [1] [2], the new system does not require
the sender to define the relationships between the sections of
the content, which are important for the layout of the content
design. One advantage of this is that it reduces human labors.
The sender only needs to define what he wants to show to
the public, and does not need to analyze the structure. The
other point is that there may be a more ideal structure rather
than the one the sender thinks the content has. For example,
in the case of academic conference posters, the relationships
between a section “system overview” and reference images
can be different. If there were professional people in the
field of the content in a place where the digital signage
is installed, they would find the relationships between the
section “system overview” and a flowchart image, while
if there were nonprofessional people, they would find the
relationships between the section and execution example
images. An appropriate structure depends on a situation
where digital signage is set, and the system needs to have
an allowance for this.
The system surveys the sectional structure evaluations
utilizing the trajectory of a viewer’s attention point. A viewer
changes his attention point while he reads the content of
digital signage, and the trajectory is based on the roles
111
and meanings of the sections. Therefore, by recognizing the
trajectory, the system can estimate the relationships between
sections. Figure 5 shows an example of relationships be-
tween sections found by the system, connecting each section
with a section which was estimated to have the biggest
relationship.
Figure 5. Based on the trajectories of viewers’ attention point, the systemestimates the structure of the content design.
C. Modification Step
After all content designs are evaluated, they are sorted
in an evaluation order based on the time duration each
content design was looked at without unconvincing feelings.
Then, the system breeds new content designs utilizing the
idea of GA. In this step, the system needs to consider the
relationships which the system has analyzed. In addition to
this, the system should not change the property values of
each section in an equally-weighted manner, because the
weight of information each section has can be different.
For example, in the case of academic conference posters,
changing the appearance of the title extensively affects the
look and feel of the content design more than changing that
of section title. Keeping such information inertia in mind,
our system makes modifications to sections with heavy
information such as the title smaller than other sections with
light information.
The modification is performed in two ways: mutation and
crossover. Figure 6 shows examples of the breeding.
1) Mutation: In this step, the system breeds some new in-
dividuals by changing the property values of each section in
the highest-ranked content design. In the system, mutation is
not executed at random. Here, the system takes into account
the evaluations of each section and sectional structure.
The system uses eye-catching evaluation and unconvinc-ing evaluation for the guide of mutation to each section. A
section which got a high value of eye-catching evaluation is
a section which the viewers found worth looking at. There-
fore, to emphasize the section is important for making the
content design more attractive, and thus the system randomly
changes its graphical properties. The system does not need to
consider what kind of change should be done to the graphical
properties for attractiveness, because unattractive individuals
will be exterminated later.
A section which got a high value of unconvincing eval-
uation is a section whose content property is thought to be
unconvincing by the viewers, and thus content properties
of the section should be changed for making the content
easier to understand, especially in the case the section
got a high value of eye-catching evaluation. In the same
way as before, the system does not need to consider what
kind of change should be done to the content properties
for making the section more convincing, and the system
randomly selects the value from the pre-defined values of
the content properties.
In addition to these modifications to each section, the
system modifies the layout of content design based on the
analysis about the relationships of sections. The system puts
up a layout where mutually-related sections are set closer.
In Figure 6, the section “system overview” is emphasized
because it gathered much attention from the viewers. On
the other hand, the sentences of the section “approach” are
replaced with other pre-defined sentence patterns because
the previous sentences were thought to be unconvincing by
the viewers. In addition to these, the sections which were
thought to have the relationships were set closer.
2) Crossover: In addition to those by mutation, the
system breeds new individuals by crossover. From highly-
ranked content designs, the system selects two content
designs for crossover, and in crossover, the system breeds
new content designs by combining property values of each
section in the two content designs.
In Figure 6, the system breeds a new content design whose
properties come from the two previous content designs.
IV. ALGORITHM
In this section, each of the steps is detailed by describing
its algorithms.
Figure 6. Examples of mutation and crossover of designs.
112
A. Defining a Content
Each section is given its section number. When the total
number of section is n, each section is given one sectionnumber from 1 to n respectively. Here, the sender defines
what values the content properties can take, because there
exist various ways to express the message and the current
system does not have natural language processing capability
for producing new sentences. On the other hand, values
of graphical properties are given to each by the system
automatically.
Giving random values to graphical properties and random-
ly selecting the pre-defined values for content properties of
each section, the system creates a fixed number of initial
content designs. Layout of each content design is also
randomly set, while there is a constraint which requires the
section “title” and “authors” to be set at the top of a poster so
that each content design satisfies minimal requirements for
a layout of an academic conference poster. These generated
content designs constitute the first generation of content
designs. The sender can define the total number N of the
content designs of the same generation. Here, the system
gives each content design one content design number from
1 to N respectively.
B. Evaluation Step
For capturing the viewers’ behaviors, our system uses
Microsoft KinectTMand Kinect for Windows SDK 1.7, which
allow us to capture various behaviors such as head pose;
facial expression; and gestures of multiple viewers involun-
tarily (See Figure 3).
By categorizing these behaviors as attention pointers or
feeling indicators, the system gathers eye-catching evalua-
tions and unconvincing evaluations. These evaluations are
approximated by Gaussian kernels. From attention pointers,
the system detects where the viewers are looking at in the
content design, and regard these as the center of Gaussian
distribution respectively. As you can see in Figure 7, the
system gives values to the content design based on the
distributions and accumulates the values until the content
design is changed. Meanings and weights of the values
are defined by feeling indicators and considered in the two
aspects as mentioned above. Therefore, when evaluations are
finished, for each content design, there are two heightmaps:
one indicates eye-catching evaluation and the other indi-
cates unconvincing evaluation. Then, in each heightmap,
the system distributes and normalizes the values to each
section based on the sizes and positions of the sections (See
Figure 7). Here, let Ej,k be the total amount of eye-catching
evaluation which has been given to the k-th section of the
j-th content design. In a similar way, let U j,k be the total
amount of unconvincing evaluation which has been given to
the k-th section of the j-th content designs.
In addition, the system recognizes the trajectories of
viewers’ attention points and estimates the relationships
Figure 7. An example of gathering evaluation. The system uses Gaussiandistributions and accumulates each evaluation by viewers until displayingthe content design finishes. There is the other heightmap which indicatesthe other evaluation.
between sections. Let F vx (x = 1 · · ·n) be a value which
indicates how long a viewer has been focusing on the x-th
section, where v indicates the identifier of the viewer given
by the system. Initially, when one content design starts to be
displayed, for any v, F v1 , F
v2 , · · · , F v
n are set to zero. When
the viewer v is looking at the f -th section, F vf increases
according to the looking time, and when the section is not
looked at, it decreases slowly until it comes to zero as time
goes. In this way, the system memorizes which sections the
viewer was looking at before. For example, if F vf was the
biggest among F v1 , F
v2 , · · · , F v
n , this means that the viewer
v was looking mainly at the f -th section before.
On the other hand, each section has values which indicate
the relationships with the other sections. Let Rj,kx (x =
1 · · ·n) be the values which the k-th section of the j-
th content design has. When some viewer is looking at
the f -th section when the j-th content design is dis-
played, Rj,f1 , Rj,f
2 , · · · , Rj,fn increase correspondingly based
on F1, F2, · · · , Fn the viewer has, and accumulate the values
by every viewer:
for any v ΔRj,fx = F v
x (x �= f, x = 1 · · ·n)From these values, the system estimates the degrees
of the relationships among sections. For example, when
displaying the number j content design is finished, if
max{Rj,f1 , · · · , Rj,f
n } is Rj,fm , this means that the f -th
section has the biggest relationship with the m-th section
in the j-th content design. Figure 8 illustrates an example
of calculating the relationships among the sections.
C. Modification Step
A probability that the modification of graphical property
is applied to a section (let pj,k be the probability of the
k-th section of the j-th content design) corresponds to the
calculated eye-catching evaluation value of each section:
pj,k =Ej,k
max{Ej,1, Ej,2, · · · , Ej,n} × aggressiveness,
113
Figure 8. An example of calculating the relationships among the sections,where viewer 1 is looking at the 3rd section of the j-th content design.The values which indicate the relationships between the 3rd section andthe other sections will increase according to the values indicating whichsection viewer 1 was looking at before mostly.
where aggressiveness means how aggressive modifications
are applied. This value can be defined by the sender. On
the other hand, considering both eye-catching evaluation and
unconvincing evaluation, a probability that the modification
of content property is applied to the section (let p′j,k be
the probability in same way as pj,k) can be formulated as
follows:
p′j,k =U j,kpj,k
max{U j,1, U j,2, · · · , U j,n} .
Here, the system creates a new layout based on
the relationships the system estimates. Firstly, the
system puts up a fixed number of layout candidates,
which are rearranged randomly, while they satisfy
minimal requirements for a layout of an academic
conference poster as mentioned before. Next,
for every layout candidate, the system calculates
d =∑n
k=1 distance(k−th section,mk−th section).Here, mk-th section is the section which has the
biggest relationship with k-th section (therefore,
Rj,kmk
is equal to max{Rj,k1 , · · · , Rj,k
n }), and
distance(k−th section,mk−th section) means Euclidean
distance between the center of k-th section and that of
mk-th section. After all d are calculated, the system finds
out the layout which has the smallest value of d. In this
layout, mutually-related sections get closer, and thus the
system selects the layout for the modification so that a
viewer can easily find out important sections which are
related to the section he is looking at.
In addition to those by mutation, the system breeds
new individuals by crossover. From highly-ranked content
designs, the system selects two content designs for crossover.
The possibility of values in these combination is based on the
eye-catching evaluation and unconvincing evaluation of each
content design so that sections with eye-catching appearance
and convincing contents are selected for the crossover. The
system compares each value in the same sections. For
example, when the system crossovers two content designs, aand b, in the section k, the probability that graphical property
in each content design is used is in the ratio of Ea,k to Eb,k.
V. RESULTS AND EVALUATION
For an empirical evaluation of the system, we made up
some sets of three content designs: (a) a content design
whose layout and properties were set by a man; (b) one
from the first generation; and (c) one from the seventh
generation where N is six and the time interval of each
content design is ten minutes. In this case, five to ten non-
professional students attended as viewers in each content
design. The reason why N is so small is that content designs
of academic conference posters are confined within a few
variety of designs and there could be similar designs if Nwas large. Figure 9 shows example results of (a), (b) and
(c).
In order to compare attractiveness, convincingness and
layout of (a), (b) and (c), we showed these produced content
designs to thirty-two subjects. As was the case with the
viewers, these subjects were non-professional students. As
we mentioned above, there are different sets of (a), (b) and
(c), and thus these people did not always see the same set
of (a), (b) and (c). After this, we asked them how they feel
about the three content designs. The results are tabulated in
Table 1.
(a) (b) (c)
Figure 9. An example of content designs used for evaluations: (a) onemade by a man; (b) one from the first generation; and (c) one from theseventh generation.
Table 1. Result of user evaluation with the three designs (a)-(c),whose examples are shown in Figure 9. These results show howmany subjects thought (c) is better than (a), and (c) is better than(b) in terms of attractiveness, convincingness and layout.
(c) is better
than (a)
(c) is better
than (b)
Attractiveness 53% 78%
Convincingness 40% 69%
Mutually-related layout 50% 69%
In the results, in every case, about 70% of the subjects
thought evolved content designs are better than the initial
content designs. This suggests most of the content designs
were improved through the method. In addition to this, about
114
half of the subjects thought evolved content designs are
better than the one by the man, which indicates the system
has an ability to overwhelm manual designs. On the other
hand, as you can see, in some cases some subjects thought
(b) is better than (c). In most of the cases, these subjects
answered this was because they preferred “simple” content
designs. For example, in Figure 10, we can easily identify
the relationships among sections in the evolved design (right)
because of the frames, while some subjects thought these
frames as annoying and preferred the initial design (left)
even though the layout of the initial design was randomly
set and did not consider the relationships among the sections.
This gives us a useful suggestion to consider the weight of
“way to change graphics” and individual preferences.
Figure 10. An example of a case where an initial content design (left) isthought to be better than an evolved content design (right).
VI. CONCLUSION
In this paper, we presented a novel pragmatic method for
surveying, evaluating and modifying the content design as
well as the structure of the content design, which can
be differentiated from the position paper [1] and poster
[2]. This method realizes the digital signage system which
automatically learns what structure and content of digital
sigange are attractive and makes the content of the digital
signage more adapted to the viewers, through utilizing the
viewers’ involuntary behaviors and the idea of GA. The
system takes away the current gap which lies between the
viewers’ evaluations and the content actually displayed on
digital signage. We proved the feasibility of the method
empirically through the development of a viewer-adaptive
digital signage for displaying academic conference posters.
One major problem in our method is the timing of chang-
ing the content design. In the method, the content design
will switch to another within the same generation when a
fixed time duration runs over. However, this is inefficient if
there were a small number of people, and we need to seek a
better way to switch the content design. Besides, we limited
the usage of digital signage only for displaying academic
conference posters, but we will be able to generalize the
method towards a variety of applications. Especially, the
proposed method enables the digital signage system to learn
the structure of the content, which paves the way for the
system to handle various contents. As for behaviors, we
just bisected involuntary behaviors, but we will be able to
categorize those behaviors into multiple kinds.
ACKNOWLEDGMENT
The work is supported in part by a Grant-in-Aid for the
Leading Graduate School Program for “Science for De-
velopment of Super Mature Society” from the Ministry
of Education, Culture, Sports, Science and Technology in
Japan.
REFERENCES
[1] K. Nagao and I. Fujishiro, “Making digital signage adaptivethrough a genetic algorithm – Utilizing viewers’ involuntarybehaviors –, Proceedings of the INSTICC International Con-ference on Computer Vision Theory and Applications, Vol. 2,pp. 54–59, 2013.
[2] K. Nagao and I. Fujishiro, “Understanding viewers involun-tary behaviors for adaptive digital signage, presented as posterat ACM SAP2013, and abstract available from IEEE Xplorer.
[3] F. Alt, S. Schneegaß, A. Schmidt, J. Muller, and N. Memarovic,“How to evaluate public displays,” Proceedings of the 2012 In-ternational Symposium on Pervasive Displays, Article No. 17,2012.
[4] H. Gunes and M. Piccardi, “A bimodal face and body gesturedatabase for automatic analysis of human nonverbal affectivebehavior,” Proceedings of the 18th International Conferenceon Pattern Recognition, pp. 1148–1153, 2006.
[5] S. Gutta, J. R. J. Huang, P. Jonathon, and H. Wechsler,“Mixture of experts for classification of gender, ethnic origin,and pose of human faces,” IEEE Transactions on NeuralNetworks, Vol. 11, No. 4, pp. 948–960, 2000.
[6] M. Pantic and L. J. M. Rothkrantz, “Automatic analysis offacial expressions: the state of the art,” IEEE Transactions onPattern Analysis and Machine Intelligence, Vol. 22, No. 12,pp. 1424–1445, 2000.
[7] R. A. Clark, Y.-H. Pua, K. Fortin, C. Ritchie, K. E. Webster, L.Denehy, and A. L. Bryant, “Validity of the Microsoft Kinectfor assessment of postural control,” Gait & Posture, Vol. 36,No. 3, pp. 372–377, 2012.
[8] J. Muller, J. Exeler, M. Buzeck, and A. Kruger, “Reflec-tiveSigns: Digital signs that adapt to audience attention,”Proceedings of the 7th International Conference on PervasiveComputing, pp. 17–24, 2009.
[9] J. H. Goldberg, M. J. Stimson, M. Lewenstein, N. Scott, andA. M. Wichansky, “Eye tracking in web search tasks: designimplications,” Proceedings of the 2002 Symposium on Eyetracking research & applications, pp. 51–58, 2002.
[10] N. Singh and S. Bhattacharya, “A GA-based approach toimprove web page aesthetics,” Proceedings of the First Interna-tional Conference on Intelligent Interactive Technologies andMultimedia, pp. 29–32, 2010.
115