an artificial player for a language game

8
36 1541-1672/12/$31.00 © 2012 IEEE IEEE INTELLIGENT SYSTEMS Published by the IEEE Computer Society ARTIFICIAL GAME PLAYER An Artificial Player for a Language Game Giovanni Semeraro, Marco de Gemmis, Pasquale Lops, and Pierpaolo Basile, University of Bari Aldo Moro A knowledge infusion process gives a system the linguistic and cultural knowledge— normally the prerogative of human beings—required to play a complex language game. inoculated into her brain, letting them take off. Even though this futuristic scenario in- volves humans rather than machines, it sug- gests one of the most challenging tasks for AI: to find a way to automatically introduce knowledge into systems that perform tasks requiring human-level intelligence. We define knowledge infusion (KI) as the process of providing a system with back- ground knowledge that gives it a deeper un- derstanding of the information it deals with. Because many sources of world knowledge have become available in recent years on the Web (such as Wikipedia), we realize KI by extracting “knowledge gems” from the unstructured information stored in those sources to create a memory of world facts that an intelligent system can exploit for the task at hand. Similarly, humans offload increasingly more cognitive processing onto cognitive technology, such as Google searches, when they look for an item on the Web in- stead of in their own brains. 1 We selected language games as a bench- mark for the proposed KI process, because no fixed sets of rules are sufficient to define the game play: solving the game depends ex- clusively on the system’s background knowl- edge. We developed an artificial player called Ottho (On the Tip of My Thought) to solve a language game that demands knowl- edge of a broad range of topics, such as movies, literature, proverbs, and so on. We created Ottho’s memory through a KI pro- cess that adopts natural-language process- ing (NLP) techniques to build a knowledge base, which a reasoning mechanism ex- ploits to play the game and propose possible solutions. The Guillotine Game We tested Ottho on the game Guillotine, a segment featured on the Italian game show L’eredità. In Guillotine, a single player is given a set of five words as clues; the five clue words are unrelated, but each is strongly linked to the same other word that is the game’s unique solution. For example, given the five clues sin, Newton, doctor , pie, and New York, the solution would be apple: in popular Christian art, an apple is often pre- sented as the forbidden fruit eaten by Adam and Eve and thus the symbol of original sin, Newton supposedly discovered gravity be- cause an apple fell on his head, “an apple a day keeps the doctor away” is a proverb, ap- ples are a common pie ingredient, and New York City is called the “Big Apple.” I n a famous scene in The Matrix, Neo and Trinity are preparing to es- cape from the roof of a building in a helicopter. Neo asks Trinity whether she is able to pilot the helicopter, and she replies, “Not yet.” She uses her cell phone to ask for a flying program for that helicopter, and a skills file is directly

Upload: pierpaolo

Post on 07-Mar-2017

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: An Artificial Player for a Language Game

36 1541-1672/12/$31.00 © 2012 IEEE IEEE INTELLIGENT SYSTEMSPublished by the IEEE Computer Society

A r t i f i c i A l G A m e P l A y e rA r t i f i c i A l G A m e P l A y e r

An Artificial Player for a Language GameGiovanni Semeraro, Marco de Gemmis, Pasquale Lops, and Pierpaolo Basile, University of Bari Aldo Moro

A knowledge

infusion process

gives a system

the linguistic and

cultural knowledge—

normally the

prerogative of human

beings—required

to play a complex

language game.

inoculated into her brain, letting them take off. Even though this futuristic scenario in-volves humans rather than machines, it sug-gests one of the most challenging tasks for AI: to find a way to automatically introduce knowledge into systems that perform tasks requiring human-level intelligence.

We define knowledge infusion (KI) as the process of providing a system with back-ground knowledge that gives it a deeper un-derstanding of the information it deals with. Because many sources of world knowledge have become available in recent years on the Web (such as Wikipedia), we realize KI by extracting “knowledge gems” from the unstructured information stored in those sources to create a memory of world facts that an intelligent system can exploit for the task at hand. Similarly, humans offload increasingly more cognitive processing onto cognitive technology, such as Google searches, when they look for an item on the Web in-stead of in their own brains.1

We selected language games as a bench-mark for the proposed KI process, because no fixed sets of rules are sufficient to define the game play: solving the game depends ex-clusively on the system’s background knowl-edge. We developed an artificial player

called Ottho (On the Tip of My Thought) to solve a language game that demands knowl-edge of a broad range of topics, such as movies, literature, proverbs, and so on. We created Ottho’s memory through a KI pro-cess that adopts natural-language process-ing (NLP) techniques to build a knowledge base, which a reasoning mechanism ex-ploits to play the game and propose possible solutions.

The Guillotine GameWe tested Ottho on the game Guillotine, a segment featured on the Italian game show L’eredità. In Guillotine, a single player is given a set of five words as clues; the five clue words are unrelated, but each is strongly linked to the same other word that is the game’s unique solution. For example, given the five clues sin, Newton, doctor, pie, and New York, the solution would be apple: in popular Christian art, an apple is often pre-sented as the forbidden fruit eaten by Adam and Eve and thus the symbol of original sin, Newton supposedly discovered gravity be-cause an apple fell on his head, “an apple a day keeps the doctor away” is a proverb, ap-ples are a common pie ingredient, and New York City is called the “Big Apple.”

In a famous scene in The Matrix, Neo and Trinity are preparing to es-

cape from the roof of a building in a helicopter. Neo asks Trinity whether

she is able to pilot the helicopter, and she replies, “Not yet.” She uses her cell

phone to ask for a flying program for that helicopter, and a skills file is directly

IS-27-05-Lops.indd 36 10/6/12 12:28 AM

Page 2: An Artificial Player for a Language Game

SEpTEMbEr/OcTObEr 2012 www.computer.org/intelligent 37

Once the five clues are given, the player has one minute to come up with the right answer. In that time, the player must perform a complex memory retrieval task on his or her knowledge concerning the meanings of thousands of words and their con-textual relations.2

On the Tip of My ThoughtTo build the memory of our system, we drew on the following knowledge sources that would help a human solve the puzzle:

• an encyclopedia, the Italian version of Wikipedia;

• a dictionary, the De Mauro Paravia Italian online dictionary (no longer available);

• compound words—groups of words that often go together to have a spe-cific meaning—retrieved from the IntraText Digital Library (www. intratext.com/bsi/listapolirematiche/indalfa.htm) and the online dic-tionary Tesoro della Lingua Itali-ana delle Origini (http://ovipc44.ovi.cnr.it/Tliopoli);

•proverbs and aphorisms from the Italian version of Wikiquote;

• descriptions of Italian movies crawled from the Internet Movie Database (www.imdb.com);

• Italian songs crawled from Only-Lyrics (www.onlylyrics.com); and

• book titles retrieved from several websites.

In previous work, we modeled each source as a term-term matrix whose cells represented the degree of corre-lation between one term in a row and one in a column, according to specific heuris-tics.3 In this article, we propose a novel strategy based on the adaptive control of thought theory, according to which in-formation in humans’ long-term mem-ory is encoded as cognitive units (CUs) that form an interconnected network.4

Cognitive units are the core of the sys-tem (see Figure 1).

In the KI process, knowledge sources are considered to be repositories of CUs. The information from the knowledge sources is in textual form, so we regard a CU as the textual de-scription of a concept. Because apply-ing a common reasoning mechanism requires a general way to represent concepts, we adopted a CU model that does not depend on the specific knowledge source. We represent each CU by two fields:

• head, or words identifying the con-cept represented by the CU, and

• body, or words describing the CU.

In a more compact way, CU = [head | body]. As an example, from the Wikipedia page providing the description of the concept “artificial

intelligence” (http://en.wikipedia.org/wiki/artificial_intelligence), we can build a possible CU:

CU = [artificial intelligence | intelligence computer science agent McCarthy reasoning …]

Words can be associated with a weight representing how they are im-portant for the concept, resembling the bag-of-words approach to infor-mation retrieval (IR).

This model has two main advan-tages. First, it does not depend on the types of knowledge sources, with the exception of the strategy for selecting words in the head and body, which is completely incorporated into the knowledge extractor module. That module runs basic NLP operations on the descriptions provided by the knowledge sources and stores the

Figure 1. Ottho (On the Tip of My Thought) system architecture. The system crawls the Web and extracts knowledge, stored in the form of cognitive units (CUs). To come up with an answer in the Guillotine game, the retrieval module queries the CU repositories and passes the answers to the reasoning module.

Ottho

Knowledgeextractormodule

Knowledgeretrievalmodule

Reasoningmodule

Web

Candidatesolutions

Cluewords

Proverbsand

aphorisms

Wikipedia

MoviesSongsBooks

CU repositories

Dictionary

Compoundforms

IS-27-05-Lops.indd 37 10/6/12 12:28 AM

Page 3: An Artificial Player for a Language Game

38 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

A r t i f i c i A l G A m e P l A y e r

resulting CUs in separate repositories. Second, this model represents CUs as structured textual documents, letting us adopt an IR model for retrieving relevant CUs associated with queries.

KI starts from a set of given words (clues for the Guillotine game), which trigger the reasoning step. The knowledge retrieval module queries the repositories of CUs to retrieve the most appropriate pieces of knowledge related to the clues. We adopt vec-tor space as the retrieval model, use the term frequency–inverse document frequency (TFIDF) weighting scheme to generate the scores for words within CUs, and compute relevance as the cosine similarity between key-words and CUs. The clues and re-trieved CUs are then passed to the reasoning module, which produces a list of the most informative words related to all clues, representing the candidate solutions.

Knowledge Sources as CU RepositoriesWe use different heuristics to define the head and body of each CU, as follows:

• From the encyclopedia, we define one CU for each Wikipedia article. The head contains the page title. For the body, we implemented a strategy for selecting the most informative words, using regular expressions to identify Wikipedia categories, words formatted in bold, section ti-tles, and internal page links.

• From the dictionary, we define one CU for each meaning of a lemma. The head contains the lemma and its synonyms, whereas the body contains the words in the defini-tion of that specific meaning. Poly-semous words result in one CU for each possible meaning.

• For the compound forms, we define one CU for each example. The head

is empty, whereas the body con-tains the sequence of words in the compound form.

• For the proverbs and aphorisms, we define one CU for each quote. The head contains the author (empty if none), whereas the body contains the words in the quote.

•For songs, movies, and books, we define one CU for each exam-ple. The head contains the author (empty if none), whereas the body contains the words in the title.

The knowledge extraction process produced a total of 743,192 CUs: 584,527 for the encyclopedia, 126,741 for the dictionary, 10,744 compound forms, 11,257 proverbs and aphorisms, and 9,923 songs, movies, and books.

Reasoning MechanismBecause words and their meanings are stored in the mind in a network-like structure,4 we adopted the spreading activation network (SAN) model as Ottho’s reasoning mechanism, consist-ing of a network of searched nodes.5 In the network for Guillotine, nodes represent words or CUs, and links de-note associations between them ob-tained from the CU repositories.

The method for building the SAN for a run of the game is as follows. We assume that M knowledge sources (KS1, …, KSM) have been modeled as CU repositories and that five clues (k1, …, k5) are provided. Initially, source nodes N1, …, N5 are added to the network. Each Ni is labeled with the clue word ki. We populate the SAN by adding CUs related to clues. For each clue ki, a search is per-formed in KSm (m = 1, …, M) to re-trieve a list of relevant CUs. At the end of this process, we get M lists of pairs Li

m correlated with ki:

L CU w CU w( , ),...,( , ) ,im

i h ih1 1=

where CUj is the jth CU retrieved from KSm, and wij is the cosine simi-larity value between ki and CUj.

The maximum length h of Lim must

be defined in order to control the size of the SAN. Essentially, we take at most M * h CUs related to clue ki, which are the ones with the highest similarity values.

For the sake of simplicity, this ex-ample is based on only two clues, newton and sin, and two knowledge sources, Dictionary and Wikipedia. Suppose that the queries “newton” and “sin” return seven CUs with the following similarity values:

=L CU CU( ,0.92),( ,0.75) ,newtonWikipedia

14 16

=L CU( ,0.72)newtonDictionary

7

=L CU CU( ,0.65),( ,0.41)sinWikipedia

2 125

=L CU CU( ,0.55),( ,0.54) .sinDictionary

24 25

The text descriptions of the re-trieved CUs are as follows:

CU_14 = [isaac(1.34) newton(1.55) | physicist(1.74) gravitation(1.66) apple(1.52)]

CU_16 = [newton(1.55) | unit(0.77) force(0.65) mechanics(0.35)]

CU_7 = [newton(1.87) | unit(1.02) force(0.75)]

CU_2 = [sin(1.93) | Christianity(1.62) apple(1.45) Adam(1.65) Eve(1.64)]

CU_125 = [original(0.55) sin(1.93) | movie(0.97) cuba(0.75)]

CU_24 = [sin(1.54) | transgression(0.54) divine(0.45) law(0.44)]

CU_25= [sin(1.54) | commit(0.32) sin(0.29) principle(0.25) person(0.14)]

IS-27-05-Lops.indd 38 10/6/12 12:28 AM

Page 4: An Artificial Player for a Language Game

SEpTEMbEr/OcTObEr 2012 www.computer.org/intelligent 39

For each pair (CUj, wij) in Lim, we

add one node labeled with the CU identifier to the SAN and link it to the source node ki. Edges are oriented from the source nodes to the CU nodes and labeled with wij. At this stage of the SAN building process, edges represent associations between clues and CUs, whereas similarity values measure the strength of those relationships. Once all the pairs have been processed, word nodes labeled with keywords (other than clues) con-tained in its head and body are in-cluded in the SAN for each CU node, and we create links from a CU node toward its word nodes. Figure 2 de-picts the SAN for our example.

The spreading activation strategy consists of iterations called pulses. Each node Ni has an associated ac-tivation value Ai(p) at iteration p. A threshold F determines whether a node is fired—that is, whether it can spread its activation value over the SAN. At every pulse, each fired node propagates its own activation value to its neighbors as a function of its

current activation value, the weights of the edges that connect it with its neighbors, and a decay factor D that limits the propagation of the activa-tion value through the network. The activation values of its neighbors are updated accordingly.

The clues trigger this spreading pro-cess. At pulse p = 1, the system ini-tializes the SAN by setting all activa-tion values Ai(p) to 0 except the clues, whose activation value is set to 1. Then, the clues are fired and spread their activation values to their neighbors—that is, the CU nodes. For each link connecting the fired node Ni to the target node Nj, we update the activation value of Nj ac-cording to the following function:

Aj(p + 1) = Aj(p) + DwijAi(p). (1)

In the second and final pulse, only fired CU nodes will spread their ac-tivation values in the same way as in the first pulse. The final result is the activation value for each word node in the SAN at termination time.

We include the labels of the most ac-tive nodes in the candidate solution list (CSL). In the example, the ac-tivation value for the node apple is updated by two CU nodes (CU2 and CU14), so it is deemed a good candi-date solution. Unit and force are also potential solutions, but the weights on their incoming edges are lower that those for apple.

Experimental EvaluationOur experimental goal is to measure Ottho’s ability to solve the game—that is, to determine whether the so-lution appears in CSLk, which is a list of labels of the k most active nodes in the SAN for that game. We rank nodes in the SAN in descending order according to their activation level at termination time, and we include the labels of the top-k nodes in CSLk.

For k = 1, Ottho is required to find the solution in only one attempt, as human players must, whereas k > 1 lets us evaluate whether Ottho might find the solution in a reasonable num-ber of attempts (that is, whether the

Figure 2. The spreading activation network (SAN) after two expansion stages. Clues newton and sin are connected to the CUs returned by the knowledge retrieval module, which are connected to the words contained in their head and body. The spreading process associates an activation value to every node (darker nodes are those with a higher level of activation).

newton sin

unit force

christ.

Adam

Eve

original

movie

Cuba

transgr.law

commit

person

divinephysicistisaac gravitation

principleapple

0.72 0.92

0.75

0.75

0.650.77

1.45

1.62

0.55

0.54

0.32

0.65 0.41

1.02

1.741.34 1.66

1.52

0.35

0.550.54

0.14

0.25

0.440.45

0.97

0.75

1.65

1.64

CU7 CU16 CU2 CU125 CU24 CU25CU14

mechanics

IS-27-05-Lops.indd 39 10/6/12 12:28 AM

Page 5: An Artificial Player for a Language Game

40 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

A r t i f i c i A l G A m e P l A y e r

solution is in the SAN, even though it might be low in the list).

Ranking nodes might be a good strategy for determining candidate solutions, but more sophisticated techniques could be required to se-lect a unique answer from among them. We used two datasets in our experiments:

•TV_GAME includes 266 guillo-tines (sets of clues) and solutions attempted by human players during the TV show.

•BOARD_GAME includes 150 guil-lotines and solutions contained in the official board game, which re-sembles the TV game.

We defined a performance mea-sure P@k for evaluating the preci-sion of the system when k answers are provided:

=P kN

@#SOLVED_GAMES

, (2)

where N is the cardinality of the dataset, and #SOLVED_GAMES is the number of guillotines for which the solution is in CSLk. P@1 mea-sures the percentage of games solved

in a single attempt, whereas P@50 measures the percentage of solutions found in CSL50. Given a guillotine g from the datasets, we build the cor-responding SAN by adopting two parameters to control its size:

•h, the maximum number of rel-evant CUs retrieved by querying a knowledge source with a clue, and

• t, the threshold for TFIDF associ-ated with words in CUs.

Next, we activate the reasoning mechanism to build CSLk, given the firing threshold F and the decay fac-tor D. (Different settings for these parameters determine different acti-vation levels for nodes at termination time, so CSLk can vary accordingly.) If the solution for g occurs in CSLk, then Ottho has solved the game. We computer P@k by counting all the guillotines solved in the dataset.

The first experiment evaluates whether the knowledge contained in the sources included in Ottho would let the system determine the solution, regardless of its position in the rank-ing. For this purpose, we call a game solvable if the solution occurs in the

SAN for that game, even if it doesn’t appear in the CSL. Figure 3 shows the percentage of solvable games with different parameters controlling the size of the SAN. The maximum value for h is 10, because for higher values the large size of the SAN (more than 3,000 nodes for h = 25) doesn’t al-low Ottho to provide the answer in 1 minute. The maximum value for t is 0.5 to limit the aggressiveness of the feature selection. The results let us define an upper bound for the sys-tem’s precision.

A single CU for each clue is suf-ficient to obtain acceptable results (over 60 percent solvable games ob-served on both the datasets). Set-ting h = 5 or 10 lets the system solve about 90 percent of the games, but the size of the SAN doubles or tri-ples respectively compared to h = 1. This leads to noisy networks in which it is more difficult for the spreading process to rank the solutions first (or even include them in the CSL at all). When we apply feature selection to tackle this problem, as expected, we wind up with fewer solvable games, because we are pruning words that might be solutions. With t = 0.1 and

Figure 3. Percentage of solvable games versus SAN size. Different settings for h and t generate different numbers of nodes in the SAN.

0

100

Solv

able

(%)

10

20

30

40

50

h=10t=0

h=10t=0.1

h=10t=0.2

h=10t=0.3

h=10t=0.4

h=10t=0.5

h=5t=0

h=5t=0.1

h=5t=0.2

h=5t=0.3

h=5t=0.4

h=5t=0.5

h=1t=0

h=1t=0.1

h=1t=0.2

h=1t=0.3

h=1t=0.4

60

70

80

90

h=1t=0.5

91% 77% 74% 73% 71% 71% 89% 62% 59% 58% 57% 56% 71% 37% 31% 27% 26% 26%

1675 290 189 154 132 113 1172 165 99

Size of the SAN

79 67 58 558 48 24 17 14 121725No. of nodes (Board game)

No. of nodes (TV game)291 186 146 123 103 1161 163 97 75 63 53 535 48 25 17 13 11

91%Board gameTV game 73% 68% 66% 63% 61% 87% 63% 56% 51% 49% 48% 64% 32% 24% 21% 19% 18%

IS-27-05-Lops.indd 40 10/6/12 12:28 AM

Page 6: An Artificial Player for a Language Game

SEpTEMbEr/OcTObEr 2012 www.computer.org/intelligent 41

h = 10, the percentage of solvable games is 77 on BOARD_GAME, and 73 on TV_GAME, which is the best performance with feature selection. This result is better than that obtained by the SAN built with a single CU retrieved for the clue and without feature selection (h = 1, t = 0), also taking into account the size of the SAN. In general, we observe bet-ter performance on BOARD_GAME compared to TV_GAME. A reason-able explanation for the difference is that guillotines in BOARD_GAME are meant just for fun, whereas those in TV_GAME are intended to challenge the contestants of the TV show.

The second experiment evaluates the precision of the system by vary-ing the parameters that control both the size of the SAN and the reasoning process. We computed P@1 to evalu-ate Ottho in the same task as human players, and P@50 to assess whether a CSL containing a manageable number of potential solutions might be an ac-curate starting point for more refi ned strategies.

We vary the fi ring threshold F be-tween 0.55 and 1 to avoid letting nodes with very low activation val-ues introduce noise into the network

and perturb the CSL. To confirm the hypothesis that lowering the fi ring threshold causes noise in the network, we performed the experi-ment with F < 0.55 and observed

poor results. According to the lit-erature, the decay factor D varies from 0.4 to 0.9.

We compare our results with two baselines, which we computed us-ing Web search engines, Google and Yahoo, as knowledge sources. For each engine, we submit one query for each clue in a game; then we include the related search terms at the bottom of the results page in the list of candidate solutions for that game. We consider the search engine to have solved the game whenever the solution occurs in that list. The precision of a search en-gine’s results on a dataset is the ratio between the number of games solved and the total number of games in that dataset; Figure 4 shows the results. Both P@1 and P@50 do not change for all settings of F and D, so long as F < D. The same is observed for F > D. This is why, given the size of the SAN, the graphs plot two values for P@1 and P@50, one for F < D and the other for F > D.

For both datasets, the best results for P@1 are obtained when F < D, whereas the best results for P@50 are

Anotable example of language games is Jeopardy!, an American quiz show demanding knowledge and quick recall and covering different topics such as

history, politics, pop culture, and science. Jeopardy! has a unique answer-and-question format in which contes-tants are presented with clues in the form of answers and must phrase their responses in the form of questions. Jeopardy! poses a grand challenge for a computing system because of the varying subject matter and because the clues involve analyzing the kinds of complexities at which humans excel and computers traditionally do not. IBM cre-ated Watson, a question-answering system that success-fully challenged human Jeopardy! players.1 The knowledge sources for Watson included both unstructured content (encyclopedias, dictionaries, thesauri, and so on) and semi-structured and structured content such as databases, tax-onomies, and ontologies (such as DBPedia, WordNet, and Yago).

Another challenging language game is Who Wants to be a Millionaire? Players must prove their knowledge of popular culture by answering a series of multiple-choice questions. Researchers have shown that a system able to

mine answers from the Web plays the game about as well as people do.2

Solving crossword puzzles is another popular language game. The fi rst attempt to solve them with a computer in the literature is Proverb, which exploits libraries of past crossword puzzles.3 WebCrow is the fi rst solver for Italian crosswords.4 WebCrow exploits two main sources: Web doc-uments and, unlike Ottho, previously solved games.

References 1. D. Ferrucci et al., “Building Watson: An Overview of the

DeepQA Project,” AI Magazine, vol. 31, no. 3, 2010, pp. 59–79. 2. S.K. Lam et al., “1 Billion Pages = 1 Million Dollars? Mining

the Web to Play ‘Who Wants to be a Millionaire?’” Proc. 19th Conf. Uncertainty in Arti� cial Intelligence, Morgan Kaufmann, 2003, pp. 337–345.

3. M.L. Littman, G.A. Keim, and N. Shazeer, “A Probabilistic Ap-proach to Solving Crossword Puzzles,” Arti� cial Intelligence, no. 134, 2002, pp. 23–55.

4. M. Ernandes, G. Angelini, and M. Gor, “WebCrow: A Web-Based System for Crossword Solving,” Proc. 20th Nat’l Conf. AI (AAAI 05), AAAI Press/MIT Press, 2005, pp. 1412–1417.

related Work in language Games

ranking nodes might

be a good strategy for

determining candidate

solutions, but more

sophisticated techniques

could be required to select

a unique answer from

among them.

IS-27-05-Lops.indd 41 10/6/12 12:28 AM

Page 7: An Artificial Player for a Language Game

42 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

A r t i f i c i A l G A m e P l A y e r

obtained when F > D. When F < D, it is simpler to fire nodes, because the threshold F that determines whether a node is fired is low, whereas the de-cay factor D is high. This means that most of a node’s activation value is propagated through the network. When F > D, a node must receive con-tributions from several other nodes before being activated because its ac-tivation level is high, whereas the low decay factor limits the activation’s propagation. A possible explanation

for why we observe the best values for P@50 when F > D is that nodes receiving contributions from more than one node are more likely to be game solutions because they are acti-vated by more than one clue. This is in line with the fact that a good can-didate solution is the one related to as many clues as possible.

The trend observed for P@1 and P@50 is the same for both datasets, and the results obtained for BOARD_GAME outperform those obtained

for TV_GAME, as Figure 4 shows. The Web baselines outperform Ottho for P@1, whereas Ottho outperforms the Web baselines for P@50, and Ottho obtains the best results when h = 5 and t = 0.5.

The results of the two experiments highlight that Ottho’s best perfor-mance in terms of solvable games (obtained with h = 10, t = 0.1) corre-sponds to very poor results in terms of P@50 (0.15 for BOARD_GAME, 0.09 for TV_GAME). This means

Figure 4. Precision for the (a) BOARD_GAME and (b) TV_GAME dataset. Google and Yahoo baselines are represented by dashed lines. Guillotines in BOARD_GAME are easier than those in TV_GAME.

(a)h=

10; t=

0 (1,7

25)

t=0.1

(291)

t=0.2

(186)

t=0.3

(146)

t=0.4

(123)

t=0.5

(103)

t=0.1

(163)

t=0.2

(97)

t=0.3

(75)

t=0.4

(63)

t=0.5

(53)

t=0.1

(48)

t=0.2

(25)

t=0.3

(17)

t=0.4

(13)

t=0.5

(11)

h=1;

t=0 (5

35)

h=5;

t=0 (1

,161)

0.00

Prec

isio

n

0.10

0.20

0.30

Size of the SAN

0.40

0.50

0.60P@1 (F < D)P@50 (F > D)P@50 (F < D)

Google baselineP@1 (F > D)Yahoo baseline

(b)h=

10; t=

0 (1,6

75)

t=0.1

(290)

t=0.2

(189)

t=0.3

(154)

t=0.4

(132)

t=0.5

(113)

t=0.1

(165)

t=0.2

(99)

t=0.3

(79)

t=0.4

(67)

t=0.5

(58)

t=0.1

(48)

t=0.2

(24)

t=0.3

(17)

t=0.4

(14)

t=0.5

(12)

h=1;

t=0 (5

58)

h=5;

t=0 (1

,172)

0.00

Prec

isio

n

0.10

0.20

0.30

Size of the SAN

0.40

0.50

0.60P@1 (F < D)P@50 (F > D)P@50 (F < D)

Google baselineP@1 (F > D)Yahoo baseline

IS-27-05-Lops.indd 42 10/6/12 12:28 AM

Page 8: An Artificial Player for a Language Game

SEpTEMbEr/OcTObEr 2012 www.computer.org/intelligent 43

that Ottho can potentially solve a high percentage of games (about 90 percent), but only rarely will the so-lutions be in the first 50 positions of the CSL. Hence, the best strategy— allowing a trade-off between the percentage of solvable games, the manageability of the SAN, and the likelihood to find the solu-tions in CSLs of up to 50 items— corresponds to a configuration of the system with a low number of CUs paired with an aggressive feature selection.

We should evaluate the perfor-mance of Ottho by taking into ac-count the difficulty of the game for human players. For this reason, we performed a small experiment in-volving 10 human players, selected according to the availability sampling strategy (nine of them have MSc de-grees). Each player was required to solve 10 guillotines, five selected at random from each of the two datasets. (We intentionally kept the workload for each player low to avoid cogni-tive overload.) The results provide us with a kind of “human accuracy,” measured as the ratio between the number of solved guillotines and the number of attempts. Human ac-curacy is 0.20 on TV_GAME and 0.34 on BOARD_GAME—higher than P@1 but significantly lower than P@50, corroborating our hypoth-esis that CSL50 is a good starting point for finding a game’s unique solution.

This article discussed a pro-cess for knowledge infusion

into systems that perform tasks re-quiring human-level intelligence. We implemented the process for solv-ing a complex language game, but it has a great potential for other, more practical applications. For example, Ottho could be adopted as a query

expansion mechanism for IR systems. In this scenario, query terms would be the clues provided to Ottho, and k expansion terms would be the key-words in CSLk. The main advantage would be the definition of a concep-tual rather than a statistical corre-lation among expansion terms and query terms.

Another possible application is dealing with overspecialization of content-based recommender sys-tems, which can only recommend items similar to those already rated by a user. Ottho might implement a strategy for diversifying recom-mendations: given an item i liked by a user, keywords describing that item (such as terms in the plot of a movie) could be provided to Ottho, and keywords in the resulting CSLk could be used to select new items to recommend. These items would be potentially interesting but likely un-expected, since they would be only indirectly related to i. In the same way, Ottho could be adopted by news filtering systems for discov-ering related articles.6 We are con-tinuing experiments in these do-mains to confirm the effectiveness of the approach.

AcknowledgmentsWe thank www.laghigliottina.it for pro-viding the TV_GAME dataset, and Carlo Strapparava for his valuable comments.

References1. L. Carr and S. Harnad, “Offloading Cog-

nition onto the Web,” IEEE Intelligent

Systems, vol. 26, no. 1, 2011, pp. 33–39.

2. M. Spitzer, The Mind within the Net:

Models of Learning, Thinking, and

Acting, MIT Press, 2000.

3. G. Semeraro et al., “On the Tip of My

Thought: Playing the Guillotine Game,”

Proc. 21st Int’l Joint Conf Artificial

Intelligence, Morgan Kaufmann, 2009,

pp. 1543–1548.

4. J. R. Anderson, “A Spreading Activation

Theory of Memory,” J. Verbal Learn-

ing and Verbal Behavior, vol. 22, no. 3,

1983, pp. 261–295.

5. A.M. Collins and E.F. Loftus, “A

Spreading Activation Theory of Seman-

tic Processing,” Psychological Review,

vol. 82, no. 6, 1975, pp. 407–428.

6. X. Wu et al., “News Filtering and Sum-

marization on the Web,” IEEE Intelligent

Systems, vol. 25, no. 5, 2010, pp. 68–76.

t h e A u t h o r sGiovanni Semeraro is an associate professor of computer science at the University of Bari Aldo Moro and head of the Semantic Web Access and Personalization Research Group. His research interests include AI; recommender systems; intelligent information mining, retrieval, and filtering; semantic and social computing; machine learning; natural lan-guage processing; the Semantic Web; and personalization. Semeraro has a M.Sc. in com-puter science from the University of Bari. Contact him at [email protected].

Marco de Gemmis is an assistant professor of computer science at the University of Bari Aldo Moro. His research interests include natural-language processing, machine-learning techniques for text categorization, information retrieval, and personalized information filtering. De Gemmis has a PhD in computer science from the University of Bari. Contact him at [email protected].

pasquale Lops is an assistant professor of computer science at the University of Bari Aldo Moro. His research interests include recommender systems, machine learning, user mod-eling, and the semantic-adaptive social Web. Lops has a PhD in computer science from the University of Bari. Contact him at [email protected].

pierpaolo basile is a research assistant in computer science at the University of Bari Aldo Moro. His research interests include natural-language processing, word sense disambigu-ation, information retrieval, and personalized information filtering. Basile has a PhD in computer science from the University of Bari. Contact him at [email protected].

Selected CS articles and columns are also available for free at

http://ComputingNow.computer.org.

IS-27-05-Lops.indd 43 10/6/12 12:28 AM