plan recognition for exploratory learning environments ... · articles summer 2015 13 ing the...

Articles

10 AI MAGAZINE

Modern educational software is open ended and flex-ible, allowing students to build scientific modelsand examine properties of the models by running

them and analyzing the results (Amershi and Conati 2006;Cocea, Gutierrez-Santos, and Magoulas 2008). Suchexploratory learning environments (ELEs) are distinguishedfrom more traditional e-learning systems in that students canbuild models from scratch by choosing objects from a depos-itory, modifying the objects, and using these modifiedobjects to construct new objects. They are also becomingincreasingly prevalent in developing countries where accessto teachers is limited. Such ELEs provide a rich educationalexperience for students and are generally used in classes toolarge for teachers to monitor all students and provide assis-tance when needed. Thus, there is a need to develop tech-niques that recognize and visualize students’ activities in away that supports students in their work and contributes to

Copyright © 2015, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602

Plan Recognition for Exploratory Learning Environments Using

Interleaved Temporal Search

Oriel Uzan, Reuth Dekel, Or Seri, Ya’akov (Kobi) Gal

■ This article presents new algorithmsfor inferring users’ activities in a classof flexible and open-ended educationalsoftware called exploratory learningenvironments (ELEs). Such settings pro-vide a rich educational environment forstudents, but challenge teachers to keeptrack of students’ progress and to assesstheir performance. This article presentstechniques for recognizing students’activities in ELEs and visualizing theseactivities to students. It describes a newplan-recognition algorithm that takesinto account repetition and interleavingof activities. This algorithm was evalu-ated empirically using two ELEs forteaching chemistry and statistics usedby thousands of students in severalcountries. It was able to outperform thestate-of-the-art plan-recognition algo-rithms when compared to a gold stan-dard that was obtained by a domainexpert. We also show that visualizingstudents’ plans improves their perform-ance on new problems when comparedto an alternative visualization that con-sists of a step-by-step list of actions.

Articles

SUMMER 2015 11

their learning by using the software. Such tools canprovide support for teachers and educationresearchers in analyzing and assessing students’ useof the software.

Students’ interactions with ELEs are complex, aswe illustrate using concepts from an ELE for teachingthe basics of chemistry. Students can engage inexploratory activities involving trial and error, suchas searching for the right pair of chemicals to com-bine in order to achieve a desired reaction. They canrepeat activities indefinitely in pursuit of a goal orsubgoal, such as adding varying amounts of an activecompound to a solution until a desired molarity isachieved. Finally, students can interleave betweenactivities, such as preparing a solution for a newexperiment while waiting for the results of a currentexperiment. These aspects make plan recognition achallenging task in ELEs.

This article presents a plan-recognition algorithmthat works from the bottom up, matching students’actions to a predefined grammar according to heuris-tics that are informed by the way students use ELEs.The algorithm works offline, decomposing a stu-dent’s entire interaction with the software into hier-archies of interdependent plans that best describe thestudent’s work. It was evaluated in an extensiveempirical study that involved seven different types ofproblems and 68 instances of students’ interactionsin two different ELEs for chemistry and statistics edu-cation. These ELEs varied widely in the type of inter-action they entailed from students. The hierarchy ofactivities that was outputted by the algorithm wascompared to a gold standard that was generated by adomain expert. The algorithm was able to achievecomparable or better recognition performance thanthe different state-of-the-art algorithms that weredesigned separately for each of the ELEs. It executedin reasonable time on real-world logs of students’ ses-sions, despite the exponential worst-case complexityof the algorithm.

The second part of this article describes a studythat demonstrates the benefit of visualizing plans tostudents. Students were shown a visualization gener-ated by the ELE of the plan for an expert’s solution toa statistics problem. Another group of students waspresented with an ordered list of the actions com-posing the solution to the problem. Such a list repre-sents the sole option currently available for extract-ing log activities from the software post hoc. Bothgroups of students were subsequently asked to usethe statistics ELE to solve new problems that weregradually more difficult than the example problemand required students to generalize mathematicalconcepts. Students’ performance for the new prob-lems was analyzed using several measures, includingthe length of interaction time with the ELE, the num-ber of actions performed on the ELE, and the ratio ofextraneous actions representing mistakes. The resultsshowed that students who were presented with the

plan visualization outperformed those students whowere presented with the list of activities for all ofthese measures.

These contributions demonstrate the benefit ofapplying novel plan-recognition technologies towardintelligent analysis of students’ interactions in open-ended and flexible software. Such technologies canpotentially support teachers in their understandingof student behavior as well as students in their prob-lem solving and lead to advances in automatic recog-nition in other exploratory domains.

Related WorkEarly approaches to plan recognition have assumed agoal-oriented agent whose activities were consistentwith its knowledge base and that formed a singleencompassing plan (Kautz 1987, Lochbaum 1998).We refer the reader to Carberry (2001) for a detailedaccount of these approaches and focus this sectionon recent works that capture some of the qualities ofexploratory domains, such as extraneous actions ormistakes and interleaving of activities. Avrahami-Zil-berbrand and Kaminka (2005) handled temporal andfree-order constraints among actions by using treestructures. They provided methods for plan recogni-tion that traverse the tree in a manner that is tempo-rally consistent with the observations while makingminimal commitments about matching actions tothe grammar. Pynadath and Wellman (2000) devel-oped a probabilistic grammar for modeling agents’plans that also included their beliefs about the envi-ronment. These techniques were evaluated usingsynthetic data and did not allow for interleavingplans (all reordering among plan constituents wasrestricted to local permutation among constituentactions). Geib and Goldman (2009) presented a prob-abilistic online plan-recognition algorithm thatbuilds all possible plans incrementally with each newobservation. This work maintains all possible expla-nations for matching future unseen actions by theagent. As we show in the article, naively applying thisapproach to exploratory domains is unfeasible.

Gal et al. (2012) proposed two algorithms for infer-ring students’ plans using an ELE for statistics educa-tion. One of these algorithms used heuristics tomatch students’ actions to the grammar. Anotherreduced the plan-recognition task to a constraint sat-isfaction process. This approach assumes a nonrecur-sive grammar and cannot recognize students’ plansin cases where students engage in indefinite repeti-tion, as in physics or chemistry ELEs. Other works(Amir and Gal 2013) allow for recursive grammars intheir plan-recognition algorithm for an ELE forchemistry education but make greedy choices abouthow to match students’ actions to the grammar. Incontrast, our algorithm makes more informed choic-es about matching actions to the grammar that areinferred by students’ sequential and interleaving

Articles

12 AI MAGAZINE

Figure 1. Snapshot of VirtualLabs.

interaction styles. Also, none of these past approach-es have been shown to work for more than a singleELE, whereas one of our algorithms is shown to gen-eralize to two ELEs for chemistry and statistics edu-cation that are significantly different in their designand interaction style with the student.

Finally, we mention works that use recognitiontechniques to model students’ activities in intelligenttutoring systems (ITSs) (VanLehn et al. 2005; Conati,Gertner, and VanLehn 2002; Roll, Aleven, andKoedinger 2010; Koedinger et al. 1997). Such systemscoach students during their problem solving, provid-ing support with proven learning gains. Conati, Gert-ner, and VanLehn (2002) used online plan-recogni-tion algorithms to infer students’ plans to solve aproblem in an educational software for teachingphysics by comparing their actions to a set of prede-fined possible plans. The number of possible plansgrow exponentially in the types of domains we con-sider, making it unfeasible to apply this approach.

Actions and PlansIn this section we provide the basic definitions thatare required for formalizing the plan-recognitionproblems in ELEs. Throughout the article we will usean ELE called VirtualLabs to demonstrate ourapproach. VirtualLabs allows students to design andcarry out their own experiments for investigatingchemical processes (Yaron et al. 2010) by simulatingthe conditions and effects that characterize scientificinquiry in the physical laboratory. We use the fol-lowing problem called Oracle as a running example:

Given four substances A, B, C, and D that react in away that is unknown, design and perform virtual labexperiments to determine the correct reactionbetween these substances.

The flexibility of VirtualLabs affords two classes ofsolution strategies to this problem (and many varia-tions within each). The first strategy mixes all foursolutions together and infers the reactants by inspect-

Articles

SUMMER 2015 13

ing the resulting solution. The second strategy mixespairs of solutions until a reaction is obtained. A snap-shot of a student’s interaction with VirtualLabs whensolving the Oracle problem is shown in figure 1.

We make the following definitions: We use theterm basic actions to define rudimentary operationsthat cannot be decomposed. For example, the basic“Mix Solution” action (MS1[s = 1, d = 3]) describes apour from flask ID 1 to flask ID 3. The output of a stu-dent’s interaction with an ELE (and the input to theplan-recognition algorithm described in the next sec-tion) is a sequence of basic-level actions representingstudents’ activities, also referred to as a log. Complexactions describe higher-level, more abstract activitiesthat can be decomposed into subactions, which canbe basic actions or complex actions themselves. Forexample, the complex action MSD[s = 6 + 8, d = 2]represents separate pours from flask ID 6 and 8 toflask ID 2.

A recipe for a complex action specifies the sequenceof actions required for fulfilling the complex action.Figure 2 presents a set of basic recipes for VirtualLabs.In our notation, complex actions are underlined,while basic actions are not.

Recipe a in the figure, called mix to same destina-tion (MSD), represents the activity of pouring fromtwo source flasks (s1 and s2) to the same destinationflask d. Recipe b, called mix through intermediateflask (MIF), represents the activity of pouring fromone source flask (s1) to a destination flask (d2)through an intermediate flask (d1). Recipes can berecursive, capturing activities that students canrepeat indefinitely. For example, the constituentactions of the complex action MSD in recipe adecompose into two separate MSD actions. In turneach of these actions can itself represent a mix tosame-destination action, an intermediate-flask pour(by applying recipe c or a basic action mix, which isthe base-case recipe for the recursion (recipe d).Recipe parameters also specify the type and volumeof the chemicals in the mix, as well as temporal con-straints between constituents, which we omit forbrevity. More generally, the four basic recipes in thefigure can be permuted to create new recipes, byreplacing MSD on the right side of the first tworecipes with MIF or MS. An example of a derivationis the following recipe for creating an intermediateflask out of a complex mix to same destination actionand basic mix solution action.

MIF[s1, d2] → MSD[s1, d1], MS[d1, d2] (1)

Planning is defined as the process by which stu-dents use recipes to compose basic and complexactions toward completing tasks using TinkerPlots.Formally, a plan is a tree of basic and complexactions, such that each complex action is decom-posed into subactions that fulfill a recipe for sometask. A set of nodes N in a plan is said to fulfill a recipeRC if there exists a one-to-one matching between theconstituent actions in RC and their parameters to

nodes in N. For example, the nodes MSD[s = 6 + 8, d= 2] and MS7[s = 2, d = 7] fulfill the mixing throughan intermediate flask recipe shown in equation 1.Each time a recipe for a complex action is fulfilled ina plan, there is an edge from the complex action toits subactions, representing the recipe constituents.

Figure 3 shows part of a plan describing part of astudent’s interaction when solving the Oracle prob-lem. The leaves of the trees are the actions from thestudent’s log and are labeled by their order of appear-ance in the log. A node representing a complexaction is labeled by a pair of indices indicating itsearliest and latest constituent actions in the plan. Forexample, the node labeled with the complex actionMSD[s = 1 + 5, d = 3] includes the activities for pour-ing two solutions from flask ID 1 and ID 5 to flask ID3. The pour from flask ID 5 to 3 is an intermediateflask pour (MIF[s = 5, d = 3]) from flask ID 5 to ID 3through flask ID 4.

In a plan, the constituent subactions of complexactions may interleave with other actions. In thisway, the plan combines with the exploratory natureof students’ learning strategies. Formally, we say thattwo ordered complex actions interleave if at leastone of the subactions of the first action occurs aftersome subaction of the second action. For example,the nodes MSD[s = 6 + 8, d = 2] and MS7[s = 2, d = 7]fulfill the mixing through an intermediate flaskrecipe shown in equation 1.

Plan RecognitionThe plan-recognition problem in ELEs is defined asfollows: Given a set of temporally ordered actionsrepresenting a student’s complete interactionsequence with the software, and a set of recipes, out-put the plan that correctly describes the student’sinteraction with the software. By “correct,” we meanthat the plan outputted by the algorithm complete-ly agrees to the plan outputted by a domain expertwho was given the same set of inputs. We expand on

Figure 2. Recipes in VirtualLabs for Mix to Same Destination (MSD)and Mix to Intermediate Flask (MIF) Actions

(a) MSD[s1 + s2, d] MSD[s1, d], MSD[s2, d](b) MIF[s1, d2] MSD[s1, d1], MSD[d1, d2](c) MSD[s, d] MIF[s, d](c) MSD[s, d] MS[s, d]

Articles

14 AI MAGAZINE

this further in the Empirical Methodology section.This problem has been shown to be NP-hard (Gal etal. 2012).

The algorithm we designed to solve the off-line plan-recognition problem, called plan recognition throughinterleaved sequential matching (PRISM), provides atrade-off between the following two complementaryaspects of students’ interactions with ELEs.First, students generally solve problems in a sequen-

tial fashion, by which we mean that actions that are(temporally) closer to each other are more likely torelate to the same subgoal. For example, in figure 3the basic actions {MS1, MS5} are more likely to fulfillthe recipe MSD[s1 + s2, d] → MS[s1, d], MS[s2, d] thanthe basic actions {MS1, MS6} because they appear clos-er together in the log. Second, students may inter-leave between activities relating to different subgoals.For example, in figure 3, the node MS4[s = 8, d = 2](which is a constituent action of the complex actionMSD[s = 6 + 8, d = 2]) occurs in between the nodesrepresenting the constituent actions of the complexaction MIF[s = 5, d = 3].

Before presenting the algorithm we first make thefollowing definitions. Let P denote the current planat an intermediate step of the algorithm. The frontierof P is all nodes in the plan that do not have parents.For example, the frontier of the plan shown in figure3 includes the nodes MS6 and MS7.

Let RC denote a recipe for a complex action C. Wesay that a set of nodes N is frontier compatible witha recipe RC if N is a subset of the frontier of P and Nfulfills RC in P. Intuitively, the nodes in a frontiercompatible set of RC may be used to fulfill the recipeand add C to the plan. In figure 3, the set of nodes F= {MSD[s = 6 + 8, d = 2], MS7[s = 2, d = 7]} is frontier

compatible with the mix intermediate flask recipe ofequation 1.

A match for recipe RC in P is a triple MC = (C, N, i,j ) such that RC is a recipe for completing action C, Nis a set of nodes that is frontier compatible with RCin P, and i, j are delimiters specifying the indicescorresponding to the earliest and latest actions in N.For example, the triple (MIF[s = 6 + 8, d = 7], F, 2, 7) is a match for the recipe of equation 1. For brevity,we omit the frontier compatible set when referringto matches and write MC = (MIF[s = 6 + 8, d = 2], 2,7 ).

We define a function FINDMATCH(RC, P, D) thatreturns the set of all matches for RC in P, where D isin the frontier compatible set of RC. For example,consider the call

FINDMATCH (RC, P, MSD[s = 6 + 8, d = 2])

where RC is the intermediate flask recipe of equation1, and P is the plan from figure 3. This call will resultin a set containing the single match that was pre-sented earlier: (MIF[s = 6 + 8, d = 7], F, 2, 7 ).

The main functions composing the algorithm areshown in figure 4. The algorithm uses a global queuefor storing potential matches for updating the stu-dent’s plan. The queue is sorted lexicographically bythe first and second indices in the delimiters of thematches. The core of the algorithm is the CONSIDER-MATCH(MC) function. This function begins by pop-ping from the queue the first match with indicesbetween i, j (line 3). The function then recursivelyupdates the plan with all of the matches in P whosedelimiters lie between i and j (line 5).

After all inner matches are exhausted, the algo-rithm checks whether MC itself can be added to theplan (line 8), which is possible if the set of nodes N

Figure 3. A Partial Plan for a Student’s Log.

The leaves of the plan refer to the basic actions in the student’s plan (numbered in order of appearance in the logs); other nodes in the planrefer to complex activities performed by the student.

MSD[s=1+5,d=3]1,5

MSD[s=6+8,d=2]2,4

MIF[s=5,d=3]3,5

MS1[s=1,d=3]1

MS2[s=6,d=2]2

MS3[s=5,d=4]3

MS4[s=8,d=2]4

MS5[s=4,d=3]5

MS6[s=1,d=3]6

MS7[s=2,d=7]7

is still frontier compatible with RC. If MC can be addedto the plan, the function ADDMATCHTOPLAN(MC) iscalled, which adds (1) a node to the plan with labelC and delimiters i and j, and (2) edges from C to allof the nodes in M as they appear in the plan. At thispoint, all of the matches in the queue involvingnodes represented in N are obsolete because C wasadded to the plan, so they are no longer in the fron-tier. Therefore we remove them from the queue (line18). The final step is a call to the function EXTEND-MATCH(MC) to update the queue with new matches inwhich C is a constituent.

To demonstrate the algorithm, suppose that theplan is initialized to include only the leaves shownin figure 3, corresponding to the student’s log. Theinitial queue will contain the following matches,sorted by their delimiters from left to right asdescribed earlier:

(MSD[s = 1 + 4, d = 3], 1, 5 ), (MSD[s = 1 + 1, d = 3], 1, 6 ),(MSD[s = 6 + 8, d = 2], 2, 4 ), (MIF[s = 6, d = 7], 2, 7 ),(MIF[s = 5, d = 3], 3, 5 ), (MIF[s = 8, d = 7], 4, 7 ),(MSD[s = 1 + 4, d = 3], 5, 6 )

The first match to be popped out from the queue inCONSIDERMATCH(*, (1, 7)) is (MSD[s = 1 + 4, d = 3], 1,5 ).

The function performs a recursive call (line 5) toupdate the plan with matches involving actionsoccurring within the delimiters of 1 and 5. The nextmatch to be popped off the queue is (MSD[s = 6 + 8,d = 2], 2, 4 ). The frontier compatible set of thismatch is {MS2[s = 6, d = 2], MS4[s = 8, d = 2]}.

At this point, the queue subset is empty becausethere are no matches between the delimiters 2, 4 , sothe function skips to line 8. Because the nodes (MS2,MS4) in the frontier compatible set of the match haveno parents, we can add the MSD[s = 6 + 8, d = 2] com-plex action to the plan, by calling ADDMATCHTO-PLAN(MSD[s = 6 + 8, d = 2], 2, 4 ). This function alsoremoves the following matches, which involve MS2or MS4 in their frontier compatible set from thequeue (line 18):

(MIF[s = 6, d = 7], 2, 7 ), (MIF[s = 8, d = 7], 4, 7 ),

(MSD[s = 6 + 8, d = 2], 2, 4 )

Finally, in line 10, the functionEXTENDMATCH(MSD[s = 6 + 8, d = 2], 2, 4 )

is called to find all matches that involve the actionMSD[s = 6 + 8, d = 2] in their frontier compatible setand add them to the queue (line 27). The only onesuch is the action (MSD[s = 6 + 8, d = 7], 2, 7 ).

Empirical MethodologyWe evaluated the algorithm on real data consistingof students’ interactions. To demonstrate the scala-bility of the PRISM algorithm we evaluated it on twodifferent ELEs: the VirtualLabs system as well as anELE for teaching statistics and probability called Tin-

kerPlots (Konold and Miller 2004) that is used world-wide in elementary school and colleges. In Tinker-Plots, students build models of stochastic events, runthe models to generate data, and analyze the results.It is an extremely flexible application, allowing fordata to be modeled, generated, and analyzed inmany ways using an open-ended interface. There aretwo key differences between TinkerPlots and Virtual-Labs. In TinkerPlots, recipes are question dependentand include ideal solutions to specific problems. Sec-ond, the TinkerPlots grammar is not recursive.

For VirtualLabs, we used four problems intendedto teach different types of experimental and analyt-ic techniques in chemistry, taken from the curricu-lum of introductory chemistry courses using Virtual-Labs in the United States. One of these was the

Articles

SUMMER 2015 15

Figure 4. Main Functions of the PRISM Algorithm.

Oracle problem that was described earlier. Another,called coffee, required students to add the rightamount of milk to cool a cup of coffee down to adesired temperature. The third problem, calledunknown acid, required students to determine theconcentration level and Ka level of an unknown solu-tion. The fourth problem, called dilution, requiredstudents to create a solution of a base compoundwith a specific desired volume and concentration.For TinkerPlots, we used two problems for teachingprobability to students in grades 8 through 12. Thefirst problem, called ROSA, required students to builda model that samples the letters A, O, R, and S and tocompute the probability of generating the nameROSA using the model. The second problem, calledrain, required students to build a model for theweather on a given day and compute the probabilitythat it will rain on each of the next four consecutivedays. In contrast to VirtualLabs, recipes in Tinker-Plots are question dependent, and describe possiblesolution strategies for solving each problem.

We compared PRISM to the best algorithms fromthe literature for each ELE: (1) the algorithm of Gal etal. (2012) for TinkerPlots, which reduces the plan-recognition problem to a CSP, and is complete, and(2) the algorithm of Amir and Gal (2013) for Virtual-Labs, which uses heuristics to match recipes toactions in the log, and is incomplete. For each prob-lem instance, a domain expert was given the plansoutputted by PRISM and the other algorithm, as wellas the student’s log. We consider the inferred plan tobe “correct” if the domain expert agrees with thecomplex and basic actions at each level of the planhierarchy that is outputted by the algorithm. Theoutputted plan represents the student’s solutionprocess using the software.

To illustrate how plans were presented to domainexperts, figure 5 shows the visualization of the final

plan for the log actions in the leaves of figure 3. Thevisualization groups all trees in the student’s plans aschildren to a single root node “Solve Oracle prob-lem.” Complex nodes are labeled with informationabout the chemical reactions that occurred duringthe activities described by the nodes.

For example, the node labeled A + B + C + D A +B + D represents an activity of mixing four solutionstogether, which resulted in a chemical reaction thatconsumed substance C. The coloring of the labels indi-cates the type of chemical reaction that has occurred.

Table 1 summarizes the performance of the PRISMalgorithm according to accuracy and run time of thealgorithm (in seconds on a commodity core i-7 com-puter). The column SoA (state of the art) refers to theappropriate algorithm from the literature for eachproblem. For VirtualLabs, this algorithm was suggest-ed by Amir and Gal (2013); for TinkerPlots, this algo-rithm was suggested by Gal et al. (2012). All of thereported results were averaged over the differentinstances in each problem. As shown in the table, forVirtualLabs, the PRISM algorithm was able to recog-nize significantly more plans than did the state-of-the-art (p = 0.001) using a proportion based Z test). Thestate-of-the-art algorithm was not able to recognize 10plan instances (3 for Oracle; 3 for unknown acid; 3 forcoffee; and 1 for dilution). In addition, the PRISMalgorithm was able to recognize all of the plans in Vir-tualLabs that were correctly recognized by the state ofthe art. In TinkerPlots, PRISM failed to recognize 2plans (1 for ROSA; 1 for seatbelts). The instances thatthe algorithms failed to recognize are false negativesthat represent bad matches in the plan-recognitionprocess. However, the fact that PRISM achieved com-parable or better performance than both state-of-the-art algorithms speaks well for its performance.

We conclude this section with discussing the limi-tations of PRISM. First, PRISM is not a complete plan-

Articles

16 AI MAGAZINE

Figure 5. Visualization of Oracle Plan.

{A+B+C+D} –> {A+B+D}

{B+C+D} –> {A+B+D}

Solve Problem

A+B+C

D

C+D

B

A+B

C

D

C

A

B

recognition algorithm, as can be attested by the factthat it failed to recognize 2 out of 64 instances in Tin-kerPlots. Second, it was significantly slower than thestate-of-the-art approaches. This is not surprising giv-en its worst case complexity. However, PRISM isdesigned to run offline after the completion of thestudent’s interaction. Therefore an average run timeof 30 seconds is a low price to pay given the signifi-cant increase in performance and its ability to gener-alize across different ELEs.

Visualizing Plans to StudentsIn this section we demonstrate the benefits of plan-recognition technology in education, in that visualiz-ing expert solutions to students improves their per-formance on new problems. Presenting such “workedexamples” to students has already been shown to pro-vide effective instructional strategies for teaching com-plex problem-solving skills (Van Merriënboer 1997)and is widely accepted as an effective learning tech-nique (Catrambone 1994). Our hypothesis was thatshowing plans of sample problems in TinkerPlots willimprove students’ performance on new problemswhen compared to a default presentation method thatconsisted of an ordered list of their activities. Such alist is the sole visualization technique currently avail-able for TinkerPlots. Specifically, we expected studentswho were shown plans to be able to solve new prob-lems that required generalization of mathematicalconcepts more quickly, with fewer mistakes and withcommitting less redundant actions when compared tostudents who were shown the list.

We used the following problem, called COIN, tovisualize worked examples to students:A fair coin with a side of “0” and a side of “1” is tossed threetimes. What is the average expected sum of the tosses?

The all-purpose sampling object in TinkerPlots is

called a sampler. A sampler is an object into whichthe user can place random devices, including spin-ners and mixers, in order to create a stochastic mod-el of the world. Examples of basic actions in Tinker-Plots can be adding or running a sampler. Actions inTinkerPlots represent higher-level activities such asflipping a coin or solving the coin problem.

Figure 6 shows a snapshot of the TinkerPlots desk-top when solving the COIN problem. The samplermechanism shown in the left contains a coin withthe elements 0 and 1. The parameter number ofdraws in the sampler is set to 10 to represent 10 ques-tions. The parameter number of repetitions in thesampler is set to 2,000 so that the number of sampleswill produce a representative sample. When a sam-pler is run, it generates data according to the distri-bution defined by the parameters of its model. Theresults of this sample are shown on the right. Alsoshown is the formula window, which is used to com-pute the sum of the three tosses for each instance.

Using the PRISM algorithm, we created a visualiza-tion of a worked example of a solution to the COINproblem. Figure 7 shows part of this visualization as atree of basic actions (leaf nodes) and complex actions(parent nodes). The root of the plan (Correct_Solution)decomposes into three constituent complex actionsfor creating the coin model (Create_Coin_Model), run-ning the model (Generate Results) and computing theaverage sum of tosses (Compute Average Sum). Inturn, the complex action Create_Coin_Model decom-poses into the basic action of adding a sampler (addsampler), the complex action of creating a coin (Cre-ate_Coin), and the basic actions of setting the numberof times the coin should be tossed (set_draws_in_sam-pler) and the number of times the experiment shouldbe repeated (set_repetitions). The leaves of the tree cor-respond to the students’ action sequences recorded byTinkerPlots. Actions appear in left-to-right order in a

Articles

SUMMER 2015 17

Table 1. Results of PRISM Algorithm.

Number of Instances

PRISM Accuracy

PRISM Run Time

SoA Accuracy

SoA Run Time

Virtual Labs

Camping 2 100% 1.107 100% 0.548

Coffee 9 100% 21.075 67% 2.781 Dilution 4 100% 1.720 75% 0.437 Oracle 6 100% 9.675 50% 2.689

Unknown Acid

7 100% 53.561 57% 4.441

TinkerPlots Earrings 10 90% 27.514 100% 1.004 Rosa 25 100% 4.430 100% 0.049 Rain 18 100% 6.334 100% 38.576

Seatbelts 11 90.9% 6.064 100% 1.121

way that is consistent with the temporal order of stu-dents’ activities. For expository convenience, only partof the plan is shown in the figure (for example, theCompute_Sum complex action is not presented). Theplan visualization was introduced to the students usingan interactive tool that allowed them to drill down toreveal the constituents actions of each node in the tree.

The list visualization presented students with abulleted list of the basic actions performed to solvethe example problem. This visualization is obtainedfrom a linear sequence of temporally ordered actions.It represents the default support that is currentlyavailable to students using the software. A partial listof these actions is shown in figure 8.

Articles

18 AI MAGAZINE

Figure 6. Solving the COIN problem Using TinkerPlots.

Figure 7. Tree Visualization of the Expert Solution to the COIN Problem.l f h l h bl

Correct_Solution

Create_Coin_Model

Generate_Results

Compute_Average_Sum

add_device_to_sampler

Create_Coin_Side

Create_Coin_Side

set_draws_in_sampler

change_element_in_device

change_element_in_device

Compute_SumCompute_Average

Run_Coin_Modelgenerate_table

Create_Tosses_Of_Coin

set_repetitions

Create_Coin

add_sampler

ParticipantsThe study involved 61 first-year undergraduates stu-dents enrolled in a statistics and probability coursefor engineering majors at Ben-Gurion University. Thestudy was carried out during the middle of the semes-ter after the students had acquired basic knowledgeof probability theory and undergone a midterm. Allstudents were given a home exercise to familiarizewith the TinkerPlots software. In total 32 studentswere presented with a plan solution, and 29 were pre-sented with a list solution.

The study was conducted in a designated lab inwhich each student was situated in front of a com-puter. In the first part of the study, all students wereprovided with the COIN problem description in writ-ing, and the expert solution to the problem was sub-sequently presented to them using the list or planvisualization, depending on their assigned condition.In addition, all students were shown an (identical)snapshot of the TinkerPlots desktop following thesolution procedure. The students were asked severalcomprehension questions about the solution, such aswhether (and why) the interaction shown to themconstitutes a correct solution to the COIN problem;to indicate the role in the solution of one of theactions in interaction; to explain the solution to afriend using free text. There were two purposes forthese questions: First, to confirm that the studentscomprehended the solution; and second, to comparebetween students’ self-explanations of the solution tothe COIN problem in the two experimental condi-tions. Students were allocated up to 20 minutes tocomplete this part of the study.

In the second part of the study, students wereasked to solve two new problems in sequence usingTinkerPlots. Students were allocated up to 30 minutesto complete this part of the study. The problems weretaken from the curriculum of an introductory coursein probability. We wanted the problems to be non-trivial but still possible to solve by the majority ofstudents in the allocated 30 minutes. We chose prob-lems whose solutions exhibited similar concepts (rea-soning about combinations and events in samplespace). The test problem, called DICE, did not men-tion explicitly the notion of expectation, and itrequired students to reason about the disjunction ofcomplex events that relate to the sample space.

John and Mary compete in a dice-tossing game. Theytake turns tossing a die, and sum the result of eachtoss. The winner is the first to accumulate more than10 points. Compute the probability that (1) John willwin after two rounds of the game. (2) Either John orMary will win after two rounds of the game.

We collected the TinkerPlots log from each student’sinteractions as well a snapshot of their desktoprecording during the activity.

ResultsWe hypothesized that students assigned to the plancondition would exhibit better performance thanstudents assigned to the list condition when usingTinkerPlots to solve the test problems. We measuredstudent performance using the following metrics: thelength of interaction (in minutes); the total numberof actions in an interaction; the ratio of redundantactions in an interaction, which represent mistakes;and exogenous actions that do not play a part in thestudents’ solution.

We provide a description of students’ performancefor the DICE problem. All of the results we reportbelow were statistically significant p = 0.04 using anonparameterized two-tailed Mann-Whitney test.We found that the plan visualization significantlyimproved students’ performance across all measures.These results are summarized in table 2.

Specifically, the average interaction length of thestudents in the plan condition (AVG = 9.01 minutes,SD = 3.92) was significantly shorter than the averageinteraction length for students in the list condition

Articles

SUMMER 2015 19

Figure 8. List Visualization of the Expert Solution to the COIN Problem.

Add device 1 to sampler1 Change the label of element 0 in device 1 in sampler 1 to “0” Change the label of element 1 in device 1 in sampler 1 to “1” Set the repetitions in sampler 1 to “1500” Set the draws in sampler 1 to “3” Run sampler 1 and generate columns Draw1,Draw2 and Draw3 Add an attribute named "Sum" to table 2 Edit the formula of attribute "Sum" in table 2 to Draw1+Draw2+Draw3 Add plot 3 Drag the attribute "Sum" to plot 3 Divide the values in plot 3Average on plot 3 of attribute "Sum"

Table 2. Performance Measures on DICE Problemsfor Students in Plan and List Conditions.

Time (Minutes) Number of Actions RedundancyPlan 9.01 39.65 28% List 12.47 57.06 46%

AcknowledgementsThis research is supported in part by EU grant FP7-ICT-2011-9 no. 600854 and by Israeli Science Foun-dation grant no. 1276/12. Reuth Dekel was support-ed in part by a grant from the Israeli Chief Scientistfor advancing the role of women in science.

ReferencesAmershi, S., and Conati, C. 2006. Automatic Recognition ofLearner Groups in Exploratory Learning Environments. InIntelligent Tutoring Systems, Lecture Notes in Computer sci-ence, Volume 4053, 463–471. Berlin: Springer.dx.doi.org/10.1007/11774303_46Amir, O., and Gal, Y. 2013. Plan Recognition and Visualiza-tion in Exploratory Learning Environments. ACM Transac-tions on Interactive Intelligent Systems 3(3): Article 20.Avrahami-Zilberbrand, D., and Kaminka, G. 2005. Fast andComplete Symbolic Plan Recognition. In Proceedings of theNineteenth International Joint Conference on Artificial Intelli-gence (IJCAI). Amsterdam: Morgan Kaufmann.Carberry, S. 2001. Techniques for Plan Recognition. UserModeling and User-Adapted Interaction 11(1–2): 31–48.dx.doi.org/10.1023/A:1011118925938Catrambone, R. 1994. Improving Examples to ImproveTransfer to Novel Problems. Memory and Cognition 22(5):606–615. dx.doi.org/10.3758/BF03198399Cocea, M.; Gutierrez-Santos, S.; and Magoulas, G. 2008. S.:The Challenge of Intelligent Support in Exploratory Learn-ing Environments: A Study of the Scenarios. In Proceedingsof the First International Workshop in Intelligent Support forExploratory Environments, CEUR Workshop Proceedings Vol-ume 381. Aachen, Germany: RWTH Aachen University.Conati, C.; Gertner, A.; and VanLehn, K. 2002. UsingBayesian Networks to Manage Uncertainty in Student Mod-eling. User Modeling and User-Adapted Interaction 12(4): 371–417. dx.doi.org/10.1023/A:1021258506583Gal, Y.; Reddy, S.; Shieber, S.; Rubin, A.; and Grosz, B. 2012.Plan Recognition in Exploratory Domains. Artificial Intelli-gence 176(1): 2270–2290. dx.doi.org/10.1016/j.artint.2011.09.002Geib, C., and Goldman, R. 2009. A Probabilistic Plan Recog-nition Algorithm Based on Plan Tree Grammars. ArtificialIntelligence 173(11): 1101–1132. dx.doi.org/10.1016/j.artint.2009.01.003Kautz, H. A. 1987. A Formal Theory of Plan Recognition.Ph.D. Dissertation, University of Rochester Department ofComputer Science, Rochester, NY.Koedinger, K. R.; Anderson, J. R.; Hadley, W. H.; Mark, M. A.1997. Intelligent Tutoring Goes to School in the Big City.International Journal of Artificial Intelligence in Education 8(1):30–43.Konold, C., and Miller, C. 2004. TinkerPlots: Dynamic DataExploration 1.0. Sheldon, WI: Key Curriculum Press.Lochbaum, K. E. 1998. A Collaborative Planning Model ofIntentional Structure. Computational Linguistics 4(4): 525–572.Pynadath, D., and Wellman, M. 2000. Probabilistic State-Dependent Grammars for Plan Recognition. In Proceedingsof the 16th Conference on Uncertainty in Artificial Intelligence,507–514. San Francisco: Morgan Kaufmann.Roll, I.; Aleven, V.; and Koedinger, K. R. 2010. The Invention

(AVG = 12.47 minutes, SD = 6.6). Additional analysisreveals that the total number of actions in a student’sinteraction (AVG = 39.65, SD = 11.79) was signifi-cantly lower than the total number of actions in thelist condition (AVG = 57.069, SD = 32.628). The ratioof redundant actions of the plan condition (28 per-cent) was significantly lower than the ratio of redun-dant actions for students in the list condition (46 per-cent). Although the number of total actions in theplan condition (AVG = 47, SD = 12.01) was slightlyhigher than in the list condition (AVG = 43.66, SD =18.11), the ratio of redundant actions in the plancondition (33 percent) was lower than the ratio ofredundant actions in the list condition (43 percent).This indicates that visualizing the hierarchical aspectof the example facilitates students’ ability to general-ize mathematical and structural concepts across newproblems (in our case, the use of expectation to rea-son about events in the sample space).

Finally, there were striking differences in the waystudents explained the solution to the COIN problembased on their respective visualization condition.Overall, 75 percent of the students in the plan con-dition used and referred to subgoals when describingthe solution to the COIN problem, as compared to 66percent of the list students. In our study, subgoalsrepresent higher-level activities such as generatingand running a sampler, projecting the results to aplot, and computing the average sum of a randomvariable. These activities recur in all three of the prob-lems in the study, and recognizing and internalizingthese concepts may have contributed to the successof the plan visualization. The students in the list con-dition were far less likely to use such concepts whendescribing the problem.

Conclusion and Future WorkThis article proposed new algorithms for recognizingstudents’ plans in exploratory learning environ-ments, which are open-ended and flexible educa-tional software. Our algorithm is shown to outper-form (or perform comparably with) thestate-of-the-art plan-recognition algorithms for twodifferent ELEs for teaching chemistry and statistics. Itis also the first recognition algorithm that is shown togeneralize successfully to several ELEs. It demon-strates that using hierarchical visualizations of expertsolutions positively affects students’ problem solvingin ELEs. We are currently applying these results inseveral directions. First, we are designing plan-recog-nition algorithms for ELEs that do not depend on apredefined grammar. Second, we are designing intel-ligent tutors in ELEs that use plan recognition toguide their interactions with students. Finally, we aredesigning automatic methods for aggregate analysisof students’ activities that is based on students’ plans.

Articles

20 AI MAGAZINE

Lab: Using a Hybrid of Model Tracing and Constraint-BasedModeling to Offer Intelligent Support in Inquiry Environ-ments. In Intelligent Tutoring Systems, Lecture Notes in Com-puter Science Volume 6094, 115–124. Berlin: Springer.dx.doi.org/10.1007/978-3-642-13388-6_16Van Merriënboer, J. J. 1997. Training Complex Cognitive Skills:A Four-Component Instructional Design Model for TechnicalTraining. Englewood Cliffs, NJ: Educational TechnologyPublications.VanLehn, K.; Lynch, C.; Schulze, K.; Shapiro, J. A.; Shelby,R. H.; Taylor, L.; Treacy, D. J.; Weinstein, A.; and Wintersgill,M. C. 2005. The Andes Physics Tutoring System: LessonsLearned. International Journal of Artificial Intelligence and Edu-cation 15(3): 147–204.Yaron, D.; Karabinos, M.; Lange, D.; Greeno, J.; and Lein-hardt, G. 2010. The ChemCollective–Virtual Labs for Intro-ductory Chemistry Courses. Science 328(5978): 584.dx.doi.org/10.1126/science.1182435

Oriel Uzan is a master’s student in the Department of Infor-mation Systems Engineering at Ben-Gurion University. Hecompleted his BSc in engineering at Ben-Gurion University

Articles

SUMMER 2015 21

in 2011. His research interests include plan recognition andvisualisation in exploratory learning environments.

Reuth Dekel is a master’s student in the Department ofInformation Systems Engineering at Ben-Gurion University.She completed her BSc in computer science at Ben-GurionUniversity in 2012. Her thesis focuses on novel algorithmsfor online plan recognition.

Or Seri is a graduate student at the Department of Brain andCognitive Science at Ben-Gurion University. She completedher BSc in math, computer science, and political science atBar-Ilan University in 2004. Her thesis focuses on visualiz-ing expert solutions to students in e-learning.

Ya’akov (Kobi) Gal leads the human-computer decision-making research group in the Department of InformationSystems Engineering at Ben-Gurion University. He receivedhis PhD from Harvard University in 2006. He is a recipientof the Wolf foundation’s 2014 Krill prize for young Israeliscientists, a Marie Curie International fellowship for 2010,a two-time recipient of Harvard University’s Derek Bokaward for excellence in teaching, as well as the School ofEngineering and Applied Science’s outstanding teacheraward.

Visit the AAAI Member Site and Create Your Own Circle!We encourage you to explore the fea-tures of the AAAI Member website(aaai.memberclicks.net), where youcan renew your membership in AAAIand update your contact informationdirectly. In addition, you are connect-ed with other members of the largestworldwide AI community via theAAAI online directory and othersocial media features. Direct links areavailable for new AI Magazine fea-tures, such as the online and app ver-sions. Finally, you will receiveannouncements about all AAAIupcoming events, publications, andother exciting initiatives. Be sure tospread the word to your colleaguesabout this unique opportunity to tapinto the premier AI society!

plan recognition for exploratory learning environments ... · articles summer 2015 13 ing the...

Documents