using analogy to cluster hand-drawn sketches for sketch

9
Articles 76 AI MAGAZINE S ketching and drawing are valuable tools for communi- cating conceptual and spatial information. When peo- ple communicate spatial ideas with each other, drawings and diagrams are highly effective because they lighten work- ing memory load and make spatial inference easier (Larkin and Simon 1987). Visual representations may also be helpful for communicating abstract ideas, even when those ideas are not literally about space (for example, reasoning about prob- ability [Cheng 2011]). Essentially, drawings provide external- ized symbol systems that facilitate spatial reasoning, which can be applied to a variety of domains. Sketching is especially useful for learning and instruction in spatially rich subjects, like science, technology, engineer- ing, and mathematics (that is, STEM fields). For science edu- cation, sketching can be used to increase engagement, improve learning, and encourage encoding across different representations (Ainsworth, Prain, and Tytler 2011). Drawing and sketching have the potential to play critical roles in sci- ence education, especially considering the importance of spa- tial skills in STEM disciplines. Data from more than 50 years of psychological research indicate that spatial skills are stable predictors of success in STEM fields (Wai, Lubinski, and Ben- bow 2009). Individuals with greater spatial skills are more Copyright © 2014, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602 Using Analogy to Cluster Hand-Drawn Sketches for Sketch-Based Educational Software Maria D. Chang, Kenneth D. Forbus n One of the major challenges to build- ing intelligent educational software is determining what kinds of feedback to give learners. Useful feedback makes use of models of domain-specifc knowl- edge, especially models that are com- monly held by potential students. To empirically determine what these mod- els are, student data can be clustered to reveal common misconceptions or com- mon problem-solving strategies. This article describes how analogical retrieval and generalization can be used to cluster automatically analyzed hand- drawn sketches incorporating both spa- tial and conceptual information. We use this approach to cluster a corpus of hand-drawn student sketches to discov- er common answers. Common answer clusters can be used for the design of targeted feedback and for assessment.

Upload: others

Post on 26-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Analogy to Cluster Hand-Drawn Sketches for Sketch

Articles

76 AI MAGAZINE

Sketching and drawing are valuable tools for communi-cating conceptual and spatial information. When peo-ple communicate spatial ideas with each other, drawings

and diagrams are highly effective because they lighten work-ing memory load and make spatial inference easier (Larkinand Simon 1987). Visual representations may also be helpfulfor communicating abstract ideas, even when those ideas arenot literally about space (for example, reasoning about prob-ability [Cheng 2011]). Essentially, drawings provide external-ized symbol systems that facilitate spatial reasoning, whichcan be applied to a variety of domains.

Sketching is especially useful for learning and instructionin spatially rich subjects, like science, technology, engineer-ing, and mathematics (that is, STEM fields). For science edu-cation, sketching can be used to increase engagement,improve learning, and encourage encoding across differentrepresentations (Ainsworth, Prain, and Tytler 2011). Drawingand sketching have the potential to play critical roles in sci-ence education, especially considering the importance of spa-tial skills in STEM disciplines. Data from more than 50 yearsof psychological research indicate that spatial skills are stablepredictors of success in STEM fields (Wai, Lubinski, and Ben-bow 2009). Individuals with greater spatial skills are more

Copyright © 2014, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602

Using Analogy to Cluster Hand-Drawn

Sketches for Sketch-Based Educational Software

Maria D. Chang, Kenneth D. Forbus

n One of the major challenges to build-ing intelligent educational software isdetermining what kinds of feedback togive learners. Useful feedback makes useof models of domain-specific knowl-edge, especially models that are com-monly held by potential students. Toempirically determine what these mod-els are, student data can be clustered toreveal common misconceptions or com-mon problem-solving strategies. Thisarticle describes how analogicalretrieval and generalization can be usedto cluster automatically analyzed hand-drawn sketches incorporating both spa-tial and conceptual information. Weuse this approach to cluster a corpus ofhand-drawn student sketches to discov-er common answers. Common answerclusters can be used for the design oftargeted feedback and for assessment.

Page 2: Using Analogy to Cluster Hand-Drawn Sketches for Sketch

Articles

SPRING 2014 77

likely to earn advanced STEM degrees and attainSTEM careers. As such, it is important that scienceeducation make use of spatial tools, like sketching,for teaching the next generation of STEM profession-als, educators, and researchers, as well as for moreinformed citizens.

Advances in intelligent tutoring systems haveopened the possibility of creating educational soft-ware than can support sketching and take advantageof the many benefits it has to offer (for example,Valentine et al. [2012]). However, building intelligentsketching software is challenging because it requiressoftware that understands sketches in humanlikeways. The noisy nature of sketches makes them diffi-cult to interpret. Consequently, assessing the qualityof a student’s sketch requires a considerable amountof spatial and conceptual reasoning. With the excep-tion of advanced design sketches, most sketches arerough approximations of spatial information. Theyare rarely drawn to scale and often require multi-modal cues (for example, gestures, speech) to facili-tate understanding. For example, a sketch of a mapmight contain various shapes that represent differentlandmarks. The shapes may look nothing like theactual landmarks, but may be denoted as landmarksby labels. No one has trouble understanding that ablob can represent something that looks physicallydifferent; such visual information is processed with agrain of salt. Somehow, people are able to focus onthe spatially and conceptually important informa-tion in the sketch and, for the most part, ignore irrel-evant information. Building software that canachieve this level of understanding from a sketch is amajor challenge for the artificial intelligence com-munity.

Qualitative representations are a good match forsketched data because they carve continuous visualinformation (for example, two-dimensional location)into discrete categories and relationships (for exam-ple, round, right of, and others). These representa-tions enable software systems to reason about sketch-es using the same structured representations that arehypothesized to be used by people.

Since comparison is prevalent in instruction, amodel of visual comparison that incorporates con-ceptual information is also important for sketchunderstanding. Analogical comparison using struc-ture mapping (Gentner 1983) allows structureddescriptions to be compared to each other to evalu-ate how similar the two descriptions are. The struc-ture-mapping model of analogy can be used to com-pare sketches to each other, highlighting qualitativesimilarities and differences, while adhering to con-straints and biases that are supported by psychologi-cal research. Computational models of structuremapping have been used to simulate cognitive phe-nomena (Gentner and Forbus 2011), solve physicsproblems (Klenk and Forbus 2009; Lockwood andForbus 2009), and compare text passages to Jeopardy

clues for question answering (Murdock 2011). Struc-ture mapping enabled these systems to make morehumanlike comparisons. In educational software,such as Sketch Worksheets (Yin et al. 2010), structuremapping generates comparisons that can be used toassess a student’s sketch by comparing it to a prede-fined solution.

A major challenge in any intelligent tutoring sys-tem is determining how to coach students. Whendesigning feedback, instructors must hypothesizewhat will be hard and what will be easy for students.Such hypotheses are not always data driven and canbe inaccurate (Nathan, Koedinger, and Alibali 2001).Consequently, most successful intelligent tutoringsystems incorporate detailed cognitive models of thetask being taught. Building cognitive models requiresresearch on novice misconceptions and strategies(Anderson et al. 1995). Some systems also model thestrategies of human tutors, such as intervention tech-niques and tutoring dialogue (VanLehn et al. 2007).Creating cognitive models for both correct knowl-edge and common misconceptions for an entiredomain is difficult. By contrast, specific exercises canhave easily defined misconceptions that can be iden-tified without a full analysis of the domain. By ana-lyzing the work of multiple students on an example,common models (some of which may be misconcep-tions) can be mined from the data. Although therehas been work devoted to assessing student knowl-edge through sketches (Jee et al. 2009; Kindfield1992) and mining information about students fromlearning data (for example, from hand-coded sketch-es [Worsley and Blikstein 2011]) we are unaware ofany efforts to combine automatic sketch understand-ing and educational data mining. This paperdescribes an approach for using analogical reasoningover hand-drawn sketches to detect common studentanswers.

Our hypothesis is that analogical generalizationcan be used to generate meaningful clusters of hand-drawn sketches. We compare analogical generaliza-tion to a k-means clustering algorithm and evaluateits performance on a set of labeled (that is, clusteredby hand) student sketches. The resulting clustersfrom the experiments can be inspected to identifythe key characteristics of each cluster. These charac-teristics can be used to identify student misconcep-tions and to design targeted feedback for students.

BackgroundStructure mapping is the comparison mechanism ofour clustering approach. Here we summarize thecomputational models for analogical matching,retrieval, and generalization that we use. We thendescribe Sketch Worksheets, which is our sketch-based educational software system used to collectand encode hand-drawn sketches.

Page 3: Using Analogy to Cluster Hand-Drawn Sketches for Sketch

Articles

78 AI MAGAZINE

Structure Mapping The structure-mapping engine (SME) (Faulkenhainer,Forbus, and Gentner 1989) is a computational mod-el of analogy that compares two structured descrip-tions, a base and a target, and computes one or moreanalogical mappings between them. Each mappingcontains a set of correspondences to indicate whichitems in the base correspond to which items in thetarget, a structural evaluation score to measure matchquality, and a set of candidate inferences, which arestatements that are true in the base and hypothesizedto be true in the target.

Three constraints are imposed on the mappingprocess to prevent all potential mappings from beingcomputed and to account for certain psychologicalphenomena. The mapping process begins by match-ing identical relations to each other. This constraintis referred to as identicality. Nonidentical relationsmay end up corresponding to each other in the finalmatch, but only if their correspondence is suggestedby higher-order matches. The mapping process alsoadheres to the 1:1 constraint, which ensures thatitems in one description can have at most one corre-sponding item in the other description. Parallel con-nectivity requires that arguments to matching rela-tions also match. These constraints make analogicalmatching tractable.

Importantly, SME has a bias for mappings withgreater systematicity, which means that it prefers map-pings with systems of shared relations, rather thanmany shared isolated facts. In other words, given twomappings, one with many matching but disconnect-ed (that is, shallow, lower-order) relations, andanother with many matching interconnected (that is,deep, higher-order) relations, SME will prefer the lat-ter. The systematicity preference in SME captures theway people perform analogical reasoning. People pre-fer analogies that have systems of shared relationsand are more likely to make analogical inferencesthat follow from matching causal systems, ratherthan shallow matches (Clement and Gentner 1991).

To create sketch clusters, we use SAGE (sequentialanalogical generalization engine), an extension ofSEQL (Kuenhe et al. 2000) that computes probabili-ties for expressions during generalization andretrieves structured descriptions using analogicalretrieval. Generalizations are created by incremental-ly introducing exemplars into a generalization con-text. Each generalization context consists of a caselibrary that includes both exemplars and generaliza-tions. For each new exemplar, the most similar exem-plar or generalization in the generalization context isfound via analogical retrieval using MAC/FAC (For-bus, Gentner, and Law 1995). MAC/FAC computescontent vectors that measure the relative frequencyof occurrence of relations and attributes in structuredrepresentations. It finds the maximum dot productof the vector for the exemplar with the vectors ofeverything in the generalization context. This step

may retrieve up to three items and is analogous to abag of words approach to similarity, albeit with pred-icates. These items are then compared to the exem-plar using SME and the item with the highest struc-tural evaluation score (that is, the most similar case)is returned.

The reminding returned by MAC/FAC is either anexemplar or an existing generalization. If the simi-larity between the reminding and the exemplar isabove a predefined assimilation threshold, then theyare merged together. If the best match is a general-ization, the new exemplar is added to it. If the bestmatch is another exemplar, then the two are com-bined into a new generalization. If there is no bestmatch above the assimilation threshold, then thenew exemplar is added directly to the case library forthat generalization context. It will remain an exem-plar in the generalization context until it is joinedwith a new exemplar or until there are no moreexemplars to add.

The resulting generalizations contain generalizedfacts and entities. Each fact in a generalization isassigned a probability, which is based on its frequen-cy of occurrence in the exemplars included in thegeneralization. For example, a fact that is true in onlyhalf of the exemplars would be assigned a probabili-ty of 0.5. Thus, entities become more abstract, in thatfacts about them “fade” as their probability becomeslower.

Sketch WorksheetsSketch Worksheets are built within CogSketch (For-bus et al. 2011), our domain-independent sketchunderstanding system. Each sketch worksheetincludes a problem statement, a predefined solutionsketch, and a workspace where the student sketcheshis or her candidate answer. As part of the authoringprocess, the worksheet author describes the problem,sketches an ideal solution, and labels elements in thesolution with concepts that he or she selects from anOpenCyc-derived knowledge base.1 CogSketch ana-lyzes the solution sketch by computing qualitativespatial and conceptual relations between items in thesketch. Spatial relations that are automatically com-puted include topological relations (for example,intersection, containment) and positional relations(for example, right of, above), all of which aredomain independent. Conceptual relations includerelations selected by the worksheet author and canbe a combination of domain general relationshipsand domain-specific relationships. The worksheetauthor can then peruse the representations comput-ed by CogSketch and identify which facts are impor-tant for capturing the correctness of the sketch. Foreach such fact, the author includes a piece of advicethat should be given to the student if that fact doesnot hold in the student’s sketch. When a student asksfor feedback, SME is used to compare the student’ssketch to the solution sketch. The candidate infer-

Page 4: Using Analogy to Cluster Hand-Drawn Sketches for Sketch

Articles

SPRING 2014 79

ences, which represent differences between thesketches, are examined to see if there are any impor-tant facts among them. If there are, the advice asso-ciated with that important fact is included as part ofthe feedback. By identifying important facts andassociating advice with them, the worksheet authorcreates relational criteria for determining the correct-ness of the sketch.

Worksheet authors can also identify which drawnelements have quantitative location criteria by defin-ing quantitative ink constraints, which define a toler-ance region for a particular drawn element. Which ele-ments of a solution sketch correspond to elements ofa student’s sketch is determined through the corre-spondences that SME computes. If the student’s drawnelement falls outside of the tolerance region, it is con-sidered incorrect. If it falls within the tolerance region,it is considered correct. This allows the author to setcriteria about the absolute location of drawn elements.

The difference between these two criterion typescan be illustrated by two different worksheet exercis-es. Consider a worksheet that asks a student to drawthe solar system. The exact location of the sun doesnot matter, as long as it is contained by the orbitrings of other planets. In other words, its location isconstrained relative to other drawn entities. A work-sheet author would capture this by marking a con-tainment fact as important and associating a naturallanguage advice string with that fact. Alternatively,consider a worksheet that asks a student to identifythe temporal lobe on a diagram of the human brain.The absolute location of the drawing denoting thetemporal lobe would be important. For an elementwhose location is constrained relative to an absoluteframe of reference (for example, a backgroundimage), quantitative ink constraints are necessary.

Sketch Worksheets have been used in experimentson spatial reasoning as well as classroom activities ingeoscience (at Northwestern University and CarletonCollege) and elementary school biology. The sketch-es used in the experiments described in this paperwere collected using sketch worksheets.

Clustering Through Analogical Generalization

Clustering is achieved by performing analogical gen-eralization over student sketches. The clustering algo-rithm adds the sketches in random order, using theSAGE algorithm mentioned above. A single general-ization context is used, that is, it operates unsuper-vised, because the goal is to see what clusters emerge.

EncodingA major challenge to clustering sketches is choosinghow to encode the information depicted in eachsketch. Each sketch contains a wealth of spatial infor-mation, not all of it relevant for any particular situa-tion.2 In order to highlight visually and conceptual-

ly salient attributes and relationships, we harnessinformation explicitly entered by the student and theworksheet author. More specifically, we filter theunderlying representations in each sketch based onthe following principles: conceptual information iscritical, quantitative ink constraints must constrainanalogical mappings, and worksheet authoringshould guide spatial and conceptual elaboration.

Conceptual Information Every sketch worksheet comes equipped with a sub-set of concepts from an OpenCyc-derived knowledgebase. This subset contains the concepts that may beused in the worksheet and are selected by the work-sheet author to limit the conceptual scope of theexercise. These concepts are applied by students toelements of their drawing through CogSketch’s con-ceptual labeling interface. This is useful for educationbecause the mapping between shapes and entities isoften one to many. While visual relationships arecomputed automatically by CogSketch, conceptualrelationships are entered by sketching arrows orannotations and labeling them appropriately,through the same interface. Thus, the conceptuallabels constitute the students’ expression of theirmodel of what is depicted. Consequently, conceptu-al information is always encoded for generalization.

Quantitative Ink Constraints Limit MatchesAnother type of information that is entered explicit-ly by the worksheet author is quantitative ink con-straints. Recall that quantitative ink constraintsdefine a tolerance region relative to an absolute frameof reference (for example, a background image).Quantitative ink constraints are defined for entitieswhose absolute position matters.

When encoding information about entities forwhich there are quantitative ink constraints, theencoding algorithm computes their position withrespect to the tolerance regions, to determine if theentity’s location meets the constraint or not. If it doesnot, we further encode how the constraint was vio-lated (for example, too wide, too narrow) and includethat information in the encoding.

Furthermore, each entity that is evaluated withrespect to a quantitative ink constraint is associatedwith that constraint as a location-specific landmark.This association limits the possible analogical map-pings by ensuring that entities associated with onelandmark cannot map to entities that are associatedwith a different landmark. This also ensures that enti-ties cannot be generalized across different location-specific landmarks. This approach for using quanti-tative constraints to limit the analogical mappingshas been shown to lead to sketch comparisons thatprovide more accurate feedback to students (Changand Forbus 2012).

Spatial and Conceptual ElaborationWorksheet authors can also specify a subset of thevisual relationships computed by CogSketch asimportant. For example, the core of Earth must be

Page 5: Using Analogy to Cluster Hand-Drawn Sketches for Sketch

inside its mantle. Some conceptual information canalso be marked as important, for example, that onelayer of rock is younger than another. All factsmarked as important by the worksheet author,whether spatial or conceptual, are always included inthe encoding for generalization.

EvaluationTo evaluate our clustering algorithm we used a set offault identification worksheets (for example, figure1) submitted by students taking an undergraduategeoscience course at Northwestern University. Therewere 28 sketches in total, spanning three differentfault identification exercises. A gold standard wascreated by hand-clustering the sketches for eachexercise separately, taking into account visual fea-

Articles

80 AI MAGAZINE

Figure 1. Four Example Student Sketches.

The bottom two are grouped together and considered nearly correct. The top two occupy their own single-member clusters.

tures of the sketch (for example, location of drawnentities), what type of entities were included in thesketch, and whether or not the sketch would be con-sidered correct by an instructor. We then ran our gen-eralization algorithm on the unlabeled data for eachexercise to evaluate how well the clusters it producedmatch the gold standard. Because clusters may differdepending on the order in which sketches are select-ed, we repeated the clustering over 10 iterations. Wecollected three measures from the resulting clusters:purity, precision, and recall. Purity is a measure of thequality of a set of clusters, defined as the ratio of cor-rectly classified exemplars across clusters to the totalnumber of exemplars. The precision of a cluster is theproportion of exemplars in the cluster that are classi-fied correctly, and the recall of a cluster is the pro-portion of exemplars that are correctly included in

Page 6: Using Analogy to Cluster Hand-Drawn Sketches for Sketch

Articles

SPRING 2014 81

Table 1. Clustering Measures for Analogical Generalization (SAGE) and k-Means Clustering (Without Analogy).

All measures are averaged over 10 random restart iterations of the cluster-ing procedure. Asterisks indicate the probability associated with independ-ent samples t-tests between SAGE and k-means measures: ** p < 0.001, * p <0.005

SAGE k-means Sketch Group 1 Number of Clusters 6.7 6 Purity** 0.90 0.72 Precision** 0.86 0.56 Recall* 0.85 0.56 Sketch Group 2 Number of Clusters 5.5 4 Purity** 0.94 0.74 Precision** 0.98 0.61 Recall** 0.82 0.59 Sketch Group 3 Number of Clusters 6.1 4 Purity** 0.96 0.83 Precision** 0.99 0.67 Recall* 0.80 0.68

the cluster. Precision and recall were computed withrespect to the gold standard.

k-Means ClusteringTo explore the impact of relational structure on gen-eralization behavior, we also compared our approachto a nonstructural way of ascertaining similarity.Specifically, we used the MAC/FAC content vectors(described above) as a cruder, nonrelational form ofsimilarity. While content vectors are still sensitive tothe presence of relationships, since those predicatesare included in them, content vectors only containrelative frequency information. In other words, “manbites dog” is the same as “dog bites man.” We used k-means clustering on the same data, where each meanwas the content vector of a sketch and the distancemeasure between means was the inverse dot productof the content vectors being compared. The moreoverlap between the content vectors (that is, themore overlap between attributes and relationships),the greater the similarity and the smaller the dis-tance. For each k-means clustering process we sup-plied k by counting the number of labeled clusters. Inthis sense, the k-means clustering approach had aslight advantage over analogical generalization. Thek-means clustering algorithm was also repeated 10times, since the initial k means can affect the make-up of clusters.

ResultsTable 1 shows the average purity, precision. and recallfor each approach across the three worksheet groups,averaged over 10 iterations of each approach. Ana-logical generalization outperformed k-means withoutanalogy for clustering in all measures. Since purity isoften high when there are many clusters, it is impor-tant to consider the precision and recall measures aswell.

We used independent samples t-tests to test for sig-nificant differences between purity, precision, andrecall for each sketch group separately (for a total ofnine comparisons). Each measure was significantlyhigher for analogical generalization than for k-meansclustering (p < 0.005, Bonferroni corrected for ninecomparisons).

Figure 2 shows two sketches that were frequentlygeneralized together. This cluster indicates a com-mon sketching behavior exhibited by students. Thehigh-probability facts in the generalization indicatethe defining criteria for the cluster. Most of the high-probability facts in this generalization are conceptmembership attributes. Other facts refer to the direc-tion of the sketched diagonal arrows in the sketch.These facts were already considered in the feedbackdesign of this worksheet. However, the three high-probability facts shown in figure 2 indicate thepotential for more targeted feedback. These factsindicate that three of the four marker beds failedquantitative ink constraints in specific ways. The

bold horizontal arrows imposed on the figure pointto two marker beds that map to each other in an ana-logical mapping. Both of these marker beds fall shortof the left bounds of their quantitative ink con-straints (see first fact in figure 2). Similarly, two oth-er marker beds (unmarked) fall short of the rightbounds of the quantitative ink constraints. Withoutknowing that multiple students would exhibit thiscommon behavior, a worksheet author would haveno reason to include targeted feedback about it.Indeed, the authors of these worksheets were sur-prised to discover that multiple students had engagedin the same interpretation of the image and conse-quently the same erroneous sketching behavior. Giv-en that multiple students commit this error, targetedfeedback about the horizontal extent of marker bedswould have been helpful, for example, “Marker bedregions are not just near the fault; they can extend tothe edges of the image.” This is the type of informa-tion that can be leveraged from clustering studentdata. In turn, this information can be used to designexercises with advice for students that can have agreater impact than the feedback created by theworksheet authors a priori.

Page 7: Using Analogy to Cluster Hand-Drawn Sketches for Sketch

Related WorkMany researchers have explored misconceptions indomains like algebra, geometry (Anderson et al.1995), and physics (VanLehn et al. 2007). Each ofthese research programs answers important ques-tions about the structure of knowledge during learn-ing. These answers have shaped the coaching strate-gies of various tutoring systems.

Many sketch understanding systems exist butmost stick to a single domain because they usesketch recognition (Lee et al. 2007; de Silva et al.2007; Valentine et al. 2012). No other sketch under-standing systems use structure mapping as a modelfor comparison. Despite this, it may still be possibleto apply similar clustering techniques to those sys-tems.

Discussion and Future WorkThis article describes a method for clustering sketch-es to detect common answer patterns. We used mod-

Articles

82 AI MAGAZINE

Figure 2. Two Sketches That Are Frequently Clustered Together and Three High-Probability Facts from Their Generalization.

The horizontal block arrows point to the drawn entity that is referenced in the first fact. That entity falls short of the left bound of its quan-titative ink constraint.

(lessThanQuantInkLeftBound

(GenEntFn 5 1 Faults-1) Object-438)

(lessThanQuantInkRightBound

(GenEntFn 3 1 Faults-1) Object-440)

(lessThanQuantInkRightBound

(GenEntFn 4 1 Faults-1) Object-436)

els of human analogical processing to cluster hand-drawn sketches completed by undergraduate geo-science students. The analogical clustering approachsignificantly outperformed a k-means clustering algo-rithm.

This technique can be used to mine commonanswer patterns from sketches so that they can beused for assessment or for designing targeted feed-back. Instructors may use this technique to discoverthe distribution of answer patterns in their class-rooms, some of which may be prevalent misconcep-tions. This approach enables common answer detec-tion in a data-driven (but tightly scoped) manner,without requiring a cognitive analysis of the entiredomain or even the entire task.

One of the limitations to this approach is theunderstandability of the facts used to describe gener-alizations. As discussed above, high-probability factscan be used to understand the defining criteria of acluster. For an instructor to easily interpret these factswould require familiarity with the knowledge repre-sentations used there. However, it can be argued that

Page 8: Using Analogy to Cluster Hand-Drawn Sketches for Sketch

Articles

SPRING 2014 83

the instructors may not need those explicit facts.Instead, they can simply view a prototypical memberof the cluster and decide on the defining criteria forthemselves. With this technique, rather than lookingat all the sketches submitted by students, an instruc-tor can inspect only as many sketches as there areclusters.

In the future we plan to continue refining encod-ing procedures of sketches. The procedures used inthis experiment are domain general, but there arelikely cases where different filters on conceptualand/or spatial information will be needed. We mayalso be able to learn more about common problem-solving strategies by including an analysis of asketch’s history in our encoding. This would allow usto create clusters based on sketching behaviors overtime, rather than only in the final state of a student’ssketch. We also have not yet integrated shape andedge level representations into this encoding proce-dure (Lovett et al. 2012), as these are only now start-ing to be integrated into our sketch worksheets. Wealso plan to add clustering to the grading utilitiesbuilt into sketch worksheets. Most importantly, weplan to extend our experiments to more sketches inmore STEM domains. This will involve continuedcollaboration with STEM experts and educators at theK–12 and university levels. We anticipate that usinga varied corpus of sketches will enable us to convergeon encoding procedures that will scale up andbecome powerful tools for designing environmentswith impactful instructional feedback.

AcknowledgementsThanks to Brad Sageman and Andrew Jacobson andtheir students for providing us with expertise, advice,and data. This work was supported by the SpatialIntelligence and Learning Center (SILC), an NSF Sci-ence of Learning Center (Award Number SBE-1041707).

Notes1. www.cyc.com/platform/opencyc.

2. For example, a student might draw a planet above the sunversus below the sun, a visually salient difference that does-n’t matter in most orbital diagrams.

ReferencesAinsworth, S.; Prain, V.; and Tytler, R. 2011. Drawing toLearn in Science. Science 333(6046): 1096–1097.dx.doi.org/10.1126/science.1204153

Anderson, J. R.; Corbett, A. T.; Koedinger, K. R.; and Pelleti-er, R. 1995. Cognitive Tutors: Lessons Learned. The Journalof the Learning Sciences 4(2): 167–207.dx.doi.org/10.1207/s15327809jls0402_2

Chang, M. D., and Forbus, K. D. 2012. Using QuantitativeInformation to Improve Analogical Matching BetweenSketches. In Proceedings of the Twenty Fourth Conference onInnovative Applications of Artificial Intelligence, 2269–2274.Menlo Park, CA: AAAI Press. PMid:23083649

Cheng, P. C.-H. 2011. Probably Good Diagrams for Learn-ing: Representational Epistemic Recodification of Probabil-ity Theory. Topics in Cognitive Science 3(3): 475–498.dx.doi.org/10.1111/j.1756-8765.2009.01065.x

Clement, C. A., and Gentner, D. 1991. Systematicity as aSelection Constraint in Analogical Mapping. Cognitive Sci-ence 15(1): 89–132. dx.doi.org/10.1207/s15516709cog1501_3

de Silva, R.; Bischel, T. D.; Lee, W.; Peterson, E. J.; Calfee, R.C.; and Stahovich, T. 2007. Kirchhoff’s Pen: A Pen-Based Cir-cuit Analysis Tutor. In Proceedings of the Fourth EurographicsWorkshop on Sketch-Based Interfaces and Modeling, 75–82.New York: Association for Computing Machinery.dx.doi.org/10.1145/1384429.1384447

Falkenhainer, B.; Forbus, K.; and Gentner, D. 1989. TheStructure-Mapping Engine: Algorithm and Examples. Artifi-cial Intelligence 41(1): 1–63. dx.doi.org/10.1016/0004-3702(89)90077-5

Forbus, K.; Gentner, D.; and Law, K. 1995. MAC/FAC: AModel of Similarity-Based Retrieval. Cognitive Science 19(2):141–205. dx.doi.org/10.1207/s15516709cog1902_1

Forbus, K. D.; Usher, J.; Lovett, A.; Lockwood, K.; and Wet-zel, J. 2011. Cogsketch: Sketch Understanding for CognitiveScience Research and for Education. Topics in Cognitive Sci-ence 3(4): 648–666. dx.doi.org/10.1111/j.1756-8765.2011.01149.x

Gentner, D. 1983. Structure-Mapping: A Theoretical Frame-work for Analogy. Cognitive Science 7(2): 155–170.dx.doi.org/10.1207/s15516709cog0702_3

Gentner, D., and Forbus, K. 2011. Computational Models ofAnalogy. Wiley Interdisciplinary Reviews: Cognitive Science.2(3): 266–276. dx.doi.org/10.1002/wcs.105

Jee, B.; Gentner, D.; Forbus, K. D.; Sageman, B.; and Uttal, D.H. 2009. Drawing on Experience: Use of Sketching to Eval-uate Knowledge of Spatial Scientific Concepts. In Proceedingsof the Thirty First Annual Conference of the Cognitive ScienceSociety, 2499–2504. Wheat Ridge, CO: Cognitive ScienceSociety.

Kindfield, A. C. H. 1992. Expert Diagrammatic Reasoning inBiology. In Reasoning with Diagrammatic Representations:Papers from the AAAI Spring Symposium, 41–46. AAAI Techni-cal Report SS-92-02. Palo Alto, CA: AAAI Press.

Klenk, M., and Forbus, K. 2009. Analogical Model Formula-tion for Transfer Learning in AP Physics. Artificial Intelligence173(18): 1615–1638. dx.doi.org/10.1016/j.artint.2009.09.003

Kuehne, S.; Forbus, K.; Gentner, D.; and Quinn, B. 2000.SEQL: Category Learning as Progressive Abstraction UsingStructure Mapping. In Proceedings of the Twenty Second Annu-al Conference of the Cognitive Science Society, 770–775. WheatRidge, CO: Cognitive Science Society.

Larkin, J. H., and Simon, H. A. 1987. Why a Diagram Is(Sometimes) Worth 10000 Words. Cognitive Science 11(1):65–99. x.doi.org/10.1111/j.1551-6708.1987.tb00863.x

Lee, W.; De Silva, R.; Peterson, E. J.; Calfee, R. C.; and Sta-hovich, T. F. 2007. Newton’s Pen — A Pen-Based TutoringSystem for Statics. In Proceedings of the Eurographics Workshopon Sketch-Based Interfaces and Modeling, 59–66. New York:Association for Computing Machinery. dx.doi.org/10.1145/1384429.1384445

Lockwood, K., and Forbus, K. 2009. Multimodal KnowledgeCapture from Text and Diagrams. In Proceedings of the FifthInternational Conference on Knowledge Capture, 65–72. New

Page 9: Using Analogy to Cluster Hand-Drawn Sketches for Sketch

York: Association for Computing Machinery. dx.doi.org/10.1145/1597735.1597747

Lovett, A.; Kandaswamy, S.; McLure, M.; and Forbus, K.2012. Evaluating Qualitative Models of Shape Representa-tion. Paper presented at the Twenty-Sixth InternationalWorkshop on Qualitative Reasoning. Playa Vista, CA, July16–18. PMid:22228709

Murdock, J. 2011. Structure Mapping for Jeopardy! Clues. InProceedings of the Nineteenth International Conference on Case-Based Reasoning, 6–10. Berlin: Springer.

Nathan, M.; Koedinger, K. R.; and Alibali, M. 2001. ExpertBlind Spot: When Content Knowledge Eclipses PedagogicalContent Knowledge. Paper presented at the 2001 AnnualMeeting of the American Educaitonal Research Association,Seattle, WA, April 10–14.

Valentine, S.; Vides, F.; Lucchese, G.; Turner, D.; Rim, H.; Li,W.; Linsey, J.; Hammond, T. 2012. Mechanix: A Sketch-Based Tutoring System for Statics Courses. In Proceedings ofthe Twenty Fourth Conference on Innovative Applications of Arti-ficial Intelligence, 2253–2260. Menlo Park, CA: AAAI Press.

VanLehn, K.; Graesser, A. C.; Jackson, G. T.; Jordan, P.;Olney, A.; and Rosé, C. P. 2007. When Are Tutorial Dia-logues More Effective Than Reading? Cognitive Science 31(1):3–62. dx.doi.org/10.1080/03640210709336984

Yin, P.; Forbus, K. D.; Usher, J.; Sageman, B.; and Jee, B.2010. Sketch Worksheets: A Sketch-Based Educational Soft-ware System. In Proceedings of the Twenty-Second Conferenceon Innovative Applications of Artificial Intelligence, 1871–1876.Menlo Park, CA: AAAI Press.

Wai, J.; Lubinski, D.; and Benbow, C. P. 2009. Spatial Abili-ty for STEM Domains: Aligning over 50 Years of CumulativePsychological Knowledge Solidifies Its Importance. Journalof Educational Psychology 101(4): 817–835.dx.doi.org/10.1037/a0016127

Worsley, M., and Blikstein P. 2011. What’s an Expert? UsingLearning Analytics to Identify Emergent Markers of Expert-ise Through Automated Speech, Sentiment, and SketchAnalysis. In Proceedings of the Fourth Annual Conference onEducational Data Mining, 235–240. Boston: InternationalEducational Data Mining Society.

Maria D. Chang is a Ph.D. candidate in computer scienceat Northwestern University. Her research interests includeknowledge representation and reasoning, human-computerinteraction, and the use of artificial intelligence in educa-tional software. She is a member of the Qualitative Reason-ing Group and the Spatial Intelligence and Learning Center.

Kenneth D. Forbus is the Walter P. Murphy Professor ofComputer Science and a professor of education at North-western University. His research interests include qualitativereasoning, analogical reasoning and learning, spatial rea-soning, sketch understanding, natural language under-standing, cognitive architecture, reasoning system design,intelligent educational software, and the use of AI in inter-active entertainment. He is a Fellow of AAAI, the CognitiveScience Society, and the Association for ComputingMachinery.

Articles

84 AI MAGAZINE

AAAI-15The First Winter AAAI Conference!

January 25–29 2015

Austin, Texas USAProgram Chairs: Blai Bonet and Sven Koenig

www.aaai.org/Conferences/AAAI/aaai15.php