probabilistic reasoning for assembly-based 3d...

Probabilistic Reasoning for Assembly-Based 3D Modeling

Siddhartha Chaudhuri∗ Evangelos Kalogerakis∗ Leonidas Guibas Vladlen Koltun

Stanford University

Figure 1: 3D models created with our assembly-based 3D modeling tool.

Abstract

Assembly-based modeling is a promising approach to broadeningthe accessibility of 3D modeling. In assembly-based modeling,new models are assembled from shape components extracted froma database. A key challenge in assembly-based modeling is theidentification of relevant components to be presented to theuser.In this paper, we introduce a probabilistic reasoning approach tothis problem. Given a repository of shapes, our approach learns aprobabilistic graphical model that encodes semantic and geometricrelationships among shape components. The probabilistic model isused to present components that are semantically and stylisticallycompatible with the 3D model that is being assembled. Our exper-iments indicate that the probabilistic model increases therelevanceof presented components.

CR Categories: I.3.5 [Computing Methodologies]: ComputerGraphics—Computational Geometry and Object Modeling;

Keywords: data-driven 3D modeling, probabilistic reasoning,probabilistic graphical models

∗S. Chaudhuri and E. Kalogerakis contributed equally to thiswork.

Links: DL PDF

1 Introduction

What remains hard is modeling. The structure inher-ent in three-dimensional models is difficult for peopleto grasp and difficult too for user interfaces to revealand manipulate. Only the determined model three-dimensional objects, and they rarely invent a shape ata computer, but only record a shape so that analysis ormanufacturing can proceed. The grand challenges inthree-dimensional graphics are to make simple model-ing easy and to make complex modeling accessible tofar more people.

— Robert F. Sproull [1990]

Providing easy-to-use tools for the creation of detailed three-dimensional content is a key challenge in computer graph-ics. With the accessibility of comprehensive game develop-ment environments, individual programmers and small teamscanbuild and deploy realistic computer games and virtual worlds[Epic Games 2011; Unity Technologies 2011]. Yet the creation ofcompelling three-dimensional content to populate such worlds re-mains out of reach for most developers, who lack 3D modeling ex-pertise.

A promising approach to 3D modeling is assembly-basedmodeling, in which new models are assembled from pre-existing components. The set of components can be de-signed specifically for this purpose or derived from a repos-itory of shapes [Funkhouser et al. 2004; Kraevoy et al. 2007;Maxis Software 2008; Chaudhuri and Koltun 2010]. The advan-tage of assembly-based modeling is that users do not need to spec-

http://doi.acm.org/10.1145/1111111.2222222

http://portal.acm.org/ft_gateway.cfm?id=2222222&type=pdf

Sentenced

Text Box

© ACM, (2011). This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Graphics 30{4}, July 2011.

ify new geometry from scratch; modeling reduces to selection andplacement of components.

A key challenge in assembly-based 3D modeling is the iden-tification of relevant components to be presented to the user.Unless the modeling task is fixed in advance, an assembly-based modeling tool must use a large and varied database ofcomponents. The interface thus needs to provide effectivemechanisms for identifying the most relevant available compo-nents at any stage of the modeling process. Previous workused either text search or geometric matching to retrieve shapesand components [Funkhouser et al. 2004; Shin and Igarashi 2007;Lee and Funkhouser 2008; Chaudhuri and Koltun 2010]. Suchtechniques do not take into account the semantic and stylisticrelationships between the current model and the componentsinthe database. In this paper, we present an approach that stud-ies a model library to learn how shapes are put together, anduses this knowledge to suggest semantically and stylistically rel-evant components at each stage of the modeling process. Ourwork aims to support open-ended 3D modeling [Talton et al. 2009;Chaudhuri and Koltun 2010].

We use a probabilistic graphical model called a Bayesian network[Pearl 1988; Koller and Friedman 2009] to represent semantic andstylistic relationships between components in a shape database.When new models are assembled, inference in the Bayesian net-work is used to derive a relevance ranking on categories of com-ponents, and on individual components within each category. Forexample, when a user begins a modeling session by placing an air-plane fuselage, the probabilistic model identifies components in therepository that are more likely to be adjacent to a fuselage,suchas wings, stabilizers, and engines. The presented components con-tinue to be dynamically updated as the current model is constructed.Thus when the user augments the fuselage with an engine froma military jet, the probabilistic model decreases the probability ofpropellers and increases the probability of rockets and missiles. Af-ter the user adds two wings, the probabilistic model lowers the prob-ability of wings, since the presence of more than two wings ona jetis unlikely.

To evaluate the effectiveness of the presented approach, wehavedeveloped an assembly-based 3D modeling tool. Our modelingin-terface is inspired by the successful Spore creation tools and usesthe probabilistic model to dynamically update the presented compo-nents. Experiments with this interface indicate that the probabilisticmodel produces more relevant suggestions than a static presentationof components or a purely geometric approach.

In summary, this paper introduces a probabilistic model of shapestructure that incorporates both semantics and geometric style. Themodel is trained on a shape database and is used to present relevantshape components during a 3D modeling session. The experimentalresults demonstrate that the model significantly increasesthe rele-vance of presented components.

2 Prior Work

Assembly-based 3D modeling was pioneered by Funkhouser etal. [2004], whose Modeling by Example system allows users toretrieve source models using text- or shape-based search. Com-ponents are then cut out of some of the retrieved models andglued onto the current one. Subsequent research has investigatedsketch-based retrieval of components [Shin and Igarashi 2007;Lee and Funkhouser 2008]. In all these approaches, the user mustsearch for each specific component. This is less appropriateforopen-ended 3D modeling, when the composition of the model isnot specified in advance and the user can benefit from a varietyofsuggestions [Chaudhuri and Koltun 2010].

Kraevoy et al. [2007] describe an assembly-based modeling toolin which the user can load a small set of compatible shapes andinterchange parts between them. The method relies on a compatiblesegmentation of all shapes and does not handle heterogenousshapelibraries.

Chaudhuri and Koltun [2010] describe a data-driven technique forpresenting components that can augment a given shape. The ap-proach is purely geometric and does not take into account these-mantics of components. Our results show that the incorporation ofsemantic relationships increases the relevance of presented compo-nents.

Fisher and Hanrahan [2010] describe a probabilistic model of spa-tial context for 3D model search. Given a query bounding boxplaced by the user in a 3D scene, their approach returns databaseshapes that are appropriate in the context of the surrounding scene.Our probabilistic model is substantially different from that of Fisherand Hanrahan and is designed for interactive shape modeling.

Our use of probabilistic graphical models is influenced by the workof Merrell et al. [2010], who generate residential building layoutsusing a Bayesian network trained on architectural programs. Theprobabilistic model of Merrell et al. is designed specifically to rep-resent architectural programs. We introduce a probabilistic modelfor the structure of general 3D shapes. Our model representsbothcomponent categories and a variable number of individual compo-nents, and augments existence and adjacency relations withsym-metry and geometric style.

The interface of our assembly-based modeling tool is inspiredby the Spore creature creator, which allows creatures to beassembled from components such as heads, legs, and arms[Maxis Software 2008]. The Spore creature creator has been usedto create over one hundred million creature models in the first yearafter release. While our interface closely follows the successful ap-proach of Spore, the presented categories and parts are dynamicallyupdated in light of the current model, in order to present themostrelevant components from a large heterogenous library. Further-more, we do not require specially crafted components, but extractthem semi-automatically from a shape library.

3 Overview

Our approach consists of two stages, illustrated in Figure2. Thefirst is an offline preprocessing stage in which a probabilistic modelis trained on a shape database. The second is an interactive stage inwhich inference in the probabilistic model is used to present rele-vant shape components during a 3D modeling session.

Preprocessing. The input to the preprocessing stage is a repos-itory of segmented and labeled shapes. Each label represents acomponent category, such ashead, arm, and torso. The labelingis hierarchical and each component may contain subparts with as-sociated subcategories; for example, torsos are composed of lowertorsoandupper torso. We use the technique of Kalogerakis et al.[2010] to produce the segmentation and labeling from a set of train-ing examples. Thus we preprocess a shape repository into compo-nents semi-automatically, relieving the chore of manuallysegment-ing each model and labeling all components.

Given the set of segmented components, we further cluster thembased on geometric style. For each category, the componentsare partitioned into a set of clusters with similar geometric fea-ture vectors. The feature vectors incorporate descriptorssuchas shape diameter [Shapira et al. 2009], curvature, shape context[Mori et al. 2001], and PCA-based parameters (AppendixA). Anexample of this clustering is illustrated in Figure2, where arms

Arm Cluster 1

Number

of arms

Arm-Torso

adjacent

Torso

exists

Arm

exists

decoration

main

upper

lower

upper

lower

hand

tail

upper

lower

foot

clothes

miscellaneous Torso Cluster 1

Style clustersModel library

Clustering

Bayesian

network

learning

Wing-Torso

adjacent

Wing

exists

Arm

Cluster 1

exists

Arm

Cluster 2

exists

Torso

Cluster 1

exists

Torso

Cluster 2

exists

Bayesian network

head

torso

arm

leg

Arm Cluster 2

Torso Cluster 2

Number

of arms

Arm-Torso

adjacent

Torso

exists

Arm

exists

Wing-Torso

adjacent

Wing

exists

Arm

Cluster 1

exists

Arm

Cluster 2

exists

Torso

Cluster 1

exists

Torso

Cluster 2

exists

Probabilistic

inference

Pre

pro

ce

ssin

gR

un

tim

e

Observed

variables

ArmHead Leg Tail

wing

Current model Bayesian network Ranked categories and components

Figure 2: Overview of our approach. Thepreprocessing stage (top) begins with a library of models, segmented and labeled using thetechnique of Kalogerakis et al. [2010]. The components extracted from the models are further clustered by geometric style. A Bayesiannetwork is then learned that encodes probabilistic dependencies between labels, geometric styles, part adjacencies,number of parts fromeach category, and symmetries. The figure shows a subset froma real network learned from a library of creature models. Theruntime stage(bottom) performs probabilistic inference in the learned Bayesian network to generate ranked lists of category labelsand components withineach category, customized for the currently assembled model.

with a roughly humanoid appearance are grouped into one cluster,while thin and alien arms are grouped into another. The clusteringallows the probabilistic model to operate not only on the semanticlabels, but also on the geometric appearance of components.

Given the set of category labels and style clusters within each label,the method learns a probabilistic model that encodes dependenciesbetween labels, style clusters, part adjacencies, numbersof partsfrom each category, and symmetries.

Runtime. During an interactive 3D modeling session, our methodperforms probabilistic inference in the learned model to generate aranked list of components that may be useful for augmenting thecurrent shape. The ranking reflects compatibility with componentsthat are already part of the assembled shape.

In order to evaluate the relevance of the suggestions produced bythe learned model, we have developed a tool for assembly-based3D modeling. The modeling interface is illustrated in Figure 4 andin the supplementary video. On the left, the interface presents tabswith semantic labels such astorso, head, andarm. The tabs canexpand hierarchically to show subcategories, such aslower torso,upper torso, andbelt. When a user selects a tab, the interface showsa list of parts that belong to the associated category. The user canselect a component and drag it onto the current model. When acomponent is added in this way, the lists of categories and compo-nents are re-ranked to reflect their semantic and stylistic compatibil-ity with the current 3D model. Using the probabilistic model, theinterface also estimates whether the new component should have

a symmetric counterpart and computes the symmetry plane. Theinterface supports sliding any component along the surfaceof themodel, as well as translation, rotation, scaling, duplication, and glu-ing, thus assisting the assembly of a 3D model from chosen com-ponents.

4 Probabilistic Model

Our probabilistic model is designed for the purpose of recommend-ing components that can augment a given 3D model. The proba-bilistic model is trained on a set of segmented and labeled shapes.Each shape is represented by the existence, adjacencies, geomet-ric style, symmetries, and number of components from each cate-gory, as described below. These attributes are representedas ran-dom variables. Each shape is thus treated as a sample from a jointprobability distributionP (X), whereX contains the following setsof variables:

Existence E = {El} whereEl ∈ {0, 1} represents whether acomponent from categoryl ∈ L exists in a shape, whereL isthe set of component labels in the repository. There is one suchrandom variable for each label.

Cardinality N = {Nl} whereNl ∈ {0}∪Z+ represents the num-

ber of components from categoryl ∈ L in a shape. There is alsoone such random variable for each label. For tractable inference,we limit the range to{0, 1, 2, 3, 4, 5+}, where5+ represents thecase of 5 or more existing components from the category. Notethat the random variablesE represent information for the exis-tence of components, whereas the cardinality variablesN make

it specific in terms of their exact number. The random variablesE are still useful in our model as an intermediate representation,since the number of components cannot be directly observed inthe user’s incomplete shape. In addition, certain dependencies ofcomponents are more compactly expressed in terms of their ex-istence rather than their exact cardinality (e.g., a table top existswhen table legs exist, regardless of their exact number), whichhelps decreasing the number of parameters in our model.

Adjacency A = {Al,l′} whereAl,l′ ∈ {0, 1} represents whethera component from categoryl ∈ L is adjacent to a componentfrom categoryl′ ∈ L in a shape. There is one such randomvariable for each pair of distinct labels that appear adjacent in atleast one of the shapes in the repository.

Style S = {Ss,l} whereSs,l ∈ {0, 1} represents whether a com-ponent of geometric styles ∈ Sl exists in a shape, whereSl

is the set of clusters for categoryl. There is one such randomvariable for each style cluster within each category.

Symmetry R = {Rl,l′} whereRl,l′ ∈ {0, 1} represents whethera component from categoryl ∈ L has a symmetric counterpartwith respect to a component from categoryl′ ∈ L. For example,an arm has a symmetric counterpart with respect to the torso.There is one such random variable for each distinct pair of la-bels that have such a symmetry relationship in at least one oftheshapes in the repository.

The probabilistic model encodes the joint distributionP (X). Thepurpose of the model is to support estimation of compatibility be-tween each component from the repository and a given 3D model.The given model is the current 3D model observed during the run-time stage. The given model imposes specific values on some ran-dom variables inX. For example, if the current model contains acomponent from categoryl and styles, the corresponding randomvariablesEl andSs,l are set to1. Similarly, if a component withlabel l is adjacent to a component with labell′, the correspond-ing random variableAl,l′ is set to1. Such variables are said to beobserved. The rest of the random variables are unobserved. In addi-tion, the cardinality random variables are never observed,since wealso do not know the final number of components in the assembledshape; although unobserved, cardinality variables are important forquerying our model, as explained below.

By performing inference in the probabilistic model, we willcom-pute probability distributions on some of the unobserved variablesgiven the observed ones, and thus estimate the compatibility of var-ious components with the currently assembled shape.

Specifically, the compatibility of a componenti from categoryliand stylesi is evaluated based on (a) how likely its categoryli is tobe adjacent to any of the categoriesL′ ⊆ L existing in the shape,(b) how likely the shape is to include more parts of categoryli, and(c) how likely the shape is to include components of stylesi. Thus,we need to evaluate, for each componenti in the repository, theprobability of the following expression:

_

l′∈L′

(Ali,l′ = 1)

!

∧ (Nli > nli) ∧ (Ssi,li = 1), (1)

wherenli is the number of components from categoryli in the cur-rent shape. The probability of this query can be evaluated usingthe conditional probability distributionP (Xq | Xe = e), whereXq ∈ X is the set of the random variables involved in the abovequery,Xe ∈ X is the set of observed random variables, ande aretheir observed values. We define thecompatibility scoreof compo-nenti as

comp(i) =X

q∈Q

P (Xq = q | Xe = e), (2)

whereQ is the set of instantiations ofXq that satisfy Expression1.The compatibility score measures the suitability of componenti foraugmenting the currently assembled shape.

The compatibility score is expressed in terms of the conditionalprobability distributionP (Xq | Xe = e). From the definitionof conditional probability, we have:

P (Xq | Xe = e) =P (Xq,Xe = e)

P (Xe = e), (3)

which can be computed from the joint distributionP (X) by sum-ming over all possible assignments to the unobserved randomvari-ables:

P (Xq | Xe = e) =

P

uP (Xq,Xe = e,Xu = u)

P

u,qP (Xq = q,Xe = e,Xu = u)

, (4)

whereXu = X − Xe − Xq is the set of random variables thatare neither query nor evidence. However, an explicit summation ofthis form is generally intractable, since the number of possible as-signments is in general exponential in the number of variables. Theexponential complexity can be reduced by assuming that all randomvariables inX are independent. However, this assumption is gener-ally incorrect. For example, if the user’s shape contains a quadrupedtorso, it is likely that the shape would have other quadrupedparts,such as legs, a head, and a tail. The existence, number, adjacencies,and style of these parts are dependent on the existence and style ofthe torso.

Instead of manually specifying which random variables are prob-abilistically dependent on which others, we learn the factorizationof the joint distribution from data. This factorization makes thecomputation of the compatibility score tractable. The factorizeddistribution is represented with a directed graphical model calleda Bayesian network [Pearl 1988; Koller and Friedman 2009]. ABayesian network represents the random variables and theircon-ditional dependencies using a directed acyclic graph. Nodes rep-resent random variables and directed edges represent conditionaldependencies between the corresponding variables. Each randomvariable is associated with a probability function that takes as in-put the set of values of the node’s parent variables and outputs theprobability of this random variable given the input values.Sinceour random variables are discrete, these probability functions arerepresented as Conditional Probability Tables (CPTs). Theproce-dure for learning the connectivity of the network and the entries ofall the CPTs is described in Section5. Information on the learnedBayesian networks used in our experiments is provided in Table 1.

Inference. Given a learned Bayesian network, the joint distribu-tion can be expressed asP (X) =

Q

x∈XP (x | π(x)), whereπ(x)

is the set of parent variables ofx. Inference in the Bayesian net-work can be used to answer conditional probability queries such as(3). Since exact inference in probabilistic graphical modelsis NP-hard, we use approximate inference. We had experimented withloopy belief propagation [Frey and MacKay 1997], but with lim-ited success, as it often returned poor approximations to the querydistributions. Instead, we implemented a likelihood-weighted sam-pling technique that stochastically samples the joint distribution andweighs each sample based on the likelihood of the observed nodesaccumulated throughout the process [Fung and Chang 1990].

Category ranking. While the compatibility score determines theranked order of individual components, the part categoriesalsoneed to be ranked as a whole, so that category labels can be pre-sented in order of relevance by the interface. We achieve this byevaluating a compatibility score similar to Equation2 that ignores

the part-specific style term. The score of a part categoryl is definedas

comp(l) =X

q∈Q′

P (Xq = q | Xe = e), (5)

whereQ′ is the set of instantiations ofXq that satisfy

_

l′∈L′

(Al,l′ = 1)

!

∧ (Nl > nl), (6)

whereL′ ⊆ L is the set of categories of components in the currentshape, andnl is the number of existing components from categoryl.

Initial ranking. The assembled shape is initially empty. There-fore none of the random variables inX are observed initially. Inthis case, we rank the parts and part categories by querying theprior probabilitiesP (Eli = 1, Ssi,li = 1) for each parti with la-bel li and stylesi, andP (El = 1) for each labell. The resultingcategory ranking is determined by the frequency of occurrence ofthe categories in the repository: the most popular categories appearfirst. For example, for a repository of creatures, the top categoriesare torsos, legs, and heads. Similarly, the initial rankingof partsis derived from the frequency of occurrence of style clusters in therepository.

Symmetry. The probabilistic model also provides informationon whether a chosen component should have a symmetric coun-terpart. Our implementation supports planar reflective symme-try, which is among the most prevalent in real-world objects[Podolak et al. 2006]. For a chosen componenti with label li, weevaluate, for each observed categoryl′ in the current shape, theprobability P (Rli,l′ = 1 | Xe = e), and take the observed cate-gory lmax that maximizes this probability. If this maximal probabil-ity is greater than0.5, the interface generates the symmetric coun-terpart ofi about the symmetry plane of the component with labellmax. If there are multiple such components, we select the largestone. If there are multiple symmetry planes for that component,we select the one that maximizes the projection of the assembledshape onto it. Symmetry planes are computed using the approachof Simari et al. [2006].

5 Training

We now describe the offline procedure for learning the entries ofthe conditional probability tables for the Bayesian network, as wellas the structure of the network. The input to this procedure is a setof shapes from a repository. We assume that a small set of man-ually segmented and labeled models is provided as a trainingsetfor a supervised segmentation and labeling procedure that prop-agates the segmentations and labelings to the rest of the dataset[Kalogerakis et al. 2010]. The sizes of the training sets used in ourexperiments are reported in Table1.

We modified the segmentation and labeling algorithm of Kaloger-akis et al. [2010] to support hierarchical labeling. For each classof shapes, the modified algorithm learns a conditional random field(CRF) at each level of the segmentation hierarchy, and generatesa sequence of increasingly refined segmentations of each shapeby successively applying the CRFs in order. The set of geomet-ric descriptors is the same as in the original algorithm. Thege-ometric descriptors are computed at surface samples ratherthanmesh faces to accommodate polygon soups. A JointBoost classi-fier [Torralba et al. 2007] is used to output the probability distribu-tion of labels over each sample. Then the distribution is assigned

First principal component

Se

cond

prin

cipa

lcom

pone

nt

0

0

0.5

0.5

1

1

1.5

-0.5

-0.5

-1

-1

Figure 3: Clusters of head parts, based on a Gaussian mixturemodel. The feature space is visualized by projection onto the twoprincipal axes.

to each mesh face, by averaging the probabilities of labels over itsnearest point samples. The unary and pairwise terms of the CRFsare defined as in the original algorithm.

Given a segmented and labeled dataset, the method performs clus-tering to identify groups of geometrically similar parts within eachpart category. The method then learns the structure and parame-ters of the Bayesian network. The balance of this section describesthese two steps in more detail.

Style clustering. For each componenti we compute a geometricfeature vectorzi. In order to capture a variety of geometric at-tributes, this feature vector incorporates shape diameter, curvature,shape context, and PCA descriptors (AppendixA). The clusteringis based on a Gaussian mixture model in the feature space ofz,so that the mixture is a superposition of Gaussian distributions thathave their own covariance structure. The number of Gaussians ofthe mixture is estimated by penalizing the complexity of thefittedmixture model. Specifically, for each part categoryl, we maximizethe following objective function with respect to the numberand pa-rameters of the mixture model:

Jl =

mlX

i=1

log

klX

s=1

ws,l · N (z; µs,l, Σs,l)

!

−1

2P log(ml), (7)

whereml is the number of parts in categoryl, kl is the number ofGaussians of the mixture forl, N (z; µs,l, Σs,l) is a Gaussian withmeanµs,l and covariance matrixΣs,l, ws,l is the correspondingmixing coefficient, andP is the number of parameters in the mix-ture model. The second term penalizes model complexity based onthe Bayesian Information Criterion [Schwarz 1978].

The objectiveJl is maximized as follows. Starting fromkl = 2,we run expectation-maximization (EM) to estimate the parametersws,l, µs,l, andΣs,l. If the number of partskl is less than twicethe number of parameters, we restrict the covariance matrixto bediagonal to prevent overfitting. EM is initialized with centers com-puted byk-means. We run EM separately forkl = 3, 4, ... . Ifthere is no increase in the objective function for5 successive repe-titions, we stop and accept the fitting computed by the EM iterationwith the highest objective function value. The clusters aredefinedas follows: each parti is assigned to the clusters that correspondsto the Gaussian that maximizesws,l · N (zi | µs,l, Σs,l). Figure3illustrates the clustering of some of the head components from thecreature dataset.

Bayesian network learning. Given training dataD, our methodlearns the graph structureG and parametersΘ of the Bayesian net-work by maximizing the Bayesian Information Criterion score

log P (D | G) ≈ log P (D | G, Θ) −1

2v log(n), (8)

wherev is the number of independent parameters in the networkandn is the number of shapes in the dataset. The first term is thelikelihood of the parametersΘ and the second term penalizes modelcomplexity.

Structure learning is NP-hard and exact algorithms aresuper-exponential in the number of variables. FollowingMerrell et al. [2010], we use a local search heuristic that exploresthe space of network structures by adding, removing, and flippingedges.

To assist the structure learning procedure, we enforce the existenceof a set of edges that represent obvious conditional dependencies.For each categoryl, we establish links from the existential nodeEl

to the cardinality nodeNl, as well as to the style nodes{Ss,l}s∈Sl

and the symmetry and adjacency nodes involving this category. Ifcategoryl has subcategories, we also establish links to the existen-tial nodes of the subcategories.

6 Modeling Interface

To evaluate the presented approach to suggestion generation, wehave implemented an assembly-based 3D modeling interface.Cat-egories of suggested parts are shown on the left, with individualparts arranged within categories (Figure4). The categories and theparts within each category are ordered based on their compatibilitywith the current shape, as computed by the probabilistic model. Theorder of the categories and parts is automatically updated when thecomposition of the assembled shape changes. The user can selecta part and drag it onto the assembly area on the right, where itissnapped and glued to the current shape [Schmidt and Singh 2010].

After adding a component, the user can further adjust its position,orientation, and scale. During these transformations, glued con-nections are maintained so that the part slides, rotates andscaleson the surface of the rest of the assembly. This reduces the needfor painstaking 6-DOF manipulation of parts. This snap-draggingcan be disabled when desired, which is useful for assemblieswithnarrow, irregular or hard-edged boundaries.

When the user drags in a part that has a symmetric counterpart,the counterpart is automatically generated and transformed with re-spect to the corresponding symmetry plane. This makes it easierto position pairs of legs, arms, wings, and wheels. The user candelink the symmetric pair and position each part individually. Partscan also be duplicated, reflected and deleted. To support confidentexploration, the interface provides unlimited undo/redo functional-ity. The interface also provides a search box that supports textualsearch for category labels, but participants in our experiments al-most never used this functionality, preferring visual exploration ofpart categories.

The modeling interface is further demonstrated in the supplemen-tary video. In our experiments, new users became proficient withthe tool after 5 minutes of demonstration and 5 additional minutesof hands-on exploration. All models in this paper were created afterthis minimal training period.

Figure 4: Modeling interface. The model is assembled from pre-sented parts.

7 Evaluation

In this section, we describe an experimental evaluation of the rele-vance of the components presented by the probabilistic model. Thegoal of the experiments was to evaluate the relevance of the pre-sented components during open-ended 3D modeling.

7.1 Experimental Setup

The presentation of components using the probabilistic model wascompared to two alternative approaches. The first alternative wasa static ordering of categories and parts. The ordering is based onthe prior probabilities of categories and style clusters inthe trainingdata. Thus more frequently used categories and geometric stylesappear first. Static ordering is the approach used in the Spore crea-ture creator [Maxis Software 2008], where the order of presentedcategories and components does not adapt to the current model.The second alternative was the purely geometric suggestiongener-ation approach of Chaudhuri and Koltun [2010]. We have modifiedthe published technique to use the same segmentations as in theother two conditions, thus the set of components was identical andonly the order of presentation varied across conditions. The samemodeling interface was used in all conditions, so the componentssuggested by the approach of Chaudhuri and Koltun [2010] werealso presented by category.

To evaluate the relevance of components presented by the three ap-proaches, we recruited 42 volunteers from the student body of acomputer science department in a research university. The studentswere recruited through departmental mailing lists. Most ofthe par-ticipants had little or no prior exposure to 3D modeling. Each par-ticipant was given a 5-10 minute orientation and was then asked toperform four modeling tasks. The tasks were of two types:

Toy. A neighborhood toy company asked you to design a children’stoy that they will manufacture. Create such a toy using the pro-vided application.

Creature. A neighborhood game company asked you to design alarge number of creatures for an upcoming fantasy game. Createone such creature using the provided application.

For the Toy assignment, the tool used a large model library with491 models obtained from the Digimation ModelBank and Archivecollections. The library included models of creatures, aircraft, wa-tercraft, and furniture. For the Creature assignment, the tool used a

5 10 15 20 25 30 35 40 45 50 55 6020

30

40

50

60

70

80

90

100

static orderingprobabilistic model

geometric approach

category ranking

%of

chos

en

cate

gorie

s

25 50 75 100 150 200 250 30050

60

70

80

90

100


geometric approach

component ranking

%of

chos

en

com

pone

nts

1 2 3 4 5 6 7 8 9 11 121020

30

40

50

60

70

80

90

100


geometric approach

category ranking

%of

chos

en

cate

gorie

s

25 50 75 100 150 200 250 30050

60

70

80

90

100


geometric approach

component ranking

%of

chos

en

com

pone

nts

Figure 5: Results for the Toy task (top) and Creature task (bottom).The plots on the left show the cumulative distribution of categoriesfrom which parts were chosen, as a function of the ranking of thecategory at the time of selection. The plots on the right showthecumulative distribution of components used by participants, as afunction of the ranking of the component within its categoryat thetime of selection. The probabilistic model presented more relevantcategories and components than the static ordering or the geomet-ric approach of Chaudhuri and Koltun [2010].

smaller library with173 creature models. Table1 provides furtherinformation on the two datasets.

Generic Creature# of database shapes 491 173

# of hand-segmented shapes 180 64# of categories 62 24

# of unique components 3702 1598# of style clusters 248 115

# of nodes in Bayesian network 595 265# of edges in Bayesian network 1224 632

Table 1: Datasets used in the evaluation.

Each participant performed two toy tasks and two creature tasks.Each task was performed in a randomly chosen condition: compo-nents presented by the probabilistic model, static ordering, or com-ponents suggested by the approach of Chaudhuri and Koltun. Themodeling interface was identical in all conditions. The conditionswere not disclosed to the participants, although some differenceswere apparent, since in the static condition the order of thepre-sented components never changed, and in the geometric conditionthe suggestion generation process was much longer due to theinten-sive computational demands of that technique. Several participantsperformed less than four tasks due to time constraints. In total, 141modeling tasks were completed. The average modeling time persession with the probabilistic model was21.4 minutes, the medianwas19 minutes, the shortest was7 minutes and the longest was56 minutes. The average modeling time was 25.4 minutes with thestatic ordering and 34.3 minutes with the approach of Chaudhuriand Koltun.

Number of components

%of

cre

ate

dm

ode

ls

1-4 5-8 9-12 13-16 17+0

20

40

60static orderingprobabilistic model

geometric approach


geometric approach

Number of source models

%of

cre

ate

dm

ode

ls

1 2-4 5-7 8-10 11+0

20

40

60

Figure 6: Number of components used in a single assembled model(top) and number of library models that the employed componentsoriginate from (bottom).

The speed of component presentation by the probabilistic modelcompared favorably with the approach of Chaudhuri and Koltun.We ran parallel modeling sessions during the evaluation, using aneight-core 2.53 GHz Xeon workstation and a quad-core 3.2 GHzCore i7 workstation. Inference in the Bayesian network useda sin-gle core and completed in2.8 seconds on average. The approach ofChaudhuri and Koltun used all available cores and required roughly90 seconds to generate suggestions.

7.2 Results

To evaluate the relevance of components presented by the three ap-proaches, we have analyzed logs generated by the application for allmodeling sessions. Relevance was evaluated both for the orderingof categories and for the ordering of parts within categories. Fig-ure5 (left) presents the cumulative distribution of categoriesfromwhich parts were chosen, as a function of the ranking of the cat-egory at the time of selection. Thus for the Toy task,64.6% ofthe chosen components were selected from the first2 categories inthe probabilistic model condition, compared to23.5% in the staticcondition and21.3% in the geometric condition. For the Creaturetask,75.05% of the components were chosen from the first2 cat-egories presented by the probabilistic model, compared to43.9%in the static condition and46.4% in the geometric condition. Fig-ure5 (right) presents the cumulative distribution of the componentsused by participants, as a function of the ranking of the componentwithin its category at the time of selection. For both tasks,the prob-abilistic model outperformed the static ordering and the geometricapproach.

Figure6 (top) shows the distribution of the number of componentsused by participants in their models. On average, participants used9.9 components for 3D models created with the probabilistic model.Figure6 (bottom) shows the distribution of the number of librarymodels that served as sources of components for each assembled3D model. On average in the probabilistic model condition, thecomponents used in single assembled model originate from5.7 li-brary models. The average number of components per assembledmodel and the average number of source library models per assem-bled model was similar across the three conditions.

Figures1 and7 show 3D models created by participants with theprobabilistic model. The models were created entirely withtheassembly-based 3D modeling interface.

28 min56 min 20 min 17 min 25 min 23 min 14 min 27 min

45 min

17 min 27 min 12 min 30 min 24 min 21 min 20 min 23 min 19 min 29 min 50 min 32 min 17 min 19 min

16 min

Figure 7: Top: models created by participants with our assembly-based 3D modeling tool. Bottom: models shown above and in Figure1,with individual components and modeling times.

8 Discussion

We have described a probabilistic reasoning approach to thepre-sentation of components in assembly-based 3D modeling. Ourap-proach learns a probabilistic graphical model that encodescondi-tional dependencies between shape components. The model oper-ates on both semantics and geometric style. During an interactivemodeling session, inference in the probabilistic model is used topresent relevant components.

The probabilistic model can be augmented in a number of ways.The model represents conditional dependencies in a directed graph.This captures directed dependencies in the data: when a fuselagefrom a commercial jet is added, it “activates” the existenceof twowings and their adjacency to the fuselage, as well as the existenceof two or four engines adjacent to the wings. However, certain re-lationships can be alternatively modeled by undirected links, whichare more appropriate for representing mutual constraints;for ex-ample, a fuselage from a commercial jet is unlikely to co-existwith missiles. Future work could investigate the use of undirectedgraphical models, such as Markov random fields, or partiallydi-rected models such as chain graphs. In addition, random variablesthat represent existence information could be omitted in favor of atree-based representation for the CPDs at the cardinality variables.Also, the presented model uses discrete variables, which simplifyboth learning and inference; however, geometric style can be more

precisely modeled with continuous variables, which can further in-crease the relevance of presented parts. The model can also be aug-mented to support a greater variety of symmetries, as well asspatialand functional relationships between components.

Assembly-based 3D modeling can also benefitfrom developments in consistent shape segmentation[Golovinskiy and Funkhouser 2009; Kalogerakis et al. 2010],from improved techniques for gluing and cutting components[Sharf et al. 2006; Schmidt and Singh 2010], as well as fromintegration with sketch-based interfaces for manipulating detailedgeometry [Nealen et al. 2005]. This will further advance theaccessibility of three-dimensional content creation.

Acknowledgments

We are grateful to Aaron Hertzmann, Sergey Levine, JonathanLaserson, Suchi Saria, and Philipp Krahenbuhl for their commentson this paper, and Daphne Koller for helpful discussions. ChrisPlatz and Hadidjah Chamberlin assisted in the preparation of figuresand the supplementary video. Niels Joubert narrated the video. Thiswork was supported in part by NSF grants SES-0835601, CCF-0641402, and FODAVA-0808515, and by KAUST Global Collabo-rative Research.

References

CHAUDHURI , S.,AND KOLTUN, V. 2010. Data-driven suggestionsfor creativity support in 3d modeling. InProc. SIGGRAPH Asia,ACM.

EPIC GAMES. 2011. Unreal Development Kit.http://www.udk.com.

FISHER, M., AND HANRAHAN , P. 2010. Context-based search for3d models. InProc. SIGGRAPH Asia, ACM.

FREY, B., AND MACKAY, D. 1997. A revolution: belief propa-gation in graphs with cycles. InAdvances in Neural InformationProcessing Systems, 479–485.

FUNG, R. M., AND CHANG, K.-C. 1990. Weighting and inte-grating evidence for stochastic simulation in Bayesian networks.In Proc. Conference on Uncertainty in Artificial Intelligence,North-Holland Publishing Co.

FUNKHOUSER, T., KAZHDAN , M., SHILANE , P., MIN , P.,K IEFER, W., TAL , A., RUSINKIEWICZ, S., AND DOBKIN , D.2004. Modeling by example. InProc. SIGGRAPH, ACM.

GOLOVINSKIY, A., AND FUNKHOUSER, T. 2009. Consistent seg-mentation of 3d models.Computers & Graphics 33, 3, 262–269.

KALOGERAKIS, E., HERTZMANN, A., AND SINGH, K. 2010.Learning 3D mesh segmentation and labeling. InProc. SIG-GRAPH, ACM.

KOLLER, D., AND FRIEDMAN , N. 2009.Probabilistic GraphicalModels: Principles and Techniques. The MIT Press.

KRAEVOY, V., JULIUS, D., AND SHEFFER, A. 2007. Modelcomposition from interchangeable components. InProc. PacificGraphics, IEEE Computer Society.

LEE, J., AND FUNKHOUSER, T. 2008. Sketch-based search andcomposition of 3d models. InProc. Eurographics Workshop onSketch-Based Interfaces and Modeling.

MAXIS SOFTWARE. 2008.Spore. http://www.spore.com.

MERRELL, P., SCHKUFZA, E., AND KOLTUN, V. 2010.Computer-generated residential building layouts. InProc. SIG-GRAPH Asia, ACM.

MORI, G., BELONGIE, S., AND MALIK , J. 2001. Shape con-texts enable efficient retrieval of similar shapes. InProc. IEEEComputer Vision and Pattern Recognition, vol. 1, 723–730.

NEALEN, A., SORKINE, O., ALEXA , M., AND COHEN-OR, D.2005. A sketch-based interface for detail-preserving meshedit-ing. In Proc. SIGGRAPH, ACM.

PEARL, J. 1988. Probabilistic Reasoning in Intelligent Systems:Networks of Plausible Inference. Morgan Kaufmann.

PODOLAK , J., SHILANE , P., GOLOVINSKIY, A., RUSINKIEWICZ,S.,AND FUNKHOUSER, T. 2006. A planar-reflective symmetrytransform for 3D shapes. InProc. SIGGRAPH, ACM.

SCHMIDT, R., AND SINGH, K. 2010. Drag, drop, and clone: Aninteractive interface for surface composition. Tech. Rep.CSRG-611, Department of Computer Science, University of Toronto.

SCHWARZ, G. 1978. Estimating the dimension of a model.TheAnnals of Statistics, 2, 461–464.

SHAPIRA, L., SHALOM , S., SHAMIR , A., ZHANG, R. H., ANDCOHEN-OR, D. 2009. Contextual part analogies in 3D objects.International Journal of Computer Vision 89, 2-3, 309–326.

SHARF, A., BLUMENKRANTS, M., SHAMIR , A., AND COHEN-OR, D. 2006. SnapPaste: an interactive technique for easy meshcomposition.Visual Computer 22, 9, 835–844.

SHIN , H., AND IGARASHI, T. 2007. Magic canvas: interactivedesign of a 3d scene prototype from freehand sketches. InProc.Graphics Interface.

SIMARI , P., KALOGERAKIS, E., AND SINGH, K. 2006. Foldingmeshes: hierarchical mesh segmentation based on planar sym-metry. InProc. Eurographics Symposium on Geometry Process-ing.

SPROULL, R. F. 1990. Parts of the frontier are hard to move.Computer Graphics 24, 2, 9.

TALTON , J. O., GIBSON, D., YANG, L., HANRAHAN , P., ANDKOLTUN, V. 2009. Exploratory modeling with collaborativedesign spaces. InProc. SIGGRAPH Asia, ACM.

TORRALBA, A., MURPHY, K. P., AND FREEMAN, W. T. 2007.Sharing visual features for multiclass and multiview object de-tection. IEEE Transactions on Pattern Analysis and MachineIntelligence 29, 5, 854–869.

UNITY TECHNOLOGIES. 2011.Unity. http://unity3d.com.

A Geometric Descriptors for Components

For each shape componentC from source meshM in the reposi-tory, we extract a30 + |L|-dimensional feature vector containing:a) mean and variance of the mean curvature over the surface ofC(the curvature is estimated at multiple scales over neighborhoods ofpoint samples of increasing radii: 1%, 2%, 5% and 10% relative tothe median geodesic distance between all pairs of point samples onthe surface ofM ); b) mean and variance of the shape diameter overthe surface ofC, and of its logarithmized versions w.r.t. normaliz-ing parameters 1, 2, 4 and 8;c) the following entries, derived fromthe singular values{s1, s2, s3} of the covariance matrix of samplepositions on the surface ofC: s1/

P

i si, s2/P

i si, s3/P

i si,(s1+s2)/

P

isi, (s1+s3)/

P

isi, (s2+s3)/

P

isi, s1/s2, s1/s3,

s2/s3, s1/s2 + s1/s3, s1/s2 + s2/s3, s1/s3 + s2/s3; d) surfacearea and volume ofC, divided by total surface area and volume ofM ; e) for each labell ∈ L, the average geodesic distance betweenpoints onC and points onM with labell, measuring the proximityof C to siblings with labell. Finally, we denoise the data by retain-ing only the principal components capturing 70% of the variance inthe data.

http://www.udk.com

http://www.spore.com

http://unity3d.com

probabilistic reasoning for assembly-based 3d...

Documents