taxonomy for 3d content-based object retrieval methods

IJRRAS 14 (2) ● February 2013 www.arpapress.com/Volumes/Vol14Issue2/IJRRAS_14_2_20.pdf

412

TAXONOMY FOR 3D CONTENT-BASED OBJECT RETRIEVAL

METHODS

Hanan ElNaghy1, Safwat Hamad

2 & M. Essam Khalifa

3

1Teaching Assistant,

2Assistant Professor,

3Professor

Faculty of Computer and Information Sciences,

Ain Shams University,Cairo, Egypt 1 [email protected],

2 [email protected],

3 [email protected]

ABSTRACT

The use of three-dimensional (3D) image and model databases throughout the Internet is growing both in number and

size. The emergence of 3D media is also directly related to the emergence of the 3D acquisition technologies. Indeed,

recent advances in 3D scanner acquisition and 3D graphics rendering technologies boost the creation of 3D model

archives for several application domains. Therefore, the development of efficient search mechanisms is required for

the effective retrieval of 3D objects from large repositories. Over the last years, a large number of competing

techniques have been developed for the purpose of content-based retrieval of 3D objects and consequently, more

research is required to be conducted on surveying and comparing such proposed techniques. In this paper, we provide

a comprehensive survey on the recent methods for content-based 3D object retrieval guided by a proposed taxonomy

based on different shape descriptors. Finally, a performance comparison is carried out among selected methods based

on qualitative measures.

Keywords: 3D Model Retrieval, 3D Object Retrieval, 3D Shape Matching, 3D Shape Description, Content Based

Similarity Search, Shape Analysis.

1. INTRODUCTION

The number of 3D geometric models available in online repositories is growing dramatically. Examples include: the

Protein Data Bank [1], which stores the 3D atomic coordinates for 29,000 protein molecules; the National Design

Repository [2], which stores 3D Computer- Aided Design (CAD) models for tens of thousands of mechanical parts;

and the Princeton Shape Database [3], which stores polygonal surface models for 36,000 everyday objects crawled

from the Web. Since graphics hardware is getting faster and 3D scanning hardware is getting cheaper, there is every

reason to believe that demand for and supply of 3D models will continue to increase into the future, leading to an

online environment in which 3D models are as plentiful as images, videos, and audio files today.

Large digital repositories of 3D models help create demand for search engines that are able to retrieve the data of

interest and for data mining algorithms to discover relationships among them. Text annotation is almost always

helpful, but content-based methods are often required to discover novel geometric relationships in the data. For

example, a mechanical engineer might use a search engine to find a particular CAD model in a parts catalogue based

on its 3D shape characteristics; a doctor might use an automatic classification system to aid diagnosis of a disease

from the shapes of afflicted organs; and a palaeontologist might use shape analysis to link similar 3D models scanned

from animal skeletons of different species.

3D object retrieval is the process in which 3D objects are retrieved from a database based on an object query using

some measures of similarity. Matching is the process of determining how similar two shapes are. This is often done

by computing a distance. A complementary process is indexing. Indexing is the process of building a data structure to

speed up the search. Note that also “indexing” is also used as a term for the identification of features in models, or

multimedia documents in general. Retrieval is the process of searching and delivering the query results. Matching and

indexing are often part of the retrieval process. At a conceptual level, a typical 3D object retrieval framework as

illustrated by Figure 1, consists of a database with an index structure created offline, and an online query engine. Each

3D model has to be identified with a shape descriptor, providing a compact overall description of the shape. To

efficiently search a large collection online, an indexing data structure and searching algorithm should be available.

The online query engine computes the query descriptor, and models similar to the query model are retrieved by

matching descriptors to the query descriptor from the index structure of the database. The similarity between two

descriptors is quantified by a dissimilarity measure. Finally, the retrieved models can be visualized.

In this paper, we classify and compare a large part of the current state-of-the-art 3D content-based object retrieval

methods, giving a contrasting assessment of the different approaches. Our aim is to introduce a novel taxonomy for

3D object retrieval approaches in order to provide a road map for possible research in 3D object retrieval domain.

mailto:[email protected]

mailto::[email protected]

mailto:[email protected]

IJRRAS 14 (2) ● February 2013 ElNaghy & al. ● Taxonomy for 3D Object Retrieval Methods

413

The paper is structured in the following manner. The following section contains a review of related work. Section III

provides an overview of our proposed taxonomy and different 3D object retrieval methods contained in it. A

comparison between these methods is presented in Section IV. Finally, considerations and conclusions are drawn in

Section V, followed by a discussion of topics for future work in Section VI.

2. RELATED WORK

Recently, a number of researchers have investigated the specific problem of content based 3D object retrieval. Also,

an extensive amount of literature can be found in the related fields of computer vision, object recognition, geometric

modelling, computer-aided design and engineering. Survey papers to this literature have been provided by Besl and

Jain [4], Loncaric [5], Campbell and Flynn [6] and Mamic and Bennamoun [7]. For an overview of 2D shape

matching methods we refer the reader to the paper by Veltkamp [8]. Unfortunately, most 2D methods do not

generalize directly to 3D model matching. 3D shapes don‟t need to be segmented from a background, exhibit no

projective deformation, and have no direct boundary parameterization.

The recent Ph.D. dissertations by Min [9], Kazhdan [10] and Vranić [11] focus on 3D object retrieval. They provide

an introduction to 3D shape matching and a detailed description of their new object retrieval and querying methods.

Goodall [12] provides an in-depth analysis of a range of 3D shape descriptors for their suitability for general purpose

and specific retrieval tasks using the Princeton Shape Benchmark, and using real world museum objects evaluated

using a variety of performance metrics. The survey by Cardone et al. [13] primarily focuses on shape similarity

methods that are suitable to compare CAD models in the context of product design and manufacturing applications.

Shilane et al. [14] compare 12 shape matching methods with respect to processing time, storage requirements and

discriminative power using the Princeton Shape Benchmark. Biasotti et al. [15] present a framework for evaluating

3D shape classification performances.

Many taxonomies have been proposed for classifying 3D retrieval methods based on different perspectives. Bustos et

al. [16] propose a taxonomy for 3D retrieval methods focusing on feature based methods. The report by Icke [17]

proposes a taxonomy based on dividing 3D retrieval methods into those that make use directly of the 3D model

(model based) and other methods that work on a number of 2D projections of the 3D model (view based). The

taxonomy proposed by Iyer et al. [18] provides an extensive overview of 3D shape searching techniques especially

relevant for CAD and engineering. Del Bimbo and Pala [19] present a comparative analysis of a few 3D retrieval

approaches by comparing these methods with respect to robustness against deformations, ability to capture an object

structural complexity, and the resolution of the models. They classify 3D retrieval methods according to the principles

under which shape representations are derived resulting into four main categories of approaches: primitive based,

statistics based, geometry based, and view based.

Descriptor Extraction

Fetching

3D Model Database

Descriptors Index Data Structure

Descriptor Extraction

Index Construction

Matching Model ID-s

Query Results

Query Descriptors

Visualization

Example Model Query

Formulation

Figure 1 3D Object Retrieval Framework


414

3. PROPOSED TAXONOMY FOR 3D OBJECT RETRIEVAL METHODS

Each 3D model, in a typical 3D object retrieval framework, has to be identified with a shape descriptor, providing a

compact overall description of the model. The simplified representation of the 3D object, given by the descriptor, tries

to carry most of the important features of the model, while being easier to handle, to store and to compare than the

object directly. A complete shape descriptor is a representation that can be used to completely reconstruct the original

object.

In this section we provide a taxonomy for 3D object retrieval methods based on the representation of the shape

descriptor with a detailed description of many individual methods, sorted according to our classification. In this

classification, the retrieval methods are divided into five broad categories: (1) View Based Methods, (2) Graph Based

Methods, (3) Geometry Based Methods, (4) Statistics Based Methods and (5) General Techniques. Figure 2 shows a

more detailed categorization of 3D object retrieval methods as proposed by taxonomy. Note that the classes of these

methods are not completely disjoint. For instance, a graph-based shape descriptor may be extended to describe shape

properties not related to topology. In these cases each method is classified according to the most characteristic aspect

of its representation.

3.1 View Based Methods

View based methods are based on the property that similar 3D objects have similar appearances from one or more

camera positions or viewing angles. In the real world, spatial objects, apart from means of physical interaction, are

recognized by humans in the way they are visually perceived. Therefore, a natural approach is to consider 2D

projections of spatial objects for similarity estimation. Thereby the problem of 3D retrieval is reduced to one in two

dimensions where techniques from image retrieval can be applied.

The main idea of view based similarity methods is that multiple images of 3D objects in a database are captured from

several camera positions around each object. The images are often processed to either produce a binary representation

[20] or to extract boundary contours [3]. Then images of the query object or sketch produced by a user are processed

in a similar manner and matching is therefore reduced to assessing the similarity between the views of the query

object and those of the models in the database. The main challenge of view based techniques is on obtaining enough

number of views to describe all possible aspects of a model with respect to the huge storage space cost.

Graph Based

Skeletal Graphs

Reeb Graphs

B-Rep Graph

Spectral Graph

Statistics Based

Shape Histograms

3D Shape Contexts

Shape Distributions

Geometric Hashing

Spatial Maps

View Based

Silhouette

Depth Buffer

Light Field

Spin Images

SIFT

General

Relevance Feedback

Bag of Features

3D Object Retrieval Methods

Geometry Based

Weighted Point Set

Geometrical Moments

Shape Spectrum

Extended Gaussian Image

Canonical 3D Morphing

Volumetric Error

Heat Kernel Signatures

Figure 2 3D Object Retrieval Methods' Taxonomy


415

One advantage of view-based descriptors is that it is straightforward to design query interfaces where the query

consists of one or more sketches of a model from different views [3] [20]. On the other hand retrieval based on views

presents two main problems. First, the appearance of self-occlusions which may impair the completeness of the

representation. Secondly, the production of a large number of views can degrade the efficiency of retrieval process

itself.

For more detailed discussion on view-based 3D object retrieval methods, it is convenient to further classify those

methods into five subcategories: (1) Silhouette, (2) Depth Buffer, (3) Light Field, (4) Spin Images and (5) Scale

Invariant Feature Transform (SIFT). An Overview of each of these five categories together with the solutions

implemented under each category will be covered in the following subsections.

3.1.1 Silhouette

A silhouette is an image of a person, an object or scene consisting of the outline and a basically featureless interior,

with the silhouetted object usually being dark (specially a shadow). So, silhouettes contain the boundary of a shape

from one view point and indicate the region of a 2D-image that contains the projection of the visible points of the 3D

object. In order to represent a 3D shape, a collection of silhouettes should be generated and stored. This can be seen as

a more economical representation compared to model based representations. Common usage of this representation is

object classification where matching is done between one view (silhouette) of a 3D shape and a database of objects

represented as collection of silhouettes of models. However, the problem with this representation is that, in theory,

different 3D shapes might have the same set of silhouette images.

Heczko et al. propose a method called silhouette descriptor [21] that defines 3D objects using their silhouettes

generated by parallel projections. PCA method is first applied for normalizing 3D objects which are then scaled into a

unit cube parallel to the principle axis. Then, three planes are calculated using parallel projections such that each of

these planes is orthogonal to one of the principal axes. Descriptors can be acquired by concatenating Fourier

descriptors of the three resulting contours. To obtain such descriptors, a silhouette is sampled by placing a certain

number of equally spaced sequential points on the silhouette, and regarding the Euclidean distance between the image

centre and the consecutive contour points as the sampling values. These sampling values in turn constitute the input to

the Fourier transform. The concatenation of the magnitudes of low frequency Fourier coefficients of the three contour

images then gives the silhouette object description. Through PCA pre-processing, the Silhouette descriptor is pose

and scale invariant. Figure 3 illustrates the contour images of a car object. Retrieval effectiveness of this descriptor is

studied in [11] by conducting different experimental results. Furthermore, the idea of using projected images for the

purpose of 3D object retrieval is also implemented by employing various distance functions on image pairs resulting

from rendering the object images from certain directions [22]. Distance functions may be based on circularity

measures from the projections or distances between vectors of magnitudes after Fourier transform. This method can

be directly applied to industrial part retrieval and inspection system where different geometric representations are

used. Ansary et al. [23] propose a method called Adaptive Views Clustering (AVC) based on characteristic view

similarity. The goal of this method is to provide an optimal selection of 2D views from a 3D model, and a

probabilistic Bayesian method for 3D object retrieval from these views. Other similar methods are introduced in [24]

and [25].

Figure 3 Silhouettes of a 3D model. Note that, from left to right, the viewing direction is parallel to the first, second, and third principal axes of the model. Equidistant sampling points are marked along

the contour. [16]

3.1.2 Depth Buffer

Depth Buffer Descriptor [21] is one of the significant image-based descriptors that begins with the same arrangement

as the silhouette descriptor, where the model is oriented and scaled into the canonical unit cube. For each principal

axis, two grey-scale images are generated by means of parallel projection producing a number of six images rather

than only three silhouettes, in case of silhouette descriptor that is described in previous section. An 8-bit grey value is

carried by each pixel in order to represent the distance between the 3D model and the sides of the unit cube measured


416

along a corresponding direction that is perpendicular to the viewing plane. Then, the depth buffer feature vector is

constructed by the contribution of some of the k first low-frequency coefficients of each image after transforming the

six images by applying the standard 2D discrete Fourier transform. An illustration of this method is given in Figure 4

where the Fourier transforms of a car model are visualized in addition to the respective depth buffer renderings of the

same model. Moreover, the idea of depth or z-buffers in the field of computer graphics is manifest in the six rendered

images that are extracted during the application of depth buffer descriptor. This shape descriptor is further discussed

in [11] and it achieves the highest retrieval performance in the study of Bustos et al. [26] and in the study of [27] with

the Princeton Shape Benchmark.

Obhuchi et al. [28] introduce a shape descriptor called The Multiple Orientation Depth Fourier Descriptor which

works on polygon soup 3D models. The models are made invariant to translation and scaling and a set of 42 depth-

buffer images (a kind of range image) of the 3D model are computed. These set of images are viewed from 42

viewpoints in order to approximately and discretely cover all the possible view aspects of the model. Then for each

viewpoint a shape feature vector is computed. The shape feature for each view is a rotationally invariant generic

Fourier Descriptor for 2D images developed by Zhang et al [29]. The set of 42 feature vectors comprises the multiple

oriented shape descriptors for the 3D model. Similarity is calculated by the sum of the minimum distances between

views on one object and all other views on the other. This descriptor performs slightly better than the Absolute-Angle

Distance (AAD) descriptor in the comparison by Ohbuchi et al. [30] and in the experimental results presented by

Tangelder and Veltkamp in [31] using a test database containing 1,213 models, this descriptor obtains better retrieval

results but with more processing time than the D2 shape distribution approach [32] which will be explained later in

Section 3.4.3. Bustos et al. [16] conclude that the depth buffer has good retrieval capability and is able to outperform

other descriptors on their benchmarking database according to their own experimental results.

Figure 4 Depth buffer-based feature vector. The first row of images shows the depth buffers of a car model.

Darker pixels indicate that the distance between view plane and object is smaller than at brighter pixels. The

second row shows coefficient magnitudes of the 2D Fourier transform of the six images. [16]

3.1.3 Light Field

A light field descriptor is proposed by Chen et al. [33] where two models are considered to be similar if they look

similar from all viewing angles. Given a 3D model made invariant to translation and scaling, they create a light field

containing a set of silhouettes that are obtained from parallel projections of a 3D object. 20 views of a 3D model are

extracted by placing a camera on each of the vertices of dodecahedron which is assumed to be entirely surrounding

the object such that it is located at the centre of the 3D object (Figure 5). This camera system is oriented in a perfect

way, where the all the cameras are pointing in the direction of the dodecahedron‟s centre and a unique definition is

applied for the camera up-vector. Each view (silhouette) is obtained as a binary image (colour or gray-level

information of the object surface is disregarded). Since views taken from opposite vertices of the dodecahedron are

identical, only 10 views are sufficient to represent a 3D object with enough accuracy (Figure 6). For each view, image

content is represented using a combination of 35 Zernike moments [34] and 10 Fourier descriptors [35], all included

in a 45-dimensional feature vector that represents the Light Field descriptor of the view so as to reduce the two-

dimensional information to a one-dimensional feature vector. In order to measure the similarity between two 3D

models, the camera system of one of the models is rotated with respect to the other, so that all 60 possible orientations

of both camera systems are covered generating a set of corresponding image pairs. Then the minimum of the sum of

distances between all these image pairs are calculated to compare between the two 3D objects.


417

Figure 5 Extraction of the lightfield descriptor for a chair model. [158]

Figure 6 Silhouette images of Chair. [33]

A multistage filter and refinement process can be applied to speed up the light field algorithm so that it can be used

for online retrieval purposes. This acceleration can be achieved through early rejection of non-relevant models which

exhibit a distance greater than the mean distance between the query object and all other objects of the database.

Shilane et al. [14] concluded that the light field method is outstanding and is able to outperform other methods but at

the cost of much more processing time.

3.1.4 Spin Images

Johnson and Hebert [36] propose a 3D descriptor using sets of so-called spin-images to characterize 3D objects. This

descriptor by design is invariant to rotation and translation transformations. A spin image is a 2D histogram that is

computed at a chosen point on the surface of a model. For a mesh model, a spin image can be computed for every

vertex on the mesh. A surface normal can be estimated at each vertex which is picked as the oriented point. A set of

points within the maximum distance D to the oriented point which satisfy the condition that the angle between their

normal and the normal of the oriented point is within allowed angle values are selected as the contributing points.

Then a 2D histogram is calculated based on the perpendicular distances to the surface normal and to the tangent

planes at the oriented point. This histogram can also be used as an image.

The authors suggest applying bilinear filtering on the spin-images in order to reduce the impact of noise. Scaling

invariance is provided by normalizing the distance range to unit length. Figures 7 and 8 illustrate the spin images

generation process. Due to potentially high storage and computation overhead when cross-comparing all spin-images

of two objects and also the presence of redundant information among close or symmetrically-related spin images,

Johnson and Hebert [36] suggest performing compression on the set of an object‟s spin images using dimension

reduction. Although this method can be used to recognize models in a cluttered 3D scene, it is very difficult to apply

to 3D shape matching due to the complexity of its representation.

De Alarcon et al. [37] use spin images method in 3D object retrieval. For each 3D model represented as polygon

mesh, they generate a large number of spin images, then using a Self Organizing Map (SOM) algorithm, they

generate a reduced set of spin images per model. Furthermore, they cluster these spin images using k-means

clustering algorithm in order to provide an indexing mechanism on their database. This technique is suited to reduce

the number of required descriptor comparisons by checking only the spin image prototypes. The authors report

experimental results on a small database.


418

Figure 7 Building a spin image as a histogram of distances α and β Figure 8 Selected spin images generated from a 3D model. [160]

of points in some neighbourhood with respect to basis point p. [159]

Also, Assfalg et al. [38] suggest spin image post processing techniques that help to reduce the number of spin images

used to describe each object. Particularly, spin images are interpreted as grey-scale images which could be efficiently

described by a low-dimensional region-based description scheme from the Content-Based Image Retrieval (CBIR)

domain. Orthogonally, fuzzy clustering is proposed to reduce the number of spin images to a smaller number of

prototypes onto which a sum of cluster distance function is suggested.

3.1.5 Scale Invariant Feature Transform (SIFT)

Scale Invariant Feature Transform (SIFT) [39] is a method for image feature generation which transforms an image

into a large collection of local feature vectors, each of which is invariant to image translation, scaling, and rotation,

and partially invariant to illumination changes and affine or 3D projection. SIFT consists of four major stages: (1)

scale-space peak selection, (2) keypoint localization, (3) orientation assignment and (4) keypoint descriptor. In the

first stage, all scales and image locations are searched to identify potential interest points that are invariant to scale

and orientation. In the second stage, keypoints are selected based on measures of their stability. The third assigns one

or more orientations to each keypoint location based on local image gradient directions. The final stage builds a local

image descriptor for each keypoint based upon the image gradients in its local neighbourhood which are then

transformed into a representation used for object recognition. This method can be used for the purpose of 3D object

Retrieval where the SIFT algorithm is applied to images rendered from the 3D model to be compared, producing

thousands of local visual features per model.

Ke and Sukthankar [40] presents a PCA-SIFT method which is an alternate representation for local image descriptors

for the SIFT algorithm. Instead of using SIFT‟s smoothed weighted histograms, Principal Components Analysis

(PCA) is applied to the normalized gradient patch in order to encode the salient aspects of the image gradient in the

feature point‟s neighborhood. Through the experiments conducted by the authors, it was found that PCA based local

descriptors are more distinctive, more robust to image deformations, and more compact than the standard SIFT

representation.

A number of methods have been presented using SIFT technique for 3D object retrieval. Li et al. [41] apply SIFT to

extract 2D view features from six orthogonal directions of a 3D model where Continuous Principal Component

Analysis (CPCA) is used to implement pose alignment for capturing the six orthogonal 2D views. The 2D view

feature vector dimension is reduced by computing and storing the eigenspace. Then, transform-based features are

extracted using the radial integral transform and the spherical integration transform. Weighted sum of 2D view and

transform-based feature similarity are used for comparing a query model with the database models.

Hua et al. [42] use multi-view SIFT features for 3D model retrieval. First the 3D model is projected from multiple

directions to obtain 2D depth images. Second, the SIFT features are extracted from these 2D images. Then, the

codebook is generated based on k-means clustering algorithm according to the proportion of SIFT features number of

various shape types and whole SIFT features respectively. Finally, a simplified vector is generated to clustering all the

SIFT features associated with a model in the form of histogram. For comparing a query models with other existing

models, the distance between simplified vectors is calculated. Also, Sfikas et al. [43] use SIFT descriptors based on

panoramic views for 3D object retrieval where a range of image queries representing partial views of 3D objects are


419

generated. Generally, SIFT based 3D object retrieval methods achieves satisfactory retrieval performance for both

articulated and the rigid models.

3.2 Graph Based Methods

Graph based methods attempt to extract a geometric meaning from a 3D shape using a graph showing how shape

components are linked together. A graph can more easily represent the structure of an object that is made up of, or can

be decomposed into, several meaningful parts such as the body and the limbs of objects modelling animals.

Correspondences between two models can then be established using subgraph isomorphism techniques, which

simultaneously define the correspondences between the nodes of the two graph representations, and give the quality

of the correspondences where each graph node may also have properties such as a local shape descriptor. Graph

comparison costs increase proportionally with graph size where graphs can be matched based on exact or inexact

matching techniques. While exact matching looks for noiseless matches between two graphs, inexact matching results

in a measure of similarity between the graphs even in the presence of noise. Graphs are a natural choice to capture

model topology but they often involve complex extraction algorithms and matching strategies. Besides the difficulty

of constructing consistent graphs, computing the matching subgraph (subgraph isomorphism) is known to be

computationally inefficient. However, the major advantage in representing 3D models as topological graphs is that it

allows representation at multiple levels of detail and facilitates matching of local geometry.

Graph based methods can be divided into four broad categories according to the type of graph used: (1) Skeletal

graphs, (2) Reeb graphs, (3) B-rep graphs and (4) Spectral graphs.

3.2.1 Skeletal graphs

Regarded as instinctive object descriptions; skeletons derived from solid objects can capture essential information

about objects‟ structure. This can be noted in applications including object analysis, compression and animation.

Skeletal graph-based techniques compute the „skeleton‟ for a model and convert it into a skeletal graph as its shape

descriptor. The concept of a skeleton is proposed by Blum [44]. A skeleton in 2D is the medial axis, while in 3D it is

the medial surface. In order to use skeletons for 3D object retrieval, suitable skeletonization algorithms and skeleton

similarity functions have to be defined. Skeletonization can be performed by various methods such as distance

transform [45], thinning [46], or Voronoi-based methods [47]. Additionally, curve skeletonization [48] methods have

been proposed to convert a 3D model into a medial axis type representation. The skeletal graph (Figure 9) stores the

various entities obtained after skeletonization in a graph data structure. Advantages of skeletal graph-based methods

are that the graphs are topology preserving and are smaller in size than B-Rep graphs that will be discussed later.

Hence, they can be used for subgraph isomorphism at a very low computational cost. Additionally, local part

attributes can be stored for a more accurate comparison. It is important to note that many 3D engineering models are

not amenable to skeletonization by thinning. A skeletal graph shape descriptor is proposed by Sundar et al. [49]

encoding both geometric and topological features of a 3D model. First, a distance transform-based thinning algorithm

[50] is applied using a thinness parameter to extract the skeletal points of a given 3D object after being voxelized.

Next, a minimum spanning tree algorithm is applied in order to connect the skeletal points in an undirected acyclic

shape graph. Finally, a hierarchical graph structure can be obtained by using different magnitudes of the thinness

parameter where every segment of the original skeleton is represented by a node in this graph. Each node carries two

signature vectors. The first one is a geometrical signature vector which encodes the radial distribution around the

segment, while the second one is a topological signature vector which encodes the topology of the subtrees rooted at

the node and is stored for efficient indexing. The skeletal graph approach is shown in Figure 10.

Figure 9 Skeletal Graph based representation of shape. [18]


420

Figure 10 Skeletal graph matching with colours showing the node-to-node correspondence based upon the topology and radial

distance about the edge. [49]

In this approach, the similarity between two given 3D models can be measured by comparing their hierarchical

skeletal graphs using image matching. Moreover, the main advantage of this approach is its support of partial

matching and it suitability for matching articulated 3D objects.

Iyer et al. [51] and Lou et al. [52] use thinning as the skeletonization method. A 3D model is converted into a voxel

model and then into a thinned skeleton. The thinned skeleton is converted into a skeletal graph through a skeleton

marching algorithm. The skeletal graph consists of nodes, edges, and loops. Each edge in the skeleton translates into

an independent geometric entity, while each loop represents a hole in the 3D model, thereby giving a shape to the

model in physical space. Skeletal graphs preserve the geometry and topology of the model and are considerably

smaller than the B-Rep graphs and insensitive to minor perturbations in shape. Additionally, feature vectors such as

moment invariants, geometry and voxelization parameters, and graph parameters are stored in the database. The

search process consists of feature-vector searching as well as skeletal graph-based searching. A database of 150

models was used to test similarities between models. However, thinning does not yield intuitive skeletons for many

engineering components, especially for shell-like parts. In order to overcome the disadvantages of the skeleton-based

approach, they also describe a multi-step search approach that uses both global feature vectors and the skeletal graphs

in different steps.

Kim et al. [53] present a method for reducing 3D solid models into skeleton graphs based on medial axis transform

and dilation. Triangulated models of the objects are initially voxelized with certain resolution, and skeletons are

obtained based on the rank of the voxels in the model. The rank of a voxel is defined as the number of layers of

voxels surrounding it. For example, in Figure 11, pixel A has a rank of 0, while pixel B has a rank of 1. Hence, the

skeleton of this image consists of pixels A and B. These pixels are called nodes. For each pixel in the skeleton, k

dilations are performed around a pixel with rank k, starting with the pixels having highest rank K. Dilations are then

performed successively for all pixels in the skeleton thereby reconstructing the object. However, care is taken not to

dilate any node pixels of lower rank that are already a part of the dilations of a pixel with higher rank. This method is

directly extended to 3D voxel models. The graph is finally obtained by the union of nodes for which dilation was

performed. Each node now holds information on all the voxels that were formed during dilation and its own rank. If

two nodes have common voxels then the voxels are assigned to the node with a higher degree while performing the

union operation. In order to reduce the size of the graph, a method is presented to merge nodes based on a threshold

rank and size of the nodes. Experiments on a few models are presented showing a considerable decrease in the size of

the graphs. This method is prone to problems with voxelization resolution. Different graphs may be produced by

varying the resolution during voxelization. To remedy this problem, if the same resolution is applied to all models, the

graphs generated may not reflect the perception of the user.

Figure 11 An illustration of the rank of a voxel during dilation. [18]


421

Nagasaka et al. [54] convert a 3D model into a voxel model and subsequently convert it into a line skeleton based on

distance transform. A distance measure, DS, denoting the distance of a voxel from the surface, is defined for all the

voxels in the model. The voxels with the maximum DS values constitute the skeleton and are defined by three types of

geometries, lines, circular rings, and triangles. These skeletons are linked to each other with or without distance for

connected and disconnected skeletons, respectively. Each skeleton is associated with a set of nine attributes including

the DS distribution, volume, and link strength. The similarities among 36 casting work pieces are compared based on

their skeletons, where each skeleton is represented as a tree with 126 leaves. A back-propagation based neural

network is trained using these attributes to classify the objects into a set of three distinct groups (called master data

objects). All the other models in the database are classified into one of these three groups. Although some of the

models have reasonable similarity estimates, others have similarity estimates that are counter-intuitive.

3.2.2 Reeb graphs

Reeb [55] define a skeleton structure, called the Reeb graph, which is represented by a graph of interconnected nodes

based upon a suitable continuous scalar function on an object. Three types of scalar functions have been used, namely

Height function, Curvature function, and Geodesic distance. Geodesic distance has been used in many applications

because it provides invariance against rotation and robustness against noise and small perturbations. The function is

integrated over the whole body to make it invariant to the starting point and is also normalized to achieve scale

invariance. Mathematically, the Reeb graph is defined as the quotient space of a shape S and a quotient function f.

Biasotti et al. [56] compare Reeb graphs obtained by using different quotient functions f and highlight how the choice

of f determines the final matching result. For instance, the integral geodetic distance as quotient function is especially

suited for articulated objects, while the distance to the barycentre should be preferred if the aim is to distinguish

different poses of an articulated object. The Reeb graphs are primarily introduced in Computer Graphics by

Shinagawa et al. [57], and are limited at the beginning to Morse mapping functions. Moreover, their computation

requires a priori knowledge of the object genus. Reeb graphs can be efficiently used for surface analysis and

understanding, simplification, similarity evaluation and 3D object retrieval.

Hilaga et al. [58] use Multi-resolution Reeb Graphs (MRGs) for representing the topology of 3D shapes as an

extension of the original Reeb graphs to be used for comparing 3D models. A 3D model is divided into a number of

levels based on the value of the scalar function where the geodesic distance distribution is utilized as the continuous

function. A node of the Reeb graph represents a connected component in a particular region, and adjacent nodes are

linked by an edge if the corresponding connected components of the object contact each other. For topology

comparison, the similarity between nodes is estimated based on the similarity of the node attributes, while a coarse-to-

fine strategy is adopted for topological correspondence comparison. The coarse-to-fine strategy is believed to help

avoid the combinatorial explosion of the topology matching. Experiments are performed on 230 models where the

searching took on an average 12 s for each model, thereby leading to 0.05 s per similarity measurement. This method

is well suited for articulated shapes.

Chen and Ouhyoung [59] extended the MRG approach proposed by Hilaga et al. to handle practical parts. The claim

was that the original MRG-based method needed some modifications for accurate determination of the search key. To

achieve this objective, the paper recommends adequate pre-processing by resampling the original triangulated models

to split large triangles into smaller triangles before determining the MRGs. A database of 445 models was used for

testing the system. The paper reported the average time for comparing two models as 0.08 s, compared to the 0.05 s

reported by Hilaga et al. The extra computation time may be attributed to the extra costs of pre-processing the models

for better accuracy.

Bespalov et al. [60] investigate the application of Hilaga‟s method to solid models by applying it for a large number

of engineering parts. They found that for solid models, minor changes in topology may result in significant

differences in similarity. Since for solid models topological insensitivity is important, they conclude that the Reeb

graph technique requires some improvements. Bespalov et al. [61] present preliminary research on a modification of

Hilaga‟s method, which computes a scale-space decomposition of a shape, represented as a rooted undirected tree

instead of a Reeb graph. This reduces the problem of comparing two 3D models to computing a matching among the

corresponding rooted trees.

Tung and Schmitt [62] propose to augment a multiresolution Reeb graph with geometrical attributes like volume,

cords and curvature of the surface part associated with a node of the Reeb graph. In the context of human body

matching, they argue that without geometric information, an arm could be matched to a leg because they are

topologically equivalent. Also, their approach supports partial matching.

In summary, the MRG-based approach has the following advantages: (1) it works for non-orientable, non-closed, and

non-manifold surfaces, (2) it is position, orientation, and scale independent, (3) it considers local and topological

similarity while comparing the 3D objects and (4) it adopts a hierarchical strategy to reduce the search space through

multiple levels of resolution. Major limitations of an MRG based approach are the following: (1) it is affected by


422

surface connectivity, (2) it is more sensitive to geometry than topology, (3) it is not amenable to sub-graph matching,

(4) it does not always represent the skeleton and (5) it produces vertices of different density. Chen and Ouhyoung [59]

conclude that using a hierarchical medial axis-based method would be more useful in overcoming some of these

problems, while Bespalov et al. [60] recommend using better scalar functions that also take into account the topology

of the object.

3.2.3 B-Rep graph matching

In the engineering domain, 3D models are often represented as Boundary Representation (B-Reps) which is

considered to be a model graph based. A B-rep describes a model in terms of its vertices, edges and faces where the

model is represented as a graph in terms of bounded B-Spline surfaces. By contrast to the facets in meshes, the faces

of a B-rep may be represented as free-form surfaces. The nodes of the graph represent the set of bounding surfaces

while the edges represent the intersecting curves between corresponding surfaces. These representations tend to be

large and complicated even for simple shapes, thereby posing a challenge for graph matching algorithms. A number

of approximate algorithms based on heuristics and randomization are often employed to determine the best match

between graphs. One approach converts the B-Rep graph into a Model Signature Graph, which is essentially a map of

the B-Rep graph, but with different node attributes that describe the geometric properties of the corresponding

surfaces in the manufacturing context.

El-Mehalawi and Miller [63] use the attributed graph matching approach to compare CAD models of engineering

parts in the STEP format. The models are converted from the STEP format to attributed graphs whose nodes contain

geometric attributes that represent the surfaces of the STEP model. The graph matching process and the experimental

results are described by El-Mehalawi and Miller [64]. The paper takes an inexact graph matching approach that

avoids the combinatorial problems associated with exact matching. Similarity measures are generated using an

inexact graph matching algorithm based on integer programming. However, most of the models tested in the paper

have small sizes and moderate complexity. Two parts with surfaces on the order of 200 were also compared with

success. However, the times taken for comparisons have not been presented.

The B-rep graph based approaches are especially relevant for the CAD/CAM community, but are difficult to apply to

models of natural shapes like humans and animals. To the best of our knowledge only Zuckerberger et al. [65] apply

an approach similar to B-rep graphs to content based retrieval suitable for natural shapes. They decompose the surface

of a model into patches classified as similar to a sphere, a cylinder, a cone or a plane, and identify adjacent patches to

build a graph representation of the model.

3.2.4 Spectral graph theory

Spectral graph theory is a branch of mathematics that relates the Eigenvalue spectra of the adjacency matrix of graphs

with other geometric invariants of the graph [66]. Chung [67] propose a refined version of the graph spectra, which is

based on the Laplacian matrix of the graph and correlates with the graph invariants better than the spectra of the

original adjacency matrix. The graph spectra obtained for different graphs are then compared using various distance

measures. These measures are intuitive only when the objects to be compared have graphs of the same size. However,

when the sizes of the graphs are not the same, the spectra have different lengths thereby making a comparison

difficult. McWherter and Regli [68] overcame this problem by padding constant values to the smaller of the two

graph spectra to make them have the same lengths.

McWherter et al. [69] make use of a specialized graph structure called the Model Signature Graph (MSG) constructed

from the B-Rep representation [70]. MSGs are essentially attributed graphs with the vertices representing the faces of

the model and the vertex attributes describing the qualities of the face. These attributes include type of surface (flat,

curved, etc.), relative size, topological identifier of the faces (planar, conical, etc.), underlying geometric

representation of the surfaces (type of function), the surface area, and a set of surface normals or aspects for the face.

Similarly, edges between surfaces are described with attributes such as their topological identifiers,

concavity/convexity, geometric representation, and the length of the curve.

Peabody et al. [71] also use the frequency histograms of these attributes in comparing two models. The type

histogram for a given solid model essentially represents the frequency of the 13 types of surfaces and eight types of

curves found in the ACIS solid modeller. The MSG essentially is as complex in topology as the original BRep

structure. Various graph properties extracted from MSGs, such as vertex and edge counts, maximum, minimum,

mean, mode, and standard deviation of vertex degrees, as well as graph diameter, are also used to compare similarities

between models. In addition, McWherter et al. [72] employ spectral graph theory to describe the topology of the

MSGs. The distance between graph spectra is called Eigen distance. The Type Histogram together with all the graph

invariants is referred to as the Invariant Topology Vector (ITV), which has 33 features. Similar models are

conjectured to have similar ITVs and, hence, the distance between ITVs provides an approximate measure of the

similarity between models. The distance measures are calculated based on the L2 norm. Variable-size graphs are dealt


423

with by „truncating‟ the Eigenvalue spectrum or „padding‟ it with a set of constant values (of 0.0, 1.0 or 2.0) in order

to maintain a constant size of the Eigenvalue array for all models. However, it is important to note that the adjacency

matrix of the MSG represents certain properties, which cannot be completely captured by the Eigenvalues. This

method is suitable for complicated graph structures, which are not amenable to graph matching algorithms that are at

best NP-hard.

McWherter et al. [73] provide a method for comparing the substructure of the solid models. The approach consists of

partitioning the MSG into two or more subgraphs by removing the edges that cross the partitions with the aim of

separating the highly connected components. Partitioning is continued iteratively while also indexing the Eigenvalues

of each of the components at each stage. This is a step towards obtaining refined similarity measures that describe a

solid model in terms of its local properties. However, optimal partitioning of the graph is again an NP-hard problem.

While the similarity metrics used in these papers can be computationally fast and seem very useful for pruning the

search space in large databases, the metrics do not seem to be satisfactory in producing refined similarity measures

that can finely distinguish 3D models.

3.3 Geometry Based Methods

Geometry based methods make use of certain geometric parameters and ratios of the 3D object as shape descriptors.

Some examples of these parameters are volume, surface area, curvature or other kinds of numerical descriptions

extracted from the shape. Examples of extracted ratios are surface area to volume ratio, compactness (non-

dimensional ratio of the volume squared over the cube of the surface area), crinkliness (surface area of the model

divided by the surface area of a sphere having the same volume as the model), convex hull features, bounding box

aspect ratio, and Euler numbers. All these parameters and ratios could be characterising the object either globally or

locally. Most global methods are computationally efficient but they do not allow partial matching, in contrast to this,

local methods are not computationally efficient though they can be used in partial matching as well.

Global methods aim to capture the characteristics of the 3D object as a whole. There are a great number of approaches

proposed in order to describe the global shape of an object.

Local methods consider the local properties of 3D object around the neighbourhoods of points on the surface.

Curvature is an example local property which has also been used within the context of the global methods. In that

context, such local properties are treated as a collection and all together they form a global descriptor for the object.

Some methods do not merge the local properties to form a global descriptor, therefore they are suited for partial

matching. Also, they can provide better descriptions of shapes because they capture more detailed information about

the object at the expense of computation time. These methods have been mostly used for object recognition in

cluttered environments and surface registration problems. Some of them have also been applied to 3D object retrieval

problem. These methods do not require prior pose normalization of the models.

In this section, we discuss a number of geometry based techniques that are applied to 3D object retrieval classified

according to the geometric descriptor used.

3.3.1 Volumetric Error

These techniques are based on the observation that different objects occupy the volume in different ways. This cannot

be captured by a simple volume difference technique. Two objects might have the same total volume, though they are

not similar. Because of the nature of the comparison, all 3D objects have to be pose normalized before processing.

Some of these techniques are presented here.

Kaku et al. [74] propose a method based on OBBTree data structure that was described by Gottschalk. First, all 3D

models in their database are pose normalized. Next, each 3D model is represented by a binary tree, where each node

in this tree represents the centre of an Oriented Bounding Box (OBB). Finally, the similarity measure between two 3D

models is a weighted combination of two similarity measures. The first similarity measure is based on the sum of

differences of the corresponding nodes on the trees, while the second one is based on the original models‟ aspect

ratios. This method outperforms the D2 shape distribution function [32] through the retrieval experiments conducted

by the authors.

Ichida et al. [75] present a system called ActiveCube with an interactive user interface for 3D object retrieval where

all the models including the query model are represented in voxels. Using a 5cm side cubes, the user generates the

query models which is recognized automatically by the system in real time. In this system, similarity measure

between two 3D models is drawn by extracting the intersections between the voxels representations of the models.

Sánchez-Cruz and Bribiesca present a method [76] capable to transform a 3D object into another, where the

computation of this transformation is used as dissimilarity measure between the two voxelized 3D objects. This

method relates the volumetric error between two objects to a transportation distance that counts the number of voxels

to be moved and how far to change one object into another one. The main disadvantage of this method is the high cost

of computing the transportation distance since generally most 3D objects possess a large number of voxels. Another,


424

geometry similarity approach was proposed by Novotni and Klein [77], where a volumetric error is calculated

between one object and a sequence of offset hulls of another object for the purpose of 3D object retrieval. The main

disadvantage of this approach is that the applied dissimilarity measure is not symmetric and disobeys the triangle

inequality.

3.3.2 Weighted Point Set

These methods generate a set of 3D points from the 3D object. The points are weighted in some manner. Different

similarity measures have been proposed to match these point sets.

The weighted point set method [78] compares two 3D objects represented by polyhedral meshes. A shape signature of

the 3D object is defined as a set of points that consists of weighted salient points from the object. The authors propose

three different ways of generating a weighted point set, given a pose normalized 3D polygon model, which is placed

in a 3D grid. Each non-empty grid cell contains one salient point. The selection of the salient point and its weight is

done in various ways: (1) pick the point in each cell that has the highest Gaussian curvature, and assign the curvature

value as the weight for the point, (2) pick the area-weighted mean of the vertices in the cell as the point, and a

measure of facet normal variation as the weight, (3) compute the centre of mass of all vertices in the cell and assign 1

to weight. The similarity measure they use is a variation of the earth mover's distance, unlike that measure the

proposed measure satisfies the triangle inequality. The authors report better results compared to the shape distribution

methods proposed by Osada et al. [32] on their own database.

Shamir et al. [79] propose a shape descriptor consisting of a hierarchy of weighted point sets, representing spherical

shape approximations such that the point sets to be matched are obtained for each 3D model by decomposing the

model into a coarse-to-fine hierarchy of spheres. The point sets, therefore, consist of sphere radii and associated

centres and can be matched by a custom coarse-to-fine algorithm involving exhaustive search on a coarse level and

graph matching techniques on finer levels in the multiresolution representation.

Dey et al. [80] propose a 3D object descriptor consisting of weighted 3D points. First, the 3D object is decomposed

into its components given by a point sample. Then, each component is represented by a point in the weighted point

set, where the weight of the point encodes the volume of the corresponding component. Finally, the weighted point

sets are matched by a measure that disobeys the triangle inequality.

Also, Funkhouser et al. [81] presented a weighted point shape matching descriptor based on the sum of squared

distances for models aligned in the same coordinate system. Squared Euclidean distance transforms of the models are

used to support the efficiency of the sum of squared distances‟ computation, while standard Singular Value

Decomposition (SVD) technique is applied to support efficient retrieval through converting the shape descriptors to a

low-dimensional subspace. The main advantage of this descriptor is that it supports partial matching by associating

weights to the points in the user selected part of the query model. This shape descriptor was proved to outperform the

3D harmonics approach [3] in addition to the radial extent function [82] through the experimental results conducted

using the Princeton benchmark [14].

3.3.3 Geometrical Moments

3D moments are a popular type of descriptor that has received the attention of several researchers. The usage of

moments as a means of description has a tradition in image retrieval and classification. Thus, moments have been

used in some of the first attempts to define feature vectors for 3D object retrieval. Statistical moments are scalar

values that describe a distribution f. Parameterized by their order, moments represent a spectrum from coarse-level to

detailed information of the given distribution [83]. In the case of 3D objects, an object may be regarded as a

distribution f (x, y, z) ∈ R3, and the moment

of order n = i + j +k in continuous form can be given as:

∫ ∫ ∫

(1)

It is well known, that the complete (infinite) set of moments uniquely describes a distribution and vice versa. In its

discrete form, objects are taken as finite point sets P in 3D, and the moment formula becomes

∑

| |

. Because moments are not invariant with respect to translation, rotation, and scale of the considered

distribution, appropriate normalization should be applied before moment calculation. When given as a polygon mesh,

candidates for input to moment calculation are the mesh vertices, the centres of mass of triangles, or other object

points sampled by some scheme. A FV can then be constructed by concatenating several moments, for example, all

moments of order up to some n.

Studies that employ moments as descriptors for 3D retrieval include Paquet and Rioux [84] where moments are

calculated in terms of the centre of mass for all triangles with respect to the mass of the triangle as well as Vranić and


425

Saupe [85] where moments are calculated for object points sampled uniformly with a ray-based scheme, and Paquet

et al. [83] where moments are calculated from the centres of mass (centroids) of all object faces. Vranić and Saupe

[85] compare the retrieval performance of ray-based with centroid-based moments and conclude that the former is

more effective. Also, Zhang and Chen [86] propose efficient ways to compute the signed volume and moments of 3D

polygon meshes. Another publication that proposed the usage of moments for 3D object retrieval is Elad et al. [87].

Here, the authors uniformly sample a certain number of points from the object‟s surface for moment calculation.

Special to their analysis is the usage of relevance feedback to adjust the distance function employed on their moment-

based descriptor. While in most systems a static distance function is employed, here it is proposed to interactively

adapt the metric. A user performs an initial query using a feature vector of several moments under the Euclidean

norm. She marks relevant and irrelevant objects in a prefix of the complete ranking. Then, via solving a quadratic

optimization problem, weights are calculated that reflect the feedback so that, in the new ranking using the weighted

Euclidean distance, relevant and irrelevant objects (according to the user input) are discriminated by a fixed distance

threshold. The user is allowed to iterate through this process until a satisfactory end result is obtained. The authors

conclude that this process is suited to improve search effectiveness.

3.3.4 Shape Spectrum

Zaharia and Prêteux [88] present a descriptor for 3D object retrieval proposed within the MPEG-7 framework for

multimedia content description. The descriptor reflects curvature properties of 3D objects. The shape spectrum

descriptor is defined as the distribution of the shape index for points on the surface of a 3D object which is a function

of the two principal curvatures. The shape index is a scaled version of the angular coordinate of a polar representation

of the principal curvature vector, and it is invariant with respect to rotation, translation and scale by construction. It is

a local geometrical attribute of a 3D surface. Figure 12 illustrates some elementary shapes with their corresponding

shape index values. The 3D shape spectrum descriptor is a continuous function and for use with polygonal models,

the descriptor is estimated. This descriptor is sensitive to topological changes meaning that objects differing only in

some pose (e.g. a human with arms out and arms by the sides) will be treated differently. Experiments conducted by

the authors with this descriptor on several 3D databases quantitatively show good retrieval results.

Also, Shum et al. [89] study the idea of using surface curvature as a description for 3D models. First, a regularly

tessellated spherical mesh representation is built having approximate uniform distribution with known connectivity

among mesh nodes. Next, for each node of the fitted mesh, the averaged curvature information is computed

depending on the neighbour nodes. Finally, the task of comparing two 3D models is done through comparing their

two curvature distributions generated from deformed meshes. They used a distance function as a shape metric

between two objects which proved to be stable under noise. The main advantage of this approach is being rotation

invariant, on the other hand the main drawback of it is that the quality of approximation of a polyhedral or free-form

surface depends on the number of patches chosen. Through the experimental results conducted by the authors, this

method was proved to be robust by means of adopting a sufficient number of tessellations.

Figure 12 Shape index values for some elementary shapes. [88]

3.3.5 Extended Gaussian Image

The distribution of the normals of the polygons that form a 3D object can be used to describe its global shape. One

way to represent this distribution is using the Extended Gaussian Image (EGI) [90] [91]. The EGI is a mapping from

the 3D object to the Gaussian sphere such that the 3D model is represented by a spherical function (Figure 13). To

compute the EGI of a 3D object, the normal vectors of all polygons of the 3D objects are mapped onto the respective

point of the Gaussian sphere that has the same normal as the polygon. To build a descriptor from this mapping, the

Gaussian sphere is partitioned into R × C cells (by using R different longitudes and C−1 different latitudes), where

each cell corresponds to a range of normal orientations. The number of mapped normals on cell gives the value of

this cell. All cell‟s values are mapped to an R × C matrix which is called the signature of the 3D object. The similarity

between two object signatures a and b is given by ∑ ∑ (| | | |)

[91]. Retrieval


426

performance studies were performed in Kazhdan et al. [92] and Funkhouser et al. [3]. Also, its performance was

evaluated in recognition of aligned human head models in Ip and Wong [91]. The extended Gaussian image has

several important properties that make it useful for shape analysis and matching. First, it is invariant to translation.

Second, the EGI scales and rotates with the model in the 3D space. Third, for convex models the EGI is an invertible

representation. Fourth, EGI is reported to have high capability to distinguish between man-made objects and natural

objects [14].

The Complex Extended Gaussian Image (CEGI) is a generalization of the EGI, proposed by Kang et al. in [93].

Rather than just voting with a real value equal to the area of the triangle, this method votes with a complex number

whose amplitude is equal to the area of the triangle and whose complex phase is equal to the normal distance of the

triangle from the origin. This approach results in a representation of a 3D model that rotates and scales with the 3D

model, and which exhibits a simple phase-shift when acted on by translation. Thus, it is particularly well suited for

applications in which one would like to register two similar models in different poses, as the challenge of solving for

the optimal rotation and translation can be decoupled by first solving for the optimal rotation using the complex norm

of the CEGI, and then separately solving for the optimal translation effectively decomposing a 6D optimization

problem into two independent 3D optimization problems.

Figure 13 Mapping from object normals to the Gaussian sphere. [16]

3.3.6 Canonical 3D Morphing (Projection)

The idea behind the projection based methods is that the energy required to morph or transform a 3D object into

another could be a measure of similarity between these two objects. In the context of 3D object retrieval, every model

in the database is morphed into a canonical space or shape (e.g., a sphere) and the amount of energy required to do

this morphing is used as a descriptor for that model and during retrieval the descriptors are compared. There are

different ways to define this energy. This section presents some of these methods.

Zaharia and Preteux have developed several successive versions the Hough Transform for use in 3D. The original

development [94] produced the Optimised 3D Hough Transform Descriptor (O3DHTD) then in later work [95] the

Canonical 3D Hough Transform Descriptor (C3DHTD). The Hough Transform morphs an object into Hough Space

using an accumulator which gathers evidence of how similar the query is to the reference. For each object, a look up

table is generated to perform this mapping. Similarity matching is performed by comparing the tables treated as

histograms. The 3D Hough Transform requires calculating a Hough Transform (HT) from all possible orientations of

the x, y and z axes from views down each axis, however this number can be reduced by taking into account the fact

that some pairs of orientation are equivalent, and that other views can be generated through a simple geometric

transform. This culminated in the O3DHTD based on three views. The C3DHTD reduced this to a single HT by

defining the object in such a way that all views become equivalent. Retrieval experiments were conducted, contrasting

the proposed descriptor with the shape spectrum and Extended Gaussian Image descriptors, attributing best

performance to the C3DHT descriptor. The largest disadvantage of using a Hough Transform is that it requires a large

amount of processing to provide a comparison as the computationally expensive part of populating the accumulator

cannot be pre-computed. It also requires normalization for rotation, scale and translation.

Leifman et al. [96] describes a sphere projection algorithm. They first normalize the pose of the models in their

database to ensure invariance to similarity transformations. The energy that is required to morph a model into its

bounding sphere with radius R is defined as ∫

where is the applied force and dist is the distance between


427

the object surface and the bounding sphere. The force is assumed to be constant for all points on the surface and along

the distance it's applied. Therefore, the energy is proportional to the distance between the sphere and the model's

surface. The authors reported experimental results on a database of 1068 random objects gathered from the Internet.

258 objects were manually categorized into 17 classes (people, missiles, cars and such). Their experiments gave better

results compared to the shape moments [97] and shape distributions [32] on most classes except for those ones that do

not have a common global shape simply because this method captures the global shape properties only.

Yu et al. [98] propose a similar method which is also based on the idea of morphing the models into a sphere. They

generate a distance map from object to the bounding sphere. Pose normalization techniques are used to put the models

in a canonical coordinate system before calculating the distance map. They also apply the Fast Fourier Transform

(FFT) on these maps in order to handle the possible misalignments even after the pose normalization. The similarity

measure they use is a weighted normalized Euclidean distance of the Fourier transformed maps. The authors report

good experimental results on a database of 52 models divided into 34 categories. No comparisons with other methods

are given.

3.3.7 Heat Kernel Signatures (HKS)

Lately, diffusion geometry has been significantly used for 3D object retrieval [99] [100] [101] [102] [103] [104].

Diffusion geometry depends on the heat diffusion equation which governs the conduction of heat u on the surface X,

(

) , (2)

where, denotes the positive semi-definite Laplace- Beltrami operator, a Riemannian equivalent of the Laplacian.

The solution of the heat equation with the initial conditions (and respective boundary

conditions if X has a boundary) describes the amount of heat on the surface at point in time . The fundamental

solution of (2) with point heat distribution as initial conditions is called the heat kernel and

denoted by (Figure 14). The heat kernel is invariant under isometric transformations and stable under

small perturbations to the isometry. In addition, the heat kernel fully characterizes 3D objects up to an isometry and

represents increasingly global properties of the 3D objects with increasing time [105] [106] [107].

Figure 14 Values of mapped on the shape (left) and values of for three different choices of y (marked with black dots in three rightmost figures). The value t = 1024 is used. Hotter colors represent smaller values. [152]

Sun et al. [108] propose using the diagonal of the heat kernel as a local descriptor, referred to as the Heat Kernel

Signatures (HKS). For each point on the shape, its heat kernel signature is an n-dimensional descriptor vector of the

form

, (3)

where is chosen in such a way that ‖ ‖ = 1.

The SHREC benchmark [109] verifies that the HKS is currently the state of the art feature descriptor. The HKS

descriptor has several advantages, which make it applicable for 3D object retrieval applications. First, HKS is

deformation-invariant (Figure 15 (b), left). Second, it captures information about the neighbourhood of a point on

the shape at a scale defined by . It captures differential information in a small neighbourhood of x for small t, and

global information about the shape for large values of . Thus, the n-dimensional feature descriptor vector p(x) can be

seen as analogous to the multi-scale feature descriptors used in the computer vision community. Third, for small

scales t, the HKS descriptor takes into account local information, which makes topological noise have only local


428

(a) Descriptors of different shapes (b) Near-isometries Topology

effect (Figure 15 (b), right). Finally, the computation of the HKS descriptor relies on the computation of the first

Eigenfunctions and Eigenvalues of the Laplace-Beltrami operator, which can be done efficiently and across different

shape representations.

While on the other hand, a disadvantage of the HKS is its dependence on the global scale of the shape and for that

reason a scale-invariant version of the HKS called Scale-Invariant HKS (SI-HKS) is proposed in [110] using local

scale normalization based on the properties of the Fourier transform. In addition to a Volumetric Heat Kernel

Signature (VHKS) which was proposed in [111] where the idea of heat kernel signature is extended to volumes. The

two approaches (HKS and SI-HKS) are distinguished by their discriminativity and efficient computability.

Figure 15 An RGB visualization of the first three components of the HKS descriptor of different shapes (figure (a)) and transformations of the same shape (figure (b)). Four leftmost shapes are approximately isometric, and the descriptor appears to be invariant to these deformations. The rightmost shape has different topology (hand and leg are glued at point marked with black dot). Note that though the descriptor changes in the place of the topological change, the discrepancy is localized. [152]

Another approach related to HKS is Wave Kernel Signature (WKS) [112] which arises from studying the Schrödinger

equation governing the dissipation of quantum mechanical particles on the geometric surface. It is based on the

Laplace–Beltrami operator and carries a physical interpretation. In contrast to the HKS, the WKS clearly separates

influences of different frequencies, treating all frequencies equally in order to solve the problem of poor feature

localization of the heat kernel descriptor. Appropriate parameterization of the WKS is determined by a theoretical

stability analysis aiming at features which are both highly informative yet robust to nonisometric perturbations of the

shape. Experimental results confirm that due to a better separation of scales and a better access to fine scale

information, the WKS allows for substantially more accurate feature matching than the HKS. Even with strongly

perturbed data, the WKS can still correctly detect feature correspondences.

3.4 Statistics Based Methods

Statistics based techniques sample points on the surface of the 3D model and extract characteristics from the sampled

points. These characteristics are organized in the form of histograms or distributions representing their frequency of

occurrence. Usually, the obtained histogram is represented as a feature vector where each coordinate value

corresponds to a bin of the histogram. Therefore, the term “histogram based methods” is used interchangeably with

the term “statistics based methods” in literature. Similarity is determined by a comparison of histograms by means of

a distance function. The accuracy and effectiveness of histogram-based techniques depend on the number of sampled

points. A larger number of sampled points result in higher accuracy. However, the efficiency is inversely related to

the number of sampled points.

An overview of most of the statistics based approaches proposed in the literature will be covered in this section.

3.4.1 Shape Histograms

Motivated by the challenge of using shape matching techniques to address the challenge of protein matching, Ankerst

et al. [113] developed three different methods for representing 3D models in terms of the distribution of surface

points as a function of distance from the centre of mass and spherical angle. When only the distance from the surface

is used the Shells descriptor is obtained, when only the spherical angle is used the Sectors descriptor is obtained, and

when both are used the combined descriptor (Spiderweb) is obtained. Shape histograms are based on a partitioning of

space in which 3D models reside. The complete space is decomposed into disjoint cells, which correspond to the bins

of the histograms. As a pre-processing step, 3D models are made invariant to translation by moving the origin to the

centroid. Figure 16 illustrates these space partitioning techniques.


429

a) Shells.

The space is decomposed into concentric shells around the centre point where the 3D model is represented by a

one-dimensional histogram, giving the distribution of distances of surface points from the centre of mass. This

representation is translation and rotation invariant since the distance of a point from the centre of mass does not

change when the model is translated or rotated about its centre. However, it requires normalization for scale.

Shells descriptor is a specific instance of an approach for obtaining rotation-invariant representations of 3D

models, and can be generalized to obtain a two-dimensional rotation-invariant representation with improved

retrieval performance.

b) Sectors.

The space is decomposed into sectors that emerge from the centre point of the model where the 3D model is

represented by a spherical histogram, giving the distribution of surface points as a function of spherical angle.

This representation scales and rotates with the model and exhibits no information loss when the initial model is

star-shaped.

c) Spiderweb (Combined).

Combining the Shells and Sectors representation, Ankerst et al. provide a shape descriptor that represents a 3D

model by a collection of spherical functions using the intersection of shells and sectors. Each spherical function is

obtained by intersecting the model with a thin spherical shell centred at the origin and then computing the Sectors

representation of the intersection. This combined descriptor represents more detailed information and has higher

dimensionality than pure shell models and pure sector models. Since the resolution of the decomposition is a

parameter, the number of dimensions can be tailored for a particular application. The resultant descriptor gives

rise to a three-dimensional representation that rotates with the model.

Figure 16 Shells and sectors as basic space decompositions for shape histograms. In each of the 2D

examples, a single bin is marked. [113]

These descriptors were evaluated by Shilane et al. [14] and showed that the combined descriptor performed quite

well, whilst the shells descriptor performed worst out of the descriptors evaluated. The sectors descriptor gave better

performance than the D2 shape distribution that will be explained in the next section, but still significantly worse than

the combined descriptor. A discrete version of the Shape Histogram that measures the occupancy of voxel cells has

also been used as a descriptor in [114], [115] and [116].

A similar search technique for mechanical parts using histograms was proposed in [117]. As a pre-processing step, the

models are normalized into a canonical form and voxelized. The 3D space is divided into axis parallel equisized

partitions. Each of these partitions is assigned to one or several bins in a histogram depending on the specific

similarity model. By scaling the number of partitions, the dimensionality of the feature vector is controlled. However,

there is a trade-off between dimensionality and accuracy of representation.

3.4.2 3D Shape Contexts

Körtgen et al. [118] combines the work on Shape Histograms [119] with Shape Contexts [120] and provides a set of

descriptors called 3D Shape Contexts which are applied for 3D object retrieval and matching. The shape context of a

point p is defined as a coarse histogram of the relative coordinates of the remaining surface points. The bins of the

histogram are defined by the overlay of concentric shells around the centroid of the model and sectors emerging from

the centroid. This histogram is known as the shape context. Matching consists of a local matching stage and a global

matching stage. In the local matching stage, for all points p the best matching point q is found on the other shape. In

the global matching stage, correspondences between similar sample points on the two shapes are found.

Compared to the methods presented in this paper, matching 3D shape contexts is less efficient, efficient indexing is

not straightforward, and the obtained dissimilarity measure does not obey the triangle inequality.


430

3.4.3 Shape Distributions

Shape Distributions techniques are not directly based on some measurements extracted from the 3D models but the

distributions of those measurements. This section will cover some of these techniques.

Shape distributions are carefully introduced and compared by Osada et al. [32] where the shape of a 3D object is

characterized as a probability distribution (histogram) sampled from a shape function. The primary step of this

technique is the selection of the shape function. Some of the shape functions studied by the authors include angles‟

distributions between three random points on the surface of a 3D object, the Euclidean distances distributions between

the object‟s centroid of the boundary and random points on the surface and other properties like area and volume

measurements between surface points selected randomly. In this work, the similarity measure between two 3D models

is defined by a metric that calculates the distance between distributions (e.g., Minkowski distances). Retrieval

experiments yielded that the best retrieval results were achieved using the D2 distance function (distance between

pairs of points on the surface, see Figure 17) and using the L1 norm of the probability density histograms which are

normalized by aligning the mean of each of the two histograms to be compared.

Obhuchi et al. [121] describe a method that computes a number of statistics along the principle axes of the 3D models.

The method works on the polygon mesh models. First, they align the models with respect to their principle axes and

for each of these axes they compute the following histograms along the axes: (1) the moment of inertia about the axis,

(2) the average distance to surfaces from the axis, (3) the variance of distance to surfaces from the axis. This

procedure generates 9 feature vectors which are concatenated to form one feature vector per model. They use

Euclidean distance and Elastic-matching distance for similarity comparisons. Their experiments show that the method

performs well on rotationally symmetric models only.

Figure 17 D2 shape distributions of five tanks (gray curves) and six cars (black curves). [32]

For comparing 3D CAD solid models, the D2 descriptor is modified by Ip et al. [122] where the D2 shape function

[32] is used and shape distributions are generated. Three separate distributions are used to represent one model. The

first distribution (IN) is calculated for all pairs of points where the line connecting them lies inside the model. The

second distribution (OUT) considers pairs of points where the line connecting them lies outside the model. The third

distribution (MIXED) is calculated for all pairs of points where the line connecting them passes both inside and

outside the model. The dissimilarity measure between two 3D models is defined by a weighted combination of their

dissimilarity for the D2, IN, OUT and MIXED distributions. This method can only be applied to volume models since

it requires classifying a line segment as inside or outside the model. This approach was extended by Ip et al. [123] in

order to automatically categorize a large model database, given a categorization on a number of training examples

from the database.

Rea et al. [124] present a descriptor characterizing the concavities of a 3D model, where the difference of the object‟s

shape distribution and the shape distribution of its convex hull are used for encoding the object‟s features. Ohbuchi et

al. [30] propose another variation of the D2 shape distribution function [32]. Two 2D histograms Angle-Distance

(AD) and Absolute Angle-Distance (AAD) are generated, taking into account surface orientation. The AD histogram

is computed by measuring both the distance between a pair of random points and the angle formed by the surfaces on

which the pair of points is located. On the other hand, the AAD histogram is computed as the absolute value of the

inner product of the surface normal vectors to increase robustness. The AD histogram is suitable for models that have

properly oriented surfaces with a consistent representation, while the AAD histogram is more suitable for models

having inconsistently oriented surfaces. This approach was found to outperform the D2 shape distribution function at

about 1.5 times higher computational cost, through the experimental results conducted by the authors. Ohbuchi et al.

[125] improved this method by a multi-resolution approach where a number of alpha-shapes are computed at different


431

scales, and for each alpha-shape the Absolute Angle-Distance descriptor is computed. This approach was found to

outperform the Angle-Distance descriptor [30] at the cost of high processing time required for computing the alpha-

shapes, through the experimental results conducted by the authors. Also, Liu et al. [126] investigate another extension

of the D2 shape distribution function, where they use a thickness histogram to estimate thickness of a 3D model from

all directions.

Rea et al. [127] propose a Surface Portioning Spectrum (SPS) distribution, particularly designed for retrieval and

indexing of 3D CAD models. In this work, both the geometry and topology of 3D models are encoded in a single 2D

graph (SPS) and the number of maximal connected regions against a range of tolerance values between 0 and 3600

are measured such that two adjacent faces are connected if and only if the angle between their normals is less than the

tolerance value. The similarity between two 3D models is evaluated using the SPS with a neural network. Pu et al.

[128] represent the 3D model by a series of slices along certain directions so that the shaped matching between two

3D models is transformed into similarity measuring between a series of 2D slices. In this method, two dimensional D2

shape distribution function is used to measure the similarity between 2D slices.

The main advantage of using Shape Distributions for 3D object retrieval is their effective distinction of 3D models

belonging to broad categories, while the main disadvantage is their poor performance in case of discriminating

between 3D models having similar general shape properties but different detailed shape properties.

3.4.4 Geometric hashing

In this technique, originated by Lamdam and Wolfson [129], a 3D object is parsed into basic geometric features such

as surface points. From the points set, a set of basis points are chosen and the coordinates of all the remaining points

with respect to that basis are calculated and are then stored in a histogram for each coordinate set. This is repeated for

all basis combinations, and the resulting histogram is stored in a hash table. This is called geometric hashing. The

objects are indexed based on the hashing and, in turn, used for matching with the query model. The bin that receives

the highest number of votes for the query model indicates the set of models that is similar to the query model.

Leibowitz et al. [130] used geometric hashing for comparing protein molecules for multiple structural alignment and

core detection in an ensemble of protein molecules. The 3D object consists of the set of data points pertaining to the

atomic coordinates here. It was observed that the geometric hashing implementation is memory intensive. Wolfson

and Rigoustos [131] described the 3D implementation aspects of the geometric hashing algorithm. Gueziec et al.

[132] discussed the use of geometric hashing in 3D medical image registration, where the crest lines in images are the

main geometric features used for hashing.

3.4.5 Spatial Maps

A limitation of statistical approaches is that they do not take into account how local features are spatially distributed

over the model surface. For this purpose, spatial map representations have been proposed to capture the spatial

location of an object. The map entries correspond to physical locations or sections of the object, and are arranged in a

manner that preserves the relative positions of the features in an object. Spatial maps are in general not invariant to

rotations, except for specially designed maps. Therefore, typically pose normalization is done first.

Vranić et al. [133] introduce a ray-based descriptor that describes a surface by associating to each ray from the origin

the distance to the last point of intersection of the ray with the model. For this spherical extent function its spherical

harmonics are computed, which form a Fourier basis on a sphere much like the familiar sine and cosine do on a line

or a circle. Their method requires pose normalization to provide rotational invariance. In Assfalg [134], a method is

proposed for description of shapes of 3D objects whose surface is a simply connected region. The 3D object is

deformed until it is a function on the sphere. Then, information about surface curvature is projected on to a 2D map

which is used as the descriptor of the object shape. Recently, curvature correlograms have been proposed [135] to

capture the spatial distribution of curvature values on the object surface.

Kazhdan et al. [136] present a general approach based on spherical harmonics to transform rotation dependent shape

descriptors into rotation independent ones. Their method is applicable to a shape descriptor which is defined as either

a collection of spherical functions or as a function on a voxel grid. In the latter case a collection of spherical functions

is obtained from the function on the voxel grid by restricting the grid to concentric spheres. From the collection of

spherical functions they compute a rotation invariant descriptor by (1) decomposing the function into its spherical

harmonics, (2) summing the harmonics within each frequency, and computing the L2-norm for each frequency

component. The resulting shape descriptor is a 2D histogram indexed by radius and frequency, which is invariant to

rotations about the centre of the mass. This approach offers an alternative for pose normalization, because their

method obtains rotation invariant shape descriptors. Their experimental results show indeed that in general the

performance of the obtained rotation independent shape descriptors is better than the corresponding normalized

descriptors obtained using the conventional PCA approach. Their experiments include the ray based spherical

harmonic descriptor proposed by Vranić et al. [133]. Finally, note that their approach generalizes the method to


432

compute the voxel-based spherical harmonics shape descriptor, described by Funkhouser et al. [3], which is defined

as a binary function on the voxel grid. Note that Kazhdan et al. [136] use the negatively exponentiated Euclidean

distance transform of the surface of a 3D model to compute the 3D voxel grid.

Novotni and Klein [137] present a method to compute 3D Zernike descriptors from voxelized models as natural

extensions of spherical harmonics based descriptors. 3D Zernike descriptors capture object coherence in the radial

direction as well as in the direction along a sphere. Both 3D Zernike descriptors and spherical harmonics based

descriptors achieve rotation invariance. However, by sampling the space only in the radial direction the latter

descriptors do not capture object coherence in the radial direction (Figure 18). The limited experiments comparing

spherical harmonics and 3D Zernike moments performed by Novotni and Klein show similar results for a class of

planes, but better results for the 3D Zernike descriptor for a class of chairs.

Figure 18 Spherical harmonics do not distinguish models that differ by a rotation of an interior part. [136]

Vranić [82] expects that voxelization is not a good idea, because many fine details are lost in the voxel grid.

Therefore, he introduces the radial extent function as a spherical harmonics shape descriptor based on using functions

defined on concentric shells around the centre of a model, and compares it with the voxel-based spherical harmonics

shape descriptor proposed by Funkhouser et al. [3]. Also, Vranić et al. accomplish pose normalization using the so-

called continuous PCA algorithm [133], which calculates sums of integrals over all triangles in a mesh model, instead

of using the conventional PCA algorithm. The experimental results from the paper show that the continuous PCA is

better than the conventional PCA and better than the weighted PCA, which takes into account the differing sizes of

the triangles of a mesh. Also it is shown that the shell-based spherical harmonic descriptor outperforms the voxel-

based spherical harmonic descriptor proposed by Funkhouser et al. [3]. Liu et al. [138] point out that Vranić‟s method

is unstable, because noise can shift the centre of the concentric shells. As an alternative they propose a method based

on spherical harmonics using Delta functions defined by the distance to the centre of the model. Their experimental

results using their own database of 740 models show better performance for their method than the method by Vranić

[82].

Ricard et al. [139] introduce a 3D Angular Radial Transform (ART) shape descriptor. The angular radial transform

represents a 3D object represented in spherical coordinates as the product of a radial basis function along the angular

and two radial basis functions along the radial directions. The shape descriptor consists of an array of the ART

coefficients of these basis functions. Since the ART coefficients are invariant to rotation around the z-axis, their

method requires only alignment of the z-axis to the first principal axis computed by the PCA. Their experimental

results show that for the Princeton Shape Benchmark, methods based on spherical harmonics are better, and for a

database provided by Renault consisting of 5000 models of car parts, the results of both methods are similar.

The spatial map based approaches show good retrieval results. But a drawback of these methods is that partial

matching is not supported, because they do not encode the relation between the features and parts of an object.

Furthermore, these methods provide no feedback to the user about why 3D objects match.


433

(1) Initial Search (2) After a Relevance Feedback Iteration

3.5 General Techniques

General techniques are those which can be applied to a 3D object retrieval method in order to enhance the overall

retrieval process. Usually, these techniques are integrated independently with an existing 3D object retrieval

descriptor which proved to have good retrieval performance in order to improve its retrieval results.

Among a wide range of general 3D object retrieval methods that have been developed, Relevance Feedback method

and Bag of Features method are selected to be discussed in this section.

3.5.1 Relevance Feedback

Relevance Feedback (RF) provides a convenient interactive way to retrieve semantically similar 3D objects by

allowing the user to include his perceptual feedback in the search. It is an iterative search technique that bridges the

semantic gap between high level use intention and low level data representation by iterating the following three

stages. First, the system retrieves similar 3D models and presents them to the user in descending order of similarity.

Next, the user provides feedback regarding the relevance of some of the current retrieval results. Finally, the system

uses these examples to learn and improve the performance in the next 3D object retrieval iteration, as demonstrated in

Figure 19.

Figure 19 Filtering out geometrically similar, but semantically dissimilar, models (query model at the top-left). [140]

A relevance feedback algorithm that is based on supervised as well as unsupervised feature extraction techniques was

presented in [140] using an algorithm that builds upon some of the best known techniques in information retrieval and

combines them in a new, completely automatic, manner, so as to outperform the existing techniques. Atmosukarto et

al. also apply relevance feedback [141] in their proposed method which depends on combining various feature types

for 3D object retrieval. It performs query processing based on known relevant and irrelevant objects in the query, and

computes the similarity of an object with the query using pre-computed rankings of the objects without computing in

high-dimensional feature spaces. Experimental results show that feature combination method significantly improves

the retrieval performance of individual feature types.

Leng and Qin present a relevance feedback mechanism [142], which effectively makes use of strengths of different

feature vectors and perfectly solves the problem of small sample and asymmetry. In [143], a relevance feedback

technique is proposed which relies on the assumption that similarity may emerge from the inhibition of differences,

i.e., from the lack of diversity with respect to the shape properties taken into account. To this end, a user is provided

with a variety of shape descriptors, each analysing different shape properties. Then the user expresses his multilevel

relevance judgments, which correspond to his concept of similarity among the retrieved objects. Finally, the system

inhibits the role of the shape properties that do not reflect the user‟s idea of similarity. The feedback technique is

based on a simple scaling procedure, which does not require neither a priori learning nor parameter optimization.

The main advantage of Relevance Feedback technique is the idea of addressing the subjectivity of similarity

depending on the fact that the process of 3D object matching itself is a subjective matter that depends on the human

viewer, as objects have semantics and are not just mere geometric entities.

3.5.2 Bag of Features

The bag-of-features approach is inspired originally by the bag-of-words approach in text retrieval, which

characterizes a text document by a histogram of words‟ occurrences in the document. Bag-of Features (BoF) approach

is employed by Liu et al. [144] for partial match retrieval of 3D models. The BoF approach integrates many local

features or descriptors into a single feature vector, ignoring position of each feature. As the method used a single-

resolution local feature, their method fails to capture global geometric shape of the models. The BoF approach is one


434

of the most popular and powerful methods to compute distance among sets, or bags of features in the field of object

recognition for 2D images [145] [146] [147] [148]. Typically, the approach encodes a given local feature into one of

several hundreds to thousands of visual word, by using a visual codebook. The visual codebook is often generated by

performing k-means clustering on the set of local features by setting k to the size of vocabulary. Then, for each image,

a histogram of visual words having the size of dictionary is created through vector quantization of local features. The

histogram then becomes the feature vector for the image. Note that the locations of the local features in the image are

not considered. A bicycle is a bicycle regardless of its position, orientation, or scale in the image.

Ohbuchi et al. [149] use Bag-of features approach as a part of their introduced method. The method, named Bag-of-

Features SIFT (BF-SIFT), employs Scale Invariant Feature Transform (SIFT) by Lowe [150]. The SIFT algorithm is

applied to a set of multiple view depth images rendered from the 3D model to be compared, producing thousands of

local visual features per model. To compute distance, the method employs the Bag-of-Features (BoF) approach that

fuses all the local features into a single feature vector. The BoF approach vector-quantizes local features into visual

words, and accumulates the frequency of the words into a histogram. The quantizer, or the codebook, is learned a

priori by using a large set of local features extracted from the kind of models to be retrieved, e.g., the models in the

database. The integration of thousands of local features into the feature vector reduced the cost of feature storage and

the cost of feature comparison. An improved algorithm was propped by Furuya and Ohbuchi [151] , which extracts

much larger number of local visual features by sampling each depth image densely and randomly. The method applies

GPU for SIFT feature extraction and an efficient randomized decision tree for encoding SIFT features into visual

words. This method achieves better retrieval results for highly articulated models and geometrically detailed models.

In [152], an HKS-based bag-of-features approach is introduced under the name of Shape Google merging Heat Kernel

Signature method together with Bag of Features method into one technique. This approach is shown to achieve state-

of-the-art results in deformable shape retrieval.

4. COMPARISON

Clearly, each 3D object retrieval method has its advantages and drawbacks; each of them, however, is suitable within

a particular application context. There is no one descriptor that performs best in all situations. While some generally

show higher performance than others overall, for specific cases other descriptors may perform much better. Recently,

great work has been done to present different experimental comparisons between a number of 3D object retrieval

methods using currently available benchmark databases based on different evaluation requirements [14], [11], [19]

and [153]. The most common used benchmark is the Princeton Shape Benchmark, a publicly available database of 3D

models, software tools, and a standardized set of experiments, for comparing 3D shape matching algorithms. The

database contains 1,814 models collected from the World Wide Web and classified by humans according to function

and form. Shilane et al. [14] find that the lightfield descriptor [33] is the most discriminating between the 12 shape

descriptors tested, but at higher storage and computational costs than most other descriptors while Vranić [11] shows

that the spherical harmonics approach using the negatively exponentiated Euclidean distance transform [136]

significantly outperforms all other approaches. On the other hand, Bronstein et al. [109] verify that among their

compared feature description algorithms the best results were achieved by heat kernel-based (HKS) methods.

Comparing the surveyed methods is a difficult task since the amount of technical details given in the original

literature varies between different methods and most of the authors employ individually compiled benchmarks when

empirically evaluating retrieval precision. We recognize it is a tremendous task to (re)produce an objective analytic

and experimental comparison of the wealth of 3D object retrieval methods that is beyond our resources.

Instead of presenting an intensive experimental comparison, we present a simple qualitative comparison based on

choosing the methods that were technically described best in the literature for each one of the four categories

illustrated in our proposed taxonomy (e.g. Light field and Spin images methods for view based category). We omitted

descriptors from the comparison, in cases where the technical description in the original sources was insufficient for

this comparison. In Table I some of the surveyed methods in Section III are evaluated with respect to several

requirements [31] of content based 3D object retrieval that will be explained this section, such as: (1) shape

representation requirements, (2) efficiency, (3) discrimination abilities, (4) ability to perform partial matching, (5)

robustness, and (6) necessity of pose normalization. Finally, the advantages and limitations of the several approaches

in content based 3D object retrieval are discussed.

4.1 3D Object Representation Requirements

Each 3D object retrieval system accepts certain types of object representations. Generally, a 3D object can be

represented in different levels of abstraction. The first level is as a set of points in 3D space, this point based

representation will be just raw data therefore it lacks a structure, but would be enough for visualization purposes. In

2D images, this corresponds to the pixels. Point Clouds [154] and Range Images are good examples of point based


435

representations. The second level of abstraction is the boundary of the object (Surface Representations), in the case of

3D shapes, the boundaries are surfaces. In 2D this corresponds to curves. Polygon mesh is a very popular surface

representation for 3D models because of its simplicity. The third level of abstraction is to think of the object in terms

of the volume it occupies. This kind of abstraction is known as Volumetric (Solid) Representations. In the case of 2D

shapes, this corresponds to the area. Voxels, Constructive Solid Geometry (CSG) and Octree are examples of

Volumetric (Solid) Representations [17].

4.2 Efficiency

Efficiency in general describes the extent to which time or effort is well used for the intended task or purpose. A 3D

object retrieval system should provide efficient methods for descriptor extraction, indexing, and database retrieval

(comparison). Since for “query by example” the shape descriptor is computed online, it is reasonable to require that

the descriptor computation and extraction is fast enough for interactive querying. Efficient indexing search structures

are needed to support efficient retrieval because it is inefficient to sequentially match all objects in the database with

the query object. This is needed due to the rapid growth of 3D databases as 3D scanning and 3D modelling become

commonplace. In databases consisting of millions of objects with hundreds of thousands of voxels or triangles each

which need to be automatically described and searched for, efficiency becomes mandatory.

In Table 1 the efficiency is qualified as slow, medium or fast based on database retrieval efficiency, where fast

efficiency supports interactive retrieval, i.e. answering queries within a second, and medium efficiency supports

almost interactive retrieval, i.e. answering queries within a few seconds, using a database containing thousands of

models.

4.3 Discriminative Power

A descriptor should capture properties that discriminate 3D objects well. However, the judgment of the similarity of

the shapes of two 3D objects is somewhat subjective, depending on the user preference or the application at hand. For

example for CAD models often topological properties such as the numbers of holes in a model are more important

than minor differences in shape. On the contrary, if a user searches for models looking roughly similar, then the

existence of a small hole in the model may be of no importance to the user. To provide effective retrieval, the system,

given a query, is supposed to return the most relevant objects from the database in the first rankings, and to hold back

irrelevant objects from this ranking. Therefore, it needs to implement discrimination methods to distinguish between

similar and dissimilar objects.

4.4 Partial Matching

In contrast to global matching, partial matching finds a 3D object of which a part is similar to a part of another. Partial

matching can be applied if 3D shape models are not complete in order to facilitate the automatic reconstruction of a

3D model from partial models obtained by e.g. laser scanning, where models may have to be reconstructed from

thousands of partial models [155] . Another application is the search inside “3D scenes” containing an instance of the

query object. Also, this feature can potentially give the user flexibility towards the matching problem, if parts of

interest of an object can be selected or weighted by the user. Partial matching is a much harder problem than global

matching, since it needs to search for, and define the sub-parts prior to measuring similarities. Graph based methods

in general are applicable to partial matching in addition to other methods like spin images method, shape spectrum

method and 3D shape contexts method that were previously explained in Section III. The weighted point method

developed by Funkhouser et al. [81], where the user selects a part of an object as a query, is a first powerful tool for

part-in-whole matching. This allows a user to search for shape models containing an instance of the query object.

Based on local feature correspondences, Funkhouser and Shilane [156] perform partial matching by backtracking

using a priority queue.

In Table 1, it is observed that all graph based methods support partial matching since these methods allow

representation of the 3D object at multiple levels of detail.


436

4.5 Robustness and Sensitivity

Robustness can be defined as the ability of a 3D object retrieval system to provide effective retrieval results despite

abnormalities in the query object or the 3D objects in the database. It is often desirable that a shape descriptor is

insensitive to noise and small extra features, and robust against arbitrary topological degeneracies, e.g. if it is obtained

by laser scanning. The shape descriptor must be stable to small changes in shape therefore; small changes in a shape

should result in small changes in the shape descriptor. On the other hand, if large changes in the shape of the object

result in very small changes in the shape descriptor, then the shape descriptor is considered not sensitive. Poor

sensitivity will lead to poor discriminative abilities. Also, if a model is given in multiple levels-of-detail,

representations of different levels should not differ significantly from the original model.

4.6 Pose Normalization

To consider searching 3D objects from heterogeneous databases where the objects may be arbitrarily scaled and

oriented in their respective coordinate systems, 3D object retrieval methods must follow one of two approaches. The

first one is to normalize the models before calculating their shape descriptors, while the other is to employ shape

descriptors that are already invariant to different types of transformation. Most methods to date advocate the use of

the first approach by applying the continuous Principal Components Analysis (PCA) method for pose normalization

[157] since designing a shape descriptor that is invariant to different types of transformations is not an easy task. On

the other hand Novotni and Klein [137] for example, favour the second approach by using rotation invariant shape

descriptors that need no pose normalization. Shape normalization and invariance techniques are more thoroughly

described by Kazhdan [10].

Table 1 Comparison between 3D object Retrieval Methods

Category View Based Graph Based Geometric Based Statistics Based

Method

Lig

ht

Fie

ld

Sp

in I

ma

ges

Sk

elet

al

Gra

ph

Ree

b G

rap

h

B-R

ep G

rap

h

Ma

tch

ing

Vo

lum

etri

c E

rro

r

Wei

gh

ted

po

int

set

Ex

ten

ded

Ga

uss

ian

Im

ag

e

Hea

t K

ern

el

Sig

na

ture

s (H

KS

)

Sh

ap

e

Dis

trib

uti

on

s

Sp

ati

al

ma

ps

References

[21],

[35],

[158]

and

[13]

[36],

[159] and

[160]

[44],

[48],

[49],

[51],

[52], [53]

and [54]

[55],

[57],

[58],

[59],

[60], [61]

and [62]

[63], [64]

and [65]

[74],

[96],

[75], [77]

and [161]

[78],

[79], [80]

and [81]

[91],

[92],

[13], [89]

and [94]

[152],

[108],

[110],

[111] and

[112]

[122],

[123],

[124],

[127],

[126],

[128] and

[162]

[135],

[136],

[133],

[137],

[138],

[139]

and

[153]

Shape Model Mesh Point

cloud Volume Volume Solid Volume Mesh

All

models

All

models All models

Efficiency Slow Medium Medium Fast Fast

Discriminative

Power High Medium Medium Medium High Low High Low High

Partial

Matching No Yes Yes No

Only

[81] No Yes No

Robustness High Medium Medium High High

Normalization

Required yes No No Yes Yes Yes No

Only

[121] and

[128]

Except

[3],

[133]

and

[137]

*References indicate which papers provide the indicated property. If no reference is indicated the property is valid in general.


437

5. CONCLUSION

3D object retrieval and analysis tools are proving useful in a number of application domains, most notably computer

graphics, Virtual Reality (VR), mechanical Computer-Aided Design (CAD), Computer Aided Manufacturing (CAM),

medicine, molecular biology, military applications, and entertainment. Looking forward, these tools will be

increasingly important, as 3D data acquisition hardware becomes more of a commodity, and more people begin to

make and use 3D models in their everyday lives. It will then be easier to find and retrieve an existing model made by

someone else than it will be to make a new one from scratch yourself. New methods will soon be deployed to perform

robust feature detection, partial shape matching, part decomposition, and eventually complete semantic labelling. 3D

models will then provide not only raw 3D geometry and surface attributes but smarts regarding their composition,

how they move, and how they are used providing all keys to making 3D data a more useful part of the information

revolution.

In this paper, a variety of recently proposed 3D object retrieval methods are surveyed through providing taxonomy

based on the representation of the used shape descriptor in each method. Four Broad categories have been considered,

View based methods, Graph based methods, Geometry based methods and Statistics based methods. Finally, a

qualitative evaluation of selected methods is presented. Based on results of the comparative evaluation, the following

considerations have been derived. View-based approaches appear to be superior to the other solutions. However, Spin

Image signatures and Light Field have very different performance for models with different resolutions, the former

showing superior performance with high-resolution 3D models, and the latter performing better for low-resolution

models. Also, SIFT based methods achieve good retrieval performance especially for highly articulated 3D objects.

On the other hand, graph based methods have a limited discriminating power, because only topology is taken into

account. To improve discriminative abilities, most authors apply graph based matching in combination with other

methods. For graph based methods, minor changes in topology may result in significant differences in similarity.

Large graph sizes affect efficiency and similar models need not have similar graphs. Hence, these methods are less

robust than other methods. Advantages of graphs based methods are that no pose normalization is required, and that a

graph based structure is suited to implement partial matching since its structure allows representation at multiple

levels of detail. Geometry based methods are varying. For example the volumetric error approach implemented by

Novotni et al. [116] allows fast matching, but uses a dissimilarity measure that does not obey the triangle inequality.

The weighted point matching approach developed by Funkhouser et al. [81] efficiently supports part-in-whole

matching, while Heat Kernel Signatures (HKS) method is generally characterized by its high discriminativity and

efficient computability.

Statistics-based solutions never provide better performance than the other solutions probably because they do not take

into account the spatial distribution of shape features over the model surface except for spatial maps that show good

retrieval results although partial matching is not supported and geometric hashing which is useful for exact matching

between shapes but acquires very high storage requirements.

6. FUTURE TRENDS

Based on the survey of diverse 3D object retrieval techniques described in this paper, we have come to some

important topics that should be addressed in the research of content-based description and retrieval of 3D objects in

order to improve the overall process of 3D object retrieval. These major topics are outlined in the following.

Better 3D Object Representations. 3D object representations should be designed keeping in mind the human

perception of shape/similarity [163] [164]. Interdisciplinary research with cognitive sciences will help yield better 3D

objects representations. Depending on the application, it may be more desirable to obtain a quick, information

preserving 3D object representation than a time-consuming, exact representation. However, it is also undesirable to

obtain an oversimplified representation.

Boosting Efficiency. System efficiency can be increased by using parallel or distributed computing methods for

computer intensive tasks. Efficient indexing is also required. The vantage method [165] may be applied to compute an

efficient index structure for pseudo-metrics that require much computing time. Also, clustering of d-dimensional

feature vectors [120] can be applied for efficient indexing. For graph descriptors the development of an efficient

indexing method is a major research topic. For instance, Sebastian et al. [166] describe a promising approach for

indexing shock graphs.

Improving Discriminative Power. Relevance feedback techniques reduce the semantic gap between the user‟s

notion of similarity and the system notion of similarity and improve system's discriminative power [87] [167] [97]

[168]. Similarity is a subjective measure and differs from user to user. The 3D object representation and

corresponding similarity measure should be customizable. This can be accomplished using hierarchical or multi-step


438

search strategies [169] [170]. Discriminative power can also be enhanced by developing interfaces that help the user

compose queries that accurately represent the user‟s intent or by developing active learning strategies for automatic

annotation [171]. Currently, a large gap exists between engineering and research in the psychological sciences.

Studies done by cognitive psychologists will be useful in understanding user intent thereby improving the

discriminative power of different 3D object retrieval methods. For example, Corney et al. [172] used human input to

optimize parameters for improving search performance.

Enhancing Benchmarks. New benchmarks should be created with a larger number of classified meshes, expanded

annotations, and subdivisions focusing on specific 3D object matching tasks such as: partial versus complete objects,

manifold versus polygon-soup meshes, and domain specific meshes such as proteins, CAD models, and architectural

objects. As more and more data gets added to the benchmarks, it will become possible to consider multi-classifiers

that take into account both geometric and non-geometric attributes of 3D models (e.g., colour, texture, scene graph

structure, textual annotation, etc.). Comparison of different 3D object retrieval methods using publicly available

benchmark databases, containing models classified in different categories containing similar shapes, has already led to

valuable comparisons of existing shape matching methods. In future work, to evaluate the power of newly developed

3D object retrieval methods, researchers should compare their methods against other state-of-the art 3D object

retrieval methods using these publicly available benchmarks.

Domain-specific 3D Object Retrieval. Based on the fact that 3D object retrieval is currently being applied in diverse

areas, 3D object retrieval systems should be specially designed for particular domains in order to make 3D retrieval

effective and satisfactory for specialists belonging to those domains. Applications in these areas are driven by

different requirements and constraints. For example, while a designer might think that two components are similar, a

manufacturing engineer might suggest two totally different manufacturing processes for them and it makes a lot of

difference whether a user is looking for industrial mechanical parts, or molecular structures. Therefore, domain

knowledge coupled with 3D object retrieval techniques will be imperative for widespread use of 3D object retrieval

systems. Additionally, to make it easier for industries to implement domain specific systems, tight integration with

specialized systems e.g. CAD/CAM/PLM will be necessary. Application-driven benchmark databases are also highly

required. For example, Bioinformatics has become a large area of research and databases of protein structures like the

protein databank [1] are highly required for the comparison of the protein structures based on their geometric shape.

In general, there are many domains where semantics and ontology play an important role, but which are only barely

trodden until now.

Combining 3D Object Retrieval Methods. Combining different approaches may produce more powerful 3D object

retrieval methods since each approach has its advantages and drawbacks. For example, spatial map based approaches

belonging to the statistics based category acquire capabilities such as fast computation, discriminative abilities and

robustness which are completely contrary to the capabilities of all approaches belonging to the graph based category

which are distinguished with the support of partial matching and that no normalization required. Also combining

geometry and topology based approaches may produce better 3D object retrieval methods. The mixed methods

proposed [173] [174] [11] combine different 3D object retrieval methods. The most significant method is Vranić [11]

who investigates a number of hybrid shape matching methods obtained by combining different 3D object retrieval

approaches. In his framework similarity is measured using a weighted combination of the similarity of the separate

methods, where the weight of each separate method is proportional to the dimension of its feature vector. His

experiments show that for comparing overall performance, a combination of the depth buffer-based descriptor [11],

the silhouette-based descriptor [11], and the ray-based descriptor [133], is the best.

7. REFERENCES

[1] Helen M. Berman et al., "The Protein Data Bank," Nucleic Acids Research, vol. 28, no. 1, pp. 235 - 242, January 2000.

[2] William C. Regli and Daniel M. Gaines, "A Repository of Designs for Process and Assembly Planning," Computer Aided Design, vol. 29, no. 12, pp. 895 - 905, Decembre 1997.

[3] Thomas Funkhouser et al., "A Search Engine for 3D Models," ACM Transactions on Graphics (TOG), vol. 22, no. 1, pp. 83 - 105, January 2003.

[4] Paul J. Besl and Ramesh C. Jain, "Three-Dimensional Object Recognition," ACM Computing Surveys (CSUR), vol. 17, no. 1, pp. 75 - 145, March 1985.

[5] Sven Loncaric, "A Survey of Shape Analysis Techniques," Pattern Recognition (PR), vol. 31, no. 8, pp. 983 - 1001, August 1998.


439

[6] Richard J. Campbell and Patrick J. Flynn, "A Survey Of Free-Form Object Representation and Recognition Techniques," Computer Vision and Image Understanding, vol. 81, no. 2, pp. 166 - 210, February 2001.

[7] George Mamic and Mohammed Bennamoun, "Representation and Recognition of 3D Free-Form Objects," Digital Signal Processing, vol. 12, no. 1, pp. 47 - 67, January 2002.

[8] Remco C. Veltkamp and Michiel Hagedoorn, "State-of-the-art in shape matching," Principles of Visual Information Retrieval, pp. 87 - 119, 2001.

[9] Patrick Min, "A 3D Model Search Engine," Princeton University, PhD thesis 2004.

[10] Michael M. Kazhdan, "Shape Representations and Algorithms for 3D Model Retrieval," Department of Computer Science, Princeton University, PhD thesis 2004.

[11] Dejan V. Vranić, "3D Model Retrieval," University of Leipzig, Germany, PhD thesis 2004.

[12] Simon Goodall, "3-D Content-Based Retrieval and Classification with Applications to Museum Data," Faculty of Engineering, Science and Mathematics School of Electronics and Computer Science, University of Southampton, PhD thesis 2007.

[13] Antonio Cardone, S. K. Gupta, and Mukul Karnik, "A Survey of Shape Similarity Assessment Algorithms for Product Design and Manufacturing Applications," Journal of Computing and Information Science in Engineering (JCISE), vol. 3, no. 2, pp. 109 - 118, June 2003.

[14] Philip Shilane, Patrick Min, Michael Kazhdan, and Thomas Funkhouser, "The Princeton Shape Benchmark," Proceedings of the International Conference on Shape Modeling and Applications (SMI '04), pp. 167 - 178, June 2004.

[15] S Biasotti, D. Giorgi, S. Marini, M. Spagnuolo, and B. Falcidieno, "A Comparison Framework for 3D Object Classification Methods," Proceeings of the Multimedia Content Representation, Classification and Security Conference (MRCS '06), vol. 4105, pp. 314 - 321, 2006.

[16] Benjamin Bustos, Daniel A. Keim, Dietmar Saupe, Tobias Schreck, and Dejan V. Vranić, "Feature-Based Similarity Search in 3D Object Databases," ACM Computing Surveys (CSUR), vol. 37, no. 4, pp. 345 - 387, December 2005.

[17] Ilknur Icke, "Content Based 3d Shape Retrieval A Survey of State of the Art," Cuny The Graduate Center PhD Program in Computer Science, 2nd Exam Part 1, 2004.

[18] Natraj Iyer, Subramaniam Jayanti, Kuiyang Lou, Yagnanarayanan Kalyanaraman, and Karthik Ramani, "Three-Dimensional Shape Searching: State-of-the-Art Review and Future trends," Computer Aided Design (CAD), vol. 37, no. 5, pp. 509 - 530, April 2005.

[19] Alberto Del Bimbo and Pietro Pala, "Content-Based Retrieval of 3D Models," ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP '06), vol. 2, no. 1, pp. 20 - 43, February 2006.

[20] Jobst Loffler, "Content-based Retrieval of 3D Models in Distributed Web Databases by Visual Shape Information," Proceedings of the Fourth International Conference on Information Visualisation, vol. IV, pp. 82 - 87, July 2000.

[21] M. Heczko, D. Keim, D. Saupe, and D. Vranic, "Methods for Similarity Search on 3D Databases," Datenbank-Spektrum (DBSK), vol. 2, no. 2, pp. 54 - 63, 2002, (In German).

[22] Jeong -Jun Song and Forouzan Golshani, "3D Object Retrieval by Shape Similarity," Proceedings of the 13th International Conference on Database and Expert Systems Applications (DEXA '02), pp. 851 - 860, September 2002.

[23] Tarik Filali Ansary, Mohamed Daoudi, and Jean-Philipe Vandeborre, "A Bayesian 3-D Search Engine Using Adaptive Views Clustering," Proceedings of the IEEE Transactions on Multimedia, vol. 9, no. 1, pp. 78 - 88, January 2007.

[24] Tarik Filali Ansary, Jean -Philippe Vandeborre, Said Mahmoudi, and Mohamed Daoudi, "A Bayesian Framework for 3D Models Retrieval Based on Characteristic Views," Proceedings of the Second International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT '04), pp. 139 - 146, September 2004.

[25] Christopher M. Cyr and Benjamin B. Kimia, "A Similarity-Based Aspect-Graph Approach to 3D Object Recognition," International Journal of Computer Vision (IJCV), vol. 57, no. 1, pp. 5 - 22, April 2004.

[26] Benjamin Bustos, Daniel Keim, Dietmar Saupe, Tobias Schreck, and Dejan Vranić, "An Experimental Effectiveness Comparison of Methods for 3D Similarity Search," International Journal on Digital Libraries (JODL), vol. 6, no. 1, pp. 39 - 54, 2006.


440

[27] Philip Nathan Shilane, "Shape Distinction for 3D Object Retrieval," Computer Science, Princeton University, PhD thesis 2008.

[28] Ryutarou Ohbuchi, Masatoshi Nakazawa, and Tsuyoshi Takei, "Retrieving 3D shapes based on their appearance," Proceedings of the Fifth ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR '03), pp. 39 - 45, November 2003.

[29] Dengsheng Zhang and Guojun Lu, "Shape-Based Image Retrieval Using Generic Fourier Descriptor," Signal Processing: Image Communication, vol. 17, no. 10, pp. 825 - 848, November 2002.

[30] Ryutarou Ohbuchi, Takahiro Minamitani, and Tsuyoshi Takei, "Shape-Similarity Search of 3D Models by Using Enhanced Shape Functions," Proceedings of the Theory and Practice of Computer Graphics (TPCG '03), pp. 97 - 104, june 2003.

[31] Johan W. H. Tangelder and Remco C. Veltkamp, "A Survey of Content Based 3D Shape Retrieval Methods," Multimedia Tools and Applications (MTA), vol. 39, no. 3, pp. 441 - 471, September 2008.

[32] Robert Osada, Thomas Funkhouser, Bernard Chazelle, and David Dobkin, "Shape Distributions," ACM Transactions on Graphics (TOG), vol. 21, no. 4, pp. 807 - 832, October 2002.

[33] Ding Yun Chen, Xiao Pei Tian, Yu Te Shen, and Ming Ouhyoung, "On Visual Similarity Based 3D Model Retrieval," Computer Graphics Forum (CGF), vol. 22, no. 3, pp. 223 – 232, September 2003.

[34] Dengsheng Zhang and Guojun Lu, "An Integrated Approach to Shape Based Image Retrieval," Proceedings of Fifth Asian Conference on Computer Vision (ACCV), pp. 652 - 657, 2002.

[35] Dengsheng Zhang and Guojun Lu, "A Comparative Study of Fourier Descriptors for Shape Representation and Retrieval," Proeedings of Fifth Asian Conference on Computer Vision (ACCV), vol. 22, pp. 646 - 651, 2002.

[36] Andrew E. Johnson and Martial Hebert, "Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes," IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 21, no. 5, pp. 433 - 449, May 1999.

[37] Pedro A. De Alarcón, Alberto D. Pascual-montano, and José M. Carazo, "Spin Images and Neural Networks for Efficient Content-Based Retrieval in 3D Object Databases," Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR '02), pp. 225 - 234, July 2002.

[38] Jürgen Assfalg, Alberto Del Bimbo, and Pietro Pala, "Retrieval of 3D Objects by Visual Similarity," Proceedings of the Sixth ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR '04), pp. 77 - 83, October 2004.

[39] David G. Lowe, "Object Recognition from Local Scale-Invariant Features," Proceedings of the Seventh IEEE International Conference on Computer Vision (ICCV), vol. 2, pp. 1150 - 1157, September 1999.

[40] Yan Ke and Rahul Sukthankar, "PCA-SIFT: A More Distinctive Representation for Local Image Descriptors," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), pp. 506 - 513, 2004.

[41] Pengjie Li, Huadong Ma, and Anlong Ming, "3D Model Retrieval Using 2D View and Transform- Based Features," Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing (PCM'10) : Part I, pp. 449 - 460, 2010.

[42] Shungang Hua, Qiuxin Jiang, and Qing Zhong, "3D Model Retrieval Based on Multi-View SIFT Feature," Communication Systems And Information Technology: Lecture Notes in Electrical Engineering, vol. 100, pp. 163 - 169, June 2011.

[43] Konstantinos Sfikas, Ioannis Pratikakis, and Theoharis Theoharis, "3D Object Retrieval via Range Image Queries Based On SIFT Descriptors on Panoramic Views," Proceedings of the 5th Eurographics conference on 3D Object Retrieval (3DOR'12), pp. 9 - 15, 2012.

[44] H. Blum, "A Transformation for Extracting Descriptors of Shape," Models for the Perception of Speech and Visual Forms, pp. 362 - 380, 1967.

[45] Gunilla Borgefors, "Distance Transformations in Arbitrary Dimensions," Computer Vision, Graphics, and Image Processing, vol. 27, no. 3, pp. 321 - 345, September 1984.

[46] Louisa Lam, Seong Whan Lee, and Ching Y. Suen, "Thinning Methodologies-A Comprehensive Survey," IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 14, no. 9, pp. 869 - 885, September 1992.

[47] R. L. Ogniewicz and O. Kübler, "Hierarchical Voronoi Skeletons," Pattern Recognition, vol. 28, no. 3, pp. 343 - 359, March 1995.

[48] Gabriella Sanniti di Baja and Stina Svensson, "A New Shape Descriptor for Surfaces in 3D Images," Pattern Recognition Letters, vol. 23, no. 6, pp. 703 - 711, April 2002.


441

[49] H. Sundar, D. Silver, N. Gagvani, S. Dickinson, and D. Silver, "Skeleton Based Shape Matching and Retrieval," Proceedings of the International Conference on Shape Modeling and Applications (SMI '03), pp. 130 - 139, May 2003.

[50] Nikhil Gagvani and Deborah Silver, "Parameter Controlled Volume Thinning," Graphical Models and Image Processing (CVGIP), vol. 61, no. 3, pp. 149 - 164, May 1999.

[51] N. Iyer, Y. Kalyanaraman, K. Lou, S. Jayanti, and K. Ramani, "A Reconfigurable 3D Engineering Shape Search System Part I: Shape Representation," Proceedings of ASME DETC’ 03 Computers and Information in Engineering Conference (CIE), September 2003.

[52] Kuiyang Lou et al., "A Reconfigurable 3D Engineering Shape Search System Part II: Database Indexing, Retrieval and Clustering," Proceedings of ASME DETC’ 03 Computers and Information in Engineering Conference (CIE), September 2003.

[53] Duck H. Kim, Il D. Yun, and Sang U. Lee, "Graph Representation by Medial Axis Transform for 3D Image Retrieval," Proceedings of Three-Dimensional Image Capture and Applications IV conference (SPIE '01), vol. 4298, no. 223, pp. 223 - 230, January 2001.

[54] Y. Nagasaka, M. Nakamura, and T. Murakami, "Extracting and Learning Geometric Features based on a Voxel Mapping Method for Manufacturing Design," Proceedings of the Third International Conference Intelligent Processing and Manufacturing of Materials (IPPM), pp. 1 - 10, 2001.

[55] G. Reeb, "Sur les Points Singuliers d‟une Forme de Pfaff Completement Integrable ou d‟une Fonction Numerique (On the singular points of a completely integrable Pfaff form or of a numerical function)," Comptes Rendus Acad. Sciences, vol. 222, pp. 847–849, 1946.

[56] Silvia Biasotti et al., "3D Shape Matching through Topological Structures," Proceedings of Discrete Geometry for Computer Imagery Conference (DGCI '03), no. 2886, pp. 194 - 203, 2003.

[57] Yoshihisa Shinagawa, Tosiyasu L. Kunii, and Yannick L. Kergosien, "Surface Coding Based on Morse Theory," IEEE Computer Graphics and Applications (CGA), vol. 11, no. 5, pp. 66 - 78, September 1991.

[58] Masaki Hilaga, Yoshihisa Shinagawa, Taku Kohmura, and Tosiyasu L. Kunii, "Topology Matching for Fully Automatic Similarity Estimation of 3D Shapes," Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '01), pp. 203 - 212, August 2001.

[59] Ding -yun Chen and Ming Ouhyoung, "A 3D Object Retrieval System Based on Multi-Resolution Reeb Graph," Proceedings of Computer Graphics Workshop (CG), june 2002.

[60] Dmitriy Bespalov, William C. Regli, and Ali Shokoufandeh, "Reeb Graph Based Shape Retrieval for CAD," Proceedings of 23rd Computers and Information in Engineering Conference (CIE '03), vol. 1, September 2003.

[61] Dmitriy Bespalov, Ali Shokoufandeh, William C. Regli, and Wei Sun, "Scale-Space Representation of 3D Models and Topological Matching," Proceedings of the Eighth ACM Symposium on Solid modeling andAapplications (SM '03), pp. 208-215, 2003.

[62] Tony Tung and Francis Schmitt, "Augmented Reeb Graphs for Content-Based Retrieval of 3D Mesh Models," Proceedings of International Conference on Shape Modeling and Applications (SMI '04), pp. 157 - 166, June 2004.

[63] Mohamed El-mehalawi and R. Allen Miller, "A Database System of Mechanical Components based on Geometric and Topological Similarity. Part I: Representation," Computer-Aided Design (CAD), vol. 35, no. 1, pp. 83 - 94, January 2003.

[64] Mohamed El-mehalawi and R. Allen Miller, "A Database System of Mechanical Components based on Geometric and Topological Similarity. Part II: Indexing, Retrieval, Matching, and Similarity Assessment," Computer-Aided Design (CAD), vol. 35, no. 1, pp. 95 - 105, January 2003.

[65] Emanoil Zuckerberger, Ayellet Tal, and Shymon Shlafman, "Polyhedral Surface Decomposition with Applications," Computers and Graphics (CG), vol. 26, no. 5, pp. 733 - 743, October 2002.

[66] Dragoés M. Cvetkoviac, Michael Doob, and Horst Sachs, Spectra of graphs : Theory and Application. New York: Academic Press, 1979.

[67] Fan R. K. Chung, "Spectral Graph Theory," Regional Conference Series in Mathematics American Mathematical Society, vol. 92, no. 92, p. 207, 1997.

[68] D McWherter and William C. Regli, "An Aproach to Indexing Databases of Solid Models," Mathematics and Computer Science, Drexel University, Philadelphia, Technical Report DUMCS-01-02, 2001.

[69] David McWherter, Mitchell Peabody, William C. Regli, and Ali Shokoufandeh, "Solid Model Databases: Techniques and Empirical Results," Journal of Computing and Information Science in Engineering (JCISE '01), vol. 1, no. 4, pp. 300 - 310, Decmber 2001.


442

[70] Tien -Lung Sun, Chuan -Jun Su, Richard J Mayer, and Richard A Wysk, "Shape Similarity Assessment of Mechanical Parts Based on Solid Models," ASME Design for Manufacturing Conference Symposium on Computer Integrated Concurrent Design, pp. 953 - 962, September 1995.

[71] Mitchell Peabody and William C. Regli, "Clustering Techniques for Databases of CAD Models," Department of Mathematical and Computer Science, Drexel University, Philadelphia, Technical Report DU-MCS-01-01, 2001.

[72] D McWherter, M Peabody, and William C. Regli, "An Approach to Iindexing Databases of Graphs," Department of Mathematical and Computer Science, Drexel University, Philadelphia, Technical Report 2001.

[73] D. McWherter, M. Peabody, William C. Regli, and A. Shokoufandeh, "Transformation Invariant Shape Similarity Comparison of Solid Models," ASME DETC, Computers and Information in Engineering Conference, September 2001.

[74] Keitaro Kaku, Yoshihiro Okada, and Koichi Niijima, "Similarity Measure Based on Obbtree for 3D Model Search," International Conference on Computer Graphics, Imaging and Visualization (CGIV '04), vol. 1, pp. 46-51, July 2004.

[75] Hiroyasu Ichida, Yuichi Itoh, Yoshifumi Kitamura, and Fumio Kishino, "Interactive Retrieval of 3D Virtual Shapes using Physical Objects," Proceedings of the IEEE Virtual Reality 2004 (VR '04), March 2004.

[76] Hermilo Sánchez-Cruz and Ernesto Bribiesca, "A Method of Optimum Transformation of 3D Objects Used As a Measure of Shape Dissimilarity," Image and Vision Computing (IVC), vol. 21, no. 12, pp. 1027 – 1036 , 2003.

[77] National Taiwan University. (2003) 3D Model Retrieval System. [Online]. http://3d.csie.ntu.edu.tw/˜dynamic/cgi- in/DatabaseII_v1.8.

[78] J W Tangelder and R C Veltkamp, "Polyhedral Model Retrieval using Weighted Point Sets," Proceedings of the International Conference on Shape Modeling and Applications (SMI '03), vol. 3, no. 1, pp. 119 - 129, May 2003.

[79] Ariel Shamir, Andrei Scharf, and Daniel Cohen-Or, "Enhanced Hierarchical Shape Matching for Shape Transformation," International Journal of Shape Modeling (IJSM '03), vol. 9, no. 2, pp. 203 - 222, August 2003.

[80] Tamal K. Dey, Joachim Giesen, and Samrat Goswami, "Shape Segmentation and Matching with Flow Discretization," Workshop on Algorithms and Data Structures (WADS), no. 2748, pp. 25 - 36, 2003.

[81] Thomas Funkhouser et al., "Modeling by Example," ACM Transactions on Graphics (TOG), vol. 23, no. 3, pp. 649 - 660, August 2004.

[82] Dejan V. Vranic, "An Improvement of Rotation Invariant 3D-Shape Based on Functions on Concentric Spheres," Proceedings of the IEEE International Conference on Image Processing (ICIP '03), vol. 2, pp. 757 - 760, September 2003.

[83] Eric Paquet, Marc Rioux, Anil Murching, Thumpudi Naveen, and Ali Tabatabai, "Description of Shape Information for 2-D and 3-D Objects," Signal Processing: Image Communication, vol. 16, no. 1-2, pp. 103 - 122, September 2000.

[84] Eric Paquet and Marc Rioux, "The MPEG-7 Standard and the Content-based Management of Three-dimensional Data: A Case Study," Proceedings of the IEEE International Conference on Multimedia Computing Systems (ICMCS '99), vol. 1, pp. 9375 - 9380, June 1999.

[85] Dejan V Vranic and Dietmar Saupe, "3D Model Retrieval with Spherical Harmonics and Moments," Proceedings of the 23rd DAGM Symposium for Pattern Recognition, pp. 392 - 397, September 2001.

[86] Cha Zhang and Tsuhan Chen, "Efficient Feature Extraction For 2D/3D Objects In Mesh Representation," Proceedings of the International Conference on Image Processing (ICIP '01), vol. 3, pp. 935 - 938, October 2001.

[87] Michael Elad, Ayellet Tal, and Sigal Ar, "Content Based Retrieval of VRML Objects: An Iterative and Interactive Approach," Proceedings of the Sixth Eurographics Workshop on Multimedia 2001, pp. 107 - 118, 2002.

[88] Titus Zaharia and Françoise Prêteux, "Three-Dimensional Shape-Based Retrieval within the MPEG-7 Framework," Proceedings of the SPIE Conference on Nonlinear Image Processing and Pattern Analysis XII, vol. 4304, pp. 133 - 145, January 2001.

[89] Heung -Yeong Shum, Martial Hebert, and Katsushi Ikeuchi, "On 3D Shape Similarity," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '96), pp. 526 - 531, June 1996.

[90] Berthold K. P. Horn, "Extended Gaussian Images," Proceedings of the IEEE (PIEEE), vol. 72, no. 12, pp. 1671

http://3d.csie.ntu.edu.tw/˜dynamic/cgi-%20in/DatabaseII_v1.8.


443

- 1686, December 1984.

[91] Horace H S Ip and William Y F Wong, "3D Head Models Retrieval based on Hierarchical Facial Region Similarity," Proceedings of the 15th International Conference on Vision Interface, pp. 314 - 319, 2002.

[92] Michael Kazhdan, Bernard Chazelle, David Dobkin, Thomas Funkhouser, and Szymon Rusinkiewicz, "A Reflective Symmetry Descriptor for 3D Models," Algorithmica, vol. 38, no. 1, pp. 201 - 225, 2003.

[93] Sing Bing Kang and Katsushi lkeuchi, "Determining 3-d Object Pose using the Complex Extended Gassian Image," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '91), pp. 580 - 585, June 1991.

[94] Titus Zaharia and Françoise Prêteux, "Hough Transform-Based 3D Mesh Retrieval," Proceedings the International Society for Optical Engineering (SPIE '01), vol. 4476, pp. 175 - 185, July 2001.

[95] Titus Zaharia and Françoise Prêteux, "Shape-Based Retrieval of 3D Mesh Models," Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '2002), vol. 1, pp. 437 - 440, August 2002.

[96] George Leifman, Sagi Katz, Ayellet Tal, and Ron Meir, "Signatures of 3D Models for Retrieval," 4th Israel Korea Bi-National Conference on Geometric Modeling and Computer Graphics, pp. 159 - 163, 2003.

[97] Michael Elad, Ayellet Tal, and Sigal Ar, "Directed Search In A 3D Objects Database Using SVM," Hewlett Packard Laboratories, Technical Report HPL-2000-20R1 2000.

[98] Meng Yu, Indriyati Atmosukarto, Wee Kheng Leow, Zhiyong Huang, and Rong Xu, "3D Model Retrieval with Morphing-Based Geometric and Topological Feature Maps," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '03), vol. 2, pp. 656 - 661, June 2003.

[99] Raif M. Rustamov, "Laplace-Beltrami Eigenfunctions for Deformation Invariant Shape Representation," Proceedings of the Fifth Eurographics Symposium on Geometry Processing (SGP '07), pp. 225 - 233, June 2007.

[100] Maks Ovsjanikov, Jian Sun, and Leonidas Guibas, "Global Intrinsic Symmetries of Shapes," Computer Graphics Forum (CGF), vol. 27, no. 5, pp. 1341 - 1348, September 2008.

[101] Diana Mateus, Radu Horaud, David Knossow, Fabio Cuzzolin, and Edmond Boyer, "Articulated Shape Matching Using Laplacian Eigenfunctions and Unsupervised Point Registration," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '08), pp. 1 - 8, June 2008.

[102] Mona Mahmoudi and Guillermo Sapiro, "Three-Dimensional Point Cloud Recognition via Distributions of Geometric Distances," Graphical Models, vol. 71, no. 1, pp. 22 - 31, January 2009.

[103] Alexander M. Bronstein, Michael M. Bronstein, Ron Kimmel, Mona Mahmoudi, and Guillermo Sapiro, "A Gromov-Hausdorff Framework with Diffusion Geometry for Topologically-Robust Non-rigid Shape Matching," International Journal of Computer Vision (IJCV), vol. 89, no. 2-3, pp. 266 - 286, September 2010.

[104] Dan Raviv, Michael M. Bronstein, Guillermo Sapiro, Alexander M. Bronstein, and Ron Kimmel, "Diffusion Symmetries of Non-Rigid Shapes," Proceedings of the Eigth International Symposium on3D Data Processing, Visualization and Transmission (3DPVT '10), 2010.

[105] Elton P. Hsu, Stochastic Analysis on Manifolds.: American Mathematical Society, 2002.

[106] Ronald R. Coifman and Stéphane Lafon, "Diffusion Maps," Applied and Computational Harmonic Analysis, vol. 21, no. 1, pp. 5 - 30, July 2006.

[107] Stéphane S. Lafon, "Diffusion Maps and Geometric Harmonics," Yale University, PhD thesis 2004.

[108] Jian Sun, Maks Ovsjanikov, and Leonidas Guibas, "A Concise and Provably Informative Multi-Scale Signature Based On Heat Diffusion," Proceedings of the Seventh Eurographics Symposium on Geometry Processing (SGP '09), pp. 1383 - 1392, 2009.

[109] A. M. Bronstein et al., "SHREC 2010: Robust Feature Detection and Description Benchmark," Proceedings of the Eurographics Workshop on 3D Object Retrieval (3DOR '10), pp. 79 - 86, 2010.

[110] Michael M. Bronstein, "Scale-Invariant Heat Kernel Signatures for Non-Rigid Shape Recognition," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '10), pp. 1704 - 1711, June 2010.

[111] Dan Raviv, Michael M. Bronstein, Alexander M. Bronstein, and Ron Kimmel, "Volumetric Heat Kernel Signatures," Proceedings of the ACM Workshop on 3D Object Retrieval (3DOR '10), October 2010.

[112] Mathieu Aubry, Ulrich Schlickewei, and Daniel Cremers, "The Wave Kernel Signature: A Quantum Mechanical Approach to Shape Analysis," Proceedings of IEEE International Conference on Computer Vision (ICCV) - Workshop on Dynamic Shape Capture and Analysis (4DMOD), pp. 1626 - 1633, November 2011.


444

[113] Mihael Ankerst, Gabi Kastenmüller, Hans Peter Kriegel, and Thomas Seidl, "3D Shape Histograms for Similarity Search and Classification in Spatial Databases," Proceedings of the Sixth International Symposium on Large Spatial Databases (SSD '99), vol. 1651, pp. 207 - 226, July 1999.

[114] Daniel A. Keim, "Efficient Geometry-based Similarity Search of 3D Spatial Databases," Proceedings of the International Conference on Management of Data (SIGMOD '99), vol. 28, no. 2, pp. 419 - 430, June 1999.

[115] Hans Peter Kriegel, Stefan Brecheisen, Peer Kröger, Martin Pfeifle, and Matthias Schubert, "Using Sets of Feature Vectors for Similarity Search on Voxelized Cad Objects," Proceedings of the ACM International Conference on Management of Data (SIGMOD '03), pp. 587 - 598, June 2003.

[116] Marcin Novotni and Reinhard Klein, "A Geometric Approach to 3D Object Comparison," Proceedings of the International Conference on Shape Modeling and Applications (SMI '01), pp. 167 - 175, May 2001.

[117] Hans Peter Kriegel et al., "Effective Similarity Search on Voxelized CAD Objects," Proceedings of the Eighth International Conference on Database Systems for Advanced Applications (DASFAA '03), pp. 27 - 36, March 2003.

[118] Marcel Körtgen, Gil -Joo Park, Marcin Novotni, and Reinhard Klein, "3D Shape Matching with 3D Shape Contexts," Proceedings of the Seventh International Conference in Central Europe on Computer Graphics and Visualization (WSCG '03), pp. 34 - 43, April 2003.

[119] Mihael Ankerst, Gabi Kastenmüller, Hans Peter Kriegel, and Thomas Seidl, "Nearest Neighbor Classification in 3D Protein Databases," Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology (ISMB '99), pp. 34 - 43, August 1999.

[120] Serge Belongie, Jitendra Malik, and Jan Puzicha, "Shape Matching and Object Recognition Using Shape Contexts," IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI '02), vol. 24, no. 4, pp. 509 - 522, April 2002.

[121] Ryutarou Ohbuchi, Tomo Otagiri, Masatoshi Ibato, and Tsuyoshi Takei, "Shape-Similarity Search of Three-Dimensional Models Using Parameterized Statistics," Proceedings of the Tenth Pacific Conference on Computer Graphics and Applications (PG '02), pp. 265 - 274, October 2002.

[122] Cheuk Yiu Ip, Ip Daniel, Daniel Lapadat, Leonard Sieger, and William C. Regli, "Using Shape Distributions to Compare Solid Models," Proceedings of the Seventh ACM Symposium on Solid Modeling and Applications (SMA '02), pp. 273 – 280, June 2002.

[123] Cheuk Yiu Ip, William C. Regli, Leonard Sieger, and Ali Shokoufandeh, "Automated Learning of Model Classifications," Proceedings of the Eighth ACM Symposium on Solid Modeling and Applications (SM '03), pp. 322 - 327, June 2003.

[124] Heather J. Rea, R. Sung, Jonathan R. Corney, and D.E.R. Clark, "Identifying 3d Object Features using Shape Distributions," Proceedings of the 18th International Conference on Computer Aided Production Engineering (CAPE '03), March 2003.

[125] Ryutarou Ohbuchi and Tsuyoshi Takei, "Shape-Similarity Comparison of 3D Models Using Alpha Shapes," Proceedings of the eleventh Pacific Conference on Computer Graphics and Applications (PG '03), pp. 293 - 302, October 2003.

[126] Yi Liu, Jiantao Pu, Hongbin Zha, Weibin Liu, and Yusuke Uehara, "Thickness Histogram and Statistical Harmonic Representation for 3D Model Retrieval," Proceedings of the Second International Symposium on3D Data Processing, Visualization and Transmission (3DPVT '04), pp. 896 - 903, September 2004.

[127] Heather J. Rea, D.E.R. Clark, Jonathan R. Corney, and Nick K. Taylor, "A Surface Partitioning Spectrum (SPS) for Retrieval and Indexing of 3D CAD Models," Proceedings of the Second International Symposium on3D Data Processing, Visualization and Transmission (3DPVT '04), pp. 167 - 174, September 2004.

[128] Jiantao Pu et al., "3D Model Retrieval Based On 2D Slice Similarity Measurements," Proceedings of the Second International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT '04), pp. 95 - 101, September 2004.

[129] Y. Lamdan and H. J. Wolfson, "Geometric Hashing: A General and Efficient Model-Based Recognition Scheme," Proceedings of the Second International Conference on Computer Vision (ICCV), pp. 238 - 249, December 1988.

[130] Nathaniel Leibowitz, Zipora Y Fligelman, Ruth Nussinov, and Haim J. Wolfson, "Multiple Structural Alignment and Core Detection by Geometric Hashing," Proceedings of the Seventh International Conference on Intelligent Systems in Molecular Biology (ISMB), pp. 169–177, August 1999.

[131] Haim J. Wolfson and Isidore Rigoutsos, "Geometric Hashing: An Overview," Proceedings of the IEEE International Conference on Computational Science and Engineering (CSE), vol. 4, no. 4, pp. 10 - 21, October 1997.


445

[132] André P. Guéziec, Xavier Pennec, and Nicholas Ayache, "Medical Image Registration Using Geometric Hashing," Proceedings of the IEEE International Conference on Computational Science and Engineering (CSE), vol. 4, no. 4, pp. 29 - 41, October-December 1997.

[133] Dejan V. Vranic, D. Saupe, and J. Richter, "Tools for 3D-Object Retrieval: Karhunen–Loevetransform and Spherical Harmonics," Proceedings of the IEEE Fourth Workshop on Multimedia Signal Processing, pp. 293 - 298, October 2001.

[134] J. Assfalg, Alberto Del Bimbo, and Pietro Pala, "Curvature Maps for 3D CBR," Proceedings of the International Conference on Multimedia and Expo (ICME’03), vol. 2, July 2003.

[135] Gianni Antini, Stefano Berretti, Alberto Del Bimbo, and Pietro Pala, "Retrieval of 3D Objects Using Curvature Correlograms," Proceedings of the International Conference on Multimedia and Expo (ICME’05), pp. 735 - 738, July 2005.

[136] Michael Kazhdan, Thomas Funkhouser, and Szymon Rusinkiewicz, "Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors," Proceedings of the ACM SIGGRAPH Symposium on Geometry Processing (SGP '03), June 2003.

[137] Marcin Novotni and Reinhard Klein, "3D Zernike Descriptors for Content Based Shape Retrieval," Proceedings of the Eighth ACM Symposium on Solid Modeling and Applications (SM '03), pp. 216 - 225, June 2003.

[138] Yi Liu et al., "A Robust Method for Shape-Based 3D Model Retrieval," Proceedings of the twelvth Pacific Conference on Computer Graphics and Applications (PG '04), pp. 3 - 9, October 2004.

[139] Julien Ricard, David Coeurjolly, and Atilla Baskurt, "Generalizations of Angular Radial Transform for 2D and 3D Shape Retrieval," Pattern Recognition Letters (PRL), vol. 26, no. 14, pp. 2174 - 2186, October 2005.

[140] George Leifman, Ron Meir, and Ayellet Tal, "Semantic-Oriented 3D Shape Retrieval using Relevance Feedback," The Visual Computer (VC), vol. 21, no. 8 - 10, pp. 865 - 875, 2005.

[141] Indriyati Atmosukarto, Wee Kheng Leow, and Zhiyong Huang, "Feature Combination and Relevance Feedback for 3D Model Retrieval," Proceedings of the 11th International Multimedia Modelling Conference (MMM '05), pp. 334 - 339, January 2005.

[142] Biao Leng and Zheng Qin, "A Powerful Relevance Feedback Mechanism for Content-Based 3D Model Retrieval," Multimedia Tools and Applications (MTA), vol. 40, no. 1, pp. 135 - 150, October 2008.

[143] D. Giorgi, P. Frosini, M. Spagnuolo, and B. Falcidieno, "3d Relevance Feedback via Multilevel Relevance Judgements," The Visual Computer (VC), vol. 26, no. 10, pp. 1321 - 1338, October 2010.

[144] Yi Liu, Hongbin Zha, and Hong Qin, "Shape Topics: A Compact Representation and New Algorithms for 3D Partial Shape Retrieval," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06), vol. 2, pp. 2025 - 2032, 2006.

[145] Josef Sivic and Andrew Zisserman, "Video Google: A Text Retrieval Approach to Object Matching in Videos," Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV '03), vol. 2, October 2003.

[146] Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, and Cédric Bray, "Visual Categorization with Bags of Keypoints," Proceedings of the European Conference on Computer Vision (ECC '04), pp. 59 - 74, 2004.

[147] R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, "Learning Object Categories from Google‟s Image Search," Proceedings of the Eleventh IEEE International Conference on Computer Vision (ICCV '05), vol. 2, pp. 1816 - 1823, October 2005.

[148] J. Winn, A. Criminisi, and T. Minka, "Object Categorization by Learned Universal Visual Dictionary," Proceedings of the Eleventh IEEE International Conference on Computer Vision (ICCV '05), vol. 2, pp. 1800 - 1807, October 2005.

[149] Ryutarou Ohbuchi, Kunio Osada, Takahiko Furuya, and Tomohisa Banno, "Salient Local Visual Features for Shape-Based 3D Model Retrieval," Proceedings of the International Conference on Shape Modeling and Applications (SMI '08), pp. 93 - 102, June 2008.

[150] David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision (IJCV), vol. 60, no. 2, November 2004.

[151] Takahiko Furuya and Ryutarou Ohbuchi, "Dense Sampling and Fast Encoding For 3D Model Retrieval Using Bag-Of-Visual Features," Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR '09), no. 26, pp. 1 - 8, July 2009.

[152] Alexander M. Bronstein, Michael M. Bronstein, Leonidas J. Guibas, and Maks Ovsjanikov, "Shape Google: Geometric Words and Expressions for Invariant Shape Retrieval," ACM Transactions on Graphics (TOG), vol.


446

30, no. 1, January 2011.

[153] AIM@SHAPE. [Online]. http://www.aim-at-shape.net/

[154] Mattia Natali, Silvia Biasotti, Giuseppe Patanè, and Bianca Falcidieno, "Graph-Based Representations of Point Clouds," Graphical Models, vol. 73, no. 5, September 2011.

[155] Marc Levoy et al., "The Digital Michelangelo Project: 3D Scanning of Large Statues," Proceedings of the Annual Conference on Computer Graphics (SIGGRAPH), pp. 131–144, 2000.

[156] Thomas A. Funkhouser and Philip Shilane, "Partial Matching of 3D Shapes with Priority-Driven Search," Proceedings of the fourth Eurographics Symposium on Geometry Processing (SGP '06), pp. 131 - 142, June 2006.

[157] Maria Petrou and Panagiota Bosdogianni, Image Processing: The Fundamentals. New York, USA: John Wiley & Sons, Inc., August 1999, vol. 1st.

[158] Yu -Te Shen, Ding -Yun Chen, Xiao -Pei Tian, and Ming Ouhyoung, "3D Model Search Engine Based on Lightfield Descriptors," Proceedings of the Eurographics 2003, 2003.

[159] A. E. Johnson and M. Hebert, "Surface Matching for Object Recognition in Complex Three-Dimensional Scenes," Image and Vision Computing (IVC), vol. 16, no. 9-10, pp. 635 - 651, July 1998.

[160] Andrew Edie Johnson, "Spin-Images: A Representation for 3D Surface Matching," Robotics Institute, Carnegie Mellon University, PhD thesis 1997.

[161] Princeton Shape Retrieval and Analysis group. (2007) 3D Model search engine. [Online]. http://shape.cs.princeton.edu/

[162] Eric Paquet and Marc Rioux, "Nefertiti: A Query by Content System for Three-Dimensional Model and Image Databases Management," Image and Vision Computing (IVC), vol. 17, no. 2, pp. 157 - 166, February 1999.

[163] Steven Yantis, Visual Perception: Essential Readings. Philadelphia, PA: Psychology Press, 2001.

[164] David Marr, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. San Francisco: W.H. Freeman, 1982.

[165] Jules Vleugels and Remco C. Veltkamp, "Efficient Image Retrieval through Vantage Objects," Pattern Recognition (PR), vol. 35, no. 1, pp. 69 - 80, January 2002.

[166] Thomas B. Sebastian, Philip N. Klein, and Benjamin B. Kimia, "Shock-Based Indexing into Large Shape Databases," Proceedings of the Seventh European Conference on Computer Vision-Part III (ECCV '02), vol. 2352, pp. 731 - 746, May 2002.

[167] J. J. Rochio, "Relevance Feedback in Information Retrieval," The SMART Retrieval System: Experiments in Automatic Document Processing, pp. 313 - 323, 1971.

[168] Raimondo Schettini, Gianluigi Ciocca, and Isabella Gagliardi, "Content-Based Color Image Retrieval with Relevance Feedback," Proceedings of the International Conference on Image Processing (ICIP), vol. 3, pp. 75 - 79, October 1999.

[169] Kuiyang Lou, S. Prabhakar, and Karthik Ramani, "Content-Based Three-Dimensional Engineering Shape Search," Proceedings of the 20th International Conference on Data Engineering (ICDE '04), pp. 754 - 765, March 30–April 2 2004.

[170] Kuiyang Lou, Natraj Iyer, Subramaniam Jayanti, Yagnanarayanan Kalyanaraman, and Karthik Ramani, "Supporting Effective and Efficient Three-Dimensional Shape Retrieval," Proceedings of the Tools and Methods for Competitive Engineering Conference (TMCE '04), pp. 199 - 210, April 2004.

[171] Cha Zhang and Tsuhan Chen, "An Active Learning Framework for Content-Based Information Retrieval," IEEE Transactions on Multimedia (TMM), vol. 4, no. 2, pp. 260 - 268, June 2002.

[172] Jonathan R. Corney, Heather J. Rea, Doug Clark, John Pritchard, and Roddy Macleod, "Coarse Filters for Shape Matching," IEEE Computer Graphics and Applications (CGA), vol. 22, no. 3, pp. 65 - 74, May/June 2002.

[173] Benjamin Bustos, Daniel Keim, Dietmar Saupe, Tobias Schreck, and Dejan Vranic, "Automatic Selection and Combination of Descriptors for Effective 3D Similarity Search," Proceedings of the IEEE Sixth International Symposium on Multimedia Software Engineering (ISMSE '04), December 2004.

[174] Benjamin Bustos, Daniel A. Keim, Dietmar Saupe, Tobias Schreck, and Dejan V. Vranic, "Using Entropy Impurity for Improved 3D Object Similarity Search," Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '04), pp. 1303 - 1306, June 2004.

http://www.aim-at-shape.net/

http://shape.cs.princeton.edu/

taxonomy for 3d content-based object retrieval methods

Documents