from angular manifolds to the integer lattice: guaranteed ... · index terms—pose graph...

17
From Angular Manifolds to the Integer Lattice: Guaranteed Orientation Estimation with Application to Pose Graph Optimization Luca Carlone, Member Andrea Censi, Member Abstract—Pose graph optimization from relative measurements is challenging because of the angular component of the poses: the variables live on a manifold product with nontrivial topology, and the likelihood function is non-convex and has many local minima. Due to these issues, iterative solvers are not robust to large amounts of noise. This paper describes a global estimation method, called MOLE2D, for the estimation of the nodes’ orienta- tion in a pose graph. We demonstrate that the original nonlinear optimization problem on the manifold product is equivalent to an unconstrained quadratic optimization problem on the integer lattice. Exploiting this insight, we show that, in general, the maximum likelihood estimate alone cannot be considered a reliable estimator. Therefore, MOLE2D returns a set of point esti- mates, for which we can derive precise probabilistic guarantees. Experiments show that the method is able to tolerate extreme amounts of noise, far above all noise levels of sensors used in applications. Using MOLE2D’s output to bootstrap the initial guess of iterative pose graph optimization methods improves their robustness and makes them avoid local minima even for high levels of noise. Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis estima- tion, integer quadratic programming I. I NTRODUCTION A pose graph is a model commonly used to formalize the Simultaneous Localization and Mapping (SLAM) problem [1], [2]. Each node in the graph represents the pose of a mobile robot at a given time. Two nodes share an edge if a relative measurement between the two poses is available. Relative measurements might be obtained by means of proprioceptive sensors (e.g., wheel odometry) or exteroceptive-sensor-based techniques (e.g., scan matching or visual odometry). The objective of pose graph optimization is to estimate the poses that maximize the likelihood of the relative measurements. Once the poses have been estimated, it is possible to construct a map of the environment by placing all measurements in the same global coordinate frame. Finding the global minimum of the likelihood is difficult mainly because of the angular component: nodes’ orientations belong to a product of manifolds that have a nontrivial topology (SO(2) n or SO(3) n , with n the number of observable poses). In fact, if the orientations were known, pose optimization would be a linear problem [3]. With unknown orientations, the problem is nonlinear, non-convex, with multiple local minima even in simple instances [4]. The state-of-the-art techniques for pose optimization, such as g2o [5] and Toro [6], are iterative optimization methods that minimize a cost function starting from an initial guess. None can guarantee convergence to a global minimum, and it is L. Carlone (corresponding author) is with the College of Computing, Georgia Institute of Technology, Atlanta, GA, USA. [email protected] A. Censi is with the Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA, USA. [email protected] observed that they get trapped in local minima in presence of large measurement noise [6]. This paper focuses on the problem of estimating the nodes’ orientations from pairwise relative angular measurements in the planar case (orientations in SO(2)), which is of interest in several application scenarios, ranging from domestic environments to factory floors. We describe MOLE2D, a multi-hypothesis global optimization method that does not suffer from local minima, even with extreme amounts of noise, and has precise probabilistic guarantees. Experiments show that, once the nodes’ orientations have been estimated with our method, they can be used as a first guess for traditional solvers to obtain a robust and accurate solution for pose graph optimization. Related work in robotics: The formulation of SLAM as an optimization problem on a graph traces back to Lu and Milios [1]. Gutmann and Konolige [7] describe how to build a pose graph in incremental fashion from laser scan measurements. A large amount of subsequent work focuses on speeding up computation. Duckett et al. [8] use a Gauss-Seidel relaxation to minimize residual errors. Kono- lige [9] describes a reduction scheme to improve efficiency of nonlinear optimization. Thrun and Montemerlo [2] de- scribe a conjugate gradient-based optimization that enables large-scale estimation. Frese et al. [10] propose a multilevel relaxation that considerably reduces the computation time by applying a multi-grid algorithm. Olson et al. [11] propose an alternative parametrization for the problem, which entails several advantages in terms of computation and robustness. Grisetti et al. [6] extend such framework, proposing a method (Toro) that is based on stochastic gradient descent and uses a tree-based parametrization. Kaess et al. [12]–[14] present an elegant formalization of SLAM using a Bayes tree model and investigate incremental estimation techniques. Several recent papers focus on the manifold structure of the problem: the domain of the problem is a product of manifolds SE(2) or SE(3), and this aspect requires a suitable treatment when using iterative optimization techniques [15], or closed-form problem-specific methods [16]–[18]. Kuemmerle et al. [5] de- scribe the g2o framework for solving general optimization problems with variables belonging to manifolds. Olson and Agarwal [19], and Sünderhauf and Protzel [20], [21] extend this framework, with the purpose of increasing robustness to outliers. Rosen et al. [22] propose the use of a trust-region method to enhance convergence of iterative optimization. The idea of improving convergence by bootstrapping iterative solvers with suitable estimates has appeared in different forms in literature [23]–[25]. The theoretical analysis of the problem is slightly behind applications; see Knuth and Barooah [26], Huang et al. [27], and Wang et al. [4]. Related work in other fields: Other fields outside of robotics

Upload: others

Post on 16-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

From Angular Manifolds to the Integer Lattice: Guaranteed OrientationEstimation with Application to Pose Graph Optimization

Luca Carlone, Member Andrea Censi, Member

Abstract—Pose graph optimization from relative measurementsis challenging because of the angular component of the poses:the variables live on a manifold product with nontrivial topology,and the likelihood function is non-convex and has many localminima. Due to these issues, iterative solvers are not robust tolarge amounts of noise. This paper describes a global estimationmethod, called MOLE2D, for the estimation of the nodes’ orienta-tion in a pose graph. We demonstrate that the original nonlinearoptimization problem on the manifold product is equivalentto an unconstrained quadratic optimization problem on theinteger lattice. Exploiting this insight, we show that, in general,the maximum likelihood estimate alone cannot be considered areliable estimator. Therefore, MOLE2D returns a set of point esti-mates, for which we can derive precise probabilistic guarantees.Experiments show that the method is able to tolerate extremeamounts of noise, far above all noise levels of sensors used inapplications. Using MOLE2D’s output to bootstrap the initial guessof iterative pose graph optimization methods improves theirrobustness and makes them avoid local minima even for highlevels of noise.

Index Terms—pose graph optimization, SLAM, mobile robots,orientation estimation, SO(2) manifold, multi-hypothesis estima-tion, integer quadratic programming

I. INTRODUCTION

A pose graph is a model commonly used to formalize theSimultaneous Localization and Mapping (SLAM) problem [1],[2]. Each node in the graph represents the pose of a mobilerobot at a given time. Two nodes share an edge if a relativemeasurement between the two poses is available. Relativemeasurements might be obtained by means of proprioceptivesensors (e.g., wheel odometry) or exteroceptive-sensor-basedtechniques (e.g., scan matching or visual odometry). Theobjective of pose graph optimization is to estimate the posesthat maximize the likelihood of the relative measurements. Oncethe poses have been estimated, it is possible to construct a mapof the environment by placing all measurements in the sameglobal coordinate frame. Finding the global minimum of thelikelihood is difficult mainly because of the angular component:nodes’ orientations belong to a product of manifolds that have anontrivial topology (SO(2)n or SO(3)n, with n the number ofobservable poses). In fact, if the orientations were known, poseoptimization would be a linear problem [3]. With unknownorientations, the problem is nonlinear, non-convex, with multiplelocal minima even in simple instances [4].

The state-of-the-art techniques for pose optimization, suchas g2o [5] and Toro [6], are iterative optimization methods thatminimize a cost function starting from an initial guess. Nonecan guarantee convergence to a global minimum, and it is

L. Carlone (corresponding author) is with the College of Computing, GeorgiaInstitute of Technology, Atlanta, GA, USA. [email protected]

A. Censi is with the Laboratory for Information and Decision Systems,Massachusetts Institute of Technology, Cambridge, MA, USA. [email protected]

observed that they get trapped in local minima in presence oflarge measurement noise [6]. This paper focuses on the problemof estimating the nodes’ orientations from pairwise relativeangular measurements in the planar case (orientations in SO(2)),which is of interest in several application scenarios, rangingfrom domestic environments to factory floors. We describeMOLE2D, a multi-hypothesis global optimization method thatdoes not suffer from local minima, even with extreme amountsof noise, and has precise probabilistic guarantees. Experimentsshow that, once the nodes’ orientations have been estimatedwith our method, they can be used as a first guess for traditionalsolvers to obtain a robust and accurate solution for pose graphoptimization.

Related work in robotics: The formulation of SLAMas an optimization problem on a graph traces back to Luand Milios [1]. Gutmann and Konolige [7] describe howto build a pose graph in incremental fashion from laserscan measurements. A large amount of subsequent workfocuses on speeding up computation. Duckett et al. [8] usea Gauss-Seidel relaxation to minimize residual errors. Kono-lige [9] describes a reduction scheme to improve efficiencyof nonlinear optimization. Thrun and Montemerlo [2] de-scribe a conjugate gradient-based optimization that enableslarge-scale estimation. Frese et al. [10] propose a multilevelrelaxation that considerably reduces the computation timeby applying a multi-grid algorithm. Olson et al. [11] proposean alternative parametrization for the problem, which entailsseveral advantages in terms of computation and robustness.Grisetti et al. [6] extend such framework, proposing a method(Toro) that is based on stochastic gradient descent and uses atree-based parametrization. Kaess et al. [12]–[14] present anelegant formalization of SLAM using a Bayes tree modeland investigate incremental estimation techniques. Severalrecent papers focus on the manifold structure of the problem:the domain of the problem is a product of manifolds SE(2)or SE(3), and this aspect requires a suitable treatment whenusing iterative optimization techniques [15], or closed-formproblem-specific methods [16]–[18]. Kuemmerle et al. [5] de-scribe the g2o framework for solving general optimizationproblems with variables belonging to manifolds. Olson andAgarwal [19], and Sünderhauf and Protzel [20], [21] extend thisframework, with the purpose of increasing robustness to outliers.Rosen et al. [22] propose the use of a trust-region methodto enhance convergence of iterative optimization. The ideaof improving convergence by bootstrapping iterative solverswith suitable estimates has appeared in different forms inliterature [23]–[25]. The theoretical analysis of the problemis slightly behind applications; see Knuth and Barooah [26],Huang et al. [27], and Wang et al. [4].

Related work in other fields: Other fields outside of robotics

Page 2: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

deal with problems formally equivalent to pose graph optimiza-tion, such as attitude synchronization [28] and calibration ofcamera networks [29]. Often these problems are formulatedin a distributed setting, where a set of agents must estimatea local state (pose, position, orientation, etc.) based on inter-agent measurements (relative distance, relative bearing, etc.).For example, Barooah and Hespanha [3], [30] consider theproblem of estimating positions of robots in a team fromrelative position measurements, assuming known orientations.Knuth and Barooah [31] focus on distributed computation.The case in which the nodes positions have to be estimatedfrom bearing measurements was pioneered by Stanfield [32]and further developed in more recent work [33]–[35]. Anothercommon setup is the one in which nodes positions are estimatedfrom pairwise distance measurements [36]–[39].

Relation with previous work: Previous work by the firstauthor and colleagues [25], [40] proposed a fast approximatedmethod for computing orientations in a pose graph, and thensolving for the translations given the orientations. Those worksmotivated this paper, but it follows a different route. This paperpresents a formal treatment of the orientation-only estimationproblem; rather than proposing an approximation, we careabout finding a guaranteed orientation estimate. In hindsight,the results of this paper allow us to conclude that the roundingoperation proposed to solve the wraparound problem in [25]is only a heuristic to solve a quadratic integer program, whichdoes not necessarily lead to the optimal solution [41]. More indetail, the proof of equivalence to a quadratic integer program,the MOLE2D algorithm, and the experimental results are originaland have not been published in previous work.

Paper outline: Our results derive from the joint applicationof graph theory, differential geometry, and integer programming.Section II recalls the necessary preliminaries.

Section III introduces the usual maximum likelihood formal-ization for the orientation estimation problem, with extra careto assumptions and problem symmetries. Section IV provesthat the maximum likelihood orientation estimation problem,with domain SO(2)n (where n is the number of observablenodes), is equivalent to an unconstrained, quadratic, integeroptimization problem on Z`, where ` is the number of cycles.

Section V shows that the maximum likelihood orientationestimator may suffer from a bias, which intuitively correspondsto choosing the wrong number of windings for one or moreloops in the graph. For this reason, it is not necessarily thebest choice for the problem at hand.

Section VI and VII describe the MOLE2D algorithm (Multi-hypothesis Orientation-from-Lattice Estimation in 2D), whichreturns a set of hypotheses for the nodes’ orientations. Ithas precise probabilistic guarantees: at least one hypothesisis “close” to the actual nodes’ orientations within a givenconfidence level. The algorithm returns a small set of plausiblehypotheses, if the “frame of reference”, described by the cyclebasis matrix, is chosen appropriately. Choosing the minimumcycle basis minimizes the expected number of hypotheses.

Section VIII discusses the performance of MOLE2D onstandard SLAM datasets. For the case of orientation estimation,we explore the trade-off in performance implied by the choice ofthe cycle basis matrix used by MOLE2D. The results confirm the

theoretical predictions. In common problem instances the set ofestimates contains a single element, because the problem is verywell constrained. For the case of pose optimization, we showthat simply substituting the orientation estimate computed byMOLE2D in place of the odometric initial guess greatly enhancesthe robustness to noise in g2o.

Datasets, source code, and the technical report [42], withextended proofs are available at [43].

TABLE ISYMBOLS USED IN THIS PAPER

Graph

G = (V, E) Directed graphm Number of edgesn+ 1 Number of nodesn Number of observable variablesV Vertex set; |V|= n+ 1E Edge set; |E|= me = (i, j) ∈ E Edge between nodes i and j` Number of cycles; ` = m− nwij Weight associated to edge (i, j)

A ∈ R(n+1)×m Incidence matrix of GA ∈ Rn×m Reduced incidence matrix of GCycle bases

CG The set of all cycle basis for Gc ∈ {−1, 0,+1}m Row vector describing a circuitW (c,w) Weight of a cycle cW (C,w) Weight of a cycle basis CC ∈ Z`×m Matrix describing a cycle basisC⊥,CT Ordering of C according to a spanning tree TMCB Minimum cycle basis for a given weight functionFCB Fundamental cycle basis built from a spanning treeGeometry of angles

SO(2) 2D rotation matricesExp : R→SO(2) Exponential mapLog : SO(2)→P(R) Logarithmic mapLog0 : SO(2)→R Principal logarithmic map〈·〉2π : R→(−π,+π] 2π modulus operationOrientation estimation (intrinsic formalization)

r◦i ∈ SO(2) Unknown orientation of node iri ∈ SO(2) Optimization variables for orientation of node idij ∈ SO(2) Relative orientation measurementεij ∈ SO(2) Measurement errorεij ∈ R Gaussian noise producing εijσij Standard deviation of εijOrientation estimation (in (−π,+π] coordinates)

θ◦ ∈ (−π,+π]n Unknown orientationsθ ∈ (−π,+π]n Optimization variables for orientationsδ ∈ (−π,+π]m Relative orientation measurementsPδ ∈ Rm×m Measurement covarianceMixed-integer formalization in k and x

x ∈ Rn Real-valued optimization variables for orientationsk ∈ Zm Regularization vectorθ?|k Estimate of θ◦ given k ∈ Zm

Reduced formalization in cycle space

γ ∈ Z` Integer vector living on the cyclesγ̂ ∈ Zm Estimator for γx?|γ Real-valued estimate of θ◦ given γθ?|γ = 〈x?|γ〉2π Wrapped estimate of θ◦ given γθ? = θ?|γ

?Max. likelihood estimate of θ◦

Γ Estimates for γ returned by INTEGER-SCREENINGΘ Estimates of θ◦ returned by MOLE2D

Miscellanea

|S| Cardinality of the set SIn n× n identity matrix0n (1n) Column vector of zeros (ones) of dimension nb·c Floor operatorχ2`,α Quantile of the χ2 distribution with ` degrees of

freedom and upper tail probability equal to α

Page 3: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

II. PRELIMINARIES

A. Computational graph theory

Chen [44] is a popular reference for computational graphtheory. Our notation (Table I) is compatible with Kavitha etal. [45], from which we take most results about cycle bases.A directed graph G is a pair (V, E), where the vertices ornodes V are a finite set of elements, and E ⊂ V × V is theset of edges. Each edge is an ordered pair e = (i, j). Wesay that e is incident on nodes i and j, leaves node i, calledtail, and is directed towards node j, called head. A weightedgraph has also a nonnegative weight wij associated to anyedge e = (i, j). The number of nodes and edges are denotedwith n+ 1 and m. Our graph has n+ 1 nodes, rather than n,because only n independent variables will be observable, andthis choice will simplify the notation later.

The incidence matrix A of a directed graph is a (n+1)×mmatrix with elements in {−1, 0,+1} that exhaustively describesthe graph topology. Each column of A corresponds to anedge and has exactly two non-zero elements. For the columncorresponding to edge e = (i, j), there is a −1 on the i-th rowand a +1 on the j-th row (Figure 1).

The reduced incidence matrix A is obtained from A byremoving one row. Without loss of generality, in this paper weassume that it is the first row, which corresponds to the firstnode that is set to the origin of the reference frame. If A hasdimensions (n+ 1)×m, A has dimension n×m.

A spanning tree is a subgraph with n edges that contains allthe nodes in the graph. For a given spanning tree, the edgesof the original graph that do not belong to the spanning treeare called chords. A cycle is a subgraph in which every nodeappears in an even number of edges. A circuit is a cycle inwhich every node appears exactly in two edges. Once the medges have been given an arbitrary order, a directed circuitcan be described by a vector of m elements, in which the e-th element is +1 or −1 if edge e is traversed respectivelyforwards (from tail to head) or backwards, and 0 if it doesnot appear. Therefore, a circuit can be represented by a vectorin {−1, 0,+1}m. Because an edge can be traversed twice ormore by a cycle, a cycle is represented by a vector c ∈ Zm.

If the graph is weighted, the weight of a cycle c ∈ Zm is

A

B

C

D

E

H

G

F

1

2

36

8

9

45

7

A =

−1 0 0 0 0 0 0 0 +1+1 −1 0 0 0 0 0 0 00 +1 −1 0 0 0 +1 0 00 0 +1 −1 0 0 0 0 00 0 0 +1 −1 0 0 0 00 0 0 0 +1 −1 0 0 00 0 0 0 0 +1 −1 −1 00 0 0 0 0 0 0 +1 −1

C1 =

(+1 +1 +1 +1 +1 +1 0 +1 +1+1 +1 0 0 0 0 −1 +1 +1

)C2 =

(+1 +1 0 0 0 0 −1 +1 +10 0 +1 +1 +1 +1 +1 0 0

)Fig. 1. To illustrate the definition of cycle basis, we use a toy graph withvertex set V = {A, . . . ,H} and edge set E = {1, . . . , 9}. We assumethat each edge has unitary weight. The topology is entirely described bythe incidence matrix A of the graph, while the reduced incidence matrix Ais obtained from A by deleting the first row. A spanning tree is given byedges {1, 2, 3, 4, 5, 6, 8}; the corresponding chords are edge 7 and edge 9(reported as dashed lines in the figure). C1 is a cycle basis matrix for the graph,whose first circuit includes edges {1, 2, 3, 4, 5, 6, 8, 9} and second circuitincludes edges {1, 2, 7, 8, 9}. C2 is a minimum cycle basis, whose first circuitincludes edges {1, 2, 7, 8, 9} and second circuit includes edges {3, 4, 5, 6, 7}.

the sum of the weights of the edges that it traverses:

W (c,w) =∑

(i,j)∈E

wij |cij |.

A cycle basis of a graph is a minimal set of circuits such thatany cycle in the graph can be written as a combination of thecircuits in the basis. We define CG as the set of all cycle basisof the graph. The number of independent circuits in the cyclebasis is called cyclomatic number and it is equal to ` = m−n.A cycle basis matrix is a matrix C ∈ Z`×m, such that eachrow c(t) describes one of the circuits (Figure 1):

C =

(c(1)

...c(`)

)∈ Z`×m.

In order to simplify the notation, and without loss of generality,we order the edges such that the first n edges belong to aspanning tree T and the remaining ` edges are chords withrespect to T. This allows to write any cycle basis matrix C as

C = (CT C⊥), (1)

where CT ∈ Z`×n contains the columns in C correspondingto edges in T, and C⊥ ∈ Z`×` contains the columns in Ccorresponding to chords with respect to T.

The weight of a cycle basis is the sum of the cycles’ weights:

W (C,w) =∑̀t=1

W (c(t),w).

The minimum cycle basis (MCB) is a basis (not necessarilyunique) that has minimum weight:

MCB(G,w).= arg min

C∈CGW (C,w).

In the paper we only consider the weight function w thatassociates to each edge the variance of the correspondingmeasurement. Therefore, we use the notation “MCB” omittingthe dependence on the graph and on the weight function, andimplying that we consider a minimum uncertainty cycle basis.Similarly, when we talk about minimum spanning tree, we referto a minimum uncertainty spanning tree.

B. Modulus algebra

The map 〈·〉2π is a function from R to the interval (−π,+π]:

〈·〉2π : R → (−π,+π],

which can be written explicitly as

〈ω〉2π.= ω + 2π

⌊π − ω

⌋, (2)

where b·c is the floor operator. For a given ω, there existsonly one integer kω, which we call regularization term, suchthat 〈ω〉2π = ω + 2πkω, and it is given by kω

.=⌊π−ω2π

⌋. An

equivalent definition for kω is

kω = arg mink∈Z

|ω + 2πk|,

and it also holds that

|〈ω〉2π |= mink∈Z|ω + 2πk|. (3)

Page 4: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

The modulus is not distributive with respect to addition, but itholds that

〈ω1 + ω2〉2π = 〈〈ω1〉2π + 〈ω2〉2π〉2π. (4)

C. Differential geometry of angles

The exponential map Exp : R → SO(2) for the manifoldSO(2) is a map from the tangent space at the identity so(2) 'R to the manifold. This map is surjective but not bijective.

The logarithmic map Log : SO(2) → P(R) is the rightinverse of the exponential map, and it maps an angle in SO(2) toall elements in the tangent space that have the same exponential.Here, “P(R)” denotes the power set of R. Note that the factthat the exponential map is not invertible is an intrinsic propertyand does not depend on a particular parametrization.

The logarithmic map satisfies the Lie group property

Log(s−1) = −Log(s) (5)

and the Abelian property

Log(s1s2) = Log(s1) + Log(s2). (6)

The principal logarithm map Log0 : SO(2) → R is abijective function that chooses one particular element on thetangent space, namely, the closest to the origin. A propertyof the principal logarithm is that it can be written with themap 〈·〉2π:

Log0(s) = 〈Log(s)〉2π . (7)

If SO(2) is parametrized by angular coordinates in (−π,+π],then the coordinate version of Exp is simply the modulus 〈·〉2π ,while the principal logarithm maps a rotation matrix to thecorresponding angle of rotation in (−π,+π].

D. Wrapped Gaussian distribution on the circle

The wrapped Gaussian distribution on the circle is thegeneralization of a Gaussian distribution [46], [47], in thesense that it is the solution of the heat equation on the circle,and has several other analogous properties. It can be obtainedby applying the exponential map to a Gaussian variable thatlives on the tangent space:

ε = Exp(ε), with ε ∼ N (0, σ2), (8)

where σ is the scaling parameter for the wrapped distribution.The probability density function for a wrapped Gaus-

sian Wσ2 : SO(2)→ R+ can be written as

Wσ2(ε) = 1σ√2π

+∞∑k=−∞

exp

(−(Log0(ε) + 2πk)2

2σ2

).

Note that Log0 returns a value in [−π,+π).A wrapped Gaussian may show a very different behaviour

w.r.t. a Gaussian density. For instance, as the noise increases, aGaussian density would tend pointwise to 0, while the wrappedGaussian distribution tends to the uniform distribution on thecircle: limσ2→∞Wσ2 = 1/2π. Other properties are insteadmaintained, such as the closure with respect to convolution [47].

Lemma 1. If s1 ∼ Wσ21

and s2 ∼ Wσ22, then s1s2 ∼ Wσ2

1+σ22.

III. MAXIMUM LIKELIHOOD ORIENTATION ESTIMATION

We introduce the standard formulation for maximum-likelihood estimation of the orientations given relative mea-surements. Let G be a directed graph with n+ 1 nodes. Eachnode has an unknown orientation r◦i ∈ SO(2). The set of medges E corresponds to the available measurements. For anyedge (i, j) ∈ E we observe the relative orientation

dij = (r◦i )−1r◦j εij ∈ SO(2), (9)

where εij is a random variable on SO(2) representing mea-surement noise, and assumed to be distributed according to awrapped Gaussian with scaling parameter σij > 0. We referto σ2

ij as the variance, as it parametrizes the Gaussian noisebefore wrapping (Section II-D). The first formalization of theproblem is intrinsic on the manifold SO(2).

Problem 1 (Intrinsic formulation of maximum likelihoodorientation estimation in the absolute frame). Given the setof relative observations {dij} ∈ SO(2)m, for (i, j) ∈ Eand the corresponding variances σ2

ij > 0, find the set ofminimizers S1 ⊂ SO(2)n+1 that satisfies

S1 = arg min{ri}∈SO(2)n+1

∑(i,j)∈E

− logWσ2ij

(d−1ij r−1i rj). (10)

The symbol ri denotes the optimization variable associated tothe orientation of the i-th node, while r◦i is the true value.

Remark 2 (Use of the wrapped Gaussian). By modelingthe noise as a wrapped Gaussian in Problem 1 we madeexplicit an assumption often kept implicit in related workfocusing on the optimization perspective (e.g., [5], [15]), forwhich one just needs to define an objective function withoutnecessarily defining a generative model for the noise. Forsmall measurement errors, assuming that the objective functionis quadratic on the tangent space implies that the noise isdistributed according to a wrapped Gaussian (Lemma 6).Other modeling possibilities include the Von Mises or Binghamdistributions [48]. We use the wrapped Gaussian for angularmeasurements for the same reasons we use the Gaussian inEuclidean space: for theoretical reasons, as it is privilegedamong all distributions because it maximizes entropy for agiven variance; for simplicity, as it has the semigroup property(Lemma 1) that allows for finite-dimensional parametrization;for practical reasons, as linear(ized) models play well withGaussian distributions and sometimes allow for closed forms.

Remark 3 (Independence assumption). Problem 1 assumes thatthe relative orientation observations are indepedendent. This isviolated if the same sensor measurement is used for computingtwo relative orientation observations, as it happens in scanmatching. In general, assuming uncorrelated measurementsleads to estimators which are still consistent though not optimal.We make the independence assumption mainly for notationalsimplicity to write the likelihood as a sum of terms eachusing only one measurement. This is convenient as we gothrough multiple reformulations of the problem. Once weget to vectorial notations, then we can easily accommodatecorrelated measurements; for instance, Problem 5 is valid alsofor nondiagonal covariance matrix.

Page 5: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

A. Observability and symmetries

All optimization problems in this paper are posed as findinga set of minimizers, rather than the “optimal solution”. The setof minimizers is indicated as Si for Problem i. Only in somecases we will be able to conclude there is a unique solution, andhence Si has only one element. We need to be careful aboutkeeping track of the set of minimizers, because the first part ofthe paper (Section IV) consists in transforming one problemto another, sometimes changing the domain or introducingextra variables. To facilitate bookkeeping, we use the conceptof symmetry: a symmetry of an optimization problem isan invertible transformation of the unknown variables thatpreserves the value of the objective function. Speaking ofsymmetries is a formal way of speaking of unobservabilityfrom the algebraic/geometric point of view. Problem 1 has onesymmetry that corresponds to the well-known fact that theabsolute orientations are not observable from only relativemeasurements: the relative measurements do not change if thenodes’ orientations are rotated by the same amount.

Formally, for any rotation matrix s ∈ SO(2), the objectivefunction (10) is invariant to the invertible function

fs : SO(2)n+1 → SO(2)n+1

{ri} 7→ {s ri}.

To avoid this ambiguity, and following standard convention, wefix the orientation of the first node to the arbitrary value r0 =( 1 00 1 ). The problem can be then restated using only n variables.

Problem 2. (Intrinsic formulation of maximum likelihood ori-entation estimation) Given a set of relative observations dij ∈SO(2)m, for (i, j) ∈ E , and the variances σ2

ij > 0, find theset of minimizers S2 = {r?i } ⊂ SO(2)n that satisfies

S2 = arg min{ri}∈SO(2)n

∑(i,j)∈E

− logWσ2ij

(d−1ij r−1i rj), (11)

having fixed r0 = ( 1 00 1 ).

The function has a minimum, because it is defined on thecompact set SO(2)n and is bounded from below. The minimumis unique in a noiseless setup and for a connected graph.

Proposition 4. (Observability of orientations) In the noiselesscase, |S2|= 1 if and only if the graph is connected.

The proof is straightforward and proceeds by looking at themeasurements along a spanning tree, which uniquely identifythe optimal solution [42]. Henceforth the graph G is assumedto be connected. It has not been known whether the solutionis unique in the noisy case; we will show that this is true withprobability 1 (Proposition 13).

B. Choosing coordinates

As a first step, we make a choice of coordinates for thenodes’ orientations and measurements. We include this passagein the Problem Statement section since it leads to the problemformulation that is commonly adopted in literature.

We use the coordinates θ◦ ∈ (−π,+π]n for the trueunknown orientations, the coordinates θ ∈ (−π,+π]n for theoptimization variables, and the coordinates δ ∈ (−π,+π]

m for

the relative measurements. Formally, these are defined as theprincipal logarithm of quantities that live on SO(2):

θ◦i.= Log0(r◦i ), θi

.= Log0(ri), δij

.= Log0(dij). (12)

The θ◦ is the “true” and unknown quantity that we want toestimate. We now want to write the objective function (11) asa least-squares cost, depending on the coordinates θ and δ. Itis only possible to write the likelihood as a quadratic functionif σij is small enough; as it grows, the likelihood of themeasurement tends to a constant that cannot be represented asa quadratic function.

Assumption 5. The uncertainty of a single measurement doesnot “spill over” the ±π boundaries: 3σij � π.

This assumption is typically respected in robotics becausethe measurements are more precise than the given threshold.

Moreover, with a preliminary transformation, we can assumewithout loss of generality that Assumption 5 is satisfied: becausethe convolution of two wrapped Gaussians is still a wrappedGaussian (Lemma 1), we just need to replace one edge witha large variance σ2

ij with two edges with smaller varianceswhose sum is σ2

ij .With this assumptions we can write the likelihood as a

quadratic function (but, still, with the modulus operation).

Lemma 6 (Quadratic approximation in coordinates). If As-sumption 5 is satisfied, then

− logWσ2ij

(d−1ij r−1i rj)' 1

2σ2ij

∣∣〈θj−θi−δij〉2π∣∣2 + c, (13)

with the constant c equal to log(σij√

2π).

Proof: From the definition of dij in (9) it follows that

− logWσ2ij

(d−1ij r−1i rj)

= − log

(1

σij√2π

+∞∑k=−∞

exp

(−

(Log0(d−1ij r−1i rj) + 2πk)2

2σ2ij

))(The other summands are negligible for 3σij � π.)

' − log(

1σij√2π

exp(−Log2

0(d−1ij r−1i rj) / 2σ2

ij

))' 1

2σ2ij|Log0(d−1ij r

−1i rj)|2+c.

The rest of the proof consists of algebraic manipulation basedon properties that we have already introduced:

|Log0(d−1ij r−1i rj)|2=

∣∣〈Log(d−1ij r−1i rj)〉2π

∣∣2(Using the property of Log0 in (7).)

=∣∣〈Log(d−1ij ) + Log(r−1i ) + Log(rj)〉2π

∣∣2(Using the property of SO(2) in (6).)

= |〈−Log(dij)− Log(ri) + Log(rj)〉2π|2

(Using the property of SO(2) in (5).)= |〈−〈Log(dij)〉2π−〈Log(ri)〉2π+〈Log(rj)〉2π〉2π|2

(Using the property (4).)= |〈Log0(dij)−Log0(ri)+Log0(rj)〉2π|2=|〈δij+θj−θi〉2π|2.

(Using the property (7) and the definition (12).)

Using the lemma we state Problem 2 as quadratic minimization.

Page 6: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

Problem 3 (Angular coordinates formulation of maximumlikelihood orientation estimation). Given the observations δij ∈(−π,+π], for (i, j) ∈ E , and the variances σ2

ij > 0, find theset of minimizers S3 ⊂ (−π,+π]n that satisfies

S3 = arg minθ∈(−π,+π]n

∑(i,j)∈E

1σ2ij| 〈θj − θi − δij〉2π |

2, (14)

having fixed the first node’s orientation to θ0 = 0.For clarity, the constant term in (13) is omitted in (14).

1) From Problem 3 to Problem 2: The conversion betweenthe two solutions sets is just a change of coordinates:

S2 = Exp(S3), S3 = Log0(S2).

2) Symmetries of Problem 3: If Assumption 5 is satisfied,Problem 3 is just a restatement in different coordinates ofProblem 2, so they have the same symmetries. In the noiselesscase, the solution is unique (Proposition 4). We still do notknow what happens in the noisy case, but we can anticipate(Proposition 13) that, for general data, the solution is unique.

Besides the modulus operation, Problem 3 might appear tobe similar to a linear estimation problem, but this is not thecase: due to the nonlinearity, the cost function is non-convexand has several local minima (see Figure 2 in [42]).

IV. ESTIMATION ON SO(2): FROM ANGLES TO INTEGERS

This section shows that the nonlinear, non-convex, con-strained Problem 3 is equivalent to an unconstrained quadraticinteger optimization problem. We will convert Problem 3, whichis defined on (−π,+π]n, where n is the number of observablenodes, through a series of intermediate formulations, until wearrive at Problem 7, which is defined on the integers Z`, where `is the number of cycles in the graph. A schematic representationof the relations among the problems presented in this paper isshown in Table II.

The final result of this section, Theorem 15 on page 10, saysthat the solution of Problem 3 is almost surely unique, and thatwe can obtain such solution by solving Problem 7.

There are several intermediate reformulations. The set Si

is the set of minimizers of the i-th problem. At each step wekeep track of the size of this set by characterizing its symmetrygroup. This is important because some of the intermediateformulations, namely Problem 4 to Problem 6, have multiplesolutions. These are not ambiguities of the original problem, butrather artifacts of our choice of using a redundant representation.

Section IV-A formulates Problem 4, whose optimization vari-able x is defined on Rn, hence not constrained in (−π,+π]n,as in Problem 3. Here, we work on a larger domain, and hencewe introduce an additional ambiguity, which is that each entryof x is determined only modulo 2π.

Section IV-B formulates Problem 5, which is definedon (x,k) ∈ Rn × Zm. The new variable k ∈ Zm can beinterpreted as regularization terms that we introduce to keeptrack of “angular excess” on each edge. The same sectionalso shows that, given the value of k, the value of x can berecovered in a closed form using linear estimation: in fact, ifwe knew k, the problem would be linear.

TABLE IIRELATIONS AMONG THE PROBLEMS DEFINED IN THIS PAPER

problem variables solution set symmetry group

Problem 1 {ri} ∈ SO(2)n+1 S1 SO(2)|

Fixing reference frame↓

Problem 2 {ri} ∈ SO(2)n S2 none|

Choice of coordinates↓

Log0 ↓ ↑ Exp

Problem 3 θ ∈ (−π,+π]n S3 none|

Real-valued parametrization↓

ϕ34 ↑

Problem 4 x ∈ Rn S4 Zn|

Using regularization terms↓

ϕ45 ↑

Problem 5 (x,k) ∈ Rn × Zm S5 Zn|

Error function separability↓

ϕ56 ↑

Problem 6 k ∈ Zm S6 Zn|

Minimal parametrization↓

ϕ67 ↑

Problem 7 γ ∈ Z` S7 none

Section IV-C formulates Problem 6, which is defined onlyon k ∈ Zm. The insight is that the integer and the real part ofthe problem can be solved separately in a two-stage procedure.

Section IV-D formulates Problem 7, which is defined on aninteger variable γ ∈ Z`. While k lives on the edges, γ liveson the cycles of the graph and it is a minimal parametrization.

Section IV-E puts together the chain of implications, andshows that the solution set S7 can be mapped surjectivelyto S3, and we can easily compute S3 once we know S7.

A. Real-valued formulation

The first step is the reformulation of Problem 3 as anunconstrained optimization problem for real variables x ∈ Rn.

Problem 4 (Real-valued formulation of maximum likelihoodorientation estimation). Given the observations δij ∈ (−π,+π],(i, j) ∈ E and the corresponding variances σ2

ij > 0, find theset of minimizers S4 ⊂ Rn that, having fixed x0 = 0, satisfies

S4 = arg minx∈Rn

∑(i,j)∈E

1σ2ij| 〈xj − xi − δij〉2π |

2. (15)

1) Symmetries of Problem 4: Note that this problem hasmore solutions than Problem 3. This is an artifact of the real-valued parametrization. In particular, if x ∈ Rn is a solution,also x− 2πp is a solution, for any integer vector p ∈ Zn.

2) From Problem 4 to Problem 3: While Problem 4 hasmultiple solutions due to the symmetry, they are all equivalentwhen projected down to the manifold using the map

ϕ34 : Rn → (−π,+π]n

x 7→ 〈x〉2π .

Proposition 7. ϕ34(S4) = S3.

Page 7: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

Proof: Using property (4) and noticing that Problem 3 isa constrained version of Problem 4 we can easily demonstratethat Problem 3 and Problem 4 attain the same optimal objective.We use the notation J (x), to denote the value of the objectivefunction of the two problems for a given vector x. Now, wehave to show that (i) for any x? ∈ S4, the variable 〈x?〉2πis in S3, and (ii) for any solution θ? ∈ S3 there exists atleast one x? ∈ S4, such that 〈x?〉2π = θ?. The implication (i)is a direct consequence of fact that the problems attain thesame optimal objective: for any x? ∈ S4, property (4) assuresthat J (x?) = J (〈x?〉2π), hence 〈x?〉2π attains the optimalobjective and satisfies the constraints of Problem 3; therefore〈x?〉2π ∈ S3. Regarding the implication (ii), we note that,for any solution θ? ∈ S3, we can simply pick x? = θ?,which implies 〈x?〉2π = 〈θ?〉2π = θ? (the modulus operationproduces no effect since θ? ∈ (−π,+π]n). An extended versionof this proof can be found in [42].

Using this result, we can solve the unconstrained Problem 4,obtaining a solution x? ∈ Rn, and then compute an optimalsolution of Problem 3 as θ? .

= 〈x?〉2π ∈ (−π,+π]n.

B. Mixed-integer formulation

Problem 4 is an optimization problem in real variables, but itsresidual errors are still nonlinear and difficult to minimize. Wenow get to the core idea of this paper: instead of solving a non-convex problem in real variables, we choose to solve a convex(quadratic) problem in mixed (integer and real) variables. The“trick” is that one can get rid of the modulus in the expressionof the residuals: by using the property (3), the terms in theerror function (15) can be written as

| 〈xj − xi − δij〉2π |2

= minkij∈Z

|xj − xi − δij + 2πkij |2.

We reformulate the problem using the vector notation bymaking use of the reduced incidence matrix A of the graph(Section II-A). Suppose that there is an ordering of the edgesfrom 1 to m, so that the measurements can be written as anm-dimensional vector δ ∈ (−π,+π]

m. Define accordingly

the regularization vector k = (k1 · · · km)T ∈ Zm, and themeasurement covariance Pδ = diag(σ2

1 , . . . , σ2m) ∈ Rm×m.

Problem 5 (Mixed-integer formulation of maximum likelihoodorientation estimation). Given the vector δ ∈ (−π,+π]m andthe diagonal positive definite matrix Pδ ∈ Rm×m, find the setof minimizers S5 ⊂ Rn × Zm that satisfies

S5 = arg min(x,k)∈Rn×Zm

∥∥ATx− δ + 2πk∥∥2P−1δ

. (16)

This is a mixed-integer convex program [49] as the objectiveis quadratic in both continuous and discrete variables.

1) From Problem 5 to Problem 4: Because k is a slackvariable, we obtain S4 from S5 with the the projection map

ϕ45 : Rn × Zm → Rn

(x,k) 7→ x.

Proposition 8. ϕ45(S5) = S4.

Proof: We show how to transform Problem 4 into Prob-lem 5. Using property (3) we can rewrite (15) as

minx∈Rn

∑(i,j)∈E

1σ2ij

minkij∈Z

|xj − xi − δij + 2πkij |2,

which corresponds to

minx∈Rn, kij∈Z,(i,j)∈E

∑(i,j)∈E

1σ2ij|xj − xi − δij + 2πkij |2. (17)

Therefore, kij are only slack variables that implicitly representthe modulus operator, and finding the optimal x for (17) is thesame as finding the optimal x for Problem 4.

2) Symmetries of Problem 5: Note that we are now workingon a larger space Rn × Zm: we are overparametrizing theproblem in order to make the corresponding cost functionquadratic. Therefore, we might have enlarged the number ofsolutions. In fact, we introduced the following symmetry. Forany vector p ∈ Rn, such that ATp is an integer vector, thefollowing transformation leaves the error function invariant:

(x,k) 7→ (x− 2πp,k +ATp). (18)

This is the only symmetry because AT has full column rank.3) Solving for x given k: Before further manipulations,

we note that, if k is known, the problem reduces to uncon-strained minimization in x ∈ Rn of the quadratic function∥∥ATx− δ + 2πk

∥∥2P−1δ

. Call x?|k the optimal x for a fixed k:

x?|k = (AP−1δ AT)−1AP−1δ (δ − 2πk). (19)

C. Separating the integer-valued and the real-valued problems

This section shows that the cost function (16) is separableinto two terms, enabling a two-stage optimization in which thecost is first optimized with respect to k and then the optimalchoice of x is computed in closed form.

The separability results in the following lemma involves acycle basis matrix C of the graph G. The result is valid forany cycle basis; in Section VI-C we discuss the algorithmicimplications of the choice of a particular cycle basis matrix.

Lemma 9. For any given cycle basis matrix C, minimizingthe cost (16) is the same as minimizing

‖x− x?|k‖2(AP−1

δ AT)+ ‖2πCk −Cδ‖2(CPδCT)−1 , (20)

where x?|k is a function of k and is given in (19).

Proof: The proof consists of straightforward algebraicmanipulations. For compactness, we name the matrices

X.= AT(AP−1δ AT)−1A, Y

.= CT(CPδC

T)−1C.

Expanding the right hand side of (16), and neglecting the termsthat do not depend on the optimization variables, we obtain

f1(x,k).= ‖x‖2AP−1

δ AT − 2δTP−1δ ATx+ 4πkTP−1δ ATx

+4π2‖k‖2P−1δ

−4πδTP−1δ k.

(21)

Page 8: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

To show that minimizing f1(x,k) in (21) is the same asminimizing (20), we rewrite the latter as

‖x‖2AP−1δ AT − 2δTP−1δ ATx+ 4πkTP−1δ ATx

+‖δ‖2P−1δ XP−1

δ

+4π2 ‖k‖2P−1δ XP−1

δ− 4πδTP−1δ XP−1δ k

+4π2‖k‖2Y −4πδTY k + ‖δ‖2Y .(22)

Because ‖δ‖2P−1δ XP−1

δ

and ‖δ‖2Y are constant and do notdepend on x and k, minimizing (22) is the same as minimizing

f2 (x, k).= ‖x‖2AP−1

δ AT − 2δTP−1δ ATx+ 4πkTP−1δ ATx

+4π2 ‖k‖2P−1δ XP−1

δ− 4πδTP−1δ XP−1δ k

+4π2 ‖k‖2Y − 4πδTY k.(23)

Comparing (21) and (23), one concludes that the first threeterms in f1 (x,k) and f2(x,k) coincide, and it only remainsto show equality for the last terms. Rewrite (23) as

‖x‖2AP−1δ AT − 2δTP−1δ ATx+ 4πkTP−1δ ATx

+4π2kT(P−1δ XP−1δ + Y )k − 4πδT(P−1δ XP−1δ + Y )k.(24)

Since Pδ is positive definite, the technical result of Lemma 22(in appendix) implies that

P−1δ = P−1δ AT(AP−1δ AT)−1AP−1δ +CT(CPδCT)−1C.

Hence P−1δ = P−1δ XP−1δ + Y , and (24) becomes

‖x‖2AP−1δ AT − 2δTP−1δ ATx+ 4πkTP−1δ ATx

+4π2kTP−1δ k − 4πδTP−1δ k,

which can be easily seen to coincide with (21). Since the ob-jective functions f1(x, k) and f2(x, k) coincide, problems (16)and (20) have the same solutions.

A consequence of writing the error function as in (20)is that a separability principle holds: we can obtain themaximum likelihood solution using a two-stage approach: firstwe estimate the k, and then we estimate x given k. Thisaspect is formalized later, in Proposition 10. Intuitively, thecost function (20) comprises two terms, the first that can bemade equal to zero choosing x = x?|k, and the second, thatdoes not depend on x and can be minimized by working on k.Since we already have a closed-form expression for x given kthe only problem that we have to solve is finding k.

Problem 6 (Integer formulation of maximum likelihood ori-entation estimation in edge space). Given the vector δ ∈(−π,+π]m and the diagonal positive definite matrix Pδ ∈Rm×m, and a cycle basis matrix C ∈ Z`×m, find the set ofminimizers S6 ⊂ Zm that satisfies

S6 = arg mink∈Zm

∥∥Ck − 12πCδ

∥∥2(CPδCT)−1 . (25)

Notice that (25) is the same as the second summand in (20),and we only rearranged the 2π term.

1) From Problem 6 to Problem 5: Proposition 10 assuresthat solving Problem 6 is the same as solving Problem 5, if weconvert the solutions using the map

ϕ56 : Zm → Rn × Zm

k 7→ (x?|k,k).

Proposition 10. ϕ56(S6) = S5.

Proof: Since AP−1δ AT is positive definite, the first termin (20) is non-negative. This implies that, for any k, theminimum is attained for x = x?|k (which annihilates thefirst summand in the objective function). Moreover, the secondsummand in (20) does not depend on x.

Summarizing the chain of implications presented so far, weconclude that for any solution k? of Problem 6 we can obtaina solution (x?,k?) of Problem 4. Moreover, from x? we caneasily obtain the solution of our original problem (Problem 3)by applying the modulus operation to x?.

2) Symmetries of Problem 6: Because (16) and (20) arecompletely equivalent, they have the same symmetries. How-ever, it is interesting to find the symmetries of Problem 6directly. Notice that the reorganization of the terms made theterm Ck explicit. Because CPδCT is positive definite, theonly symmetries are described by the kernel of C. For anyinteger vector q ∈ kerC, this transformation does not changethe value of the objective function:

k 7→ k + q, for q ∈ kerC ∩ Zm.

Recall that C is a full-row-rank ` × m matrix, where ` isthe dimension of the cycle space. Its kernel kerC has thusdimension m − `, which is equal to n. Because AT is anorthogonal complement of C it provides a base for kerC.Therefore, any q ∈ kerC ∩ Zm can be written as q = ATp,for some p ∈ Zn. Therefore, the symmetry is

k 7→ k +ATp, for p ∈ Zn, (26)

which confirms the symmetry in (18).

D. From k towards a minimal parameterization γ

The cycle basis matrix C is a “fat” `×m matrix, because thenumber of cycles ` is much less than the number of edges m.Therefore, there are an infinite number of k? such that theproduct Ck? attains the minimum of (25). (This family ofsolutions is described by symmetry (26).) Consequently, wehave an infinite number of optimal orientation estimates x?|k

?

.Fortunately, the next proposition assures that the infinitecardinality of solutions is an artifact created when passingfrom SO(2) to the reals. In particular, all vectors k having thesame product Ck lead to orientation estimates x?|k that differby integer multiples of 2π and they give equivalent solutions.

Proposition 11. Two solutions k1,k2 satisfying Ck1 = Ck2are equivalent up to 2π multiples: 〈x?|k1〉2π = 〈x?|k2〉2π .

Proof: Recall that x?|k1 = (AP−1δ AT)−1AP−1δ (δ −2πk1) and similarly x?|k2 = (AP−1δ AT)−1AP−1δ (δ−2πk2).Define δ?|k1 = ATx?|k1 and δ?|k2 = ATx?|k2 . If the orienta-tion of the first node is set to zero, δ?|k1 and δ?|k2 uniquelyidentify x?|k1 and x?|k2 , since x?|ki can be rewritten as aninteger-valued linear combination of δ?|ki , i = 1, 2; In general,we have x?|ki = Dδ?|ki , i = 1, 2, where D is a left integerpseudoinverse of AT. Therefore, determining δ?|ki is the sameas determining x?|ki .The difference δ?|k2−δ?|k1 can be writtenas 2πAT(AP−1δ AT)−1AP−1δ (k1−k2) and by Lemma 22 thisis equal to 2π(k1−k2)−2πPδC

T(CPδCT)−1(Ck1−Ck2).

Page 9: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

The second term is zero because Ck1 = Ck2, givingδ?|k2 − δ?|k1 = 2π(k1 − k2). Therefore, elements of δ?|k1and δ?|k2 only differ by multiples of 2π; then, x?|k2−x?|k2 =D(δ?|k2 − δ?|k1) = 2πD(k1 − k2), and since D is integer,also x?|k1 and x?|k2 differ by multiples of 2π.

Using this result we can write the final formulation of theproblem using only the variable γ = Ck ∈ Zm because all kproducing the same γ = Ck are equivalent. (The vector γ isinteger because both C and k are integer.)

Problem 7 (Integer formulation of maximum likelihood ori-entation estimation in cycle space). Given the vector δ ∈(−π,+π]m and the diagonal positive definite matrix Pδ ∈Rm×m, and a cycle basis matrix C ∈ Z`×m, find the set ofminimizers S7 ⊂ Z` that satisfies

S7 = arg minγ∈Z`

‖γ − γ̂‖2P−1γ

(27)

with γ̂ = 12πCδ, and Pγ = 1

4π2CPδCT.

1) From Problem 7 to Problem 6: Given a γ, there is asimple way to compute a k satisfying Ck = γ, assuming thatthe rows of C are ordered appropriately as in (1).

Lemma 12. Given a vector γ ∈ Z` and a cycle basis matrixwritten as C = (CT C⊥), an integer solution to Ck = γ canbe computed as

k =

(0nC−1⊥ γ

). (28)

Proof: From Liebchen [50, Lemma 3 and Theorem 7] itfollows that C⊥ is invertible and det(C⊥) = ±1. Moreover,because C⊥ is an integer matrix with unitary determinant,necessarily C−1⊥ is itself integer (see Schrijver [51], or justthink that the inverse is the adjoint matrix over the determinant).Therefore, C−1⊥ γ is an integer vector. Finally, we can showthat Ck = (CT C⊥)(0T

n (C−1⊥ γ)T)T = C⊥(C−1⊥ γ) = γ.We use the notation

C† =

(0nC−1⊥

)(29)

to remark that C† is a right (integer) pseudoinverse of C.2) Symmetries of Problem 7: The objective function (27) is

convex so it would be tempting to just say that there is only oneminimum. However, we should be careful because the intuitionsof convex optimization often fail in integer programming. Forexample, Figure 3 in [42] shows a case in which a convexobjective function has two integer solutions.

What we can say is that this cannot happen for general data.

Proposition 13.∣∣S7∣∣ = 1 with probability 1.

Proof: The set S7 is the set of minimizers of (27), whichhas an objective function of the form ‖γ − µ‖2P , with P ∈R`×` a positive definite matrix, and µ ∈ R` a random variablewhich depends on the measurements, and can be seen to bea Gaussian vector. Consider the set of values M ⊂ R` suchthat if µ ∈M , then ‖γ − µ‖2P has multiple minimizers for γ.Then it is easy to show that the set M has measure 0 in R`.(For example, we can see how for an infinitesimal perturbationof the mean of the parabola in Figure 3 in [42], the solutionset goes from {0, 1} to either {0} or {1}.) For demonstrating

this claim, consider a generic µ ∈ M . Since µ ∈ M thereexist v ≥ 2 discrete values {γ1, . . . ,γv} ∈ Z` such that∥∥γ1 − µ

∥∥2P

= · · · = ‖γv − µ‖2P . (30)

If we fix {γ1, . . . ,γv}, and take µ as the independent variable,the constraints in (30) define an algebraic variety of dimensionat most `− 1, which has measure 0 in R`.

E. Inception

Let us link Problem 7 with the other problems presented sofar. From a solution γ of Problem 7 we can find a solution kof Problem 6 using the formula (28). We call this map ϕ6

7:

ϕ67 : Z` → Zm

γ 7→ C†γ.

However, there are much fewer γ ∈ Z` than k ∈ Zm, therefore,using this map on S7 we will not be able to cover all of S6:ϕ67(S7) ( S6. Likewise, further projecting down to using ϕ5

6

we will still lose solutions: ϕ56 ◦ ϕ6

7(S7) ( S5, where thesymbol ◦ denotes the function composition operator. Anotherstep is not enough, as we still have ϕ4

5 ◦ ϕ56 ◦ ϕ6

7(S7) ( S4.We have to go four levels deep. When we finally arrive backto Problem 3, we do obtain all solutions.

Proposition 14 (Inception). ϕ34 ◦ ϕ4

5 ◦ ϕ56 ◦ ϕ6

7(S7) = S3.

Proof: We first prove the relation ϕ34 ◦ϕ4

5 ◦ϕ56 ◦ϕ6

7(S7) ⊆S3. Consider a γ? ∈ S7. By construction, k? = ϕ6

7(γ?) attainsa minimum of Problem 6, since γ is only a slack variablethat substitutes Ck. Therefore, k? ∈ S6. Then, applying insequence Proposition 10, Proposition 7, and Proposition 8, wecan easily conclude that ϕ3

4 ◦ ϕ45 ◦ ϕ5

6 ◦ ϕ67(γ?) ∈ S3.

Now, we want to prove ϕ34 ◦ ϕ4

5 ◦ ϕ56 ◦ ϕ6

7(S7) ⊇ S3, that,together with the relation proved in the previous paragraph,demonstrates Proposition 14. Pick a θ? ∈ S3. As shown inthe proof of Proposition 7, x? = θ? also belongs to S4, andit is such that (a) ϕ3

4(x?) = θ?. We have already seen in Sec-tion IV-B that introducing k is a way to implicitly representthe modulus. If we choose k? = b(π1m −ATx? + δ)/2πc(compare with equations (2) and (16)), then the pair (x?,k?)solves Problem 5; moreover, (x?,k?) is such that ϕ4

5(x?,k?) =x?, that, using (a), implies (b) ϕ3

4◦ϕ45(x?,k?) = θ?. According

to Proposition 10, if (x?,k?) ∈ S5, then k? ∈ S6, and then,from (b), we get (c) ϕ3

4 ◦ ϕ45 ◦ ϕ5

6(k?) = θ?. Finally, if k? isan optimal solution for Problem 6, we can pick γ? = Ck?

and guarantee that γ? is an optimal solution for Problem 7(there is only a change of variables between the two problems),i.e., γ? ∈ S7. Concluding, for a given θ? ∈ S3, we founda γ? ∈ S7, that is such that θ? = ϕ3

4 ◦ ϕ45 ◦ ϕ5

6 ◦ ϕ67(γ?).

The following theorem, whose proof is a direct consequenceof Proposition 13 and 14, summarizes what we have learned.

Theorem 15 (Mixed-integer maximum likelihood estimationon SO(2)). The solution of Problem 3 is almost surely unique,and, given the solution γ? of

γ? = arg minγ∈Z`

∥∥γ − 12πCδ

∥∥2P−1γ,

Page 10: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

the solution of Problem 3 can be computed as

θ? =⟨(AP−1δ A

T)−1AP−1δ

(δ − 2πC†γ?

)⟩2π, (31)

with C† .=(

0nC−1⊥

).

Equation (31) is a closed-form expression for the mappingϕ34◦ϕ4

5◦ϕ56◦ϕ6

7(γ?), hence the result is a more explicit versionof Proposition 14.

In contrast with iterative optimization techniques, the pro-posed computation of θ?|γ

?

does not suffer from local minima,assuming that we are able to compute γ?.

Two further challenges stand in the way. First, Problem 7is NP-hard [49]. Several algorithms have been proposed inliterature to solve integer quadratic programming (see, e.g., [41],[52], [53] and references therein); however, for the hardnessof the problem, one cannot expect to solve exactly and quicklylarge-scale problems; for example, Chang and Golub [54] reportcomputational times above 10 seconds in problem instanceswith less than 50 variables. Similar numerical results arereported by Jazaeri et al. [53]. Second, as explained in thenext section, computing only the maximum likelihood estimatedoes not guarantee to have an accurate orientation estimate.

V. LIMITATIONS OF THE MAXIMUM LIKELIHOOD ESTIMATE

Section IV has shown that the maximum likelihood esti-mation problem on the orientations, which lie on the productmanifold SO(2)n, is equivalent to an optimization problem inthe variable γ on the integers lattice Z`. This section showsthat, if the maximum likelihood solution γ? is different thanthe “true” value γ◦, then there is a bias in the estimate of x(Lemma 16), which raises the mean square estimation error.To define the “true” value of γ rewrite the measurements δ as

δ = 〈ATθ◦ + ε〉2π,

where θ◦ is the “true” value of the nodes’ orientations. From (2)the measurement model can be written as

δ = ATθ◦ + 2k◦π + ε, (32)

where k◦ is the “true” regularization vector given by k◦ .=⌊

(π1m −ATθ◦ − ε)/2π⌋. Define the “true” value of γ as

γ◦.= Ck◦.

For a given γ, define the real-valued estimate x?|γ as

x?|γ.= (AP−1δ AT)−1AP−1δ (δ − 2πC†γ) ∈ Rn, (33)

and the corresponding (wrapped) estimate θ?|γ as:

θ?|γ.= 〈x?|γ〉2π ∈ (−π,+π]n.

We call x?|γ?

the real-valued maximum likelihood estimate (itis x?|γ computed for γ?). Comparing with (31), the readercan notice that real-valued maximum likelihood estimate issimply the maximum likelihood orientation estimate beforeapplying the modulus. We can give a full characterization of thedistribution of the real-valued maximum likelihood estimator.

Lemma 16. The real-valued maximum likelihood estima-tor x?|γ

?

can be written as

x?|γ?

= θ◦ + (34)2πp+ (35)(AP−1δ AT)−1AP−1δ ε+ (36)2π(AP−1δ AT)−1AP−1δ C†(γ◦ − γ?), (37)

where the term (34) is the true value of the orientations; theterm (35) contains some integer vector p and has no effectonce the modulus is applied to x?|γ

?

; the term (36) containsthe noise which would appear even if the problem were linear;and, finally, the term (37) contains an additional bias, which isproportional to the mismatch between γ? and γ◦. In particular,if γ? = γ◦, then the vector x?|γ is Normally distributed withmean θ◦ + 2πp and covariance matrix (AP−1δ AT)−1.

Proof: Substitute δ from (32) in the expression of thereal-valued estimator to obtain

x?|γ = (AP−1δ AT)−1AP−1δ

(ATθ◦ + ε+ 2π(k◦ −C†γ)

).

Rewrite k◦ = C†γ◦ + k⊥, as to separate k◦ into two vectors:the first is C†γ◦, which, by construction, satisfies C(C†γ◦) =γ◦. The second belongs to the kernel of C, and then sat-isfies Ck⊥ = 0`. Note that k⊥ is an integer vector, sinceboth k◦ and C†γ◦ are integer vectors by construction. Becauseof Proposition 11, the term k⊥ only adds multiples of 2π tothe estimate:

x?|γ=(AP−1δ AT)−1AP−1δ

(ATθ◦+ε+2πC†(γ◦−γ)

)+2πp,

with p = Dk⊥, for some integer matrix D. Multiplying theparentheses gives the desired result for γ = γ?.

One consequence of this result is that the maximum likeli-hood estimate of x computed from γ? is not necessarily theone with the minimum estimation error, because one cannotguarantee in general that the optimal solution γ? of Problem 7is such that γ? = γ◦. Section V-B in [42] shows a simpleexample in which the maximum likelihood estimator fails toretrieve γ◦. Consequently, in order to attain a small estimationerror, one should look for γ◦, instead of simply computing γ?.

VI. A MULTI-HYPOTHESIS ESTIMATOR FOR γ◦

This section shows how to find a set of candidates for γ◦.Section VI-A describes how to derive an estimator for γ◦ whichallows building a confidence set. Section VI-B describes theINTEGER-SCREENING algorithm, which finds a set of integervectors Γ containing γ◦ with desired probability. Section VI-Cdiscusses the influence of the cycle basis matrix in theconstruction of Γ and proves that the minimum cycle basismatrix is the choice minimizing the expected size of the set Γ.

A. An estimator of γ◦

From the knowledge of a cycle basis matrix C, the mea-surements δ, and the covariance matrix Pδ we can design anestimator γ̂ for the unknown integer vector γ◦.

Proposition 17. The real-valued estimator

γ̂.= 1

2πCδ (38)

Page 11: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

is a Normally distributed estimator with mean γ◦ and covari-ance matrix

Pγ.= 1

4π2CPδCT. (39)

Proof: Multiply both members of (32) by C to obtain

Cδ = C(ATθ◦ + 2πk◦ + ε).

By Lemma 21, the term CAT is equal to zero. Reordering theremaining terms, we get Ck◦ = 1

2πCδ −12πCε. Using the

definition γ◦ = Ck◦ given in Section V, this implies γ◦ =12πCδ −

12πCε, from which the thesis follows.

The availability of this estimator allows the computation ofthe set Γ, as described in the following section.

Algorithm 1: INTEGER-SCREENING

1 input:2 a real vector γ̂ ∈ R`3 a positive definite matrix Pγ ∈ R`×`4 a confidence level α ∈ (0, 1)5 precondition: γ̂ ∼ N (γ◦,Pγ)6 output: a subset of Z` containing γ◦ with probability at least α7 variables:8 U (k) ⊆ {1, . . . , `} # Indices determined at iter. k9 R(k) ⊆ {1, . . . , `} # Indices that are still ambiguous at iter. k

10 〈ζR(k) ,PR(k)〉 # Current estimate of γ◦i , i ∈ R(k)

11 I(k)i ⊂R # Current confidence interval for γ◦i , i={1, . . . , `}12 Γ

(k)i ⊂Z # Current admissible values for γ◦i , i={1, . . . , `}

1314 R(1) ← {1, . . . , `}15 〈ζR(1) ,PR(1)〉 ← 〈γ̂,Pγ〉16 for k in {1, 2, . . . }:17 # Compute confidence sets using marginal probabilities18 U (k) = ∅19 for i in R(k):20 η = α

1`

21 bi ←√P

(k)ii χ2

1,η , with η = α1/`

22 I(k)i ← [ζ(k)i − bi, ζ(k)i + b]

23 Γ(k)i ← I(k)i ∩ Z

24 if |Γ(k)i | = 1: # Check if the set contains a single integer

25 U (k) ← U (k) ∪ {i}26 end27 # Preserve the previous confidence sets28 for i in {1, . . . , `} \ R(k):29 I(k)i ← I(k−1)

i

30 Γ(k)i ← Γ

(k−1)i

31 end32 # Update the set of indices that are still ambiguous33 R(k+1) ←R(k) \ U (k)

34 if U (k) = ∅: # Break if there is no progress35 break36 # Otherwise condition on the elements that we determined37 〈ζR(k+1),PR(k+1)〉←CONDITION(〈ζR(k),PR(k)〉,γU(k)=ΓU(k))38 end39 Γ← Γ

(k)1 × Γ

(k)2 × 1 . . .Γ

(k)`

40 return Γ

The function CONDITION (〈ζ,P 〉,γi = ui) computes the conditionalNormal distribution parametrized in its mean ζ and covariance P .

B. The INTEGER-SCREENING algorithm

The INTEGER-SCREENING algorithm (Algorithm 1) com-putes a set of integer vectors containing γ◦ with a user-specifiedprobability. The algorithm is based on two ideas: marginaliza-tion and conditioning. We use marginalization to exclude non-plausible values for the elements of γ◦. Since γ̂ ∼ N (γ◦,Pγ),also the marginal distribution of the i-th element γ̂i is aGaussian with mean γ◦i . Once a given confidence level ischosen, we have a confidence interval for each single element,and if the interval contains only one integer, then we canuniquely determine that element. Once we are sure of thevalue of one element, say γ◦i = ui, by conditioning on γ◦i = uiwe can shrink the uncertainty on the others. These two ideassuggest an iterative algorithm that looks for elements of γ◦

that can be determined unambiguously, and then uses thoseconstraints to further shrink the uncertainty on the remainingelements.

The input to Algorithm 1 consists of the real vector γ̂, thepositive definite matrix Pγ , and the confidence level α. It isassumed that γ̂ is a Normally distributed estimator of γ◦ ∈ Z`and that Pγ is its covariance. The output of the algorithm is aset Γ of integer vectors that has probability of containing γ◦

no smaller than a user-specified parameter α.Throughout the execution, the set U (k) contains the indices

of the elements of γ◦ that are uniquely identified at iteration k,and, conversely, the set R(k) that contains the indices that arestill ambiguous. At the beginning (line 14), R(k) = {1, . . . , `}as no element has been identified.

We use γ◦U(k) to indicate the subvector of γ◦ at the indicesgiven by U (k), and γ◦R(k) to denote the elements of γ◦

at the indices given by R(k). The algorithm updates twovariables ζR(k) and PR(k) , preserving the invariant

ζR(k) ∼ N (γ◦R(k) ,PR(k)),

i.e., they describe a Normally distributed estimator of theelements γ◦R(k) that have not been identified yet. The in-variant holds at the beginning as the variables are initializedto ζR(k) = γ̂ and PR(k) = Pγ (line 15).

At a generic iteration k, the algorithm computes the con-fidence set for each γ◦i , i ∈ R(k). Since ζR(k) is Normallydistributed, also each component ζ(k)i is Normally distributedwith mean γ◦i and variance given by the i-th diagonal elementof the covariance P (k)

ii . Therefore, with probability η, it holdsthat

γ◦i ∈[ζ(k)i − b, ζ(k)i + b

].= I(k)i ,

where b =√P

(k)ii χ2

1,η, and χ21,η is the quantile of the χ2

distribution with 1 degree of freedom and upper tail probabilityequal to η (lines 21–22). Note that the confidence intervaldepends on η = α

1` . The relation between α and η is justified

by Proposition 18.Then, the algorithm computes all the integers within the

interval I(k)i , obtaining Γ(k)i = I(k)i ∩Z (line 23). If the set Γ

(k)i

contains a single integer, say ui, then with probability η it holdsthat γ◦i = ui, and the algorithm adds the index i to the set ofuniquely determined elements U (k) (line 25).

Page 12: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

After checking all sets Γ(k)i , i ∈ R(k) (line 26), we have

the set U (k), that contains all the elements in γ◦ that weuniquely determined at the current iteration. Clearly, theseindices can be removed from the ones that are still ambiguous(line 33). Moreover, we can exploit this information to inferthe value of the remaining elements of γ◦, by computingthe density P(ζR(k) |ζU(k) = ΓU(k)), which is the conditionaldensity of the elements that are still ambiguous, given theelements uniquely defined (line 37). Since the original densityis a Gaussian, also the conditional density is a Gaussian, withmean ζR(k+1) and covariance PR(k+1) . Therefore, at the end ofthe iteration, we have a unique value for the elements in U (k)

and a probabilistic description (i.e., mean and covariance) ofthe elements in R(k+1). Since the conditioning may haveshrunk the uncertainty on some element, we proceed to thenext iteration. If the set U (k) is empty, it means that we are notable to make any progress and the loop exits (line 35). Noticethat when conditioning on some component of γ◦ we reduce thesize of the mean vector and the covariance matrix. At iteration k,it holds that ζR(k) ∈ R|R(k)| and PR(k) ∈ R|R(k)|×|R(k)|.The algorithm performs at most K ≤ ` iterations because,at each iteration, at least one additional element of γ◦ ∈ R`is determined. After the algorithm stops we have a collectionof confidence sets Γ

(K)i ⊂ Z, i ∈ {1, . . . , `}. If the admissible

elements γ◦i belongs to Γ(K)i , then it must hold that

γ◦ ∈ Γ(K)1 × Γ

(K)2 × . . .× Γ

(K)`

.= Γ,

where × denotes the Cartesian product of sets (line 39).Proposition 18 bounds the probability of γ◦ ∈ Γ.

Proposition 18 (Correctness of INTEGER-SCREENING). Theinteger vector γ◦ is in the set Γ returned by Algorithm 1 withprobability no smaller than α.

Proof: The complete proof is given in [42]. Theproof proceeds as follows. By construction, the setsU (1), · · · ,U (k),R(k+1) are disjoint and, at the end of eachiteration k = {1, . . . ,K} it holds that

|U (1)|+ · · ·+ |U (k)|+|R(k+1)|= `.

At iteration k, the intervals I(k)i are built from the marginals ofa Normal distribution. Therefore, the technical result of Lemma23 (in Appendix) guarantees that

P( ∧i∈{1,...,`}

γ◦i ∈ I(k)i

)≥ η|U

(1)|+...+|U(k)|+|R(k+1)| = η`,

which leads to the desired result for η` = α.

C. Optimal choice of the cycle basis matrix

So far we have not discussed how to choose the cycle basismatrix C. We can show that the mininimum cycle basis matrixMCB is theoretically the best choice for constructing the setΓ because it makes the estimator γ̂ = 1

2πMCBδ a minimum-variance estimator.

Proposition 19 (Optimal choice of cycle basis matrix). Choos-ing C to be the minimum (uncertainty) cycle basis matrix MCB

makes γ̂ a minimum variance unbiased estimator of γ◦ withinthe class of estimators {γ̂ = 1

2πCδ : C ∈ CG}.

Proof: We already know that γ̂ is unbiased. For anunbiased estimator, the variance is equal to the mean squareerror. For a Normally distributed estimator, the mean squareerror is proportional to the trace of the covariance matrix.Therefore, we have to show that choosingC = MCB minimizesTrace(Pγ) = 1

4π2Trace(CPδCT). The t-th term on the main

diagonal of CPδCT is ctPδcTt =∑

(i,j)∈ct σ2ij , where the

notation (i, j) ∈ ct means “for all edges that belong to the t-thcycle”. The previous expression coincides with the weight ofthe cycle ct under the weight function w : (i, j)→ σ2

ij .Therefore, the trace of Pγ = CPδC

T is equal to the sumof the weights of the cycles in the cycle basis, which bydefinition is W (C,w). Therefore the minimum cycle basis,which minimizes W (C,w), also minimizes Trace(Pγ).

If γ̂ is a minimum variance estimator, it minimizes the sizeof the confidence set Γ built by Algorithm 1. In Algorithm 1the diagonal elements of Pγ define the width of the confidenceintervals, therefore, since the minimum cycle basis matrix min-imizes the sum of the diagonal elements of Pγ (i.e., its trace),then it also minimizes the widths of the confidence intervalsused for the INTEGER-SCREENING algorithm. Therefore, itenables the determination of a small set Γ of admissibleinteger vectors.

VII. THE MOLE2D ORIENTATION ESTIMATION ALGORITHM

We can now summarize the findings we presented so far ina single algorithm, that allows computing a (multi-hypothesis)estimate of robot orientations, which we call MOLE2D (Multi-hypothesis Orientation-from-Lattice Estimation in 2D).

Pseudocode for MOLE2D is given in Algorithm 2. The inputto the algorithm is the reduced incidence matrix A (whichdescribes the graph), the measurements δ, their covariance Pδ ,and a parameter α which gives the desired confidence level.The output is the set of estimates Θ.

The first step (line 9) is the computation of a cycle basis ofthe graph. Any cycle basis will do, but Section VI-C tells usthat the choice of the cycle basis can be improved if informedby the covariance Pδ, and the best choice is the minimumcycle basis matrix of the graph. Without loss of generality, weassume that the rows of the cycle basis are ordered accordingto (CT C⊥), as described in (1).

The next step (lines 11–13) consists in the computationof a set Γ which contains γ◦ with confidence α using theINTEGER-SCREENING algorithm, described in Section VI-B.

The “for” loop in line 14 computes, for each integervector γ ∈ Γ, the corresponding real-valued estimate x?|γ

(line 15), and obtain the wrapped estimate θ?|γ by applyingthe modulus to x?|γ (line 16). The collection of wrappedestimates is then returned in the set Θ.

In the rest of this section we show that the MOLE2Dalgorithm solves the two limitations of the maximum likelihoodestimator of Section IV (computation efficiency and needof an assessment for the resulting estimate). In particular,in Section VII-A we show that one of the orientation hypothesesreturned by the algorithm is “close” to θ◦ (with desired

Page 13: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

Algorithm 2: MOLE2D

1 input:2 reduced incidence matrix A (topology of the graph)3 measurements δ4 covariance Pδ5 confidence level α ∈ (0, 1)6 output: set of estimates of θ◦

78 # Compute a cycle basis matrix in canonical form9 C = (CT

T CT⊥)T ← COMPUTE-CYCLE-BASIS(A,Pδ)

10 # Compute set Γ, containing admissible vectors for γ◦

11 γ̂ ← 12πCδ

12 Pγ ← 14π2CPδC

T

13 Γ← INTEGER-SCREENING(γ̂,Pγ , α)14 for γ in Γ:

15 x?|γ = (AP−1δ A

T)−1AP−1δ

(δ − 2π

0nC−1⊥

γ)

16 θ?|γ = 〈 x?|γ 〉2π17 end18 return Θ = {θ?|γ}.

The function COMPUTE-CYCLE-BASIS (A,Pδ) computes a cycle basisfor the graph, possibly informed by the covariance matrix Pδ .

probability); then, in Section VII-B we show that the algorithmincludes only worst-case polynomial operations.

A. Assessment of the estimator

This section is aimed at evaluating the quality of ourestimator. More precisely, we want to assess how the set ofestimates Θ = {θ?|γ}, which is the output of Algorithm 2,relates with θ◦. The assessment of the estimator is given inthe following proposition.

Proposition 20. Consider the set of estimators Θ returnedby Algorithm 2. Then, with probability no smaller than α,one of the estimators in Θ, say θ?|γ

j ∈ Θ, satisfies thefollowing statements: (i) the real-valued estimator x?|γ

j

isNormally distributed with mean θ◦ + 2πp and covariancematrix (AP−1δ AT)−1; (ii) the wrapped estimator θ?|γ

j

=

〈x?|γj 〉2π is distributed according to a wrapped Gaussian withmean θ◦ and covariance matrix (AP−1δ AT)−1.

Proof: The statements follow from Lemma 16 and Propo-sition 18. Lemma 16 assures that the event γ◦ ∈ Γ is the sameas the event “at least one real-valued (respectively, wrapped)orientation estimate is distributed according to a Gaussian(respectively, wrapped Gaussian)”, and Proposition 18 assuresthat such event happens with probability α.

A straightforward consequence of Proposition 20 is that,in the case |Γ| = 1, the single estimate contained in Θis distributed according to a wrapped Gaussian around θ◦.Therefore, in the case of |Γ| = 1, we can draw conclusions thatare peculiar of linear estimators and are very rare in nonlinearestimation problems (e.g., Normality). In the experimentalsection we will show that most of the problem instancesthat constitute a benchmark for state-of-the-art approaches toSLAM satisfy the condition |Γ| = 1. Therefore, in commonproblem instances, our multi-hypothesis estimator returns asingle guaranteed orientation estimate.

B. Complexity of MOLE2D

The complexity of computing C (line 9) heavily dependson the choice of the cycle basis. We consider four possiblechoices, listed here from computationally cheap to expensive:

FCBo: This is the fundamental cycle basis built from theodometric spanning tree. Call To the odometric spanningtree, which is also a spanning path for the graph. Each cycleof FCB(To) comprises a chord in the graph with respect to To,say (i, j), and the unique path in To from node i to node j.The construction of FCBo implies a complexity O(n `): theodometric spanning path can be considered a given of theproblem and the complexity reduces to fill in the matrix C ∈R`×m, which has at most n+ 1 nonzero elements in each row;FCBm: This is the fundamental cycle basis built from

the minimum uncertainty spanning tree. FCBm requires thecomputation of the minimum spanning tree, which amountsto O(m+ n), and the construction of the matrix C (O(nm)),therefore the overall cost is O(nm);MCBa: A (2ν−1)-approximation of the minimum cycle basis

is computed using the algorithm proposed by Kavitha et al. [55],which implies a complexity O(n3+

2ν );

MCB: The minimum uncertainty cycle basis, is computedusing the method by Mehlhorn and Michail [56], which impliesa complexity O(m3).

The computation of the estimator γ̂ (line 11) requires atmost `m operations, while the covariance matrix requires `2moperations (exploiting the fact that Pδ is diagonal). For theINTEGER-SCREENING algorithm (line 13), the worst case iswhen only one index is added to the set of uniquely determinedelements at each iteration. Conditioning is an operation thathas cubic operation for general matrices (O(`3)). Therefore,in the worst case, the complexity is O(`4) (the algorithmperforms ` conditioning). Finally, to obtain Θ one has tocompute x?|γ and θ?|γ for each γ ∈ Γ. The expression of x?|γ

is given in (33) and contains the two matrix inverses C−1⊥and (AP−1δ AT)−1; in practice, one would not perform thesematrix inversions, but would rather solve two linear systems,with overall complexity O(m3) (recall that ` < m and n ≤ m).The complexity of computing θ?|γ from x?|γ (i.e., applyingthe modulus operation) is O(n).

Assuming that the cardinality of Γ does not grow with theproblem dimension, the worst-case complexity of Algorithm 2amounts to O(m4). The cardinality of Γ depends on thesize of the loops and on the measurement uncertainty ratherthan on problem dimension: we already observed in theproof of Proposition 19 that the diagonal elements of Pδ (thatdetermine Γ) are essentially the sum of the variances of themeasurements along each cycle.

The worst-case complexity is a poor indicator of the actualcomplexity of algorithm, for two main reasons. First, thematrices involved in the various steps of the algorithm aresparse, therefore the computation of θ?|γ (line 14) has acomplexity that is far below the upper-bound. Second, ifwe are careful about the choice of the cycle basis matrix fromwhich Pγ is computed, the INTEGER-SCREENING algorithm isable to compute a small set Γ in few iterations, therefore theaverage complexity is essentially that of doing one conditioning.

Page 14: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

VIII. EXPERIMENTAL EVALUATION

This section presents an experimental analysis of the pro-posed approach and its application to pose graph optimization.

Section VIII-A discusses the performance of MOLE2D algo-rithm in the problem of orientation estimation. The objective isto evaluate how important is the choice of the cycle basis matrixin practice, what is the cardinality of the set of candidate vectorsΓ in real applications (recall that |Γ|= |Θ|), and how fast isthe algorithm on common problem instances. Section VIII-Bdiscusses the use of the orientation estimate produced byMOLE2D as the initial guess for g2o.

The experiments used three standard datasets:INTEL: This dataset, acquired at the Intel Research Lab in

Seattle, includes odometry and range-finder data. Relative poseconstraints are derived from scan matching. Data processingdetails are given in previous work [25];

MITb: This dataset was acquired at the MIT Killian Court.Data processing details are given in previous work [40];

M3500: This dataset was created by Olson et al. [11].To test MOLE2D in more challenging scenarios, we created

other datasets by adding extra Gaussian noise (with standarddeviation σ) to the M3500 orientation measurements. Thesenew datasets are called M3500a, M3500b, M3500c, with σ =0.1, 0.2, 0.3 rad, respectively.

A. Effect of different cycle bases on orientation estimation

Here we only consider the orientation measurements inthe pose graph and the corresponding covariance matrix. ForMOLE2D we chose the confidence level α = 0.99. The testare performed on a desktop computer with processor Intel i7,3.4 GHz. For the computation of the cycle basis matrices, weused Michail’s C++ implementation [57], and we chose ν = 2for the (2ν − 1)-approximation [55]. The rest of MOLE2D isimplemented in Matlab.

In the scenarios INTEL, MITb, and M3500, the INTEGER-SCREENING algorithm is able to identify a single possible valuefor γ◦, regardless the choice of the cycle basis matrix (Table III,last column). In the scenarios characterized by extreme noiselevels (M3500a–c), the choice of the cycle basis truly matters.If one uses the fundamental cycle basis FCBo the size of Γis too large to be tractable; the explosion of |Γ| is partiallymitigated by the use of FCBm, that, however, fails to producea reasonably small number of vectors in Γ in the scenarioM3500c. Using a minimum cycle basis gives a small cardinalityof Γ, respectively, 1, 3, and 16, for the cases M3500a, M3500b,and M3500c, with no observed difference between the exactminimum cycle basis MCB and the approximation MCBa.

While the minimum cycle bases have better performance,they are more expensive to compute (Table IV).

As predicted by Proposition 19, the minimum cycle basesminimize the number of iterations in the INTEGER-SCREENING(Table III, second column). Moreover, the minimum cycle basesare able to determine most of the components of γ◦ (e.g., 95%)in the first iteration (Table III, third column), and they requireinverting lower-density matrices (Table III, fourth column).All these elements provide a computational advantage whenusing the minimum cycle bases in the INTEGER-SCREENING

(Table V, third column). The minimum cycle basis matricesare sparser, and this also constitutes an advantage in thecomputation of γ̂ and Pγ (Table V, second column). Finally,since the minimum cycle bases produce a smaller set ofhypotheses Γ, they require solving a smaller number of linearsystems for computing Θ from Γ (Table V, fourth column).

In conclusion, if the noise is moderate the FCBm offersa good compromise between performance and computationaleffort. For extreme noise, the approximate minimum cycle basismatrix MCBa is a choice that assures a similar performanceto the MCB while being cheaper to compute.

TABLE IIIPERFORMANCE OF INTEGER-SCREENING FOR DIFFERENT CYCLE BASES

cycle basis K u (%) d (%) |Γ|

INTE

L

FCBo 1 100.00 n/a 1

FCBm 1 100.00 n/a 1

MCBa 1 100.00 n/a 1

MCB 1 100.00 n/a 1

MIT

FCBo 2

{iter. 1

iter. 2

(80.00)

20.00

(25.78)

n/a

(16)

1

FCBm 1 100.00 n/a 1

MCBa 1 100.00 n/a 1

MCB 1 100.00 n/a 1M

3500

FCBo 5

iter. 1

iter. 2

iter. 3

iter. 4

iter. 5

(52.92)

(21.54)

(20.06)

(5.12)

0.36

(1.69)

(15.61)

(100.00)

(100.00)

n/a

(>10100)

(>10100)

(>1050)

(972)

1

FCBm 2

{iter. 1

iter. 2

(98.62)

1.38

(2.41)

n/a

(>109)

1

MCBa 2

{iter. 1

iter. 2

(99.95)

0.05

(0.48)

n/a

(3)

1

MCB 2

{iter. 1

iter. 2

(99.95)

0.05

(0.44)

n/a

(3)

1

M35

00a

FCBo 6 − − >1030

FCBm 3 − − 8

MCBa 2 − − 1

MCB 2 − − 1

M35

00b

FCBo 29 − − >1040

FCBm 4 − − 27

MCBa 3 − − 3

MCB 3 − − 3

M35

00c

FCBo 9 − − >10100

FCBm 7 − − >104

MCBa 4 − − 16

MCB 4 − − 16

K: number of iterations of INTEGER-SCREENING, including details of eachiteration when significative (statistics of intermediate iterations are reported inparenthesis; in cells with “−” details are omitted for brevity); u .

= |U(k)|/`:percentage of elements uniquely determined at each iteration; d: density ofthe matrix to be inverted to compute the conditional Gaussian in line 37. Thedensity is the number of non-zero elements over the total number of elements(in percentage); |Γ|: cardinality of Γ at iteration k. The symbol “n/a” denotesthat no matrix inversion is needed (all elements of γ◦ have been determined).

TABLE IVCOMPUTATION TIME FOR CYCLE BASIS MATRICES (SECONDS)

n m FCBo FCBm MCBa MCB

INTEL 1228 1505 ≤0.01 ≤0.01 0.09 0.20MITb 808 828 ≤0.01 ≤0.01 0.01 0.01

M3500 3500 5599 ≤0.01 0.30 1.11 1.54

Page 15: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

TABLE VCOMPUTATION TIME FOR MOLE2D (SECONDS)

phase Computationof γ̂ and Pγ

INTEGER-SCREENING

Computationof Θ from Γ

Total

lines 11–12 13 14–16

INTE

L

FCBo 0.07 0.04 ≤0.01 0.10

FCBm ≤0.01 0.03 ≤0.01 0.04

MCBa ≤0.01 0.05 ≤0.01 0.05

MCB ≤0.01 0.04 ≤0.01 0.04

MIT

FCBo ≤0.01 0.04 ≤0.01 0.04

FCBm ≤0.01 0.03 ≤0.01 0.03

MCBa ≤0.01 0.03 ≤0.01 0.03

MCB ≤0.01 0.03 ≤0.01 0.03

M35

00

FCBo 0.72 0.79 ≤0.01 1.52

FCBm ≤0.01 0.47 ≤0.01 0.47

MCBa ≤0.01 0.21 ≤0.01 0.22

MCB ≤0.01 0.21 ≤0.01 0.22

M35

00a FCBo 0.72 0.80 (Γ too large to continue)

FCBm ≤0.01 0.36 0.04 0.4

MCBa ≤0.01 0.21 ≤0.01 0.22

MCB ≤0.01 0.21 ≤0.01 0.22

M35

00b FCBo 0.71 1.19 (Γ too large to continue)

FCBm ≤0.01 0.51 0.12 0.64

MCBa ≤0.01 0.23 0.03 0.26

MCB ≤0.01 0.23 0.03 0.26

M35

00c FCBo 0.72 0.72 (Γ too large to continue)

FCBm ≤0.01 0.48 (Γ too large to continue)MCBa ≤0.01 0.23 0.15 0.38

MCB ≤0.01 0.23 0.14 0.37

B. Robustness of MOLE2D-based pose graph optimization

This section shows how the use of MOLE2D can improvepose graph optimization, by comparing Toro [6], g2o [5], andg2o bootstrapped with the orientation estimate provided byMOLE2D; this last algorithm is denoted with MOLE2D+g2o.

The objective function in pose graph optimization is referredto as the χ2 cost (see equation (2) in [15]), and it is ageneralization of the cost in Problem 3 to include translationerrors. Convergence to a global minimum of the cost functioncorresponds to small χ2 costs, while convergence to localminima leads to larger χ2 values.

In MOLE2D+g2o we use the MOLE2D algorithm to computenodes’ orientations from relative orientation measurements,and then we substitute this estimate as a first guess of theorientations for g2o; the translation guess is the one fromodometry. Following the recommendation of the previoussection, we used the approximate minimum cycle basis matrixwithin the MOLE2D algorithm. If MOLE2D returns more thanone hypothesis, we run MOLE2D+g2o for each possible initialguess and choose the one that achieves the smallest cost.

The results are summarized in Figure 2 and 3. Figure 2 showsa graphical representation of the estimated pose graph forone run in each scenario with the corresponding χ2 value.Figure 3 shows the sample distribution of the χ2 cost for theM3500 data for 100 realizations of the orientation noise.

The results confirm what is already known about Toro andg2o: for low noise g2o converges to a smaller χ2 cost than Toro,since Toro’s approximations prevent reaching the minimum.

Toro g2o MOLE2D+g2o

INTE

L

(a) χ2 = 1.01 · 105 (b) χ2 = 2.17 · 102 (c) χ2 = 2.17 · 102

MIT

b

(d) χ2 = 2.72 · 102 (e) χ2 = 5.10 · 106 (f) χ2 = 4.11 · 101

M35

00

(g) χ2 = 2.49 · 102 (h) χ2 = 1.38 · 102 (i) χ2 = 1.38 · 102

M35

00a

(j) χ2 = 2.05 · 103 (k) χ2 = 1.13 · 106 (l) χ2 = 9.12 · 102

M35

00b

(m) χ2 = 7.83 · 103 (n) χ2 = 1.50 · 106 (o) χ2 = 2.05 · 103

M35

00c

(p) χ2 = 2.87 · 104 (q) χ2 = 1.08 · 109 (r) χ2 = 2.55 · 103

Fig. 2. Estimated pose graphs and corresponding χ2 values. The first columnreports the results obtained from Toro. The second shows the results obtainedfrom g2o. The third column reports the results obtained by bootstrapping g2owith the orientation estimate of MOLE2D (MOLE2D+g2o).

10

15

20

0.05 0.05 0.05 0.1 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.3Toro g2o MOLE2D Toro g2o MOLE2D Toro g2o MOLE2D Toro g2o MOLE2D

log(χ

2)

Fig. 3. Distribution of the χ2 cost for 100 trials and different levels ofnoise (σ ∈ {0.05, 0.1, 0.2, 0.3}) in the M3500 dataset for Toro, g2o andMOLE2D+g2o. g2o was set to perform 20 iterations. The plot is drawn usingMatlab’s boxplot. The red line is the median; the blue box margins are the25th and 75th percentiles q0.25 and q0.75; the black whiskers are at q0.25−wand q0.75 +w where w = 1.5(q0.75−q0.25). All points outside of this rangeare considered outliers and plotted individually with red crosses.

Page 16: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

This happens in the “easy” scenarios (e.g. INTEL or M3500in Figure 2) and in several runs with σ = 0.05 rad (first columnin Figure 3). In scenarios with moderate noise, Toro has betterconvergence properties: generally, gradient methods are knownto have a larger basin of convergence [6]. Both methods failfor larger noise as they get stuck in local minima (e.g., MITb,M3500a–c in Figure 2). Local minima correspond to incorrectwraparounds in long loops (Figure 2m).

The results show that combining MOLE2D with MOLE2D+g2ogives a method that is both robust and precise. The third columnin Figure 2 shows that this bootstrapping greatly improves therobustness of the iterative solver. In all cases the combinationof MOLE2D and g2o attains the smallest observed χ2 value,and, visually, qualitatively more correct maps. This findingis confirmed by the statistics in Figure 3 where in all casesMOLE2D+g2o is able to reach the smallest χ2 cost. The mediannumber of orientation hypotheses reported by MOLE2D was|Γ|= 1 for σ = 0.05 rad and σ = 0.1 rad, while it was |Γ|= 3for σ = 0.2 rad and |Γ|= 48 for σ = 0.3 rad.

IX. CONCLUSIONS

This paper considered the problem of estimating the ori-entations of nodes in a pose graph from relative orientationmeasurements. The reformulation of maximum likelihood ori-entation estimation in terms of quadratic integer programmingallows us concluding that the maximum likelihood estimateis almost surely unique, and gives us a way to compute aglobal solution. Starting from this observation we devised amulti-hypothesis estimator that enables efficient computationand has guaranteed performance (at least one of the computedestimates is guaranteed to be “close” to the actual orientationof the nodes). We elucidated on the theoretical derivation withnumerical experiments on real and simulated data. Finally, weshowed that the proposed approach can be used to bootstrapstate-of-the-art techniques for pose graph optimization andallows a remarkable boost in their performance, extending theirapplicability. Future work includes the analysis of the estimationproblem in a 3D setup, which is nontrivial because SO(3) isnot Abelian. A second line of research consists in derivingprobabilistic guarantees on the pose estimate (the results of thispaper only guarantee the quality of the orientation estimate).

APPENDIX

Lemma 21 (Orthogonal complements [58]). For a connectedgraph G, the transpose of the cycle basis matrix CT isan orthogonal complement of the transpose of the reducedincidence matrix AT, in the sense that: 1) (AT CT) is asquare matrix of full rank; and 2) CAT = 0`×n.

Lemma 22. Given a cycle basis matrix C and a reducedincidence matrix A of a connected graph G, for any symmetricpositive definite matrix P , it holds that

P−1AT(AP−1AT)−1AP−1 +CT(CPCT)−1C = P−1.

Proof: Because P is symmetric and positive definite, thereexists two symmetric and positive definite matrices N and Msuch that M2 = P , N2 = P−1, and N = M−1.

Following Meyer [59, equation (5.13.3)], the orthogonalprojector of NAT is NAT(ANNAT)−1AN and the or-thogonal projector of MCT is MCT(CMMCT)−1CM .Because N and M are full rank and CT is an orthogonalcomplement of AT (Lemma 21), also NAT is an orthogonalcomplement of MCT. Meyer [59, equation (5.13.6)] describesa relation between the projectors of two matrices that areorthogonal complements. For our matrices, the relation is

NAT(ANNAT)−1AN+MCT(CMMCT)−1CM=Im,

where Im is the identity of size m. Pre-and post-multiplyingby N and recalling that M2 = P and N2 = P−1 we get

P−1AT(AP−1AT)−1AP−1+NMCT(CPCT)−1CMN=P−1.

The result follows by substituting N = M−1.

Lemma 23. Let x ∈ Rn be Normally distributed with mean µand covariance matrix P . Given the confidence intervals

Ii =[µi −

√Piiχ2

1,η , µi +√Piiχ2

1,η

], i = {1, . . . , n},

then P (x1 ∈ I1 ∧ . . . ∧ xn ∈ In) ≥ ηn.

Proof: The lemma can be seen as a direct consequenceof Theorem 1 in [60] (see also [61]).

REFERENCES

[1] F. Lu and E. Milios, “Globally consistent range scan alignment forenvironment mapping,” Autonomous Robots, vol. 4, pp. 333–349, 1997.

[2] S. Thrun and M. Montemerlo, “The GraphSLAM algorithm withapplications to large-scale mapping of urban structures,” Int. J. Robot.Res., vol. 25, pp. 403–429, 2006.

[3] P. Barooah and J. Hespanha, “Estimation on graphs from relativemeasurements,” Control Systems Magazine, vol. 27, no. 4, pp. 57–74,2007.

[4] H. Wang, G. Hu, S. Huang, and G. Dissanayake, “On the Structure ofNonlinearities in Pose Graph SLAM,” in Robotics: Science and Systems,2012.

[5] R. Kuemmerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard,“g2o: A general framework for graph optimization,” in Int. Conf. onRobotics and Automation, pp. 3607–3613, 2011.

[6] G. Grisetti, C. Stachniss, and W. Burgard, “Non-linear constraint networkoptimization for efficient map learning,” IEEE Trans. on IntelligentTransportation Systems, vol. 10, no. 3, pp. 428–439, 2009.

[7] J. Gutmann and K. Konolige, “Incremental mapping of large cyclicenvironments,” in Proc. of the IEEE Int. Symp. CIRA, pp. 318–325,1999.

[8] T. Duckett, S. Marsland, and J. Shapiro, “Fast, on-line learning of globallyconsistent maps,” Autonomous Robots, vol. 12, no. 3, pp. 287–300, 2002.

[9] K. Konolige, “Large-scale map-making,” in Proc. of the AAAI NationalConf. on Artificial Intelligence, pp. 457–463, 2004.

[10] U. Frese, P. Larsson, and T. Duckett, “A multilevel relaxation algorithmfor simultaneous localization and mapping,” IEEE Trans. on Robotics,vol. 21, no. 2, pp. 196–207, 2005.

[11] E. Olson, J. Leonard, and S. Teller, “Fast iterative alignment of posegraphs with poor estimates,” in Int. Conf. on Robotics and Automation,pp. 2262–2269, 2006.

[12] M. Kaess, A. Ranganathan, and F. Dellaert, “iSAM:incremental smoothingand mapping,” IEEE Trans. on Robotics, vol. 24, no. 6, pp. 1365–1378,2008.

[13] M. Kaess, H. Johannsson, R. Roberts, V. Ila, J. Leonard, and F. Dellaert,“iSAM2: Incremental smoothing and mapping with fluid relinearizationand incremental variable reordering,” in Int. Conf. on Robotics andAutomation, pp. 3281–3288, 2011.

[14] M. Kaess, H. Johannsson, R. Roberts, V. Ila, J. Leonard, and F. Dellaert,“iSAM2:incremental smoothing and mapping using the bayes tree,” Int.J. Robot. Res., vol. 31, pp. 217–236, 2012.

[15] G. Grisetti, R. Kuemmerle, C. Stachniss, U. Frese, and C. Hertzberg,“Hierarchical optimization on manifolds for online 2D and 3D mapping,”in Int. Conf. on Robotics and Automation, pp. 273–278, 2010.

Page 17: From Angular Manifolds to the Integer Lattice: Guaranteed ... · Index Terms—pose graph optimization, SLAM, mobile robots, orientation estimation, SO(2) manifold, multi-hypothesis

[16] D. L. Rizzini, “A closed-form constraint networks solver for maximumlikelihood mapping,” in European Conf. on Mobile Robots, pp. 223–228,2010.

[17] G. Dubbelman, I. Esteban, and K. Schutte, “Efficient trajectory bendingwith applications to loop closure,” in Int. Conf. on Intelligent Robotsand Systems, pp. 1–7, 2010.

[18] G. Dubbelman, P. Hansen, B. Browning, and M. Dias, “Orientationonly loop-closing with closed-form trajectory bending,” in Int. Conf. onRobotics and Automation, pp. 815–821, 2012.

[19] E. Olson and P. Agarwal, “Inference on networks of mixtures for robustrobot mapping,” in Robotics: Science and Systems, 2012.

[20] N. Sünderhauf and P. Protzel, “Switchable constraints for robust posegraph slam,” in Int. Conf. on Intelligent Robots and Systems, pp. 1879–1884, 2012.

[21] N. Sünderhauf and P. Protzel, “Towards a robust back-end for pose graphslam,” in Int. Conf. on Robotics and Automation, pp. 1254–1261, 2012.

[22] D. Rosen, M. Kaess, and J. Leonard, “An incremental trust-region methodfor robust online sparse least-squares estimation,” in Int. Conf. on Roboticsand Automation, pp. 1262–1269, 2012.

[23] F. Dellaert and A. Stroupe, “Linear 2D localization and mapping forsingle and multiple robots,” in Int. Conf. on Robotics and Automation,pp. 688–694, 2002.

[24] M. Bosse and R. Zlot, “Keypoint design and evaluation for placerecognition in 2d lidar maps,” Robotics and Autonomous System J.,vol. 57, no. 12, pp. 1211–1224, 2009.

[25] L. Carlone, R. Aragues, J. Castellanos, and B. Bona, “A linear approxima-tion for graph-based simultaneous localization and mapping,” in Robotics:Science and Systems, 2011.

[26] J. Knuth and P. Barooah, “Error growth in position estimation fromnoisy relative pose measurements,” Robotics and Autonomous System J.,vol. 61, no. 3, pp. 229–244, 2013.

[27] S. Huang, Y. Lai, U. Frese, and G. Dissanayake, “How far is SLAMfrom a linear least squares problem?,” in Int. Conf. on Intelligent Robotsand Systems, pp. 3011–3016, 2010.

[28] J. Thunberg, E. Montijano, and X. Hu, “Distributed attitude synchroniza-tion control,” in IEEE Conf. on Decision and Control, pp. 1962–1967,2011.

[29] R. Tron and R. Vidal, “Distributed image-based 3-D localization of camerasensor networks,” in IEEE Conf. on Decision and Control, pp. 901–908,2009.

[30] P. Barooah, “Estimation and control with relative measurements: Algo-rithms and scaling laws,” Ph.D. Thesis, Center for Control, Dynamical-systems, and Computation, University of California Santa Barbara, 2007.

[31] J. Knuth and P. Barooah, “Distributed collaborative 3d pose estimationof robots from heterogeneous relative measurements: an optimization onmanifold approach,” submitted to Robotica, preprint available online:http://plaza.ufl.edu/knuth/Publications/ , 2013.

[32] R. Stanfield, “Statistical theory of DF finding,” Journal of IEE, vol. 94,no. 5, pp. 762–770, 1947.

[33] W. Foy, “Position-location solutions by Taylor-series estimation,” inIEEE Transaction on Aerospace and Electronic Systems AES-12 (2),pp. 187–194, 1976.

[34] G. Mao, B. Fidan, and B. Anderson, “Wireless sensor network localizationtechniques,” Computer Networks, vol. 51, no. 10, pp. 2529–2553, 2007.

[35] G. Piovan, I. Shames, B. Fidan, F. Bullo, and B. Anderson, “On frameand orientation localization for relative sensing networks,” Automatica,vol. 49, no. 1, pp. 206–213, 2011.

[36] P. Biswas, T. Liang, K. Toh, T. Wang, and Y. Ye, “Semidefiniteprogramming approaches for sensor network localization with noisydistance measurements,” IEEE Transactions on Automation Science andEngineering, vol. 3, no. 4, pp. 360–371, 2006.

[37] G. Calafiore, L. Carlone, and M. Wei, “A distributed technique forlocalization of agent formations from relative range measurements,” IEEETrans. on Systems, Man and Cybernetics, Part A, vol. 42, no. 5, pp. 1083–4427, 2012.

[38] T. Eren, D. Goldenberg, W. Whiteley, Y. Yang, A. Morse, B. Anderson,and P. Belhumeur, “Rigidity, computation, and randomization in networklocalization,” in IEEE INFOCOM, vol. 4, pp. 2673–2684, 2004.

[39] I. Shames, B. Fidan, and B. Anderson, “Minimization of the effect of noisymeasurements on localization of multi-agent autonomous formations,”Automatica, vol. 45, no. 4, pp. 1058–1065, 2009.

[40] L. Carlone, R. Aragues, J. Castellanos, and B. Bona, “A first-order solutionto simultaneous localization and mapping with graphical models,” in Int.Conf. on Robotics and Automation, pp. 1764–1771, 2011.

[41] A. Hassibi and S. Boyd, “Integer parameter estimation in linear modelswith applications to GPS,” IEEE Trans. On Signal Processing, vol. 46,no. 11, pp. 2938–2952, 1998.

[42] L. Carlone and A. Censi, “From angular manifolds to the integerlattice: Guaranteed orientation estimation with application to pose graphoptimization,” ArXiv preprint: http://arxiv.org/abs/1211.3063, 2012.

[43] L. Carlone and A. Censi, “From angular manifolds to the integerlattice: Guaranteed orientation estimation with application to pose graphoptimization,” supplementary material, http://www.lucacarlone.com/index.php/resources/research/mole2d, 2013.

[44] W. Chen, Graph Theory and Its Engineering Applications. AdvancedSeries in Electrical and Computer Engineering, 1997.

[45] T. Kavitha, C. Liebchen, K. Mehlhorn, D. Michail, R. Rizzi, T. Ueckerdt,and K. Zweig, “Cycle bases in graphs: Characterization, algorithms,complexity, and applications,” Computer Science Rev., vol. 3, no. 4,pp. 199–243, 2009.

[46] G. S. Chirikjian and A. B. Kyatkin, Engineering applications ofnoncommutative harmonic analysis. CRC Press, 2001.

[47] G. Chirikjian, Stochastic Models, Information Theory, and Lie Groups,Volume 1: Classical Results and Geometric Methods. Applied andNumerical Harmonic Analysis, Birkhäuser, 2009.

[48] K. V. Mardia and P. E. Jupp, Directional Statistics. John Wiley & Sons,1999.

[49] G. Nemhauser and L. Wolsey, Integer and combinatorial optimization.Wiley New York, 1988.

[50] C. Liebchen, “Finding short integral cycle bases for cyclic timetabling,”Lecture Notes in Computer Science, vol. 2832, pp. 715–726, 2003.

[51] A. Schrijver, Combinatorial Optimization: Polyhedra and Efficiency.Springer, 2003.

[52] H. Lenstra, “Integer programming with a fixed number of variables,”Math. Oper. Res., vol. 8, pp. 538–548, 1983.

[53] S. Jazaeri, A. Amiri-Simkooei, and M. Sharifi, “Fast integer least-squaresestimation for GNSS high-dimensional ambiguity resolution using latticetheory,” Journal of Geodesy, vol. 86, no. 2, pp. 123–136, 2012.

[54] X. Chang and G. Golub, “Solving ellipsoid-constrained integer leastsquares problems,” SIAM J. Matrix Analysis Applications, pp. 1071–1089, 2009.

[55] T. Kavitha, K. Mehlhorn, and D. Michail, “New approximation algorithmsfor minimum cycle bases of graphs,” Lecture Notes in Computer Science,vol. 4393, pp. 512–523, 2007.

[56] K. Mehlhorn and D. Michail, “Minimum cycle bases: Faster and simpler,”ACM Transactions on Algorithms, vol. 6, no. 1, pp. 1–13, 2009.

[57] D. Michail, “Minimum cycle basis library.” http://www.mpi-inf.mpg.de/~michail/subpages/mcb/doxygen/.

[58] W. Russell, D. Klein, and J. Hespanha, “Optimal estimation on the graphcycle space,” in American Control Conf., pp. 1918–1924, 2010.

[59] C. Meyer, Matrix Analysis and Applied Linear Algebra. SIAM, 2000.[60] Z. Sidak, “Rectangular confidence regions for the means of multivariate

Normal distributions,” Journal of the American Statistical Association,vol. 62, no. 318, pp. 626–633, 1967.

[61] O. Dunn, “Estimation of the means of dependent variables,” Annals ofMathematical Statistics, vol. 29, no. 4, pp. 1095–1111, 1958.

Luca Carlone is a postdoctoral research fellowat the College of Computing, Georgia Institute ofTechnology. He held a postdoctoral position at thePolitecnico di Torino (2012), and was visiting re-searcher at the University of California Santa Barbara(2011) and at the University of Zaragoza (2010). Hereceived his M.Sc. degrees (summa cum laude) inMechatronic Engineering from Politecnico di Torino,and in Automation Engineering from Politecnicodi Milano in 2008, and a Ph.D. from Politecnicodi Torino in 2012. His research interests include

nonlinear and distributed estimation applied to mobile robotics.

Andrea Censi is a Research Scientist and PrincipalInvestigator at the Massachusetts Institute of Technol-ogy. Previously, he was a Visiting Scholar at the AILab at the University of Zurich. He received the B.Sc.and M.Sc. degrees (summa cum laude) in controlengineering and robotics from Sapienza Universityof Rome, Italy, in 2005 and 2007, respectively, anda Ph.D. in Control & Dynamical Systems from theCalifornia Institute of Technology in 2012. He isbroadly interested in perception and decision makingproblems for natural and artificial embodied agents.