math 689-001: computational topologymath.gmu.edu/~wanner/m689.pdf · math 689-001: computational...

Math 689-001: Computational Topology, Spring 2015

Instructor:Thomas Wanner

Time & Place:Thursdays, 7:20-10pm, EH 4106

Textbook:H. Edelsbrunner, J. Harer,Computational TopologyAmerican Mathematical SocietyProvidence, RI, 2010.

Shameless Advertisement:math.gmu.edu/∼wanner/m689.pdf

Algebraic Topology and Homology

Algebraic topology provides quantitative information on complexand often irregular objects:

• The information is invariant under transformations which donot require cutting or gluing of the object.

• Homology groups measure the complexity of the object in anydimension.

• Betti numbers, torsion coefficients, and the Eulercharacteristic are coarser measures of this information.

Betti Numbers

The Betti number β0 counts the number of connected componentsof the structure.

How many distinctconnected regions doesthe maze have??

Even for a simple maze as the one above, determining the numberof connected components is not always obvious.

Betti Numbers

The Betti number β1 counts the number of independent tunnelscreated by the structure. This equals the number of loops

• which cannot be shrunk to a point and

• which cannot be morphed into each other.

(a) β1 = 0(b) β1 = 2(c) β1 = 6

Similarly, the Betti number β2 counts the number of closed regionscreated by the structure.

Three-Dimensional Cahn-Hilliard Example

Even for relatively small three-dimensional microstructures theBetti numbers have to be determined computationally:

This isosurface hasBetti numbersβ0 = 1,β1 = 1701,β2 = 0.

Fast computation??

The Euler Characteristic

Another topological invariant which can be associated with objectsis the Euler characteristic χ. If the object is given as a union ofunit squares on an integer lattice (black and white digital image)one has

χ = N0 − N1 + N2

where N0 is the number of occupied lattice points, N1 the numberof occupied edges, and N2 the number of occupied squares.

N0 = 24N1 = 32N2 = 10

χ = 2

Additivity Property of the Euler Characteristic

Determining the Euler characteristic is computationally easy, sinceit is an additive functional: If X and Y are two sets, then the Eulercharacteristic of their union X ∪ Y satisfies

χ(X ∪ Y ) = χ(X ) + χ(Y )− χ(X ∩ Y )

χ(X ) = 1χ(Y ) = 1

χ(X ∩Y ) = 2

χ(X ) = 0χ(Y ) = 1

χ(X ∩Y ) = 0

χ(X ) = 1χ(Y ) = 1

χ(X ∩Y ) = 0

Betti Numbers or the Euler Characteristic?

Which topological invariant is more appropriate for applications?

• The easy computability of the Euler characteristic is one ofthe main reasons for its frequent use in applications.

• In contrast, the computation of the Betti numbers is moreinvolved, but it does provide valuable additional information,as they measure global connectivity in different dimensions.They cannot be computed by local information alone.

• From the Betti numbers, the Euler characteristic can easily berecovered via

χ = β0 − β1 + β2 − β3 + . . .

Efficient methods for the computation of Betti numbers have beendeveloped only in the last decade!

Point Cloud Data and Distance Complexes

For experimental or numerical data given as a collection of pointsin Euclidean space, one can associate a number of discrete graphsor complexes which measure distance information. These are basedon the intersections of ε-disks centered at the data points.

64 ROBERT GHRIST

Figure 2. A fixed set of points [upper left] can be completed to aCech complex Cε [lower left] or to a Rips complex Rε [lower right]based on a proximity parameter ε [upper right]. This Cech complexhas the homotopy type of the ε/2 cover (S1 ∨ S1 ∨ S1), while theRips complex has a wholly different homotopy type (S1 ∨ S2).

stored as a graph and reconstituted instead of storing the entire boundary operatorneeded for a Cech complex. This virtue — that coarse proximity data on pairs ofnodes determines the Rips complex — is not without cost. The penalty for thissimplicity is that it is not immediately clear what is encoded in the homotopy typeof R. In general, it is neither a subcomplex of En nor does it necessarily behavelike an n-dimensional space at all (Figure 2).

1.4. Which ε? Converting a point cloud data set into a global complex (whetherRips, Cech, or other) requires a choice of parameter ε. For ε sufficiently small,the complex is a discrete set; for ε sufficiently large, the complex is a single high-dimensional simplex. Is there an optimal choice for ε which best captures thetopology of the data set? Consider the point cloud data set and a sequence of Ripscomplexes as illustrated in Figure 3. This point cloud is a sampling of points ona planar annulus. Can this be deduced? From the figure, it certainly appears asthough an ideal choice of ε, if it exists, is rare: by the time ε is increased so asto remove small holes from within the annulus, the large hole distinguishing theannulus from the disk is filled in.

2. Algebraic topology for data

Algebraic topology offers a mature set of tools for counting and collating holesand other topological features in spaces and maps between them. In the context ofhigh-dimensional data, algebraic topology works like a telescope, revealing objectsand features not visible to the naked eye. In what follows, we concentrate on ho-mology for its balance between ease of computation and topological resolution. We

From Barcodes: The Persistent Topology of Data by Robert Ghrist (2008).

Rips Complexes for Various DistancesFor different values of the radius one obtains different Ripscomplexes. Which complex captures the correct topologicalinformation of the underlying data set, in this case an annulus?

PERSISTENT TOPOLOGY OF DATA 65

Figure 3. A sequence of Rips complexes for a point cloud dataset representing an annulus. Upon increasing ε, holes appear anddisappear. Which holes are real and which are noise?

assume a rudimentary knowledge of homology, as is to be found in, say, Chapter 2of [15].

Despite being both computable and insightful, the homology of a complex asso-ciated to a point cloud at a particular ε is insufficient: it is a mistake to ask whichvalue of ε is optimal. Nor does it suffice to know a simple ‘count’ of the number andtypes of holes appearing at each parameter value ε. Betti numbers are not enough.One requires a means of declaring which holes are essential and which can be safelyignored. The standard topological constructs of homology and homotopy offer nosuch slack in their strident rigidity: a hole is a hole no matter how fragile or fine.

2.1. Persistence. Persistence, as introduced by Edelsbrunner, Letscher, andZomorodian [12] and refined by Carlsson and Zomorodian [22], is a rigorous re-sponse to this problem. Given a parameterized family of spaces, those topologicalfeatures which persist over a significant parameter range are to be considered assignal with short-lived features as noise. For a concrete example, assume thatR = (Ri)

N1 is a sequence of Rips complexes associated to a fixed point cloud for an

increasing sequence of parameter values (εi)N1 . There are natural inclusion maps

(2.1) R1ι

↪→ R2ι

↪→ · · · ι↪→ RN−1

ι↪→ RN .

Instead of examining the homology of the individual terms Ri, one examines thehomology of the iterated inclusions ι : H∗Ri → H∗Rj for all i < j. These mapsreveal which features persist.

As a simple example, persistence explains why Rips complexes are an acceptableapproximation to Cech complexes. Although no single Rips complex is an especiallyfaithful approximation to a single Cech complex, pairs of Rips complexes ‘squeeze’the appropriate Cech complex into a manageable hole.

Persistent HomologyPersistent homology encodes the structure of the Rips complexesas the radius increases. It keeps track of components, loops, etc.,and how they merge as ε increases. In this way, features which arevisible at a variety of scales can be isolated.

PERSISTENT TOPOLOGY OF DATA 67

which come into existence at parameter ti and which persist for all future parame-ter values. The torsional elements correspond to those homology generators whichappear at parameter rj and disappear at parameter rj + sj . At the chain level,the Structure Theorem provides a birth-death pairing of generators of C (exceptingthose that persist to infinity).

2.3. Barcodes. The parameter intervals arising from the basis for H∗(C; F ) inEquation (2.3) inspire a visual snapshot of Hk(C; F ) in the form of a barcode. Abarcode is a graphical representation of Hk(C; F ) as a collection of horizontal linesegments in a plane whose horizontal axis corresponds to the parameter and whosevertical axis represents an (arbitrary) ordering of homology generators. Figure 4gives an example of barcode representations of the homology of the sampling ofpoints in an annulus from Figure 3 (illustrated in the case of a large number ofparameter values εi).

H0

H1

H2ε

ε

ε

Figure 4. [bottom] An example of the barcodes for H∗(R) in theexample of Figure 3. [top] The rank of Hk(Rεi

) equals the numberof intervals in the barcode for Hk(R) intersecting the (dashed) lineε = εi.

Theorem 2.3 yields the fundamental characterization of barcodes.

Theorem 2.4 ([22]). The rank of the persistent homology group Hi→jk (C; F ) is

equal to the number of intervals in the barcode of Hk(C; F ) spanning the parameterinterval [i, j]. In particular, H∗(Ci

∗; F ) is equal to the number of intervals whichcontain i.

A barcode is best thought of as the persistence analogue of a Betti number.Recall that the kth Betti number of a complex, βk := rank(Hk), acts as a coarsenumerical measure of Hk. As with βk, the barcode for Hk does not give any in-formation about the finer structure of the homology, but merely a continuously

Course Content

The course develops the theoretical background for these concepts.It will be self-contained, only basic knowledge of groups is required.

• Geometric Topology:Graphs, surfaces, triangulations, complexes

• Algebraic Topology:Homology, cohomology, duality, Morse theory

• Persistent Topology:Persistent homology, spectral sequences, stability

If interested, students can use a variety of existing and free softwarepackages to test the developed concepts, but no programming willbe part of the course. The course grade is based on homework,participation, and final student projects or presentations. Maybewe can even find some use for the department’s 3D printers!

math 689-001: computational topologymath.gmu.edu/~wanner/m689.pdf · math 689-001: computational...

Documents