computational topology: a new way of viewing data › 2017 › 11 › ... · 2017-11-29 · 2. saba...

Computational Topology: A New Way Of Viewing DataTHANOS GENTIMIS LSU 3/3/2015

What do you think?

Mobius Band …a way to confuse ants

Sometimes I can’t tell a coffee mug from a donut

The hairy ball theorem …no use trying to get the perfect haircut …

… unless your head is shaped like a donut.

Algebraic TopologyIs fun!

The betti numbers!Betti numbers are the invariants of interest.

Betti0 counts the number of connected components

Betti1 counts the number of non-trivial loops

Betti2 counts the number of non-trivial voids

Graphs Courtesy of Dmitriy Morozov

Betti0=3Betti1=0Betti2=0



Data has shape! And shape matters!Raw Data Point Cloud

The “shape” of this point-cloud can help us understand our data!Can we “quantify” that shape? Can we compute it?Can we use this shape to:

Find anomalies?Maybe Classify data?

Topological Data Analysis pipeline:We assume that the data lies in a topological space.

We take measurements, sampling that space.

We try to reconstruct it by using an approximation.

We compute the invariants to understand it.


Graphs Courtesy of Vin De Silva

Triangulations - Mesh

Given a sampling of a space we can try to partially reconstruct it by “triangulation”, using Cech Complexes, Rips Complexes or Alpha Shapes.

Simplicial ComplexA simplicial complex is the union of simple pieces (simplices) i.e. vertices, edges, triangles etc.

Two simplices must intersect at a simplex or not at all.

Cech ComplexStart with a point cloud X in some Euclidean space.

Fix a parameter r>0.

Consider balls of radius r around each point covering X.

The Cech Complex is the nerve of this cover.

The Nerve theorem guarantees similar

topologies!

Proximity measures

Given a set of nodes with an attribute vector.

A metric on the space on nodes can be defined as the weighted sum of distances of their attributes:

One can then create a simplicial complex representing those nodes and use TDA!

Applications!

-10

1

-0.500.51

-0.5

0

0.5

1

w(t)w(t+1)

w(t

+ 2

)

Signal Processing

0 0.05 0.1 0.15 0.2-1

-0.5

0

0.5

1

time(s)

w(t

)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14

Be

tti 0

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14

Be

tti 1

Signal in time domain 3D delay embeddingBarcodes Betti numbers

Choosing delays and embedding dimension Algebraic topology

1

2

( )

( )

( )

w t

w t

w t

Wheeze detection

Non-wheeze signals

Wheeze signals

0

1

1

0

0

1

1

1

Coverage Hole detection

Coverage Hole detection.


Graphs Courtesy of Harish Chintakunta

The coverage area of the sensors is

approximated by a simplicial complex.

“Holes in coverage” become non-zero

Betti1

Analysis of a socio-plexClusters based on

feature “closeness”

What is the meaning of loops? Voids?

Simplifying Social Networks

Movie Courtesy of Adam Wilkerson

The socioplex is reduced by a “strong homotopy collapse

The core contains the PI’sThe periphery is mostly graduate students!

Gait Recognition

The video sequence of a human captured from the side is analyzed.

The silhouettes are joinedand a simplicial structure is created

The persistent betti numbers can beused to distinguish between walkingand running.

Graphs Courtesy of Rocio Gonzalez-Diaz

Gene Periodicity

Graphs Courtesy of Jose Perea

The gene expression is measured as a time series

Snippets of that series are compared to discover similaritiesand repetitions.

Their topological invariantsare computed and they are givena periodicity score!

ConclusionsIn TDA you don’t go in with your data and a question. You go in with just your data and look for the questions!

We really don’t know how many applications it has … we find new uses in different unconnected areas.

It is a combination of Mathematics, Computer Science, Statistics, Engineering, but ultimately it needs experts on the specific field to analyze the metadata.

Useful links

Jose Perea (Gene Expression Periodicity) https://fds.duke.edu/db/aas/math/joperea

Harish Chintakinta (Coverage Hole Detection) http://www4.ncsu.edu/~hkchinta/index.html

Dmitiry Morozov (Cyclic Coordinates, Dionysus) http://www.mrzv.org/

Vin De Silva (Theory and applications) http://pages.pomona.edu/~vds04747/public/index.html

Vidit Nanda (Theory and applications, Perseus) http://www.sas.upenn.edu/~vnanda/

Gunnar Carlesson (Theory and applications, Ayasdi,Javaplex) http://math.stanford.edu/~gunnar/

Rocio Gonzalez(Gait recognition) http://ma1.eii.us.es/miembros/rogodi/XSLT.asp?xml=rogodi.xml&xsl=profesores.xsl

https://fds.duke.edu/db/aas/math/joperea

http://www4.ncsu.edu/~hkchinta/index.html

http://www.mrzv.org/

http://pages.pomona.edu/~vds04747/public/index.html

http://www.sas.upenn.edu/~vnanda/

http://math.stanford.edu/~gunnar/

http://ma1.eii.us.es/miembros/rogodi/XSLT.asp?xml=rogodi.xml&xsl=profesores.xsl

Bibliography 1. Harish Chintakunta, Thanos Gentimis, Rocio Gonzalez-Diaz, Maria-Jose Jimenez, Hamid Krim,

Information Theoretic Persistent Barcodes, Pattern Recognition Journal, Elsevier, 2014

2. Saba Emrani, Thanos Gentimis and Hamid Krim, Persistent Homology Of Delay Embeddings, IEEE Signal Processing Letters,Vol.21(April 2014), No:14

3. Thanos Gentimis, Harish Chintakunta, Hamid Krim, Introduction to Computational Topology, Signal Processing Magazine, to appear

4. Graig Bell, Thanos Gentimis, Directed Filtrations, preprint submitted to Computational Geometry

5. Maria Bampasidou, Thanos Gentimis, Modeling Collaborations with Persistent Homology,preprint available on arxiv

Thanks!

History Of Computational TopologyComputational work of Frosini, Ferri, and collaborators in Bologna, Italy.

Doctoral thesis of Robins at Boulder, Colorado.

Bio-geometry project of Edelsbrunner, Harer at Duke, North Carolina.

General Framework provided by Carlesson, Zomorodian in Stanford, California.

Popularization by Ghrist, De Silva Upenn, Pennsylvania.

computational topology: a new way of viewing data › 2017 › 11 › ... · 2017-11-29 · 2. saba...

Documents