introduction to topological data analysis · introduction to topological data analysis ippei...
TRANSCRIPT
![Page 1: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/1.jpg)
Introduction to topological dataanalysis
Ippei Obayashi
Adavnced Institute for Materials Research, Tohoku University
Jan. 12, 2018
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 1 / 32
![Page 2: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/2.jpg)
Persistent homology
Topological Data Analysis (TDA)▶ Data analysis methods using topology from mathematics▶ Characterize the shape of data quantitatively
⋆ By using connected components, rings, cavities, etc.
Persistent homology (PH) is a main tool of TDA▶ The key idea is “Homology” from mathematics▶ Gives a good descriptor for the shape of data (called apersistence diagram)
Rapidly developed in 21st century▶ Mathematical theories▶ Software▶ Applications to materials science, sensor network,phylogenetic network, etc.
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 2 / 32
![Page 3: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/3.jpg)
Example 1
These images are classified into two groups (left 4images and right 4 images). Do you find thecharacteristic shape to distinguish the two groups?
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 3 / 32
![Page 4: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/4.jpg)
Shapes around blue dots are “typical” for left images, and red dots
for right imagesI. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 4 / 32
![Page 5: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/5.jpg)
Example 2
Atomic configurations of amorphous silica (SiO2) andliquid silica. Do you find the difference?
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 5 / 32
![Page 6: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/6.jpg)
From Y. Hiraoka, et al., PNAS 113(26):7035-40 (2016)
Persistence diagrams can capture the difference clearlyI. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 6 / 32
![Page 7: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/7.jpg)
HomologyConnected components, rings, and cavities aremathematically formalized by homology.Algebra is used to formalize such geometricstructuresThere are many types of holes and characterized by“dimension”
dim 1: 1dim 2: 0
dim 1: 0dim 2: 1
dim 1: 1dim 2: 0
dim 1: 2dim 2: 1
1 dim: You can see the inside from outside 2 dim: You cannot see
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 7 / 32
![Page 8: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/8.jpg)
How to count ringsHow many rings/holes in the tetrahedron skelton?
Four?
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 8 / 32
![Page 9: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/9.jpg)
But if you see the tetrahedron from upside, the numberof rings is three.
What happened?
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 9 / 32
![Page 10: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/10.jpg)
(1)
(2) (3)
(4)
We cosider the addition of rings. Then(1) + (2) + (3) = (4) since two arrows with oppositedirections are vanished when added. This means that thefour rings are not linearly independent. We can formalizethe number of linearly independent rings by linearalgebra.I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 10 / 32
![Page 11: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/11.jpg)
Persistent homology
Characterizing the shape of data is a difficultproblem
▶ Especially, for 3D data
Homology is one possible tool for that purpose, buthomology drops the details about the shape of datatoo much
▶ Homology can only count the number of holes
We want more information about the shape of datawith easy-to-use form
Computational homology is proposed in 20 century,but it is sensitive to noise
→ using increasing sequence (called filtration)
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 11 / 32
![Page 12: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/12.jpg)
r-Ball model
very
small
hole
medium
hole
large
hole
Input data is a set of points (called a point cloud)The points themselves have no “hole”, but there aresome hole-like structuresPut a disc whose radius is r onto each pointThere are three holes
▶ Homology can detect the number of holesI. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 12 / 32
![Page 13: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/13.jpg)
FiltrationBy increasing the radii r gradually, many holes appearand disappear. The theory of PH can makemathematically proper pairs of the radii of appearanceand disappearance.
radius
A holeappear
Dividedintotwo holes
One holedisappers
Another holedisappears
birth deathbirth death
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 13 / 32
![Page 14: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/14.jpg)
Persistence diagram
The pairs are called birth-death pairs. The pairs arevisualized by a scatter plot on (x, y)-plane.
radius
A holeappear
Dividedintotwo holes
One holedisappers
Another holedisappears
birth deathbirth death
This diagram visualizes 1-dimensional persistenthomology. This diagram is called persistence diagram.
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 14 / 32
![Page 15: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/15.jpg)
We can apply PH to any dimensional data.▶ Practical for 2D and 3D▶ Because it is difficult to understand high dimensional“holes”
▶ Since it is hard to characterize the shape of 3D data, theapplication to 3D data is especially useful
We can apply PH to various kinds of increasingsequences
▶ We can apply PH other than point clouds▶ Bitmap data▶ PH is useful for 3D bitmap data such as X-ray CT data
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 15 / 32
![Page 16: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/16.jpg)
Mathematics of PH
PH relates various fields
Algebraic topology
Representation theory
Computational geometry
Combinatorics
Probability theory
Statistics
Various studies about fundamental theories are important
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 16 / 32
![Page 17: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/17.jpg)
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 17 / 32
![Page 18: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/18.jpg)
Amorphous Silica
What is glass?
Not liquid, not solid, but something in-between
Atomic configuration looks random
But it maintains rigidity
We require further geometric understandindgs ofatomic configurations
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 18 / 32
![Page 19: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/19.jpg)
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 19 / 32
![Page 20: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/20.jpg)
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 20 / 32
![Page 21: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/21.jpg)
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 21 / 32
![Page 22: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/22.jpg)
Combination of statistics/machine learning
Data (point clouds, images, etc.) Persistence diagrams
Machine learning・PCA・Regression・Classification :
Characteristic geometric patterns in data
Additional information
Visualize
Inverse analysis
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 22 / 32
![Page 23: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/23.jpg)
Software
For the practical data analysis using PH, analysissoftware is important.I will introduce Homcloud.
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 23 / 32
![Page 24: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/24.jpg)
Softwares for PH
Various analysis softwares are developed for their ownpurpose and interest
Gudhi
dipha, phat, ripser
eirine
RIVET
JavaPlex
Perseus
Dionysus...
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 24 / 32
![Page 25: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/25.jpg)
Homcloud
Focus on applications, especially to materialsscience
▶ Data analysis for molecular dynamical simulations▶ Images from electric microscopy, 3D images from X-rayCT
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 25 / 32
![Page 26: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/26.jpg)
We can compute persistence diagrams from varioussources (point clouds, 2D/3D bitmap data)
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 26 / 32
![Page 27: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/27.jpg)
Inverse analysis
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 27 / 32
![Page 28: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/28.jpg)
Homcloud as a platform for thedevelopment of new methods
Getting an idea→Writing a code and trying it→ Ifit works, we consider a background theoryWe can quickly introduce such a new idea into dataanalysis
▶ Collaborators also use the idea quickly
Try ideas found in papers by other researchers
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 28 / 32
![Page 29: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/29.jpg)
I develop the software and analyze data together▶ Mainly data from materials science
⋆ Provided by collaborators
▶ Dogfooding▶ Do not implement unused functionality▶
Collaborators also use HomcloudImplemented mainly in python
▶ Python is often used for data science
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 29 / 32
![Page 30: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/30.jpg)
Homcloud Demo
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 30 / 32
![Page 31: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/31.jpg)
Future plan of Homcloud
Better user interface
Performance improvementImplement new methods
▶ Parallel to theoretical researches
Publish in this winter▶ http://www.wpi-aimr.tohoku.ac.jp/hiraoka_
labo/homcloud.html
If you want to use Homcloud, please contact withus: [email protected]
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 31 / 32
![Page 32: Introduction to topological data analysis · Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials Research, Tohoku University Jan. 12, 2018 I](https://reader030.vdocuments.us/reader030/viewer/2022040611/5ed79e09dff44a19a36e8baf/html5/thumbnails/32.jpg)
Wrap up
Persistent homology enable us to analyze the shapeof data quantitatively and effectively by using thepower of the mathematical theory of topology
▶ A persistence diagram is a good descriptor for the shapeof data
▶ Applications to 3D data is most effective, in my opinion
There are many applications▶ We mainly apply persistent homology to materialsscience
▶ Meteology▶ Brain science, life science, etc.
Combination of theoretical researches, softwaredevelopment, and applications is important
I. Obayashi (AIMR (Tohoku U.)) Introduction to TDA Jan. 12, 2018 32 / 32