a parallel delaunay algorithm for cgal david millman advisor: sylvain pion july 26th 2007

A Parallel Delaunay algorithm for CGAL

David Millman

Advisor: Sylvain Pion

July 26th 2007

Goal

To create a parallel implementation of Delaunay Triangulation in R3 with CGAL for shared memory parallel machines using OpenMP.

Motivation

• Delaunay’s many uses– Meshing in finite element theory– computational biology– geometric modeling– anything that can be done with a Voronoi diagram

• Multi-Processor systems– more common– multi core systems

Motivation (cont.)

• Big data sets– Robust algorithms to mesh billions of points– Sequentially CGAL

• 1 processor and 16GB ram, 10 million points ~120 seconds and uses 5.5GB ram

– Blandford, Belloch, Kadow ‘06 • 64 processors and 200GB ram 1 billion points 5512

seconds and used 197GB

Tools

• CGAL - Computational Geometry Algorithms Librarywww.cgal.org

• OpenMP - API for shared memory parallel programming www.openmp.org

• Capricorne2 quad core processors (8 cores)16GB ram

CGAL Delaunay Algorithm

• Locate

• Find Conflict Region

• Create New Cells

• Remove invalid cells

Steps to Parallelization

• Compact Container

• Locate

• Find Conflict Region

• Create New Cells

Locks

• OpenMP provides– Test lock– Wait lock

• Priority lock– Lock and priority pair– Test lock– Priority lock

CGAL Locks• Omp_lock_traits

– Export types• Lock_type• Priority_type

– Constants• max_num_threads• is_parallel

– Static function to handle omp functions• static void set_num_threads(int i)• static size_t get_num_threads()• static void wait_lock(Lock_type* lock)

– Priority lock• bool priority_lock(Priority_type p)• bool test_lock(Priority_type p)• void unset_lock()• bool is_priority(Priority_type p) const

• Omp_empty_lock_traits– Same interface

Compact Container

• STL like container• Pointers to 4 byte

aligned objects• Iterators are not

invalidated during insert and delete

• Memory

Free List

n n

MT-Compact Container

• Each thread maintains its own free list– Insert– Delete– Allocate

• Only lock for allocation

• • Size Formula

• Memory size capacity freeList

ii

Free List

n NT * nWhere NT = number of threads

MT-Compact Container (cont.)

• Old:– Compact_container<T, Allocator = Default_allocator>

• New:– Compact_container<T, Allocatror = Default_allocator,

Lock_type = Omp_empty_lock_traits>– No new functions– Free list array is a boost array parameterized on

lock_traits::max_num_threads

Locate point p

• Start at some cell, cx

y

z

• Determine which face, f, of c, p is outside of

• Continue until p is contained in the current cell

• Repeat with the adjacent cell that shares f with c

c

MT-Locate

• Same steps as Locate, but we must lock and unlock the vertices of the cells, to avoid the cell being destroyed.

z

x

y

Find Conflict Region

• Initialize c, be the cell containing p

• If p is in the circumcircle of the vertices of the c mark it as conflict

• Expand until conflict region is found

MT-Find Conflict Region

• Once again, same steps, but we must lock and unlock vertices to avoid deadlocks

Create New Cell

• Remove cells which are in conflict creating a hole

• Triangulate the hole with a star

MT-Create New Cell

The same as Create New Cell

…Just release the locks at the end.

• Remove cells which are in conflict creating a hole

• Triangulate the hole with a star

TDS

• Vertex base– Old: TDS_vertex_base<TDS>

– New: TDS_vertex_base<TDS, LT=Omp_empty_lock_traits>

– Private derivation of Priority_lock

– Functions for locking, unlocking, etc.

• Cell base – no changes• TDS

– Added functions to help with locking and unlocking• priority_lock_cell, priority_lock_mirror_vertex, • is_locked (vertex and cell)• lock (vertex and cell)

Triangulation_3 and Delaunay_3

• Triangulation_3– parallel_locate(Point p, Vertex start)

• vertex as hint

• cell returned is locked

– error_vertex • query and access functions (similar to infinite vertex)

• Delaunay_3– parallel_insert(Iterator begin, Iterator end, int num_threads)

CC Results (cont.)

Compact Comtainer Results

Locate Results

Locate Results (cont.)

Delaunay Results

Delaunay Results (cont.)

Results Summary

• Compact Container

• Locate

• Delaunay

Future work

• Optimize

• Optimize

• Optimize

• Optimize

• Parallel mesh refinement

• Mesh compression

Thank you

• INRIA, NSF, REUSSI, Sylvain Pion and Chee Yap and Everyone responsible for putting this program together.

a parallel delaunay algorithm for cgal david millman advisor: sylvain pion july 26th 2007

Documents