the ska science data processor and regional centres

The SKA

Science Data Processor

and Regional Centres

Paul Alexander

Leader the Science Data Processor Consortium

SKA: A Leading Big Data Challenge

for 2020 decade

Antennas

Digital Signal

Processing (DSP)

High Performance

Computing Facility (HPC)

Transfer antennas to DSP

2020: 5,000 PBytes/day

2030: 100,000 PBytes/day

Over 10’s to 1000’s kms

To Process in HPC

2020: 50 PBytes/day

2030: 10,000 PBytes/day

Over 10’s to 1000’s kms

HPC Processing

2023: 250 PFlop

2033+: 25 EFlop

X X X X X X

SKY Image

Detect &

amplify

Digitise &

delay

Correlate

Process Calibrate, grid, FFT

Integrate

s

B

B . s

1 2

Astronomical signal

(EM wave) • Visibility:

V(B) = E1 E2*

= I(s) exp( i w B.s/c )

• Resolution determined by

maximum baseline

qmax ~ l / Bmax

• Field of View (FoV) determined

by the size of each dish

qdish ~ l / D

Standard interferometer

Image formation computationally intensive

• Images formed by an inverse

algorithm similar to that used

in MRI

• Typical image sizes up to:

• 30k x 30k x 64k voxels

Work Flows: Imaging Processing Model

Correlator

RFI excision

and phase

rotation

Subtract current sky

model from visibilities using current calibration model

Grid UV data to form e.g.

W-projection

Major cycle

Image gridded data

Deconvolve imaged data

(minor cycle)

Solve for telescope and

image-plane calibration model

Update

current sky model

Update

calibration model

Astronomical

quality data

UV data store

UV processors

Imaging processors

One SDP Two Telescopes

Ingest (GB/s)

SKA1_Low 500

SKA1_Mid 1000

In total need to deploy

eventually a system which

is close to 0.25 EFlop of

processing

Scope of the SDP

The SDP System

Multiple instances of

the process data

component, some real

time some batch

1…N

Illustrative Computing Requirements

• ~250 PetaFLOP system

• ~200 PetaByte/s aggregate BW to fast working memory

• ~80 PetaByte Storage

• ~1 TeraByte/s sustained write to storage

• ~10 TeraByte/s sustained read from storage

• ~ 10000 FLOPS/byte read from storage

• ~2 Bytes/Flop memory bandwidth

9

SDP High-Level Architecture

What does SDP do?

SDP is coupled to rest of the

telescope

Try to make the coupling as

loose as possible, but some

time critical aspects

For each observation:• Controlled by a scheduling

block

• Run a Real time (RT)

process to ingest data and

feed back information

• Schedule a batch processing

for later

• Must manage resources to

SDP keeps up on timesacle

of approximately a week

Real-time activity

Batch activity

SDP Architecture and Architectural Drivers

• How do we achieve this functionality

• Architecture to deliver at scale

• Some key considerations

– SDP is required to evolve over time to support changing work flows required to deliver the science

– Different Science experiments have different computational costs and resource demands

• Use for load balancing

• Expect evolution of system to more demanding experiments during the operational lifetime

Data Driven Architecture


o Smaller FFT size at cost of data duplication


o Further data parallelism in spatial indexing (UVW-space)

o Use to balance memory bandwidth per node

o Some overlap regions on target grids needed

How do we get performance

and manage data volume?

Approach: Build on BigData Concepts

"data driven” graph-based processing approachreceiving a lot of attention

Inspired by Hadoop but for our complex data flow

Graph-based approach

Hadoop

Execution Engine: hierarchical

Processing Level 1

Cluster and data

distribution

controller

Relatively low-data

rate and cadence

of messaging

Staging:

aggregation of

data products

Processing level 2

Static data

distribution

exploiting inherent

parallelism

Execution Engine: hierarchical

Processing Level 2

Data Island

Shared file

store across

data island

Worker

nodes

Task-level

parallelism to

achieve scaling

Process

controller

(duplicated for

resilience)

Cluster manager

e.g. Mesos-like

Data flow for an observation

Execution Framework

What do we need

• Processing Level 1:– Custom framework to provide scale out

• Processing Level 2:– Many similarities to Big-Data frameworks

– Need to support multiple execution frameworks• Streaming processing – possibly Apache STORM

• Work flows with relatively large task size or where data-driven load balancing is straight forward – possibly development of prototype code

• ”Big Data” workflows with load balancing, resilience – possibly something like Apache SPARK

– Framework manages tasks• Tasks need to be pure functions

• Most tasks domain specific, but need supporting infrastructure– Processing libraries

– Interfaces to the execution framework

– Performance tuning on target hardware

What do we need

• Skills and work – e.g. for SPARK– Need to develop for High Throughput

• High Throughput data analytics framework (HTDAF)

• New data models

• Links to external processes

• Memory management between framework and processes

– Likely to have significant applicability

– Shared file system needs to be very intelligent about data placement / management or by tightly coupled with the HTDAF

– Integration of in-memory file system / object store systems with the cluster file system / object store

– Software development in data analytics, distributed processing systems, JVM, …

Database services

• Tasks running within processing framework need metadata and to communicate updates / results

– High performance distributed database

– Prototyping using REDIS

– Needs to interface to processing requirements

– Schemas

File / object store

• Multiple file / object stores,

one per “data island”

– Integration into execution

framework

– Resource allocation

integrated with cluster

management system

– Complex resource

management

– Migration / interface to

long-term persistent store

likely to be separate

Hierarchical Storage

Manager

Data lifecycle management

Manage data lifecycle across the cluster and through time

• Strong link to overall processing and observing schedule

• Performance is key

• Determine initial data placements

• Manage data movement and migration

• Local high performance obtained by light-weight layer (e.g. policies) on a file system

Hardware Platform

Complex Network

Performance Prototype Platform

35

Example tests:

• automated deployment from containers (github)

• infrastructure management and orchestration

• simulate streamed visibility data from correlator to N

ingest nodes

• basic calibration on ingest

• calibration and imaging with suitable modelling

across 2 compute islands

• data products written to preservation (staging)

• basic quality assurance metrics visible

• distributed database performance

• some automated built in testing at various levels

• LMC interface prototyping (platform management

and telemetry)

• Execution Framework - Big Data prototyping

(Apache Spark, etc.)

• scheduling

• Significant £400k

STFC/UK investment in

hardware prototype

• Currently being installed in

Cambridge

Control layer

Complex functionality

– Resilience required here

– Interface to cluster manager for resource management

– Prototyping around adding functionality to OpenStack

– Built around various open-source products integrated into broader framework

Control and system components

Delivering Data

Delivery System

• User Interfaces

• Delivery System Service Manager

• Delivery System Interfaces Manager

• Tiered Data Delivery Manager

Science data catalogue and Prepare

Science Product

– This component manages the catalogue

of data products and how they are

assembled for distribution including data

model conversion if needed

Domain specific components

39

Not discussed in detail

here

• Some performance

tuning

• Interfacing

• I/O system?

Other system Software

Compute System Software

• Compute Operating System software– The O/S will be based on a widely-adopted community version of Linux such as

Scientific Linux (Fermi Lab, et al) or CentOS (CERN, et al)

• Middleware– Messaging Layer based on TANGO’s messaging layer and/or open source

messaging layers (like ZeroMQ) is possible with effort

• Interface to TANGO– TANGO is the selected product for overall monitor and control – interfacing of other

components to TANGO will be required for e.g. monitor information from the cluster manager

• Middleware stack– Based on widely adopted versions of community-supported software that will mature

up to deployment, like OpenStack. Effort will be required during the construction period to develop enhancements and SDP-specific customisation. Effort will also be required to maintain enhancements, relevant to the hardware and software requirements. Close collaboration and convergence with large contributors (like CERN) in the open source community could lead to reduced long-term maintenance effort.

Other activities

• Expect SDP to be delivered by a consortium with strong link / involvement from SKAO and academic partners

– Consortium building and leadership

– Management

– Software Engineering and System Engineering

– Integration and acceptance

– Some on-going support activities after construction

Archive Estimates SKA1

Archive Growth Rate: ~ 45 – 120 Pbytes/yr

Larger including discovery space: ~300 Pbytes/yr

Data Products

• Standard products

• calibrated multi-dimensional sky images;

• time-series data

• catalogued data for discrete sources (global sky

model) and pulsar candidates

• No derived products from science teams

Further Processing and Science Extraction

at Regional Centres

Regional Centres

• Data distribution and access for

astronomers

• Provide regional data centres and High

Performance Computing services

• Software development

• Support for key science teams

• Technology development and engineering

support

• Significant compute in their own right

• In UK likely to be supported by layering on

top of a coordinated compute infrastructure

– working title UK-T0

Global Headquarters

South African Site

Australian Site

Regional Centre 1

Regional Centre 2

Regional Centre 3

• Data Analytics support

• Science analysis

• Source extraction

• Bayesian analysis

• Joining of catalogues

• Visualisation

the ska science data processor and regional centres

Documents