©1998-2003, michael aivazis danse software architecture challenges and opportunities for the next...

22
©1998-2003, Michael Aivazis DANSE Software Architecture Challenges and opportunities for the next generation of data analysis software Michael Aivazis California Institute of Technology DANSE Software Workshop September 3-8, 2003

Post on 21-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

©1998-2003, Michael Aivazis

DANSE Software Architecture

Challenges and opportunities for the next generation of data analysis software

Michael Aivazis

California Institute of Technology

DANSE Software Workshop

September 3-8, 2003

2

©1998-2003, Michael Aivazis

Outline

Ontology

Introduction to the architectural elementscomponents and their parts

data transport: ports and pipes

Implementation issuesanalysis, modeling, simulation

software engineering

collaborative development

distributed computing

user interfaces

graphics

Status

3

©1998-2003, Michael Aivazis

Ontology

DANSE is a software architecture:a specification of the organization of the software systema description of the crucial structural elements and their interfacesa specification for the possible collaborations of these elements a strategy for the composition of structural and behavioral elements

DANSE is multi-layeredflexibilitycomplexity managementrobustness under evolutionary pressures

Contemplate the disconnect between a remote, parallel computation and your ability to control it from your laptop

application-general

application-specific

framework

computational engines

4

©1998-2003, Michael Aivazis

User stereotypes

Visiting scientistoccasional user of prepackaged and specialized analysis tools

Instrument specialistauthor of prepackaged specialized tools

Expert scientistprospective author/reviewer of PRL paper

Analysis expertauthor of analysis, modeling or simulation software

Software integratorresponsible for extending software with new technology

Framework maintainerresponsible for maintaining and extending the DANSE infrastructure

5

©1998-2003, Michael Aivazis

Example dataflow diagram

NeXusReaderNeXusReader SelectorSelector

BckgrndBckgrnd

SelectorSelector

SelectorSelector

EnergyEnergy NeXusWriterNeXusWritertimestimes

instrument infoinstrument info

raw countsraw counts

filenamefilename

time intervaltime interval

energy binsenergy bins filenamefilename

A picture that’s worth a thousand questions…

6

©1998-2003, Michael Aivazis

ComponentComponent

Component schematic

input portsinput ports output portsoutput ports

propertiesproperties

component corecomponent core namename

controlcontrol

7

©1998-2003, Michael Aivazis

Component anatomy

Core: encapsulation of computational enginesmiddleware that manages the interaction between the framework and codes written in low level languages

Harness: an intermediary between a component’s core and the external world

framework services:control

port deployment

core services:deployment

launching

teardown

8

©1998-2003, Michael Aivazis

Component core

Three tier encapsulation of access to computational engines

engine

bindings

facility implementation by extending abstract framework services

Cores enable the lowest integration level availablesuitable for integrating large codes that interact with one another by exchanging complex data structures

UI: text editor

facilityfacility

bindingsbindings

custom codecustom code

core

9

©1998-2003, Michael Aivazis

Computational engines

Normal engine life cycle:deployment

staging, instantiation, static initialization, dynamic initialization, resource allocation

launchinginput delivery, execution control, hauling of output

teardownresource de-allocation, archiving, execution statistics

Exceptional eventscore dumps, resource allocation failures

diagnostics: errors, warnings, informational messages

monitoring: debugging information, self consistency checks

Distributed computing

Parallel processing

10

©1998-2003, Michael Aivazis

Flexibility through the use of scripting

Scripting enables us toorganize large numbers of user tunable parameters

allow the runtime environment to discover new capabilities without the need for recompilation or relinking

compose computations at runtime

The interpretive environment:Python is

a modern object oriented language

robust, portable, mature, well supported, well documented

easily extensible

rapid application development

has been extended to support for parallel programming

has no measurable impact on either performance or scalability

11

©1998-2003, Michael Aivazis

Pyre architecture

componentcomponent

bindingsbindings

librarylibrary

extension

componentcomponent

bindingsbindings

custom codecustom code

core

facilityfacility

framework

facilityfacility

facilityfacilityfacilityfacility

componentcomponent

bindingsbindings

custom codecustom code

service

requirementrequirement

implementationimplementation

packagepackage

The integration framework is a set of co-operating abstract services

FORTRAN/C/C++FORTRAN/C/C++

pythonpython

12

©1998-2003, Michael Aivazis

Component harness

The harnesscollects and delivers user configurable parameters

interacts with the data transport mechanisms

guides the core through the various stages of its lifecycle

provides monitoring services

Parallelism and distributed computing are achieved by specialized harness implementations

The harness enables the second level of integration adding constraints makes code interaction more predictable

provides complete support for an application generic interface

13

©1998-2003, Michael Aivazis

Data transport

data pipedata pipe

input portinput portoutput portoutput port

14

©1998-2003, Michael Aivazis

Ports and pipes

Ports further enable the physical decoupling of components by encapsulating data exchangeRuntime connectivity implies a two stage negotiation process

when the connection is first established, the io ports exchange abstract descriptions of their requirementsappropriate encoding and decoding takes place during data flow

Pipes are data transport mechanism chosen for efficiencyintra-process or inter-processcomponents need not be aware of the location of their neighbors

Standardized data types obviate the need for a complicated runtime typing system

meta-data in a format that is easy to parse (XML)tableshistograms

15

©1998-2003, Michael Aivazis

Component implementation strategy

Write enginecustom code, third party librariesmodularize by providing explicit support for life cycle managementimplement handling of exceptional events

Construct python bindingsselect entry points to expose

Integrate into frameworkconstruct object oriented veneerextend and leverage framework services

Cast as a componentprovide object that implements component interfacedescribe user configurable parametersprovide meta data that specify the IO port characteristicscode custom conversions from standard data streams into lower level data structures

All steps are well localized!

16

©1998-2003, Michael Aivazis

Application-general user interface

The natural interfacegood correspondence between logical and physical descriptions of the computation

Multiple views of the computationdata flow

control flow

deployment of distributed components

Dynamic interface generation from component supplied specifications

High end graphicsdata plotting in two and three dimensions

three dimensional visualizations

17

©1998-2003, Michael Aivazis

Data flow paradigm appears naturalusability problems are focused on knowledge of what is possibleused by many commercial and open source tools

Improvementsdecouple UI from diagram logicinterface

use OpenGL!collaborativeinteresting and relevant research

diagram logicthin, reusable componentscriptingmulti-layered control

development can use existing solutions as a guide of what not to do

many modules already available in pyreenable distributed programming

Target for prototype: early 2004

Visual Programming Environment

18

©1998-2003, Michael Aivazis

User interface prototypes - I

19

©1998-2003, Michael Aivazis

User interface prototypes - II

20

©1998-2003, Michael Aivazis

Distributed computing

Enabled by designcareful orchestration of the interactions between the computational engines and their environment

Issuesauthentication

security

remote deployment, control and monitoring

Two pronged implementation strategyuse currently available, robust (but primitive) technologies to extend the framework

initial authentication and deployment based on ssh

authentication and security using pyre services

User interfaces are harder to write correctly!

21

©1998-2003, Michael Aivazis

Application-specific interfaces

User interface shouldbe simple, easy to use

treat the data analysis process at the right level of abstraction

have reasonable defaults for most choices

Forms or wizards?

As users get more familiar they become more demandingtransparent access to the underlying application-generic interface

access to the scripts?

22

©1998-2003, Michael Aivazis

Status

Leveraging pyre:large number of existing servicescomponent cores

integration strategy well understood

component harnessessupport for user configurable properties is complete

data transport

User interfacestwo application-generic prototypes under construction

Missing piecesengines for analysis, simulation and modelingapplication-specific UItests, of all sorts…documentationtraining materials