software development at the lmu -...
TRANSCRIPT
1
Software Development at the LMU - Munich
Ideas, Vision, Commitment?
2
ObsPy
3
ObsPy
ObsPy• Python toolbox for seismologists• Goal: facilitate rapid application development for seismology• Modular extensible architecture• Waveform data: GSE2/GSE1, MiniSEED, SAC, SEG-Y, SH Q/ASCII, SEISAN,
IRIS TSPAIR & SLIST, WAV• Inventory data: Dataless SEED, XML-SEED, RESP• Data request clients: ArcLink/WebDC, IRIS DHI/Fissures, SeisHub, IRIS Web
Service, NERIES Web Service• Signal processing: Filters, triggers, instrument correction, rotation, array
analysis, beamforming• Plotting: spectrograms, beachballs and waveforms• Waveform indexer
4
ObsPy
ObsPy• Open source (LGPL or GPL)• 6 core developers• Platform independent (Win, Mac, Linux) and tested• Test-driven development (TDD), currently ~500 unit tests (http:/tests.obspy.org)• Reliance on well-known third-party libraries (numpy, scipy, matplotlib)• Reusing well established code, e.g. libmseed, GSE UTI• Automated build of API documentation from source• Binary distributions: PyPI (http://pypi.python.org) & Windows Installer• Source code & community webpage containing tutorials, installation
instruction, ticket system, mailing lists
http://www.obspy.org
5
6
Databases and Archiving:Problem:
• Real time System often to restrictive to do “advanced science”
• Common databases not flexible enough to support fast developing and parameter rich fields in geosciences (e.g., volcanology, computational seismology, local observatories)
• Missing multi-user or inter-institutional link between different data bases
➯ Our solution: SeisHub
7
SeisHub
SeisHubNative, document-centric XML database• RESTful Web service (HTTP, HTTPS),• Standard relational database as back-end• Both worlds: SQL for querying and manipulating data and any standard
connected to XML, e.g. XSLT or XSD• Not restricted to seismology at all Extended to a “classical” seismic database• Index of local, file-based waveform archive (MiniSEED, GSE2, SAC, ...)• Meta Data: Gaps, overlaps, quality and timing information• Waveform previews (30s)• XML resource types for handling inventory data (XML-SEED) and events• (QuakeML based)• Remote waveforms access (ArcLink)
8
SeisHub
SeisHub Technical Details• Python-based, standalone web service• Platform independent, open source (GPL)• Implementation of various web protocols, like HTTP, SSH, SFTP• Plug-in architecture: Dynamic discovering and loading of modules and• support for Python .egg files
Development remarks:• Test-driven development proven software, so far → ca. 250 test cases• Well-documented source code• Subversion• Trac: ticket system and project wiki
9
SeisHub
SeisHub: Database Design Data storage• Primary data → file system• Continuous waveform archive (MiniSEED, GSE2, SAC ...)• Other data via (GeoTIFF, GPS time series, etc.) file system• Meta Data → Web service on top of a XML/relational database hybrid
• Data is packed into a XML document → Data structure is within the document, no need for a predefined database schema
• XML resources are archived into a BLOB field• Only searchable values are indexed• Pointers to primary data
10
SeisHub
SeisHub: Database Design Data access
• HTTP/HTTPS: REST web service• XML documents have a fixed resource identifier (URL's)• Data transformation via XML Style Sheets on request (?output=...)• Data validation via Schema (XML Schema, RelaxNG, Schematron) on
resource upload• Document properties like related meta data or indexes
• SFTP: XML documents mapped into a virtual file system
11
SeisHub
SeisHub: Database Design • Indexing
• Generated using a XPath expression, type and additional options• Simple creation + reindexing via web interface• Various build-in types (datetime, bool, numeric, double, float, etc..)• ProcessorIndex: custom processing
• Searching• XPath-like query on XML catalog object (restricted to indexes)• SQL on database object
• Mapper: predefined queries & output format bound to an fixed URL• FileSystemResource: integrates a file system directory (read only)
12
SeisHub
SeisHub: AdvantagesTechnical:
• Sharing data over the network, but no firewall problems (HTTP / HTTPS)• License free, open source, internet standards• Platform independent, most basic client is a standard browser• XML:• Data validation on upload (XML schemas)• Data transformation on request (XML stylesheets)• Querying: SQL or XPath
Scientist:• May modify there data provided as XML document at any time without
corrupting the underlying database• May dynamically add or delete search indexes, schemas and stylesheets
13
# -*- coding: utf-8 -*-import sys, httplibimport osimport globimport base64from obspy.core import UTCDateTime
USER = "xxxxx"PASS = "xxxxx"SERVERNAME = 'dummy.geophysik.uni-muenchen.de:8080'URL = '/xml/seismology/event/%s?format=seiscomp'
now = UTCDateTime()
auth = 'Basic ' + (base64.encodestring(USER + ':' + PASS)).strip()
filelook = "/home/seisdata/.seiscomp3/log/events/%04d/%02d/*/*/*.last.xml"%(now.year,now.month)#filelook = "/home/seisdata/.seiscomp3/log/events/%04d/*/*/*/*.last.xml"%(now.year)
for filename in glob.glob(filelook):# print filename
# read data + remove namespace because SeisHub doesn't like namespaces data = open(filename, 'rt').read() data = data.replace('xmlns="http://geofon.gfz-potsdam.de/ns/seiscomp3-schema/0.6"', '')
#construct and send the header webservice = httplib.HTTP(SERVERNAME) webservice.putrequest("PUT", URL % (os.path.basename(filename))) webservice.putheader('Authorization', auth) webservice.putheader("Host", "localhost") webservice.putheader("User-Agent", "Python PUT") webservice.putheader("Content-type", "text/xml; charset=\"UTF-8\"") webservice.putheader("Content-length", "%d" % len(data)) webservice.endheaders() webservice.send(data)
# get the response statuscode, statusmessage, header = webservice.getreply()
Import of QuakeML via XSLT
Use case (specific):e-Tomo Europe
Evolutionary tomography of the contiguous European continent. Evolutionary means (dis-)„continuous“ improvement of tomographic model(s) using waveform inversion as more receiver stations and earthquakes become available.
e-Tomo workflow (1/2)
e-Tomo workflow (1/2)
Retrieve earthquake data from archive in near real time
(ObsPy)
e-Tomo workflow (1/2)
Retrieve earthquake data from archive in near real time
(ObsPy)
Retrieve earthquake source parameters
(or estimate)
e-Tomo workflow (1/2)
Retrieve earthquake data from archive in near real time
(ObsPy)
Run forward Simulations with
various solvers and models
(axisem, ses3d, seissol)
Retrieve earthquake source parameters
(or estimate)
e-Tomo workflow (1/2)
Retrieve earthquake data from archive in near real time
(ObsPy)
Run forward Simulations with
various solvers and models
(axisem, ses3d, seissol)
Retrieve earthquake source parameters
(or estimate)
Store real and synthetic data
Visualization (e.g., SeisHub,Google Earth)
e-Tomo workflow (1/2)
Retrieve earthquake data from archive in near real time
(ObsPy)
Run forward Simulations with
various solvers and models
(axisem, ses3d, seissol)
Retrieve earthquake source parameters
(or estimate)
Store real and synthetic data
Visualization (e.g., SeisHub,Google Earth)
System for interactive and automatic data
mining (ObsPy/SeisHub) –
phase, amplitude fits
e-Tomo workflow (1/2)
Retrieve earthquake data from archive in near real time
(ObsPy)
Run forward Simulations with
various solvers and models
(axisem, ses3d, seissol)
Retrieve earthquake source parameters
(or estimate)
Store real and synthetic data
Visualization (e.g., SeisHub,Google Earth)
Graphical representation of phase and amplitude
Information (e.g., GIS, Google Earth)
System for interactive and automatic data
mining (ObsPy/SeisHub) –
phase, amplitude fits
choose different simulations (method, earth model)
Apply interactive adjustable filter or simulations, apply measures of
coherence and display it graphically
Interactive selection of specific region and display synthetic
e-Tomo workflow (2/2) If applicable:
Calculation of kernels for new data (receivers
or events)
END
e-Tomo workflow (2/2)
Subsurface model Updates, new versions
Quantifiable quality improvement
If applicable:Calculation of kernels
for new data (receivers or events)
END
e-Tomo workflow (2/2)
Recalculation of all earthquake souce Parameters with
best model
Subsurface model Updates, new versions
Quantifiable quality improvement
If applicable:Calculation of kernels
for new data (receivers or events)
END
e-Tomo workflow (2/2)
Recalculation of all earthquake souce Parameters with
best model
Subsurface model Updates, new versions
Quantifiable quality improvement
Event Data base with improved locations, stfs
and mechanisms
If applicable:Calculation of kernels
for new data (receivers or events)
END
e-Tomo workflow (2/2)
Recalculation of all earthquake souce Parameters with
best model
Subsurface model Updates, new versions
Quantifiable quality improvement
Event Data base with improved locations, stfs
and mechanisms
If applicable:Calculation of kernels
for new data (receivers or events)
END
Vision:
Observatory Scale:• Connecting different Observatories together (client/server
application)• Developing more web services (e.g., quality control, array
techniques etc.) with use of ObsPy as underlying signal toolbox
Computational Seismology:• flexible storage of ALL simulation relevant data in a data base (e.g.,
earth model, code used)• storage of both synthetic and real data in the same archive;
searchable field of methods, models mentioned above• web service for estimating different scenarios (e.g, PGV for different
regions) - attached to a GIS • storage and interactive search of synthetic seismograms of “virtual
stations” up to a supported precision
19
ObsPy:
• No direct Funding! Partial funding out of VERCE • Open Source Project: small entry barriers; make people feel
responsible for modules; focus on students/young researches• Problem driven• Automated testing increases the productivity and sustainablity• More active input of the community is needed.
SeisHub:• No Funding! At present no active developer working on it. Partial
funding out of VERGE possible.• Already strong connection to the Geophysical Observatory will
guarantee the sustainability at least to a lower level.• Open Source Project! Documentation and active involvement of
the community needed
20