evolving inversion methods in geophysics with cloud computing – a case study of an escience...
TRANSCRIPT
1
Evolving inversion methods in Geophysics with Cloud Computing – a case study of an eScience
collaboration
Mudge, Chandrasekhar, Heinson, Thiel
Prof J Craig Mudge FTSEUniversity of Adelaide
AustraliaSchool of Computer Science/ School of Earth Sceinces
7th IEEE eScience Conference, Stockholm, December 2011
2
Two South Australian successes in geology1. Hot rocks for geo-thermal energy - 95% investment is in
South Australia
2. Olympic Dam - BHP Billiton -- world's fourth largest copper deposit, fifth largest gold
deposit and the largest uranium deposit.
[email protected] IEEE eScience 2011
Outline
1. Cloud computing2. Collaborative Cloud Computing Lab (C3L)3. Inversion in magnetotelluric processing4. Geothermal – EGS in South Australia5. Results and Lessons learned6. Future work
4
Cloud service provider owns and operates the infrastructure
and innovates to keep technology leading edge, handle software upgrades, and
steadily reduce energy costs
Google, Dalles Oregon Microsoft Azure, Chicago
6
A no-machines Lab
eScience enabled bycloud computing
Seed funding from -- Department of Mines www.pir.sa.gov.au
-- MSFT Research Jim Gray Seed Grant
Started June 2010
machines
Our three cloud service providers
1. Amazon Web Services2. Microsoft Azure
Now adding government funded eResearch clouds which will run Open Stack (NASA and Rackspace)
[email protected] IEEE eScience 2011
Magnetotelluric (MT) imaging1. Using the magnetic and electric
fields of the earth, MT imaging determines the resistivity structure of a sub-surface area of interest.
2. It goes deeper (hundred or so Km) than seismic (<2 Km) but does not have the same resolution
3. Applications1. mineral exploration, 2. water management in mining, 3. geothermal exploration, 4. carbon storage, 5. aquifer research and management6. earthquake and volcano studies.
CO2 in depleted gas field
(Heinson and Mudge, 2010)
8
11
Data logging by University of Adelaide Geophysics, on a geothermal site – Paralana, SA,
Australia
[email protected] IEEE eScience 2011
13
yes
no
locally improvemodel misfit
compute model’sMT response
can locally improve misfit?
> max iterations?
start
compute sensitivity
matrix
compare model responseto observed data
can locally improve smoothness?
smoothenough?
requiredmisfit?
locally improvemodel smoothness
finish
yes
yes
no
yesno
yes
no
no
Inversion iterations:Compute model response,compare with observed data
Searching the solution space
[email protected] IEEE eScience 2011
[email protected] IEEE eScience 2011 21
Performance analysis beyond speedup
Sequential
Parallel
Examples of recent performance analysis 1. Effect of FORTRAN compiler with different optimisations has been worth exploring. A factor of 3X speed up from the Intel Visual Fortran Composer XE 2011 for Windows.2. “Steal time” - time lost due to hypervisor’s management of a virtual machine – Netflix have analysed their Amazon experience extensively
[email protected] IEEE eScience 2011 22
Results and learnings
1. “No-machines” works2. Speedup has led to 100% adoption in MT research3. First results of monitoring fluid injection in EGS
Reservoirs using magnetotellurics (MT) – promising since seismic does not indicate fluid flow, and MT is low cost
4. Taking chunks of FORTRAN is achievable in a timely manner
5. Capability building – a true eScience partnership6. Our Web Services user interactions took same amount
of programming effort as parallelising
[email protected] IEEE eScience 2011
23
eScience in the cloud- observations of a veteran of the
computer industry (but not my co-authors in this eScience paper)
1. Web Services (giving interoperability between disparate services of historic proportion) could have been adopted faster in eScience
[email protected] IEEE eScience 2011
26
eScience in the cloud- observations of a veteran of the
computer industry (but not my co-authors in this eScience paper)
1. Web Services (giving interoperability between disparate services of historic proportion) could have been adopted faster in eScience
2. Cloud computing will speed up the use of web services , because cloud makes it natural to interact using web services (service orientation, discovery, interoperability)
[email protected] IEEE eScience 2011
27
Lessons learned – HPC programming
1. MapReduce (Hadoop) is the programming model that best matches data centre as the computer. However, because it requires rewrite of existing programs, the first wave of benefits come from simpler parallelism – parameter sweeps, Monte Carlo simulation, job-level parallelism, etc.
2. Second wave of benefits will be new algorithms and rewrites using MapReduce
3. Nevertheless, the first wave in cloud-based bioinformatics (matching short reads against reference genome) did use MapReduce
28
Lessons learned - Azure1. Why was Azure much harder to migrate to than
predicted?Answer:- We came from a non .Net environment- Azure younger than Amazon (2 years)
- Virtual Machine in Beta- Deployment times 20 minutes vs 20 seconds slows debugging
- Azure designed for long running applications, e.g., ecommerce, more than for scientific
2. However, we persist.- Warehouse-sized data centre – operating system is robust
and rich, e.g., hot swap of patches- Benefits of [email protected] IEEE eScience 2011
[email protected] IEEE eScience 2011
30
Future work 1 of 2
1. Inversion on demand, available to colleagues and explorers world-wide, wrapped in workflow (persistence, provenance, partial runs, ...)
2. National/international collaboration building on a national Geophysics Virtual Lab
- access to disparate data (seismic, borehole images, gravity, magnetic, ...) built by Auscope using results of GeoSciML Interoperability Working Group
31
Sustainable Energy Policy Societal Need
Energy Exploration Integrated Virtual Laboratory
EnvironmentVirtual Laboratory
Integrated Virtual Labs
Virtual Geophysical Laboratory
National Borehole
Laboratory
Virtual Geodesy Laboratory
Virtual Earth ObservationLaboratory
Virtual Oceans Laboratory
Virtual Laboratories
Geophysics Borehole Geodesy Land cover Marine
Virtual Libraries
Processing Services
DataMiddleware
Processing Services
DataMiddleware
Processing Services
DataMiddleware
Processing Services
DataMiddleware
Processing Services
DataMiddleware
Modelling & analytic tools
Dr Robert Woodcock and Dr Lesley [email protected]
IEEE eScience 2011
[email protected] IEEE eScience 2011
32
Future work 2 of 2
3. Explore statistical machine learning to detect interesting patterns
4. Exploring solution space using Evolutionary Algorithms implemented on thousands of processors in the cloud (Brad Alexander)
5. Promulgate security best practices6. Following the success of speedup, model size
has become the limiter for our geophysicists
[email protected] IEEE eScience 2011
33
AcknowledgementsBrad AlexanderGordon BellPinaki ChandrasekharDennis GannonGraham HeinsonTony Hey Ed LazowskaStephan Thiel
Summary
1. Cloud computing2. Collaborative Cloud Computing Lab (C3L)3. Inversion in magnetotelluric processing4. Geothermal – EGS in South Australia5. Lessons learned6. Future work
35
Thanks and
questions
www.cloudinnovation.com.au
+61 417 679 266+1 650 224 2111
[email protected] IEEE eScience 2011
[email protected] IEEE eScience 2011
36
Security best practices
1. Certifications2. Physical security3. Secure services4. Data privacy via encryption5. Backups6. Constant monitoring7. External review8. Compare yours with Google, Amazon, Azure