tackling the big data challenges in e&pc311515.r15.cf1.rackcdn.com/teradata.pdf · tackling the...
TRANSCRIPT
Teradata Confidential
What if you could…
• perform all E&P analytical activities through a web browser?
• work collaboratively on a single instance of the data?
• guide the science rather than drive the PC?
Seismic Volumes
Reservoir Models
Well Sensors
Teradata Confidential
What if you could…
• perform all E&P analytical activities through a web browser?
• work collaboratively on a single instance of the data?
• guide the science rather than drive the PC?
Teradata Confidential
What if you could…
• perform all E&P analytical activities through a web browser?
• work collaboratively on a single instance of the data?
• guide the science rather than drive the PC?
+
+
Teradata Confidential
What if you could…
• perform all E&P analytical activities through a web browser?
• work collaboratively on a single instance of the data?
• guide the science rather than drive the PC?
• guarantee data custodianship based on standards?
• know who knew what, and when?
Teradata Confidential
What is inhibiting this?
• perform all E&P analytical activities through a web browser?
• work collaboratively on a single instance of the data?
• guide the science rather than drive the PC?
• guarantee standards-based data custodianship?
• know who knew what, and when?
• browser capabilities• network bandwidth• data access• data volumes• compute loads• poor data model/structures• 70% of time managing data• analytical
compartmentalisation• application-centric view• file/transfer formats rule• no community ownership• no granularity• temporally-enabled data
governance
Teradata Confidential
What is out there to help?
• perform all E&P analytical activities through a web browser?
• work collaboratively on a single instance of the data?
• guide the science rather than drive the PC?
• guarantee standards-based data custodianship?
• know who knew what, and when?
• browser capabilities• network bandwidth• data access• data volumes• compute loads• poor data model/structures• 70% of time managing data• analytical compartmentalisation• application-centric view• file/transfer formats rule• no community ownership• no granularity• temporally-enabled data
governance
RDBMS
appliancesData Warehouse
“High Performance XXXX”
“Big Data Solution”
Teradata Confidential
Why can’t I make sense of it all?
RDBMS
appliancesData Warehouse
“High Performance XXXX”
“Big Data Solution”
•“I am old”•“I am an enterprise architect, not a web programmer!”
•“I am a web programmer, not an enterprise architect!”
•“I see red and twitch involuntarily whenever I hear Big Data”
•I have all of these things and I’m still confused
Even with all thesepieces, no one has yetmanaged to bring allnecessary data to bearon business questionsin a useful timeframe
Teradata Confidential
Am I building it properly? (A lesson from history)
1980 1990 2000 2010
10 15
1014
1013
1012
1011
1010
109
108
107
106
105
104
103
Subsurfacevolumes(bytes)
Subsurfacevolumes(bytes)
Disk Capacity(bytes)
Disk Capacity(bytes)
NetworkSpeed(bps)
NetworkSpeed(bps)
Transistors on a CPUTransistors on a CPU
InterconnectSpeed(bps)
InterconnectSpeed(bps)
Data Transfer(bps)
Data Transfer(bps)
Teradata Confidential
Am I building it properly?Subsurfacevolumes(bytes)
Subsurfacevolumes(bytes)
Data Transfer(bps)
Data Transfer(bps)
Questioning power is not keeping up with the potential value of the answer.
Teradata Confidential
Application-centric approach
The quest to integrate decision making led to:
• Application Service Provision• Collaborative Visualisation• Virtualized workstations• Proprietary file formats
•No long-term stewardship of:• File formats• Decision milestones• Data dependencies• Information management
?? ?
Teradata Confidential
Action
Decision
Insight
Knowledge
Information
Data
Analytical Compartments v Data Flow
Wells SeismicBorehole Sensors
Subsurface Models
Decision Support
Knowledge Discovery
Data Retention
Teradata Confidential
E&P Data Usage Modes
Caveat: These numbers are half-baked estimates.Error is ~50%
0
5
10
15
20
25
30
35
40
45
50
She
ll
BP
Ara
mco
Qat
ar O
&G
Sta
toil
Tota
l
Rep
sol
Gaz
prom
Mae
rsk
Cen
tric
a
ENI
GD
F-Sue
z
Britis
h G
as
NPD
DO
NG
KO
C
Luko
il
OM
V
RW
E
Gup
co
Su
bsu
rfac
e D
ata
Vol
um
es(P
etab
ytes
)
Could be"Operationalized"Could be used in"Knowledge Discovery"Data of low businessvalue
Decision Support
Knowledge Discovery
Data Retention
Teradata Confidential
Exploration
Logistics
Production
Distribution
Core and Borehole
Data Latency
Gigabytes
yearsmonthsdayshrs
Petabytes
Terabytes
Data Volume / yr
The E&P analytical landscape
Refining
Teradata Confidential
Many other industries are curing their “big data” problems:• Analytical Integration
• Massive data volumes
• Mixed workloads
• Query Concurrency
The E&P analytical landscape
Production Sensor, 500
Production Seismic, 2000
Exploration Seismic, 5000
Derived and Duplicated Seismic,
5000
Geological Interpretation, 100
Reservoir Models, 1000
Asset and Logistics, 500
Trading, 50
Retail and Marketing, 10
ERP, 100Other, 660
Approximate volumes in Terabytes
Exploration
Logistics
Production
Distribution
Refining
Teradata Confidential
Install seafloor seismic imaging array and
stimulate with in-reservoir tectonic events, and supply-vessel based
airguns
Spend 2-3 days reprocessing data and
reincorporate into workflow
Allow geoscientists and engineers to respond to
HSE and production issues to see how the
reservoir is evolving in a right-time manner!
Carry out $100M+ seismic survey every three years to re-image producing reservoir
Spend 2-3 months reprocessing data and reincorporate into workflow
Hope geoscientists and engineers can control reservoir flows at the weekly
scale based on imaging from the year+ timeframe
Integrated Operations AnalyticsWellfield Data Warehousing in a supermajor
Current approach
Ideal approach
How to do a rapid comparison between “new” and everything ever seen – and make rapid decisions on it.
Shortened Timeframe
Decision Support
Knowledge Discovery
Data Retention
Teradata Confidential
MR/Hadoop~5 concurrent users
Fast Loading
Data Assimilation
Online Archival
SQL-MR~25 concurrent users
Data Discovery
Knowledge Generation
Pattern Detection
Data Warehouse~100+ concurrent users
Decision Support
Data Dependencies
Predictive Analytics
Decision Support
Knowledge Discovery
Data Retention
ETL
Best of Breed Big Data Architecture
Seismic Imaging
Seismic Modelling
Reservoir Char’n
Reservoir Modelling
OperationsReservoir Monitoring
Teradata Confidential
MapReduce/Hadoop
•Is Not a database• HDFS• No schema, indexes, optimizer
• No high availability, security•Not high performance•Not a data warehouse
• No integrated data, no history
• Severe data skew•Not mature technology
• Early open source• A few ISV tools integrating• Single points of failure
•Not a cloud technology, per se
Uses•Seismic Processing•Trace Sorting•1D/2D filtering and transformation
•Online Seismic Archiving•Repurposing WITSML and other sensor feeds
It is a great place to “get to know” your data.
Teradata Confidential
SQL-MapReduce
Uses:•Pattern matching•Feature extraction•“Spot the difference”•Statistical investigations:
•Clustering•Likelihoods•Associations
•“Fail fast” hypothesis reduction:•Seismic Modelling•Reservoir Modelling frameworks
•Is a database• MPP database• schema, indexes, optimizer• security
•Is high performance•Not a data warehouse
• No integrated data but it can talk to a DW via SQL
•Not mature technology• A few ISV tools integrating• Single points of failure
Can enable AaaS
Teradata Confidential
How do I describe a pattern in SQL-MR?
Simple partitioning in-trace (vertical) analytics and adjacent trace analytics
Broader pattern matching – how do I state spatial relationships, meso-scale textures? easy with SQL-MR!
How do I find everything that “looks like” a channel in my 10 Tb 3D (inc. pre-stack) image volume?
2D Profile through a 3D volume showing a cross-section across a filled channel.
5. Internal facies: moderate variance, poor lateral continuity and steeply-dipping beds (statistical description)
4. Incised facies: laterally continuous (low variance) and high amplitude stratigraphy (nPath)
3. Overburden facies: laterally moderately continuous with high amplitude stratigraphy (nPath)
1. Capping boundary: Strong, broadly convex reflection (nPath)
2. Basal boundary: Very strong, grossly concave reflection with
local minima (nPath)
The Power of SQL-MRJava MR functions perform thestandard textural descriptions(reflectivity, variance, etc) and SQLasks the questions – above, inside,below.
Teradata Confidential
LWD/LWP reporting
The Active Data Warehouse in E&P
Sensor Data
Asset DataLogging Data
Decision Control Systems Asset Management+ real time analytics= Datamining while Drilling
Integrated Single Instance
Remove scaling and complexity barriers
Current Problems:• Lack of integration• Performance barriers• Poor scaling to large data volumes and higher complexity• Cannot provide answers to big strategic questions
Operationalised DSS•Master Store for all data•data is stored in a manner that allows it to be useful
Features•Fully-integrated•Highly relational•Fully-decomposed•ACID compliant•Enterprise Grade
• Establish workflows to integrated operations
• Integrate Subsurface insights with Asset and Maintenance actions
• Link production activities via Logical Data Models to other areas of the business• Permanent Seismic
Monitoring• Fracturation Models
and regimes• Hydrocarbon
Accounting
Teradata Confidential
Integrated Operations
Seismic Imaging
Reservoir Character-
isation
Reservoir Modelling
Reservoir Monitoring
AssetManagement
Seismic Archiving
HydrocarbonAccounting
Enabling the E&P cloud
End-to-end workflow management
• guarantee data custodianship based on standards
• know who knew what, and when?
• Data dependency management