data warehousing & business intelligence 5 years from now

63
Data Warehousing & Business Intelligence 5 years from now Presentation to Helsinki TDWI Meeting December 15 th , 2009 Martin Willcox

Upload: teradata-corporation

Post on 28-Jul-2015

773 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Data Warehousing & Business Intelligence 5 Years From Now

Data Warehousing & Business Intelligence 5 years from now

Presentation to Helsinki TDWI MeetingDecember 15th, 2009Martin Willcox

Page 2: Data Warehousing & Business Intelligence 5 Years From Now

2 > 04/15/2023

Disclaimer

• The views expressed are those of the author and do not necessarily reflect those of the Teradata Corporation in all cases.

• Predicting the future is notoriously difficult and subject to error!

• The engineering plans outlined in this presentation are subject to change; they are not firm commitments, either to develop the features in question, or to provide them within a specified timeframe.

Source: Text

Page 3: Data Warehousing & Business Intelligence 5 Years From Now

3 Teradata Confidential

The information explosion and the data warehouse - lies, damn lies & statistics?

“Winter Corp primary research... shows a consistent trend since 1988: the size of the largest data warehouse we validate triples approximately every two years.”

Source: http://www.b-eye-network.com/view/7188

Page 4: Data Warehousing & Business Intelligence 5 Years From Now

4 Teradata Confidential

Decreasing storage unit costs will continue to drive increasing digitisation

• asdasd• sasdasdas

Source: IBM Research

“Around the year 2000 the price of storage dropped to a point where it became cheaper to store data on computer disks than on paper. In fact, this probably was a great turning point in the history of the development of western civilisation... now the digitisation of text is not only of interest for sharing and analysis, but it is also more economical”

Source: Physical Database Design; Lightstone, Teorey & Nadeau

Page 5: Data Warehousing & Business Intelligence 5 Years From Now

The Evolution of Detail Data

Invoice summaryBill summary

POS transaction summary

10-20 buckets of detail per customer per month

A few months of history avail

200-500 itemsof detail

What does customerrevenue and expenses

look like over time?

Sensor dataRFID data

GPS location dataWeb behavior

1000’s of event details per month or even day

2+ years of history avail

10,000 – 1M+ items of detail

How do overall customerbehaviors relate to revenue

related activities?

Transaction detailBill line-item detail

POS SKU level detail

100’s of detail transactionsper customer per month

A year of history avail

2,000 – 5,000 itemsof detail

What can I learn from specificcustomer transactions that will

allow me to stimulate revenue?

1x 10x 100x +

Page 6: Data Warehousing & Business Intelligence 5 Years From Now

DTAPLocation Update Request07/28/02 10:09:10.134 PM

BSC MSC/VLRnew

HLR MSC/VLRold

GSM.98 MAPSend Parameter (old LAI, TMSI)07/28/02 10:09:10.516 PM

GSM.98 MAP Return Result (IMSI)07/28/02 10:09:11.0.57 PM

GSM.98 MAPUpdate Location07/28/02 10:09:11.611 PM

GSM.98 MAPCancel Location07/28/02 10:09:12.187 PM

GSM.98 MAPCancel Location Acknowledge07/28/02 10:09:12.492 PM

BSC MSC/VLRnew

HLR MSC/VLRold

DTAPLocation Update Accept07/28/02 10:09:15.084 PM

GSM.98 MAPInsert Subscriber Data07/28/02 10:09:13.256 PM

GSM.98 MAPInsert Subscriber Data Ack07/28/02 10:09:13.780 PM

GSM.98 MAPUpdate Location Ack07/28/02 10:09:14.322 PM

DTAPTMSI Allocation Complete07/28/02 10:09:15.676 PM

Increasing Sophistication, Complexity & Information

GSM GPRS 3G

91SignalingMessages

11SignalingMessages

200+SignalingMessages

Page 7: Data Warehousing & Business Intelligence 5 Years From Now

Example new data type: Geospatial(web, text, audio, image…)

Page 8: Data Warehousing & Business Intelligence 5 Years From Now

Analytical Archive

• Legal and Regulatory Compliance requirements are mainstream

• Organizations need to move from off-line model> 1½ day or longer response to queries > Queries limited to only small subset of entire history> No business value gained from the history data > Impacting other back-up systems> Requires lengthy data retrieval, conversion and archival efforts

• Towards an Analytical Archive on-line environment> Near instantaneous response to any access requests> Able to perform queries on full set of history data at any time> Enables full business analysis of all archive data

Page 9: Data Warehousing & Business Intelligence 5 Years From Now

Moore’s law is enabling the production of very small, rugged computing systems…

9 > 04/15/2023

Smart dust: self-contained, millimeter-scale sensing and communication platforms for massively distributed sensor networks,

that contain sensors, computational ability, bi-directional wireless communications, and a power supply.

Page 10: Data Warehousing & Business Intelligence 5 Years From Now

10 > 04/15/2023

Sensor technology in action

Need historical data (pattern identification and forecasting); to integrate this data with other data (correlation); near real-

time capture and analysis of data (to support preventative diagnosis).

Page 11: Data Warehousing & Business Intelligence 5 Years From Now

11

Teradata Confidential

Computing performance for I/O intensive operations has been limited by storage

2GB-7,200RPM

9GB-10,000RPM

36GB – 7,200RPM

73GB-15,000RPM

36GB-10,000RPM

Storage densities are increasing in line with Moore’s law, but disk access times are increasing much more slowly.

Page 12: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential 12

data

data

data

data

Database Server Database Server

Mechanical Rotation and Seek Limit HDD Speed

22X Faster on Typical Data Warehouse Workloads

Page 13: Data Warehousing & Business Intelligence 5 Years From Now

13 > 04/15/2023

Prediction #1

Declining unit cost of storage –

+ increased competition+ increased complexity of products, services and

processes+ new sources, types of data+ increased regulation+ increased digitization and new sensor technology

= sustained growth in size of most data warehouses for the foreseeable future.

Page 14: Data Warehousing & Business Intelligence 5 Years From Now

14 > 04/15/2023

Green IT is no longer just for the environmentalists…

“On the current path, in 5 years the cost of energy to power the data center will be higher than the cost of the IT equipment that it

powers.”

Gartner Data Center Conference 2007

Page 15: Data Warehousing & Business Intelligence 5 Years From Now

15 > 04/15/2023

Count on energy prices increasing substantially…

Carbon price /($ per-tonne of CO2)

Proposed by US Congress 12 (rising to 20 by 2020)

European Trading Scheme current 22

Required to make onshore wind generation profitable without subsidy

38

Required to stablize atmospheric levels of CO2 at 450 ppm (average temperature rise of 2 degrees Celsius)

40 (rising to 80 by 2050)

Required to make offshore wind generation profitable without subsidy

136

Required to make solar generation profitable without subsidy

196

Figures taken from “Good policy, and bad”, The Economist, 3rd December 2009.

Page 16: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential 16

SSD technology to the rescue? Not on its own…

Enterprise SSD Ratio Enterprise 15K HDD

IOPs (4 KB) 105 150X 102

Sequential Read BW (MB/s) >450 3X >150

Random 80/20 BW (MB/s) >450 22X 20

Avg. Random I/O latency Microseconds (10-6) >1000 Milliseconds (10-3 )

Active Power 10W 60% 17W

Typ. Capacity 300 GB 67% 450 GB

Page 17: Data Warehousing & Business Intelligence 5 Years From Now

17 > 04/15/2023

…but the technology will help if applied selectively…

# of 15k HDD drives 160

Storage capacity 160 * 450GB = 72,000 GB

Storage power 160 * 17W = 2,720W

# of SSD drives 60

Storage capacity 60 * 300GB = 18,000GB

# of 15k HDD drives 40

Storage capacity 40 * 450GB = 18,000GB

# of 7.5k SATA HDD drives 36

Storage capacity 36 * 1,000GB = 36,000GB

Storage power (60 * 10W) + (40 * 17W) + (36 * 12.9) = 1,744W (-36% reduction)

Homogenous system, based on 15k RPM enterprise class HDD

Heterogeneous system, 1 : 1 : 2.5 ratio of SSD : 15k HDD : 7.5k HDD

Page 18: Data Warehousing & Business Intelligence 5 Years From Now

18 > 04/15/2023

…and empirical evidence is that heterogeneous storage will be a good compromise in many cases

Data Temperature Demographic7 day trace

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Available Space

IO

85% of the IO directed at hottest 15% of data (10% of cylinders)

94% of the IO directed at hottest 30% of data (20% of cylinders)

43% of the IO directed at hottest 1.5% of data (1% of cylinders)

Page 19: Data Warehousing & Business Intelligence 5 Years From Now

19 > 04/15/2023

Where does the space go?

2.9x more spinning disk than data – and Teradata enjoys lowest disk : data ratios according to last Gartner research published in this area.

RAID 1 is a very space intensive insurance / performance mechanism…

SSD technology may reduce the RDE burden still further?

TD uses value-based compression, actual mileage depends on data demographics. More aggressive

compression schemes typically trade space for performance – lots of current research in this space.

“Fractured mirror” concept may see a revival in the fortunes of “software RAID” mechanisms, with logical, rather than physical, “mirrors”.

Page 20: Data Warehousing & Business Intelligence 5 Years From Now

20 > 04/15/2023

Multi-core CPU technology affords interesting new power saving possibilities…

…but in a multi-user environment these are likely to yield less dramatic savings.

Page 21: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential21 >

Greatest opportunity still lies in better management of information assets

“Delivering Business Intelligence to Network Rail – A Strategic Approach”

Christopher Stanley, Presentation to Gartner BI Summit, Den Haag, January 20-22, 2009.

Page 22: Data Warehousing & Business Intelligence 5 Years From Now

22 > 04/15/2023

Prediction #2

Lots of “technology fixes” will be applied to “green IT” – • Intelligent power management;• New storage technologies;• Enhanced data compression & “S/W RAID”;• Smart data centre design & location.

But shortening the length of the “digital shadow” through increased information consolidation and integration will achieve more than these “silver bullets” can.

Page 23: Data Warehousing & Business Intelligence 5 Years From Now

Copyright Teradata © 2009 – All rights Reserved

The problem with traditional Business Intelligence

Process Step

Process Step

Process Step

Process Step

Process Step

ProcessStep

User has a key decision to make

here…

…Industry’s default response: stop what you

are doing, logon to a different BI tool /

application, run a report and study it, logoff / logon

again…

Much traditional BI is predicated on the assumption that knowledge workers are sat in a head office in front of a PC, making forward-

looking decisions and with time to develop and test hypotheses by comparing “forecast” versus “actuals” in static reports…

Page 24: Data Warehousing & Business Intelligence 5 Years From Now

Copyright Teradata © 2009 – All rights Reserved

…not this knowledge worker!

Page 25: Data Warehousing & Business Intelligence 5 Years From Now

25 > Copyright Teradata © 2009 – All rights Reserved

Pervasive BI: right data to the right actor -at the right time and via the right channel

Page 26: Data Warehousing & Business Intelligence 5 Years From Now

Copyright Teradata © 2009 – All rights Reserved

Primarily Batchwith Pre-defined

Queries

STAGE 1

REPORTING

WHAThappened?

Event-basedTriggeringTakes Hold

STAGE 5

AnalyticalModeling

Grows

STAGE 3

PREDICTING

WHATwill happen?

ContinuousUpdate and

Time-Sensitive Queries GainImportance

STAGE 4

OPERATIONALIZING

What IShappening?

Increasein Ad HocQueries

STAGE 2

ANALYZING

WHYdid it happen?

Batch Ad HocEvent-based Triggering

AnalyticsContinuous Update

ACTIVE WAREHOUSING

What do I WANT to happen?

Information evolution and “Active Data Warehousing”

“More than 85% of the eBay analytical workload is new & unknown” (Oliver Ratzesberger, eBay, October 2008)

Page 27: Data Warehousing & Business Intelligence 5 Years From Now

27 > 12/15/09 Copyright Teradata © 2008 – All rights Reserved

Event-based processing: event relationships versus rules

• Conjunctions> All the specified events happened

• Sequencing> The events happened in the specified

order• Disjunction (any n)

> Any n of the m specified events happened

• Temporal association> Events happened within n time units of

each other• Negation

> An event did not happen within a deadline

• Aggregation> Collections of events following sliding

window semantics

Page 28: Data Warehousing & Business Intelligence 5 Years From Now

28 > 12/15/09 Copyright Teradata © 2008 – All rights Reserved

Some events can only be detected inside the data warehouse…

• Inside the Data Warehouse> When integrating data

detects the event > When analysis of data

detects the event > When KPI thresholds vary

considerably by time period

• Inside the Application> When a business process

detects the event first> When using a BAM or BPM

solution> Simple analysis of non-

integrated data detects the event first

Workflows & Applications

ADW

Workflows & Applications

ADW

Page 29: Data Warehousing & Business Intelligence 5 Years From Now

29 > 12/15/09 Copyright Teradata © 2008 – All rights Reserved

Complex Event

Processing

…and intelligent responses to externally detected events will anyway require the EDW

Possible Actions

Pricing

Inventory

Distribution

Capital

New supplier

Rebalance staffing

Buy more/less

Modeling & Simulation

Event Streams

History

Today

EDW

Real time

Analytic

Page 30: Data Warehousing & Business Intelligence 5 Years From Now

Copyright Teradata © 2009 – All rights Reserved

Active Enterprise Intelligence at ABN AMROBetter Web Advertising

There are currently > 50 different proposals. These are a few of them:

Within 2 seconds after a customer has positively identified him/herself the best matching proposition (out of over 50 propositions) is computed and presented based on real time customer information

365 days per year (24x7)

175,000 contacts / day

> 63 million personalizations / year

Page 31: Data Warehousing & Business Intelligence 5 Years From Now

Copyright Teradata © 2009 – All rights Reserved

Click through rate:

5,5%

Click through rate:1,1%

Click through rate:4,0%

Click through rate for banners (benchmark: 0.2 % or 1 in 500)

Active Enterprise Intelligence at ABN AMROBetter Web Advertising

Page 32: Data Warehousing & Business Intelligence 5 Years From Now

Copyright Teradata © 2009 – All rights Reserved

Also Used on Inbound Call Dashboard

Recommendations linked to overall targets

Updated daily !

Linked to:

-Total SOW

-Total SOI

-Credit position

Potential indicator – input for

Contact-type

S= Service oriented

V = Sales oriented

E = Efficiency oriented

Housebank indicator

Strategic goal

Ouside in

Customer perspective

Page 33: Data Warehousing & Business Intelligence 5 Years From Now

Copyright Teradata © 2009 – All rights Reserved

Active Enterprise Intelligence in BankingNext Best Activity on Call Center

Inbound:• Doubled sales (+122%) to high potential customers

• > 24% decline in average handling time for low potential customers - with no negative influence on customer satisfaction

Outbound:• Increased conversion rates by 15%

Call Centre

Page 34: Data Warehousing & Business Intelligence 5 Years From Now

34 > 04/15/2023

EDW please phone home (or “tech vendor actually eats its own dog food”)

• System status and health > Major incidents

– Disk shutdown, reset, etc.> Change control

– Patches, new hardware…> Transfer types

– TVI/Customer Care Link– Manual

• Sent to secure portal > Extensive security> No GUI interface

secure portal

Page 35: Data Warehousing & Business Intelligence 5 Years From Now

35 > 04/15/2023

How fast can you fix it? Only as quickly as you know there is a problem!

• Priority 1 events: < 10 minutes• Change control: under 6 hours

ELT

ADW

Teradata@YourService

Customer

CustomerServices

CustomerSystem

Secure portal

Pri

ori

ty s

chedule

r

1-5min

1min

1-5min

Every 6 hours

seconds

Page 36: Data Warehousing & Business Intelligence 5 Years From Now

36 > 04/15/2023

Note that traditional reporting doesn’t go away… “BI is dead, long live BI!”

Page 37: Data Warehousing & Business Intelligence 5 Years From Now

Prediction #3

“Pervasive” and “event-driven” BI finally come of age – • Data warehouse supports thousands / tens of thousands

of concurrent users, a very diverse mix of queries and multiple Service Level Goals (SLGs);

• Closed-loop integration of operational, analytical processes;

• Majority of data warehouse calls are from SOA applications, not dedicated, “traditional” BI tools – in many cases, users won’t even know that they are interacting with the data warehouse;

• Information management becomes the critical issue in the face of this increased complexity – governance, lineage and validation in particular will have to improve to support widespread SOA deployment.

Page 38: Data Warehousing & Business Intelligence 5 Years From Now

Market has spoken – and verdict is that the MPP “appliance” platform is best fit for DW…

1980 1985 1990 1995 2000 2005 2010

IBM DB2 Parallel Edition

Oracle ExadataNetezza1st Teradata implementation goes

live at Wells Fargo

DATAllegro

Not all MPP platforms are created equal! Caveat emptor!

Teradata = best technology + best processes + best people

Aster Data Vertica

Greenplum

Kognitio (WhiteCross) NeoView

Page 39: Data Warehousing & Business Intelligence 5 Years From Now

> 11/04/2009

…but (public) cloud computing is changing the way that many services are delivered

• Essential characteristics > On-demand self-service > Resource pooling [virtualization]> Rapid elasticity > Measured Service [pay per use]

• Service models > Software-as-a-Service (SaaS)> Platform-as-a-Service (PaaS)> Infrastructure-as-a-Service (IaaS)

• Deployment models > Private cloud > Public cloud> Hybrid cloud

Source: Draft NIST Working Definition of Cloud Computing, 8-21-09, version 15http://csrc.nist.gov/groups/SNS/cloud-computing/index.html

“Nearly 90% of organisations expect to maintain or grow use of SaaS in 2009."

Source: Gartner User Survey, November 2008

Page 40: Data Warehousing & Business Intelligence 5 Years From Now

40 > Oct. 6, 2009

Traditional data exploration architectures

Data Warehouse

basetables

Sandboxes, marts, etc.

• Data Moved across the Enterprise

• Lack of Security• Stale

• Control Issues> Security> Privacy> Completeness

Page 41: Data Warehousing & Business Intelligence 5 Years From Now

41 > Oct. 6, 2009

Agile Analytics or “Private Cloud Computing”

• Inside the existing EDW

• An internal private cloud> Dynamic mart provisioning> Self service, multi-tenant> Virtualized, chargeback

• Enablers> Support for self-service

provisioning> Advanced Workload

Management > Information governance, to

prevent proliferation of obsolete & redundant data.

Enterprise Data

Warehouse

basetables

Active Workload Management

Sandboxes, marts, etc.

EDW Server & Storage

Page 42: Data Warehousing & Business Intelligence 5 Years From Now

42 > Oct. 6, 2009

• Dependent virtual marts> Small applications> Prototyping> User education> Short term projects> Mart consolidation

• Extremely private data> Healthcare> Payroll> HR> Etc.

• Proof of Concept> Demos, function test

• Development> Easy access to real data

• Power user sand box> Research, discovery> Trial and error> Hypothesis testing

• Testing> Quality Assurance> New features> Application upgrades

Use cases for internal analytic clouds

Page 43: Data Warehousing & Business Intelligence 5 Years From Now

Prediction #4

Public cloud computing will continue to evolve; private cloud computing will come of age; virtualization with everything

• Departmental data marts will migrate to the public cloud, making information management (even more) complex;

• Public cloud computing infrastructures will evolve, but performance, security and privacy issues will prevent widespread adoption for data warehousing – “appliances” continue to dominate;

• Private cloud computing / analytical sandboxing becomes an industry-standard best practice;

• Continued focus on virtualization of private computing resources to drive higher levels of hardware utilization and efficiency.

Page 44: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Copyright 200944 > 04/15/2023

Hadoop

• Hadoop is> A parallel programming framework – open source

implementation of Map/Reduce concept> A file system (HDFS)> A batch job and task dispatching system> Massively parallel> An Apache open source project

• Hadoop is NOT> A database

– No indexes, transactions, recovery journals, SQL> A data warehouse

– Not subject oriented, nonvolatile, time variant, integrated data> A commercial-off-the-shelf BI or ETL tool

Page 45: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Copyright 200945 > 04/15/2023

Benefits of Hadoop

• Support for 100s to 1000s of server nodes> Extreme scalability> Commodity hardware = low costs

• Data analyzed where it is stored> Low or no data movement

• Easy integration of developer tools> Java, grep, python, etc.

• IT programmers do parallel processing> Wow!

• Batch programming> Complex multi-step processing

Page 46: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Copyright 200946 > 04/15/2023

Somewhat Equivalent Terminology

Hadoop Teradata

single namespace single image

Map tasks AMP SQL execution

Reduce tasks AMP SQL aggregation

Partition, shuffle and sort row redistribution

intermediate data spool

HDFS Master Name Node Parsing Engine

HDFS Slave Data Node AMP nodes

HDFS replicated data fallback

Job Tracker dispatcher

Task Tracker AMPs

map functions UDFs, in-database functions, push-down

key primary key

Hash Partitioner hash buckets

rack or cluster clique

Page 47: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Copyright 200947 > 04/15/2023

No Equivalent Function

Hadoop Unique Teradata Unique

varying node counts (extreme scale) Cost based optimizer

tools: grep, Pig, Python, etc. Primary and secondary indexes

automatic re-execution on failure multi-table joins

cloud based FastLoad, MultiLoad, FastExport

process non-relational data BI query tools & spreadsheets

InputFormats pre-execution query cost estimate

InputSplit role based security admin.

Referential Integrity

Views, schemas, data types

row and table locking

spatial and temporal types

historical data persistence

Page 48: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Copyright 200948 > 04/15/2023

When to Use Which Infrastructure

Complex processesMulti-step processes1,000+ nodes requiredCan’t move the dataExtensive text parsingHouseholding analysis

Data miningSimple reportsData cleansing

“It depends” Iterative discoveryDrill down, OLAPDashboardsBusiness user queryingIntegrated subject areasMultidimensional viewsTrends over timeVisualization toolsText analysis tools

Page 49: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Copyright 200949 > 04/15/2023

Prediction #5

Hadoop will complement, rather than replace, the RDBMS• Continued adoption of Hadoop for implementation of

process-parallel, c.f.: data-parallel applications by early-adopters who can justify the required investment in skilled resources – and can retain them;

• Limited adoption for general data management unless Hadoop evolves to include traditional DBMS functionality (locking, logging, transactions, recovery, schema support, support for declarative rather than procedural programming, etc., etc.).

Page 50: Data Warehousing & Business Intelligence 5 Years From Now

Available now…

Presentation to Helsinki TDWI MeetingDecember 14th, 2009Martin Willcox

Page 51: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential51 >

Teradata Reduces Data Center Burden

0

1000

2000

3000

4000

5000

1992 1994 1996 1998 2000 2002 2004 2006 2008

Wa

tts

/Eq

uip

me

nt

sq

. ft

. 1U & Blade

2U & Greater

Storage Servers

TD Servers

TD Storage

5200 5255 5300 5380 5400 5450 5500

Industry Standard Equipment Power Density Increase

According to ASHRAE

Teradata Equipment Power Density

• Teradata has lower IT equipment power density than industry average• Lowers the demands on data center cooling• Results in energy and floor space savings

5555

Page 52: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential52 >

Teradata Metric for Platform Efficiency- PPW

• Teradata is now using a method for measuring the energy efficiency of their data warehouse systems

• PPW = Platform energy efficiency metric: Performance per Watt> Based on TPerf - Teradata’s traditional measure of data warehouse

performance capacity > Calculated by dividing TPerf by the total electrical power (in

kilowatts) measured for a complete platform system– Teradata node cabinets– Teradata Enterprise Storage cabinets– BYNET switch hardware

Teradata Data Warehouse Performance - TPerf Electrical Power Consumed for SystemPPW =

Page 53: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential53 >

Teradata Platform Efficiency

• Teradata has delivered significant Platform Efficiency improvements over the last 6 years> Plotting the PPW for six generation of products demonstrate the

enhancements

5.42

9.59

3.04

8.60

4.98

0.0

2.0

4.0

6.0

8.0

10.0

12.0

53xx (2002-2003)

54xx (2005-2006)

55xx (2007-2008)

Platform Generation (GCA Dates)

Av

era

ge

TP

ER

F/k

W

Performance per Watt improvement: 215%

Performance per Watt improvement: 93%

Page 54: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential 54

World’s Fastest Data WarehouseLeveraging Teradata Technology

• First 100% enterprise class Solid State Drive analytic appliance> 4 million IOPS* per rack

– Competition claims 1 million IOPS> Loads 7TB+ an hour per rack

– Competition claims 5TB/hr

• SSD is 150 times faster than HDD for typical data warehouse work> 55,000 I/Os per second per drive> Latency up to 1000X faster

• Data protection for high availability

1

3

1

3

1

3

1

3

1

3

1

3

1

3

1

3

1

3

1

3

1

3

* I/O per second at 8KB I/O size

100% Solid State

Drive Appliance

Page 55: Data Warehousing & Business Intelligence 5 Years From Now

55 > Oct. 6, 2009

Elastic Mart Builder…

Page 56: Data Warehousing & Business Intelligence 5 Years From Now

56 > Oct. 6, 2009

Enterprise Data Warehouse

Verify Data

…for simple, self-service provisioning in support of “private cloud” deployments

Table Name

xyz

Col3

PI

Type

123abcSample

Col2Col1

CSV

Upload File

Upload

Password

Myfile.csvFile name

User Name

IntegerDateCharDecimal

Create Elastic Mart

Create

UserName MySpecialID

Password

Perm space 500MB

Elastic Marts

POCSandbox

Virtualmart

Import

1 2

34

5

Page 57: Data Warehousing & Business Intelligence 5 Years From Now

57 > Oct. 6, 2009

Teradata Express for Amazon Public Cloud

• Free Teradata software> Teradata 13 + SLES10> 1TB disk limit> Non-production use

• Runs in Amazon Web Services> Cloud computing Leader

– 100,000+ servers for rent

> Working with Teradata Engineering

> Amazon is migrating to a Teradata EDW

EBS

datasources

Page 58: Data Warehousing & Business Intelligence 5 Years From Now

Coming soon…

Presentation to Helsinki TDWI MeetingDecember 14th, 2009Martin Willcox

Page 59: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential12 > 11/04/09

Node

HBA HBA

Node

HBA HBA

Node

HBA HBA

Node

HBA HBA

Node

HBA HBA

Node

HBA HBA

Node

HBA HBA

Node

HBA HBA

Node

HBA HBA

High Performance

20 MB/Sec

SSD

150 MB/Sec

High Capacity

6 MB/Sec

TVS: automatic migration of hot / cold data across heterogeneous storage devices (2H2010 / 1H2011)

Page 60: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Takes Appliances to the Next Step

Teradata Data Mover, Replication Services, Dual Load, ETL Partners

Active Enterprise Data Warehouse

Analytical Ecosystem Management

Extreme Performance

Extreme Data

DSS

Single, IntegratedActive Data Warehouse

SSD

Entrp.HDD

FatHDD

Centralized ApplianceArchitecture Flexibility withAppliances

DSS Extreme Performance

Extreme Data

Appliances

Page 61: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Takes Appliances to the Next Step

Analytical Ecosystem Management

Extreme Performance

Extreme Data

DSS

Single, IntegratedActive Data Warehouse

SSD

Entrp.HDD

FatHDD

Centralized ApplianceArchitecture Flexibility

For All Your Analytical Needs!

• “Purpose Built” Platform Family

• Architecture Flexibility

• Products to fit your needs

• Active Data Warehousing as the goal

Page 62: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential >

Teradata 13.10 - and beyond(Continued focus on performance, scalability, mixed-workload management)

• Native temporal support (TD 13.10, 2H2010);• Enhanced, automated compression (TD 13.10 & TD 14,

2H2011);• Enhanced BAR, Replication (TD 14) -

> Re-architect these components;> Improved performance, reliability;> Improved integration with 3rd party products.

• Eliminate planned downtime / “always on” (TD 14) - > Improved handling of hardware failures / re-start elimination;> Seamlessly re-submit queries impacted by system failure;> Online expansion and upgrades;> Enhanced FALLBACK for less storage-intensive data availability

and improved query performance.

Page 63: Data Warehousing & Business Intelligence 5 Years From Now

Teradata Confidential >

Teradata 14 - and beyond

• Simplified deployment and maintenance / “autonomic computing (TD 14) - > Automated Physical Layout (indexes, etc);> Automated background compression;> Automated Statistics Collection;> Automated Checksums;> Self-Healing File System.

• Virtualization configurations.