teradata.ppt

36
Teradata Past, Present and Future Todd Walter CTO – Teradata Labs

Upload: kumar-swamy-molugu

Post on 05-Sep-2015

2 views

Category:

Documents


0 download

TRANSCRIPT

  • Teradata Past, Present and FutureTodd WalterCTO Teradata Labs

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights Reserved

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedTeradata Company HighlightsFounded 1979 West LAFirst product to market 1984First Terabyte system 1987Acquired by AT&T and merged with acquired NCR 1992Tri-vested as part of NCR - 1997Teradata Corporation (re)Launched October 1, 2007Global Leader in Enterprise Data WarehousingEDW/ADW Database TechnologyAnalytic SolutionsConsulting ServicesPositioned in Gartners Leaders Quadrant in data warehousing since 1999Top 10 U.S. publicly-traded software companyS&P 500 MemberListed NYSE: TDCNYSE Arca Tech 1002007 - $1.7B revenueGlobal presence and world-class customer listMore than 850 customersMore than 2,000 installations5,500+ associates

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights Reserved

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedContinuous (R)evolutionHardware+ Database+ Consulting+ Data models and reports+ Analytic applications

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedContinuous (R)evolutionSell the HW, give everything else awaySell the SW with some HW to run on Sell solving business problems and technology to solve themSell applications with consulting, SW and HW inside

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedContinuous (R)evolution90% R&D 10% integration 8028670% R&D 30% integration i48620% R&D 80% integration Pentium10% R&D 90% integration Xeon Quad Core

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights Reserved

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedScaleEvery dimension of the technology must scale to meet todays requirementsData, Data model complexity, Users, Performance, queries, Data loading, What is a big Data Warehouse?Total spinning disk?2.5 PetabytesBig table?150 billion rowsNumber of tables?300,000Insert/Update per day?5 billion recordsIdentified users?100,000Queries per day?5 millionData Turnover rate?1TB per 5 seconds

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedThe ProblemAccts. Payable

    Accts. Receivable

    Invoicing

    Sales/Orders

    Finance G/L

    Customer Support

    HR

    Payroll

    Purchasing

    Order Fulfillment

    Manufacturing

    Inventory MarketingSupply ChainFinanceRisk ManagementMaintenanceSalesOperationsInventoryCall Center Proliferation of Data Marts has resulted in fragmented data, higher costs, poor decisions Operational Systems Decision Makers

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedThe EDW SolutionAccts. Payable

    Accts. Receivable

    Invoicing

    Sales/Orders

    Finance G/L

    Customer Support

    HR

    Payroll

    Purchasing

    Order Fulfillment

    Manufacturing

    Inventory Enterprise Data Warehouse(EDW)

    Integrated data provides consistency of data, lower costs, better decisionsMarketingSupply ChainFinanceRisk ManagementMaintenanceSalesOperationsInventoryCall Center Operational Systems Decision Makers

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedActive Enterprise IntelligenceAn Obvious Trend: More Speed, More UsersDaysSeconds

    Strategic IntelligenceOperational IntelligenceEnterprise Data WarehouseBI Tools & reportsAnalysis & visualizationPredictive AnalyticsEDW Enterprise IntegrationMixed workload managementSOA, BPMS, IDEsPortals/composite applications

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedActive Enterprise Intelligence enabled by anActive Data WarehouseSTRATEGIC INTELLIGENCEOPERATIONAL INTELLIGENCETeradata WarehouseActive EventsActive AccessSuppliersCustomersCallCenterLogisticsActive Enterprise IntegrationActiveAvailabilityActiveWorkloadManagementActiveLoad

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedActive Enterprise Intelligence in Retail Detecting Retail Fraud

    SituationThieves make copies of cash register receipts, walk into the store, pick up merchandise, and return items for cash. ProblemAssociates in returns department did not have historical POS receipt retrieval access to verify against previously returned receipts or to do returns without receipts.SolutionAssociates query Teradata to quickly check if a return has already occurred on that receipt number. Also used by analysts to understand and prevent excessive returns.

    Impact(for 500-store chain)100% ROI in 5 monthsStopped a crime ring on the first day of rolloutCost savings have been huge

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedActive Enterprise Intelligence in RetailSingle View of the Customer Across All Channels

    SituationNeeded to add Web channel for selling shoes. ProblemToo much time and cost to keep multiple customer systems synchronized. Realized they needed just one customer database, not one more for the Web, in addition to Call Center, and POS/Store databases.SolutionAdopted an ADW strategy, moved all customer data to one Teradata system, revised data models to cover all channels, added web channel for commerce, used web services, added TASM to handle multiple workload types

    Impact1M tactical hits to the EDW per day from the POS, Call Center, and Web with 0.11 sec response timeRuns simultaneously with back-office BI, reports, and ETL workloadsEliminated all other customer data systems

    Copyright Teradata 2007-2009 All rights Reserved

  • Change is Fast and Getting Faster

    New Challenges for Database Technology

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedWhat is the Measure of a Great Architecture?Handle huge changes of underlying technologies and dependent components while continuing to deliver the key value proposition.

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights Reserved

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights Reserved2003200520072009201190nm process 45nmprocess65nmprocess 32nmprocess22nmprocessHyper-ThreadingDual CoreMulti CoreProcessor RoadmapCPU power radically increasing20002008+SPECInt20005XSINGLE-CORE PERFORMANCEDUAL/MULTI-CORE PERFORMANCE2004Source Intel Corporation

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedWhat Does Shared Nothing Mean?1985 Every hardware part, every line of software pure shared nothing1995 Multiple units of parallelism sharing CPU, memory2004 Multiple units of parallelism sharing multiple cores, memory2009 Multiple units of parallelism sharing same physical spindles but still not sharing dataFuture Multiple units of parallelism in Virtual machines/cloud not even knowing what physical machine it is on or sharing

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedTeradata MPP Server ArchitectureNodesIncrementally scalable to 1024 nodesOperating SystemLinux, Windows, UnixStorageIndependent I/OScales per nodeBYNET InterconnectFully scalable bandwidthConnectivityFully scalableChannel ESCON/FICONLAN, WANServer ManagementOne console to view the entire systemSMP Node1

    SMP Node2SMP Node3SMP Node4Server ManagementDual BYNET Interconnects

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedShared Nothing - Dividing the WorkVirtual processors (vprocs) do the workTwo typesAMP: owns and operates on the dataPE: handles SQL and external interactionConfigure multiple vprocs per hardware nodeTake full advantage of SMP CPU and memoryEach vproc has many threads of executionMany operations executing concurrentlyEach thread can do work for any user, transactionSoftware is equivalent regardless of configurationNo user changes as system grows from small SMP to huge MPP

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedAMPsLogsLocksBuffersI/OShared Nothing - Dividing the WorkBasis of Teradata scalabilityEach AMP owns an equal slice of the diskOnly that AMP reads that sliceNo single point of control for any operationI/O, Buffers, Locking, Logging, DictionaryNothing centralizedExponential communication costs avoided# NodesCoordination costTeradata

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedRows automatically distributed evenly by hash partitioningEven distribution results in scalable performanceDone in real-time as data are loaded, appended, or changed.Hash map defined and maintained by the system2**32 hash codes, 64K buckets distributed to AMPsPrime Index (PI) column(s) are hashedHash is always the same - for the same valuesNo reorgs, repartitioning, space management

    Teradata Data Distribution AMP1 AMP2 AMP3 AMP4 AMPnTable A Table B Table CPrimary IndexTeradata Parallel Hash FunctionRowHash (Hash Bucket)Data Fields

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedDisk Capacity Exploding with Little Increase in PerformanceRandom I/O; 48K block; 80% read

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedPlatform ChangeFocus used to be Optimization of expensive CPU cyclesMicro-management of precious disk spaceNowManage I/OBalance CPU power to the I/O capacityFind new ways to optimize I/O, trading for CPU use as necessaryPulling 2.5GB/sec per node continuousDiscontinuity comingSSDs become price competitive and reliable

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedFile SystemTeradata wrote a new rule bookOld one written by IBM 35 years ago, used by all mainstream DBMSs today - except TeradataFile system built of raw slicesRows stored in blocksVariable lengthGrow and shrink on demandRows located dynamicallyMay be moved to reclaim space, defragMaximum block size is configurableSystem default or per table8K to 128KChange dynamicallyIndexes are just rows in tablesHas evolved from direct management of single spindles to completely virtualized storage, not even knowing spindle location

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedWorkload Management Evolution1984 pure timeshare1987 4 priorities, defined by user1995 multiple priorities in multiple partitions2000 weighted workload groups2004 queuing, reserved resources, focus on tactical work2009 Visualization and detailed workgroup managementFuture Set service level goals, our job to deliver

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedSpeed10Active EventsActive AccessQuery and ReportingActive LoadActive Data Warehouse Active Workload Management Manage workloadsReduce server congestionDynamically adjust in-flight task priorityTurn the dial change prioritiesFast active access queriesPerformance, performance, performanceGet maximum throughputSpeed60Speed75Speed25

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedTASM Reporting/Monitoring - 13.10

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedIT, Finance,Planners, Power Users,Data MinersExecutives,Middles Managers, MarketingConsumersSuppliersB2BOperationalEmployeesCategory Mgr, Line Managers, Service ManagersUsersBusiness CriticalMission CriticalDualActiveStrategic IntelligenceOperational IntelligenceAvailability Requirements

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedAlways ON An Elusive ChallengeUnplanned downtimeHardware faultsSoftware faultsHangsPlanned downtimeSoftware upgradeHardware upgradeData center maintenanceDisastersMulti-component failuresBuilding disastersArea disastersAnd optimize resource value to the businessAnd avoid hidden costs and surprisesEg Major performance variationsMajor opportunity for research but must be holisticReaches far beyond core database

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedReal time Operational ActionsStrategicIntelligenceOperationalIntelligence

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedReal Time Customer ManagementStrategicIntelligenceOperationalIntelligence4. Is this customer approaching the predicted loss rate for their segment?5. What offers are available for this customer?

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights ReservedThats a Wrap!Business requires a new level of decision makingMany more decisions by many more people much fasterCurrent representation of the state of the enterpriseData Warehouse must evolve to support the requirements of Active Enterprise IntelligenceTechnology must evolve to deal with the new requirementsRich area for research and innovationChange view of what data warehouse/BI meansTeradata driving an aggressive roadmap to meet real business requirements

    Copyright Teradata 2007-2009 All rights Reserved

  • * > 09/2009Copyright Teradata 2007-2009 All rights Reserved

    Copyright Teradata 2007-2009 All rights Reserved

    *NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*

    [Enter any extra notes here; leave the item ID line at the bottom]Avitage Item ID: {{E3648B2F-FB1B-499B-B91B-8871943BA5EE}}NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*Retail Fraud is a $16 B year problem in the USA alone. With web receipts and better copying capabilities, thieves can make multiple copies of a single receipt and make multiple returns for cash or other merchandise. Or they can bring back shoplifted items and try to exchange for cash.

    The problem is that often the associates in Returns department dont have access to past sales information and cant keep track easily of returned merchandise. This is especially problematic if the policy is to make returns without receipts.

    So the solution is straightforward: hook up the Point of Sale systems so within seconds, the Teradata data warehouse is updated with sales, return, exchange, and void data, and provide the Returns department with the entire history of purchases by that customer,, so they can ensure that a sold product can only be returned once. The impact? Huge, according to one Teradata customer who has already built this system. They stopped a crime ring in the first day of their rollout, a group that had defrauded the company of thousands of dollars. They saw a 100% payback on their investment in just 5 months, and continue to reap the benefits of this example use of Active Enterprise Intelligence.

    NCR Teradata Internal Use only*NCR Teradata Internal Use only**NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*

    [Enter any extra notes here; leave the item ID line at the bottom]

    Avitage! Item ID: {{33DC1405-7316-423E-B269-8F92054D20CE}}NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*(CLICK)In this chart, we have 3 different disk drive sizes, and you can see that per generation, disk drive bandwidth hasnt increased very much.(CLICK)As disk capacities get larger (36 GB 73 GB 146 GB) the performance per capacity ratio (Capacity vs. Disk Bandwidth on right side of chart) declines significantly.

    The key metric on this slide is performance per capacity (MB/ SEC/ GB)

    Look at this slide! Capacity is doubling, but throughput is diminishing! If you fill all the drives up with data, you will not have enough I/O or bandwidth!

    Choosing twice as much storage capacity in a configuration, but not increasing the number of physical disks (to keep I/O constant), will result in performance degradation.

    NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only*Assuming workloads are categorized, this illustration shows speed limits which are actually resource limits for each workload. Each workload is allowed to consume a limited amount of resources at any given time to ensure other workloads get their rightful share. Dynamic Resource PrioritizationInside every fully utilized active data warehouse, theres a major turf battle going on. Each job in the database is engaged in an ongoing struggle for more and more resources for its own work, often competing against other diverse activities. In most databases, these me-first conflicts result in short, resource-light queries falling victim to the heavier jobs. Those batch fraud-detection reports and long-running market share analysis queries essentially take ownership of the database and all it has to give. But Teradata Database lets your specific business needs determine how your precious database resources are divided. Once a definition for equitable sharing of database assets is in place, it automatically controls what percent of the CPU and disk I/O those batch reports and complex queries, as well as those vulnerable short queries, will receive. When theres a handful of users on the system, Teradata Database spreads available resources out relative to the priorities and assignments that have been made to those particular users, without a single sub-second of CPU being wasted.

    Teradata Database has made job scheduling and prioritization of the work a core competency since 1988. And recently, that technology has deepened and matured offering even more flexibility. Teradatas Priority Scheduler can be used to ensure that the event-driven work coming from the web is allowed to cut into line to grab the CPU it needs to get that promotion back to the client quickly. For example, if the tactical query that comes up with that promotion returns an answer in 1 second when running alone in the database, that same query, if armed with a high Teradata Database priority, can maintain a similar turnaround even if multiple complex inventory adjustment queries begin executing at the same time. For the active data warehouse, it will be critical to keep more resource-hungry complex queries from dominating the resources in the system, starving out the shorter tactical work. Teradatas Dynamic Workload Manager will play a big role in enabling favored work to be as near to real time as it needs to be.

    NCR Teradata Internal Use only*NCR Teradata Internal Use only**NCR Teradata Internal Use only*NCR Teradata Internal Use only*While no 2 dimensional drawing can accurately portray such complex issues, this graphic frames the discussion around when to move to mission critical and dual active solutions. In general, the type of users often correlates with the population of users. For example, we know that the consumer population for many industries can mean 10 of thousands to millions of possible users via the internet . Similarly, for some industries, the population of supplier employees who access your data warehouse can be enormous, maybe not always in concurrent users but certainly in potential users. At the other end of the spectrum, planning, analysis, and power users tend to be a small community albeit an influential one. In the middle of the graphic we see overlaps of many kinds because line managers (category managers, sales managers, service managers, etc.) often bounce between strategic decisio0ns and operational decisions, with probably more time spent in the operational tasks.Business critical is not a well defined term in our industry. It tends to mean anything less than mission critical. These users can often tolerate downtime, from a few hours perhaps even an entire day. But many data warehouse sites have become so dependent on the EDW, that they have hardened the server, software, and procedures to a mission critical level. This means the executives realize how many decisions are made daily based on BI Tools based reporting that they are willing to fund the project to increase system availability. Mission critical can begin in the EDW and certainly extends all the way to the end of the graphic. These clients understand that large populations of front line users will demand 24X7 data availability. With operational employees you MIGHT be able to tolerate a 10-20 minute outage every month. It depends very much on the business use of the EDW. As the EDW evolves to larger populations and more operational ACTIVE tasks, outrages become increasingly expensive so additional investments in availability become mandatory. In some cases, an active data warehouse begins being so critical to the operational employee that it becomes necessary to step up to a dual active configuration. This is particularly true in retail with 100s of concurrent employees and suppliers using the data, but it may also occur with large call centers or sales staff. Finally, we hope it is obvious that when consumers gain access to the data warehouse, it is typically for eCommerce purchasing. No downtime is tolerated in this case because the loss of revenue cannot be tolerated.

    NCR Teradata Internal Use only*NCR Teradata Internal Use only*Problem: Lack of ability to track customer gaming behavior and Comp redemption. No mechanism to communicate or react to specific behaviors and trends

    Solution: Player Contact System - when a patron swipes his/her card at a casino that information is sent to Teradata.The player profile is accessed and it is determined if the casino should make personal contact with that player.Allows Harrahs to provide real-time offers to customers at each gaming pointEnables Harrahs to track the redemption of any comp provided to a guest as the comp is redeemed or partially redeemed. Allows them not to over-comp guests.

    Future:Marketing At The Slots initiative. This implementation has a BusinessWorks process receiving inbound card-swipes from the Slot Data System and building an EDW query. It then makes a Request/Reply call to Teradata to solicit and compile an XML message which is then published back out on the TIB for consumption by other applications. This will drive CRM to a new real-time level allowing interaction with the customer while they are gaming.

    NCR Teradata Internal Use only*NCR Teradata Internal Use only*Problem: Lack of ability to track customer gaming behavior and Comp redemption. No mechanism to communicate or react to specific behaviors and trends

    Solution: Player Contact System - when a patron swipes his/her card at a casino that information is sent to Teradata.The player profile is accessed and it is determined if the casino should make personal contact with that player.Allows Harrahs to provide real-time offers to customers at each gaming pointEnables Harrahs to track the redemption of any comp provided to a guest as the comp is redeemed or partially redeemed. Allows them not to over-comp guests.

    Future:Marketing At The Slots initiative. This implementation has a BusinessWorks process receiving inbound card-swipes from the Slot Data System and building an EDW query. It then makes a Request/Reply call to Teradata to solicit and compile an XML message which is then published back out on the TIB for consumption by other applications. This will drive CRM to a new real-time level allowing interaction with the customer while they are gaming.

    NCR Teradata Internal Use only*NCR Teradata Internal Use only*NCR Teradata Internal Use only