data warehouse. group 5 kacie johnson summer bird washington farver jonathan wright mike muchane

21
Data Data Warehouse Warehouse . Group 5 Group 5 Kacie Johnson Kacie Johnson Summer Bird Summer Bird Washington Farver Washington Farver Jonathan Wright Jonathan Wright Mike Muchane Mike Muchane

Upload: wesley-daniels

Post on 04-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Data WarehouseData Warehouse..

Group 5Group 5Kacie JohnsonKacie JohnsonSummer BirdSummer Bird

Washington FarverWashington FarverJonathan WrightJonathan WrightMike MuchaneMike Muchane

Page 2: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

OutlineOutline I. Data warehouse definition and integrated I. Data warehouse definition and integrated

technologiestechnologies

II. OLAP and OLTPII. OLAP and OLTP

III. The concept of data warehousingIII. The concept of data warehousing

IV. How data warehouses are used by companiesIV. How data warehouses are used by companies

V. History of data warehousingV. History of data warehousing

VI. Advantages and DisadvantagesVI. Advantages and Disadvantages

VII. Future applicationsVII. Future applications

Page 3: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

DefinitionDefinitionA A data warehousedata warehouse is a logical collection is a logical collection

of information gathered from many of information gathered from many different operational databases used to different operational databases used to create create business intelligencebusiness intelligence that supports that supports business analysis activities and decision-business analysis activities and decision-making tasks. making tasks.

Page 4: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Business IntelligenceBusiness IntelligenceBusiness intelligenceBusiness intelligence usually refers to usually refers to

the information that is available for the the information that is available for the enterprise to make decisions on. A data enterprise to make decisions on. A data warehousing (or data mart) system is the warehousing (or data mart) system is the backend, or the infrastructural, component backend, or the infrastructural, component for achieving business intelligencefor achieving business intelligence

Page 5: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Data MartData MartA database that has the same A database that has the same

characteristics as a characteristics as a data warehousedata warehouse, but is , but is usually smaller and is focused on the data usually smaller and is focused on the data for one division or one workgroup within for one division or one workgroup within an enterprise.an enterprise.

Page 6: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Data Mining ToolsData Mining Tools

Data mining tools are Software tools used to query Data mining tools are Software tools used to query information in a data warehouse. Consist of: information in a data warehouse. Consist of:

1.1. Query-and-Reporting toolsQuery-and-Reporting tools2.2. Intelligent AgentsIntelligent Agents3.3. Multidimensional analysis Multidimensional analysis

tools (MDA)tools (MDA)4.4. Statistical toolsStatistical tools

Page 7: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

OLAPOLAP A data warehouse uses A data warehouse uses

OLAPOLAP (On-Line (On-Line Analytical Processing) to Analytical Processing) to collect, organize, and collect, organize, and make data available for make data available for the purpose of analysis - the purpose of analysis - to give management the to give management the ability to access and ability to access and analyze information analyze information about its business. This about its business. This type of data can be type of data can be called called “informational “informational data”data”..

Page 8: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

OLTPOLTP

Most data is collected to handle a Most data is collected to handle a company's on-going business. This type of company's on-going business. This type of data can be called data can be called "operational data"."operational data". The The systems used to collect operational data systems used to collect operational data are referred to as are referred to as OLTP OLTP (On-Line (On-Line Transaction Processing). Transaction Processing).

Page 9: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Data Warehouse Is…Data Warehouse Is…

Subject Oriented Subject Oriented

IntegratedIntegrated Time VariantTime Variant Nonvolatile Collection of Data for Nonvolatile Collection of Data for

Management’s Decisions Management’s Decisions

Page 10: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Building BlocksBuilding Blocks

Source DataSource DataDate StagingDate StagingData StorageData Storage Information DeliveryInformation DeliveryMetadataMetadataManagement and ControlManagement and Control

Page 11: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane
Page 12: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Design of DWDesign of DW

IntegrationIntegration: facilitates an overview and : facilitates an overview and analysis in the data warehouseanalysis in the data warehouse

SeparationSeparation: operations used for reporting, : operations used for reporting, decision support, analysis and controllingdecision support, analysis and controlling

Page 13: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Dimensions and MeasuresDimensions and Measures

DimensionsDimensions: categorizes each item in a : categorizes each item in a data set in non-overlapping regions.data set in non-overlapping regions.

MeasuresMeasures: a property that can be : a property that can be summed or averages using pre-computed summed or averages using pre-computed aggregates.aggregates.

Page 14: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Types of Data WarehouseTypes of Data Warehouse

FinancialFinancial InsuranceInsuranceHuman ResourcesHuman ResourcesGlobalGlobalData Mining/Data Mining and Exploration Data Mining/Data Mining and Exploration TelecommunicationsTelecommunications

Page 15: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Before DWBefore DW Executives and decision makers could get Executives and decision makers could get

critical information that already existed on the critical information that already existed on the organizationorganization

The available data was exceedingly difficult to The available data was exceedingly difficult to get (“data in jail”)get (“data in jail”)

Only a fraction of the data captured, processed Only a fraction of the data captured, processed and stored was actually available (“data poor”)and stored was actually available (“data poor”)

Page 16: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

DW In CompaniesDW In Companies

ValidationValidation: : where users validate what they already where users validate what they already believe to be true (45%)believe to be true (45%)

TacticalTactical ReportingReporting: : where the user uses the data for where the user uses the data for tactical reasons (40%)tactical reasons (40%)

ExplorationExploration: : where the user searches for knowledge not where the user searches for knowledge not already known (15%)already known (15%)

Page 17: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Why the volume of data is Why the volume of data is exploding:exploding:

DWs carry historical data DWs carry historical data DWs carry detailed dataDWs carry detailed dataDWs carry data for which there is no DWs carry data for which there is no

known need known need DWs carry eCommerce dataDWs carry eCommerce data

Page 18: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

AdvantagesAdvantages Cut costs Cut costs Boost revenuesBoost revenues Saves timeSaves time Better customer serviceBetter customer service Avoids old dataAvoids old data Queries or reports without impacting the Queries or reports without impacting the

performance of the operational systemsperformance of the operational systems Combines related data from separate sourcesCombines related data from separate sources Increased data consistencyIncreased data consistency Improves access to a wide variety dataImproves access to a wide variety data

Page 19: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

DisadvantagesDisadvantagesCan complicate business processes.Can complicate business processes.Data warehousing can have a learning Data warehousing can have a learning

curve that may be too long for impatient curve that may be too long for impatient firms.firms.

Can require a great deal of "maintenance.” Can require a great deal of "maintenance.” The cost to capture data, clean it up, and The cost to capture data, clean it up, and

deliver it .deliver it . Inability to adapt quickly to changing Inability to adapt quickly to changing

business conditions or requirements.business conditions or requirements.

Page 20: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Future DevelopmentsFuture Developments

Development of parallel DB servers with Development of parallel DB servers with improved query engines will make it possible to improved query engines will make it possible to access huge data bases in much less time access huge data bases in much less time

Another new technology is data warehouses that Another new technology is data warehouses that allow for the mixing of traditional numbers, text allow for the mixing of traditional numbers, text and multi-media. The availability of improved and multi-media. The availability of improved tools for data visualization (business tools for data visualization (business intelligence) will allow users to see things that intelligence) will allow users to see things that could never be seen before.could never be seen before.

Page 21: Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane

Any Questions?Any Questions?