utilizing data warehousing as a business...
Post on 18-Jul-2020
0 Views
Preview:
TRANSCRIPT
Agenda
• Introduction• History of DW• Advantages & Disadvantages of Data Warehouse• Example dimension of a DW (SAP BI)
Introduction• According to W.H.Inmon:
A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process
• Data warehousing is the process of creating, populating and querying a data warehouse.
• It is a framework for deriving information from data.
• Data mining is the process of identifying and interpreting intrinsic patterns in data to solve a business problem.
• A vital discovery that propelled the development of data warehousing was the fundamental differences between operational (transaction processing) systems and information (decision support) systems.
Characteristic Operational Systems (OLTP) Informational Systems (OLAP)
Primary Purpose Run the business on a current basis Support managerial decision making
Type of Data Real time based on current data Snapshots and predictions
Primary Users Clerks, salespersons, administrators Managers, analysts, customers
Scope Narrow, planned, and simple updates and queries Broad, complex queries and analysis
Design Goal Performance throughput, availability Ease of flexible access and use
Database concept Complex simple
Time-focus Point in time Period of time
Normalization High Low
Volume Many - constant updates and queries on one or a few table rows
Periodic batch updates and queries requiring many or all rows
Differences between Operational Systems & Informational Systems
Data warehousing has evolved rapidly since its inception. Here is the story timeline of data warehousing:
• 1970’s – Operational systems (such as data processing) were not able to handle large and frequent requests for data analyses. Data stored was in mainframe files and static databases. A request was processed from recorded tapes for specific queries and data gathering. This proved to be time consuming and an inconvenience.
• 1980’s – Real time computer applications became decentralized. Relational models and database management systems started emerging and becoming the wave. Retrieving data from operational databases was still a problem because of “islands of data.”
• 1990’s – Data warehousing emerged as a feasible solution to optimize andmanipulate data both internally and externally to allow business’ to make accurate decisions
History of DW
Aspects that supported the accelerated development of Data Warehouses:
• Improvements in database technology– The beginning of relational data models and relational
database management systems (RDBMS)• Advances in computer hardware
– The abundant use of affordable storage and other architectures
• The importance of end-users in information systems– The development of interfaces allowing easier use of systems
for end users• Advances in middleware products
– Enabled enterprise database connectivity across heterogeneous platforms
History of DW
Data Warehousing
Operational data
Information
Sales
Financial
Inventory
Operational Systems Data Warehouse
Customer
Geography
Product
Organized by processesor tasks
Organized by subject
A subject oriented approach
Why is a DW necessary?• Warehousing data outside the operational systems
– Data warehousing is needed for processing as OLAP queries in operational databases would substantially degrade the performance of operational tasks
• Integrating data from more than one operational system– Decision support needs consolidation (such as aggregation
and summarization) of data from heterogeneous sources; and operational databases contain only detailed raw data.
• Data is mostly volatile
• Data saved for longer periods than in transaction systems
Advantages of data warehouse
• Time Saving• Efficiency• Complete Documentation• Data Integration
Limitations
•High Cost•Complexity
Opportunities and Challenges for Data Warehousing
The opportunities and challenges for data warehousing are mainly reflected in four aspects:-
• Data Quality• Business Intelligence• E-business and the Internet• Other trends
Data Warehouse ProcessDATA SOURCES STAGING AREA DATA WAREHOUSE DECISION SUPPORT
Application Databases
Application Databases
Packaged application/ERP
Data
Packaged application/ERP
Data
Desktop DataDesktop Data
External DataExternal Data
Web-based DataWeb-based Data
______________________________________________________
______________________________________________________
INCOME ANNUAL REPORT
___ ___ ____ _____ ___ __
___ ___ ____ _____ ___ __
___ ___ ____ _____ ___ __
INCOME ANNUAL REPORT
___ ___ ____ _____ ___ __
___ ___ ____ _____ ___ __
___ ___ ____ _____ ___ __
Reports
EIS
OLAP
Statistical & Financial Analysis
EXTRACTIONTRANSFORMING
CLEANINGAGGREGATION
DATA WAREHOUSE
DATA MARTS
ETL process
External data sources
Extract Extract
AdjustedRaw data
DataWarehouse
Internal data sources
Staging
Area
Met
adat
a
Loading
Transformation
Extraction
Available DW products• Cognos• Crystal Decisions• Hyperion• IBM• Information Builders• Microsoft• Microstrategy• MIK• MIS AG• NCR• Oracle• SAP• SAS
Example dimension of a DW (SAP BI)
• Users: 460• Number of reports (queries): 400• Number of InfoCubes: 25• Data volume per day: 2 - 2.5 million datasets• Data since 2000• Data source: SAP R/3, SAP SEM 3.0, SAP CRM 3.0 and flat
files• Hardware:
– 11 system boards– 44 processors– 45 GB RAM– 2,000 GB hard drive capacity
Example dimension of a DW (SAP BI)
Summary• Decision support is an emerging, rapidly growing
subarea of databases.• Involves the creation of large, consolidated data
repositories called data warehouses.• Warehouses exploited using sophisticated analysis
techniques: complex SQL queries and OLAP “multidimensional” queries (influenced by both SQL and spreadsheets).
• New techniques for database design, indexing, view maintenance, and interactive querying need to be supported.
top related