data warehouse

20
PROJECT REPORT ON DATA WAREHOUSING SUBMITTED BY: SANA ALVI (24)

Upload: sana-alvi

Post on 19-Jul-2015

92 views

Category:

Technology


0 download

TRANSCRIPT

PROJECT REPORT ON

DATA WAREHOUSING

SUBMITTED BY:SANA ALVI

(24)

TABLE OF CONTENTS:

ABSTRACT OF DATA WAREHOUSING

INTRODUCTION OF DATA WAREHOUSING

BACKGROUND OF DATA WAREHOUSING

ARCHITECTURE OF DATA WAREHOUSING

ADVANTAGES OF DATA WAREHOUSING

CONCLUSION

REFRENCES

ABSTRACT

Presently, large enterprises rely on database systemsto manage their data and information. Thesedatabases are useful for conducting daily businesstransactions. However, the tight competition in themarketplace has led to the concept of data mining inwhich data are analyzed to derive effective businessstrategies and discover better ways in carrying outbusiness. In order to perform data mining, regulardatabases must be converted into what so calledinformational databases also known as datawarehouse. Data warehouse is used for transformingan operational database into an informationalwarehouse useful for decision makers to conductdata analysis, predication, and forecasting.

A data warehouse can be thought of as a place theinformation systems department puts the data that isto be turned into information. Data warehouses areinformation repositories specialized in supportingdecision making. It is used for transforming anoperational database into an informationalwarehouse useful for decision makers to conductdata analysis, predication, and forecasting.

INTRODUCTION

Nowadays, almost every enterprise uses a database tostore its vital data and information (Pant, May 21–24,1995,).For instance, dynamic websites, accountinginformation systems, payroll systems; stock managementsystems all rely on internal databases as a container tostore and manage their data. The competition in themarketplace has led business managers and directors toseek a new way to increase their profit and market power,and that by improving their decision making processes. Inthis sense, the idea of data warehouse and data miningwas born (Zhu & Davidson, 2007).

Data warehousing is the process of collecting data fromoperational functional databases, transforming, and thenarchiving them into special data repository called datawarehouse with the goal of producing accurate andtimely management information (Preeti S. S. R., 2011);whereas, Data mining is the process of discovering trendsand patterns from data warehouse, useful to carry outdata analysis (Kantardzic M. , 2003.)

A typical university often comprises a lot of subsystems crucial for its internal processes and operations. Examples of such subsystems include the student registration system, the payroll system, the accounting system, the course management system, the staff system, and many others. In essence, all these systems are connected to many underlying distributed databases that are employed for every day transactions and processes.

Background of Data Warehouse Operational Database: An operational database is a

regular database meant to run the business on a current basis and support everyday transactions and processes (W. H. Inmon, 1992.).

Informational Database: An informational database is a special type of database that is designed to support decision making based on historical point-in-time and prediction data for complex queries and data mining applications. A data warehouse is an example of informational database.

Data Mining: In its broader sense, it is a knowledge discovery process that uses a blend of statistical, machine learning, and artificial intelligence techniques to detect trends and patterns from large data-sets, often represented as data warehouse. The purpose of data mining is to discover news facts about data helpful for decision makers (Tan, Steinbach, & Kumar, 2005).

ARCHITECTURE OF DATA WAREHOUSING

The warehouse architecture may include:

(Mumick, Gupta, & IS, 18, June 1995)

• Process Architecture

• Data Model architecture

• Technology Architecture

• Information Architecture

• Resource Architecture

ADVANTAGES OF DATA WAREHOUSING

Integrating data from multiple sources;

Performing new types of analyses; and

Reducing cost to access historical data.

Standardizing data across the organization, a "singleversion of the truth";

Improving turnaround time for analysis and reporting;

Sharing data and allowing others to easily access data;

Supporting ad hoc reporting and inquiry;

Reducing the development burden on IS/IT; and

Removing informational processing load fromtransaction-oriented databases;

CONCLUSION Data warehouse is a collection of wide variety of

corporate data that is organised and made available to the end users for decision-making purposes .DW is used by managers and knowledge workers who require access to this data for analysing the business and planning its future. Data warehouses have come into use because of high capacity mass storage. Data warehousing is the leading and most reliable technology used today by companies for planning, forecasting, and management for e.g. resource planning, financial forecasting and control etc.

Data warehouses need a lot more maintenance and a support team of qualified professionals is needed to take care of the issues that arise after its deployment including data extraction, data loading, network management, training and communication, query management and some other related tasks.

REFRENCES A, G., & Mumick, I. S. (1995, June). Techniques and Applications. Maintenance

of Materialized Views, 18(2), 18-24. Anahory, S., & Murray, D. (1997). Data Warehousing in the Real World: A

Practical Guide for Building Decision Support Systems. Addison Wesley Professional.

Chatziantoniou, D., & Ross, K. (2000). Querying Multiple Features . VLDB (pp. 11-22). Wiley.

Chen, B., Chen, L., Lin, Y., & Ramakrishnan. (2005). Prediction cubes., (pp. pages 982–993).

Cube, G. J. (1997). Generalizing Group-by, Cross-Tab and Sub Totals. A Relational Aggregation Operator, 1, 1.

Devlin, & Murphy, P. T. (1988). An architecture for a business and information system. IBM Systems Journal, 27(1), 27-55.

Hackney, D. (1997). Understanding and Implementing Successful Data Marts.Addison-Wesley.

Husemann, B., Lechtenborger, J., & Vossen, G. (2000). Conceptual data warehouse., (pp. 22-30). Stockholm.

Inmon. (1992). W.H. Building the Data Warehouse. John Wiley. Inmon, & W, H. (1991). Third Wave Processing: Database Machines and

Decision Support System. Canada: ACM.

Inmon. (1992). W.H. Building the Data Warehouse. John Wiley.

Inmon, & W, H. (1991). Third Wave Processing: Database Machines and Decision Support System. Canada: ACM.

Inmon, & W, H. (1995). What is a Data Warehouse? Prism, 1(1), 50-62.

Inmon, & William. (1999). Building the Operational Data Store. London: John Wiley & Sons.

Inmon, W., Imhoff, C., & Sousa. (2001). Corporate Information Factory.

J, W. (1995). Research Problems in Data Warehousing. CIKM.

Jeffrey, A. H., Prescott, M., & Heikki, T. (2008). Modern Database Management. Prentice Hall.

Kantardzic, & Mehmed. (2003). Data Mining: Concepts, Models, Methods, and Algorithms. London: John Wiley & Sons.

Kantardzic, M. (2011). Data Mining: Concepts, Models, Methods,. John Wiley & Sons.

Kimball, R. (1996). The Data Warehouse Toolkit. Wiley Computer Publishing.

Kimball, R. (2000). Backward in Time. Intelligent Enterprise Magazine, 3-15.

Laberge, R. (2011.). The Data Warehouse Mentor: Practical Data Warehouse and Business Intelligence . Osborne Media: McGraw-Hill.

Lehner, W., Albrecht, J., & Wedekind, H. (1998). Normal forms for multidimensional. International Conference on Scientific and Statistical (pp. 63-72). Capri: ACM.

Malvestuto, & F, M. (1988). Existence of extensions and product extensions for discrete. Discrete Mathematics, 61–77.

Malvestuto., F. (1988.). Existence of extensions and product extensions for discrete. 69:61–77,.

Mendelzon, & Hurtado, C. (2004). OLAP. Mertz, & Kimbal, R. (2000). The Data Webhouse Toolkit: Building

the Web-Enabled. John Wiley & Sons (pp. 20-43). Chichester: Wiley.

Michael, G., & Gruenwald, L. (1999, June 23). A Survey of Data Mining and Knowledge Discovery Software Tools. 1, pp. 20-33. Chicago: ACM.

Microsoft Access Home Page. (n.d.). Retrieved from http://office.microsoft.com/en-us/access/default.aspx

Mumick, & Gupta. (18 June 1995). Maintenance of materialized views: Problems.

Mumick, Gupta, A., & S, I. (1995). Maintenance of materialized views: Problems. IEEE Data Engineering Bulletin, 8(2), 18.

Pant, S., & Hsu, C. (1995). Strategic Information Systems Planning: A Review. Information Resources Management Association, (pp. 21-24). Atlanta.

Power, D. (2000). What are the advantages and disadvantages of Data Warehouses? DSS news, 1(7), 1-22.

Preeti, S., Srikantha, R., & Suryakant, P. (2011). Optimization of Data Warehousing System: Simplification in Reporting and Analysis. 9, pp. 33-37. International Journal of Computer Applications.

Rizzi, Matteo, G., & Stefano. (2009). Data Warehouse Design: Modern Principles and Methodologies. New York: McGraw-Hill Osborne.

S, H. P. (1995). Strategic Information Systems Planning: A Review. Information Resources Management Association, (pp. 21-24). Atlanta.

Sagiv, Y. (1983, January 9). A characterization of globally consistent databases and their access. 266–286.

Sagiv., Y. (1991.). Evaluation of queries in independent database schemes., (pp. 38:120–161,).

Sen, A., & Sinha, A. P. (2005). A Comparison of Data warehousing. ACM, 79-84.

Stephen, R. (1998). Building the Data Warehouse.

Tan, Pang, N., Steinbach, M., & Kumar, V. (2005). Introduction to Data Mining. New York: Addison Wesley.

W, H., & Inmon. (1992). Building the Data Warehouse,. New York: John canned SQL and Wiley.

Zhu, X., & Davidson, I. (2007). Knowledge Discovery and Data Mining: Challenges and Realities. New York.

Zhu, X., Davidson, & Ian. (2007). Knowledge Discovery and Data Mining: Challenges and Realities. (pp. 30-41). New York: Hershey.