introduction to data mining
DESCRIPTION
Introduction to data mining. Literature. Data mining in commerce. About 13 million customers per month contact the West Coast customer service call center of the Bank of America - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/1.jpg)
Introduction to data mining
![Page 2: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/2.jpg)
Literature
![Page 3: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/3.jpg)
Data mining in commerce• About 13 million customers per month contact the West Coast
customer service call center of the Bank of America • In the past, each caller would have listened to the same
marketing advertisement, whether or not it was relevant to the caller’s interests.
• Chris Kelly, vice president and director of database marketing: “rather than pitch the product of the week, we want to be as relevant as possible to each customer”
• Thus, based on individual customer profiles, the customer can be informed of new products that may be of greatest interest.
• Data mining helps to identify the type of marketing approach for a particular customer, based on the customer’s individual profile.
![Page 4: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/4.jpg)
Recommendation systems
![Page 5: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/5.jpg)
Why mine data – commercial viewpoint
• Lots of data is being collected– Web data, e-commerce– purchases at department/grocery stores– Bank/Credit Card transactions
• Computers have become cheaper and more powerful• Competitive pressure is strong – Provide better, customized services
R. Grossman, C. Kamath, V. Kumar, “Data Mining for Scientific and Engineering Applications”
![Page 6: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/6.jpg)
• Data collected and stored at enormous speeds (GB/hour)– remote sensors on a satellite– telescopes scanning the skies– microarrays generating gene expression data– scientific simulations generating terabytes of data
• Traditional techniques infeasible for raw data• Data mining may help scientists – in classifying and segmenting data– in hypothesis formation
Why mine data – scientific viewpoint
R. Grossman, C. Kamath, V. Kumar, “Data Mining for Scientific and Engineering Applications”
![Page 7: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/7.jpg)
Data mining in bioinformatics
• Brain tumors represent the most deadly cancer among children
• Gene expression database for pediatric brain tumors was built, in an effort to develop more effective treatment.
![Page 8: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/8.jpg)
• Clearly, a lot of data is being collected.• However, what is being learned from all this
data? What knowledge are we gaining from all this information?
• “we are drowning in information but starved for knowledge”
• The problem today is not that there is not enough data. Rather, the problem is that there are not enough trained human analysts available who are skilled at translating all of this data into knowledge.
![Page 9: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/9.jpg)
• Data mining is the process of discovering meaningful new correlations, patterns and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical an mathematical techniques.
(www.gartner.com)
• Data mining is an interdisciplinary field bringing togther techniques from machine learning, pattern recognition, statistics, databases, and visualization to address the issue of information extraction from large data bases.
(Peter Cabena, Pablo Hadjinian, Rolf Stadler, JaapVerhees, and Alessandro Zanasi, Discovering Data Mining: From Concept to Implementation, Prentice Hall, Upper Saddle River, NJ, 1998.)
![Page 10: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/10.jpg)
• The growth in this field has been fueled by several factors:– growth in data collection– storing of the data in data warehouses– availability of increased access to data from
Web– competitive pressure to increase market
share–development of data mining software suites– tremendous growth in computing power
and storage capacity
![Page 11: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/11.jpg)
Need for human direction of DM
• Don’t believe software vendors advertising their analytical software as being plug-and-play out-of-the-box application providing solutions without the need of human interaction!
• Data mining is not a product that can be bought, it is a discipline that must be mastered!
![Page 12: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/12.jpg)
• It is easy to do data mining badly.• Software always gives some result.• A little knowledge is especially dangerous– e.g. analysis carried out on unpreprocessed data
can lead to errorneous conclusions, the models can be way off
– if deployed, the errors can lead to very expensive failures
• The costly errors stem from the black-box approach.
![Page 13: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/13.jpg)
Data maning trap• If we try hard enough, we always find some patterns.• However, they may be just a matter of chance. They don’t
have to be characteristic for process that generates the data.
• Derogatory sport definition of data mining:
Data mining means sorting through a huge volume of data, extracting decision rules that seem to favor one team over another, but without regard to whether or not there is any cause-and-effect relationship. Data mining is the equivalent of sitting a huge number of monkeys down at keyboards, and then reporting on the monkeys who happened to type actual words.
![Page 14: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/14.jpg)
• Instead, apply a “white-box” methodology.• i.e. understand algorithms and statistical
model structures underlying a software
• The white-box approach is the reason why you are attending this lecture (apart from the fact, that the lecture is compulsory).
![Page 15: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/15.jpg)
Data mining as a process
• One of the fallacies associated with DM is that DM represents an isolated set of tools
• Instead, DM should be viewed as a process• The process is standardized – CRISP-DM framework
(http://www.crisp-dm.org/)
– Cross-Industry Standard Process for Data Mining– developed in 1996 by analysts from DaimlerChrysler,
SPSS, and NCR– provides a nonproprietary and freely available standard
process for fitting data mining into the general problem-solving strategy of a business or research unit
![Page 16: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/16.jpg)
CRISP-DM
starts here
![Page 17: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/17.jpg)
1. Business understanding phase– Formulate the project objectives and requirements
2. Data understanding phase– collect the data– use EDA (exploratory data analysis) to familiarize yourself
with the data– evaluate the quality of the data
3. Data preparation phase– prepare from the initial raw data the final data set. This
phase is very labor intensive.– select the cases and variables you want to analyze– perform transformation of variables, if needed– clean the raw data so they are ready for modelling tools
![Page 18: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/18.jpg)
4. Modeling phase– select and apply appropriate modeling techniques– calibrate model settings to optimize results– often, several different techniques may be used– if necessary, loop back to the data preparation phase
to bring the form of the data into line with the specific requirements of a particular data mining technique
5. Evaluation phase– evaluate models for quality and effectivness– establish whether some important facet of the
business or research problem has not been accounted for sufficiently
![Page 19: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/19.jpg)
6. Deployment phase– make use of the models created– examples of deployment:• report• implement a parallel DM process in another
department
![Page 20: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/20.jpg)
CRISP-DM example
• Business understanding– Objectives: reduce costs associated with
warranty claims and improve customer satisfaction
– Specific business problems can be formulated:• Are there interdependencies among warranty claims?• Are past warranty claims associated with similar
claims in the future?
Investigated patterns in the warranty claims for DaimlerChrysler automobiles
Jochen Hipp and Guido Lindner, Analyzing warranty claims of automobiles: an application description following the CRISP–DM data mining process, in Proceedings of the 5th International Computer Science Conference (ICSC ’99), pp. 31–40, Hong Kong, December 13–15, 1999
![Page 21: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/21.jpg)
• Data understanding– use of DaimlerChrysler’s Quality Information
System (QUIS)– it contains information on over 7 million vehicles
and is about 40 gigabytes in size– QUIS contains production details about how and
where a particular vehicle was constructed + warranty claim information
– researchers stressed the fact that the database was entirely unintelligible to domain nonexperts• experts from different departments had to be located
and consulted, a task that turned out to be rather costly
![Page 22: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/22.jpg)
• Data preparation– the QUIS DB did not contain all information
needed for the modelling purposes– e.g. the variable “number of days from selling date
until first claim” had to be derived from the appropriate date attributes
– researchers then turned to DM software where they ran into a common roadblock: data format requirements varied from algorithm to algorithm• result was further exhaustive preprocessing of the data
– researchers mention that the data preparation phase took much longer than they had planned
![Page 23: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/23.jpg)
• Modeling– to investigate dependencies, researchers used• Bayesian networks• Association rules mining
– the details of the results are confidential, but we can get general idea of dependencies uncovered by models• particular combination of construction specifications
doubles the probability of encountering an automobile electrical cable problem
![Page 24: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/24.jpg)
• Evaluation– The researchers were disappointed that association
rules models were found to be lacking in effectiveness and to fall short of the objectives set for them in the business understanding phase• “In fact, we did not find any rule that our domain experts
would judge as interesting.”– To account for this, the researchers point to the
“legacy” structure of the database, for which automobile parts were categorized by garages and factories for historic or technical reasons and not designed for data mining.
– They suggest redesigning the database to make it more amenable to knowledge discovery.
![Page 25: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/25.jpg)
• Deployment– It was a pilot project, without intention to deploy
any large-scale models from the first iteration.– Product: report describing lessons learned from
this project• e.g. change of the structure of the database (new
variables, different categorization of automobile parts)
![Page 26: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/26.jpg)
Lessons learned
• uncovering hidden nuggets of knowledge in databases is a rocky road
• intense human participation and supervision is required at every stage of the data mining process
• there is no guarantee of positive results
![Page 27: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/27.jpg)
Connection to other fields
StatisticsDatabasesystems
Vizualization
Data Mining
Machine learningPattern recognition
![Page 28: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/28.jpg)
Machine learning
• A subfield of artificial intelligence.• Discipline that is concerned with the design and
development of algorithms that allow computers to evolve behavior based on experience.– experience – empirical data, such as from sensors or
databases– evolve behavior – usually through search of patterns
in data• similar goal as DM, DM uses algorithms from ML
![Page 29: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/29.jpg)
Pattern recognition
• Problem of searching patterns - a fundamental one, long and successful history.
• For instance, the extensive astronomical observations of Tycho Brahe in the 16th century allowed Johannes Kepler to discover the empirical laws of planetary motion, which in turn provided a springboard for the development of classical mechanics.
![Page 30: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/30.jpg)
Pattern recognition
• automatic discovery of regularities in data through the use of computer algorithms and with the use of these regularities to take actions such as classifying the data into different categories
![Page 31: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/31.jpg)
Pattern recognition
• if train has 2 wagons, it goes to the left
data patterns
![Page 32: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/32.jpg)
More real patternsface detection
![Page 33: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/33.jpg)
Connection to other fields
StatisticsDatabasesystems
Vizualization
Data Mining
Machine learningPattern recognition
![Page 34: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/34.jpg)
Iris Sample Data Set
• Many of the exploratory data techniques are illustrated with the Fisher’s Iris Plant data set.– From the statistician Douglas Fisher, mid-1930s– Can be obtained from the UCI Machine Learning Repository
http://www.ics.uci.edu/~mlearn/MLRepository.html
based on WEKA tutorial
![Page 35: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/35.jpg)
Fisher, R.A. (1936). "The Use of Multiple Measurements in Taxonomic Problems". Annals of Eugenics 7: 179–188, http://digital.library.adelaide.edu.au/coll/special//fisher/138.pdf.
iris setosa iris versicolor iris virginica
Contains flower dimension measurements on 50 samples of each species.
![Page 36: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/36.jpg)
Data mining terminology• The four iris dimensions are termed attributes, input attributes,
features• The three iris species are termed classes, output attributes• Each example of an iris is termed a sample, instance, object, data
point
These dimensions were measured:• sepal (kališní lístek) length• sepal width• petal (korunní lístek) length• petal width
Measurements on these iris species:• setosa• versicolor• virginica
based on WEKA tutorial
![Page 37: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/37.jpg)
Input OutputAttributes Attribute
Inst.Sepal
LengthSepal Width
Petal Length
Petal Width Species
1 5.1 3.5 1.4 0.2 setosa2 4.9 3 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5 3.6 1.4 0.2 setosa
Numerical Nominal
ClassSample
based on WEKA tutorial
![Page 38: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/38.jpg)
Statistics• statistical analysis– summary statistics (mean, median, standard
deviation)• Exploratory Data Analysis (EDA)– A preliminary exploration of the data to better
understand its characteristics.– Created by statistician John Tukey– A nice online introduction can be found in Chapter 1
of the NIST Engineering Statistics Handbook http://www.itl.nist.gov/div898/handbook/index.htm
![Page 39: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/39.jpg)
EDA• Helps to select the right tool for preprocessing or
analysis• People can recognize patterns not captured by data
analysis tools• In EDA, as originally defined by Tukey– The focus was on visualization– Clustering and anomaly detection were viewed as
exploratory techniques– In data mining, clustering and anomaly detection are
major areas of interest, and not thought of as just exploratory
• Human makes and validates hypotheses– While in DM computer makes and validates hypotheses
![Page 40: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/40.jpg)
setosa
virginica
versicolor
based on WEKA tutorial
![Page 41: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/41.jpg)
based on WEKA tutorial
setosa
versicolor
virginica
![Page 42: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/42.jpg)
based on WEKA tutorial
sepal length
sepa
l wid
th
![Page 43: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/43.jpg)
Connection to other fields
StatisticsDatabasesystems
Vizualization
Data Mining
Machine learningPattern recognition
![Page 44: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/44.jpg)
Visualization• Can reveal hypotheses
based on WEKA tutorial
![Page 45: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/45.jpg)
Connection to other fields
StatisticsDatabasesystems
Vizualization
Data Mining
Machine learningPattern recognition
![Page 46: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/46.jpg)
Data warehouse
• A data warehouse is a repository of an organization's electronically stored data.
• Data warehouses are designed to facilitate reporting and analysis.
• Technology:– relational database system– multidimensional database system
![Page 47: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/47.jpg)
Data warehousing
• process of constructing and using data warehouse
• Data warehousing is the coordinated, periodic copying of data from various sources, both inside and outside the enterprise, into an environment optimized for analytical and informational processing.
![Page 48: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/48.jpg)
data warehousing includes • business intelligence tools• tools to extract, transform, and load data• tools to manage and retrieve metadata
![Page 49: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/49.jpg)
Business intelligence tools
• a type of application software designed to report, analyze and present data
• they include– reporting and querying software
• “Tell me what happened.”• tools that extract, sort, summarize, and present selected data
– OLAP (On-Line Analytical Processing )• “Tell me what happened and why.”
– data mining• “Tell me what might happened.” (predict) • “Tell me something interesting.” (relationships)
![Page 50: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/50.jpg)
OLAP
• Query and report data is typically presented in row after row of two-dimensional data.
• OLAP: “Tell me what happened and why.”• To support this type of processing, OLAP
operates against multidimensional databases.
![Page 51: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/51.jpg)
Example: Iris data• We show how the attributes, petal length, petal
width, and species type can be converted to a multidimensional array– First, we discretized the petal width and length to
have categorical values: low, medium, and high– We get the following table - note the count
attribute
![Page 52: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/52.jpg)
Length
![Page 53: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/53.jpg)
• Slices of the multidimensional array are shown by the following cross-tabulations
Setosa Versicolor
Virginica
![Page 54: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/54.jpg)
Creating a Multidimensional Array• Two key steps in converting tabular data into a
multidimensional array.1. identify which attributes are to be the dimensions
and which attribute is to be the target attribute whose values appear as entries in the multidimensional array.• The attributes used as dimensions must have discrete
values• The target value is typically a count or continuous value
2. find the value of each entry in the multidimensional array by summing the values (of the target attribute) or count of all objects that have the attribute values corresponding to that entry.
![Page 55: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/55.jpg)
OLAP Operations: Data Cube• The key operation of an OLAP is the formation
of a data cube.• A data cube is a multidimensional
representation of data, together with all possible aggregates.
![Page 56: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/56.jpg)
• By all possible aggregates, we mean the aggregates that result by selecting a proper subset of the dimensions and summing over all remaining dimensions.
• For example, if we choose the species type dimension of the Iris data and sum over all other dimensions, the result will be a one-dimensional entry with three entries, each of which gives the number of flowers of each type.
Length
![Page 57: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/57.jpg)
• Consider a data set that records the sales of products at a number of company stores at various dates.
• This data can be represented as a 3 dimensional array
• There are 3 two-dimensionalaggregates (3 choose 2 ),3 one-dimensional aggregates,and 1 zero-dimensional aggregate (the overall total)
Data Cube Example
![Page 58: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/58.jpg)
• The following figure table shows one of the two dimensional aggregates, along with two of the one-dimensional aggregates, and the overall total
![Page 59: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/59.jpg)
OLAP Operations• Various operations are defined on the data
cube:– Slicing/Dicing - selecting a group/subgroup of cells
from the entire multidimensional array by specifying a specific value for one or more dimensions.
– Roll-up and Drill-down - granularity
![Page 60: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/60.jpg)
The End
![Page 61: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/61.jpg)
OLAP Operations: Roll-up and Drill-down
• Attribute values often have a hierarchical structure.– Each date is associated with a year, month, and week.– A location is associated with a continent, country, state
(province, etc.), and city. – Products can be divided into various categories, such as
clothing, electronics, and furniture.• Note that these categories often nest and form a
tree or lattice– A year contains months which contains day– A country contains a state which contains a city
![Page 62: Introduction to data mining](https://reader036.vdocuments.us/reader036/viewer/2022062323/56816855550346895dde6d79/html5/thumbnails/62.jpg)
• This hierarchical structure gives rise to the roll-up and drill-down operations.– For sales data, we can aggregate (roll up) the sales
across all the dates in a month. – Conversely, given a view of the data where the
time dimension is broken into months, we could split the monthly sales totals (drill down) into daily sales totals.
OLAP Operations: Roll-up and Drill-down