data analysis for the new millennium: data mining, genetic algorithms, and visualization by bruce l....
TRANSCRIPT
![Page 1: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/1.jpg)
Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization
by
Bruce L. GoldenRH Smith School of Business
University of Maryland
IMC Knowledge Management SeminarApril 15, 1999
![Page 2: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/2.jpg)
2
Focus
Connection between data and knowledge
Examples of data analysis from late 1980s
Contrast with data analysis in late 1990s
Introduce techniques
• MDS and Sammon maps
• neural networks and SOMs
• decision trees
• genetic algorithms
Illustrate the power of visualization
Data analysis as a strategic asset
Conclusion
![Page 3: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/3.jpg)
3
Setting the Stage
“Data is information devoid of context, information is data in context
and knowledge is information with causal links. The more structure is
added to a pool of information, the more we can talk about knowledge.”
(California Management Review, Winter 1999)
How can we systematically discover knowledge from data?
data information knowledge
![Page 4: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/4.jpg)
4
Modeling Salinity Dynamics in the Chesapeake Bay
The Goal: Construct multiple regression models that accurately
describe the dynamics of salinity in the Maryland portion of the Bay
The work was done in late 1989/early 1990
Motivation:
• Salinity exerts a major influence over the survival and distribution
of many fish species in the Chesapeake Bay
• Maryland was the nation’s leader in oyster production several
decades ago
• MDNR’s oyster production rebuilding program relied on predicting
salinity levels for various areas in the Bay
![Page 5: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/5.jpg)
5
Model Building
Transformation of key variables
Extensive screening for independent variables
Used stepwise regression in SPSS/PC
Data collected at 34 stations by USEPA from 1984 to 1989
36,258 water samples collected at different depths (9 variables
per sample)
Constructed 10 models in all (bottom data / total data)
Upper, Middle, Lower, Entire Bay, Lower Tributaries
Four key independent variables
Day, Depth, Latitude, Longitude
![Page 6: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/6.jpg)
6
Model Results
Six independent variables in each model
Model assumptions are not violated
R2 values range from 0.56 to 0.81
Entire Bay Model
R2 = 0.649
depth increases, salinity increases
Salinity = 199.839 - 1.151 Day1 + 1.161 Day2 + 0.283 Depth - 4.863 Latitude - 1.543 Longitude - 13.402 Longitude1
![Page 7: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/7.jpg)
7
Salinity Modeling Summary
The regression models were validated using new (1990) data
involving 7,000 observations
These regression models can be used to predict salinity for a
location on the Bay at a specified depth and date
In 1991, we applied neural networks to the same problem
To our surprise, the neural network models predicted salinity
levels more accurately than the regression models in 90% of
the cases
![Page 8: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/8.jpg)
8
The Problem with Linear Regression
“But we all know the world is nonlinear.” (Harold Hotelling, 1948)
Predicted Regression Line
True Regression Line
Linear Regression Shortcomings: Nonlinear Data (Cabena et al., 1998)
![Page 9: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/9.jpg)
9
Neural Network Configuration
Sa lin ity L e ve l
Hid d e n No d e s
S ta tio nNu m b e r
D e p th L a titu d e L o n g itu d e D a te L o n g itu d e * D e p th
![Page 10: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/10.jpg)
10
Neural Networks
Neural networks are computer programs designed to recognize
patterns and learn “like” the human brain
They are versatile and have been used to perform prediction and
classification
The key is to iteratively determine the “best” weights for the links
connecting the nodes
Drawback: It is difficult to explain/interpret the results (same is true
for regression)
![Page 11: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/11.jpg)
11
Visualization
Psychologists claim that more than 80% of the information
we absorb is received visually (Cabena et al., 1997)
Data is often highly multidimensional
Mapping from three or more dimensions to two dimensions
is not easy
![Page 12: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/12.jpg)
12
Flattening the Earth
“Would you tell me, please, which way I ought to go from here?” asked Alice.
“That depends a good deal on where you want to get to,” said the Cat.
(Lewis Carroll)
![Page 13: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/13.jpg)
13
![Page 14: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/14.jpg)
14
Map Projections
We use map projections to represent a spherical Earth on a
flat surface
Two map projections of the world can look quite different
All map projections distort reality in some ways -- shape, area,
distance, angles, etc.
Equivalent projections preserve area
Conformal projections preserve angles
No projection can be both conformal and equivalent
Bottom line: map projections are extremely useful, but offer
compromise solutions
![Page 15: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/15.jpg)
15
![Page 16: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/16.jpg)
16
![Page 17: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/17.jpg)
17
![Page 18: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/18.jpg)
18
Visual Clustering (Segmentation) Methods
Multi-Dimensional Scaling (MDS)
Sammon Mapping
Self-Organizing Maps
Euclidean distance (more or less) is used as a similarity measure
![Page 19: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/19.jpg)
19
![Page 20: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/20.jpg)
20
![Page 21: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/21.jpg)
21
Self-Organizing Maps (SOMs)
Developed by Teuvo Kohonen in early 1980s
Observations are mapped onto a two-dimensional hexagonal grid
Related to MDS and Sammon maps, but ensures better spacing
Colors are used to indicate clusters
Software: SOM_PAK (Public domain, WWW), Viscovery (Eudaptics,
Austria)
![Page 22: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/22.jpg)
22
Country Risk Data
Goal: Look at risks involved in investing in stock markets around the world
Source: Wall Street Journal of June 26, 1997
52 countries, 20 variables
The article clusters countries into five groups of approximately equal size
those most similar to the U.S.
other developed countries
mature and emerging markets
newly emerging markets
frontier markets
![Page 23: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/23.jpg)
23
![Page 24: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/24.jpg)
24
![Page 25: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/25.jpg)
25
Country Risk Data (continued)
Nine clusters is a better representation of the data than five clusters
Component maps and cluster summary statistics help explain why
Numerous other applications in finance and economics
![Page 26: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/26.jpg)
26
![Page 27: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/27.jpg)
27
![Page 28: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/28.jpg)
28
![Page 29: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/29.jpg)
29
![Page 30: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/30.jpg)
30
![Page 31: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/31.jpg)
31
Picking Mutual Funds with SOMs
Source: Morningstar 1997 data on 500 mutual funds
Among the most successful funds, historically
Approximately 15 variables
Categories: World Stocks, International Bonds, Large & Midsize Stocks,
Small Cap Stocks, Emerging Markets, All Funds
Diversification invest in funds that are in different clusters
![Page 32: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/32.jpg)
32
Current SOM Projects
Direct Mail Response
observations -- hundreds of thousands of customers
variables -- customer history with firm, age, zip code
goal -- identify clusters of customers for direct mail promotion
Profit Opportunities in Telecommunications Worldwide
observations -- approximately 200 countries
variables -- socio-economic measures, teledensity measures
goal -- identify clusters of countries in which demand for
wireless services may be high
![Page 33: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/33.jpg)
33
Flattening the Earth and SOMs: Connections
There is an art and science to each
Each is based on sophisticated mathematics
When you move from many dimensions to two dimensions, you lose
important details
On the other hand, visualization generates insights and impact
![Page 34: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/34.jpg)
34
Data Mining and Knowledge Management
Two types of organizational knowledge
Explicit Knowledge
databases
reports
manuals
Data mining attempts to convert some of this tacit knowledge
into explicit knowledge
Tacit Knowledge
in employees’ heads
learned from experience
not yet codified
![Page 35: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/35.jpg)
35
Decision Trees
Given a table of data
potential customers are the rows
independent variables and dependent variable are the columns
Decision trees are used for classification, prediction, or estimation of the dependent variable
Accuracy is typically less than 100%
One popular approach -- information theory
maximize information gain at each split
limit the number of splits
software: C4.5 Another popular approach -- statistics
software: CART, CHAID
![Page 36: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/36.jpg)
36
A Decision Tree for Widget Buyers
11 yes9 no
1 yes6 no
10 yes3 no
8 yes
2 yes3 no
No
Yes
No
Rule 1. If residence not NY, then not a widget buyer
Rule 2. If residence NY and age > 35, then not a widget buyer
Rule 3. If residence NY and age <= 35, then a widget buyer
85.020
836
#
total #
classifiedcorrectlyAccuracy
Adapted from (Dhar & Stein, 1997)
![Page 37: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/37.jpg)
37
Evolutionary Algorithms / Genetic Algorithms
Developed by John Holland in the late 1960s / early 1970s
Speed up evolution a millionfold or so on the computer
Simple, elegant, powerful idea
![Page 38: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/38.jpg)
38
A Simple Genetic Algorithm
1. Start with a randomly generated population of n chromosomes
(candidate solutions to a problem)
2. Calculate the fitness f(x) of each chromosome x in the population
3. Repeat the following steps until n offspring have been created
a. Randomly select a pair of parent chromosomes from
the current population
b. Crossover (mate) the pair at a randomly chosen point to
form two offspring
c. Randomly mutate the two offspring and add the resulting
chromosomes to the population
d. Calculate the fitness of the resulting chromosomes
![Page 39: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/39.jpg)
39
A Simple Genetic Algorithm (continued)
4. Let the n fittest chromosomes survive to the next generation
5. Go to Step 3 (repeat for 50 generations)
![Page 40: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/40.jpg)
40
Financial Investment Example
Five sectors
1. Financial services
2. Health care
3. Utilities
4. Technology
5. Consumer
Two parents
21 24 18 1615 10 30
percent investedin sector 3
10 31 25
![Page 41: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/41.jpg)
41
Financial Investment Example (continued)
Crossover
21 24 15 1615 10 30 21 31 25
After normalization and rounding, we obtain two offspring
Mutation
10 1618 31 25 10 2418 10 30
19 1514 29 23 11 2619 11 33
19 1514 29 23
and
19 1523 29 14
![Page 42: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/42.jpg)
42
Railroad
AreaNatl_prod
Birth_rt
Natl_prod
Inf_mor_rtLow
MediumLowHighMedium
HighMedium
Low
<= 2090
<= 1138910
> 14
> 1045
> 886.3<= 886.3
<= 1045
<= 14
> 57<= 57
> 2090
>1138910
Crossover Illustration for Decision Trees
Parent 1 Parent 2
![Page 43: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/43.jpg)
43
Crossover Illustration (continued)
Child 1 Child 2
Railroad
Area
Birth_rt
Low
MediumLow
Medium
<= 2090
<= 1138910
> 14<= 14> 2090
>1138910
Natl_prod
HighMedium
> 1045<= 1045
Natl_prod
Inf_mor_rt
High
Low
> 886.3<= 886.3
> 57<= 57
![Page 44: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/44.jpg)
44
Railroad
Natl_prod
Inf_mor_rtLow
HighMedium
<= 2090
> 886.3<= 886.3
> 57<= 57
> 2090
Highways
Elect_prod High
> 208350<= 208350
MediumLow
<= 33 > 33
Mutation Illustration
![Page 45: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/45.jpg)
45
GA Applications
Financial Data Analysis
State Street Global Advisors Advanced Investment Technologies Barclays Global Investors PanAgora Asset Management Fidelity Funds
Operations and Supply Chain Management General Motors Volvo Cemex
Engineering Design
General Electric Boeing
![Page 46: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/46.jpg)
46
Data Analysis Then and Now
Late 1980s
Linear methods
Large data sets
Few dimensions
Ask specific questions
Search for information
Late 1990s
Nonlinear methods
Massive data sets
Highly multi-dimensional
What can we infer?
Search for knowledge
![Page 47: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/47.jpg)
47
Data Analysis as a Strategic Asset
Competitive advantage
Sustainable over a period of time
![Page 48: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/48.jpg)
48
Examples
BT Labs
Enterprise Rent-A-Car
Dupont
![Page 49: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/49.jpg)
49
Concluding Remarks
Corporate data is more plentiful than ever before
Companies are becoming more serious about mining that data
Powerful software tools are widely available
Many companies already view data analysis/data mining as a strategic
asset
Data analysis/data mining is a key area within knowledge management
Exciting opportunities exist for collaborative research
![Page 50: Data Analysis for the New Millennium: Data Mining, Genetic Algorithms, and Visualization by Bruce L. Golden RH Smith School of Business University of Maryland](https://reader036.vdocuments.us/reader036/viewer/2022062801/56649e7a5503460f94b7a596/html5/thumbnails/50.jpg)
50
Recommended Books
Berry & Linoff, Data Mining Techniques for Marketing, Sales, and Customer Support, Wiley (1997)
Deboeck & Kohonen, Visual Explorations in Finance with SOMs, Springer (1998)
Dhar & Stein, Seven Methods for Transforming Corporate Data into Business Intelligence, Prentice Hall (1997)
Mitchell, An Introduction to Genetic Algorithms, MIT Press (1996)
Monmonier, How to Lie with Maps (second edition), University of Chicago Press (1996)