an introduction to data mining in institutional research
DESCRIPTION
TRANSCRIPT
![Page 1: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/1.jpg)
An Introduction to Data Mining in Institutional Research
Dr. Thulasi KumarDirector of Institutional ResearchUniversity of Northern Iowa
![Page 2: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/2.jpg)
AIR/SPSS Professional AIR/SPSS Professional Development SeriesDevelopment Series
Background
Covering variety of topics
Up to date information on www.airweb.org
![Page 3: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/3.jpg)
Copyright 2003-4, SPSS Inc.Copyright 2003-4, SPSS Inc. 3
Common QuestionsCommon Questions
1. Will I be able to get copies of the slides after the event?
2. Is this web seminar being taped so I or others can view it after the fact?
3. Can I ask questions during this event?
![Page 4: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/4.jpg)
Copyright 2003-4, SPSS Inc.Copyright 2003-4, SPSS Inc. 4
Common QuestionsCommon Questions
1. Will I be able to get copies of the slides after the event?
2. Is this web seminar being taped so I or others can view it after the fact?
3. Can I ask questions during this event?
Yes
Yes
Yes
![Page 5: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/5.jpg)
TodayToday’’s Agendas Agenda
Data Mining OverviewHistory How it compares to other analytic techniques
Phases in the Data Mining Process
Applications of Data Mining in Institutional Research
Data Mining solutions
Question and Answer
![Page 6: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/6.jpg)
The Evolution of Data AnalysisThe Evolution of Data AnalysisEvolutionary Step
Business Question
Enabling Technologies
Product Providers
Characteristics
Data Collection (1960s)
"What was my total revenue in the last five years?"
Computers, tapes, disks
IBM, CDC
Retrospective, static data delivery
Data Access (1980s)
"What were unit sales in New England last March?"
Relational databases (RDBMS), Structured Query Language (SQL), ODBC
Oracle, Sybase, Informix, IBM, Microsoft
Retrospective, dynamic data delivery at record level
Data Warehousing & Decision Support (1990s)
"What were unit sales in New England last March? Drill down to Boston."
On-line analytic processing (OLAP), multidimensional databases, data warehouses
SPSS, Comshare, Arbor, Cognos, Microstrategy,NCR
Retrospective, dynamic data delivery at multiple levels
Data Mining (Emerging Today)
"What’s likely to happen to Boston unit sales next month? Why?"
Advanced algorithms, multiprocessor computers, massive databases
SPSS/Clementine, Lockheed, IBM, SGI, SAS, NCR, Oracle, numerous startups
Prospective, proactive information delivery
Source: SPSS BI
![Page 7: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/7.jpg)
What is Data Mining?What is Data Mining?The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data stored in repositories and by using pattern recognition technologies as well as statistical and mathematical techniques (The Gartner Group).
The exploration and analysis of large quantities of data in order to discover meaningful patterns and rules (Berry and Linoff).
The nontrivial extraction of implicit, previously unknown, and potentially useful information from data (Frawley, Paitestsky-Shapiro and Mathews).
![Page 8: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/8.jpg)
Differences between Statistics andDifferences between Statistics andData MiningData Mining
STATISTICS DATA MINING
Confirmative Explorative
Small data sets/File-based Large data sets/Databases
Small number of variables Large number of variables
Deductive Inductive
Numeric data Numeric and non-numeric
Clean data Data cleaning
![Page 9: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/9.jpg)
Paradigm ShiftParadigm Shift
Traditional IR Work:
Data file => Descriptive/Regression Analysis => Tabulations/Reports
Data Mining Driven IR Work:
Database => Data Mining (Visualization, Association, Clustering, Predicative Modeling) => Immediate Actions
Historical Predictive
Historical Predictive
Source: Jing Luan, Cabrillo College, CA
![Page 10: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/10.jpg)
Data Mining is notData Mining is not……
OLAP
Data Warehousing
Data Visualization
SQL
Ad Hoc Queries
Reporting
![Page 11: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/11.jpg)
Data Mining Roots and AlgorithmsData Mining Roots and Algorithms
StatisticsDistributions, mathematics, etc.
Machine LearningComputer science, heuristics and induction algorithms
Artificial IntelligenceEmulating human intelligence
Neural NetworksBiological models, psychology and engineering
![Page 12: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/12.jpg)
Data Mining isData Mining is……
Predictive ModelingLiner/Logistic RegressionNeural NetworksDecision Trees
ClusteringKohonen Neural Networks ClusteringK-Means ClusteringNearest Neighbor Clustering
![Page 13: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/13.jpg)
Data Mining isData Mining is……(cont(cont’’d)d)
SegmentationDecision TreesNeural NetworksPredictive Modeling
Affinity AnalysisAssociation RuleSequence Generators
Cat. % nBad 52.01 168
Good 47.99 155Total (100.00) 323
Credit ranking (1=default)
Cat. % nBad 86.67 143
Good 13.33 22Total (51.08) 165
Paid Weekly/MonthlyP-value=0.0000, Chi-square=179.6665, df=1
Weekly pay
Cat. % nBad 15.82 25Good 84.18 133Total (48.92) 158
Monthly salary
Cat. % nBad 90.51 143
Good 9.49 15Total (48.92) 158
Age CategoricalP-value=0.0000, Chi-square=30.1113, df=1
Young (< 25);Middle (25-35)
Cat. % nBad 0.00 0Good 100.00 7Total (2.17) 7
Old ( > 35)
Cat. % nBad 48.98 24Good 51.02 25Total (15.17) 49
Age CategoricalP-value=0.0000, Chi-square=58.7255, df=1
Young (< 25)
Cat. % nBad 0.92 1Good 99.08 108Total (33.75) 109
Middle (25-35);Old ( > 35)
Cat. % nBad 0.00 0Good 100.00 8Total (2.48) 8
Social ClassP-value=0.0016, Chi-square=12.0388, df=1
Management;Clerical
Cat. % nBad 58.54 24
Good 41.46 17Total (12.69) 41
Professional
![Page 14: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/14.jpg)
Phases in the DM Process: CRISPPhases in the DM Process: CRISP--DMDM
www.crisp-dm.org
•Business Understanding•Data Understanding•Data Preparation•Modeling•Evaluation•Deployment
![Page 15: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/15.jpg)
CRISPCRISP--DMDMBusiness Understanding
Understanding project objectives and data mining problem identification
Data UnderstandingCapturing, understand, explore your data for quality issues
Data PreparationData cleaning, merge data, derive attributes etc.
ModelingSelect the data mining techniques, build the model
EvaluationEvaluate the results and approved models
DeploymentPut models into practice, monitoring and maintenance plan
![Page 16: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/16.jpg)
Data at the heart of theData at the heart of thePredictive EnterprisePredictive Enterprise
Behavioral data- Orders- Transactions- Payment history- Usage history
Descriptive data- Attributes- Characteristics- Self-declared info- (Geo)demographics
Attitudinal data- Opinions- Preferences- Needs- Desires
Interaction data- Offers- Results- Context- Click streams- Notes
Source: SPSS BI
![Page 17: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/17.jpg)
Data Mining ApplicationsData Mining Applications
Institutional Effectiveness
Which students make greatest use of institutional services?
What courses provide high full-time equivalent students (FTES) and allow better use of space?
What are the patterns in course taking?
What courses tend to be taken as a group?
![Page 18: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/18.jpg)
Data Mining Applications (cont’d)
Enrollment ManagementEnrollment ManagementWho are our best students?
Where do our students come from?
Who is most likely to return for another semester?
Who is most likely to fail or drop out?
![Page 19: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/19.jpg)
Data Mining Applications (cont’d)
MarketingMarketing
Who is most likely to respond to our new campaign?
Which type of marketing/recruiting works best?
Where should we focus our advertising and recruiting?
![Page 20: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/20.jpg)
Data Mining Applications (cont’d)
AlumniAlumni
What are the different types/groups of alumni?
Who is likely to pledge, for how much, and when?
Where and on whom should we focus our fundraising drives?
![Page 21: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/21.jpg)
Data Mining Applications inData Mining Applications inInstitutional ResearchInstitutional Research
Categorize your studentsClassification
Predict students retention/Alumni donationsNeural Nets/Regression
Group similar studentsSegmentation
Identify courses that are taken togetherAssociation
Find patterns and trends over timeSequence
•Cafeteria meal planning•Student housing planning
•Identify high risk students•Estimate/predict alumni contribution•Predict new student application rate
•Course planning•Academic scheduling•Identify student preferences for clubs and social organizations
•Faculty teaching load estimation•Course planning•Academic scheduling
•Predict alumni donation•Predict potential demand for library resources
![Page 22: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/22.jpg)
Data Mining with ClementineData Mining with ClementineIndustry-leading workbench for data mining
Comprehensive range of tools for all stages of the data mining process
Pioneered visual approach for maximum productivity
Multiple modeling techniques to predict future events
![Page 23: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/23.jpg)
SummarySummary
Successful data mining strategy involves:Well defined goals, project objectives, and questionsSufficient and relevant dataCareful consideration and selection of software and analysts (tech and domain expert)Support from senior administrators (VPs and the President)
DM provides a set of tools, techniques and a standardized process.
Need domain expertise in institutional research to build, test, validate, and deploy models.
DM does not build models automatically. Analysts do.
![Page 24: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/24.jpg)
Next Steps: Data Mining ResourcesNext Steps: Data Mining Resources
http://www.kdnuggets.com/
http://www.dmhe.org/
http://www.uni.edu/instrsch/dm/index.html
http://www.spss.com/data_mining/
![Page 25: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/25.jpg)
Questions?
![Page 26: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/26.jpg)
Next Steps: Next Steps: WebcastsWebcasts and White and White PapersPapers
December 12th, 2pm Moving Beyond the Basics: Data Mining for Institutional Research
Information at www.spss.com/airseries3
Visit www.spss.com/airseries2 to download a copy of the SPSS Data Mining Tips Guide
![Page 27: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/27.jpg)
For more informationFor more information
www.spss.com
www.airweb.org
Complete the evaluation form and tell us what you thought of today’s webcast
![Page 28: An Introduction to Data Mining in Institutional Research](https://reader033.vdocuments.us/reader033/viewer/2022042813/54b418be4a79599e1f8b4710/html5/thumbnails/28.jpg)
THANK YOU!
Survey also at: http://www.airweb.org/page.asp?page=217&meetingid=0010