predict student behavior to increase retention

36
Predict student Predict student behavior to increase behavior to increase retention retention Online seminar presented by: Online seminar presented by: Jing Luan, Ph.D., Cabrillo Jing Luan, Ph.D., Cabrillo College College Bob Valencic, SPSS Inc. Bob Valencic, SPSS Inc. August 22, 2002 August 22, 2002

Upload: tommy96

Post on 28-Nov-2014

690 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Predict student behavior to increase retention

Predict student behavior to Predict student behavior to increase retentionincrease retention

Online seminar presented by:Online seminar presented by:Jing Luan, Ph.D., Cabrillo CollegeJing Luan, Ph.D., Cabrillo College

Bob Valencic, SPSS Inc.Bob Valencic, SPSS Inc.

August 22, 2002August 22, 2002

Page 2: Predict student behavior to increase retention

Business issues in higher educationBusiness issues in higher education How to predict student behavior and How to predict student behavior and

increase retention?increase retention? Data mining concepts Data mining concepts Data mining methodsData mining methods

Case studies Case studies Getting started on data miningGetting started on data mining Q&AQ&A

Seminar agendaSeminar agenda

Page 3: Predict student behavior to increase retention

Higher education business Higher education business issuesissues

Institutional effectiveness Student learning outcome assessment Enrollment management

Achieving optimum attraction, retention and persistence goals

Marketing Increasing competition for students

Alumni

How can data mining help?

Page 4: Predict student behavior to increase retention

Institutional effectivenessInstitutional effectiveness

Which students make greatest use of institutional services?

What courses provide high full-time equivalent students (FTES) and allow better use of space?

What are the patterns in course taking? What courses tend to be taken as a group?

Getting to know your students

Page 5: Predict student behavior to increase retention

Enrollment managementEnrollment management

Who are our best students? Where do our students come from? Who is most likely to return for

another semester? Who is most likely to fail or drop out?

Helping your students succeed

Page 6: Predict student behavior to increase retention

MarketingMarketing

Who is most likely to respond to our new campaign?

Which type of marketing/recruiting works best?

Where should we focus our advertising and recruiting?

Making the best use of tight budgets

Page 7: Predict student behavior to increase retention

AlumniAlumni

What are the different types/groups of alumni?

Who is likely to pledge, for how much, and when?

Where and on whom should we focus our fundraising drives?

Continuing the relationship

Page 8: Predict student behavior to increase retention

Our focus today: Our focus today: Predicting student behaviorPredicting student behavior

Acquiring new students Retaining students Increasing persistence to and

beyond graduation

Page 9: Predict student behavior to increase retention

Data mining definedData mining defined

“The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data stored in repositories and by using pattern recognition technologies as well as statistical and mathematical techniques.”

The Gartner Group

Page 10: Predict student behavior to increase retention

Another definitionAnother definition

“Simply put, data mining is used to discover patterns and relationships in your data in order to help you make better business decisions.”

Robert Small, Two Crows

Page 11: Predict student behavior to increase retention

CRISP-DMCRISP-DM

Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment

Page 12: Predict student behavior to increase retention

Two types of data miningTwo types of data mining

Supervised Purpose: For

classification and estimation

Algorithms C5.0 C&RT Neural

Network, etc.

Unsupervised Purpose: For

clustering and association

Algorithms Kohonen Kmeans TwoStep GRI, etc.

Page 13: Predict student behavior to increase retention

Algorithm vs. modelAlgorithm vs. model

Algorithm A technical term

describing a specific mathematically driven data mining function

Model A set of

representative rules, behaviors or characteristics against which data are analyzed to find similarities

Page 14: Predict student behavior to increase retention

Output

Hidden layer

Input layer

Neural networksNeural networks Synonymous with Machine Learning Identifies complex relations Somewhat difficult to interpret Long computation times

Page 15: Predict student behavior to increase retention

Cat. % nBad 52.01 168

Good 47.99 155Total (100.00) 323

Credit ranking (1=default)

Cat. % nBad 86.67 143

Good 13.33 22Total (51.08) 165

Paid Weekly/MonthlyP-value=0.0000, Chi-square=179.6665, df=1

Weekly pay

Cat. % nBad 15.82 25Good 84.18 133Total (48.92) 158

Monthly salary

Cat. % nBad 90.51 143

Good 9.49 15Total (48.92) 158

Age CategoricalP-value=0.0000, Chi-square=30.1113, df=1

Young (< 25);Middle (25-35)

Cat. % nBad 0.00 0Good 100.00 7Total (2.17) 7

Old ( > 35)

Cat. % nBad 48.98 24Good 51.02 25Total (15.17) 49

Age CategoricalP-value=0.0000, Chi-square=58.7255, df=1

Young (< 25)

Cat. % nBad 0.92 1Good 99.08 108Total (33.75) 109

Middle (25-35);Old ( > 35)

Cat. % nBad 0.00 0Good 100.00 8Total (2.48) 8

Social ClassP-value=0.0016, Chi-square=12.0388, df=1

Management;Clerical

Cat. % nBad 58.54 24

Good 41.46 17Total (12.69) 41

Professional

Decision treesDecision trees

Easy to interpret - income < $40K

– job > 5 yrs then yes– job < 5 yrs then no

- income > $40K– high debt then no– low debt then yes

Page 16: Predict student behavior to increase retention

Apriori Apriori Discovers events that occur togetherOften called ‘market basket’ analysisExample – What groups classes do certain students take in the same semester that may impact facilities and course scheduling?

Page 17: Predict student behavior to increase retention

Kohonen networkKohonen network

Seeks to describe dataset in terms of natural clusters of cases

Example – identify similar groups of students

Page 18: Predict student behavior to increase retention

Predicting student persistence

Case study using Case study using ClementineClementine®®

Page 19: Predict student behavior to increase retention

Examining dataExamining data

Page 20: Predict student behavior to increase retention

Clustering using Clustering using TwoStepTwoStep

Page 21: Predict student behavior to increase retention

Building models for Building models for persistence in streamspersistence in streams

A node is being executed (notice the red arrows denoting the flow of data.

Page 22: Predict student behavior to increase retention

Seeing the work of Seeing the work of neural thinkingneural thinking

Graphic display

showing an ANN is

learning the data.

Page 23: Predict student behavior to increase retention

Results of neural nodeResults of neural node

These are the outputs of the Neural Networks. Overall accuracy and significance of features (left). Predicted number of policies using fresh data vs. known data (above).

Page 24: Predict student behavior to increase retention

Examining C5.0Examining C5.0

The control panel of the C5.0

node, (Expert)

Page 25: Predict student behavior to increase retention

Results of C5.0 nodeResults of C5.0 nodeView the

prediction by individual

records (PNXT vs. $C-PNXT).

View the overall

prediction accuracy.

Page 26: Predict student behavior to increase retention

Comparing C&RT and Comparing C&RT and C5.0C5.0

Use the Analysis node to examine the difference in accuracy for C&RT and

C5.0.

Page 27: Predict student behavior to increase retention

Which one is better:Which one is better:C&RT & C5.0C&RT & C5.0

C5.0 has an accuracy rate of

66.3% and C&RT 63.7%.

They agree 72% of the time.

Page 28: Predict student behavior to increase retention

Visualizing Results

Page 29: Predict student behavior to increase retention

Visualizing Results

Page 30: Predict student behavior to increase retention

Scoring new dataScoring new data

Moment of truth. The most powerful feature of data mining is to use learned “rules” to predict (score)

using fresh data for business purposes. Shown

here is the change of dataset to a fresh data set

unseen by Clementine before now.

Page 31: Predict student behavior to increase retention

Using models to score Using models to score new datanew data

Model Results Scored Results

Page 32: Predict student behavior to increase retention

Additional case studyAdditional case study

How best to identify future transfer students so college can groom them?

What can a community college do to increase transfer rates?

Using decision tree models, the top rule for successful transfers was: taking more than 12 units, taken less than 5 non-transfer courses, must have taken at least one math course.

Predicting the behavior of transfer students

Page 33: Predict student behavior to increase retention

Getting startedGetting started

Company stability and customer feedback

User interface Scalability Server/Client Modeling capacities Learning curve Join a listserv, such as CLUG Cost

Evaluate data mining software

Page 34: Predict student behavior to increase retention

Getting startedGetting started

Determine business needs Determine technology infrastructure and

management support Identify mining area and business problems Determine data source(s) Invite an expert to jump start Pilot test mining results CRISP-DM and Real-time data mining,

Knowledge Discover in Databases (KDD)

Develop a data mining plan for your institution

Page 35: Predict student behavior to increase retention

Want to Learn MoreWant to Learn More??Full training course descriptions at:Full training course descriptions at:

www.spss.com/training

Contact us or one of our other data mining experts by Contact us or one of our other data mining experts by callingcalling 800-543-5815800-543-5815..

Check out the Knowledge Management/Data Mining Check out the Knowledge Management/Data Mining Discussion Group:Discussion Group:

http://www.kdl1.com/kmdm

Obtain the book,Obtain the book, “Knowledge Management – Building A “Knowledge Management – Building A Competitive Advantage in Higher Education,” Competitive Advantage in Higher Education,” published by published by

Jossey-Bass:Jossey-Bass:http://josseybass.com/cda/product/0,,0787962910,00.html

Bob Valencic Bob Valencic [email protected] Luan Jing Luan [email protected]

Page 36: Predict student behavior to increase retention

Thank you!Thank you!

Predict student behavior Predict student behavior to increase retentionto increase retention

22ndnd Annual Public Sector Roadshow Annual Public Sector RoadshowOctober 15 in Washington, D.C.October 15 in Washington, D.C.

www.spss.com/psroadshow