cqads data analysis workshops...cqads data analysis workshops at cqads, we know that your ideal...

5
CQADS Data Analysis Workshops At CQADS, we know that your ideal decision-making environment re- quires an in-depth understanding of your data. To help you make the most out of your decisions, we offer a variety of work- shops targeted to all learning levels, on topics ranging from data collection to data analysis and visualization. Selected workshops are offered at Carleton University on a regular ba- sis. On-site sessions can also be ar- ranged. Customizable sessions (topic- and level-specific) containing multiple workshops are available, as are spot- lights on specific methods. For pricing information, availability, and other inquiries, please contact [email protected]. For the workshop schedule, or to join our mailing list, please visit cqads.carleton.ca/workshops. Information about Carleton’s collabo- rative Data Science Masters program can be found at carleton.ca/cuids. INTRODUCTORY DATA SCIENCE Introduction to Analytics – Preparing and Visualizing Data Dead Simple Data Science – Data Analysis with Everyday Tools (coming soon) Elements of Data Visualization – Principles and Execution (coming soon) DATA SCIENCE AND DATA ANALYSIS Mining for Information Gold – Data Science Concepts The Mechanics of Uncertainty – Bayesian Analysis (coming soon) Spotlight on Big Data – The MapReduce Paradigm (coming soon) SPECIAL TOPICS IN MACHINE LEARNING Classification Methods I and II Clustering Algorithms I and II Text Mining I and II (coming soon) Recommender Systems and Social Network Analysis (coming soon) STATISTICAL ANALYSIS WITH R Visual and Descriptive Statistics (coming soon) Univariate Analysis (coming soon) Multivariate Analysis (coming soon) Time Series Analysis (coming soon) HANDS-ON DATA DISCOVERY Data Processing and Visualization with R Advanced Data Visualization with R (coming soon) Basic Data Crunching with Python (coming soon) Data Collection and Web Scraping (coming soon) Adventures in Distributed Computing with Spark and HO (coming soon) The Data Scientist’s Toolbox Putting It All Together – Analyzing Datasets, From A to Z (coming soon) , Centre for Quantitative Analysis and Decision Support - Carleton University Herzberg Laboratories - Colonel By Drive - Ottawa ON, KS B6 [email protected]

Upload: others

Post on 28-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CQADS Data Analysis Workshops...CQADS Data Analysis Workshops At CQADS, we know that your ideal decision-making environment re-quires an in-depth understanding of your data. To help

CQADS Data Analysis WorkshopsAt CQADS, we know that your idealdecision-making environment re-quires an in-depth understanding ofyour data.

To help you make the most out of yourdecisions, we offer a variety of work-shops targeted to all learning levels,on topics ranging from data collectionto data analysis and visualization.

Selected workshops are offered atCarleton University on a regular ba-sis. On-site sessions can also be ar-ranged.

Customizable sessions (topic- andlevel-specific) containing multipleworkshops are available, as are spot-lights on specific methods.

For pricing information, availability,and other inquiries, please [email protected].

For the workshop schedule, or to joinour mailing list, please visitcqads.carleton.ca/workshops.

Information about Carleton’s collabo-rative Data Science Masters programcan be found at carleton.ca/cuids.

INTRODUCTORY DATA SCIENCEIntroduction to Analytics – Preparing and Visualizing DataDead Simple Data Science – Data Analysis with Everyday Tools (coming soon)

Elements of Data Visualization – Principles and Execution (coming soon)

DATA SCIENCE AND DATA ANALYSISMining for Information Gold – Data Science ConceptsThe Mechanics of Uncertainty – Bayesian Analysis (coming soon)

Spotlight on Big Data – The MapReduce Paradigm (coming soon)

SPECIAL TOPICS IN MACHINE LEARNINGClassification Methods I and IIClustering Algorithms I and IIText Mining I and II (coming soon)

Recommender Systems and Social Network Analysis (coming soon)

STATISTICAL ANALYSIS WITH RVisual and Descriptive Statistics (coming soon)

Univariate Analysis (coming soon)

Multivariate Analysis (coming soon)

Time Series Analysis (coming soon)

HANDS-ON DATA DISCOVERYData Processing and Visualization with RAdvanced Data Visualization with R (coming soon)

Basic Data Crunching with Python (coming soon)

Data Collection and Web Scraping (coming soon)

Adventures in Distributed Computing with Spark and H2O (coming soon)

The Data Scientist’s ToolboxPutting It All Together – Analyzing Datasets, From A to Z (coming soon)

2

3

45

2

4,5

Centre for Quantitative Analysis and Decision Support − Carleton University4332 Herzberg Laboratories − 1125 Colonel By Drive − Ottawa ON, K1S 5B6

[email protected]

Page 2: CQADS Data Analysis Workshops...CQADS Data Analysis Workshops At CQADS, we know that your ideal decision-making environment re-quires an in-depth understanding of your data. To help

Introduction to AnalyticsPreparing and Visualizing Data

DESCRIPTIONAn introduction to the notions that must be mastered prior to, andafter, embarking on data analysis, along with a discussion of com-mon challenges and pitfalls.Participants will be introduced to various methods of data visual-ization, and to some intrinsic limitations of data, as well as someeasily avoidable pre- and post-analysis mistakes.

INTENDED AUDIENCEIndividuals with an interest in quantitative methods who wish tograsp the larger and broader “behind-the-scenes” issues that affectanalysts’ decisions and results (this version of the course containsvery little technical details and is not intended for experienceddata analysts).

PRE-REQUISITESThis is not a workshop on formulas, computation or any specificanalytical methods. Mathematical and/or statistical sophisticationis not required.“Dencity” by Fathom – Previous page: “Health and Wealth of Nations, 2012” by the Gapminder Foundation

SUGGESTED DURATION1 day

OUTLINE1. Data Science Universals

1.1 Ethics in the Data Science Context1.2 Guiding Principles of Analytics1.3 Analytics Workflow

2. Data Clean-up and Validation2.1 Data Quality2.2 Missing Data2.3 Anomalous Observations

3. Visualization: Seeing Data in a Different Light3.1 Pre-Analysis Use vs. Post-Analysis Use3.2 Simple Graphical Methods3.3 Useful Representations of Complex Data3.4 Design Suggestions3.5 Data Visualization Hall of Fame

4. Odds and Ends4.1 Biases, Fallacies, and Pitfalls4.2 What’s the Big Deal About Big Data?4.3 What We Didn’t Talk About

Data Processing and Visualization with RDESCRIPTIONAn introductory experience in data manipulation and visualization using R.This workshop is a companion to Introduction to Analytics.

GOALBy the end of the course, participants will be able to process some simpledatasets using R.

INTENDED AUDIENCEIndividuals who have some conceptual understanding of data science, andwho would like to gain some practical experience using R.

IT REQUIREMENTSParticipants must provide their own computers with wireless connectivity.

PRE-REQUISITESParticipation in the workshop Introduc-tion to Analytics is not required, but fa-miliarity with its contents is expected.Some knowledge of the data scienceconcepts introduced in Mining for Infor-mation Gold could prove useful.

SUGGESTED DURATION1/2 day

OUTLINE1. R Basics2. Data Manipulation3. Data Cleaning4. Simple Data Visualization

Page 2 of 5 CQADS Data Analysis Workshops – Catalogue

Page 3: CQADS Data Analysis Workshops...CQADS Data Analysis Workshops At CQADS, we know that your ideal decision-making environment re-quires an in-depth understanding of your data. To help

Mining for Information GoldData Science Concepts

DESCRIPTIONAn introduction to the fundamental data science concepts involvedin data analysis, with a detailed discussion of three common ana-lytical concepts – classification, clustering and association rules.

The application of these concepts will be illustrated through somesimple toy examples, along with a discussion of common chal-lenges and pitfalls.

GOALBy the end of the course, participants will be able to appreciatethe functionality of different types of data science concepts andrecognize opportunities for their application when presented withreal-world data.

INTENDED AUDIENCEIndividuals who wish to understand the functionality and capabili-ties offered by different data science concepts and methods, evenif they won’t be the ones implementing them.

PRE-REQUISITESThis workshop requires very little mathematical or computer pro-gramming knowledge. Some experience with quantitative ideas isassumed.

SUGGESTED DURATION1 dayPersonal file

OUTLINE1. Fundamental Notions

1.1 Introduction1.2 Data Mining vs Real Mining: An Analogy1.3 The Structure of Data1.4 The Most Basic Technique of All1.5 Learning - Supervised vs. Unsupervised

2. Association Rules2.1 Useful Applications2.2 A Simple Algorithm (brute force)2.3 Comments and Notes2.4 Case Study: Danish Medical Dataset

3. Classification3.1 Useful Applications3.2 A Simple Algorithm (decision tree)3.3 Comments and Notes3.4 Case Study: Deepwater Horizon Oil Spill

4. Clustering4.1 Useful Applications4.2 A Simple Algorithm (k-means)4.3 Comments and Notes4.4 Case Study: OKCupid

5. Data Mining Issues and Challenges5.1 Overfitting5.2 Unrepresentative Data5.3 Curse of Dimensionality5.4 Technical Limitations for Big Data5.5 The Wrong Tool for the Job5.6 Non-Transferable Assumptions

The Data Scientist’s ToolboxGetting Technical

DESCRIPTIONAn introductory experience in data science using R. This workshop is acompanion to Mining for Information Gold.

GOALBy the end of the course, participants will be able to analyze some simpledatasets using R.

INTENDED AUDIENCEIndividuals who have some conceptual understanding of data science, andwho would like to gain some practical experience using R.

IT REQUIREMENTSParticipants must provide their own computers with wireless connectivity.

PRE-REQUISITESParticipation in the workshops Miningfor Information Gold and Data Process-ing and Visualization with R is not re-quired, but familiarity with their contentsis assumed.

SUGGESTED DURATION1/2 day

OUTLINE1. Association Rules2. Decision Trees3. k-Means4. Overfitting5. The Curse of Dimensionality

CQADS Data Analysis Workshops – Catalogue Page 3 of 5

Page 4: CQADS Data Analysis Workshops...CQADS Data Analysis Workshops At CQADS, we know that your ideal decision-making environment re-quires an in-depth understanding of your data. To help

Special Topics in Machine LearningClassification Methods I and II

DESCRIPTIONA continuation of the introduction of data science concepts startedin Mining for Information Gold, with a detailed discussion of foursupplemental classification methods/concepts commonly used bydata scientists.The application of these concepts will be illustrated through somesimple toy examples, along with a discussion of common chal-lenges and pitfalls.

GOALBy the end of the course, participants will be able to further appre-ciate the functionality of increasingly sophisticated classificationmethods and recognize opportunities for their application whenpresented with real-world data.

INTENDED AUDIENCEIndividuals who wish to understand the functionality and capabili-ties offered by different classification methods, even if they won’tbe the ones implementing them.

This workshop is also designed for individuals who want to increasetheir data science competency by learning different data sciencetechniques.“Interactive NBA Shot Charts” by Todd W. Schneider

PRE-REQUISITESFamiliarity with the data science concepts intro-duced in Mining for Information Gold is assumed.

Some experience with mathematical optimizationwould be beneficial, but is not necessary.

SUGGESTED DURATION1 day each

OUTLINEClassification Methods I

1. Artificial Neural Networks2. Support Vector Machines3. Naıve Bayes Classifiers4. Logistic Regression

Classification Methods II5. Bayesian Trees6. Conditional Inference Trees7. Nearest Neighbour Classifiers8. Ensemble Learning

The Data Scientist’s ToolboxClassification with R

DESCRIPTIONA continuation of the introductory experience extracting patterns and knowl-edge from real datasets using R. This workshop is a companion to Classifi-cation Methods I and II.

GOALBy the end of the course, participants will be able to solve classificationproblems using R and increasingly sophisticated data science methods.

INTENDED AUDIENCEIndividuals who have some conceptual understanding of classification meth-ods, and who would like to gain some practical experience using R.

IT REQUIREMENTSParticipants must provide their own computers with wireless connectivity.

PRE-REQUISITESFamiliarity with the appropriate methodsdiscussed in Classification Methods I orII is required.

SUGGESTED DURATION1/2 day each

OUTLINEClassification Methods I

1. Artificial Neural Networks2. Support Vector Machines3. Naıve Bayes Classifiers4. Logistic Regression

Classification Methods II5. Bayesian Trees6. Conditional Inference Trees7. Nearest Neighbour Classifiers8. Ensemble Learning

Page 4 of 5 CQADS Data Analysis Workshops – Catalogue

Page 5: CQADS Data Analysis Workshops...CQADS Data Analysis Workshops At CQADS, we know that your ideal decision-making environment re-quires an in-depth understanding of your data. To help

Special Topics in Machine LearningClustering Algorithms I and II

DESCRIPTIONA continuation of the introduction of data science concepts startedin Mining for Information Gold, with a detailed discussion of foursupplemental clustering methods/concepts commonly used bydata scientists.The application of these concepts will be illustrated through somesimple toy examples, along with a discussion of common chal-lenges and pitfalls.

GOALBy the end of the course, participants will be able to further ap-preciate the functionality of increasingly sophisticated clusteringalgorithms and recognize opportunities for their application whenpresented with real-world data.

INTENDED AUDIENCEIndividuals who wish to understand the functionality and capabili-ties offered by different clustering algorithms, even if they won’t bethe ones implementing them.

This workshop is also designed for individuals who want to increasetheir data science competency by learning different data sciencetechniques.Personal file

PRE-REQUISITESFamiliarity with the data science concepts intro-duced in Mining for Information Gold is assumed.

Some experience with mathematical optimizationwould be beneficial, but is not necessary.

SUGGESTED DURATION1 day each

OUTLINEClustering Algorithms I

1. Hierarchical Clustering2. Density-Based Algorithms3. Spectral Clustering4. Clustering Validation

Clustering Algorithms II5. Expectation-Maximization Clustering6. Affinity Propagation Clustering7. Fuzzy Clustering8. Clustering Streaming Data

The Data Scientist’s ToolboxClustering with R

DESCRIPTIONA continuation of the introductory experience extracting patterns and knowl-edge from real datasets using R. This workshop is a companion to ClusteringAlgorithms I and II.

GOALBy the end of the course, participants will be able to solve clustering problemsusing R and increasingly sophisticated data science methods.

INTENDED AUDIENCEIndividuals who have some conceptual understanding of clustering algo-rithms, and who would like to gain some practical experience using R.

IT REQUIREMENTSParticipants must provide their own computers with wireless connectivity.

PRE-REQUISITESFamiliarity with the appropriate methodsdiscussed in Clustering Algorithms I or IIis required.

SUGGESTED DURATION1/2 day each

OUTLINEClustering Algorithms I

1. Hierarchical Clustering2. Density-Based Algorithms3. Spectral Clustering4. Clustering Validation

Clustering Algorithms II5. EM Clustering6. Affinity Propagation Clustering7. Fuzzy Clustering8. Clustering Streaming Data

CQADS Data Analysis Workshops – Catalogue Page 5 of 5