data science dianna xu bryn mawr college 1. the course 300-level computer science elective cs majors...
TRANSCRIPT
![Page 1: Data Science Dianna Xu Bryn Mawr College 1. The Course 300-level Computer Science elective CS majors and minors Pre-reqs: CS1, CS2 (Data Structures),](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649e4e5503460f94b44f65/html5/thumbnails/1.jpg)
1
Data Science
Dianna XuBryn Mawr College
![Page 2: Data Science Dianna Xu Bryn Mawr College 1. The Course 300-level Computer Science elective CS majors and minors Pre-reqs: CS1, CS2 (Data Structures),](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649e4e5503460f94b44f65/html5/thumbnails/2.jpg)
2
The Course
• 300-level Computer Science elective• CS majors and minors• Pre-reqs: CS1, CS2 (Data Structures), Discrete
Math and Linear Algebra• Unstructured data and explorative data analysis• Iterative processes that require programming
skills and and knowledge of algorithms
![Page 3: Data Science Dianna Xu Bryn Mawr College 1. The Course 300-level Computer Science elective CS majors and minors Pre-reqs: CS1, CS2 (Data Structures),](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649e4e5503460f94b44f65/html5/thumbnails/3.jpg)
3
200-level Data Visualization Course
• Taught Spring 2014 at Haverford– Stats basics, linear regression– Clustering– Baby network analysis – PageRank– Visualization with D3
• 50% overlap of students
![Page 4: Data Science Dianna Xu Bryn Mawr College 1. The Course 300-level Computer Science elective CS majors and minors Pre-reqs: CS1, CS2 (Data Structures),](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649e4e5503460f94b44f65/html5/thumbnails/4.jpg)
4
Topics
• Statistical methods (2-3 weeks)– basics, regression analysis, Bayesian methods
• Machine learning background (2-3 weeks)– multivariate regression and logistic regression
• Dimensionality reduction (3 weeks)– PCA, SVD, Kernel PCA, other non-linear methods
• Topological data analysis (2-3 weeks)– manifold learning, intro to TDA
• Network analysis (2-3 weeks)– collaborative filtering, community detection
![Page 5: Data Science Dianna Xu Bryn Mawr College 1. The Course 300-level Computer Science elective CS majors and minors Pre-reqs: CS1, CS2 (Data Structures),](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649e4e5503460f94b44f65/html5/thumbnails/5.jpg)
5
Project-Oriented
• Students will team up for a semester-long project on data analysis for local "data clients" – Faculty members who have "real life" data sets
and research questions– Library, registrar and institutional research
![Page 6: Data Science Dianna Xu Bryn Mawr College 1. The Course 300-level Computer Science elective CS majors and minors Pre-reqs: CS1, CS2 (Data Structures),](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649e4e5503460f94b44f65/html5/thumbnails/6.jpg)
6
Data Sets
• Digital Du Chemin – repertory of polyphonic songs from 16th-century
France• Dark Reactions – chemical experiments with associated reactants
and results• Maine athletes• Anil's bio data?
![Page 7: Data Science Dianna Xu Bryn Mawr College 1. The Course 300-level Computer Science elective CS majors and minors Pre-reqs: CS1, CS2 (Data Structures),](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649e4e5503460f94b44f65/html5/thumbnails/7.jpg)
7
Machine Learning Modules
• Focused on the process of doing (good) machine learning i.e.– (step 1) Pose a problem in the language of machine learning– (step 2) Gather data– (step 3) Choose a potential method for solving the problem– (step 4) Setup an experiment to properly evaluate your method– (step 5) Evaluate experiment and possibly go to step 2 or 3
• Possible toolsets:– iPython notebook– Scikit-learn
![Page 8: Data Science Dianna Xu Bryn Mawr College 1. The Course 300-level Computer Science elective CS majors and minors Pre-reqs: CS1, CS2 (Data Structures),](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649e4e5503460f94b44f65/html5/thumbnails/8.jpg)
8
iPython Notebook Modules
http://occam.olin.edu/sites/default/files/DataScienceMaterials/machine_learning_lecture_1/Machine%20Learning%20Lecture%201.html
http://occam.olin.edu/sites/default/files/DataScienceMaterials/machine_learning_lecture_2/Machine%20Learning%20Lecture%202.html
![Page 9: Data Science Dianna Xu Bryn Mawr College 1. The Course 300-level Computer Science elective CS majors and minors Pre-reqs: CS1, CS2 (Data Structures),](https://reader036.vdocuments.us/reader036/viewer/2022062304/56649e4e5503460f94b44f65/html5/thumbnails/9.jpg)
9
Open-source, python-based package for machine learning.Principal strength is a consistent API and enforcement of "good" machine learning practices