ß ß data science incubator. this morning context: a data science environment data science studio...

19
ß Data Science Incubator

Upload: mervyn-moore

Post on 28-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

ß

Data Science Incubator

Page 2: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

This morning

• Context: A Data Science Environment• Data Science Studio• Pilot Incubator Program• Discussion

2

Page 3: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

A 5-year, $37.8 million cross-institutional collaboration

3

Page 4: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

Establish a virtuous cycle

• 6 working groups, each with • 3-6 faculty from each institution

Page 5: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

Pilot Program Organizers

• Andrew Whitaker, Research Scientist• Dan Halperin, Director of Research, Scalable Data Analytics• Jake Vanderplas, Director of Research, Physical Sciences• Bill Howe, Associate Director

5

Page 6: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

The Data Science Studio

• An open collaborative research space• A resident data science team

– Permanent staff of ~5 data scientists – applied research and development– ~15-20 data science fellows (research scientists, visitors, postdocs, students)

• How to Engage:– Drop-in open workspace– Studio “Office Hours”– Incubation Program

…plus seminars, sponsored lunches, workshops, bootcamps, joint proposals...

6

Page 7: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

7

6th floor Physics Astronomy Building

A partnership among …

• Provost• UW Libraries• Physics, Astronomy,

Arts & Sciences• eScience Institute

Page 8: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

8

Estimated Timeline:• Design Phase Jan-June• Construction June – Sep• Target: October 1, 2014

Page 9: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

Incubator Program Overview

• Goal: Create watercooler opportunities and scale our efforts by co-locating collaborations from different fields in the studio

• Protocol: ~1-page proposals for 1-quarter, on-site data science collaborations with us

• What we're looking for: Projects where fruitful collaboration is possible, with potential for significant impact, and that have sustained engagement

• This meeting: Pilot program for Spring Quarter to inform full launch Fall 2014.

9

http://data.uw.edu/incubator

Page 10: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

Spring Incubator Pilot Program Logistics

• Applications due online 3/10• Each proposal identifies a Project Lead (PL)

– The person doing the work, not the thesis advisor

• Incubator participants join the studio 2 days/week– Days decided collectively by participants and team

• Pilot program operates out of Sieg 326• Milestones at 3, 6, 9 weeks

– blog posts + demo, visualization, IPython notebook, dataset, GitHub repo, preliminary results, etc.

• Networking/poster session during 9th week

10

Page 11: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

Areas of interest• scalable data management and analytics• learning and predictive models• interactive visualization• parallel algorithms• code review, publishing, and reproducibility• online teaching materials, tutorials

11

Page 12: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

A Live SeaFlow Dashboard

12

Laser

Microscope Objective

Pine Hole Lens

Nozzle d1

d2

FSC (Forward scatter)

Orange fluo

Red fluo

Francois Ribalet

Jarred Swalwell

Ginger Armbrust

Page 13: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

SeaFlow Ambitions• SeaFlow is a huge success! NSF wants one on

every R/V

13

Page 14: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

SeaFlow Ambitions• Underway biology should enable adaptive

sampling - a sort of “holy grail”

• How can remote collaborators participate?• What about citizen science?

14

“Wait! We saw a populationchange between P3 and P4!”

“Let’s go back!”

Page 15: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

A Live SeaFlow Dashboard

15

Is the instrumentworking?

Where is the ship?What is it doing?

What phytoplankton populations are we seeing?

Page 16: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

The AscotDB Project

16

• A multi-year collaboration between UW Astronomy and UW Computer Science researchers and students

• ASCOT = the AStronomy COllaborative Toolkit

• Goal: Provide an interactive and collaborative environment for analysis of astronomical data.

Page 17: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

The AscotDB Project

17

• Interacting browser-based widgets for generating database queries & associated visualization.

• The resulting visualizations can be shared with collaborators through a browser URL

Page 18: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

Pilot cohort desiderata

• good clustering• alignment with sponsor and program goals• new directions, new questions• availability, engagement, commitment• “do only what we can only do together”

– with apologies to Djikstra

• clarity and shovel-readiness• capacity for measurable outcomes

18

Page 19: ß ß Data Science Incubator. This morning Context: A Data Science Environment Data Science Studio Pilot Incubator Program Discussion 2

Spring Schedule

• 3/10: Proposals due• 3/14: Follow-up requests• 3/21: Pilot participants notified• 3/31: Spring program start date• 4/21: First milestone• 5/12: Second milestone• 6/2: Third milestone• 6/6: Poster/networking event

19