sfbayacm acm data science camp 2015 10 24

30
8:15 arrive, network, register for tutorial and camp 8:50-10:50 Tutorial: Introduction to R for Machine Learning 11:00 Camp Kickoff Sponsors: ACM SIGKDD, PayPal, UCSC 11:25 Keynote: Spark for Data Science, Big & Small 12:25 Propose Sessions Ask for a “show of hands for interest” Room Size 1:15 Lunch, post Session Matrix 2:00 Session 1 : (50 min for session, 10 min break) 5:00 Session 4 6:00 Session Summary

Upload: greg-makowski

Post on 21-Apr-2017

371 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: SFbayACM ACM Data Science Camp 2015 10 24

}  8:15 arrive, network, register for tutorial and camp }  8:50-10:50 Tutorial: Introduction to R for Machine

Learning

}  11:00 Camp Kickoff }  Sponsors: ACM SIGKDD, PayPal, UCSC }  11:25 Keynote: Spark for Data Science, Big & Small }  12:25 Propose Sessions Ask for a “show of hands for interest” à Room Size }  1:15 Lunch, post Session Matrix }  2:00 Session 1 : (50 min for session, 10 min break) }  5:00 Session 4 }  6:00 Session Summary

Page 2: SFbayACM ACM Data Science Camp 2015 10 24

◦  8:50 – 10:50am by �  Joseph Rickert (Program Manager, Microsoft) �  Robert Horton (Data Scientist, Microsoft)

◦  Rapid introduction to the R language – in

depth enough to build machine learning models �  RandomForest, kernlab, caret ◦  Exploratory analysis, visualize, clustering,

classification ◦  How to find R help and additional resources ◦  Big data capabilities of Microsoft’s RRE

distribution of R

Page 3: SFbayACM ACM Data Science Camp 2015 10 24
Page 4: SFbayACM ACM Data Science Camp 2015 10 24

Morning Tutorial Starts Now

Page 5: SFbayACM ACM Data Science Camp 2015 10 24

An ACM SF Bay Area Professional Chapter Event Saturday, October 24, 2015

SFbayACM.org/event/silicon-valley-data-science-camp-2015

WiFi: conference Password: (none)

Twitter Tag #DSCAMP

Page 6: SFbayACM ACM Data Science Camp 2015 10 24

Association of Computing Machinery (ACM)

◦  Principal technical, educational, scientific society for computing professionals world-wide

�  Chapter representing SF Bay Area since 1957

◦  Membership/volunteer led, local dues only $20/yr ◦  Members get discounts with publishers, conferences

◦  Produces monthly free meetings �  3rd Wed on General Computing topics �  4th Mon on Data Science

◦  Details at www.SFbayACM.org �  Suggest, Volunteer, Donate: [email protected]

Page 7: SFbayACM ACM Data Science Camp 2015 10 24

}  10 Year Anniversary of Data Science SIG }  Monday night, November 30 at ebay, San Jose ◦  Online Controlled Experiments: Lessons from Running

A/B/n Tests for 12 Years ◦  Ronny Kohavi, Distinguished Engineer & General

Manager, Analysis & Experimentation, Microsoft

Page 8: SFbayACM ACM Data Science Camp 2015 10 24

}  Scala Professional Development Seminar ◦  Date: Sat, Nov 7, 8am-5pm ◦  Location: PayPal Town Hall (here) ◦  Speaker: Cay Horstmann, Computer Science,

San Jose State University ◦  Author of “Scala for the Impatient”

◦  Interactive crash course into this language ◦  Bring your laptop (w/ Scala pre-loaded) ◦  Presentation / lab format

Q) What is Scala? A) Object Oriented Meets Functional http://www.scala-lang.org/

Page 9: SFbayACM ACM Data Science Camp 2015 10 24

}  How many have been to an un-conference? }  Goals and context of the un-conference ◦  Informal ◦  Share enthusiasm, curiosity, knowledge, questions ◦  Participate, make it happen! ◦  Share responsibility (i.e. leave session room after 50 min) ◦  Encourage session note takers to blog & share at end ◦  http://www.campsite.org/list/733 ◦  Respect others – questions & brainstorms are “safe” ◦  Have FUN!

Twitter Tag #DSCAMP

Page 10: SFbayACM ACM Data Science Camp 2015 10 24

◦  Greg Makowski – DS SIG & Conference Chair

◦  Bill Bruns – SF bay ACM Chair

◦  Stephen McInerney – DS SIG

◦  Steve Lazarus – web registration ◦  Seeking replacement before retirement

◦  Greg Weinstein - general

◦  Liana Ye – volunteers, food, registration

◦  Liz Fraley – ACM Treasurer

Bill

Liana

Greg W

Liz

Steve

Greg M

Stephen

Page 11: SFbayACM ACM Data Science Camp 2015 10 24

}  8:15 arrive, network, register for tutorial and camp }  8:50-10:50 Tutorial: Introduction to R for Machine

Learning

}  11:00 Camp Kickoff }  Sponsors: ACM SIGKDD, PayPal, UCSC }  11:25 Keynote: Spark for Data Science, Big & Small }  12:25 Propose Sessions Ask for a “show of hands for interest” à Room Size }  1:15 Lunch, post Session Matrix }  2:00 Session 1 : (50 min for session, 10 min break) }  5:00 Session 4 }  6:00 Session Summary

Page 12: SFbayACM ACM Data Science Camp 2015 10 24

}  SIGKDD: ACM SIG on Knowledge Discovery and Data Mining. ◦  Home of data miners, data scientists, and analytics

professionals

}  KDD: the premier conference of the field ◦  Research Track, Industry/Government Track, Industry

Practice Expo, Tutorials, Workshops, Invited Talks, Panels, KDD Cups

Page 13: SFbayACM ACM Data Science Camp 2015 10 24

Expect 2,000 – 2,500 attendees KDD Cup competition has been going since 2009

Page 14: SFbayACM ACM Data Science Camp 2015 10 24

}  General Chairs

}  Program Committee Chairs

}  Industry Chairs

Balaji Krishnapuram (IBM)

Mohak Shah (Bosch, USA)

Alex Smola (CMU)

Charu Aggarwal (IBM)

Rajeev Rastogi (Amazon)

Dou Shen (Baidu)

Page 15: SFbayACM ACM Data Science Camp 2015 10 24

Shipeng Yu Associate GC

David Hazel, Derek Young

Web Chairs

Ron Bekkerman Social Network Chair

Romer Rosales Proceedings Chair

Hanghang Tong, Vishy Vishwanathan Tutorials Chairs

Andrei Broder Panels Chair

Quoc Le, Zhi-Hua Zhou

Workshops Chairs

Shou-De Lin KDD Cup co- chair

Gabor Melli, Ankur Teredesai Media & Publicity Chairs

Ying Li Treasurer

Joaquin Quinonero Candela, Olivier Chapelle Local Arrangements Chairs

Sofus Macskassy Student Travel Awards

Chair

Page 16: SFbayACM ACM Data Science Camp 2015 10 24

2505 Augustine Drive, Santa Clara, CA 95054 (near Freeway 101 off Great American Parkway)

http://www.ucsc-extension.edu/

◦  UCSC Extension offers professional technology courses for software, hardware, IT and Web professionals. Over 100 courses are available for enrollment each quarter.

◦  Has a certificate program on “Database and Data Analytics” is the fastest growing certificate in UCSC Extension. Courses cover big data, data science and database applications.

Annual Sponsor

Page 17: SFbayACM ACM Data Science Camp 2015 10 24

Thank PayPal for use of the location Soren Archibald

www.KDnuggets.com A primary hub for data mining Co-marketing sponsor

Gregory Piatetsky-Shapiro

Page 18: SFbayACM ACM Data Science Camp 2015 10 24

STRONG FOUNDATION STRONG MOMENTUM

169 Million Active Customer Accounts

$8 Billion Revenue

4 Billion Payment Transactions

+19 Million Active Customer Accounts Gained in 2014

+17% Total Revenue Growth YoY

+24% Payment Transactions Growth YoY

$235 Billion Total Payment Volume

+25% Total Payment Volume Growth YoY

Page 19: SFbayACM ACM Data Science Camp 2015 10 24

© 2014 PayPal Inc. All rights reserved. Confidential and proprietary.

KEY ENABLER

OF OUR BUSINESS

SUPPORTS THE PAYPAL BRAND PROMISE

MAKES PAYPAL

UNIQUE

19

Invest in Growth & Innovation

Improve Experience & Increase Revenue Simultaneously

Lowest Loss Rates

Secure

Customer Champion

Simple

Onboard Underserved Merchants

New Markets, Multiple Funding Types

Enroll Users Easily

Ongoing Innovation

Page 20: SFbayACM ACM Data Science Camp 2015 10 24

© 2014 PayPal Inc. All rights reserved. Confidential and proprietary.

Strong Foundation

Strong Front Door

11.5 MILLION PAYMENTS processed daily by PayPal

Next-level encryption on every PayPal transaction

PayPal never shares financial information with merchants

PayPal always verifies a person’s identity for payments

24/7 data analytics combined with human oversight to accurately and quickly spot suspicious activity

Constant innovation to advance our machine learning/data mining techniques

Seller and buyer protection offered for eligible transactions

Security & Fraud Services

Consistently ranked among the top in consumer trust & security

20

Financial Information

Consumer Privacy

Consumers Trust PayPal to Help Protect

Their Information

% of consumers who trust these companies to protect their financial data and private

information such as passwords or birthday

Javelin Strategy & Research: Gang of Five: Apple, Google, Amazon, Facebook, and PayPal-eBay:

Threat of the Mobile Wallet Disruptors, 2013. 1% 1%

4% 3%

4% 4%

4% 4%

4% 4%

6% 6%

10% 7%

8% 7%

10% 10%

10% 8%

12% 13%

14% 14%

15% 15%

16% 15%

17%

17% 18%

21%

28% 29%

34% 34% Industry Engagement

Founding member of the FIDO alliance

PayPal chairs the DMARC initiative to reduce phishing attacks against all Internet users

PayPal has been doing tokenization for 15+ years, securely storing customers’ financial information in the cloud.

Page 21: SFbayACM ACM Data Science Camp 2015 10 24

}  Joseph Bradley is a Spark Committer working on MLlib at DataBricks

}  Ph.D. in Machine Learning from Carnegie Mellon University in 2013

}  Spark allows fast, iterative analysis on laptop & cluster }  Spark DataFrames, allow manipulation of an API inspired

by R & Python Pandas }  ML Pipelines facilitate ML workflows and model tuning }  Spark R provides an API for R users to work with

distributed data }  Initial PMML support to export models to other tools

Page 22: SFbayACM ACM Data Science Camp 2015 10 24

Keynote Starts Now

Page 23: SFbayACM ACM Data Science Camp 2015 10 24

}  8:15 arrive, network, register for tutorial and camp }  8:50-10:50 Tutorial: Introduction to R for Machine

Learning

}  11:00 Camp Kickoff }  Sponsors: ACM SIGKDD, PayPal, UCSC }  11:25 Keynote: Spark for Data Science, Big & Small }  12:25 Propose Sessions Ask for a “show of hands for interest” à Room Size }  1:15 Lunch, post Session Matrix }  2:00 Session 1 : (50 min for session, 10 min break) }  5:00 Session 4 }  6:00 Session Summary

Page 24: SFbayACM ACM Data Science Camp 2015 10 24

WiFi: conference Password: (none)

Page 25: SFbayACM ACM Data Science Camp 2015 10 24

Town Square A Main auditorium Largest sessions Summary session Town Square C

Coffee Food Sponsors

bathrooms Entrance

Registration Join ACM

Courtyard Eat Lunch

Fireside A

Fireside B

Fireside C

Fireside D

Powwow

Talk Soup Stairs

WiFi: conference Password: (none) www.SFbayACM.org

Page 26: SFbayACM ACM Data Science Camp 2015 10 24

WiFi: conference Password: (none) www.SFbayACM.org

Page 27: SFbayACM ACM Data Science Camp 2015 10 24

}  Write a topic on a sheet of paper ◦  Facilitators name

}  60 seconds per suggestion! ◦  Ask for people to show hands for interest, count ◦  Ask for a time keeper (50 minutes for a session) ◦  Ask for a blogger, note taker or person to report ◦  http://www.campsite.org/list/733

}  Based on interest amount, pick a session location and one of the 4 time frames

}  Pick what to attend per session: ◦  2:00 3:00 4:00 5:00

WiFi: conference Password: (none)

Twitter Tag #DSCAMP

Page 28: SFbayACM ACM Data Science Camp 2015 10 24

Session Proposals Start Now

Page 29: SFbayACM ACM Data Science Camp 2015 10 24

Concurrent Sessions 1-3 for the Camp

Page 30: SFbayACM ACM Data Science Camp 2015 10 24

Concurrent Sessions 4-6 for the Camp