microsoft technologies for data science 201601

Post on 22-Jan-2018

1.247 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Microsoft Technologies for Data Science

Mark Tabladillo, Ph.D.

Senior Data Scientist

LogicBlox/Predictix

Networking

Interactive

http://www.bizjournals.com/atlanta/subscriber-only/2015/07/31/top-tech-employers-2015.html

Name Atlanta Georgia Total

McKesson Corporation 3,455 3,525 36,868

Name Atlanta Georgia Total

McKesson Corporation 3,455 3,525 36,868

Verizon Wireless 3,525 4,839 NA

Name Atlanta Georgia Total

McKesson Corporation 3,455 3,525 36,868

Verizon Wireless 3,525 4,839 NA

Lockheed Martin 5,800 5,800 24,000

Name Atlanta Georgia Total

McKesson Corporation 3,455 3,525 36,868

Verizon Wireless 3,525 4,839 NA

Lockheed Martin 5,800 5,800 24,000

Cox Enterprises Inc. 7,484 7,685 50,000

Name Atlanta Georgia Total

McKesson Corporation 3,455 3,525 36,868

Verizon Wireless 3,525 4,839 NA

Lockheed Martin 5,800 5,800 24,000

Cox Enterprises Inc. 7,484 7,685 50,000

AT&T Inc. 16,794 21,084 250,790

Terms Definition

Data Science

Machine Learning

Data Mining

Applied Statistics

the automated or semi-

automated process of

discovering patterns in

data

Applied scientific method

http://www.kdnuggets.com/polls/2015/analytics-

data-mining-data-science-software-used.html

http://products.office.com/en-us/excel

http://www.microsoft.com/en-us/server-cloud/products/sql-server/

http://pytools.codeplex.com/

http://azure.microsoft.com/en-us/services/hdinsight/

http://www.revolutionanalytics.com/

Technology Choices

SQL SERVER ANALYSIS SERVICES Enterprise

Business Intelligence

EXCEL ADD-IN FOR SSAS Office 365

Office 2013 or Higher x64

SEMANTIC SEARCH Enterprise

Business Intelligence

Standard

Web

Express with Advanced Services

MICROSOFT AZURE ML Free (Size Limited)

Paid (Web Service): Experiment + Query

F# Open Source

http://download.microsoft.com/download/F/C/2/FC21C981-

4351-4434-A78A-

3384CA7515BF/SQL_Server_2016_Deeper_Insights_Across_D

ata_White_Paper.pdf

SS

SQL

AS

NoSQL

Data mining add-in for business analysts

• Ease of use

• Rich data mining

• Scalable

Rowset

Output

with Scores

Varchar

NVarchar

Office

PDF

Documents

Full-Text

Keyword

Index

“FTI”

iFilters

Semantic Document

Similarity Index “DSI”

Semantic

Database

Semantic

Key Phrase

Index –

Tag Index

“TI”

Simplified Chinese

British English

Portuguese

Chinese (Hong Kong SAR, PRC)

Spanish

Chinese (Singapore)

Chinese (Macau SAR)

Time in Seconds vs. Number of Documents

(2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)

http://users.cis.fiu.edu/~lzhen001/activities/KDD2011Program/docs/p213.pdf

http://download.microsoft.com/download/3/B/9/3B9FBA69-8AAD-4707-830F-6C70A545C389/Introducing_Azure_Machine_Learning.pdf

http://datacamp.com

https://github.com/jakevdp/sklearn_pycon2015

http://azure.microsoft.com/en-us/services/machine-learning/

Mutable Immutable

Classic Open

Source

Java Scala

.NETNow Open Source

C#, C++,

VB.NET

F#

Conference: http://www.kdd.org/

http://www.kdnuggets.com/2015/09/free-data-science-books.html

https://channel9.msdn.com/Blogs/Windows-Azure

https://mva.microsoft.com/

http://blogs.technet.com/b/machinelearning/

http://social.msdn.microsoft.com/forums/azure/en-US/home?forum=MachineLearning

http://sqlserverdatamining.com

http://marktab.net

http://curah.microsoft.com/342704/azure-machine-learning-videos-february-2015

top related