apache madlib (incubating)...decision tree pca low rank matrix factorization classification...

17
1 Apache MADlib (Incubating) Oct 2016 User Survey Results

Upload: others

Post on 23-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

1

Apache MADlib (Incubating)

Oct 2016

User Survey Results

Page 2: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

2

Received ~40 responses from 27 different companies

Page 3: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

3

Summary (1) • ~50% of respondents have 1 year or less of

MADlib use• Fraud detection is the most common use case• Regression (various), clustering and random

forest are the most commonly used MADlib algorithms

• Gradient boosting is the most commonly requested new algorithm

Page 4: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

4

Summary (2) • Users prefer new algorithms more than

improvements to existing algorithms by a 2:1 margin

• Improved documentation/examples and better performance are the biggest concerns

• The most common other tools used by respondents are R, Spark and Python (and associated libraries)

Page 5: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

5

Q1

Page 6: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

6

Q2

Page 7: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

7

Q3

Page 8: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

8

Q4 - Top Use Cases

Page 9: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

9

Q4 - Other Use Cases

Page 10: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

10

Q4 - Use Cases

Stemmed, stop words removed

Page 11: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

11

Q5 - Frequently Used Algorithms

Page 12: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

12

Q6 - Top Requested Features

*Note that there is an R interface called PivotalRhttps://cran.r-project.org/web/packages/PivotalR/

*

Page 13: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

13

Q6 - Other Requested Features

*

Page 14: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

14

Q6 - Requested Features

All responses, stemmed, stop words removed

Page 15: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

15

Q7 - Main Concerns

Page 16: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

16

Q7 - Main Concerns

All responses, stemmed, stop words removed

Page 17: Apache MADlib (Incubating)...Decision tree PCA Low rank matrix factorization Classification (various) PivotalR SVM Count 10 busi center exploratori scienc System intellig histori essenti

17

Q8 - Other Tools Used

+Several others...