data science, data & dashboards design

12
Data Science, Data & Dashboards Koo Ping Shung [email protected]/[email protected]

Upload: koo-ping-shung

Post on 13-Dec-2014

230 views

Category:

Data & Analytics


0 download

DESCRIPTION

Sharing of Data Science Questions, Types of Data and also some guidelines on designing dashboards.

TRANSCRIPT

Page 1: Data Science, Data & Dashboards Design

Data Science, Data & Dashboards

Koo Ping Shung

[email protected]/[email protected]

Page 2: Data Science, Data & Dashboards Design

Acknowledgement Materials for these slides are adapted from

Week 3 materials of Coursera’s Data Scientist Toolbox module.

https://www.coursera.org/course/datascitoolbox

Information Dashboard Design (2006) by Stephen Few.

Page 3: Data Science, Data & Dashboards Design

Data Science Questions Descriptive Exploratory Inferential Predictive Causal Mechanistic

Page 4: Data Science, Data & Dashboards Design

Data Science Questions (I) Descriptive

Descriptive statistics on a data set. Set the ground for further analysis.

Exploratory Getting familiar with the data. Finding some initial/preliminary patters in the

data. Inferential

Using a small sample to say something about the population

Central Limit Theorem – How much is enough? Choosing a non-bias sample.

Page 5: Data Science, Data & Dashboards Design

Data Science Questions (II) Predictive

Using Xs to predict value of Y Accuracy depends on getting the ‘right’ data. Parsimony – Using fewest X to predict Y accurately

Causal (Stats & OR) What is the change in A when B changes? (+ve/-ve) Supported by some hypothesis.

Mechanistic (Stats & OR) Exact changes in variables leading to exact

changes in other variables. Can keep throwing new variables in but how much

is enough?

Page 6: Data Science, Data & Dashboards Design

The ProcessBusiness

Objectives & Questions

Collecting & Preparing Data Exploratory Data

Analysis

Build Mathematical Models (Train &

Validate)

Select the Mathematical Model

to be used.

Deployment of Mathematical Model

in IT Systems (if needed)

Continuous Validation of Model to ensure

acceptable Predictive Power

Implementation

Preparation

Model B

uild

ing

Page 7: Data Science, Data & Dashboards Design

7

Types of Data

Data are generally classified into two types: Structured or Unstructured data.

Structured data are data that are generally captured by source systems and tabular

format. Each row is an observation and each column represents a

variable/characteristics. Structured data is understood easily by computers and human for

processing. Unstructured data

Each row may represent a document/file/listing. There are no variable types. More processes is needed to understand and analyze each observation.

Note that the analysis of each type of data, structured and unstructured is very different.

Page 8: Data Science, Data & Dashboards Design

Types of Data Data Tables

Relational Databases JSON & XML

JSON – Javascript Object Notation XML – Extended Markup Language

Textual – Tweets, Blogs, Emails, Reviews Visual – Videos & Pictures Audio – Music, Sound, Speech

Page 9: Data Science, Data & Dashboards Design

Big Data Surrounded by data 3 Vs of Big Data

Velocity Volume Variety

Data capturing is much easier with growth of technology.

Relevant data is more important. Role of the Data Scientist to propose data to

capture. Weighing Value vs Costs (Capture &

Maintenance)

Page 10: Data Science, Data & Dashboards Design

Building Dashboards Role of dashboard – Strategic, Analytical or

Operational Type of data Domain Type of Measurement – ruler, listing Update frequency Access rights Interactivity Mechanism of display – Text, Graphics or

mixture. Portability – Mobile? PC?

Page 11: Data Science, Data & Dashboards Design

Dashboards Guidelines Try to stay within a single screen. Use as much of the ‘real estate’ but put in

relevant information. Provide context – How Good or Bad? Where

are we? Avoid too much details and precisions. Choose the right display

Go back to the Biz Qn – Pie Chart or Bar Chart? Highlight important info – make sure it stands

out. Do not clutter with unnecessary ‘ornaments’. Watch the colors.

Page 12: Data Science, Data & Dashboards Design

Koo Ping Shung

[email protected]/[email protected]

Thank you!