dr.shaikh shaffi ahamed ph.d., associate professor dept. of family & community medicine

65
Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Upload: amberly-short

Post on 01-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Dr.Shaikh Shaffi Ahamed Ph.D.,Associate Professor

Dept. of Family & Community Medicine

Page 2: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Statistics is the science of conducting studies to collect, organize, summarize, analyze, present, interpret and draw conclusions from data.

Any values (observations or measurements) that have been collected

Page 3: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

What Is Statistics?1. Collecting Data

e.g., Sample, Survey, Observe,Simulate

2. Characterizing Data e.g., Organize/Classify,

Count, Summarize

3. Presenting Data e.g., Tables, Charts,

Statements

4. Interpreting Resultse.g. Infer, Conclude, Specify Confidence

Why?Data Analysis

Decision-Making

© 1984-1994 T/Maker Co.

Page 4: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

(1) Statistics arising out of biological sciences, particularly from the fields of Medicine and public health.

(2) The methods used in dealing with statistics in the fields of medicine, biology and public health for planning, conducting and analyzing data which arise in investigations of these branches.

Page 5: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

BASIC CONCEPTSData : Set of values of one or more variables recorded on one or more observational units (singular: Datum)

Categories of data 1. Primary data: observation, questionnaire, interviews & survey 2. Secondary data: census, medical records, registry

Sources of data 1. Routinely kept records2. Surveys (census)3. Experiments4. External source

Page 6: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Dataset: Data for a set of variables collection in group of persons.

Data Table: A dataset organized into a table, with one column for each variable and one row for each person.

Datasets and Data Tables

Page 7: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

OBS AGE BMI FFNUM TEMP( 0F) GENDER EXERCISE LEVEL QUESTION

1 26 23.2 0 61.0 0 1 1

2 30 30.2 9 65.5 1 3 2

3 32 28.9 17 59.6 1 3 4

4 37 22.4 1 68.4 1 2 3

5 33 25.5 7 64.5 0 3 5

6 29 22.3 1 70.2 0 2 2

7 32 23.0 0 67.3 0 1 1

8 33 26.3 1 72.8 0 3 1

9 32 22.2 3 71.5 0 1 4

10 33 29.1 5 63.2 1 1 4

11 26 20.8 2 69.1 0 1 3

12 34 20.9 4 73.6 0 2 3

13 31 36.3 1 66.3 0 2 5

14 31 36.4 0 66.9 1 1 5

15 27 28.6 2 70.2 1 2 2

16 36 27.5 2 68.5 1 3 3

17 35 25.6 143 67.8 1 3 4

18 31 21.2 11 70.7 1 1 2

19 36 22.7 8 69.8 0 2 1

20 33 28.1 3 67.8 0 2 1

Typical Data Table

Page 8: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Definitions for Variables

• AGE: Age in years• BMI: Body mass index, weight/height2 in kg/m2

• FFNUM: The average number of times eating “fast food” in a week

• TEMP: High temperature for the day• GENDER: 1- Female 0- Male• EXERCISE LEVEL: 1- Low 2- Medium 3- High• QUESTION: what is your satisfaction rating for this

Biostatistics session ?

1- Very Satisfied 2- Somewhat Satisfied 3- Neutral

4- Somewhat dissatisfied 5- Dissatisfied

Page 9: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

• When collecting or gathering data we collect data from individuals cases on particular variables.

• A variable is a unit of data collection whose value can vary.

• Variables can be defined into types according to the level of mathematical scaling that can be carried out on the data.

• There are four types of data or levels of measurement:

Types of variables and Types of variables and datadata

1. Nominal 2. Ordinal

3. Interval 4. Ratio

Page 10: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Scales of Measurement

Page 11: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Terminology Categorical variables Quantity variables Nominal variables Ordinal Variables Binary data. Discrete and continuous data. Interval and ratio variables Qualitative and Quantitative traits/

characteristics of data.

Page 12: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Categorical data The objects being studied are grouped into

categories based on some qualitative trait.

The resulting data are merely labels or categories.

Page 13: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Examples of categorical data

Eye color:blue, brown, black, green, etc. Smoking status: smoker, non-smoker Attitudes towards the death penalty:Strongly disagree, disagree, neutral, agree,

strongly agree.

Page 14: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Categorical data

Ordinaldata

Nominaldata

Page 15: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Nominal data

A type of categorical data in which objects fall into unordered categories.

Studies measuring nominal data must ensure that each category is mutually exclusive and the system of measurement needs to be exhaustive.

Variables that have only two responses i.e. Yes or No, are known as dichotomies.

Page 16: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Examples of Nominal data Type of Car

BMW, Mercedes, Lexus, Toyota, etc., Ethnicity

White British, Afro-Caribbean, Asian, Arab, Chinese, other, etc.

Smoking statussmoker, non-smoker

Page 17: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Binary Data A type of categorical data in which there

are only two categories.Examples: Smoking status- smoker, non-smoker Attendance- present, absent Result of a exam- pass, fail. Status of student- undergraduate,

postgraduate.

Page 18: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

• Ordinal data is data that comprises of categories that can be rank ordered.

• Similarly with nominal data the distance between each category cannot be calculated but the categories can be ranked above or below each other.

Ordinal dataOrdinal data

Page 19: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Examples of Ordinal Data:

Grades in exam- A+, A, B+ B, C+, C ,D , D+, and Fail.

Degree of illness- none, mild, moderate, acute, chronic.

Opinion of students about stats classes-Very unhappy, unhappy, neutral, happy,

ecstatic!

Page 20: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Examples:

Nominal data (Binary)& Nominal data (Binary)& Ordinal dataOrdinal data

What is your gender? (please tick)

Male

Female

Did you enjoy the teaching session ? (please tick)

Yes

NoWhat is the level of satisfaction with the new curriculum at a medical school received? (please tick)

Very satisfied

Somewhat satisfied

Neutral

Somewhat dissatisfied

Very dissatisfied

Page 21: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

QUANTATIVE DATA The objects being studied are

‘measured’ based on some quantitative trait.

The resulting data are set of numbers. Examples: Pulse rate Height Age Exam marks Time to complete a statistics test Number of cigarettes smoked

Page 22: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Quantitativedata

ContinuousDiscrete

Page 23: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Discrete DataOnly certain values are possible (there are gaps between the possible values). Implies

counting.

Continuous DataTheoretically, with a fine enough

measuring device. Implies counting.

Page 24: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

24

Discrete data -- Gaps between possible values

Continuous data -- Theoretically,no gaps between possible values

Number of Children

Hb

Page 25: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Examples of Discrete Data: Number of children in a family Number of students passing a stats exam Number of crimes reported to the police Number of bicycles sold in a day.

Generally, discrete data are counts.We would not expect to find 2.2 children

in a family or 88.5 students passing an exam or 127.2 crimes being reported to the police or half a bicycle being sold in

one day.

Page 26: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Example of Continuous Data: Age ( in years) Height( in cms.) Weight (in Kgs.) Sys.BP, Hb., etc.,

‘Generally, continuous data come from measurements.

Page 27: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Variables

Category Quantity

Nominal Ordinal Discrete(counting)

Continuous(measuring)

Page 28: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Interval variables Examples: Fahrenheit temperature scale- Zero is

arbitrary- 40 Degrees is not twice as hot as 20 degrees.

IQ tests. No such thing as Zero IQ. 120 IQ not twice as intelligent as 60.

Question- Can we assume that attitudinal data represents real, quantifiable measured categories? (ie. That ‘very happy’ is twice as happy as plain ‘happy’ or that ‘Very unhappy’ means no happiness at all). “Statisticians not in agreement on this”.

Page 29: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Ratio variables Examples: Can be discrete or continuous data. The distance between any two adjacent

units of measurement (intervals) is the same and there is a meaningful zero point.

Income- someone earning SR20,000 earns twice as much as someone who earns SR10,000.

Height Weight Age

Page 30: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

•These levels of measurement can be placed in hierarchical order.

Hierarchical data Hierarchical data orderorder

Page 31: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

• Nominal data is the least complex and give a simple measure of whether objects are the same or different.

• Ordinal data maintains the principles of nominal data but adds a measure of order to what is being observed.

• Interval data builds on ordinal by adding more information on the range between each observation by allowing us to measure the distance between objects.

• Ratio data adds to interval with including an absolute zero.

Hierarchical data Hierarchical data orderorder

Page 32: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

32

QUANTITATIVE DATA QUALITATIVE DATA

wt. (in Kg.) : under wt, normal & over wt. Ht. (in cm.): short, medium & tall

Page 33: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

33

CLINIMETRICS

A science called clinimetrics in which qualities are converted to meaningful quantities by using the scoring system.

Examples: (1) Apgar score based on appearance, pulse, grimace,

activity and respiration is used for neonatal prognosis.

(2) Smoking Index: no. of cigarettes, duration, filter or not, whether pipe, cigar etc.,

(3) APACHE( Acute Physiology and Chronic Health Evaluation) score: to quantify the severity of condition of a patient

Page 34: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

• Why do we need to know what type of data we are dealing with?

• The data type or level of measurement influences the type of statistical analysis techniques that can be used when analysing data.

Data types – Data types – important?important?

Page 35: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Frequency DistributionsFrequency Distributions What is a frequency distribution?What is a frequency distribution? A

frequency distribution is an organization of raw data in tabular form, using classes (or intervals) and frequencies.

What is a frequency count?What is a frequency count? The frequency or the frequency count for a data value is the number of times the value occurs in the data set.

Page 36: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Frequency Distributions data distribution – pattern of

variability.

the center of a distribution the ranges the shapes

simple frequency distributions grouped & ungrouped frequency

distributions

Page 37: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Categorical or Qualitative Categorical or Qualitative Frequency DistributionsFrequency Distributions

What is a categorical frequency What is a categorical frequency distribution?distribution?

A categorical frequency distribution represents data that can be placed in specific categories, such as gender, blood group, & hair color, etc.

Page 38: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Categorical or Qualitative Categorical or Qualitative Frequency Distributions -- Frequency Distributions -- ExampleExample

Example:Example: The blood types of 25 blood donors are given below. Summarize the data using a frequency distribution.

AB B A O BAB B A O B O B O A O O B O A O B O B B B B O B B B A O AB AB O A O AB AB O A B AB O AA B AB O A

Page 39: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Categorical Frequency Distribution Categorical Frequency Distribution for the Blood Types -- for the Blood Types -- Example ContinuedExample Continued

Note:Note: The classes for the distribution are The classes for the distribution are the blood types.the blood types.

Page 40: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Quantitative Frequency Quantitative Frequency Distributions -- UngroupedDistributions -- Ungrouped

What is an ungrouped frequency What is an ungrouped frequency distribution?distribution?

An ungrouped frequency distribution simply lists the data values with the corresponding frequency counts with which each value occurs.

Page 41: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Quantitative Frequency Quantitative Frequency Distributions Distributions –– Ungrouped -- Ungrouped -- ExampleExample

Example:Example: The at-rest pulse rate for 16 athletes at a meet were 57, 57, 56, 57, 57, 57, 56, 57, 58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60,58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60, and 5858. Summarize the information with an ungrouped frequency distribution.

Page 42: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Quantitative Frequency Quantitative Frequency Distributions Distributions –– Ungrouped -- Ungrouped -- Example Example

ContinuedContinued

Note: The Note: The (ungrouped) (ungrouped) classes are the classes are the observed values observed values themselves.themselves.

Page 43: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Example of a simple frequency distribution (ungrouped)

5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1

f 9 3 8 2 7 2 6 1 5 4 4 4 3 3 2 3 1 3

f = 25

Page 44: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Relative Frequency DistributionRelative Frequency Distribution

Note:Note: The relative The relative frequency for a frequency for a class is obtainedclass is obtainedby computing by computing f/nf/n..

Page 45: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Example of a simple frequency distribution 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1

f rel f 9 3 .12 8 2 .08 7 2 .08 6 1 .04 5 4 .16 4 4 .16 3 3 .12 2 3 .12 1 3 .12 f = 25 rel f = 1.0

Page 46: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Cumulative Frequency and Cumulative Frequency and Cumulative Relative FrequencyCumulative Relative Frequency

Note:Note: Table Table withwithrelative and relative and cumulativecumulativerelative relative frequencies.frequencies.

Page 47: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Example of a simple frequency distribution (ungrouped)

5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1

f cf rel f rel. cf 9 3 3 .12 .12 8 2 5 .08 .20 7 2 7 .08 .28 6 1 8 .04 .32 5 4 12 .16 .48 4 4 16 .16 .64 3 3 19 .12 .76 2 3 22 .12 .88 1 3 25 .12 1.0 f = 25 rel f = 1.0

Page 48: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Quantitative Frequency Quantitative Frequency Distributions -- GroupedDistributions -- Grouped

What is a grouped frequency What is a grouped frequency distribution?distribution? A grouped frequency distribution is obtained by constructing classes (or intervals) for the data, and then listing the corresponding number of values (frequency counts) in each interval.

Page 49: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Patient No

Hb

(g/dl)

Patient No

Hb

(g/dl)

Patient No

Hb

(g/dl)

1 12.0 11 11.2 21 14.9

2 11.9 12 13.6 22 12.2

3 11.5 13 10.8 23 12.2

4 14.2 14 12.3 24 11.4

5 12.3 15 12.3 25 10.7

6 13.0 16 15.7 26 12.5

7 10.5 17 12.6 27 11.8

8 12.8 18 9.1 28 15.1

9 13.2 19 12.9 29 13.4

10 11.2 20 14.6 30 13.1

Tabulate the hemoglobin values of 30 adult Tabulate the hemoglobin values of 30 adult male patients listed belowmale patients listed below

Page 50: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Hb (g/dl) No. of patients

9.0 – 9.910.0 – 10.911.0 – 11.912.0 – 12.913.0 – 13.914.0 – 14.915.0 – 15.9

136

10532

Total 30

Frequency distribution of 30 adult male Frequency distribution of 30 adult male patients by Hb patients by Hb

Page 51: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

DIAGRAMS/GRAPHSCategorical data

--- Bar diagram (one or two groups)

--- Pie diagram

Continuous data

--- Histogram

--- Frequency polygon (curve)

--- Stem-and –leaf plot

--- Box-and-whisker plot

Page 52: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Two-dimensional graphs: Basic Set-Up

Page 53: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Histograms

H is t o g ra m s

Page 54: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Frequency Polygons

Page 55: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Example data

68 63 42 27 30 36 28 3279 27 22 28 24 25 44 6543 25 74 51 36 42 28 31 28 25 45 12 57 51 12 32 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43 27 49 28 23 19 11 52 46 3130 43 49 12

Page 56: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Stem and leaf plotStem-and-leaf of Age N = 60

Leaf Unit = 1.0

6 1 122269

19 2 1223344555777788888

11 3 00111226688

13 4 2223334567999

5 5 01127

4 6 3458

2 7 49

Page 57: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Bar Graphs

912

2016

128

20

0

5

10

15

20

25

Smo Alc Chol DM HTN NoExer

F-H

Risk factor

Numb

er

The distribution of risk factor among cases with Cardio vascular Diseases

Heights of the bar indicates frequency

Frequency in the Y axis and categories of variable in the X axis

The bars should be of equal width and no touching the other bars

Page 58: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

HIV cases enrolment in USA by gender

0

2

4

6

8

10

12

1986 1987 1988 1989 1990 1991 1992

Year

En

rollm

ent

(hu

nd

red

)

MenWomen

Bar chart

Page 59: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

HIV cases Enrollment in USA by gender

0

2

4

6

8

10

12

14

16

18

1986 1987 1988 1989 1990 1991 1992

Year

Enro

llm

ent (T

hou

sands)

WomenMen

Stocked bar chart

Page 60: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Grouped Bar Graph

Page 61: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Pie diagram Pie diagram – depicts

the percentage represented by each alternative as a slice of a circular pie; the larger the slice, the greater the percentage.

Page 62: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

10%

20%

70%

Mild

Moderate

Severe

The prevalence of different degree of Hypertension

in the population

Pie Chart•Circular diagram – total -100%

•Divided into segments each representing a category

•Decide adjacent category

•The amount for each category is proportional to slice of the pie

Page 63: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine
Page 64: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

General rules for designing graphs A graph should have a self-explanatory

legend A graph should help reader to

understand data Axis labeled, units of measurement

indicated Scales important. Start with zero

(otherwise // break) Avoid graphs with three-dimensional

impression, it may be misleading (reader visualize less easily

Page 65: Dr.Shaikh Shaffi Ahamed Ph.D., Associate Professor Dept. of Family & Community Medicine

Thank You