descriptive statistics - wordpress.com · descriptive statistics maths 4th eso josÉ jaime noguera...

42
DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1

Upload: others

Post on 18-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

DESCRIPTIVE STATISTICS

MATHS 4TH ESO

JOSÉ JAIME NOGUERA

1

Page 2: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

INTRODUCTION

Statistics is used to collect, organize, analyze and present data.

• POPULATION: the whole group of entities (individuals) that you want to study.

• SAMPLE: a small subset of the population, that represents the entire population.

2

Page 3: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

3

Page 4: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

EXAMPLE

Spanish general election:

WHICH PARTY ARE YOU GOING TO VOTE?

– POPULATION: all Spanish citizen over 18.

– SAMPLE: the people who you really ask the question.

4

Page 5: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Random Variables

A random variable or statistical variable is the characteristic that we want to study in the population.

According to the answer to your question you can classify the variables as:

Random variable

QUALITATIVE Answer=not a

number

QUANTITATIVE Answer= a number

DISCRETE Answer=a integer

number

CONTINUOUS Answer= a decimal

number 5

Page 6: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Examples

• We want to study the number of brothers or sisters of a population.

– Discrete quantitative variable.

• We want to study the hair colour of a population.

– Qualitative variable.

• We want to study the height of a population.

– Continuous quantitative variable.

6

Page 7: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Organizing data: frequency tables and charts

7

Page 8: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

A complete example for a discrete quantitative variable

Study: we ask 25 students the number of brothers and sisters. The answers are:

1, 3 , 0, 1, 2 , 2 , 0, 1 , 2 , 1 , 2, 1, 1, 0, 1, 0, 1, 0, 1, 1, 3, 2, 2, 1 ,1

8

Page 9: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

YOU NEED TO KNOW

• N = the number of data.

• 𝑥𝑖 = the value of the variable number i.

• 𝑓𝑖 = absolute frequency, is the number of times that 𝑥𝑖 appears in the answers.

• ℎ𝑖 = relative frequency = 𝑓𝑖

𝑁

• 𝐹𝑖 = absolute cumulative frequency= 𝑓1 + 𝑓2 +⋯𝑓𝑖

• 𝐻𝑖 = relative cumulative frequency= ℎ1 + ℎ2 +⋯ℎ𝑖=𝐹𝑖

𝑁

• 𝐻𝑖 as % = relative cumulative frequency as a percentage= 𝐻𝑖 · 100 %

9

Page 10: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

1, 3 , 0, 1, 2 , 2 , 0, 1 , 2 , 1 , 2, 1, 1, 0, 1, 0, 1, 0, 1, 1, 3, 2, 2, 1 ,1

𝒙𝒊 𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %

0 5 5/25=0.2 5 5/25=0.2 0.2·100=20%

1 12 12/25=0.48 5+12=17 17/25=0.68 0.68·100=68%

2 6 6/25=0.24 5+12+6=23 23/25=0.92 0.92·100=92%

3 2 2/25=0.08 5+12+6+2=25 25/25=1 1·100=100%

N=25

FREQUENCY TABLE

10

Page 11: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

CHARTS

0

2

4

6

8

10

12

14

0 1 2 3

BAR CHART

Absolute frequency

11

Page 12: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

CHARTS

0

2

4

6

8

10

12

14

0 1 2 3

FREQUENCY POLIGON

Absolute frequency

12

Page 13: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

CHARTS

0 20%

1 48%

hi·360º=172.8º

2 24%

3 8%

PIE CHART

13

Page 14: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

A complete example for a continuous quantitative variable

If we have too many different values of xi we have to group the values into class intervals.

Example: We know the weight of 30 students:

52 63 71 68 72 69

73 81 53 80 71 72

77 61 83 78 55 60

73 53 66 90 80 96

67 70 82 83 71 61

14

Page 15: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Choosing the length (or amplitude) of the intervals

Here we have several options

• If the problem says the number of intervals, for instance, “group the data into 6 intervals”

𝑙𝑒𝑛𝑔𝑡ℎ =𝑀𝑎𝑥.𝑉𝑎𝑙𝑢𝑒−𝑀𝑖𝑛.𝑉𝑎𝑙𝑢𝑒

6

• If the problem says nothing:

𝑙𝑒𝑛𝑔𝑡ℎ =𝑀𝑎𝑥. 𝑉𝑎𝑙𝑢𝑒 −𝑀𝑖𝑛. 𝑉𝑎𝑙𝑢𝑒

𝑁

15

Page 16: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

In our case the problem says nothing, therefore:

𝑙𝑒𝑛𝑔𝑡ℎ =96 − 52

30= 8.03

The length should be an integer. We always choose the higher integer 8.03 → 𝑙𝑒𝑛𝑔ℎ𝑡 = 9

The first number of the first interval is also confusing. Sometimes the intervals are centered on the data, but we will simply choose the minimum value of our data.

16

Page 17: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

52 63 71 68 72 69

73 81 53 80 71 72

77 61 83 78 55 60

73 53 66 90 80 96

67 70 82 83 71 61

Intervals Class Mark 𝒙𝒊

𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %

[52,61)

[61,70)

[70,79)

[79,88)

[88,97]

Pay attention! 17

Page 18: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

52 63 71 68 72 69

73 81 53 80 71 72

77 61 83 78 55 60

73 53 66 90 80 96

67 70 82 83 71 61

Intervals Class Mark 𝒙𝒊

𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %

[52,61) 52 + 61

2= 56.5

[61,70) 65.5

[70,79) 74.5

[79,88) 83.3

[88,97] 92,5

18

Page 19: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

52 63 71 68 72 69

73 81 53 80 71 72

77 61 83 78 55 60

73 53 66 90 80 96

67 70 82 83 71 61

Intervals Class Mark 𝒙𝒊

𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %

[52,61) 56.5 5

[61,70) 65.5 7

[70,79) 74.5 10

[79,88) 83.3 6

[88,97] 92,5 2

19

Page 20: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

52 63 71 68 72 69

73 81 53 80 71 72

77 61 83 78 55 60

73 53 66 90 80 96

67 70 82 83 71 61

Intervals Class Mark 𝒙𝒊

𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %

[52,61) 56.5 5 5/30=0.16 5 5/30=0.16 16%

[61,70) 65.5 7 0.23 5+7=12 12/30=0.4 40%

[70,79) 74.5 10 0.33 22 0.73 73%

[79,88) 83.3 6 0.2 28 0.93 93%

[88,97] 92,5 2 0.06 30 1 100%

N=30 20

Page 21: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

HISTOGRAM

0

2

4

6

8

10

12

52 61 70 79 88 97

21

Page 22: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Exercise

In a clothing store, the number of garments sold per day is:

a) Make a frequency table grouping the data into 6 class intervals.

b) Draw the proper chart.

22

Page 23: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Statistical Parameters

23

Page 24: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

STATISTICAL CONCENTRATION PARAMETERS

They are also known as CENTRAL TENDENCY MEASURES:

• MEAN: 𝑥 =𝑥1·𝑓1+𝑥2·𝑓2+⋯+𝑥𝑛·𝑓𝑛

𝑁=

𝑥𝑖·𝑓𝑖𝑛𝑖=1

𝑁

• MODE: Mo is the 𝑥𝑖 with the greatest 𝑓𝑖

• MEDIAN: Me is the value in the middle of the data when they are in order.

24

Page 25: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Discrete quantitative variable

• Using the data of our previous example:

𝑥 =0·5+1·12+2·6+3·2

25=1.2

Mo= 1 (because its 𝒇𝒊 is the greatest one)

Me= 1

Because the 50% of N=25 is 0.5·25=12.5, then, the first 𝐹𝑖 greater than or equal to 12.5 is 17=𝐹2 which corresponds with 𝑥2=1

𝒙𝒊 𝒇𝒊 𝑭𝒊

0 5 5

1 12 17

2 6 23

3 2 25

N=25

25

Page 26: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Continuous quantitative variable

• Using the data of our previous example:

𝑥 =56.5·5+65.5·7+74.5·10+83.3·6+92.5·2

30

= 72.36 Mo= 74.5, or modal interval= [70,79)

Me= 74.5, or median class interval=[70,79)

Because the 50% of N=30 is 0.5·30=15, then, the first 𝐹𝑖 greater than or equal to 15 is 22=𝐹3 which corresponds with 𝑥3=74.5

Intervals 𝒙𝒊 𝒇𝒊 𝑭𝒊

[52,61) 56.5 5 5

[61,70) 65.5 7 12

[70,79) 74.5 10 22

[79,88) 83.3 6 28

[88,97] 92.5 2 30

N= 30

26

Page 27: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

STATISTICAL POSITION PARAMETERS

• QUARTILES: are the points that divide the data into four equal parts:

– 𝑄1: first quartile. Below 𝑄1 are the 25% of the data.

– 𝑄2 = Me. Below 𝑄2 = 𝑀𝑒 are the 50% of the data.

– 𝑄3: third quartile. Below 𝑄3 are the 75% of the data.

• PERCENTILES, 𝑃𝑘 , below it are the k% of the data.

27

Page 28: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Discrete quantitative variable • Using the data of our previous example:

𝒙𝒊 𝒇𝒊 𝑭𝒊

0 5 5

1 12 17

2 6 23

3 2 25

N=25

• 𝑄1 → 0.25 · 25 = 6.25 The first 𝐹𝑖 greater than or equal to 6.25 is 17=𝐹2 which corresponds with 𝑥2=1. Hence 𝑄1=1

• 𝑄2 → 0.5 · 25 = 12.5 The first 𝐹𝑖 greater than or equal to 12.5 is 17=𝐹2 which corresponds with 𝑥2=1. Therefore 𝑄2=1=Me

• 𝑄3 → 0.75 · 25 = 18.75 The first 𝐹𝑖 greater than or equal to 18.75 is 23=𝐹3 which corresponds with 𝑥3=2. Then 𝑄3=2

• 𝑃95 → 0.95 · 25 = 23.75 The first 𝐹𝑖 greater than or equal to 23.75 is 25=𝐹4 which corresponds with 𝑥3=3. Hence 𝑃95=3

28

Page 29: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Continuous quantitative variable • Using the data of our previous example:

• 𝑄1 → 0.25 · 30 = 7.5 The first 𝐹𝑖 greater than or equal to 7.5 is 12=𝐹2 which corresponds with 𝑥2=65.5. Hence 𝑄1=65.5

• 𝑄2 → 0.5 · 30 = 15 The first 𝐹𝑖 greater than or equal to 15 is 22=𝐹2 which corresponds with 𝑥3=74.5. Therefore 𝑄2=Me=74.5

• 𝑄3 → 0.75 · 30 = 22.5 The first 𝐹𝑖 greater than or equal to 28 is 28=𝐹4 which corresponds with 𝑥4=83.3. Then 𝑄3=83.3

• 𝑃30 → 0.30 · 30 = 9 The first 𝐹𝑖 greater than or equal to 9 is 12=𝐹2 which corresponds with 𝑥2=65.5. Hence 𝑃30=65.5

Intervals 𝒙𝒊 𝒇𝒊 𝑭𝒊

[52,61) 56.5 5 5

[61,70) 65.5 7 12

[70,79) 74.5 10 22

[79,88) 83.3 6 28

[88,97] 92.5 2 30

N= 30

29

Page 30: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Box and whisker plot

30

Page 31: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Discrete quantitative variable

• We know that – Minimum value=0

– 𝑄1=1

– 𝑄2 = 𝑀𝑒=1

– 𝑄3=2

– Maximum value=3

31

Page 32: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Continuous quantitative variable

• We know that – Minimum value = 52

– 𝑄1= 65.5

– 𝑄2 = 𝑀𝑒= 74.5

– 𝑄3= 83.3

– Maximum value = 97

32

Page 33: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Improving the quartiles calculus

Once you know the basics of quartiles let’s see an special case:

• Calculate the quartiles :

1 , 1, 1, 2, 3, 3, 4, 4, 4, 5, 5, 5

The frequency table is:

33

Page 34: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

𝒙𝒊 𝒇𝒊 𝑭𝒊

1 3 3

2 1 4

3 2 6

4 3 9

5 3 12

N=12

If you want to calculate the mean: • 𝑄2 → 0.5 · 12 = 6 The first 𝐹𝑖 greater than

or equal to 6 is 6=𝐹3 which corresponds with 𝑥3=3. Therefore 𝑄2=3=Me

But this is unreal because if we see the data the mean should be:

1, 1, 1, 2, 3, 3, 4, 4, 4, 5, 5, 5

Me=3+4

2= 3.5

To solve this drawback, we simply, calculate the Mean (or any other quartile) as 𝑥𝑖 + 𝑥𝑖+1

2

when we find a 𝐹𝑖 exactly the same as 0.5·N (or k% of N). In other cases we calculate the quartiles as usual. Hereinafter we will calculate the quartiles as has been explained in this slide.

34

Page 35: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Dispersion (spread) Parameters

• Range: R=Max. value-Min value.

• Average Deviation: 𝐷𝑥 =𝑓1 𝑥1−𝑥 +𝑓2 𝑥2−𝑥 +⋯+𝑓𝑛 𝑥𝑛−𝑥

𝑁

• Variance: 𝜎2 =𝑓1 𝑥1−𝑥

2+𝑓2 𝑥2−𝑥 2+⋯+𝑓𝑛 𝑥𝑛−𝑥

2

𝑁

• Standard deviation: 𝜎 = 𝜎2

• Coefficient of variation: CV=𝜎

𝑥

35

Page 36: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Example

We know that:

𝑥 = 1.2

• Range: R=3-0=3

• Average Deviation: 𝐷𝑥 =5 0−1.2 +12 1−1.2 +6 2−1.2 +2 3−1.2

25 = 0.67

• Variance: 𝜎2 =5 0−1.2 2+12 1−1.2 2+6 2−1.2 2+2 3−1.2 2

25= 0.72

• Standard deviation: 𝜎 = 𝜎2 = 0.86 = 0.85

• Coefficient of variation: CV=𝜎

𝑥 =

0.85

1.2= 0.6

36

Page 37: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Exercise

• Calculate the spread parameters:

𝑥 = 72.36

37

Page 38: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Interpreting the spread measures

A 𝑥 = 3

𝜎 = 1.03

𝐶𝑉 =𝜎

𝑥 = 0.34

B 𝑥 = 3

𝜎 = 1.68

𝐶𝑉 =𝜎

𝑥 = 0.56

C 𝑥 = 30 𝜎 = 16.8

𝐶𝑉 =𝜎

𝑥 = 0.56

• A and B have the same mean but the data dispersion is greater in B because its 𝜎 is greater in B (also de CV)

• The CV is useful when we compare two sets of data when the units are different. • In A and B the CV contains the same information as 𝜎 because the units are the

same (1,2,3,4,5), but if we compare B and C the 𝜎 is not useful because it seems that in C the data dispersion is greater (𝜎 is greater) . But that is not true because the units are different, so we have to use the CV . In fact, the data spread in B and C is the same (because they have the same CV)

38

Page 39: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Dispersion diagrams

39

Page 40: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Dispersion diagrams

If we have pairs of data (𝑥𝑖 , 𝑦𝑖) and we plot them, we obtain a dispersion diagram.

Example:

40

Page 41: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Correlation • If the point cloud is near a line, then exist linear correlation.

• In other types of curves, there is correlation but is nonlinear.

• If the point cloud is not near any curve, then there is no correlation.

41

Page 42: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data

Example • In a laboratory, we give to some mice three medicaments A, B, C in order

to cure a disease. We plot the quantity of substance (X axis) vs the number of dead mice (Y axis). We plot the dispersion diagrams associated with

each substance:

A B C

• In A there is linear positive correlation, as X increases Y tends to increase. This is not a good choice because the medicament kills the mice.

• In B there is linear negative correlation. As X increase, Y decreases. B is a good medicament because reduce the number of dead mice.

• In C there is no correlation. C does not have anything to do with the disease

42