level 3 certificate - ocr.org.uk · • it is symmetrical about the mean, which equals the mode,...

14
The Normal distribution March 2015 QUANTITATIVE PROBLEM SOLVING (MEI) QUANTITATIVE REASONING (MEI) LEVEL 3 CERTIFICATE Topic Exploration Pack H866/H867

Upload: vankiet

Post on 28-Aug-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

The Normal distributionMarch 2015

QUANTITATIVE PROBLEM SOLVING (MEI)QUANTITATIVE REASONING (MEI)

LEVEL 3 CERTIFICATETopic Exploration Pack

H866/H867

We will inform centres about any changes to the specification. We will also publish changes on our website. The latest version of our specification will always be the one on our website (www.ocr.org.uk) and this may differ from printed versions.

Copyright © 2015 OCR. All rights reserved.

Copyright OCR retains the copyright on all its publications, including the specifications. However, registered centres for OCR are permitted to copy material from this specification booklet for their own internal use.

Oxford Cambridge and RSA Examinations is a Company Limited by Guarantee. Registered in England. Registered company number 3484466.

Registered office: 1 Hills Road Cambridge CB1 2EU

OCR is an exempt charity.

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

Contents Introduction ..................................................................................................................................... 4

Part A Properties of the Normal curve ............................................................................................. 6

Part B Calculation of z-scores ....................................................................................................... 10

Part C Testing for Normality .......................................................................................................... 11

Activity 2 Comparing distributions - Teacher Guidance ................................................................. 12

This Topic Exploration Pack should accompany the OCR resource ‘The Normal Distribution’

learner activities, which you can download from the OCR website.

This activity offers an

opportunity for maths

skills development.

March 2015 3

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 4

Introduction There are three aspects of the Normal Distribution that need to be taught in Introduction to

Quantitative Reasoning:

Firstly, learners need to recognise and use the basic properties of the Normal distribution. These

are that:

• it forms a bell-shaped curve;

• it is symmetrical about the mean, which equals the mode, although learners should

understand that real data samples are rarely perfectly symmetrical;

• 68% of the data items lie within 1 standard deviation of the mean, 95% lie within 2 standard

deviations and 99.7% lie within 3 standard deviations.

Secondly, learners need to be able to calculate z-scores, understand that a z-score represents a

number of standard deviations from the mean and use z-scores to make comparisons.

Finally, they should be able to interpret a Normal probability plot when testing for Normality using

statistical software. This will only require them to recognise that a straight line plot indicates

Normality.

Prior Knowledge Students will need to be familiar with:

• Mean, median and mode, which should already have been covered by all students at

GCSE.

• The variability of sample data. The GCSE subject content includes, at both Foundation and

Higher, the limitations of sampling and the fact that empirical, unbiased samples tend

towards theoretical probability distributions with increasing sample size.

• Constructing and interpreting frequency charts from grouped and continuous data. All

students should already be able to interpret, analyse and compare the distributions of data

sets through appropriate graphical representation involving discrete, continuous and grouped

data. However, only those who study for Higher GCSE are required to construct frequency

charts from grouped continuous data.

• The difference between discrete and continuous variables. It is worth discussing the fact that

all recorded data is discrete because we have to measure to a given level of accuracy.

• The concept of standard deviation as a measure of spread, which is not covered at GCSE

and therefore needs to be taught.

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 5

Teaching points/Misconceptions Most distributions of height, weight, and other measurements follow a Normal distribution and so

students do not generally find it difficult to recognise the shape of the Normal distribution and

identify where it might occur; it falls within their experience that most measurements will be

towards the middle and a few will be at the extremes.

Students are not required to calculate standard deviation in the examination. However, in order to

understand what standard deviation is, it would be useful for them to know how it is calculated. Its

calculation as the (square root of the) average of the squared differences from the mean is not

difficult to introduce and will show students that it is a fairly natural and transparent way to measure

spread.

Some students may believe that all z-scores lie between −3 and 3. This view can be discouraged

by drawing sketches showing that values fall outside this range. Others may believe that z-scores

can take any values, failing to consider that most real distributions will have natural limitations - for

example, length cannot be negative and test scores cannot be greater than 100%.

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 6

Part A Properties of the Normal curve At the start of the topic, the idea of distribution will need to be revised. This could be done by using

the Standards Units card sets S5 and S6:

• S5 Card Set A; ask students in pairs to group the charts in as many ways as they can (they

might group by symmetry, skew, averages, range, total frequency or others they think of).

• S6 Cards sets A and B; ask students to match the frequency graphs with the statements.

Alternatively, students could generate data and compare the resulting distributions.

For example:

• Roll one six-sided dice 50 times and record the scores (rectangular).

• Roll a single dice until they get a six and count the number of rolls it takes - ask

several pairs to do this and collate the data as it takes a while (geometric).

• Roll 2 dice 50 times and record the sum each time (normal)

• Roll ten six-sided dice 30 times and record the number of 2s (binomial).

Many students will find the geometric distribution in example 2 incredibly counter intuitive; teachers

may wish to allow time to discuss this in more detail and explore the idea that probabilities can

often lead to unexpected outcomes.

The accompanying Excel workbook ‘The Normal distribution’ (worksheet ‘Distributions’) will create

these graphs for you. Enter the frequencies for an experiment in one of the blue columns.

Or, if you have data you can use, you could compare real distributions that might interest your

students.

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 7

Students should understand that standard deviation is the measure of spread most commonly

used by statisticians because it;

• relates to the mean,

• can be manipulated algebraically,

• takes into account all the data,

• is not too sensitive to outlying values.

They do not need to know the formulae or to calculate standard deviation but it will help them to

understand if the method is explained. This could be done by generating some class data (for

example, ask them to measure their pulse rates over 1 minute).

Then ask the following questions, getting the class to do the calculations with their data (in small

groups of 6-8 if possible):

Answer: Subtracting the mean from each item of data to find how far each measurement is from

the mean.

Answer: Find the mean difference by adding up the differences and dividing by the number of data

items. (Students will find that total is always zero - explore why).

Answer: Either ignore the negative signs or square the differences.

(a) We want to find out how far the data is, on average, from the mean. Can you suggest a

good way to start?

(b) We want to find the average difference - how could we do that?

(c) What could we do to avoid the total being zero?

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 8

Answer: This is genuinely debatable, but the absolute values are not generally used because,

algebraically, the modulus sign is difficult to manipulate. Standard deviation is the name of the

measure which uses the squared values and which is most commonly used.

The Normal distribution could be introduced by showing the Normal distribution in fish populations

presented by Marcus de Sautoy.

Students can then do Activity 1 which involves them collecting data that is approximately Normally

distributed and finding out what proportion of the data lies within 1, 2 and 3 standard deviations of

the mean. There are many sets of data you could get your class to collect such as heights,

estimates, or length of time breath can be held. However, they will need about 100 results to do

this activity properly so they need to collect a large number of measurements quickly. Some

practical ideas suggested by the Centre for Innovation in Mathematics Teaching are;

1. Lengths of leaves. Evergreen bushes such as laurel are useful - though make sure all the

leaves are from the same year's growth.

2. Weights of crisp packets. Borrow a box of crisps from a canteen and weigh each packet

accurately on a balance from the science laboratory.

3. Pieces of string. Look at 10cm on a ruler and then take a ball of string and try to cut 100

lengths of 10cm by guessing. Measure the lengths of all the pieces in mm.

4. Weights of apples. If anyone has apple trees in their garden they are bound to have large

quantities in the autumn.

(d) Which method would be better?

(e) Finish by explaining that the mean of the squared differences is called the variance and that

the square root of this value is called the standard deviation. We square root because the

standard deviation needs to be in the same units as the original data, not in squared units

which is what the variance is measured in (because we squared the differences).

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 9

5. Size of pebbles on a beach. Geographers often look at these to study the movement of

beaches. Use a pair of callipers then measure on a ruler.

6. Game of bowls. Make a line with a piece of rope on the grass about 20 metres away. Let

everyone have several goes at trying to land a tennis ball on the line. Measure how far each

ball is from the line.

Students could work in groups, or as a whole class. The main teaching points are that:

• the graph forms a bell-shaped curve,

• the graph is symmetrical about the mean, and the mean and mode are roughly the same,

• roughly 68% of the data items lie within 1 standard deviation of the mean, 95% lie within 2

standard deviations and 99.7% lie within 3 standard deviations,

• real data samples are rarely perfectly symmetrical and the percentages from even a large

sample may not be exact,

• there are tests which statisticians use to check whether a set of data is close enough to

assume that it is Normal.

Students could use the Excel workbook ‘The Normal distribution’ (worksheet ‘Activity 1 Data’) to

plot their data and to calculate the mean and standard deviation. If you use the spreadsheet, enter

the raw data in the first green column. Then decide on some (equal) class intervals and enter the

upper bounds in the second green column. You may prefer them to group and plot the data by

hand, use a scientific calculator to find the mean and standard deviation, or set up their own

spreadsheet to plot their data and calculate the values.

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 10

Part B Calculation of z-scores You could now watch the video clip Against All Odds: Normal Calculations which uses the context

of a club for tall people to compare the entry requirements for men and women by calculating z-

scores. The suggested sections to use are:

00:21 to 04:00 revises the Normal distribution and introduces the Beanstalks. At 04:00 you could

stop and discuss with the group how many standard deviations away from the mean 5’10” is and

develop the idea of a z-score being 𝑣𝑎𝑙𝑢𝑒−𝑚𝑒𝑎𝑛𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

often written 𝑥−𝜇𝜎

.

06:16 to 09:00 explains the calculation of z-scores. You may wish to stop the clip at 09:00 and

explain the use of tables to find exact percentages before continuing with the video. Students will

not be examined on this, but it is difficult to interpret z-scores without understanding that exact

percentages can be found for any z-score.

09:00 to 10:41 explains the use of z-scores to compare values from different distributions and

interprets the values for the male and female Beanstalks.

Students could then do Activity 2 (see Activity 2 - Teacher Guidance below and Excel workbook

‘The Normal distribution’ (worksheet ‘Activity 2 Dice’)). Given two dice games, one where you

score the sum of three dice and one where you score the sum of 5 dice, in which game are you

more likely to score 10 or more? This activity requires the collection of discrete data but the

distributions will approximate Normal distributions.

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 11

Part C Testing for Normality The data collected in Activities 1 and 2 may have formed asymmetrical, bell-shaped curve when

plotted or they may not: when we take even a large random sample, there is always a chance that

our sample is not itself distributed normally, even if the population is. Students may have

questioned whether or not their frequency graphs were Normal and this is a very good question to

ask.

Statisticians often need to test whether their data is Normal in order to decide whether the

assumptions they are making are correct and it is not always possible to do this by plotting the

sample data. They have therefore developed statistical methods and software to check data for

Normality. There are several ways in which this can be done and students will not be examined on

these methods. What they need to do is to be able to interpret a Normal probability plot. The

simple rule is that, if the plot is close to a straight line, the data may be considered to be Normally

distributed. If the plot does not resemble a straight line, then the data cannot be considered to be

Normally distributed. This is because the probability plots essentially plot observed probabilities

against those expected if the distribution were Normal, so a perfect straight line would indicate that

all the data was exactly as expected.

Normally distributed Not Normally distributed

To put this in a context, you could show the clip Against All Odds: Checking Assumption of

Normality which is about egg sizes and hen weights on a chicken farm. The clip refers to box-

plots, which students should be familiar with. It also refers to ‘bin-sizes’ which we usually call

‘class-intervals’.

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 12

Students could then undertake Activity 3, using the Excel workbook ‘The Normal distribution’

(worksheet ‘Plots’) to check whether their data from Activities 1 and 2, or any other data they wish

to collect or make up, are distributed Normally.

Activity 2 Comparing distributions - Teacher Guidance Initially, you could just pose the problem: “Given two dice games, one where you score the sum of

three dice and one where you score the sum of 5 dice, in which game are you more likely to score

10 or more? How much more likely?” It might be fairly obvious that you are more likely to get 10 or

more with 5 dice but how much more likely is not so easy to answer. Twice as likely? Three times?

The group might suggest rolling some dice and seeing what happens, which is a good place to

start.

1. Split the class into two, one group rolling three fair six-sided dice and the other group rolling

5 of the same dice. Each group will need to roll their dice at least 100 times so it would be

best, if you have enough dice, to have more than one sub-group doing each. Agree with

them how they are going to collect the data (tally chart including all possible scores).

2. See which group had more scores that were 15 or higher, but also enter the data for each

group into the spreadsheet and calculate the mean and standard deviation for each group.

Plot the data using the spreadsheet and compare the distributions. They should approximate

those below:

3. Discuss the difference between the theoretical distributions and the real data. Explain that we

are going to use the real data to estimate the mean and standard deviation of the theoretical

distributions because we don’t know how to work them out exactly (this is not difficult, but

beyond the scope of Core Maths - could set as a challenge for small 𝑛).

Quantitative Problem Solving (MEI) Topic Exploration Pack Quantitative Reasoning (MEI)

March 2015 13

OCR Resources: the small print OCR’s resources are provided to support the teaching of OCR specifications, but in no way constitute an endorsed teaching method that is required by the Board,

and the decision to use them lies with the individual teacher. Whilst every effort is made to ensure the accuracy of the content, OCR cannot be held responsible for

any errors or omissions within these resources. We update our resources on a regular basis, so please check the OCR website to ensure you have the most up to

date version.

© OCR 2015 - This resource may be freely copied and distributed, as long as the OCR logo and this message remain intact and OCR is acknowledged as the

originator of this work.

OCR acknowledges the use of the following content: Thumbs up and down icons: alexwhite/Shutterstock.com

Please get in touch if you want to discuss the accessibility of resources we offer to support delivery of our qualifications: [email protected]

4. Ask class to find how many standard deviations away from the mean of each distribution 10

is (that is, find the z-scores for the value 10). The theoretical z-scores are −0.17 for three

dice and −1.96 for five dice.

5. 97.5

55≈ 1.77 so 10 is one and a half to two times as likely with 5 dice as with 3.

Is this borne out by their experimental results?

We’d like to know your view on the resources we produce. By clicking on the ‘Like’ or ‘Dislike’ button you can help us to ensure that our resources work for you. When the email template pops up please add additional comments if you wish and then just click ‘Send’. Thank you. If you do not currently offer this OCR qualification but would like to do so, please complete the Expression of Interest Form which can be found here: www.ocr.org.uk/expression-of-interest

For staff training purposes and as part of our quality assurance programme your call may be recorded or monitored.

©OCR 2015 Oxford Cambridge and RSA Examinations is a Company Limited by Guarantee. Registered in England. Registered office 1 Hills Road, Cambridge CB1 2EU. Registered company number 3484466. OCR is an exempt charity.

OCR customer contact centreGeneral qualificationsTelephone 01223 553998Facsimile 01223 552627Email [email protected]