analysis

38
Analysis

Upload: minty

Post on 22-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Analysis. Start with describing the features you see in the data. Starting point. Overall visual non-numerical comparisons Overlap Shift Unusual features. Starting point. Overlap: I notice there is a lot of overlap between the boys’ and girls’ foot lengths. Starting point. Shift: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Analysis

Analysis

Page 2: Analysis

Start with describing the features you see in the

data

Page 3: Analysis

Starting pointOverall visual non-numerical comparisonsOverlapShiftUnusual

features

Page 4: Analysis

Starting pointOverlap:I notice there is a lot of overlap between the boys’ and girls’ foot lengths.

Page 5: Analysis

Starting pointShift: The boys’ foot lengths are shifted further up the scale than the girls’ foot lengths.

Page 6: Analysis

Starting pointUnusual features:There is an unusually low foot size of 15 cm in the girls’ data. I suspect that this data is a mistake as it seems too low in comparison with the other data for girls.OR one of the girls has a recorded foot length far shorter than any other girl

Page 7: Analysis

After the initial overall visual non-numerical comparisons:

• Make more detailed comparative descriptions of the features including use of summary statistics and specific observation values where appropriate.

• Reflect and perhaps comment on some of the features using “I wonder . . .” and “I expect . . .” type statements, i.e., comment on any inferential thoughts.

Page 8: Analysis

comparative descriptions of the features including use of summary

statistics The median foot length of the boys (25cm) is 3cm longer than the median foot length of the girls (22cm).

The mean foot length of the boys (25.5cm) is 2.1cm longer than the mean foot length of the girls (23.4cm).

Page 9: Analysis

comparative descriptions of the features including use of summary

statistics The range of foot lengths for the boys (9cm) is the same as the range of foot lengths for the girls (9cm) if we ignore the unusual value.

Also the interquartile range for the foot lengths for the boys (3cm) is the same as for the girls (3cm).

Page 10: Analysis

comparative descriptions of the features including use of summary

statistics The most common result for the foot length of boys was 25cm but for the girls it was 22 and 23 cm.

In all these cases, the boys seem to have higher values of foot length than the girls by about 2cm.

Page 11: Analysis

comparative descriptions of the features including use of summary

statistics The median foot length for the boys is the same as the UQ value for the girls (25cm)

Page 12: Analysis

Make ComparisonsBetween the groups (e.g., overlap, shift, spread

and shape statements)Within each group (e.g., unusual observations)

Page 13: Analysis

Overlap

Be aware of sampling variation:Sampling alone can produce shiftsThese shifts are small in large samplesThey can be large in small samples.

Page 14: Analysis

Overlap

There is some overlap of the boxes but the median of the girls’ foot length is outside the boys’ box and the median of the boys’ foot length is the same as the UQ of the girls’ foot length.

Page 15: Analysis

Overlap

OR There is some overlap for the middle 50% of the boys’ right foot lengths and the middle 50% of the girls’ right foot lengths.

Page 16: Analysis

Shift

The boys values are shifted to the right of the girls values for maximum and minimum values and median and UQ and LQ foot lengths.

Page 17: Analysis

Shift

The middle 50% of the boys’ foot lengths (the box) is shifted much further along the scale than the middle 50% of the girls’ foot lengths.

Page 18: Analysis

Spread

The spread for both boys’ foot lengths and girls’ foot lengths are the same i.e. range is 9cm in both cases and IQR is 3cm for both.

Page 19: Analysis

Spread

The middle 50% of boys have a right foot measuring between 24cm and27cm (IQR = 3cm) whereas the middle 50% of the girls are between 22and 25cm (IQR = 3cm). This means that the foot lengths for these boys vary by about the same amount as these girls’ do.

Page 20: Analysis

Spread

I expect that the boys’ and girls’ foot length distributions back in the two populationshave similar variability.

Page 21: Analysis

Note:The range should not be used as it is very inclined to be an unstable estimate of the population spread.The range is highly likely to vary greatly from sample to sample for samples of these sizes.The range is also prone to be severely affected by the occasional extreme observation.This is why we use other more resistant measures of spread such as the IQR.The IQR is not disturbed by the presence of a few very large or very small observations.

Page 22: Analysis

From the dot plot:Some of the boys have bigger right foot lengths than some of the girls and vice versa

Page 23: Analysis

ShapeThe shape of the distributions is not clear from the dot plots but appears to be unimodal as would be expected and maybe slightly skewed to the right as indicated by the box plots.

To get a more accurate view, we would need to increase the sample size.

Page 24: Analysis

ShapeOR The sample distribution for the boys’ foot lengths is roughly symmetricalwith a mound around 24 to 27cm, i.e., unimodalThe sample distribution for the girls’ foot lengths shows a large moundaround 22 to 24 cm.

Page 25: Analysis

ShapeI wonder if boys’ and girls’ foot length distributions back in the two populations are roughly symmetric and unimodal. I expect so for a body measurement such as foot length for both girls and boys.

Page 26: Analysis

Unusual valueI notice one of the girls has a foot length (15cm) far smaller than any other girlI worry that this may be a mistake. It could be a measurement or recording mistake or perhaps this girl is much younger than 13 years. I wouldn’t expect a 13 year-old girl to have a foot size this small. I need to check her other measurements such as age, height etc. to further investigate this extreme value.

Page 27: Analysis

Gaps and clustersI notice the dots are stacked on whole numbers. This is because the foot lengths are measured to the nearest cm.

Page 28: Analysis

Gaps and clustersThere is a gap in the girls’ group at 28cm and gaps in the boys’ group at 22 and 29cm.

Page 29: Analysis

Gaps and clustersBoys’ and girls’ foot length distributions back in the two populations would not have gaps at these same values. The gaps are in the sample due to the small size of the sample.

Page 30: Analysis

SamplingIf a new random sample of 24 13-year-old boys and a new random sample of 22 13-year-old girls were taken I would expect the plots to look different because of sampling variability. With these sample sizes, I would expect each IQR spread to change slightly and that each box would be slightly further down or up the scale.

Page 31: Analysis

I wonder:if I repeated this sampling process many times

the boys’ foot lengths would, just about always, be shifted further up the scale than the girls’

if boys tend to have a greater foot length than girls back in the two populations

if the median foot length of boys really is greater than that of girls back in the two populations

Page 32: Analysis

ConclusionI notice that the distance between the medians is greater than 1/3 of the “overall visible spread”

Page 33: Analysis

ConclusionI am going to claim that the right foot lengths of 13 year-old New Zealand boys tend to be longer than the right foot lengths of 13 year-old New Zealand girls back in the two populations. I am prepared to make this call because, in my data, the distance between the boys’ and the girls’ median foot lengths is big relative to the overall visible spread. To make this call, with sample sizes of around 30, the difference between the two foot length medians needs to be more than about 1/3 of the overall visible spread. This is true for my data.

Page 34: Analysis

ConclusionI don’t believe that the pattern in my data of the boys tending to have longer foot lengths than the girls is just due to who happened to be randomly selected in the girls’ group and who happened to be randomly selected in the boys’ group, i.e., I don’t believe this data pattern has just happened by chance. I am prepared to claim that this pattern in the data is real, i.e., that this pattern persists back in the two populations.

Page 35: Analysis

Notes:We use ‘… right foot length …’ because the investigative question asks about the right foot length.

Page 36: Analysis

Notes:Using statistics there is always the possibility that the calls (decisions) that we make are wrong, i.e., we are making calls in the face of uncertainty. For example, we want to make a call on who tends to be taller (back in the two populations), 13 year-old boys or 13 year-old girls. We may make the call that it’s 13 year-old boys when in fact it’s girls who tend to be taller. Or, we may not want to make a call even though boys tend to be taller than girls..

Page 37: Analysis

Explanatory I expected that boys tend to have bigger feet than girls back in the populations and the information I collected (my data) supports this belief.I can’t think of any other factor which can

explain the difference in foot size other than gender.

Page 38: Analysis

Notes:In this explanatory element we ask ourselves if our conclusion makes sense with what we know, i.e., whether our contextual knowledge matches our conclusions.We must try to think of other factors which may lead to alternative explanations when measuring foot lengths. These suggestions should also be present in the conclusion.