state9 project work statistics pdf may 24 2012-5-25 pm 1 8 meg evozi 2

2

SMK. ST. FRANCIS CONVENT (M), 88000 KOTA KINABALU, SABAH

ADDITIONAL MATHEMATICS PROJECT WORK 2013STATISTICS OF STUDENTS’ SCORES IN AN EXAMINATION

NAME : Noor Shazlien bt. M. JamalCLASS : 5 Pure Science 1I/C NUMBER : EXAM REGISTRATION NO:

2

No

Contents

Page1 PART 1

Importance of data analysis in daily life The types of Measure of Central

Tendency and of Measure of Dispersion

3 - 5

2 PART 2 Raw data Frequency Distribution Table Measure of central

tendency i) Meanii) Modeiii) Median

6 - 11

3 PART 3 Measure of

Dispersion- Interquartile

range

12 - 15

4 PART 4 Problem solving

17 – 19

5 REFLECTION 206 REFERENCES 21 -

27

3

PART 1

Imp o rt an ce of d at a an alysi s in d aily life

Data analysis is a process used to transform, remodel and revise certain information (data) with a view to reach to a certain conclusion for a given situation or problem. Data analysis can be done by different methods as according to the needs and requirements.

For example if a school principal wants to know whether there is a relationship between students’ performance on the district writing assessment and their socioeconomic levels. In other words, do students who come from lower socioeconomic backgrounds perform lower, as we are led to believe? Or are there other variables responsible for the variance in writing performance? Again, a simple correlation analysis will help describe the students’ performance and help explain the relationship between the issues of performance and socioeconomic level.

Analysis does not have to involve complex statistics. Data analysis in schools involves collecting data and using that data to improve teaching and learning. Interestingly, principals and teachers have it pretty easy. In most cases, the collection of data has already been done. Schools regularly collect attendance data, transcript records, discipline referrals, quarterly or semester grades, norm- and criterion-referenced test scores, and a variety of other useful data. Rather than complex statistical formulas and tests, it is generally simple counts, averages, percents, and rates that educators are interested in.

There are many benefits of data analysis however; the most important ones are as follows: - data analysis helps in structuring the findings from different sources of data collection like survey research. It is again very helpful in breaking a macro problem into micro parts. Data analysis acts like a filter when it comes to acquiring meaningful insights out of huge data-set. Every researcher has sort out huge pile of data that he/she has collected, before reaching to a conclusion of the research question. Mere data collection is of no use to the researcher. Data analysis proves to be crucial in this process. It provides a meaningful base to critical decisions. It helps to create a complete dissertation proposal.

One of the most important uses of data analysis is that it helps in keeping human bias away from research conclusion with the help of proper statistical treatment. With the help of data analysis a researcher can filter both qualitative and quantitative data for an assignment writing projects. Thus, it can be said that data analysis is of utmost importance for both the research and the researcher. Or to put it in another words data analysis is as important to a researcher as it is important for a doctor to diagnose the problem of the patient before giving him any treatment

4

Th e t yp es of M easu re of Cent ra l Tend en cy an d of M easu re of D isp e rsion .

Central tendency gets at the typical score on the variable, while dispersion gets at how much variety there is in the scores. When describing the scores on a single variable, it is customary to report on both the central tendency and the dispersion. Not all measures of central tendency and not all measures of dispersion can be used to describe the values of cases on every variable. What choices you have depend on the variable’s level of measurement.

MeanThe mean is what in everyday conversation is called the average. It is calculated by simply adding the values of all the valid cases together and dividing by the numberof valid cases.

x x

Nor x

fx f

The mean is an interval/ratio measure of central tendency. Its calculation requires that the attributes of the variable represent a numeric scale

ModeThe mode is the attribute of a variable that occurs most often in the data set.

For ungroup data, we can find mode by finding the modal class and draw the modal class and two classes adjacent to the modal class. Two lines from the adjacent we crossed to find the intersection. The intersection value is known as the mode.

MedianThe median is a measure of central tendency. It identifies the value of the middlecase when the cases have been placed in order or in line from low to high. The middle of the line is as far from being extreme as you can get.

N F

m L

2 C

fm

m m

x

1 3

There are as many cases in line in front of the middle case as behind the middle case. The median is the attribute used by that middle case. When you know the value of the median, you know that at least half the cases had that value or a higher value, while at least half the cases had that value or a lower value.

RangeThe distance between the minimum and the maximum is called the range. The larger the value of the range, the more dispersed the cases are on the variable; the smaller the value of the range, the less dispersed (the more concentrated) the cases are on the variable

Range = maximum value – minimum value

Interquartile range (IQR) is the distance between the 75th percentile and the 25th percentile. The IQR is essentially the range of the middle 50% of the data. Because it uses the middle 50%, the IQR is not affected by outliers or extreme values.

1Q L 4

N F C

3Q L 4

N F C

IQR = Q3 - Q1

f

f

Standard DeviationThe standard deviation tells you the approximate average distance of cases from the mean. This is easier to comprehend than the squared distance of cases from the mean. The standard deviation is directly related to the variance.

If you know the value of the variance, you can easily figure out the value of the standard deviation. The reverse is also true. If you know the value of the standard deviation, you can easily calculate the value of the variance. The standard deviation is the square root of the variance

fx 2

f 2

PART 2

1. March Additional Mathematics test scores for 30 students.

Students Marks1 552 603 634 655 736 747 758 759 7610

7711

7812

8013

8114

8215

8316

8417

8418

8619

8620

8621

8722

8823

8824

8925

9026

9027

9028

9129

9330

95

2. Frequency distributions table:

Marks Tally

Frequency55 - 59 | 160 - 64 |

|2

65 - 69 | 170 - 74 |

|2

75 - 79 ||

580 - 84 ||||

|6

85 - 89 |||| ||

790 - 94 |

|5

95 - 99 | 1

a) I) MeanThe mean mark of 30 students can be found by using the formula:

fx f

Marks Midpoint,x

Frequency

f

fx

55 - 59 57

1 5760 - 64 6

22 12

465 - 69 67

1 6770 - 74 7

22 14

475 - 79 77

5 38580 - 84 8

26 49

285 - 89 87

7 60990 - 94 9

25 46

095 - 99 97

1 97∑f = 30 ∑fx = 2435

From the table∑f = 30

∑fx = 2435

Therefore,

mean, 243530

= 81.1667

II) Mode

The modal class is 85 - 89, i.e., the majority of the students got

that marks.

To find the mode weight, we draw the modal class and two classes

adjacent to the modal class.

From the histogram, the mode mark is 86.5

III) Median

M eth od 1 – B y u sin g formu la

Median is the value of the centre of a set of data

Median weight for 50 students can be obtained by using the formula:

Median, m

WhereL = lower boundary of median class,N = total frequency,F = cumulative frequency before the median class,fm = frequency of median class,C = class interval size.

Marks Lower

Boundar

Upper

boundar

Frequencyf

CumulativeFrequenc55 - 59 54.

559.5

1 160 - 64 59.

564.5

2 365 - 69 64.

569.5

1 470 - 74 69.

574.5

2 675 - 79 74.

579.5

5 1180 - 84 79.

584.5

6 1785 - 89 84.

589.5

7 2490 - 94 89.

594.5

5 2995 - 99 94.

599.5

1 30∑f = 30

From the table, Median class = 30 ÷ 2

= 15th value= 80 - 84

L = 79.5 fm = 6 Total frequency N = 30F = 11 C = 69.5 - 64.5

= 5

m 79.5

30 2

11 5

6

m 82.8333

M e th od 2 – b y d r awi n g an ogive

Ogive

Ogive is a graph constructed by plotting the cumulative frequency

of a set of data against the corresponding upper boundary of each

class.

Not only that, ogive is also the method of calculation, the median,

and the interquartile range of a set of data can also be estimated

from its ogive.

b) Mean, x = 81.2 kg Median, m = 82.8333 (or 82.75) Mode = 86.5

From the above measure of central tendency, mean is suitable measure of

central tendency because the minimum value of raw data is not extreme where

the data seems to be clustered, whereas mode and median does not take all the

values in the data into account which decrease the accuracy of central tendency.

1

3

P A R T 3

Measure of Dispersion is a measurement to determine how far the values of data in a set of data are spread out from its average value.

a) I) The interquartile rangeMethod I – By using formula

1Q L 4

N F C

f m

Q1 class = 30 x ¼= 7.5th value= 75 - 59

L = 74.5 fm = 5 Total frequency N = 30F = 6 C = 5

1 30 6

Q 74.5 4 51 5

Q1 76.0

3Q L 4

N F C

f m

Q3 class = 30 x ¾= 22.5 th value= 85 – 89

L = 84.5 fm = 7 N = 30F = 17 C = 5

3 30 17

Q 84.5 4 53 7

Q3 88.4286

Therefore the Interquartile range, Q3 – Q1 = 88.4286 – 76.0

= 12.4286

Method II – By using ogive

From the ogiveInterquartile range = 88.5 – 76.25

= 12.25

II) The Standard deviation

Method I

Marks Midpoint,x

Frequencyf

fx

fx255 - 59 5

71 5

7324960 - 64 6

22 12

4768865 - 69 6

71 6

7448970 - 74 7

22 14

410368

75 - 79 77

5 385

2964580 - 84 8

26 49

240344

85 - 89 87

7 609

5293890 - 94 9

25 46

042320

95 - 99 97

1 97

9409∑f = 30 ∑fx =

2435∑fx2 = 200450

243530

= 81.1667

v200450

81.1667 2

30

9.6764

Method IIMarks

Midpointx

Frequency

f

x x x x 2 f x x 2

55 - 59

57

1 -24.1667 584.0294 584.029460 - 64

62

2 -19.1667 367.3624 734.724865 - 69

67

1 -14.1667 200.6954 200.695470 - 74

72

2 -9.1667 84.0284 168.056875 - 79

77

5 -4.1667 17.3614 86.806980 - 84

82

6 0.8333 0.6944 4.166385 - 89

87

7 5.8333 34.0274 238.191790 - 94

92

5 10.8333 117.3604 586.801995 - 99

97

1 15.8333 250.6934 250.6934∑f = 30 f x x 2

2854.1666

f x x 2

f

2854.1666

30 9.7539

b) The standard deviation gives a measure of dispersion of the data about the

mean. A direct analogy would be that of the interquartile range, which gives

a measure of dispersion about the median. However, the standard deviation

is generally more useful than the interquartile range as it includes all data

in

its calculation. The interquartile range is totally dependent on just two

values and ignores all the other observations in the data. This reduces the

accuracy

it extreme value is present in the data. Since the marks does not contain

any extreme value, standard deviation give a better measures compared to

interquartile range.

P A R T 4

a) If the teacher adds 3 marks for each student in class for their commitment and discipline shown,

The new marks for 30 studentsStudents Marks

1 582 633 664 685 766 777 788 789 7910

8011

8112

8313

8414

8515

8616

8717

8718

8919

8920

8921

9022

9123

9124

9225

9326

9327

9328

9429

9630

98

Modal class = 90 – 94

New Frequency distributions table:

Marks

Lower

Boundar

Midpoint, x

Frequency, f

C. Frequency

fx

fx255 -

5954.5

57

1 1 57

324960 -

6459.5

62

1 2 62

384465 -

6964.5

67

2 4 134

897870 -

7469.5

72

0 4 0 075 - 79

74.5

77

5 9 385

2964580 - 84

79.5

82

4 13

328

2689685 - 89

84.5

87

7 20

609

5298390 - 94

89.5

92

8 28

736

6771295 - 99

94.5

97

2 30

194

18818∑f = 30 ∑f = 2505 ∑fx2= 381495

New mean , x 250530

x 83.5

New mode,

New Mode = 90.75

New MedianMedian class = 80 – 84

m 79.5

30 9 2 5

4

m 87.0

Class interval remain sameClass interval = 5

New interquartile range

Q1 class = 30 x ¼= 7.5 th value= 75 – 79

1 30 4

Q 74.5 4 51 5

Q1 78.0

Q2 class = 30 x ¾= 22.5= 85 - 89

3 30 13

Q 84.5 4 53 7

Q3 91.2857

Therefore the Interquartile range, Q3 – Q1 = 91.2857 – 78.0

= 13.2857

New standard deviation, 381495 83.52

30

75.7908

b)

Marks

Lower

Boundar

Midpoint, x

Frequency, f

C.

Frequenc

fx

fx255 -

5954.5

57

1 1 57

324960 -

6459.5

62

1 2 62

384465 -

6964.5

67

2 4 134

897870 -

7469.5

72

0 4 0 075 - 79

74.5

77

5 9 385

2964580 - 84

79.5

82

4 13

328

2689685 - 89

84.5

87

7 20

609

5298390 - 94

89.5

92

8 28

736

6771295 - 99

94.5

97

3 31

291

28227∑f = 31 ∑fx =

2602∑fx2 = 390904

New mean , x 2602 31

x 83.9355

New standard deviation, 390904 83.93552

31

74.5965

Mode, median, and interquartile range are not affected by the adding for new marks.

REFLECTION

REFER ENCES

ww w. wo rldteac he rs p ress .co m

www. h eart scan . com. my

www. t ip s. com. my/art ic le. ph p

http ://nut riweb . org. my

http :// gemin igeek. com/b log/a rch ives/2004/ 08/ad d ma t h - p roj ect - t ip s- 1

http : / / w w w . s ag e pub . c o m / upm - d a t a/43350_ 4 . p d f

Blog A Form 4 Sasbadi by Pua Kim Teck

Preston Additional Mathematics Form 4 & 5 reference book by Tan Li Lan

Fokus Ungu Matematik Tambahan Form 4 & 5 by Wong Teck Sing

Additional Mathematics Form 4 Text Book.

Mathematics Form 4 Text Book.

http://www.worldteacherspress.com/

http://www.sagepub.com/upm-data/43350_4.pdf

http://geminigeek.com/blog/archives/2004/08/addmath-project-tips-1

http://nutriweb.org.my/

http://www.tips.com.my/article.php

http://www.heartscan.com.my/

state9 project work statistics pdf may 24 2012-5-25 pm 1 8 meg evozi 2

Documents