state9 project work statistics pdf may 24 2012-5-25 pm 1 8 meg evozi 2
TRANSCRIPT
2
SMK. ST. FRANCIS CONVENT (M), 88000 KOTA KINABALU, SABAH
ADDITIONAL MATHEMATICS PROJECT WORK 2013STATISTICS OF STUDENTS’ SCORES IN AN EXAMINATION
NAME : Noor Shazlien bt. M. JamalCLASS : 5 Pure Science 1I/C NUMBER : EXAM REGISTRATION NO:
2
No
Contents
Page1 PART 1
Importance of data analysis in daily life The types of Measure of Central
Tendency and of Measure of Dispersion
3 - 5
2 PART 2 Raw data Frequency Distribution Table Measure of central
tendency i) Meanii) Modeiii) Median
6 - 11
3 PART 3 Measure of
Dispersion- Interquartile
range
12 - 15
4 PART 4 Problem solving
17 – 19
5 REFLECTION 206 REFERENCES 21 -
27
3
PART 1
Imp o rt an ce of d at a an alysi s in d aily life
Data analysis is a process used to transform, remodel and revise certain information (data) with a view to reach to a certain conclusion for a given situation or problem. Data analysis can be done by different methods as according to the needs and requirements.
For example if a school principal wants to know whether there is a relationship between students’ performance on the district writing assessment and their socioeconomic levels. In other words, do students who come from lower socioeconomic backgrounds perform lower, as we are led to believe? Or are there other variables responsible for the variance in writing performance? Again, a simple correlation analysis will help describe the students’ performance and help explain the relationship between the issues of performance and socioeconomic level.
Analysis does not have to involve complex statistics. Data analysis in schools involves collecting data and using that data to improve teaching and learning. Interestingly, principals and teachers have it pretty easy. In most cases, the collection of data has already been done. Schools regularly collect attendance data, transcript records, discipline referrals, quarterly or semester grades, norm- and criterion-referenced test scores, and a variety of other useful data. Rather than complex statistical formulas and tests, it is generally simple counts, averages, percents, and rates that educators are interested in.
There are many benefits of data analysis however; the most important ones are as follows: - data analysis helps in structuring the findings from different sources of data collection like survey research. It is again very helpful in breaking a macro problem into micro parts. Data analysis acts like a filter when it comes to acquiring meaningful insights out of huge data-set. Every researcher has sort out huge pile of data that he/she has collected, before reaching to a conclusion of the research question. Mere data collection is of no use to the researcher. Data analysis proves to be crucial in this process. It provides a meaningful base to critical decisions. It helps to create a complete dissertation proposal.
One of the most important uses of data analysis is that it helps in keeping human bias away from research conclusion with the help of proper statistical treatment. With the help of data analysis a researcher can filter both qualitative and quantitative data for an assignment writing projects. Thus, it can be said that data analysis is of utmost importance for both the research and the researcher. Or to put it in another words data analysis is as important to a researcher as it is important for a doctor to diagnose the problem of the patient before giving him any treatment
4
Th e t yp es of M easu re of Cent ra l Tend en cy an d of M easu re of D isp e rsion .
Central tendency gets at the typical score on the variable, while dispersion gets at how much variety there is in the scores. When describing the scores on a single variable, it is customary to report on both the central tendency and the dispersion. Not all measures of central tendency and not all measures of dispersion can be used to describe the values of cases on every variable. What choices you have depend on the variable’s level of measurement.
MeanThe mean is what in everyday conversation is called the average. It is calculated by simply adding the values of all the valid cases together and dividing by the numberof valid cases.
x x
Nor x
fx f
The mean is an interval/ratio measure of central tendency. Its calculation requires that the attributes of the variable represent a numeric scale
ModeThe mode is the attribute of a variable that occurs most often in the data set.
For ungroup data, we can find mode by finding the modal class and draw the modal class and two classes adjacent to the modal class. Two lines from the adjacent we crossed to find the intersection. The intersection value is known as the mode.
MedianThe median is a measure of central tendency. It identifies the value of the middlecase when the cases have been placed in order or in line from low to high. The middle of the line is as far from being extreme as you can get.
N F
m L
2 C
fm
m m
x
1 3
There are as many cases in line in front of the middle case as behind the middle case. The median is the attribute used by that middle case. When you know the value of the median, you know that at least half the cases had that value or a higher value, while at least half the cases had that value or a lower value.
RangeThe distance between the minimum and the maximum is called the range. The larger the value of the range, the more dispersed the cases are on the variable; the smaller the value of the range, the less dispersed (the more concentrated) the cases are on the variable
Range = maximum value – minimum value
Interquartile range (IQR) is the distance between the 75th percentile and the 25th percentile. The IQR is essentially the range of the middle 50% of the data. Because it uses the middle 50%, the IQR is not affected by outliers or extreme values.
1Q L 4
N F C
3Q L 4
N F C
IQR = Q3 - Q1
f
f
Standard DeviationThe standard deviation tells you the approximate average distance of cases from the mean. This is easier to comprehend than the squared distance of cases from the mean. The standard deviation is directly related to the variance.
If you know the value of the variance, you can easily figure out the value of the standard deviation. The reverse is also true. If you know the value of the standard deviation, you can easily calculate the value of the variance. The standard deviation is the square root of the variance
fx 2
f 2
PART 2
1. March Additional Mathematics test scores for 30 students.
Students Marks1 552 603 634 655 736 747 758 759 7610
7711
7812
8013
8114
8215
8316
8417
8418
8619
8620
8621
8722
8823
8824
8925
9026
9027
9028
9129
9330
95
2. Frequency distributions table:
Marks Tally
Frequency55 - 59 | 160 - 64 |
|2
65 - 69 | 170 - 74 |
|2
75 - 79 ||
580 - 84 ||||
|6
85 - 89 |||| ||
790 - 94 |
|5
95 - 99 | 1
a) I) MeanThe mean mark of 30 students can be found by using the formula:
fx f
Marks Midpoint,x
Frequency
f
fx
55 - 59 57
1 5760 - 64 6
22 12
465 - 69 67
1 6770 - 74 7
22 14
475 - 79 77
5 38580 - 84 8
26 49
285 - 89 87
7 60990 - 94 9
25 46
095 - 99 97
1 97∑f = 30 ∑fx = 2435
From the table∑f = 30
∑fx = 2435
Therefore,
mean, 243530
= 81.1667
II) Mode
The modal class is 85 - 89, i.e., the majority of the students got
that marks.
To find the mode weight, we draw the modal class and two classes
adjacent to the modal class.
From the histogram, the mode mark is 86.5
III) Median
M eth od 1 – B y u sin g formu la
Median is the value of the centre of a set of data
Median weight for 50 students can be obtained by using the formula:
Median, m
WhereL = lower boundary of median class,N = total frequency,F = cumulative frequency before the median class,fm = frequency of median class,C = class interval size.
Marks Lower
Boundar
Upper
boundar
Frequencyf
CumulativeFrequenc55 - 59 54.
559.5
1 160 - 64 59.
564.5
2 365 - 69 64.
569.5
1 470 - 74 69.
574.5
2 675 - 79 74.
579.5
5 1180 - 84 79.
584.5
6 1785 - 89 84.
589.5
7 2490 - 94 89.
594.5
5 2995 - 99 94.
599.5
1 30∑f = 30
From the table, Median class = 30 ÷ 2
= 15th value= 80 - 84
L = 79.5 fm = 6 Total frequency N = 30F = 11 C = 69.5 - 64.5
= 5
m 79.5
30 2
11 5
6
m 82.8333
M e th od 2 – b y d r awi n g an ogive
Ogive
Ogive is a graph constructed by plotting the cumulative frequency
of a set of data against the corresponding upper boundary of each
class.
Not only that, ogive is also the method of calculation, the median,
and the interquartile range of a set of data can also be estimated
from its ogive.
b) Mean, x = 81.2 kg Median, m = 82.8333 (or 82.75) Mode = 86.5
From the above measure of central tendency, mean is suitable measure of
central tendency because the minimum value of raw data is not extreme where
the data seems to be clustered, whereas mode and median does not take all the
values in the data into account which decrease the accuracy of central tendency.
1
3
P A R T 3
Measure of Dispersion is a measurement to determine how far the values of data in a set of data are spread out from its average value.
a) I) The interquartile rangeMethod I – By using formula
1Q L 4
N F C
f m
Q1 class = 30 x ¼= 7.5th value= 75 - 59
L = 74.5 fm = 5 Total frequency N = 30F = 6 C = 5
1 30 6
Q 74.5 4 51 5
Q1 76.0
3Q L 4
N F C
f m
Q3 class = 30 x ¾= 22.5 th value= 85 – 89
L = 84.5 fm = 7 N = 30F = 17 C = 5
3 30 17
Q 84.5 4 53 7
Q3 88.4286
Therefore the Interquartile range, Q3 – Q1 = 88.4286 – 76.0
= 12.4286
Method II – By using ogive
From the ogiveInterquartile range = 88.5 – 76.25
= 12.25
II) The Standard deviation
Method I
Marks Midpoint,x
Frequencyf
fx
fx255 - 59 5
71 5
7324960 - 64 6
22 12
4768865 - 69 6
71 6
7448970 - 74 7
22 14
410368
75 - 79 77
5 385
2964580 - 84 8
26 49
240344
85 - 89 87
7 609
5293890 - 94 9
25 46
042320
95 - 99 97
1 97
9409∑f = 30 ∑fx =
2435∑fx2 = 200450
243530
= 81.1667
v200450
81.1667 2
30
9.6764
Method IIMarks
Midpointx
Frequency
f
x x x x 2 f x x 2
55 - 59
57
1 -24.1667 584.0294 584.029460 - 64
62
2 -19.1667 367.3624 734.724865 - 69
67
1 -14.1667 200.6954 200.695470 - 74
72
2 -9.1667 84.0284 168.056875 - 79
77
5 -4.1667 17.3614 86.806980 - 84
82
6 0.8333 0.6944 4.166385 - 89
87
7 5.8333 34.0274 238.191790 - 94
92
5 10.8333 117.3604 586.801995 - 99
97
1 15.8333 250.6934 250.6934∑f = 30 f x x 2
2854.1666
f x x 2
f
2854.1666
30 9.7539
b) The standard deviation gives a measure of dispersion of the data about the
mean. A direct analogy would be that of the interquartile range, which gives
a measure of dispersion about the median. However, the standard deviation
is generally more useful than the interquartile range as it includes all data
in
its calculation. The interquartile range is totally dependent on just two
values and ignores all the other observations in the data. This reduces the
accuracy
it extreme value is present in the data. Since the marks does not contain
any extreme value, standard deviation give a better measures compared to
interquartile range.
P A R T 4
a) If the teacher adds 3 marks for each student in class for their commitment and discipline shown,
The new marks for 30 studentsStudents Marks
1 582 633 664 685 766 777 788 789 7910
8011
8112
8313
8414
8515
8616
8717
8718
8919
8920
8921
9022
9123
9124
9225
9326
9327
9328
9429
9630
98
Modal class = 90 – 94
New Frequency distributions table:
Marks
Lower
Boundar
Midpoint, x
Frequency, f
C. Frequency
fx
fx255 -
5954.5
57
1 1 57
324960 -
6459.5
62
1 2 62
384465 -
6964.5
67
2 4 134
897870 -
7469.5
72
0 4 0 075 - 79
74.5
77
5 9 385
2964580 - 84
79.5
82
4 13
328
2689685 - 89
84.5
87
7 20
609
5298390 - 94
89.5
92
8 28
736
6771295 - 99
94.5
97
2 30
194
18818∑f = 30 ∑f = 2505 ∑fx2= 381495
New mean , x 250530
x 83.5
New mode,
New Mode = 90.75
New MedianMedian class = 80 – 84
m 79.5
30 9 2 5
4
m 87.0
Class interval remain sameClass interval = 5
New interquartile range
Q1 class = 30 x ¼= 7.5 th value= 75 – 79
1 30 4
Q 74.5 4 51 5
Q1 78.0
Q2 class = 30 x ¾= 22.5= 85 - 89
3 30 13
Q 84.5 4 53 7
Q3 91.2857
Therefore the Interquartile range, Q3 – Q1 = 91.2857 – 78.0
= 13.2857
New standard deviation, 381495 83.52
30
75.7908
b)
Marks
Lower
Boundar
Midpoint, x
Frequency, f
C.
Frequenc
fx
fx255 -
5954.5
57
1 1 57
324960 -
6459.5
62
1 2 62
384465 -
6964.5
67
2 4 134
897870 -
7469.5
72
0 4 0 075 - 79
74.5
77
5 9 385
2964580 - 84
79.5
82
4 13
328
2689685 - 89
84.5
87
7 20
609
5298390 - 94
89.5
92
8 28
736
6771295 - 99
94.5
97
3 31
291
28227∑f = 31 ∑fx =
2602∑fx2 = 390904
New mean , x 2602 31
x 83.9355
New standard deviation, 390904 83.93552
31
74.5965
Mode, median, and interquartile range are not affected by the adding for new marks.
REFLECTION
REFER ENCES
ww w. wo rldteac he rs p ress .co m
www. h eart scan . com. my
www. t ip s. com. my/art ic le. ph p
http ://nut riweb . org. my
http :// gemin igeek. com/b log/a rch ives/2004/ 08/ad d ma t h - p roj ect - t ip s- 1
http : / / w w w . s ag e pub . c o m / upm - d a t a/43350_ 4 . p d f
Blog A Form 4 Sasbadi by Pua Kim Teck
Preston Additional Mathematics Form 4 & 5 reference book by Tan Li Lan
Fokus Ungu Matematik Tambahan Form 4 & 5 by Wong Teck Sing
Additional Mathematics Form 4 Text Book.
Mathematics Form 4 Text Book.