meljun cortes -types of data
DESCRIPTION
MELJUN CORTES -Types of DataTRANSCRIPT
![Page 1: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/1.jpg)
Lesson 13 - 1
Year 1
CS113/0401/v1
LESSON 13TYPES OF DATA
Qualitative Not usually numeric
No particular order
Examples:
– Colour, Types of Materials
Quantitative Numeric
Ordered
Measurable
Continuous
– E.g. Length, Age, Weight
Discrete
– E.g. Shoe size, Number of people
![Page 2: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/2.jpg)
Lesson 13 - 2
Year 1
CS113/0401/v1
First stage in making raw data understandable
RAW DATA
Number of sheets of listing paper used by each of 120 jobs
Not easily digested!
17
24 11
14
18
17
7
5
21
6 11 18 22 14 6 17
14
8
12132712 189
14
18 14
13
21
8
27
9
11
16 27 21 14 11 19 7
10
29
17121419 129
23
17 24
7
13
14
17
21
8
17 19 24 26 2 5 18
14
16
7162813 148
19
27 9
18
8
24
19
7
13
14 16 19 11 17 23 12
25
16
15102118 1411
9
14 28
20
12
16
10
8
9
11 22 10 17 9 18 12
24
8
716520 710
DATA TABULATION (1)
![Page 3: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/3.jpg)
Lesson 13 - 3
Year 1
CS113/0401/v1
Category (No of sheets
used)Tally Frequency
0 - 11
5 - 261111 1111 1111 1111 1111 1
10 - 371111 1111 1111 1111 11111111 1111 11
15- 311111 1111 1111 1111 11111111 1
20 - 161111 1111 1111 1
25 - 91111 1111
120Total
Frequency distribution table
DATA TABULATION (2)
Tabulate in (discrete) categories
![Page 4: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/4.jpg)
Lesson 13 - 4
Year 1
CS113/0401/v1
FREQUENCY DISTRIBUTION (1)
Raw data
Raw data are collected data which have been organized numerically
Array
An array is an arrangement of raw numerical data in ascending or descending order of magnitude. The difference between the largest and smallest number is called the range of the data
![Page 5: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/5.jpg)
Lesson 13 - 5
Year 1
CS113/0401/v1
FREQUENCY DISTRIBUTION (2)
Frequency distribution
When summarizing a large number of raw data it is often useful to distribute the data into classes or categories and to determine the number of individuals belonging to each class, called the class frequency
![Page 6: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/6.jpg)
Lesson 13 - 6
Year 1
CS113/0401/v1
EXAMPLE
A set of 100 students obtained from an alphabetical listing of an university record.
Their weights ranging from 60kg to 74kg are tabulated.
![Page 7: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/7.jpg)
Lesson 13 - 7
Year 1
CS113/0401/v1
Mass ( kilograms) Number of Students
60 - 6263 - 6566 - 6869 - 7172 - 74
51842278
Total 100
EXAMPLE
The first class or category, for example consists of masses from 60 to 62 kg and is indicated by the symbol 60 - 62. Since 5 students have masses belonging to this class, the corresponding class frequency is 5.
Data organized and summarized in the above frequency distribution are often called grouped data
![Page 8: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/8.jpg)
Lesson 13 - 8
Year 1
CS113/0401/v1
CLASS INTERVAL
A symbol defining a class such as 60 - 62 is called a class interval. The end numbers 60 and 62, are called the class limits.
The smaller number 60 is the lower class limit and the larger number 62 is the upper class limit.
![Page 9: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/9.jpg)
Lesson 13 - 9
Year 1
CS113/0401/v1
CLASS MARK
A class mark is the midpoint of the class interval and is obtained by adding the lower and upper class limits and dividing by two
In the previous examples, the class mark of the interval 60 - 62 is (60 + 62) / 2 = 61
![Page 10: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/10.jpg)
Lesson 13 - 10
Year 1
CS113/0401/v1
MEDIAN (1)
The median of a set of numbers arranged in order of magnitude is the middle value or the arithmetic mean of the two middle values.
Example 1 The set of numbers
3, 4, 4, 5, 6, 8, 8, 8, 10
For an odd number of data the median occurs at position
(N + 1) / 2
= 10 / 2
= 5th position
Therefore the median = 6
![Page 11: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/11.jpg)
Lesson 13 - 11
Year 1
CS113/0401/v1
MEDIAN (2)
Example 2 The set of numbers
5, 5, 7, 9, 11, 12, 15, 18
For even number of data the median is the average of the two middle values
The median
= (Pos 4 + Pos 5) / 2
= (9 + 11) / 2
= 10
![Page 12: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/12.jpg)
Lesson 13 - 12
Year 1
CS113/0401/v1
For grouped data the median, obtained by interpolation is given by
MEDIAN = L1 + C
Where
L1 = lower class boundary of the median class(I.e. the class
containing the median).
N = number of items in the data
(I.e. total frequency)
ƒ median
- ƒ 1N
2
MEDIAN (1)
![Page 13: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/13.jpg)
Lesson 13 - 13
Year 1
CS113/0401/v1
MEDIAN (2)
ƒ 1 = sum of frequencies of all classes lower than the median
class
median = frequency of median class
c = size of median class interval
![Page 14: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/14.jpg)
Lesson 13 - 14
Year 1
CS113/0401/v1
MEDIAN OF A GROUPED FREQUENCY DISTRIBUTION
Draw a Cumulative Frequency Diagram
Search for the middle value on the c axis and read off the corresponding value on the x axis
This is the median
![Page 15: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/15.jpg)
Lesson 13 - 15
Year 1
CS113/0401/v1
MEDIAN FROM A CUMULATIVE FREQUENCY
DIAGRAM
![Page 16: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/16.jpg)
Lesson 13 - 16
Year 1
CS113/0401/v1
MODE (1)
The mode of a set of numbers is that value which occurs with the greatest frequency, I.e. it is the most common value. The mode may not exit, and even of it does exists it may not be unique
Example The set
2, 2, 5, 7, 9, 9, 9, 10, 11, 12, 18 has mode 9
Example The set
3, 5, 8, 10, 12, 15, 16
has no mode
![Page 17: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/17.jpg)
Lesson 13 - 17
Year 1
CS113/0401/v1
MODE (2)
Example
The set
2, 3, 4, 4, 4, 5, 5, 7, 7, 7, 9
has mode 4 and 7 and is called bimodal
A distribution having only one mode is called unimodal
![Page 18: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/18.jpg)
Lesson 13 - 18
Year 1
CS113/0401/v1
MODE OF A FREQUENCY DISTRICUTION
Ungrouped data
Mode is the x value which has the highest value of
Grouped data
Can’t find mode, only the modal class
![Page 19: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/19.jpg)
Lesson 13 - 19
Year 1
CS113/0401/v1
x f
51 - 55
55 - 60
61 - 65
121610
MODAL CLASS
55 - 60 is the modal class
We don’t know x values before grouping, so we can’t find the mode exactly
N.B.
Actual mode might not even be in this class
![Page 20: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/20.jpg)
Lesson 13 - 20
Year 1
CS113/0401/v1
In cases where grouped data where frequency curve has been constructed to fit the data, the mode will be the value (or values) of x corresponding to the maximum point (or points) on the curve, From a frequency distribution or histogram the mode can be obtained from the following formula,
Mode = L1 + ( (
1 + 2
1* c
MODE (1)
![Page 21: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/21.jpg)
Lesson 13 - 21
Year 1
CS113/0401/v1
Where
L1 = lower class boundary of modal class(i.e. class containing the mode).
1 = excess of modal frequency over frequency of next lower class
2 = excess of modal frequency over frequency of the next higher class
c = size of modal class interval
MODE (2)
![Page 22: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/22.jpg)
Lesson 13 - 22
Year 1
CS113/0401/v1
GROUPED MODE FROM HISTOGRAM (1)
Can only ESTIMATE
Assume mode is in Modal Class
![Page 23: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/23.jpg)
Lesson 13 - 23
Year 1
CS113/0401/v1
Calculation
Mode Estimate
= 25 + 5 x
= 25 + 5 x
= 25 + 1.9
= 26.9
40
40 + 64
40
104
GROUPED MODE FROM HISTOGRAM (2)
![Page 24: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/24.jpg)
Lesson 13 - 24
Year 1
CS113/0401/v1
X =
X1 + X2 + X3 + ….. + Xn
N
=
n
i=1
Xi
N
ARITHMETIC MEAN (1)
The arithmetic mean or the mean of a set of N numbers X1, X2, X3, ..., Xn is donoted by X is defined as
![Page 25: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/25.jpg)
Lesson 13 - 25
Year 1
CS113/0401/v1
ARITHMETIC MEAN (2)
Eight numbers:7, 21, 13, 17, 23, 18, 9, 20
Add them = 128
Divide by 8 = 16
This is the arithmetic mean
It is the the most common definition of “average”
It only works with quantitative data
![Page 26: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/26.jpg)
Lesson 13 - 26
Year 1
CS113/0401/v1
X =
1X1 + 2X2 + ….. + nXn
1 + 2 + …. n
=
n
i=1
iXi
in
i=1
X
ARITHMETIC MEAN (3)
If the number X1, X2, X3, ..., Xn occurs 1, 2, 3, ..., n times respectively, the arithmetic mean is
![Page 27: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/27.jpg)
Lesson 13 - 27
Year 1
CS113/0401/v1
MEAN OF A FREQUENCY DISTRIBUTION
Mean age = = 20.77
(rounded to nearest integer, 21)
2077100
Age (x) xFrequency ()
17
18
19
20
21
22
23
24
25
26
3
8
14
21
24
13
7
6
3
1
51
144
266
420
504
286
161
144
75
26
= 100 x = 2077
![Page 28: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/28.jpg)
Lesson 13 - 28
Year 1
CS113/0401/v1
HISTOGRAMS (1)
Only used for quantitative data
Histogram is like a bar chart, but with no gaps between bars and calibrated horizontal axis
Order of bars depends on value and on horizontal scale
![Page 29: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/29.jpg)
Lesson 13 - 29
Year 1
CS113/0401/v1
HISTOGRAMS (2)
![Page 30: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/30.jpg)
Lesson 13 - 30
Year 1
CS113/0401/v1
HISTOGRAMS (3)
![Page 31: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/31.jpg)
Lesson 13 - 31
Year 1
CS113/0401/v1
AREA IN HISTOGRAMS
![Page 32: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/32.jpg)
Lesson 13 - 32
Year 1
CS113/0401/v1
Line of Code No of Programs100 -
150 -
125 -
39
51
42
24
12
3
325 - 349
300 -
21275 -
30250 -
200 -
175 -
225 -
12
6
CUMULATIVE FREQUENCY DIAGRAMS (1)
Table 1:
![Page 33: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/33.jpg)
Lesson 13 - 33
Year 1
CS113/0401/v1
Line of Code(less than)
CumulativeFrequency
100
150
125
132
81
39
15
3
0
325
300
201275
171250
200
175
225
222
234
240350
CUMULATIVE FREQUENCY DIAGRAMS (2)
Table 2:
![Page 34: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/34.jpg)
Lesson 13 - 34
Year 1
CS113/0401/v1
020406080
100120140160180200220240
0 50 100 150 200 250 300 350
Lines of code (less than)
Cu
mm
ula
tiv
e F
req
ue
nc
y
CUMULATIVE FREQUENCY DIAGRAMS(3)
Cumulative FrequencyCurve
![Page 35: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/35.jpg)
Lesson 13 - 35
Year 1
CS113/0401/v1
n
i=1
(Xi - X) 2
N
STANDARD DEVIATION (1)
The Standard Deviation of a set of N numbers X1, X2, ..., Xn is denoted by S.D. and is defined by
S.D. =
Where
X = Arithmetic Mean
N = Total Number of element in the set
![Page 36: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/36.jpg)
Lesson 13 - 36
Year 1
CS113/0401/v1
n
j=1[ j (Xj - X) 2 ]
n
j=1
i Xi2
i
i Xi2
- i
( )
S.D.
or
S.D. =
STANDARD DEVIATION (2) (GROUPED DATA)
If X1, X2, ..., Xn occurs with frequencies 1, 2, ..., n respectively, the standard deviation can be written as
![Page 37: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/37.jpg)
Lesson 13 - 37
Year 1
CS113/0401/v1
Question 6 c) NCC 1/93
On test the actual access times for 50 hard disc drives were distributed as follows:
Calculate the mean access time and the standard deviation.
Time (ms)
No. of Drives
22.6
3
22.7
1
23.022.9
106
22.8 23.223.1
914 25
23.3
![Page 38: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/38.jpg)
Lesson 13 - 38
Year 1
CS113/0401/v1
Alternative Question 6cx
22.6
22.7
22.8
22.9
23.0
23.1
23.2
23.3
f fx fx2
1
3
6
10
14
9
5
2
22.6
68.1
136.8
229.0
322.0
207.9
116.0
46.6
510.76
1545.87
3119.04
5244.10
7406.00
4802.49
2691.20
1085.78
1149.0 26405.24 (1 mark for each total) 2
2[1] [1]
Mean = 114950
= 22.98 [1]
S.D =fx2
f( X )2
= 26405.2450 (22.98)2
= 0.156
[1]
![Page 39: MELJUN CORTES -Types of Data](https://reader033.vdocuments.us/reader033/viewer/2022061205/5481a4585806b5f2048b4574/html5/thumbnails/39.jpg)
Lesson 13 - 39
Year 1
CS113/0401/v1
The variance of a set of data is defined as the square of the standard deviation and is thus given by (S.D.)
Variance =
i.e.
Variance = (S.D.)2
n
j=1[ j (Xj - X) 2 ]
n
j=1 j
VARIANCE