prelim (pstat)
TRANSCRIPT
-
8/6/2019 Prelim (Pstat)
1/45
-
8/6/2019 Prelim (Pstat)
2/45
-
8/6/2019 Prelim (Pstat)
3/45
Definition of StatisticsDefinition of Statistics
Introduction
Statistics is a set of tools used to organize and analyze data. Data
must either be numeric in origin or transformed by researchers into
numbers. For instance, statistics could be used to analyze
percentage scores English students receive on a grammar test: thepercentage scores ranging from 0 to 100 are already in numeric
form. Statistics could also be used to analyze grades on an essay by
assigning numeric values to the letter grades, e.g., A=4, B=3, C=2,
D=1 and F=0.
Employing statistics serves two purposes, (1) description and (2)
prediction. Statistics are used to describe the characteristics of
groups. These characteristics are referred to as variables..
-
8/6/2019 Prelim (Pstat)
4/45
STATISTICS
Is the science of the collection, organization, and interpretation of
data. It deals with all aspects of this, including the planning of data
collection in terms of the design ofsurveys and experiments.
A branch of applied mathematics concerned with the collection
and interpretation of quantitative data and the use of probability
theory to estimate population parameters
Statistics is the science of making effective use of numerical data
relating to groups of individuals or experiments
Practice of collecting, organizing, describing, and analyzing data
to draw conclusions from the data to apply to a cause.
-
8/6/2019 Prelim (Pstat)
5/45
Importance of StatisticsImportance of Statistics
In Computer Engineering:
To make systems(programs) with no flawsfor ISO certifications and process verifications
It is used to study repetitive operations in order to setstandards.
For data interpretation that is frequently seen in computer
engineering.
Example:They will make a program for data interpretation that
is easy to use for people in the same time has no f laws.
-
8/6/2019 Prelim (Pstat)
6/45
InAstronomy:
Astronomy is one of the oldest branch of statistical study, itdeals with the measurement of distance, sizes, masses anddensities of heavenly bodies by means of observations.During these measurements errors are unavoidable so most
probable measurements are founded by using statisticalmethods.
Example:An astronomer found 10 heavenly bodies that isthe same look. And he studied the 10 and 8 out of 10 of the
heavenly bodies were of the same characteristics.
-
8/6/2019 Prelim (Pstat)
7/45
In Mathematics:
Statistics is branch of applied mathematics. The largenumber of statistical methods like probability averages,dispersions, estimation etc. is used in mathematics anddifferent techniques of pure mathematics like integration,differentiation and algebra are used in statistics.
Example: There are problems in math that needsanalization, interpretation, and explanation to solved theproblems.
-
8/6/2019 Prelim (Pstat)
8/45
In Economics:
Statistics play an important role in economics. In economicsresearch statistical methods are used for collecting andanalysis the data and testing hypothesis. The relationshipbetween supply and demands is studies by statistical
methods, the imports and exports, the inflation rate, the percapita income are the problems which require goodknowledge of statistics.
Example:A sole proprietor wants to know what product is
the most demanded today and the cost of that said product.
-
8/6/2019 Prelim (Pstat)
9/45
In Health and Medicine:
edical Statistics deals with the application of biostatisticsto medicine and the health sciences, includingepidemiology, public health, forensic medicine, and clinicalresearch. Find data on indicators of the nations health, such
as health inequalities, mobility rates, smoking drinking anddrug use, and abortion statistics.
Example: In pharmacological research, statistics is used tosummarize (descriptive statistics) experimental data in
terms of central tendency (mean or median) and variance(standard deviation, standard error of the mean, confidenceinterval or range) but more importantly it enables us toconduct hypothesis testing.
-
8/6/2019 Prelim (Pstat)
10/45
Basic Statistical TermsBasic Statistical Terms
VARIABLE - is a quantity that may assume any of set ofvalues.Examples are monthly income, average grade, volume, priceand so forth.
CONSTANT - is a quantity that does not change its value.Example: The mathematical symbol (Greek alphabet pi), isa constant because its value does not change which is alwaysequal to 3.1416... Likewise, the equivalence of 2.54
centimeters to an inch is a constant.
-
8/6/2019 Prelim (Pstat)
11/45
TYPES OF DATA
UNGROUPED (orRAW) DATA-which are not organized inany specific way. They are simply the collection of data asthey are gathered. While some computations may be madefor this kind of data, analysis and interpretation.
INTERVALS NON-CUMUL. CUMULATIVELower Upper Abs. Rel. Abs. Rel.1.00 1.50 24 .169 24 .169
1.50 2.00 24 .169 48 .3382.00 2.50 21 .148 69 .486
-
8/6/2019 Prelim (Pstat)
12/45
-
8/6/2019 Prelim (Pstat)
13/45
PRIMARYDATA- are measured and gathered by theresearcher that published it.
SECONDARYDATA- are republished by another researcheror agency.
POPULATION - is the entire collection of all possibleobservations of a particular characteristic of interest.
Example: the population of the grades of all students who tookan entrance examination; the monthly incomes of all
employees in the government.
SAMPLE - is a representative set of observations that reflectthe characteristic of the whole, that is, the population fromwhich it is taken.
-
8/6/2019 Prelim (Pstat)
14/45
PARAMETER- is any statistical characteristic of a
population.
Example: the Mean and the Standard Deviation. Thus, we saythe population Mean is a parameter of the population.
STATISTIC - is any statistical characteristic of a sample suchas the Mean and the Standard Deviation also. The SampleMean is a sample statistic.
FREQUENCYDISTRIBUTION - is a tabulation of the valuesthat one or more variables take in a sample. Each entry in thetable contains the frequencyor count of the occurrences ofvalues within a particular group or interval, and in this waythe table summarizes the distribution of values in the
sample.
-
8/6/2019 Prelim (Pstat)
15/45
For example, the heights of the students in a class could be
organized into the following frequency table.
Height range Number of student Cumulative number4.55.0 feet 25 255.05.5 feet 35 60
5.56 feet 20 806.06.5 feet 20 100
-
8/6/2019 Prelim (Pstat)
16/45
LEVELS / SCALESOF MEASUREMENT
Statistical operations on numerical values depend upon thenature of such values. Numerical values may be categorizedby levels of measurement namely, nominal, ordinal, intervalandratio (Seigel, Castillian, 1988).
NOMINALLEVEL - is the crudest form of measurement. Thenumber of symbols is used for the purpose of categorizingforms into groups. The categories are mutually exclusive, that
is being one category automatically excludes another.
Example:Sex: M Male Faculty Tenure: 1 Tenured
F - Female
-
8/6/2019 Prelim (Pstat)
17/45
ORDINALLEVEL - is a sort of improvement of nominal level.
Data are ranked from "bottom to top" or "low to high" mannerstatements of the kind "greater than" or "less than" maybemade here
Example:Class Standing: (Excellent, Good, Poor)Teacher's Evaluation: 1 - Poor
2 - Fair3 - Good
4 - Very Good
-
8/6/2019 Prelim (Pstat)
18/45
INTERVALLEVEL - possesses the properties of the nominal andordinal levels. The distances between any two numbers on the
scale are known and it does not have a stable starting point.
Example:Consider the I.Q. scores of four students, 90, 150, 85 and
145. Here we can say that the difference between 90 and 150 is the
same as the difference between 85 and 145, but we cannot claimthat the second student is twice as intelligent as the first.
RATIOLEVEL - possesses all the properties of the nominal,ordinal and interval levels, in addition, this has an absolute zero
point. Data can be classified and be placed in a proper order. Wecan compare the magnitudes of these data.
Example: Age, income, exam scores, performance, ratings,grades of students and tuition fees as examples of ratio variables.
-
8/6/2019 Prelim (Pstat)
19/45
Sampling TechniquesSampling Techniques
Sampling Technique or Sampling PlanIs the procedure of gathering sampling units from the
population.
Is the method of selecting a sample size (n) from auniverse (N) such that each member of the population hasan equal chance of being included in the sample and allpossible combinations of size (n) have an equal chance ofbeing selected as the sample.
Technique of drawing sample from the population.Sampling is being applied once that the entire elements ofthe population is not available or the population size is too
large.
-
8/6/2019 Prelim (Pstat)
20/45
A.ProbabilitySampling Technique is a sampling technique
wherein each of the population unit has an equal chance ofbeing drawn or being selected as members of the sample.is a sampling technique in which the probability of getting anyparticular sample may be calculated
1.Random SamplingIs a basic type of probability sampling. Using this
technique, each individual in the population has an equalchance of being drawn into the sample.
Is the method of selecting a sample size (n) from auniverse (N) such that each member of the population has anequal chance of being included in the sample and all possiblecombinations of size (n) have an equal chance of beingselected as the sample.
-
8/6/2019 Prelim (Pstat)
21/45
1.1.LotterySampling orRaffle Sampling
Assigning numbers to each member of the populationusually carries out the lottery sampling method. the items areplaced in a container. All are thoroughly mixed, and elementsare drawn as needed.
The lottery sampling method is usually carried out by
assigning numbers to each member of the population.
1.2.Table ofRandom NumberThe selection of each member of the population is left
adequately to chance, and every member of the population hasan equal chance of being chosen.
1.2.1 Direct Selection MethodIs used when there are only few sample units to be
selected.
-
8/6/2019 Prelim (Pstat)
22/45
1.2.2.RemainderMethod
Is used whenever the direct selection method cannotbe applied. There are two ways of conducting the remaindermethod:
1. When the number taken from the Table of Random
Numbers is subtracted from the upper limit within which thisnumber falls, the remainder is the sample unit.2. When the upper limit of the set is subtracted from thenumber taken from the Random Table and yields a numberequal or less than N, the remainder is the sample unit.
-
8/6/2019 Prelim (Pstat)
23/45
2.Systematic SamplingUsed to select the members of the sample from a large
population.Picking every nth element of the population as a
member of the sample when using this method.The most common form of systematic sampling is an
equal-probability method, in which everykth
element in theframe is selected, where k, the sampling interval (sometimesknown as the skip), is calculated as:
where n is the sample size, andNis the population size.
-
8/6/2019 Prelim (Pstat)
24/45
3.StratifiedSampling
In this technique, the set of interest is divided intogroups or aggregates from which the actual sampling is done.In this method, the population is subdivided into at
least two different subpopulations (or strata) that share thesame characteristics and then the elements of the sample are
drawn from its stratum proportionately.
Determining sample sizeSLOVINs FORMULA
n = ___N____1+Ne^2where:n = sample size N = populatione = margin of error
-
8/6/2019 Prelim (Pstat)
25/45
4.Cluster orArea Sampling
Is a sampling wherein group or clusters instead ofindividuals are randomly chosen
This involves dividing the population into non-overlapping clusters
4.Cluster orArea SamplingIs a sampling wherein group or clusters instead of
individuals are randomly chosen
This involves dividing the population into non-overlapping clusters
-
8/6/2019 Prelim (Pstat)
26/45
B. Non-Probability Sampling TechniqueIs a sampling technique wherein the sample units do
not have equal chances of being drawnis a sampling technique wherein members of the sample aredrawn from the population based on the judgment of theresearchers
Examples of non-probability sampling include:a. Convenience or Haphazard- members of the population
are chosen based on their relative ease of access. To samplefriends, co-workers, or shoppers at a single mall, are all
examples of convenience.b. Snowball Sampling - The first respondent refers a friend.The friend also refers a friend, etc.
-
8/6/2019 Prelim (Pstat)
27/45
c. Judgmental sampling or Purposive samplingThe researcher chooses the sample based on who they
think would be appropriate for the study. This is used primarilywhen there is a limited number of people that have expertise inthe area being researched.d. Deviant Case
Get cases that substantially differ from the dominantpattern (a special type of purposive sample).e.Case study
The research is limited to one group, often with a similarcharacteristic or of small size.
f. ad hoc quotasA quota is established (say 65% women) and researchers
are free to choose any respondent they wish as long as thequota is met.
-
8/6/2019 Prelim (Pstat)
28/45
Collection of DataCollection of Data
Any statistical investigation must necessarily be basedon accurate data. In order to ensure the accuracy of data , onemust know the right sources and methods of collecting them.
1.PrimarydataRefer to information which are gathered directly from
an original source, or which are based on direct or first-handexperience.
Examples:First-person accounts, autobiographies, and
diaries.
-
8/6/2019 Prelim (Pstat)
29/45
2.SecondarydataRefer to information which are taken from published
or unpublished data which were previously gathered byother individuals or agencies.Examples:
Published books, newspapers, magazines,
biographies, business reports, and the like.
-
8/6/2019 Prelim (Pstat)
30/45
Methods used in the collection of data
The direct or interview methodThis is a method of person-to-person exchange
between the interviewer and the interviewee. The interviewmethod provides consistent and more precise informationsince clarification may be given by the interviewee.
The indirect or questionnaire methodWritten responses are given to prepared questions. A
questionnaire is a list of questions which are intended to
elicit answers to the problems of a study. This method isinexpensive and can cover a wide area in a shorter span oftime.
-
8/6/2019 Prelim (Pstat)
31/45
The registration methodThis method of gathering information is enforced by certain
laws. Examples are the registration of births, deaths, motorvehicles, marriages, and licenses. The advantage of thismethod is that information is kept systematized.
The observation method
In this method, the investigator observes the behavior ofpersons or organizations and their outcomes. It is usuallyused when the subjects cannot talk or write
The experiment methodThis method is used when the objective is to determine thecause and effect relationship of certain phenomena undercontrolled conditions. Scientific researchers usually use theexperiment method.
-
8/6/2019 Prelim (Pstat)
32/45
Types of questions
Structured questionThis is a type of question that leaves only one way or
few alternative ways of answering it. Here are some examplesof this type of question.
Unstructured or open-ended questionsAs the name suggests, there are questions which can be
answered in many ways. Probing questions or questions thatwant to elicit reasons are normally of this type.
-
8/6/2019 Prelim (Pstat)
33/45
Presenting a DataPresenting a Data
Textual formPresenting a data in a paragraph form.
Example:
There are 100 students enrolled in the college of education.54 students came from the English major. 25 students camefrom mathematics major. And lastly 25 students came fromFilipino major.
-
8/6/2019 Prelim (Pstat)
34/45
Tabular fromThe data are places in a table.
Example:Age Interval(yrs) Frequency15-19 13
20-24 1525-29 2030-34 10
-
8/6/2019 Prelim (Pstat)
35/45
Graphical presentationThe data are presented in a diagrammatic form. The
graphical representation of data makes the reading moreinteresting, less time-consuming and easily understandable.
Kinds ofGraphical presentation:Pie Chart
A pie chart is a way of summarizing a set of categoricaldata. It is a circle which is divided into segments. Eachsegment represents a particular category.
-
8/6/2019 Prelim (Pstat)
36/45
BarChartA bar chart is a way of summarizing a set of categorical
data. It is often used in exploratory data analysis to illustratethe major features of the distribution of the data in aconvenient form. It displays the data using a number ofrectangles, of the same width, each of which represents a
particular category. The length (and hence area) of eachrectangle is proportional to the number of cases in thecategory it represents, for example, age group, religiousaffiliation.
-
8/6/2019 Prelim (Pstat)
37/45
HistogramA histogram is a way of summarizing data that are
measured on an interval scale (either discrete orcontinuous). It is often used in exploratory data analysis toillustrate the major features of the distribution of the data ina convenient form. It divides up the range of possible values
in a data set into classes or groups.
-
8/6/2019 Prelim (Pstat)
38/45
FrequencydistributionFrequency distribution is a tabulation of the values
that one or more variables take in a sample. Each entry in thetable contains the frequency or count of the occurrences ofvalues within a particular group or interval, and in this waythe table summarizes the distribution of values in the
sample.
Raw data is a term for data collected on source whichhas not been subjected to processing or any othermanipulation, it is also known as primary data.
An array is a systematic arrangement of objects,usually in rows and columns.
-
8/6/2019 Prelim (Pstat)
39/45
The range of a sample (or a data set) is a measure ofthe spread or the dispersion of the observations. It is thedifference between the largest and the smallest observedvalue of some quantitative characteristic and is very easy tocalculate.
The average of the values of the class limits for agiven class. A class mark is also called a mid-value or centralvalue and usually denoted by x.
When data are continuous, class boundary are used.
We can get the class boundary when we subtract .5 by thelover limit and adding .5 by the upper limit.
-
8/6/2019 Prelim (Pstat)
40/45
Frequency polygons are a graphical device forunderstanding the shapes of distributions. They serve thesame purpose as histograms, but are especially helpful incomparing sets of data. Frequency polygons are also a goodchoice for displaying cumulative frequency distributions. Tocreate a frequency polygon, start just as for histograms, by
choosing a class interval. Then draw an X-axis representingthe values of the scores in your data. Mark the middle ofeach class interval with a tick mark, and label it with themiddle value represented by the class. Draw the Y-axis toindicate the frequency of each class. Place a point in the
middle of each class interval at the height corresponding toits frequency. Finally, connect the points. You should includeone class interval below the lowest value in your data andone above the highest value.
-
8/6/2019 Prelim (Pstat)
41/45
Cumulative FrequencyCorresponding to a particular value is the sum of all
the frequencies up to and including that value.
-
8/6/2019 Prelim (Pstat)
42/45
Ogive(a cumulative line graph) Is best used when you
want to display the total at any given time. The relativeslopes from point to point will indicate greater or lesserincreases.
-
8/6/2019 Prelim (Pstat)
43/45
Summation NotationSummation Notation
Is the operation of combining a sequence of numbers usingaddition; be summed may be integers, rational, number, realnumbers, or complex numbers. Besides numbers, othertypes of values can be added as well: vectors, matrices,
polynomials, and in general elements of any additive group.
EXAMPLES:4
(2+i^2) = (2+1^2) + (2+2^2) + (2+3^2) +(2+4^2)i=1 = 3+6+11+18
= 40
-
8/6/2019 Prelim (Pstat)
44/45
6
(2) = 2+2+2+2+2+2i=1 =12
5 (5-i) = (5-1) + (5-2) + (5-3) + (5-4) + (5-5)
i=1 = 4 + 3 + 2 +1 + 0= 10
-
8/6/2019 Prelim (Pstat)
45/45