stat 1 report f 202
Post on 07-Apr-2018
221 Views
Preview:
TRANSCRIPT
-
8/3/2019 Stat 1 Report F 202
1/25
TERM PAPEROn
Dispersion:A statistical tool
Department of FinanceUniversity of Dhaka
-
8/3/2019 Stat 1 Report F 202
2/25
Date of Submission: July 24, 2010
Dr. M. Khairul HossainProfessorDepartment of FinanceUniversity of Dhaka
GROUP-2Apel Mahmood Rifat 15-007Sumaiya Amena 15-051Shakira Mahzabeen 15-085Khairul Bashar 15-153
Submitted To
Submitted By
-
8/3/2019 Stat 1 Report F 202
3/25
Letter of TransmittalJuly 24, 2010.Dr. M. Khairul Hossain
Professor
Department of Finance
University of Dhaka.
Subject: Submission of report named Dispersion: A statistical tool
Dear Sir,
We take the pleasure to inform you that, we are going to submit the report that you had
assigned us as a partial requirement for the course Business Statistics I (F-202)
The report is prepared on Dispersion: A statistical tool.
We sincerely hope that, you will enjoy going through this report, as we have felt great
pleasure to prepare it. If any other information is required for further clarification, we will bepleased to provide you with that.
We are thanking you heartily. We tried our best to make this report the best one. We think
this report can serve us all as a means of tool for solving business decision problems.
Finally, we would like to thank you for providing us the opportunity to work in such an
interesting and enthusiastic report as we have enjoyed as well as learned a lot in preparing
this report.
Sincerely,
Apel Mahmood Rifat
On behalf of group 2
15th
batch, Section-A
Department of Finance
University of Dhaka
-
8/3/2019 Stat 1 Report F 202
4/25
AcknowledgementFor the completion of this study we cant deserve all praise. There were a lot of people whohelped us by providing valuable information, advice and guidance for the completion of this
report in the scheduled time.
Course report is an essential part of BBA program as one can gather practical knowledge
within the short period of time by observing and doing the works of chosen topic. In this
regard our report has been arranged on Dispersion.
At first we like to pay our thanks to almighty Allah, for helping us to do all the works with
perfection.
We would like to pay our gratitude to our supervising course teacher Prof. Dr. M. KhairulHossain who instruct us in the right way and give us proper guidelines for preparing this
report.
At last we must mention the wonderful working environment and group commitment that has
enabled a lot deal to do and observe the process during our time.
-
8/3/2019 Stat 1 Report F 202
5/25
Table of Content
SerialNo.
Subject Page No.
01. Executive Summary 0702. Introduction 0803. Macroenvironmental Forces 1104. Pharmaceuticals Industry 1905. Square Pharmaceuticals Ltd. 2306. Impact of Macroenvironment in Square Pharma 3207. Beximco Pharmaceuticals Ltd. 3708. Macro environmental factors affecting Beximco Parma to
launch a new product
48
09. Findings 5810. Conclusion 5911. Reference 6012. Bibliography 61
-
8/3/2019 Stat 1 Report F 202
6/25
Introduction
Origin of the report
This report is generated under the academic supervision of our course teacher Prof. Dr. M.Khairul Hossain, Department of Finance, University of Dhaka. This report is prepared as the
requirement of Business statistics course. The topic is Dispersion: A statistical tool.
Methodology
The methodology of the report is inductive. The report is based on secondary information.
Secondary Information:The secondary sources of data are different reference books,
website etc.
Key Parts of the report
The main view of the report is to discuss Dispersion, as a statistical tool. Different measures
of dispersion and their use is discussed in this report.
Objectives of the report
Broad Objectives:The main objective of the study is to evaluate the impact of macro-
environment forces in decision making of launching new product in pharmaceutical industry.
Specific Objectives:
To be acquainted with the Pharmaceutical industry
To learn clear knowledge of macro-environment forces
To learn about new product launching process of Pharmaceutical industry
To have the practical knowledge of theoretical knowledge of Marketing theory
ScopeIn this report, at first we cover the preliminary concept of Dispersion. Then we go for the
classification of dispersion factors on launching a new product of a pharmaceuticals
company.
LimitationsThere were certain limitations of the problem we face in report preparing.
Unavoidable conditions:
Some of the unavoidable conditions also had a deterring effect on preparing thereport.
-
8/3/2019 Stat 1 Report F 202
7/25
Restrictions that we faced:
Lack of information, lack of technology etc. are the restrictions within the problem.
Absence of some information regarding data compilation:
While making the survey for data collection, we have faced problems. Some of the
information was really essential was hard to collect.
-
8/3/2019 Stat 1 Report F 202
8/25
Introduction
While measures of central tendency are used to estimate "normal" values of a dataset,
measures of dispersion are important for describing the spread of the data, or its variation
around a central value. Two distinct samples may have the same mean or median, but
completely different levels of variability, or vice versa. A proper description of a set of data
should include both of these characteristics. There are various methods that can be used to
measure the dispersion of a dataset, each with its own set of advantages and disadvantages.
In statistics, statistical dispersion (also called statistical variability or variation) is variability
or spread in a variable or a probability distribution. Common examples of measures of
statistical dispersion are the variance, standard deviation and inter quartile range.
Measures of dispersion express quantitatively the degree of variation or dispersion of values
in a population or in a sample. Along with measures ofcentral tendency, measures of
dispersion are widely used in practice as descriptive statistics. Some measures of dispersionare the standard deviation, the average deviation, the range, the interquartile range.
For example, the dispersion in the sample of 5 values (98,99,100,101,102) is smaller than the
dispersion in the sample (80,90,100,110,120), although both samples have the same central
location - "100", as measured by, say, the mean or the median . Most measures of dispersion
would be 10 times greater for the second sample than for the first one (although the values
themselves may be different for different measures of dispersion).
Dispersion is contrasted with location or central tendency, and together they are the most
used properties of distributions.
A measure of statistical dispersion is a real number that is zero if all the data are identical,
and increases as the data becomes more diverse. It cannot be less than zero.
Most measures of dispersion have the same scale as the quantity being measured. In other
words, if the measurements have units, such as meters or seconds, the measure of dispersion
has the same units.
http://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Variable_%28mathematics%29http://en.wikipedia.org/wiki/Probability_distributionhttp://en.wikipedia.org/wiki/Variancehttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Interquartile_rangehttp://www.statistics.com/resources/glossary/p/population.phphttp://www.statistics.com/resources/glossary/s/sample.phphttp://www.statistics.com/resources/glossary/c/centralt.phphttp://www.statistics.com/resources/glossary/d/descstats.phphttp://www.statistics.com/resources/glossary/s/standev.phphttp://www.statistics.com/resources/glossary/a/avgdev.phphttp://www.statistics.com/resources/glossary/r/range.phphttp://www.statistics.com/resources/glossary/i/intrqrtrng.phphttp://www.statistics.com/resources/glossary/m/mean.phphttp://www.statistics.com/resources/glossary/m/median.phphttp://en.wikipedia.org/wiki/Central_tendencyhttp://en.wikipedia.org/wiki/Real_numberhttp://en.wikipedia.org/wiki/Units_of_measurementhttp://en.wikipedia.org/wiki/Units_of_measurementhttp://en.wikipedia.org/wiki/Real_numberhttp://en.wikipedia.org/wiki/Central_tendencyhttp://www.statistics.com/resources/glossary/m/median.phphttp://www.statistics.com/resources/glossary/m/mean.phphttp://www.statistics.com/resources/glossary/i/intrqrtrng.phphttp://www.statistics.com/resources/glossary/r/range.phphttp://www.statistics.com/resources/glossary/a/avgdev.phphttp://www.statistics.com/resources/glossary/s/standev.phphttp://www.statistics.com/resources/glossary/d/descstats.phphttp://www.statistics.com/resources/glossary/c/centralt.phphttp://www.statistics.com/resources/glossary/s/sample.phphttp://www.statistics.com/resources/glossary/p/population.phphttp://en.wikipedia.org/wiki/Interquartile_rangehttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Variancehttp://en.wikipedia.org/wiki/Probability_distributionhttp://en.wikipedia.org/wiki/Variable_%28mathematics%29http://en.wikipedia.org/wiki/Statistics -
8/3/2019 Stat 1 Report F 202
9/25
Importance
A study of dispersion enables us to get additional information about the composition of data.
Confining mean will not provide us this vital information.
Central tendency will only give information on the location of the data. Dispersion defines
the spread of the data. In addition, shape should also be part of the defining criteria of data.
So, dispersion describes location, spread & shape as best measures to define data.
Two different set of data can have different mean but same variability. On the other hand two
set of data can have same mean but different variability.
Shape A and B has the same mean but different variability
Curve ACurve B
Curve B
Curve A
Curve A and B have different mean but same variability.
-
8/3/2019 Stat 1 Report F 202
10/25
Variability or variation is something connected with human life and study is very important
for mankind. The total area of the earth may not be very important to a research minded
person but the area under different crops, area covered by forests, area covered by residential
and commercial buildings are figures of great importance because these figures keep on
changing form time to time and from place to place. Very large number of experts is engaged
in the study of changing phenomenon. Experts working in different countries of the worldkeep a watch on forces which are responsible for bringing changes in the fields of human
interest. The agricultural, industrial and mineral production and their transportation from one
part to the other parts of the world are the matters of great interest to the economists,
statisticians, and other experts. The changes in human population, the changes in standard
living, and changes in literacy rate and the changes in price attract the experts to make
detailed studies about them and then correlate these changes with the human life. Thus
variability or variation is something connected with human life and study is very important
for mankind.
The study of dispersion is very important in statistical data. Like-
Test the reliability of an average
Control the variability
Compare two or more sets of data with respect of their variability
Facilitate the use of other statistical techniques
If in a certain factory there is consistence in the wages of workers, the workers will besatisfied. But if some workers have high wages and some have low wages, there will be
unrest among the low paid workers and they might go on strikes and arrange demonstrations.
If in a certain country some people are very poor and some are very high rich, we say there is
economic disparity. It means that dispersion is large.
The idea of dispersion is important in the study of wages of workers, prices of commodities,
standard of living of different people, distribution of wealth, distribution of land among
framers and various other fields of life.
Measures of dispersion are known as averages of the second order because they indicate the
average deviation of individual observations from the mean.
Measures of dispersion can be described from two perspectives. They are:-
1. Absolute form
2. Relative form
A graphical representation is in the following:-
-
8/3/2019 Stat 1 Report F 202
11/25
Range: Considering the several measures of dispersion, the range is the first measure of the absolute
form. The range is based on the largest and the smallest values in the data set. It is known as the
simplest measure of dispersion. However, the range only provides information about the maximum
and minimum values and does not say anything about the values in between. It isthe difference
between the largest and the smallest values in a data set. In the form of an equation, after re-
arranging the data, it will be like this:
The range is widely used in statistical process control (SPC) applications because it is very easy to
calculate and understand.
Quartile Deviation: The quartile deviation is half the difference between the upper and lower
quartiles in a distribution. It is a measure of the spread through the middle half of a distribution. It
can be useful because it is not influenced by extremely high or extremely low scores. Quartile
Deviation is an ordinal statistic and is most often used in conjunction with the median. The formula
to calculate quartile deviation is:
Where, QD = Quartile Deviation
Q3 = Third Quartile
Q1 = First Quartile
Measures ofDispersion
AbsoulteForm
RangeQuartile
DeviationMean
DeviationStandardDeviation
RelativeForm
Coefficientof Range
Coefficientof QuartileDeviation
Coefficientof MeanDeviation
Coefficientof
Variation
Range = Largest value Smallest value
-
8/3/2019 Stat 1 Report F 202
12/25
Mean Deviation: A defect of the range is that it is based on only two values, the highest and the
lowest. It does not take into consideration all of the values. The mean deviation does. It measures
the mean amount by which the values in a population or sample vary from their mean. In terms of a
definition, mean deviation is the arithmetic mean of the absolute values of the deviations from the
arithmetic mean. The formula is:-
MD = l X X l
n
Where, X is the value of each observation
X is the arithmetic mean of the values
n is the number of observations in the sample
ll indicates the absolute value
Standard Deviation:The variance and the standard deviation are also based on the deviationsfrom the mean. However, instead of using the absolute value of the deviations, the variance and the
standard deviation square the deviations.
Features of standard deviation are as follows:
The standard deviation is the square root of the sample variance. Defined so that it can be used to make inferences about the population variance.
Calculated using the formula:
The values computed in the squared term, x i x bar, are anomalies, which is discussed in
another section
Not restricted to large sample data sets, compared to the root mean square anomaly
http://iridl.ldeo.columbia.edu/dochelp/StatTutorial/Climatologies/http://iridl.ldeo.columbia.edu/dochelp/StatTutorial/Climatologies/ -
8/3/2019 Stat 1 Report F 202
13/25
Variance: The arithmetic mean of the squared deviations from the mean is known as the variance.The variance is nonnegative and is zero only if all the observations are the same. The formula is:-
-
8/3/2019 Stat 1 Report F 202
14/25
Measures of Relative Dispersion
A measure of relative variation is the ratio of the measure of the absolute variation
to an average. It is sometimes called the co-efficient of variation because co-efficient
means a pure number that is independent of the unit of measurement. It should beremembered that while computing the relative variation the average used as base
should be the same one from which the absolute variations were measured.
The relative variations are:
Coefficient of range
The relative measure corresponding to a range called the coefficient of range, is
obtained by applying the following formula:
Coefficient of range =
In a frequency distribution, coefficient of range is calculated by taking the difference
between the lower limit of the lower class and the upper limit of the upper class.
Example:
The following are the prices of shares of a company from Monday to Saturday:
Day Price Day Price
Monday 200 Thursday 160
Tuesday 210 Friday 220
Coeffcient of range
Coeffcient of mean deviation
Coeffcient of quartile deviation
Coeffcient of variance
-
8/3/2019 Stat 1 Report F 202
15/25
Wednesday 208 Saturday 250
Solution:
Range= L S Here, Largest value = 160 and
=250 160 Smallest value = 250
Coefficient of range =
=
=0.219
Coefficient of quartile deviation
The relative measure corresponding to a quartile deviation called the coefficient of
quartile deviation is calculated as follows:
Coefficient of quartile deviation =
Coefficient of quartile deviation can be used to compare the degree of variation in
different distributions.
Coefficient of mean deviation
The relative measure corresponding to a mean deviation called the coefficient of
mean deviation is calculated as follows:
Coefficient of mean deviation =
If mean has been used while calculating the value of mean deviation in such a case
coefficient of mean deviation can be obtained by diving average deviation by the
mean.
Coefficient of variation
The relative measure corresponding to a variation is called the coefficient of
variation. This measure developed by Karl Pearson is the most commonly used
measure of relative variation. It is used in such problems where we want to compare
-
8/3/2019 Stat 1 Report F 202
16/25
the variability of two or more than two series. Coefficient of variation denoted by C.V
is obtained as follows:
C.V. =
Percentile: If the data are organized in ascending form and then which single data
divides the information into hundred, it is called percentile.
Percentile = () ; i= 1,2,3,..,99If the
() is in fraction, then
Percentile= value + *( ) + For frequency distribution,
Percentile=
Example: Find the percentile of 2, 4,6,8,10,12,14,16,18.Solution: Here, n= 9
Percentile= () th value=value=10.
Decile: In descriptive statistics, a decile is any of the nine values that divide the sorted data
into ten equal parts, so that each part represents 1/10 of the sample or population. Thus:
The 1st decile cuts off the lowest 10% of data, i. e., the 10th percentile.
The 5th decile cuts off lowest 50% of data, i. e., the 50th percentile, 2nd quartile, or
median.
The 9th decile cuts off lowest 90% of data, i. e., the 90th percentile.
http://en.wikipedia.org/wiki/Descriptive_statisticshttp://en.wikipedia.org/wiki/Descriptive_statistics -
8/3/2019 Stat 1 Report F 202
17/25
Empirical Rule:
Provides significant information into the distribution of data around the mean,
approximating normality.1. The mean one standard deviation contains approximately 68.26% of the
measurements in the series.
2. The mean two standard deviations contain approximately 95.5% of the
measurements in the series.
3. The mean three standard deviations contain approximately 99.7% of themeasurements in the series.
Climatologists often use standard deviations to help classify abnormal climatic
conditions. The chart below describes the abnormality of a data value by how many
standard deviations it is located away from the mean. The probabilities in the third
column assume the data is normally distributed.
-
8/3/2019 Stat 1 Report F 202
18/25
Standard Deviations Away
From Mean
Abnormality Probability of Occurance
beyond -3 sd extremely subnormal 0.15%
-3 to -2 sd greatly subnormal 2.35%
-2 to -1 sd subnormal 13.5%-1 to +1 sd normal 68.0%
+1 to +2 sd above normal 13.5%
+2 to +3 sd greatly above normal 2.35%
beyond +3 sd extremely above normal 0.15%Oliver, John E. Climatology: Selected Applications. p 45.
Chebyshevs Theorem: A large standard deviation reveals that the observations are widely scatteredabout the mean. The Russian mathematician P. L. Chebyshev (1821-1894) developed a theorem that
allows us to determine the minimum proportion of the values that lie within specified number of
standard deviations of the mean. For example, according to Chebyshevs Theorem, at least three of
four values, or 75 percent, must lie between the mean plus two standard deviations and the mean
minus two standard deviations. This relationship applies regardless of the shape of the distribution.
Further, at least eight of nine values, o 88.9 percent will lie between plus three standard deviations
and minus three standard deviations of the mean. At least 24 of 25 values, or 96 percent, will lie
between plus and minus five standard deviations of the mean.
For any set of observations, the proportion of the values lie within k standard deviations of the mean
is at least 1 1/k2
, where k is any constant greater than 1.
-
8/3/2019 Stat 1 Report F 202
19/25
Which Measure of Variation to Use
The choice of a suitable measure of dispersion depends on the following three
factors:
1. The type of data available: If observations are few in numbers, avoid thestandard deviation. If they are generally skewed, avoid the mean deviation as
well. If they have gaps around the quartiles, the quartile deviation should be
avoided. If there are open-end classes, the quartile measure of variation
should be preferred.
2. The purpose of investigation: In an elementary treatment of statistical seriesin which a measure of variability is desired only for itself, any of the three
measures, namely, range, mean deviation, quartile deviation would be
acceptable. Probably the man deviation would be superior. In usual practice,
the measure of variability is employed in further statistical analysis. For such
a purpose, the standard deviation is by far the most popularly used. It is free
from those defects with which other measures suffer. It lends itself to the
analysis of variability in terms of normal curve of error. Practically, all
advanced statistical methods deal with variability and centre around the
standard deviation. Hence, unless the circumstances warrant for the use of
any other measure, we should make use of standard deviation for measuring
variability.
-
8/3/2019 Stat 1 Report F 202
20/25
A tabular format comparison among the measures of dispersion is drawn in the following:
Characteristics Range Quartile
Deviation
Mean Deviation Standard
DeviationClear Definition Yes Yes Yes Yes
Easily
Understandable
Yes Yes No Yes
Determination
Procedures
Easy Average Average Not that easy
For further
Algebraic Process
Not Eligible Not Eligible Not Eligible Eligible
Usage of all item
in a data set
No No Yes Yes
Effect of extreme
values
Yes Not much Yes Not much
Effect of sample
fluctuations
No Not much Not much Not much
From the above discussions, it is seen that standard deviation supports almost all the characteristics
of an ideal measures of dispersion. Therefore, we can say that, standard deviation is the ideal
measure of dispersion.
-
8/3/2019 Stat 1 Report F 202
21/25
Some practical applications of Measures of Dispersion:
The following data show the lifetime of laptops of two different brands.
Life time
(Years)
No. of laptop
Dell HP
2 4
4 6
6 8
8 10
10-12
20
15
25
30
35
15
20
20
25
15
i. Find which of the brands shows a greater lifetime?ii. Which of the brands you would prefer if the prices were same? Why?
Solution:
The brand, which has greater mean, has the greater lifetime. If the prices were same thebrand which has less variability has to be preferred. The brand which has less coefficient of
variance has less variability.
At first let us take Compaq,
Life time (Years) Dell (f) X fX 2 4 20 3 60 180
4 6 15 5 75 375
6 8 25 7 175 1225
8 10 30 9 270 2430
10-12 35 11 385 4235
-
8/3/2019 Stat 1 Report F 202
22/25
=
C.V.
-
8/3/2019 Stat 1 Report F 202
23/25
Life time (Years) HP(f) X fX 2 4 15 3 45 135
4 6 20 5 100 500
6 8 20 7140 980
8 10 25 9 225 2025
10-12 15 11165 1815
95 675 5455
=
C.V.
-
8/3/2019 Stat 1 Report F 202
24/25
Dell has the mean 7.72 and HP has the mean 7.11. As Dell has the greater mean than HP, so
Dell has the greater lifetime.
The covariance of Dell is 36.53% and the covariance of HP is 36.99%. If the prices were
same, Dell is more preferable as it has less variability and it indicates better quality and
higher consistency.
Application of Empirical Rule:
A sample of the rental rates at University Park Apartments approximates a symmetrical,
bell-shaped distribution. The sample mean is 500 taka and the standard deviation is 20 taka.
Using the empirical rule, we have to determine:
1. About 68 percent of the monthly food expenditures are between what two amounts?
2. About 95 percent of the monthly food expenditures are between what two amounts?
3. Almost all of the monthly expenditures are between what two amounts?
Solution:
1. About 68 percent of the monthly food expenditures are between
X 1s = 500 1(20)
That is 480 and 520 taka
2. About 95 percent of the monthly food expenditures are between
X 2s = 500 2(20)
That is 460 and 540 taka
-
8/3/2019 Stat 1 Report F 202
25/25
3. Almost all (99.7 percent) are between
X 3s = 500 3(20)
That is 440 and 560 taka.
top related