chapter 1class interval is kept in the same class-interval. in this method the upper limit of a...

16
Chapter 1 What is Statistic ? Statistics is the science of the collection, organization, and interpretation of numerical data. It is the branch of mathematics that involves the use of quantified representations, models for the representation and analysis of given empirical and real data. Statistics is a developing discipline. It has been defined as the science of counting, science of averages and science of statistics and probabilities. 1] Aggregate of Facts : Simple or Isolated Items Cannot be termed as Statistics unless they are a part of aggregate of facts relating to any particular feild of enquiry . 2] Affected by Multiplicity of causes : Numerical Figures should be affected by multiplicity of factors . 3] Numerically Expressed : Only numerical Data constitutes Statistics . Thus , the statement like ‘ the standard of living in nashik has improved ’ does not constitutes Statistics . The qualitative and descriptive characters do not constitute Statistics unless they are expressed in numbers . 4] Enumerated according to reasonable standard of accuracy : The numerical data pertaining to any feild of enquiry can be obtained by completely enumerating the underlying population . In such a case Data will be Exact and Accurate . However , if complete enumeration of underlying population is not practical or feasible , then population quantities are estimated , by using the principle of Statistics . 5] Collected in Systematic Manner : The Data must be Collected in a Systamatic Manner . We must define the characteristics under Study and also the Population . Trained Investigators must conduct the enquiry . 6] Collected for Pre-Determined Purpose : It is of Utmost Importance to define in clear and Concrete terms the objectives or the purppose of the enquiry and the data should be collected keeping in veiw these Objectives .

Upload: others

Post on 20-Mar-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

Chapter 1

What is Statistic ?

Statistics is the science of the collection, organization, and interpretation ofnumerical data. It is the branch of mathematics that involves the use of quantified representations, models for the representation and analysis of given empirical and real data. Statistics is a developing discipline. It has been defined as the science of counting, science of averages and science of statistics and probabilities.

1] Aggregate of Facts : Simple or Isolated Items Cannot be termed as Statistics unless they are a part of aggregate of facts relating to any particular feild of enquiry .

2] Affected by Multiplicity of causes : Numerical Figures should be affected by multiplicity of factors .

3] Numerically Expressed : Only numerical Data constitutes Statistics . Thus , the statement like ‘ the standard of living in nashik has improved ’ does not constitutes Statistics . The qualitative and descriptive characters do not constitute Statistics unless they are expressed in numbers .

4] Enumerated according to reasonable standard of accuracy : The numerical data pertaining to any feild of enquiry can be obtained by completely enumerating the underlying population . In such a case Data will be Exact and Accurate . However , if complete enumeration of underlying population is not practical or feasible , then population quantities are estimated , by using the principle of Statistics .

5] Collected in Systematic Manner : The Data must be Collected in a Systamatic Manner . We must define the characteristics under Study and also the Population . Trained Investigators must conduct the enquiry .

6] Collected for Pre-Determined Purpose : It is of Utmost Importance to define in clear and Concrete terms the objectives or the purppose of the enquiry and the data should be collected keeping in veiw these Objectives .

Page 2: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

7] Comparabe : You can Compare two or more Datasets when these data are arising from comparable populations . They may be compared with respect to some unit , like Time , Period or Place .

The two main branches of statistics are :

Descriptive statistics – This branch involves the collection and presentation of data. The presentation of data can be done in various formslike graphs, charts, tables etc. it also involves calculating appropriate statistical averages which would serve as the best representation tools for the data. Descriptive statistics forms the basis for analysis and discussion in such diverse fields as securities trading, the social sciences, government, the health sciences, and professional sports. For example- industrial and population statistics, average age of people who vote etc.

Inferential statistics – This branch of statistics involves drawing conclusions from limited information taken on sample basis and testing the reliability of the estimates. It deals with analysis of data, making estimates ad drawing conclusions. The analysis usually starts with a hypothesis and the consistency of the data in accordance to the hypothesis is checked.

What is Data

Information in Numerical Form Generated by Observations or Experimentations .

There are Two Types of data :

1] Categorical Data

2] Measurement data

Measurement Data has two types :

1] Discrete Data

2] Continuos Data

Page 3: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

1] Discrete Data : Discrete data / variables are numeric variables that have acountable number of values between any two values. A discrete variable is always numeric. For example, the number of customer complaints or the number of flaws or defects.

2] Continuous Data : Continuous variable / data are numeric variables that have an infinite number of values between any two values . For example , Date and Time .

Ref : [ Click for info ]

Scale of Measurement

1] Nominal Scale

2] Ordinal Scale

3] Interval Scale

4] Ratio Scale

1] Nominal Scale : Nominal Scale is used for labelling the Variables , without any quatitative value . As shown in the example below they do not have any Numerical Significance . In Nominal Scale the observations are classified into categories , that cannot be Measured or Ordered .

2] Ordinal Scale : In ordinal Scale the Order of the Values is Important , but the difference between each one is unknown .

Page 4: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

In the above example , it is difficult to differentiate between Unhappy and OK , that is this cannot be Quantified . Ordinal Scales are typical measures of non-numeric concepts like Satisfaction , Discomfort , Happiness , etc.

3] Interval Scale : Interval scales are Numeric Scales in which we the Order and the exact difference between values . Example of Interval Scale is Celsius Temperature , because the difference between each value is the same . That is , the difference between 60 degrees and 50 degrees is measureable , 10 degrees . Time is also an example of Interval Scale .

The problem with Interval Scale is that they do not have TRUE ZERO . There is no such thing as NO TEMPERATURE . Without True Zero it is not possible to calculate Ratio . With Interval Data we can Add , Subtract but cannot Multiply or Divide .

Ratio Scale : When it comes to measurement Ratio Scales are Reliable , as they tell us about the Order , the exact difference between Values , and

Page 5: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

they also have an Absolute Zero . That is , they have a Clear definition of Zero . Height and Weight are examples of Ratio Scales . Ratio Variables can be added , subtracted , multiplied and divided .

Ref : [ Click for more info ]

Importance of Statistics

1] Business : Statistics plays an important role in business. It helps a businessman to plan his production according to the taste and preference of the customer. It also helps to determine the quality of the product. A businessman can make correct decision regarding the location of business, marketing of the products, finance, resources, etc…through statistics.

2] Economics : Statistics play an important role in economics. Economics largely depends upon statistics. . In economics statistical methods are used for collecting and analysis the data. The relationship between supply and demands is also studied by statistical method. The imports and exports, the inflation rate, the per capita income are the problems which require good knowledge of statistics.

3] Mathematics : Statistics is branch of applied mathematics. The large number of statistical methods like probability averages, dispersions, estimation etc… is used in mathematics and different techniques of pure mathematics like integration, differentiation and algebra are used in statistics. Thus statistics and mathematics are interrelated with each other.

Page 6: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

4] Banking : Statistics play an important role in banking. The banks make use of statistics for a number of purposes. The bankers use statistical approaches to estimate the numbers of depositors and their claims for a certain day.

Limitations of Statistics

1] Statistics is not suited for the study of Qualitative Phenomenon . Statisticis a Science of dealing with a set of numerical data , which are capable of Quantitative mesurements .

2] Statistics Does Not study individuals . It deals with an aggregate of objects and does not give any specific recognition to the individual item in the series .

3] Statistics laws are not Exact , they are not like the Laws of Physics . Statistics Laws are only approximations .

4] If sufficient care is not exercised in Collecting , Analysing and Interpretingthe Data , then Statistical results might be misleading .

5] Only a person who has an expert Knowledge of Statistics , can handle the Statistical Data efficiently .

Chapter 2

Representation of Data:

There are three ways for representing data in Statistics: Textual Representation Tabular Representation

Page 7: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

Diagrammatic Representation

P.T.OTextual Representation

Tabular Representation

Page 8: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

P.T.ODiagrammatic Representation

1] Simple Bar Diagram2] Component Bar Diagram3] Pie Diagram

Types of Tablesa) One-way Table

b) Two-way Table

c) Three-way Table

One-way Table: If only one characteristic is observed, the table representing such data is called One-way Table. Here the first column consists of values of the characteristic under consideration and the second column consists of the frequencies of these values.

Example:

Faculty Number Of Students

Page 9: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

Science 8600

Arts 11400

Total 20000

Sate Number Of Students

Maharashtra 16500

Other Than Maharashtra 3500

Total 20000

Sex Number Of Students

Gents 16000

Ladies 4000

Total 20000

Two-way Table: If the data on each unit are available on two characteristics then make use of Two-way Table. One of the characteristic is represented row-wise and other column-wise. The cell represents the frequency of, corresponding two characteristics in data. Such tables also help us understand the relationship between the two characteristics under consideration.

Example:

State Maharashtra Other thanMaharashtra

Total

Faculty

Science 6600 2000 8600

Arts 9900 1500 11400

Total 16500 3500 20000

Page 10: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

Sex Gents Ladies Total

Faculty

Science 7000 1600 8600

Arts 9000 2400 11400

Total 16000 4000 20000

Sex Gents Ladies Total

State

Maharashtra 13000 3500 16500

O.T. Maha. 3000 500 3500

Total 16000 4000 20000

Three-way Table: When the Information is to be reported on three characteristics, three-way table is used. It is complicated, when compared with two other tables, but there are occasions where this table is used. In such tables each column representing one value of second characteristic is sub-divided into sub-columns, representing one value of the third characteristic.

Example:

State Maharashtra Other than Maharashtra

Sex Gents Ladies Total Gents Ladies Total

Faculty

Science 5200 1400 6600 1800 200 2000

Arts 7800 2100 9900 1200 300 1500

Total 13000 3500 16500 3000 500 3500

Page 11: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

Requirement of a good statistical Table1) A Table should have Self-explanatory title, which should cover the

‘when, what and where’ about the data.

2) Headings of Rows and Columns should be clearly stated.

3) Units should be mentioned whenever necessary.

4) Avoid short forms.

5) Classes and sub-classes should be clearly separated by lines, of different colours or thickness.

6) The whole table should be visible at glance.

7) Footnote must be included for explanation of signs or abbreviations.

8) Source note must include reference to the source of data.

9) A table should have a table-number, for easy access.

Advantages of Tabular Presentation1) It is convenient and self-sufficient form of presenting statistical

information.

2) It summarizes the information and displays important feature of it.

3) Unnecessary repetitions (that may appear in text) are avoided.

4) Comparison between localities, age-groups, etc. Can be made.

5) Errors and omission in the information can be detected.

6) Reference to any detail of the data, can be provided.

Frequency Distribution Types1) Inclusive Method

2) Exclusive Method

Page 12: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

Inclusive method: In this method the value similar to the upper limit, of any class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short this method allows a class-interval to include both its lower and upperlimits within it. For example:

Exclusive Method: In this method the upper limit of a class becomes the lower limit of the next class. It is called ' Exclusive ' as we do not put any item that is equal to the upper limit of a class in the same class; we put it in the next class i.e. the upper limits of classes are excluded from them. For example, a person of age 20 years will not be included in the class-interval (10 - 20) but taken in the next class (20 - 30), since in the class interval (10 - 20) only units ranging from 10 - 19 are included. The exclusive-types of class-intervals can also be expressed as:

Page 13: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

Meaning of the Terms used in classification

Class Limit: A class is formed within the two values. These values are known as the class-limits of that class. The lower value is called the lower class limit while the higher value is called the upper class limit of the class.

In the example given above, the first class-interval has LCL = 1 and UCL = 7.

Class boundaries: Weights are recorded to the nearest Kg. The class-intervals 60 - 62 includes all measurements from 59.50000... to 62.50000 ... Kg ; the variable being a continuous one. These numbers, indicated briefly bythe exact numbers 59.5 and 62.5, are called class-boundaries or true class limits. The smaller number 59.5 is the lower class boundary and the larger one 62.5 is the upper class boundary.

In any problem if the class-intervals are given as the inclusive type, then they should first be converted into the exclusive-type. For this we require a correction factor.

Correction factor = (the upper limit of a class - the lower limit of the next class), which is generally 0.5.

Now you subtract it from the lower limits and add it to the upper limits of the class-intervals given in the inclusive-method. The class-intervals given above can be written, after correction as:

Page 14: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

Class Width: The difference between the upper and lower limits of a class is called the magnitude or length or width of a class and is denoted by ‘i’ or ‘c’.

Class-mark: The arithmetical average of the two class limits (i.e. the lower limit and the upper limit) is called the mid-value or the class mark of that class-interval. For example, the mid-value of the class-interval (0 - 10) is

Relative Frequency: The relative frequency of a class is the frequency of theclass divided by the total number of frequencies. It can also be expressed in terms of percentage.

Example: The weights of 100 persons were given as under:

Page 15: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short

Frequency Density: Number of units (Frequency) in a class per unit length of a class is known as Frequency Density. It is a ratio of Frequency of the class to the width of the class. Frequency density is useful when drawing Histogram, for data present classes of unequal width.

Page 16: Chapter 1class interval is kept in the same class-interval. In this method the upper limit of a previous class is less by 1, from the lower limit of the next class interval. In short