introduction to statistic

Upload: lieynna4996

Post on 14-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Introduction to Statistic

    1/27

    1

    CHAPTER 1

    INTRODUCTION

    OBJECTIVES

    After completing this chapter, students should be able to:

    1. Describe the difference between descriptive andinferential statistics.

    2. Identify and interpret the relationships between sampleand population, and statistics and parameter.

    3. Identify and describe the different types of variables.

    4. Identify and describe the different types of data.

    5. Differentiate and identify the techniques of datacollection.

    6. Identify and interpret the measurement scales.

  • 7/30/2019 Introduction to Statistic

    2/27

    2

    What is Statistics?

    The word statistics derives from classical Latin roots,

    status which means state.

    Statistics has become the universal language of the

    sciences.

    As potential users of statistics, we need to master both

    the science and the art of using statistical

    methodology correctly.

    These method include:

    Carefully defining the situation

    Gathering data

    Accurately summarizing the data

    Deriving and communicating meaningful conclusions

    Specific definition:

    Statistics is a collection of procedures and principles for

    gathering data and analyzing information to help people

    make decisions when faced with uncertainty.

  • 7/30/2019 Introduction to Statistic

    3/27

    3

    Nowadays statistics is used in almost all fields of human

    effort such as:

    education

    health business agriculture..etc.

    Example applications of Statistics

    1. Sport=> A statistician may keeps records of the number of hits

    a baseball player gets in a season.

    2. Financial=> Financial advisor uses several statistic information to

    make reliable predictions in investment.

    3. Public Health=> An administrator would be concerned with the

    number of residents who contract a new strain of flu

    virus during a certain year.

    4. Others

    =>

  • 7/30/2019 Introduction to Statistic

    4/27

    4

    Statistics has Two Aspects

    1) Theoretical / Mathematical Statistics

    2) Applied Statistics

    1) Theoretical / Mathematical Statistics

    => Deals with the development, derivation and proof of

    statistical theorems, formulas, rules and laws.

    2) Applied Statistics=> Involves the applications of those theorems,

    formulas, rules and laws to solve real world problems.

    ** Applied Statistics can be divided into two main areas,

    depending on how data are used.

    (1) Descriptive statistics (2) Inferential statistics

    What most people think of whenthey hear the wordstatistics

    Includes the collection, presentation,and description of sample data.

    Using graphs, charts and tables toshow data.

    Refers to the technique of

    interpreting the valuesresulting from the descriptive

    techniques and making

    decisions and drawingconclusions about the

    o ulation

  • 7/30/2019 Introduction to Statistic

    5/27

    5

    Example 1Determine which of the following statements is descriptive

    in nature and which is inferential.

    a. Of all U.S kindergarten teachers, 32% say that knowing

    the alphabet is an essential skill.Inferential

    b. Of the 800 U.S kindergarten teachers polled, 32% say

    that knowing the alphabet is an essential skill.

    descriptive

    ASPECTS OF STATISTICS

    Theoretical/MathematicalStatistics Applied Statistics

    Inferential Statistics

    Deals with the development, derivationand proof of statistical theorems,

    formulas, rules and laws.

    Descriptive Statistics

    Involves the applications of thosetheorems, formulas, rules and

    laws to solve real world problems.

    Consist of method for collecting,organizing, displaying and

    summarizing data

    Consist of methods that use resultsobtained from sample to make decisions

    or conclusions about a population

  • 7/30/2019 Introduction to Statistic

    6/27

    6

    Why do we have to study statistics?

    To read and understand various statistical studies in

    related field.

    To communicate and explain the results of study in

    related field using our own words.

    To become better consumers and citizens.

  • 7/30/2019 Introduction to Statistic

    7/27

    7

    Basic Terms of Statistics

    1. Population versus Sample

    Population=> a collection of all individuals about which

    information is desired.

    -individuals are usually people but could also be schools,

    cities, pet dogs, agriculture fields, etc.

    => there are two kinds of population:

    i. When the membership of a population can be (orcould be) physically listed.

    - finite population:- e.g. the books in library.

    ii. When the membership is unlimited.- infinite population:- e.g. the population of all people

    who might use aspirin.

    Sample=> a subset of the population.

    2. Parameter versus Statistic

    Parameter

    => a numerical value summarizing all the data of an entirepopulation.

    - often a Greek letter is used to symbolize the name of

    parameter.

    e.g. the average age at time of admission for all students

    who have ever attended our college.

  • 7/30/2019 Introduction to Statistic

    8/27

    8

    Statistics

    => a numerical value summarizing the sample data.

    - english alphabet is used to symbolize the name of statistic

    e.g. the average height, found by using the set of 25heights.

    3. Variable

    =>a characteristics of interest about each individual element

    of a population or sample.

    e.g. a students age at entrance into college, the color of

    students hair, etc.

    4. Data value

    => the value of variable associated with one element of a

    population or sample. This value may be a number, a word,

    or a symbol.

    e.g. Farah entered college at age 23, her hair is brown,etc.

    5. Data

    => the set of values collected from the variable from each of

    the elements that belong to sample.

    e.g. the set of 25 heights collected from 25 students.

  • 7/30/2019 Introduction to Statistic

    9/27

    9

    Example 2

    A statistics student is interested in finding out something

    about the average ringgit value of cars owned by the faculty

    members of our university. Each of the seven terms just

    describe can be identified in this situation.

    i) population: the collection of all cars owned by all faculty

    members at our university.

    ii) sample: any subset of that population. For example, the

    cars owned by members the statistics department.

    iii) variable: the ringgit value of each individual car.

    iv) data value: one data value is the ringgit value of a

    particular car. Alis car, for example, is value at RM 45

    000.

    v) data: the set of values that correspond to the sample

    obtained (45,000; 55,000; 34,0000;).

    vi) parameter: which we are seeking information is the

    average value of all cars in the population.

    vii) statistic: will be found is the average value of the cars

    in the sample.

    Census: a survey includes every element in the population.

    Sample survey: a survey includes every element in selected

    sample only.

  • 7/30/2019 Introduction to Statistic

    10/27

    10

    Types of Variables

    1. Quantitative (numerical) Variables

    A variable that quantifies an element of a population.

    - e.g. the total cost of textbooks purchased by eachstudent for this semesters classes.

    Arithmetic operations such as addition and averaging

    are meaningful for data that result from a quantitative

    variable.

    Can be subdivided into two classification: discrete

    variables and continuous variables.

  • 7/30/2019 Introduction to Statistic

    11/27

    11

    Discrete Variables

    A quantitative variable that can assume a countable

    number of values.

    Can assume any values corresponding to isolated

    points along a line interval. That is, there is a gap

    between any two values.

    Example 3

    Number of courses for which you are currently registered.

    Continuous Variables

    A quantitative variable that can assume an

    uncountable number of values.

    Can assume any value along a line interval, including

    every possible value between any two values.

    Example 4

    Weight of books and supplies you are carrying as you

    attend class today.

  • 7/30/2019 Introduction to Statistic

    12/27

    12

    2. Qualitative (attribute, categorical) variables

    A variable that describes or categorizes an element of

    a population.

    Example 5

    A sample of four hair-salon customers was surveyed for

    their hair color, hometown and level of

    satisfaction.

  • 7/30/2019 Introduction to Statistic

    13/27

    13

    Exercise 1

    1. Of the adult U.S. population, 36% has an allergy. A

    sample of 1200 randomly selected adults resulted in

    33.2% reporting an allergy.

    a. Describe the population.b. What is sample?c. Describe the variable.d. Identify the statistics and give its value.e. Identify the parameter and give its value.

    2. The faculty members at Universiti Utara Malaysia were

    surveyed on the question How satisfied were you with

    this semester schedule? Their responses were to be

    categorized as very satisfied, somewhat satisfied,

    neither satisfied nor dissatisfied, somewhat

    dissatisfied, or very dissatisfied.

    a. Name the variable interest.b. Identify the type of variable.

    3. A study was conducted by Aventis Pharmaceuticals Inc.

    to measure the adverse side effects of Allegra, a drug

    used for treatment of seasonal allergies. A sample of 679

    allergy sufferers in the United States was given 60 mg of

    the drug twice a day. The patients were to report whether

  • 7/30/2019 Introduction to Statistic

    14/27

    14

    they experienced relief from their allergies as well as any

    adverse side effects (viral infection, nausea, drowsiness,

    etc)

    a. What is the population being studied?b. What is the sample?c. What are the characteristics of interest about each

    element in the population?

    d. Are the data being collected qualitative orquantitative?

    4. Identify each of the following as an example of (1)

    attribute (qualitative) or (2) numerical (quantitative)

    variables.

    a. The breaking strength of a given type of stringb. The hair color of children auditioning for the musical

    Annie.

    c. The number of stop signs in town of less than 500people.

    d. Whether or not a faucet is defective.e. The number of questions answered correctly on a

    standardized test.

    f. The length of time required to answer a telephonecall at a certain real estate office.

  • 7/30/2019 Introduction to Statistic

    15/27

    15

    DATA

    The set of values collected from the variable from each

    of the elements that belong to sample.

    e.g. the set of 25 heights collected from 25 students.

    From a survey or an experiment.

    Two types of data:

    Primary data:

    necessary data obtainedthrough survey

    conducted by researcher

    Secondary data:

    data obtained from

    published material by

    governmental, industrial

    or individual sources

  • 7/30/2019 Introduction to Statistic

    16/27

    16

    1. PRIMARY DATA

    Primary Data Collection Techniques

    Data is collected by researcher

    Data is obtained from respondent

    (i) Face to face interview

    - Two ways communication.

    - Researcher(s) asks question directly to

    respondent(s).

    Advantages:

    Precise answer.

    Appropriate for research that requires huge datacollection.

    Increase the number of answered questions.

    Disadvantages:

    Expensive.

    Interviewer might influence respondents responses.

    Respondent refuse to answer sensitive or personalquestion.

  • 7/30/2019 Introduction to Statistic

    17/27

    17

    (ii) Telephone interview

    Advantages:

    Quick.

    Less costly.

    Wider respondent coverage.

    Disadvantages:

    Information obtained might not represent the

    whole population.

    Limited interview duration.

    Not appropriate for long and contemplate

    question.

    Demonstration cannot be performing.

    Telephone is not answered.

    (iii) Postal questionnaire

    - A set of questions to obtain related informationof conducted study.

    - Questionnaires are posted to every respondent.

  • 7/30/2019 Introduction to Statistic

    18/27

    18

    Advantages:

    Wider respondent coverage.

    Respondent have enough time to answer

    questions.

    Interviewer influences can be avoided.

    Lower cost.

    Disadvantages:

    One way interaction.

    Low response rate.

    Not suitable for numerous and hard questions.

    Time consuming.

    Questionnaire is answered by unqualifiedrespondent.

    (iv) Observation

    Observing and measuring specific

    characteristics without attempting to modify the

    subjects being studied.

    Records human behaviors, objects and situations

    without contact with respondent.

    - not commonly used.

    - precise information.

  • 7/30/2019 Introduction to Statistic

    19/27

    19

    2. SECONDARY DATA

    - Published records from governmental, industrial orindividual sources.

    - Historical data.

    - Various resources.

    - Experiment is not required.

    Advantages:

    Lower cost. Save time and energy.

    Disadvantages:

    Obsolete information.

    Data accuracy is not confirmed.

  • 7/30/2019 Introduction to Statistic

    20/27

    20

    Data also can be classified by how they are categorized,

    counted or measured.

    This type of classification uses measurement scales with

    4 common types of scales: nominal, ordinal, interval

    andratio.

    Nominal Level of Measurement

    A qualitative variable that characterizes (or

    describes/names) an element of a population.

    Arithmetic operations not meaningful for data.

    Order cannot be assigned to the categories.

    Example: - Survey responses:- yes, no, undecided,

    - Gender:- male, female

    Ordinal Level of Measurement

    A qualitative variable that incorporates and orderedposition, or ranking.

    Differences between data values either cannot be

    determined or are meaningless.

    Example: - Level of satisfaction:- very satisfied,satisfied, somewhat satisfied, etc.

    - Course grades:- A, B, C, D, or F

  • 7/30/2019 Introduction to Statistic

    21/27

    21

    Interval Level of Measurement

    Involve a quantitative variable.

    A scale where distances between data are meaningful.

    Differences make sense, but ratios do not (e.g., 30-

    20=20-10, but 20/10 is not twice as hot!).

    No natural zero

    Example:

    - Temperature scales are interval data with 25oC warmer

    than 20oC and a 5oC difference has some physical

    meaning. Note that 0oC is arbitrary, so that it does not

    make sense to say that 20oC is twice as hot as 10oC.

    - The year 0 is arbitrary and it is not sensible to say that

    the year 2000 is twice as old as the year 1000.

    Ratio Level of Measurement

    A scale in which both intervals between values and ratios

    of values are meaningful.

    A real zero point.

    Example:

  • 7/30/2019 Introduction to Statistic

    22/27

    22

    - Temperature measured in degrees Kelvin is a ratio

    scale because we know a meaningful zero point (absolute

    zero).

    - Physical measurements of height, weight, length are

    typically ratio variables. It is now meaningful to say that

    10 m is twice as long as 5 m. This is because there is a

    natural zero.

    Levels of Measurement

    Nominal - categories only

    Ordinal - categories with some order

    Interval - differences but no natural starting point

    Ratio - differences and a natural starting point

  • 7/30/2019 Introduction to Statistic

    23/27

    23

    Exercise 2

    1) Classify each as nominal-level, ordinal-level, interval-

    level or ratio-level.

    2) Data obtained from a nominal scale

    a. must be alphabetic.

    b. can be either numeric or nonnumeric.

    c. must be numeric.

    d. must rank order the data.

    3) The set of measurements collected for a particular

    element is (are) called

    a. variables.

    b. observations.c. samples.

    d. none of the above answers is correct.

    a. Ratings of newscasts in Malaysia.

    (poor, fair, good, excellent)

    b. Temperature of automatic popcorn poppers.

    c. Marital status of respondents to a survey on

    saving accounts.

    d. Age of students enrolled in a marital arts course.

    e. Salaries of cashiers of C-Mart stores.

  • 7/30/2019 Introduction to Statistic

    24/27

    24

    4) The scale of measurement that is simply a label for the

    purpose of identifying the attribute of an element is the

    a. ratio scale.

    b. nominal scale.

    c. ordinal scale.d. interval scale.

    5) Some hotels ask their guests to rate the hotels services

    as excellent, very good, good, and poor. This is an

    example of the

    a. ordinal scale.

    b. ratio scale.c. nominal scale.

    d. interval scale.

    6) The ratio scale of measurement has the properties of

    a. only the ordinal scale.

    b. only the nominal scale.

    c. the rank scale.

    d. the interval scale.

    7) Arithmetic operations are inappropriate for

    a. the ratio scale.

    b. the interval scale.

    c. both the ratio and interval scales.

    d. the nominal scale.

    8) A characteristic of interest for the elements is called a(n)

    a. sample.

    b. data set.

    c. variable.

    d. none of the above answers is correct.

  • 7/30/2019 Introduction to Statistic

    25/27

    25

    9) In a questionnaire, respondents are asked to mark their

    gender as male or female. Gender is an example of a

    a. qualitative variable.

    b. quantitative variable.

    c. qualitative or quantitative variable, depending onhow the respondents answered the question.

    d. none of the above answers is correct.

    10) The summaries of data, which may be tabular, graphical,

    or numerical, are referred to as

    a. inferential statistics.

    b. descriptive statistics.c. statistical inference.

    d. report generation.

    11) Statistical inference

    a. refers to the process of drawing inferences about the

    sample based on the characteristics of the population.

    b. is the same as descriptive statistics.

    c. is the process of drawing inferences about the

    population based on the information taken from the

    sample.

    d. is the same as a census.

  • 7/30/2019 Introduction to Statistic

    26/27

    26

    Answer Exercise 1

    1) a. all adults of U.S. population

    b. 1200 randomly selected from adultsc. allergy

    d. 33.2% effected by allergy

    e. 36.0% has an allergy

    2) a. satisfaction

    b. ordinal

    3) a. all allergy sufferers in the U.S.

    b. 679 allergy sufferers in the U.S.

    c. to measure the adverse side effects of allergy

    d. qualitative

    4) a. quantitative

    b. qualitative

    c. quantitative

    d. qualitative

    e. quantitative

    f. quantitative

  • 7/30/2019 Introduction to Statistic

    27/27

    Answer Exercise 2

    1) a. Ordinal b. Interval c. Nominal

    a. Ratio e. ratio

    2) b 3) c

    4) b 5) a

    6) d 7) d

    8) c 9) a

    10) b 11) c