lecture i: overview and terminology - faculty.fairfield.edu · siblings;favorite color;gender. i...

46
Lecture I: Overview and Terminology MA 217 - Stephen Sawin Fairfield University August 8, 2017

Upload: trinhlien

Post on 11-Jul-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Lecture I: Overview and Terminology

MA 217 - Stephen Sawin

Fairfield University

August 8, 2017

Three Stages of the Course

I Descriptive Statistics- Summary and visualization of data

I Probability- Quantitative investigation of likelihood,uncertainty and randomness

I Inferential Statistics-Drawing probabilistic conclusions fromuncertain or incomplete data

Three Stages of the Course

I Descriptive Statistics- Summary and visualization of data

I Probability- Quantitative investigation of likelihood,uncertainty and randomness

I Inferential Statistics-Drawing probabilistic conclusions fromuncertain or incomplete data

Three Stages of the Course

I Descriptive Statistics- Summary and visualization of data

I Probability- Quantitative investigation of likelihood,uncertainty and randomness

I Inferential Statistics-Drawing probabilistic conclusions fromuncertain or incomplete data

Despite recession, average WallStreet bonus leaps 25% (2009)

Three Stages of the Course

I Descriptive Statistics- Summary and visualization of data

I Probability- Quantitative investigation of likelihood,uncertainty and randomness

I Inferential Statistics-Drawing probabilistic conclusions fromuncertain or incomplete data

Three Stages of the Course

I Descriptive Statistics- Summary and visualization of data

I Probability- Quantitative investigation of likelihood,uncertainty and randomness

I Inferential Statistics-Drawing probabilistic conclusions fromuncertain or incomplete data

Increased airport security after 9/11reduced airline travel by 6%. Ifreplaced with car travel, probablycaused about 400 more deaths peryear from increased airport security!

Three Stages of the Course

I Descriptive Statistics- Summary and visualization of data

I Probability- Quantitative investigation of likelihood,uncertainty and randomness

I Inferential Statistics-Drawing probabilistic conclusions fromuncertain or incomplete data

Three Stages of the Course

I Descriptive Statistics- Summary and visualization of data

I Probability- Quantitative investigation of likelihood,uncertainty and randomness

I Inferential Statistics-Drawing probabilistic conclusions fromuncertain or incomplete data

There is no significant evidence thatwashing hands with antibacterialsoap reduces illness compared withwashing with ordinary soap.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people -

Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses -

Cost; # ofbedrooms; Has porch?.

I All colleges -

Business school?;Average GPA.

I All die rolls -

Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people -

Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses -

Cost; # ofbedrooms; Has porch?.

I All colleges -

Business school?;Average GPA.

I All die rolls -

Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people -

Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses -

Cost; # ofbedrooms; Has porch?.

I All colleges -

Business school?;Average GPA.

I All die rolls -

Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people -

Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses -

Cost; # ofbedrooms; Has porch?.

I All colleges -

Business school?;Average GPA.

I All die rolls -

Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

1.2 Population and Variables

I Population - All the things (individuals) you want to consider.I Variable - A question you can ask about each individual in

population. A property.I Numerical/Quantitative - A variable whose values are

numbers. Can be Discrete - separate values, such as1, 2, 3, . . . or Continuous - infinitely many values betweeneach two values, real numbers

I Categorical/Qualitative - A variable whose values are from afixed list of values: multiple choice. Can be binary or multiplechoice.

I All people - Height; GPA; # ofsiblings; Favorite color; Gender.

I All houses - Cost; # ofbedrooms; Has porch?.

I All colleges -Business school?;Average GPA.

I All die rolls - Value on the die.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Population: allvoters Variable:vote for yourcandidate?Parameter:Proportion of allvoters who will votefor candidate.

Ex: Population: allU.S. collegesVariable: tuitionParameter:Average tuition ofall U.S colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Parameter: Prop. ofvoters who supportcandidate.Sample: phone surveyof 2000 voters.Statistic: Prop. ofthose 2000 voters whosupport candidate.

Ex: Parameter: Avg.tuition of all U.Scolleges.Sample: The 30colleges in Long Island.Statistic: Avg. tuitionof those 30 colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Parameter: Prop. ofvoters who supportcandidate.Sample: phone surveyof 2000 voters.Statistic: Prop. ofthose 2000 voters whosupport candidate.

Ex: Parameter: Avg.tuition of all U.Scolleges.Sample: The 30colleges in Long Island.Statistic: Avg. tuitionof those 30 colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Parameter: Prop. ofvoters who supportcandidate.Sample: phone surveyof 2000 voters.Statistic: Prop. ofthose 2000 voters whosupport candidate.

Ex: Parameter: Avg.tuition of all U.Scolleges.Sample: The 30colleges in Long Island.Statistic: Avg. tuitionof those 30 colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Parameter: Prop. ofvoters who supportcandidate.Sample: phone surveyof 2000 voters.Statistic: Prop. ofthose 2000 voters whosupport candidate.

Ex: Parameter: Avg.tuition of all U.Scolleges.Sample: The 30colleges in Long Island.Statistic: Avg. tuitionof those 30 colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Parameter: Prop. ofvoters who supportcandidate.Sample: phone surveyof 2000 voters.Statistic: Prop. ofthose 2000 voters whosupport candidate.

Ex: Parameter: Avg.tuition of all U.Scolleges.Sample: The 30colleges in Long Island.Statistic: Avg. tuitionof those 30 colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Parameter: Prop. ofvoters who supportcandidate.Sample: phone surveyof 2000 voters.Statistic: Prop. ofthose 2000 voters whosupport candidate.

Ex: Parameter: Avg.tuition of all U.Scolleges.Sample: The 30colleges in Long Island.Statistic: Avg. tuitionof those 30 colleges.

Summarizing and SamplingI Parameter - A summary or description of the variable in the

whole population. Express as parameter/variable/populationPopulations are too big: we can’t know the value of theparameter.

I Sample -A (small) set of individuals in the population whichwe gather data about. Hopefully representative.

I Statistic -A summary or description of the variable in thesample

Ex: Parameter: Prop. ofvoters who supportcandidate.Sample: phone surveyof 2000 voters.Statistic: Prop. ofthose 2000 voters whosupport candidate.

Ex: Parameter: Avg.tuition of all U.Scolleges.Sample: The 30colleges in Long Island.Statistic: Avg. tuitionof those 30 colleges.

Lecture 1 Key Points

After this lecture you should be able to

I Define population, parameter, statistic and sample.

I Identify population, variables and fully identify the parameterfrom a description of the situation.

I Distinguish numerical and categorical variables, recognizewhen a numerical variable is continuous or discrete, recognizewhen a variable is binary or multiple choice.

I Identify sample and statistic and distinguish from populationand parameter from a description of the situation.