using excel

43
8/25/2015 1 Using Excel Program used to organize data Produce tables Perform calculations Make graphs Perform statistical tests Organizing data in tables Allows you to arrange data in a format that is best for analysis The following are the steps you would use:

Upload: others

Post on 01-Nov-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Excel

8/25/2015

1

Using Excel

• Program used to organize data• Produce tables• Perform calculations• Make graphs• Perform statistical tests

Organizing data in tables

• Allows you to arrange data in a format thatis best for analysis

• The following are the steps you would use:

Page 2: Using Excel

8/25/2015

2

Page 3: Using Excel

8/25/2015

3

Performing calculations

• Allows you to perform several calculations• Sum, Average, Variance, Standard

deviation• Basic subtraction, addition, multiplication• More complex formulas

Page 4: Using Excel

8/25/2015

4

Page 5: Using Excel

8/25/2015

5

Page 6: Using Excel

8/25/2015

6

Page 7: Using Excel

8/25/2015

7

Page 8: Using Excel

8/25/2015

8

Page 9: Using Excel

8/25/2015

9

Page 10: Using Excel

8/25/2015

10

Page 11: Using Excel

8/25/2015

11

• Bar Charts…….

• Scatter Plots………………….

0

0.10.2

0.3

0.40.5

0.6

0.70.8

0.9

1 2 3

-1

0

1

2

3

4

5

6

7

8

9

0 0.2 0.4 0.6 0.8 1 1.2

Making graphs

Page 12: Using Excel

8/25/2015

12

Page 13: Using Excel

8/25/2015

13

Page 14: Using Excel

8/25/2015

14

Page 15: Using Excel

8/25/2015

15

Page 16: Using Excel

8/25/2015

16

Page 17: Using Excel

8/25/2015

17

Page 18: Using Excel

8/25/2015

18

• Bar Charts…….

• Scatter Plots………………….

0

0.10.2

0.3

0.40.5

0.6

0.70.8

0.9

1 2 3

-1

0

1

2

3

4

5

6

7

8

9

0 0.2 0.4 0.6 0.8 1 1.2

Making graphs

Page 19: Using Excel

8/25/2015

19

Page 20: Using Excel

8/25/2015

20

Page 21: Using Excel

8/25/2015

21

Page 22: Using Excel

8/25/2015

22

Page 23: Using Excel

8/25/2015

23

Analyzing Data in Excel

Statistical tests can be done to determine:• Whether or not there is a significant difference

between two data sets (Student’s t-test)• Whether or not there is a significant difference

between more than two data sets (ANOVA)• Whether or not there is a significant relationship

between two variables (Regression analysis)

Analyzing Data in Excel

The following steps must be followed:1. Choose an appropriate statistical test2. State H0 and HA

3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 24: Using Excel

8/25/2015

24

• Normally, you would have to calculate the criticalvalue and look up the P value on a table

• All tests done in Excel provide the P value for you• This P value is used to determine the significance of

statistical results• This P value must be compared to an α value• α value is usually 0.05 or less (e.g. 0.01)• Less than 5% chance that the null hypothesis is true• The lower the α value the more certain we about

rejecting the null Hypothesis• First thing you must do is select which statistical test

you want to perform• This is how it is done……..

Analyzing Data in Excel

Page 25: Using Excel

8/25/2015

25

Page 26: Using Excel

8/25/2015

26

t-Tests

• Used to compare the means of two populations and answer thequestion:Is there a significant difference between the two populations?

• Example: Is there a significant difference between the averageheight of pine trees from 2 sites in Everglades National Park?

• You cannot use this test to compare two different types of data(e.g. water depth data and soil depth data).

• It can only compare two sets of data based on the same datatype (e.g. water depth data from two different sites)

• The two data sets that are being compared must be presentedin the same units. (e.g. you can compare two sets of data ifboth are recorded in days. You cannot compare data recordedin units of days with data recorded in units of months)

Page 27: Using Excel

8/25/2015

27

• Your Null Hypothesis is always:

There is no significant difference between the twocompared populations (μ1= μ2)

• Your Alternative Hypothesis is always:

There is a difference between the two comparedpopulations (μ1 ≠ μ2)

1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 28: Using Excel

8/25/2015

28

Page 29: Using Excel

8/25/2015

29

Page 30: Using Excel

8/25/2015

30

t-Tests1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• If p > 0.05 then fail to reject your Null Hypothesis and state that“there is no significant difference between the two comparedpopulations”

• If p < 0.05 then reject your Null Hypothesis and state that “thereis a significant difference between the two comparedpopulations”

Page 31: Using Excel

8/25/2015

31

t-Tests1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• Our results show P = 0.09903

• Therefore P > 0.05 (This means that there is greater than a 5%chance that our null hypothesis is true)

• So we must fail to reject the Null Hypothesis and state that “thereis no significant difference between the two comparedpopulations”

ANOVA

• Used to compare the means of more than two populations andanswer the question:Is there a significant difference between the populations?

• Example: Is there a significant difference between the averageheight of pine trees from 4 sites in Everglades National Park?

• For comparing a particular feature of two or more populations,use a Single Factor ANOVA

• For comparing a particular feature of two or more populations,subdivided into two groups, use a Two Factor ANOVA

0102030405060708090

100

Number of Students

Num

ber o

f Dai

ly B

eers

MicroEcoBuisinessStatistics

Page 32: Using Excel

8/25/2015

32

• Your Null Hypothesis is always:

There is no significant difference between thecompared populations (μ1 = μ2 = μ3 = μ4 …..)

• Your Alternative Hypothesis is always:

There is a difference between the comparedpopulations (μ1 ≠ μ2 ≠ μ3 ≠ μ4 …..)

1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 33: Using Excel

8/25/2015

33

Page 34: Using Excel

8/25/2015

34

ANOVA1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• If p > 0.05 then fail to reject your Null Hypothesis and state that“there is no significant difference between the comparedpopulations”

• If p < 0.05 then reject your Null Hypothesis and state that “thereis a significant difference between at least two of the comparedpopulations”

0102030405060708090

100

Number of Students

Num

ber o

f Dai

ly B

eers

MicroEcoBuisinessStatistics

Page 35: Using Excel

8/25/2015

35

ANOVA1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• Our results show P = 0.002197

• Therefore P < 0.05 (This means that there is less than a 5%chance that our null hypothesis is true)

• So we must reject your Null Hypothesis and state that “there is asignificant difference between at least two of the comparedpopulations”

0102030405060708090

100

Number of Students

Num

ber o

f Dai

ly B

eers

MicroEcoBuisinessStatistics

ANOVA• Remember:The ANOVA result will only tell you thati) None of the data sets are significantly

different from each otherOR

ii) At least two of the data sets among the datasets being compared are significantlydifferent

• If there is a significant difference between atleast two data sets, it will not tell you whichtwo.

Page 36: Using Excel

8/25/2015

36

Regression analysis

• Used to determine whether or not there is a linear relationshipbetween two variables and answer the question:Is there a significant linear relationship between two variables?

• Example: Is there a significant relationship between theaverage height of pine trees and soil depth in EvergladesNational Park?

• It basically creates an equation (or line) that best predicts Yvalues based on X values.

• You cannot use this test to compare populations. It onlycompares variables.

• You are looking at two different variables (e.g. water depth (cm)and plant abundance (no. of individuals), so the data sets donot have to be presented in the same units

0.00

10.00

20.00

30.00

40.00

50.00

60.00

0 1 2 3 4 5 6

Price of Whiskey ($)

Mon

ey S

pent

by

TA ($

)

• Your Null Hypothesis is always:

There is no significant linear relationship between thetwo variables

• Your Alternative Hypothesis is always:

There is a significant linear relationship between thetwo variables

1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 37: Using Excel

8/25/2015

37

Example: R square value of 0.04• The regression line does not fit the

data well• Many of the points lie far from the

line, so there is not a defined linearrelationship between the twovariables

• “x” cannot be used to predict “y”

Example: R square value of 0.94• The regression line fits the data well• The points all lie fairly close to the

line, so there is a defined linearrelationship between the twovariables

• “x” can be used to predict “y”

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1 1.2

Price of Whiskey ($)

Mon

ey S

pent

by

TA ($

)

0.00

10.00

20.00

30.00

40.00

50.00

60.00

0 1 2 3 4 5 6

Price of Whiskey ($)

Mon

ey S

pent

by

TA ($

)

• R squared: how well “y” can be predicted by “x”, i.e. howstrong the linear relationship is between the two variables.

• The closer R square is to 0, the less well it fits the data.• The closer R square is to 1, more it fits the data.

1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

Page 38: Using Excel

8/25/2015

38

Page 39: Using Excel

8/25/2015

39

Page 40: Using Excel

8/25/2015

40

Page 41: Using Excel

8/25/2015

41

Regression analysis1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the Significance F or Samplep-value

• If p > 0.05 then fail to reject your Null Hypothesis and statethat “There is no significant linear relationship between thetwo variables”

• If p < 0.05 then reject your Null Hypothesis and state that“There is a significant linear relationship between the twovariables”

0.00

10.00

20.00

30.00

40.00

50.00

60.00

0 1 2 3 4 5 6

Price of Whiskey ($)

Mon

ey S

pent

by

TA ($

)

1. Choose an appropriate statistical test2. State H0 and HA3. Run test to produce Test Statistic4. Examine P-value5. Decide to accept or reject H0

• When you run the test, look for the p-value

• Our results show Significance F or Sample p-value = 1.65E08 =0.0000000165

• Therefore P < 0.05 (This means that there is less than a 5%chance that our null hypothesis is true)

• So we must reject your Null Hypothesis and state that “There isa significant linear relationship between the two variables”

• Next look at the R squared value

• Our results show R squared = 0.975

• Therefore the line fits the data well

• “x” can be used to predict “y”

Regression analysis

0.00

10.00

20.00

30.00

40.00

50.00

60.00

0 1 2 3 4 5 6

Price of Whiskey ($)

Mon

ey S

pent

by

TA ($

)

Page 42: Using Excel

8/25/2015

42

Ecological study

• What is the aim of the study?• What is the main question being asked?• What are your hypotheses?• Collect data• Summarize data in tables• Present data graphically• Statistically test your hypotheses• Analyze the statistical results• Present a conclusion to the proposed question

Aim: To determine whether or not there are changes in heights of Pine trees withdistance from the edge of a forest trail in Everglades National Park.Hypotheses:HO: There is no significant relationship between distance from the edge of the trail andPine tree heightHA: There is a significant relationship between distance from the edge of the trail andPine tree heightResults:

Discussion/Conclusion:The gap created by the trail may be adversely affecting Pine trees, such that they areshorter near the trail and become taller with distance from the trail.

Distance from trail (m) Plant heights (m)

0 2.1

5 2.7

10 2.9

15 3.1

20 3.4

25 3.7

30 3.8

35 4.5

40 4.6

45 4.8

50 5.6

SUM 41.2

AVERAGE 3.74

STANDARD DEVIATION 1.04

Change in tree height with distance from forest trail

01

23

456

0 10 20 30 40 50 60

Distance from trail

Tree

hei

ght (

m)

• P = 1.65E-08 Since P < 0.05, reject Ho• Therefore, there is a significant relationship

between distance from the edge of the trailand Pine tree height

• R Square = 0.97, so there is a strongpositive linear relationship betweendistance from the trail and plant height

Average tree height of pine trees along transectfrom forest trail to interior forest at ENP

Page 43: Using Excel

8/25/2015

43

Three questions:1. T-test2. Single factor ANOVA3. Regression analysis

Assignment – Worksheet 1