statistics through applications

15
Statistics Through Applications How Do We Get “Good” Data?

Upload: branden-farmer

Post on 01-Jan-2016

22 views

Category:

Documents


1 download

DESCRIPTION

Statistics Through Applications. How Do We Get “Good” Data?. Good Data isn't Based on an Anecdote. Using anecdotal evidence is relying on an isolated example or experience to make a decision Good data should come from many varied examples and be non-partial - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistics Through Applications

Statistics Through Applications

How Do We Get “Good” Data?

Page 2: Statistics Through Applications
Page 3: Statistics Through Applications

Good Data isn't Based on an Anecdote

Using anecdotal evidence is relying on an isolated example or experience to make a decision

Good data should come from many varied examples and be non-partial

Anecdotes usually appeal to our emotions and fool us into belief while statistics are dry but much more reliable

Page 4: Statistics Through Applications

Good Data is Compared Fairly

Often a rate expressed as a percent or fraction is a more valid measure than a simple count of occurrences

Two schools both had 1900 students pass TAKS. One school has 2000 students and the other has 2500. Did they perform equally as well?

Page 5: Statistics Through Applications

Good Data needs to be Communicated and Read Carefully

An advertisement for a home security system says, “When you go on vacation, burglars go to work. According to FBI statistics, over 26% of home burglaries take place between Memorial Day and Labor Day. Beware - summertime is burglary time!”

Only one in two cameras is actually in operation, but this could soon increase to as many as one in three.Watford Observer, 2 August 2002

Continental Airlines once advertised that it had “decreased lost baggage by 100% in the past six months.”

Page 6: Statistics Through Applications

Results from a Gallup poll, taken May 29-31, 2009, with a 3% margin of error.

Can we conclude that more Americans have a favorable impression of Dick Cheney than Nancy Pelosi?

What can we conclude from these graphs?

Page 7: Statistics Through Applications

Good Data is Valid, Unbiased & Reliable

Valid – relevant and appropriate

Unbiased – not consistently different from  actuality in one direction

Reliable – as little variation as possible

Page 8: Statistics Through Applications

Even Good Data Varies How Long is a Minute?

How accurate are you and your classmates at knowing how long a minute is?

Get a partner and a stopwatch. You will take turns timing and guessing. Using the stopwatch, the timer tells the guesser when to start. When the guesser believes that a minute has passed, he says “Stop.” At that point, the timer stops the stopwatch and records the time that passed to the nearest tenth of a second. Do not tell your partner how much time actually passed!

Reset the stopwatch and switch roles. Continue timing and measuring until each person has been timed three times.

Page 9: Statistics Through Applications

Analyzing How Long is a Minute?

Was your data valid? Was either partner’s data biased? Which partner was more reliable? How about the class as a whole? Add

your data (3 from each of you) to the class list and graph.

Then figure the average of your 3 measurements and add it to the other graph.

Page 10: Statistics Through Applications

All data varies, but we can use Averages to Improve Reliability No measuring process is perfectly

reliable. The average of several repeated

measurements of the same individual is more reliable (and less variable) than a single measurement.

Page 11: Statistics Through Applications

Goal: The least amount of variability and bias possible!

Page 12: Statistics Through Applications

How do we achieve our goal?

To reduce bias, use random sampling. A random sample should represent the population, and therefore give unbiased results.

To reduce variability, use a larger sample. Increasing the sample size will almost always give an average estimate that is close to the truth.

Page 13: Statistics Through Applications

Good Data shouldn't be confounding

Just because two variables have a relationship, that doesn’t mean one causes the other. There could be a confounding variable at play.

A confounding variable is an additional variable that effects the response but isn't separated out.

Confounding variables are most often found in observational studies comparing a characteristic of two groups or poorly designed experiments .

Sometimes the media ignores confounding variables and misinterprets results from observational studies reporting “proven” links when in statistics we only have shown evidence of a relationship.

Page 14: Statistics Through Applications

Consider this data – is there a relationship?

YearNumber of Methodist Ministers in New England

Cuban Rum Imported to Boston (in barrels)

1860 63 8,376

1865 48 6,406

1870 53 7,005

1875 64 8,486

1880 72 9,595

1885 80 10,643

1890 85 11,265

1895 76 10,071

1900 80 10,547

1905 83 11,008

1910 105 13,885

1915 140 18,559

1920 175 23,024

1925 183 24,185

1930 192 25,434

1935 221 29,238

1940 262 34,705

Page 15: Statistics Through Applications

Another confounding example A study sites that a group of children who had

certain vaccinations were more likely to develop autism than a group of children who did not receive those same vaccinations. 

Does this mean that vaccinations cause autism? No, since the children were not randomly

assigned to get vaccinations or not, there could be another confounding variable at play.  Perhaps the parents that chose vaccinations also made some other choice that increased the risk of austism or were genetically inclined to have a higher risk of autistic children