math1041 study notes for unsw

21

Click here to load reader

Upload: oliver

Post on 05-Nov-2015

342 views

Category:

Documents


37 download

DESCRIPTION

Study notes for Statistics for Life and Social Sciences.

TRANSCRIPT

Stats Study Notes

Stats Study NotesGraphical and Numerical Summaries

Statistic - summary of data (which are measures of events)Field of Statistics - collecting, analysing and understanding data measured with uncertainty

When choosing which graph:

1 Variable2 Variables

QuantitativeHistogram (columns, no gaps)

Box PlotScatterplot

CategoricalBar graph (with gaps)Clustered bar chart (two side-by-side charts in same scale)

Jittered Scatterplot

One Each-Comparative Bloxplot

Comparative Histogram

(side-by-side, same scale)

When looking at a graph observe:

Location - where most data is (similar to mode, also mean/median)

Spread - variability (width of bulky part) Shape - symmetric, left-skewed, right-skewed (skewed=direction it is pulled from symmetry)

Unusual observations

When choosing which method of numerical summary:

One categorical variable - table of frequency/percentages

One quantitative variable

Location

Mean

Median

if n (number of values) is odd, M= EQ \F(xn+1,2) if n is even, M= EQ \F(x+x EQ \F(n,2) +1,2)

Spread

Standard Deviations = EQ \r(\F((- EQ \O(x,) ),n-1) )

Interquartile Range - Q3-Q1 (each are calculated as medians of the top or bottom half)Five number summary: (Min, Q1, M, Q3, Max). This is the data shown in a box plot, however the tails of a box plot may exclude outliers. This is calculated by adding 1.5IQR to the outer ends of Q1 and Q3, then picking the furthest data points within this range. Outlier points are marked with a .

TransformationsLinear transformations are changing units of x to xnew, for example time (minh), length (kmmi) and temperature (oCoF), altering location and shape, but not shape. They are found by the equation:xnew = a + bx

Measures of location follow this:

EQ \O(x,) new = a + b EQ \O(x,) Mnew = a + bM

Measures of spread are only affected by b: snew = bs

IQRnew = bIQR

Non-linear transformations change shape, and are good for correcting skewed data and working with outliers. To pull down the right tail (right-skewed) use log(x) [preferred], x1/4 or x1/2 (from strongest to weakest). These are monotonically increasing (keeps everything in order), and the base of the log only affects the scale, not shape, and hence will not make it more symmetrical. Because log(ab)=log(a)+log(b), they change multiplicative values to additive. To pull down the left tail (left-skewed), treat it as -x then continue with right-skewed (e.g. log(-x). If dealing with zeros in right-skewed, use log(x+1). To stretch the proportions of data where 0