![Page 1: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/1.jpg)
Computing in Computing in ArchaeologyArchaeology
Basic StatisticsBasic Statistics
Week 8 (25/04/07)Week 8 (25/04/07)© Richard Haddlesey www.medievalarchitecture.net
![Page 2: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/2.jpg)
AimsAims
To familiarise ourselves with KEY To familiarise ourselves with KEY statistical terms and their meaningsstatistical terms and their meanings
To understand the use of stats in To understand the use of stats in archaeologyarchaeology
To assign variables, appropriate To assign variables, appropriate levels of measurement, at the levels of measurement, at the recording levelrecording level
![Page 3: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/3.jpg)
Key textsKey texts
![Page 4: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/4.jpg)
Basic StatsBasic Stats
Batch
VariablesVariables
Case Case Case
Post holes
Length, area, diameter
Post hole ID
![Page 5: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/5.jpg)
VariablesVariables
Variables are measured according Variables are measured according to one of FOUR levelsto one of FOUR levels
1.1. Nominal Nominal = arbitrary name= arbitrary name
2.2. OrdinalOrdinal = sequence with no distance= sequence with no distance
3.3. IntervalInterval = sequence with fixed distance= sequence with fixed distance
4.4. RatioRatio = sequence with a fixed = sequence with a fixed datumdatum
![Page 6: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/6.jpg)
Vince NOIRVince NOIR
NNominalominal OOrdinalrdinal IIntervalnterval RRatioatio
![Page 7: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/7.jpg)
Nominal examplesNominal examples
ConditionCondition AgeAge DiameterDiameter LengthLength ContextContext PeriodPeriod
![Page 8: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/8.jpg)
Ordinal examplesOrdinal examples
ConditionCondition1.1. ExcellentExcellent
2.2. GoodGood
3.3. FairFair
4.4. PoorPoor
Here “2” may be between “1” and Here “2” may be between “1” and “3” but is unlikely to be of equal “3” but is unlikely to be of equal distancedistance
![Page 9: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/9.jpg)
Interval examplesInterval examples
PeriodPeriod1.1. Late Bronze (1200-650)Late Bronze (1200-650)2.2. Early Iron (649-100)Early Iron (649-100)3.3. Late Iron (100+)Late Iron (100+)
Here, if we have 3 artefacts dated Here, if we have 3 artefacts dated 150BC, 300BC and 450BC, although 150BC, 300BC and 450BC, although bb may be equal distance between may be equal distance between aa and and cc, , cc is not twice as old as is not twice as old as aa..
This is because there is no datum.This is because there is no datum.
![Page 10: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/10.jpg)
Ratio examplesRatio examples
Age instead of periodAge instead of period• 1000 ya is twice 500 ya1000 ya is twice 500 ya• 20kg is twice 10kg20kg is twice 10kg
Ratio is the highest level of Ratio is the highest level of measurement because it has a measurement because it has a datum datum
![Page 11: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/11.jpg)
Mortlakestyle bowl
Fengatestyle bowl
Grooved ware jar
Nominal, Ordinal and Interval
![Page 12: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/12.jpg)
Note!Note!
Avoid using 0 or 1 to indicate such Avoid using 0 or 1 to indicate such variables as yes or no, as we may variables as yes or no, as we may need to know if it is “no” or “no data”need to know if it is “no” or “no data”
Also when using presence or absence Also when using presence or absence you may wish to add “missing” to you may wish to add “missing” to avoid confusionavoid confusion
![Page 13: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/13.jpg)
Further distinctionFurther distinction
Nominal and OrdinalNominal and Ordinal• = categorical= categorical• = qualitative= qualitative
Interval and RatioInterval and Ratio• = continuous= continuous• = quantitative= quantitative
![Page 14: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/14.jpg)
CodingCoding
Nominal and Ordinal often need coding, to Nominal and Ordinal often need coding, to minimise errors, via a keyword indexminimise errors, via a keyword index
con = contextcon = context• str = stray findstr = stray find• set = settlementset = settlement• bur = burialbur = burial
Avoid 1,2,3,etc, as you will have to keep Avoid 1,2,3,etc, as you will have to keep looking up their meanings which is time looking up their meanings which is time consumingconsuming
![Page 15: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/15.jpg)
CodingCoding
NOTE!NOTE!
EVERY DATA VALUE MUST HAVE A EVERY DATA VALUE MUST HAVE A CODE AND ONLY ONE CODE!CODE AND ONLY ONE CODE!
![Page 16: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/16.jpg)
GroupingGrouping
Good for periods, as in Good for periods, as in • Late Bronze (1200-650)Late Bronze (1200-650)• Early Iron (649-100)Early Iron (649-100)• Late Iron (100+)Late Iron (100+)
NOTE: it is better to record as a NOTE: it is better to record as a continuous variable (i.e. 780BC), continuous variable (i.e. 780BC), then group as an output (i.e. Late then group as an output (i.e. Late Bronze)Bronze)
![Page 17: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/17.jpg)
Good PracticeGood Practice
Always keep a “CLEAN” version of Always keep a “CLEAN” version of the original data setthe original data set
![Page 18: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/18.jpg)
Exploring the dataExploring the data
![Page 19: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/19.jpg)
Context FNO Taxon Bone z1 z2 z3 z4 z5 z6 F/U L/R art. sex NISP chop cut m1 m2 m3 m4269 58 bs mn 0 0 0 0 0 0 - r - - 1 35.9 14.6722 191 eq sc 1 1 1 1 1 1 f r 2 - 1 78.2 40.7 55.6722 191 eq sc 1 1 1 1 1 1 f l 2 - 1 78.7 41.4 48.5371 102 eq sc 1 1 1 1 1 1 f r - - 1 45.0 58.0 52.9722 191 eq cal 1 1 1 1 1 0 f r 2 - 1 90.6 45.0722 191 eq mp 1 1 1 0 0 0 f l 2 - 1 41 45.6 40.3 28.7722 191 eq mp 1 1 1 0 0 0 f r 2 - 1 42 46.0 39.5 29.4722 191 eq mp 1 1 1 0 0 0 f r 2 - 1 46.0 39.7 28.5285 72 bs cal 1 1 1 1 1 0 f r - - 1 1 1 137.5 46.3722 191 eq mp 1 1 1 0 0 0 f l 2 - 1 42 46.3 40.0 29.2722 191 eq pp 1 1 1 0 0 0 f l 2 - 1 71 48.7 45.0 32.5722 191 eq pp 1 1 1 0 0 0 f r 2 - 1 71 48.8 45.2 32.5722 191 eq pp 1 1 1 0 0 0 f r 2 - 1 68 49.0 45.0 34.1722 191 eq pel 1 1 1 1 1 1 f l 2 - 1 60.1 52.2722 191 eq ast 1 1 1 1 0 0 - r 2 - 1 51 53 44.9722 191 eq ast 1 1 1 1 0 0 - l 2 - 1 51 54 44.4 52.7722 191 eq mciii 1 1 1 1 1 1 f r 2 - 1 187 179 43.7 28.6722 191 eq mciii 1 1 1 1 1 1 f l 2 - 1 187 180 42.8722 191 eq mtiii 1 1 1 1 1 1 f l 2 - 1 229 223 41.4 39.1722 191 eq mtiii 1 1 1 1 1 1 f r 2 - 1 229 223 42.8 39.5722 191 eq hum 1 1 1 1 1 1 f/f r 2 - 1 232 30.8722 191 eq rad 1 1 1 1 1 1 f/f l 2 - 1 274 71.7 64.2
example data set
![Page 20: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/20.jpg)
univariate frequency tableunivariate frequency table
speciesspecies frequencyfrequency
cattlecattle 187187
sheepsheep 109109
pigpig 7878
horsehorse 2121
TotalTotal 395395
![Page 21: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/21.jpg)
speciesspecies pitspits ditchesditches TotalTotal
cattlecattle 6767 120120 187187
sheepsheep 6363 4646 109109
pigpig 4141 3737 7878
horsehorse 33 1818 2121
TotalTotal 174174 221221 395395
bivariate frequency tablebivariate frequency table
![Page 22: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/22.jpg)
bivariate frequency tablebivariate frequency table
speciesspecies pitspits ditchesditches TotalTotal
cattlecattle 67 67 39%39% 120 120 54%54% 187187
sheepsheep 63 63 36%36% 46 46 21%21% 109109
pigpig 41 41 24%24% 37 37 17%17% 7878
horsehorse 3 3 2%2% 18 18 8%8% 2121
TotalTotal 174 174 100%100% 221 221 100% 100% 395395
![Page 23: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/23.jpg)
MultivariateMultivariate
These tend to operate on a table, or These tend to operate on a table, or matrix of items, described in terms of matrix of items, described in terms of a set of variablesa set of variables
![Page 24: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/24.jpg)
Pictorial displays forPictorial displays forcategorical datacategorical data
![Page 25: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/25.jpg)
0
5
10
15
20
25
30
35
40
45
50
cattle sheep pig horse
%
bar chart
![Page 26: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/26.jpg)
0
10
20
30
40
50
60
cattle sheep pig horse
%
pits
ditches
multiple bar chart
![Page 27: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/27.jpg)
pie chart
![Page 28: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/28.jpg)
Pictorial displays forcontinuous data
![Page 29: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/29.jpg)
0
2
4
6
Co
un
t
Hunt's House
Monkton
4 9.0 5 0.0 5 1.0 5 2.0 5 3.0 5 4.0 5 5.0 5 6.0 5 7.0 5 8.0 5 9.0 6 0.0 6 1.0 6 2.0 6 3.0 6 4.0 6 5.0 6 6.0 6 7.0 6 8.0 6 9.0 7 0.0 7 1.0 7 2.0
Bd (mm)
0
2
4
6
Co
un
t
histogram
![Page 30: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/30.jpg)
![Page 31: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/31.jpg)
![Page 32: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/32.jpg)
Basic descriptive statistics:
• mode• median• mean• range• variance• standard deviation
![Page 33: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/33.jpg)
pottery fragments (weights in grams):2, 2, 3, 5, 8
![Page 34: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/34.jpg)
pottery fragments (weights in grams):2, 2, 3, 5, 8
Mode = 2
![Page 35: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/35.jpg)
ModeMode
Mode is the only way to measure Mode is the only way to measure average/typical in the average/typical in the NominalNominal class class
If there are two averages then they If there are two averages then they are bimodal (1,2,are bimodal (1,2,33,,33,,6,66,6,7,8,9),7,8,9)
Three = trimodal, etc.Three = trimodal, etc.
![Page 36: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/36.jpg)
pottery fragments (weights in grams):2, 2, 3, 5, 8
Mode = 2
Median = 3
![Page 37: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/37.jpg)
MedianMedian
Best for Best for ordinalordinal and above and above
If the number of variables is even, If the number of variables is even, you make a number between the two you make a number between the two middle numbers middle numbers
(1,2,3,(1,2,3,4,54,5,6,7,8 = 4+5/2=,6,7,8 = 4+5/2=4.54.5))
![Page 38: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/38.jpg)
pottery fragments (weights in grams):2, 2, 3, 5, 8
Mode = 2
Median = 3
Mean = (2+2+3+5+8)/5 = 4
![Page 39: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/39.jpg)
MeanMean
The most commonly used average The most commonly used average and, it will only work for and, it will only work for intervalinterval and and ratioratio
It is the most important measure of It is the most important measure of position because a lot of further position because a lot of further statistical analyses are based on itstatistical analyses are based on it
![Page 40: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/40.jpg)
ConclusionConclusion
It is important to understand that the It is important to understand that the modemode, , medianmedian and and meanmean are three quite are three quite different measures of position which can different measures of position which can give three different values when applied to give three different values when applied to the same data-setthe same data-set
2, 2, 3, 5, 8 2, 2, 3, 5, 6, 8
Mode = 2 2 Median = 3 4 Mean = 4 4.333
![Page 41: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/41.jpg)
The The skewskew
symmetrical
Positive skew Negative skew
![Page 42: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/42.jpg)
Measures of variability – the spread
![Page 43: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/43.jpg)
pottery fragments (weights in grams):2, 2, 3, 5, 8
Range =
max – min
8 - 2 = 6
• Very simple and of limited use
![Page 44: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/44.jpg)
variance
key:
![Page 45: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/45.jpg)
pottery fragments (weights in grams):2, 2, 3, 5, 8
s2 =
(2-4)2 + (2-4)2 + (3-4)2 +(5-4)2 + (8-4)2
5
variance (s2)
s2 = 5.2
s2 =
(Mean = 2=2=3=5=8/5=4)
![Page 46: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/46.jpg)
variance
standard deviation
![Page 47: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/47.jpg)
pottery fragments (weights in grams):2, 2, 3, 5, 8
variance (s2) = = 5.2
standard deviation =
= (√variance) = √5.2 = 2.28
![Page 48: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/48.jpg)
SummarySummary
Variables are measured according to Variables are measured according to one of one of FOURFOUR levels levels
1.1. Nominal Nominal = arbitrary name= arbitrary name
2.2. OrdinalOrdinal = sequence with no distance= sequence with no distance
3.3. IntervalInterval = sequence with fixed distance= sequence with fixed distance
4.4. RatioRatio = sequence with a fixed datum= sequence with a fixed datum
![Page 49: Computing in Archaeology Basic Statistics Week 8 (25/04/07) © Richard Haddlesey](https://reader037.vdocuments.us/reader037/viewer/2022102814/5514968f550346ea6e8b54ab/html5/thumbnails/49.jpg)
SummarySummary
Measures of position Measures of position (average/typical)(average/typical)• ModeMode• MedianMedian• MeanMean• RangeRange• VarianceVariance• Standard DeviationStandard Deviation