overview of some tests

Overview of some testsThomas INGICCO

J.L.T. Géricault, Le Radeau de La MéduseJ.L.T. Géricault, The Raft of The Medusa

Chi square test

Aim:Comparison of observed effectives Oij to theoretical effectives EijAre the lines and columns of a crossed table independant? Meaning that being part of the first variable has no influence on the modality of being part of the second variable.

Chi square test

Aim:Comparison of observed effectives Oij to theoretical effectives EijAre the lines and columns of a crossed table independant. Meaning that being part of the first variable has no influence on the modality of being part of the second variable.

Measured variable:A qualitative variable with k classes

Chi square test



Conditions of utilization:The class of the variables must be exclusivesCochran’s rule must be respected: in each class Eij ≥ 5. But possiblity to have some classes with 1 ≤ Oij ≤ 5 if a minimum of 80% of the totality of the classes have Oij > 5

Chi square test




Test hypotheses:H0: πi = Pthéo i The theoretical proportions Pthéo i are the real proportions in the observed populationH1 bilat: One at least of the theoretical proportions is not the real proportion in the observed population

Chi square test




Test hypotheses:H0: πi = Pthéo i The theoretical proportions Pthéo i are the real proportions in the observed populationH1 bilat: One at least of the theoretical proportions is not the real proportion in the observed population

The statistic is:

𝜒 ²=∑ 𝑖 , 𝑗 (𝑂 𝑖𝑗−𝐸 𝑖𝑗) ²𝐸𝑖𝑗

In R: sum((Oij - Eij)^2/Eij)

Chi square test

In details:Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE)

Chi square test


obs1<-data.frame(Ceram[,10:11])obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2)addmargins(obs3)

Chi square test

In details:Ceram<-read.table(i"K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE)

obs1<-data.frame(Ceram[,10:11])obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2)addmargins(obs3)graphics.off()par(cex.lab=1.5, xpd=NA, font=2)mosaicplot(t(obs3), main=NULL, cex.axis=1.1)

Chi square test


obs1<-data.frame(Ceram[,10:11])obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2)addmargins(obs3)graphics.off()par(cex.lab=1.5, xpd=NA, font=2)mosaicplot(t(obs3), main=NULL, cex.axis=1.1)

obs3theo<-suppressWarnings(chisq.test(obs3)$expected)addmargins(obs3theo) nij<-obs3tij<-obs3theo chi2.calc<-sum((nij-tij)^2/tij)chi2.calck<-dim(obs3)[1] c<-dim(obs3)[2] nu=(k-1)*(c-1)nu pchisq(chi2.calc, nu, lower.tail=FALSE)

Chi square test

In details:

# Test in R

chisq.test(obs3)

Fisher test

Aim:Comparison of observed effectives G & F (independance of 2 qualitatives variables) as observed proportions PG1/F1 and PG1/F2 (equality of the proportions)

Fisher test


Measured variable:Two qualitative variables F & G with 2 classes

Fisher test



Conditions of utilization:The class of the variables must be exclusivesQualitative variables are nominal

Fisher test




Test hypotheses:H0: πG1/F1 = πG1/F12 Proportions are identical i n the target populationH1 bilat: πG1/F1 ≠ πG1/F12 Proportions are different in the target populationH1 unilat right: πG1/F1 > πG1/F12 Proportion πG1/F1 is strictly superior to the target populationH1 unilat left: πG1/F1 < πG1/F12 Proportion πG1/F1 is strictly inferior to the target population

Fisher test




Test hypotheses:H0: πG1/F1 = πG1/F12 Proportions are identical i n the target populationH1 bilat: πG1/F1 ≠ πG1/F12 Proportions are different in the target populationH1 unilat right: πG1/F1 > πG1/F12 Proportion πG1/F1 is strictly superior to the target populationH1 unilat left: πG1/F1 < πG1/F12 Proportion πG1/F1 is strictly inferior to the target population

The statistic is:

𝑁 𝐹𝐸=𝑛11

In R: sum((Oij - Eij)^2/Eij)

Fisher test

In details:

Fisher test

In details:

Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Lecture-4/Ceramics.txt",header=TRUE)

Fisher test

In details:

Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Lecture-4/Ceramics.txt",header=TRUE)

obs1<-Ceram[,c(12,9)]obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2)# obs3<-t(obs3) # obs4<-obs3 ; obs4[,1]<-obs3[,2] ; obs4[,2]<-obs3[,1] ; dimnames(obs4)[[2]][1] <- dimnames(obs3)[[2]][2] ; dimnames(obs4)[[2]][2] <- dimnames(obs3)[[2]][1] ; obs3<-obs4 addmargins(obs3)

In details:

graphics.off()pG1.Fi<- obs3[, 1]/ margin.table(obs3, 1) par(mar=c(5.1, 5.1, 4.1, 2.1))barplot(pG1.Fi, xlab = paste(labels(dimnames(obs3))[2], " / ", labels(dimnames(obs3))[1]), xaxt = "n", ylab="Proportion", ylim=range(0, max(pG1.Fi)+ 0.15), cex.lab=2, cex.axis=1.8)position.labels <- barplot(pG1.Fi, plot = FALSE)[] axis(side=1, at = position.labels, labels = c(paste(colnames(obs3)[1], " / ", rownames(obs3)[1]), paste(colnames(obs3)[1], " / ", rownames(obs3)[2])), cex.axis=1.8)

Fisher test

In details:

graphics.off()pG1.Fi<- obs3[, 1]/ margin.table(obs3, 1) par(mar=c(5.1, 5.1, 4.1, 2.1))barplot(pG1.Fi, xlab = paste(labels(dimnames(obs3))[2], " / ", labels(dimnames(obs3))[1]), xaxt = "n", ylab="Proportion", ylim=range(0, max(pG1.Fi)+ 0.15), cex.lab=2, cex.axis=1.8)position.labels <- barplot(pG1.Fi, plot = FALSE)[] axis(side=1, at = position.labels, labels = c(paste(colnames(obs3)[1], " / ", rownames(obs3)[1]), paste(colnames(obs3)[1], " / ", rownames(obs3)[2])), cex.axis=1.8)

windows() par(cex.lab=2, xpd=NA, font=2)mosaicplot(t(obs3), main=NULL, cex.axis=1.5)

Fisher test

In details:

n11<-obs3[1, 1]n1.<-margin.table(obs3, 1)[1]n21<-obs3[2, 1]n2.<-margin.table(obs3, 1)[2]pG1.F1<-n11/n1.pG1.F2<-n21/n2.t(data.frame(pG1.F1, pG1.F2))

Fisher test

In details:


n12<-obs3[1,2]n22<-obs3[2,2]n.1<-margin.table(obs3,2)[1]n.2<-margin.table(obs3,2)[2]pG2.F1<-n12/n1.pG2.F2<-n22/n2.pF1.G1<-n11/n.1pF1.G2<-n12/n.2pF2.G1<-n21/n.1pF2.G2<-n22/n.2t(data.frame(pG2.F1, pG2.F2, pF1.G1, pF1.G2, pF2.G1, pF2.G2))

Fisher test

In details:


n12<-obs3[1,2]n22<-obs3[2,2]n.1<-margin.table(obs3,2)[1]n.2<-margin.table(obs3,2)[2]pG2.F1<-n12/n1.pG2.F2<-n22/n2.pF1.G1<-n11/n.1pF1.G2<-n12/n.2pF2.G1<-n21/n.1pF2.G2<-n22/n.2t(data.frame(pG2.F1, pG2.F2, pF1.G1, pF1.G2, pF2.G1, pF2.G2))

n11<-obs3[1,1]NFE.calc<-n11NFE.calc

Fisher test

In details:

n1.<-margin.table(obs3,1)[1]n<- margin.table(obs3)n.1<-margin.table(obs3,2)[1]p.right<-phyper(NFE.calc-1,n1.,n-n1.,n.1,lower.tail=FALSE) p.left<-phyper(NFE.calc,n1.,n-n1.,n.1)p.rightp.left

Fisher test

In details:


if(p.right < p.left) {p.value1<-p.right ; NFE.left<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.left<-Inf ; while(NFE.left >= 0 & d.NFE.left> d.NFE.calc) { NFE.left<-NFE.left - 1 ; d.NFE.left<- round(dhyper(NFE.left,n1.,n-n1.,n.1),12)} ;

Fisher test

In details:


if(p.right < p.left) {p.value1<-p.right ; NFE.left<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.left<-Inf ; while(NFE.left >= 0 & d.NFE.left > d.NFE.calc) { NFE.left<-NFE.left - 1 ; d.NFE.left<- round(dhyper(NFE.left,n1.,n-n1.,n.1),12)} ; if(d.NFE.left > d.NFE.calc){p.value2<-0} else{p.value2<- phyper(NFE.left,n1.,n-n1.,n.1)}} else{p.value1<-p.left ; NFE.right<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.right<-Inf ; while(d.NFE.right > d.NFE.calc){ NFE.right<-NFE.right + 1 ; d.NFE.right<- round(dhyper(NFE.right,n1.,n-n1.,n.1),12)} ; p.value2<- phyper(NFE.right-1,n1.,n-n1.,n.1,lower.tail=FALSE)}

Fisher test

In details:


if(p.right < p.left) {p.value1<-p.right ; NFE.left<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.left<-Inf ; while(NFE.left >= 0 & d.NFE.gauche > d.NFE.calc) { NFE.left<-NFE.left - 1 ; d.NFE.left<- round(dhyper(NFE.left,n1.,n-n1.,n.1),12)} ; if(d.NFE.left > d.NFE.calc){p.value2<-0} else{p.value2<- phyper(NFE.left,n1.,n-n1.,n.1)}} else{p.value1<-p.left ; NFE.right<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.right<-Inf ; while(d.NFE.right > d.NFE.calc){ NFE.right<-NFE.right + 1 ; d.NFE.right<- round(dhyper(NFE.right,n1.,n-n1.,n.1),12)} ; p.value2<- phyper(NFE.right-1,n1.,n-n1.,n.1,lower.tail=FALSE)}

p.value<-p.value1+p.value2p.value

Fisher test

In details:

Pn11<-choose(n1.,n11)*choose(n-n1.,n.1-n11)/choose(n,n.1)Pn11

dhyper(n11,n1.,n-n1.,n.1)n11<-obs3[1,1]

Fisher test

In details:

Pn11<-choose(n1.,n11)*choose(n-n1.,n.1-n11)/choose(n,n.1)Pn11

dhyper(n11,n1.,n-n1.,n.1)n11<-obs3[1,1]

# Test in Rfisher.test(obs3)

Fisher test

Student t test

Aim:Comparison of two observed means m1 and m2

Student t test


Measured variable:A quantitative variable and a qualitative variable with two classes

Student t test



Conditions of utilization:The quantitative variable must follow a normal lawThe quantitative variable may be continuous or discrete

Student t test




Test hypotheses:H0: μ1 = μ2 Means are identical in the target pop.H1 bilat: μ1 ≠ μ2 Means are different in the target pop.H1 unilat right: μ1 > μ2 Mean is srtictly superior to the mean in the target pop.H1 unilat left: μ1 < μ2 Mean is srtictly inferior to the mean in the target pop.

Student t test




Test hypotheses:H0: μ1 = μ2 Means are identical in the target pop.H1 bilat: μ1 ≠ μ2 Means are different in the target pop.H1 unilat right: μ1 > μ2 Mean is srtictly superior to the mean in the target pop.H1 unilat left: μ1 < μ2 Mean is srtictly inferior to the mean in the target pop.

The statistic is: with:

𝑡=𝑚1−𝑚2

√ �̂�2×( 1𝑛1−1𝑛2 )

In R: (m1-m2)/(s2*(1/n1+1/n2))^0.5

�̂�2=(𝑛¿¿1−1)𝑠1❑ ²+(𝑛2−1) 𝑠2 ²

𝑛1+𝑛2−2¿

Student t test

In details:

obs1<-data.frame(Ceramics[which(Ceramics$Base=="Round" | Ceramics$Base=="Flat"), c(2,13)])obs2<-na.omit(obs1)nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3

Student t test

In details:

obs1<-data.frame(Ceramics[which(Ceramics$Base=="Round" | Ceramics$Base=="Flat"), c(2,13)])obs2<-na.omit(obs1)nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3

n1<-length(na.omit(obs3[, 1])) n2<- length(obs3[, 2])m1<-mean(na.omit(obs3[, 1]))m2<-mean(obs3[, 2]) s.1<- sd(na.omit(obs3[, 1]))s.2<- sd(obs3[, 2]) param <- data.frame(c(n1, n2), c(m1, m2), c(s.1, s.2))names(param) <- c("Effectives", "Mean", "Standard deviation")row.names(levels(obs2[,2]))param

In details:In details:

s2<-((n1-1)*s.1^2+(n2-1)*s.2^2)/(n1+n2-2) t.calc<- (m1-m2)/(s2*(1/n1+1/n2))^0.5t.calc

Student t test



nu<-n1+n2-2numin(pt(t.calc, nu, lower.tail=FALSE), pt(t.calc, nu))*2

Student t test



nu<-n1+n2-2numin(pt(t.calc, nu, lower.tail=FALSE), pt(t.calc, nu))*2

# Test in Rt.test(obs3[, 1],obs3[, 2],var.equal=TRUE)

Student t test

Analysis of variance (ANOVA)

Aim:Comparison of at least two observed means



Measured variable:A quantitative variable and a qualitative variable with k classes




Conditions of utilization:The quantitative variable must follow a normal lawThe variances of the quantitative variable in each classes of the qualitative variable must be equal ()-> If conditions are not fulfilled, see the Kruskal-Wallis test





Test hypotheses:H0: μ1 = μ2 Means are identical in the target pop.H1 bilat: μ1 ≠ μ2 One of the means at least is different in the target pop.





Test hypotheses:H0: μ1 = μ2 Means are identical in the target pop.H1 bilat: μ1 ≠ μ2 One of the means at least is different in the target pop.

The statistic is: 𝐹 𝑣 2𝑣 1=

𝐼 𝑛𝑡𝑒𝑟𝑐𝑙𝑎𝑠𝑠𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝐼 𝑛𝑡𝑒𝑟𝑐𝑙𝑎𝑠𝑠𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 =

𝑆𝐶𝐼𝑘−1𝑆𝐶𝐸𝑛−1

In R: (m1-m2)/(s2*(1/n1+1/n2))^0.5


In details:

Ceram<-read.table("K:/Cours/Philippines/Statistics210/Data/Ceramics.txt",header=TRUE)

obs1<-Ceram[,c(7,10)] obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1) nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3


In details:

graphics.off()

k<-nlevels(obs2[, 2])

stripchart(obs2[, 1]~obs2[, 2], method="jitter", jitter=0.1, vertical=FALSE, ylim=range(0.5, k+0.5), group.names=levels(obs2[, 2]), xlab= names(obs2)[1], ylab=names(obs2)[2], pch=16, cex=1.2)

mc<-sapply(split(obs2[, 1], obs2[, 2]), mean)for(i in 1:k){segments(mc[i], i-0.25, mc[i], i+0.25, lwd=3, col=gray(0.5))}


In details:

k<-nlevels(obs2[,2])

nc<-sapply(split(obs2[, 1], obs2[, 2]), length) mc<-sapply(split(obs2[, 1], obs2[, 2]), mean) sc<-sapply(split(obs2[, 1], obs2[, 2]), sd) param <- data.frame(nc, mc, sc)names(param) <- c("Observations", "Mean", "Standard.deviation")


In details:



xi<- obs2[,1]m<-mean(xi)SCT<-sum((xi-m)^2)SCTSCI<-sum(nc*(mc-m)^2)SCISCE<-0for(i in 1:k){SCE<-SCE+sum((na.omit(obs3[i])-mc[i])^2)}SCESCI+SCEn<-length(obs2[,1])Fcalc<-(SCI/(k-1))/(SCE/(n-k))Fcalc


In details:



xi<- obs2[,1]m<-mean(xi)SCT<-sum((xi-m)^2)SCTSCI<-sum(nc*(mc-m)^2)SCISCE<-0for(i in 1:k){SCE<-SCE+sum((na.omit(obs3[i])-mc[i])^2)}SCESCI+SCEn<-length(obs2[,1])Fcalc<-(SCI/(k-1))/(SCE/(n-k))Fcalc

pf(Fcalc, (k-1), (n-k), lower.tail=FALSE)


In details:

# Test in Roneway.test(obs1[,1] ~ obs1[,2], var.equal=TRUE)anova(lm(obs1[,1] ~ obs1[,2]))param

Mann-Whitney-Wilcoxon test

Aim:Comparison of two observed median and 2



Measured variable:A quantitative variable and a qualitative variable with 2 classes




Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive





Test hypotheses:H0: me1 = me2 Medians are identical in the populationH1 bilat: me1 ≠ me2 Medians are different in the populationH1 upper: me1 > me2 Median me1 is strictly superior than me2 in the populationH1 lower: me1 < me2 Median me1 is strictly inferior than me2 in the population





Test hypotheses:H0: me1 = me2 Medians are identical in the populationH1 bilat: me1 ≠ me2 Medians are different in the populationH1 upper: me1 > me2 Median me1 is strictly superior than me2 in the populationH1 lower: me1 < me2 Median me1 is strictly inferior than me2 in the population

The statistic is: with:

𝑊 1❑=∑

𝑗=1

𝑛1𝑟 𝑗

In R: W1calc-0.5*n1*(1+n1)

𝑈 1❑=𝑊 1−

12 𝑛1(𝑛1+1)

In details:

Ceramics<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)

obs1<-data.frame(Ceramics[which(Ceramics$Human.remains=="Yes" | Ceramics$Human.remains=="No"), c(4,14)])obs1[, 2]<-factor(obs1[, 2])obs2<-na.omit(obs1)


In details:



nc.max<-max(table(obs2[, 2])) nb.na<-nc.max- table(obs2[, 2]) tempo<-split(obs2[, 1], obs2[, 2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))}


In details:



nc.max<-max(table(obs2[, 2])) nb.na<-nc.max- table(obs2[, 2]) tempo<-split(obs2[, 1], obs2[, 2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))}

obs3<-data.frame(tempo)xr<-sort(obs2[, 1])ci<-obs2[, 2]cr<-ci[order(obs2[, 1])] r<-rank(xr,ties.method="average") obs4<-data.frame(r,xr,cr)names(obs4)<- c("rank", paste(names(obs2)[1], "by increasing order"), names(obs2)[2])obs4


In details:

n1<-sapply(split(obs2[, 1], obs2[, 2]), length)[1] n2<-sapply(split(obs2[, 1], obs2[, 2]), length)[2] m1<- sapply(split(obs2[, 1], obs2[, 2]), mean)[1]m2<- sapply(split(obs2[, 1], obs2[, 2]), mean)[2]s1<-sapply(split(obs2[, 1], obs2[, 2]), sd)[1]s2<-sapply(split(obs2[, 1], obs2[, 2]), sd)[2]me1<- sapply(split(obs2[, 1], obs2[, 2]), median)[1]me2<- sapply(split(obs2[, 1], obs2[, 2]), median)[2]min1<- sapply(split(obs2[, 1], obs2[, 2]), min)[1]min2<- sapply(split(obs2[, 1], obs2[, 2]), min)[2]max1<- sapply(split(obs2[, 1], obs2[, 2]), max)[1]max2<- sapply(split(obs2[, 1], obs2[, 2]), max)[2]


In details:

param <- data.frame(c(n1, n2), c(m1, m2), c(s1, s2), c(me1, me2), c(min1, min2), c(max1, max2))names(param) <- c("Effective", "Mean", "Standard.deviation", "Median", "Minimum", "Maximum")row.names(levels(obs2[, 2]))paramW1calc<-sum(subset(obs4[, 1], obs4[, 3]== names(obs3)[1]))U1calc<-W1calc-0.5*n1*(1+n1)U1calcmin(pwilcox((U1calc-1), n1, n2, lower.tail=FALSE), pwilcox(U1calc, n1, n2))*2


In details:

param <- data.frame(c(n1, n2), c(m1, m2), c(s1, s2), c(me1, me2), c(min1, min2), c(max1, max2))names(param) <- c("Effective", "Mean", "Standard.deviation", "Median", "Minimum", "Maximum")row.names(levels(obs2[, 2]))paramW1calc<-sum(subset(obs4[, 1], obs4[, 3]== names(obs3)[1]))U1calc<-W1calc-0.5*n1*(1+n1)U1calcmin(pwilcox((U1calc-1), n1, n2, lower.tail=FALSE), pwilcox(U1calc, n1, n2))*2

## In Rwilcox.test(obs3[, 1], obs3[, 2])


Aim:Comparison of k observed median (

Kruskal-Wallis test



Kruskal-Wallis test




Test hypotheses:H0: me1 = me2 Medians are identical in the populationH1 bilat: me1 ≠ me2 One of the medians at least is different in the population

Kruskal-Wallis test




Test hypotheses:H0: me1 = me2 Medians are identical in the populationH1 bilat: me1 ≠ me2 One of the medians at least is different in the population

The statistic is:

𝐾=𝐾

1−∑𝑖=1

𝑓

(𝑡 𝑖3− 𝑡𝑖)

𝑛3−𝑛

Kruskal-Wallis test

In details:

Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)

obs1<-Ceram[,c(7,10)] obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1)

Kruskal-Wallis test

In details:


obs1<-Ceram[,c(7,10)] obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1)

ri<-rank(obs2[,1] ,ties.method="average") obs3<-data.frame(obs2,ri)names(obs3)[3]<- "Rank"obs3nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs4<-data.frame(tempo)

Kruskal-Wallis test

In details:

nc<-sapply(split(obs2[, 1], obs2[, 2]), length) me.c<-sapply(split(obs2[,1], obs2[,2]), median) min.c<- sapply(split(obs2[,1], obs2[,2]), min) max.c<- sapply(split(obs2[,1], obs2[,2]), max) mc<-sapply(split(obs2[, 1], obs2[, 2]), mean) sc<-sapply(split(obs2[, 1], obs2[, 2]), sd) param <- data.frame(nc, me.c, min.c, max.c, mc, sc)names(param) <- c("Effective", "Median", "Minimum", "Maximum", "Mean", "Standard.deviation")row.names(levels(obs2[, 2]))param

Kruskal-Wallis test

In details:

n<-length(obs3$Rank)Rc<-tapply(obs3$Rank, obs3[, 2], "sum") K.calc<-12*sum(Rc^2/nc)/(n*(n+1))-3*(n+1)K.calcti<-table(obs3[,1])Kcorr<-K.calc/(1-sum(ti^3-ti)/(n^3-n))Kcorrk<-length(nc)pchisq(Kcorr, k-1, lower.tail=FALSE)k<-length(nc) n<-sum(nc)vecteur.ini<-NULLfor(i in 1:k){vecteur.ini<-c(vecteur.ini, rep(LETTERS[i], nc[i]))}rank<-1:n

## Test in Rkruskal.test(obs1[,1], obs1[,2])

Kruskal-Wallis test

Aim:Comparison of two observed variances and

Fisher-Snedecor test




Test hypotheses:H0: = Variances are identical in the populationH1 bilat: ≠ Variances are different in the populationH1 upper: > Variances is strictly superior than in the populationH1 lower: < Variances is strictly inferior than in the population





Test hypotheses:H0: = Variances are identical in the populationH1 bilat: ≠ Variances are different in the populationH1 upper: > Variances is strictly superior than in the populationH1 lower: < Variances is strictly inferior than in the population

The statistic is: 𝐹 𝑣 2𝑣 1=

𝜎12

𝜎 22


In details:

n<-length(obs3$Rank)Rc<-tapply(obs3$Rank, obs3[, 2], "sum") K.calc<-12*sum(Rc^2/nc)/(n*(n+1))-3*(n+1)K.calcti<-table(obs3[,1])Kcorr<-K.calc/(1-sum(ti^3-ti)/(n^3-n))Kcorrk<-length(nc)pchisq(Kcorr, k-1, lower.tail=FALSE)k<-length(nc) n<-sum(nc)vecteur.ini<-NULLfor(i in 1:k){vecteur.ini<-c(vecteur.ini, rep(LETTERS[i], nc[i]))}rank<-1:n

## Test in Rkruskal.test(obs1[,1], obs1[,2])


In details:


obs1<-data.frame(Ceram[which(Ceram$Base=="Round" | Ceram$Base=="Flat"), c(6,13)]) obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1)


In details:


obs1<-data.frame(Ceram[which(Ceram$Base=="Round" | Ceram$Base=="Flat"), c(6,13)]) obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1) nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3


In details:


obs1<-data.frame(Ceram[which(Ceram$Base=="Round" | Ceram$Base=="Flat"), c(6,13)]) obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1) nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3

mc<-sapply(split(obs2[,1], obs2[,2]),mean) obs4<-obs2obs4[which(obs2[, 2] == levels(obs2[, 2])[1]), 1] <- obs4[which(obs2[, 2] == levels(obs2[, 2])[1]), 1] - mc[1] obs4[which(obs2[, 2] == levels(obs2[, 2])[2]), 1] <- obs4[which(obs2[, 2] == levels(obs2[, 2])[2]), 1] - mc[2]


In details:

n1<-sapply(split(obs2[, 1], obs2[, 2]), length)[1]n2<-sapply(split(obs2[, 1], obs2[, 2]), length)[2]s2.1<-sapply(split(obs2[, 1], obs2[, 2]), var)[1] s2.2<-sapply(split(obs2[, 1], obs2[, 2]), var)[2] m1<- sapply(split(obs2[, 1], obs2[, 2]), mean)[1]m2<- sapply(split(obs2[, 1], obs2[, 2]), mean)[2]param <- data.frame(c(n1, n2), c(s2.1, s2.2), c(s2.1^0.5, s2.2^0.5), c(m1, m2))names(param) <- c("Effective", "Variance", « Standard.deviation", "Mean")row.names(levels(obs2[, 2]))param


In details:


F.calc<-s2.1/s2.2F.calcnu.1<-n1-1nu.2<-n2-1min(pf(F.calc, nu.1, nu.2, lower.tail=FALSE), pf(F.calc, nu.1, nu.2))*2F.calc/qf(0.025, nu.1, nu.2, lower.tail=FALSE)F.calc/qf(0.025, nu.1, nu.2)


In details:


F.calc<-s2.1/s2.2F.calcnu.1<-n1-1nu.2<-n2-1min(pf(F.calc, nu.1, nu.2, lower.tail=FALSE), pf(F.calc, nu.1, nu.2))*2F.calc/qf(0.025, nu.1, nu.2, lower.tail=FALSE)F.calc/qf(0.025, nu.1, nu.2)

## In Rvar.test(obs3[, 1], obs3[, 2])


Spearman correlation test

Aim:Comparison of a correlation coefficient to the nulle theoretical value

Measured variable:Two quantitative variables x and y

Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneEach individual must possess a value of each variable

Test hypotheses:H0: = 0 Correlation coefficient is nulle in the populationH1 bilat: ≠ 0 Correlation coefficient is not nulle in the population H1 upper: > 0 Correlation coefficient is positive in the populationH1 lower: < 0 Correlation coefficient is negative in the population

The statistic is: 𝑡𝑝❑=

𝜌𝑥𝑦❑

√ 1−𝜌𝑥𝑦❑ ²

𝑛−2

In R: n*(n^2-1)*(1-rho.xy)/6

In details:


obs1<-data.frame(Ceram[which(Ceram$Human.remains=="Yes"), c(2,4)]) obs2<-na.omit(obs1)


In details:



rxi<-rank(obs2[,1],ties.method = "average") ryi<-rank(obs2[,2],ties.method = "average")obs3<-data.frame(rxi,ryi)names(obs3)<-c(paste("rank of", names(obs2)[1]), paste("rank of", names(obs2)[2]))obs3


In details:



rxi<-rank(obs2[,1],ties.method = "average") ryi<-rank(obs2[,2],ties.method = "average")obs3<-data.frame(rxi,ryi)names(obs3)<-c(paste("rank of", names(obs2)[1]), paste("rank of", names(obs2)[2]))obs3

n<-length(obs2[, 1])n rho.xy<-(sum(rxi*ryi) - n*((n+1)/2)^2) / ((sum(rxi^2) - n*((n+1)/2)^2)^0.5 * (sum(ryi^2) -n*((n+1)/2)^2)^0.5) rho.xy cor(rank(obs2[,1]), rank(obs2[,2])) Scalc<-n*(n^2-1)*(1-rho.xy)/6Scalcsum((ryi-rxi)^2)


In details:

## In Rcor.test(obs1[,1], obs1[,2], method="spearman")


overview of some tests

Documents

qualitative variable

measured variable

theoretical effectives

classes chi square testaim

real proportions

crossed table independant

class eij

observed populationh1