overview of some tests
DESCRIPTION
Overview of some tests. Thomas INGICCO. J.L.T. Géricault, Le Radeau de La Méduse J.L.T. Géricault, The Raft of The Medusa. Chi square test. Aim : Comparison of observed effectives Oij to theoretical effectives Eij - PowerPoint PPT PresentationTRANSCRIPT
Overview of some testsThomas INGICCO
J.L.T. Géricault, Le Radeau de La MéduseJ.L.T. Géricault, The Raft of The Medusa
Chi square test
Aim:Comparison of observed effectives Oij to theoretical effectives EijAre the lines and columns of a crossed table independant? Meaning that being part of the first variable has no influence on the modality of being part of the second variable.
Chi square test
Aim:Comparison of observed effectives Oij to theoretical effectives EijAre the lines and columns of a crossed table independant. Meaning that being part of the first variable has no influence on the modality of being part of the second variable.
Measured variable:A qualitative variable with k classes
Chi square test
Aim:Comparison of observed effectives Oij to theoretical effectives EijAre the lines and columns of a crossed table independant. Meaning that being part of the first variable has no influence on the modality of being part of the second variable.
Measured variable:A qualitative variable with k classes
Conditions of utilization:The class of the variables must be exclusivesCochran’s rule must be respected: in each class Eij ≥ 5. But possiblity to have some classes with 1 ≤ Oij ≤ 5 if a minimum of 80% of the totality of the classes have Oij > 5
Chi square test
Aim:Comparison of observed effectives Oij to theoretical effectives EijAre the lines and columns of a crossed table independant. Meaning that being part of the first variable has no influence on the modality of being part of the second variable.
Measured variable:A qualitative variable with k classes
Conditions of utilization:The class of the variables must be exclusivesCochran’s rule must be respected: in each class Eij ≥ 5. But possiblity to have some classes with 1 ≤ Oij ≤ 5 if a minimum of 80% of the totality of the classes have Oij > 5
Test hypotheses:H0: πi = Pthéo i The theoretical proportions Pthéo i are the real proportions in the observed populationH1 bilat: One at least of the theoretical proportions is not the real proportion in the observed population
Chi square test
Aim:Comparison of observed effectives Oij to theoretical effectives EijAre the lines and columns of a crossed table independant. Meaning that being part of the first variable has no influence on the modality of being part of the second variable.
Measured variable:A qualitative variable with k classes
Conditions of utilization:The class of the variables must be exclusivesCochran’s rule must be respected: in each class Eij ≥ 5. But possiblity to have some classes with 1 ≤ Oij ≤ 5 if a minimum of 80% of the totality of the classes have Oij > 5
Test hypotheses:H0: πi = Pthéo i The theoretical proportions Pthéo i are the real proportions in the observed populationH1 bilat: One at least of the theoretical proportions is not the real proportion in the observed population
The statistic is:
𝜒 ²=∑ 𝑖 , 𝑗 (𝑂 𝑖𝑗−𝐸 𝑖𝑗) ²𝐸𝑖𝑗
In R: sum((Oij - Eij)^2/Eij)
Chi square test
In details:Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE)
Chi square test
In details:Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE)
obs1<-data.frame(Ceram[,10:11])obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2)addmargins(obs3)
Chi square test
In details:Ceram<-read.table(i"K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE)
obs1<-data.frame(Ceram[,10:11])obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2)addmargins(obs3)graphics.off()par(cex.lab=1.5, xpd=NA, font=2)mosaicplot(t(obs3), main=NULL, cex.axis=1.1)
Chi square test
In details:Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE)
obs1<-data.frame(Ceram[,10:11])obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2)addmargins(obs3)graphics.off()par(cex.lab=1.5, xpd=NA, font=2)mosaicplot(t(obs3), main=NULL, cex.axis=1.1)
obs3theo<-suppressWarnings(chisq.test(obs3)$expected)addmargins(obs3theo) nij<-obs3tij<-obs3theo chi2.calc<-sum((nij-tij)^2/tij)chi2.calck<-dim(obs3)[1] c<-dim(obs3)[2] nu=(k-1)*(c-1)nu pchisq(chi2.calc, nu, lower.tail=FALSE)
Chi square test
In details:
# Test in R
chisq.test(obs3)
Fisher test
Aim:Comparison of observed effectives G & F (independance of 2 qualitatives variables) as observed proportions PG1/F1 and PG1/F2 (equality of the proportions)
Fisher test
Aim:Comparison of observed effectives G & F (independance of 2 qualitatives variables) as observed proportions PG1/F1 and PG1/F2 (equality of the proportions)
Measured variable:Two qualitative variables F & G with 2 classes
Fisher test
Aim:Comparison of observed effectives G & F (independance of 2 qualitatives variables) as observed proportions PG1/F1 and PG1/F2 (equality of the proportions)
Measured variable:Two qualitative variables F & G with 2 classes
Conditions of utilization:The class of the variables must be exclusivesQualitative variables are nominal
Fisher test
Aim:Comparison of observed effectives G & F (independance of 2 qualitatives variables) as observed proportions PG1/F1 and PG1/F2 (equality of the proportions)
Measured variable:Two qualitative variables F & G with 2 classes
Conditions of utilization:The class of the variables must be exclusivesQualitative variables are nominal
Test hypotheses:H0: πG1/F1 = πG1/F12 Proportions are identical i n the target populationH1 bilat: πG1/F1 ≠ πG1/F12 Proportions are different in the target populationH1 unilat right: πG1/F1 > πG1/F12 Proportion πG1/F1 is strictly superior to the target populationH1 unilat left: πG1/F1 < πG1/F12 Proportion πG1/F1 is strictly inferior to the target population
Fisher test
Aim:Comparison of observed effectives G & F (independance of 2 qualitatives variables) as observed proportions PG1/F1 and PG1/F2 (equality of the proportions)
Measured variable:Two qualitative variables F & G with 2 classes
Conditions of utilization:The class of the variables must be exclusivesQualitative variables are nominal
Test hypotheses:H0: πG1/F1 = πG1/F12 Proportions are identical i n the target populationH1 bilat: πG1/F1 ≠ πG1/F12 Proportions are different in the target populationH1 unilat right: πG1/F1 > πG1/F12 Proportion πG1/F1 is strictly superior to the target populationH1 unilat left: πG1/F1 < πG1/F12 Proportion πG1/F1 is strictly inferior to the target population
The statistic is:
𝑁 𝐹𝐸=𝑛11
In R: sum((Oij - Eij)^2/Eij)
Fisher test
In details:
Fisher test
In details:
Fisher test
In details:
Fisher test
In details:
Fisher test
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Lecture-4/Ceramics.txt",header=TRUE)
Fisher test
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Lecture-4/Ceramics.txt",header=TRUE)
obs1<-Ceram[,c(12,9)]obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2)# obs3<-t(obs3) # obs4<-obs3 ; obs4[,1]<-obs3[,2] ; obs4[,2]<-obs3[,1] ; dimnames(obs4)[[2]][1] <- dimnames(obs3)[[2]][2] ; dimnames(obs4)[[2]][2] <- dimnames(obs3)[[2]][1] ; obs3<-obs4 addmargins(obs3)
In details:
graphics.off()pG1.Fi<- obs3[, 1]/ margin.table(obs3, 1) par(mar=c(5.1, 5.1, 4.1, 2.1))barplot(pG1.Fi, xlab = paste(labels(dimnames(obs3))[2], " / ", labels(dimnames(obs3))[1]), xaxt = "n", ylab="Proportion", ylim=range(0, max(pG1.Fi)+ 0.15), cex.lab=2, cex.axis=1.8)position.labels <- barplot(pG1.Fi, plot = FALSE)[] axis(side=1, at = position.labels, labels = c(paste(colnames(obs3)[1], " / ", rownames(obs3)[1]), paste(colnames(obs3)[1], " / ", rownames(obs3)[2])), cex.axis=1.8)
Fisher test
In details:
graphics.off()pG1.Fi<- obs3[, 1]/ margin.table(obs3, 1) par(mar=c(5.1, 5.1, 4.1, 2.1))barplot(pG1.Fi, xlab = paste(labels(dimnames(obs3))[2], " / ", labels(dimnames(obs3))[1]), xaxt = "n", ylab="Proportion", ylim=range(0, max(pG1.Fi)+ 0.15), cex.lab=2, cex.axis=1.8)position.labels <- barplot(pG1.Fi, plot = FALSE)[] axis(side=1, at = position.labels, labels = c(paste(colnames(obs3)[1], " / ", rownames(obs3)[1]), paste(colnames(obs3)[1], " / ", rownames(obs3)[2])), cex.axis=1.8)
windows() par(cex.lab=2, xpd=NA, font=2)mosaicplot(t(obs3), main=NULL, cex.axis=1.5)
Fisher test
In details:
n11<-obs3[1, 1]n1.<-margin.table(obs3, 1)[1]n21<-obs3[2, 1]n2.<-margin.table(obs3, 1)[2]pG1.F1<-n11/n1.pG1.F2<-n21/n2.t(data.frame(pG1.F1, pG1.F2))
Fisher test
In details:
n11<-obs3[1, 1]n1.<-margin.table(obs3, 1)[1]n21<-obs3[2, 1]n2.<-margin.table(obs3, 1)[2]pG1.F1<-n11/n1.pG1.F2<-n21/n2.t(data.frame(pG1.F1, pG1.F2))
n12<-obs3[1,2]n22<-obs3[2,2]n.1<-margin.table(obs3,2)[1]n.2<-margin.table(obs3,2)[2]pG2.F1<-n12/n1.pG2.F2<-n22/n2.pF1.G1<-n11/n.1pF1.G2<-n12/n.2pF2.G1<-n21/n.1pF2.G2<-n22/n.2t(data.frame(pG2.F1, pG2.F2, pF1.G1, pF1.G2, pF2.G1, pF2.G2))
Fisher test
In details:
n11<-obs3[1, 1]n1.<-margin.table(obs3, 1)[1]n21<-obs3[2, 1]n2.<-margin.table(obs3, 1)[2]pG1.F1<-n11/n1.pG1.F2<-n21/n2.t(data.frame(pG1.F1, pG1.F2))
n12<-obs3[1,2]n22<-obs3[2,2]n.1<-margin.table(obs3,2)[1]n.2<-margin.table(obs3,2)[2]pG2.F1<-n12/n1.pG2.F2<-n22/n2.pF1.G1<-n11/n.1pF1.G2<-n12/n.2pF2.G1<-n21/n.1pF2.G2<-n22/n.2t(data.frame(pG2.F1, pG2.F2, pF1.G1, pF1.G2, pF2.G1, pF2.G2))
n11<-obs3[1,1]NFE.calc<-n11NFE.calc
Fisher test
In details:
n1.<-margin.table(obs3,1)[1]n<- margin.table(obs3)n.1<-margin.table(obs3,2)[1]p.right<-phyper(NFE.calc-1,n1.,n-n1.,n.1,lower.tail=FALSE) p.left<-phyper(NFE.calc,n1.,n-n1.,n.1)p.rightp.left
Fisher test
In details:
n1.<-margin.table(obs3,1)[1]n<- margin.table(obs3)n.1<-margin.table(obs3,2)[1]p.right<-phyper(NFE.calc-1,n1.,n-n1.,n.1,lower.tail=FALSE) p.left<-phyper(NFE.calc,n1.,n-n1.,n.1)p.rightp.left
if(p.right < p.left) {p.value1<-p.right ; NFE.left<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.left<-Inf ; while(NFE.left >= 0 & d.NFE.left> d.NFE.calc) { NFE.left<-NFE.left - 1 ; d.NFE.left<- round(dhyper(NFE.left,n1.,n-n1.,n.1),12)} ;
Fisher test
In details:
n1.<-margin.table(obs3,1)[1]n<- margin.table(obs3)n.1<-margin.table(obs3,2)[1]p.right<-phyper(NFE.calc-1,n1.,n-n1.,n.1,lower.tail=FALSE) p.left<-phyper(NFE.calc,n1.,n-n1.,n.1)p.rightp.left
if(p.right < p.left) {p.value1<-p.right ; NFE.left<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.left<-Inf ; while(NFE.left >= 0 & d.NFE.left > d.NFE.calc) { NFE.left<-NFE.left - 1 ; d.NFE.left<- round(dhyper(NFE.left,n1.,n-n1.,n.1),12)} ; if(d.NFE.left > d.NFE.calc){p.value2<-0} else{p.value2<- phyper(NFE.left,n1.,n-n1.,n.1)}} else{p.value1<-p.left ; NFE.right<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.right<-Inf ; while(d.NFE.right > d.NFE.calc){ NFE.right<-NFE.right + 1 ; d.NFE.right<- round(dhyper(NFE.right,n1.,n-n1.,n.1),12)} ; p.value2<- phyper(NFE.right-1,n1.,n-n1.,n.1,lower.tail=FALSE)}
Fisher test
In details:
n1.<-margin.table(obs3,1)[1]n<- margin.table(obs3)n.1<-margin.table(obs3,2)[1]p.right<-phyper(NFE.calc-1,n1.,n-n1.,n.1,lower.tail=FALSE) p.left<-phyper(NFE.calc,n1.,n-n1.,n.1)p.rightp.left
if(p.right < p.left) {p.value1<-p.right ; NFE.left<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.left<-Inf ; while(NFE.left >= 0 & d.NFE.gauche > d.NFE.calc) { NFE.left<-NFE.left - 1 ; d.NFE.left<- round(dhyper(NFE.left,n1.,n-n1.,n.1),12)} ; if(d.NFE.left > d.NFE.calc){p.value2<-0} else{p.value2<- phyper(NFE.left,n1.,n-n1.,n.1)}} else{p.value1<-p.left ; NFE.right<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.right<-Inf ; while(d.NFE.right > d.NFE.calc){ NFE.right<-NFE.right + 1 ; d.NFE.right<- round(dhyper(NFE.right,n1.,n-n1.,n.1),12)} ; p.value2<- phyper(NFE.right-1,n1.,n-n1.,n.1,lower.tail=FALSE)}
p.value<-p.value1+p.value2p.value
Fisher test
In details:
Pn11<-choose(n1.,n11)*choose(n-n1.,n.1-n11)/choose(n,n.1)Pn11
dhyper(n11,n1.,n-n1.,n.1)n11<-obs3[1,1]
Fisher test
In details:
Pn11<-choose(n1.,n11)*choose(n-n1.,n.1-n11)/choose(n,n.1)Pn11
dhyper(n11,n1.,n-n1.,n.1)n11<-obs3[1,1]
# Test in Rfisher.test(obs3)
Fisher test
Student t test
Aim:Comparison of two observed means m1 and m2
Student t test
Aim:Comparison of two observed means m1 and m2
Measured variable:A quantitative variable and a qualitative variable with two classes
Student t test
Aim:Comparison of two observed means m1 and m2
Measured variable:A quantitative variable and a qualitative variable with two classes
Conditions of utilization:The quantitative variable must follow a normal lawThe quantitative variable may be continuous or discrete
Student t test
Aim:Comparison of two observed means m1 and m2
Measured variable:A quantitative variable and a qualitative variable with two classes
Conditions of utilization:The quantitative variable must follow a normal lawThe quantitative variable may be continuous or discrete
Test hypotheses:H0: μ1 = μ2 Means are identical in the target pop.H1 bilat: μ1 ≠ μ2 Means are different in the target pop.H1 unilat right: μ1 > μ2 Mean is srtictly superior to the mean in the target pop.H1 unilat left: μ1 < μ2 Mean is srtictly inferior to the mean in the target pop.
Student t test
Aim:Comparison of two observed means m1 and m2
Measured variable:A quantitative variable and a qualitative variable with two classes
Conditions of utilization:The quantitative variable must follow a normal lawThe quantitative variable may be continuous or discrete
Test hypotheses:H0: μ1 = μ2 Means are identical in the target pop.H1 bilat: μ1 ≠ μ2 Means are different in the target pop.H1 unilat right: μ1 > μ2 Mean is srtictly superior to the mean in the target pop.H1 unilat left: μ1 < μ2 Mean is srtictly inferior to the mean in the target pop.
The statistic is: with:
𝑡=𝑚1−𝑚2
√ �̂�2×( 1𝑛1−1𝑛2 )
In R: (m1-m2)/(s2*(1/n1+1/n2))^0.5
�̂�2=(𝑛¿¿1−1)𝑠1❑ ²+(𝑛2−1) 𝑠2 ²
𝑛1+𝑛2−2¿
Student t test
In details:
obs1<-data.frame(Ceramics[which(Ceramics$Base=="Round" | Ceramics$Base=="Flat"), c(2,13)])obs2<-na.omit(obs1)nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3
Student t test
In details:
obs1<-data.frame(Ceramics[which(Ceramics$Base=="Round" | Ceramics$Base=="Flat"), c(2,13)])obs2<-na.omit(obs1)nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3
n1<-length(na.omit(obs3[, 1])) n2<- length(obs3[, 2])m1<-mean(na.omit(obs3[, 1]))m2<-mean(obs3[, 2]) s.1<- sd(na.omit(obs3[, 1]))s.2<- sd(obs3[, 2]) param <- data.frame(c(n1, n2), c(m1, m2), c(s.1, s.2))names(param) <- c("Effectives", "Mean", "Standard deviation")row.names(levels(obs2[,2]))param
In details:In details:
s2<-((n1-1)*s.1^2+(n2-1)*s.2^2)/(n1+n2-2) t.calc<- (m1-m2)/(s2*(1/n1+1/n2))^0.5t.calc
Student t test
In details:In details:
s2<-((n1-1)*s.1^2+(n2-1)*s.2^2)/(n1+n2-2) t.calc<- (m1-m2)/(s2*(1/n1+1/n2))^0.5t.calc
nu<-n1+n2-2numin(pt(t.calc, nu, lower.tail=FALSE), pt(t.calc, nu))*2
Student t test
In details:In details:
s2<-((n1-1)*s.1^2+(n2-1)*s.2^2)/(n1+n2-2) t.calc<- (m1-m2)/(s2*(1/n1+1/n2))^0.5t.calc
nu<-n1+n2-2numin(pt(t.calc, nu, lower.tail=FALSE), pt(t.calc, nu))*2
# Test in Rt.test(obs3[, 1],obs3[, 2],var.equal=TRUE)
Student t test
Analysis of variance (ANOVA)
Aim:Comparison of at least two observed means
Analysis of variance (ANOVA)
Aim:Comparison of at least two observed means
Measured variable:A quantitative variable and a qualitative variable with k classes
Analysis of variance (ANOVA)
Aim:Comparison of at least two observed means
Measured variable:A quantitative variable and a qualitative variable with k classes
Conditions of utilization:The quantitative variable must follow a normal lawThe variances of the quantitative variable in each classes of the qualitative variable must be equal ()-> If conditions are not fulfilled, see the Kruskal-Wallis test
Analysis of variance (ANOVA)
Aim:Comparison of at least two observed means
Measured variable:A quantitative variable and a qualitative variable with k classes
Conditions of utilization:The quantitative variable must follow a normal lawThe variances of the quantitative variable in each classes of the qualitative variable must be equal ()-> If conditions are not fulfilled, see the Kruskal-Wallis test
Test hypotheses:H0: μ1 = μ2 Means are identical in the target pop.H1 bilat: μ1 ≠ μ2 One of the means at least is different in the target pop.
Analysis of variance (ANOVA)
Aim:Comparison of at least two observed means
Measured variable:A quantitative variable and a qualitative variable with k classes
Conditions of utilization:The quantitative variable must follow a normal lawThe variances of the quantitative variable in each classes of the qualitative variable must be equal ()-> If conditions are not fulfilled, see the Kruskal-Wallis test
Test hypotheses:H0: μ1 = μ2 Means are identical in the target pop.H1 bilat: μ1 ≠ μ2 One of the means at least is different in the target pop.
The statistic is: 𝐹 𝑣 2𝑣 1=
𝐼 𝑛𝑡𝑒𝑟𝑐𝑙𝑎𝑠𝑠𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝐼 𝑛𝑡𝑒𝑟𝑐𝑙𝑎𝑠𝑠𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 =
𝑆𝐶𝐼𝑘−1𝑆𝐶𝐸𝑛−1
In R: (m1-m2)/(s2*(1/n1+1/n2))^0.5
Analysis of variance (ANOVA)
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics210/Data/Ceramics.txt",header=TRUE)
obs1<-Ceram[,c(7,10)] obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1) nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3
Analysis of variance (ANOVA)
In details:
graphics.off()
k<-nlevels(obs2[, 2])
stripchart(obs2[, 1]~obs2[, 2], method="jitter", jitter=0.1, vertical=FALSE, ylim=range(0.5, k+0.5), group.names=levels(obs2[, 2]), xlab= names(obs2)[1], ylab=names(obs2)[2], pch=16, cex=1.2)
mc<-sapply(split(obs2[, 1], obs2[, 2]), mean)for(i in 1:k){segments(mc[i], i-0.25, mc[i], i+0.25, lwd=3, col=gray(0.5))}
Analysis of variance (ANOVA)
In details:
k<-nlevels(obs2[,2])
nc<-sapply(split(obs2[, 1], obs2[, 2]), length) mc<-sapply(split(obs2[, 1], obs2[, 2]), mean) sc<-sapply(split(obs2[, 1], obs2[, 2]), sd) param <- data.frame(nc, mc, sc)names(param) <- c("Observations", "Mean", "Standard.deviation")
Analysis of variance (ANOVA)
In details:
k<-nlevels(obs2[,2])
nc<-sapply(split(obs2[, 1], obs2[, 2]), length) mc<-sapply(split(obs2[, 1], obs2[, 2]), mean) sc<-sapply(split(obs2[, 1], obs2[, 2]), sd) param <- data.frame(nc, mc, sc)names(param) <- c("Observations", "Mean", "Standard.deviation")
xi<- obs2[,1]m<-mean(xi)SCT<-sum((xi-m)^2)SCTSCI<-sum(nc*(mc-m)^2)SCISCE<-0for(i in 1:k){SCE<-SCE+sum((na.omit(obs3[i])-mc[i])^2)}SCESCI+SCEn<-length(obs2[,1])Fcalc<-(SCI/(k-1))/(SCE/(n-k))Fcalc
Analysis of variance (ANOVA)
In details:
k<-nlevels(obs2[,2])
nc<-sapply(split(obs2[, 1], obs2[, 2]), length) mc<-sapply(split(obs2[, 1], obs2[, 2]), mean) sc<-sapply(split(obs2[, 1], obs2[, 2]), sd) param <- data.frame(nc, mc, sc)names(param) <- c("Observations", "Mean", "Standard.deviation")
xi<- obs2[,1]m<-mean(xi)SCT<-sum((xi-m)^2)SCTSCI<-sum(nc*(mc-m)^2)SCISCE<-0for(i in 1:k){SCE<-SCE+sum((na.omit(obs3[i])-mc[i])^2)}SCESCI+SCEn<-length(obs2[,1])Fcalc<-(SCI/(k-1))/(SCE/(n-k))Fcalc
pf(Fcalc, (k-1), (n-k), lower.tail=FALSE)
Analysis of variance (ANOVA)
In details:
# Test in Roneway.test(obs1[,1] ~ obs1[,2], var.equal=TRUE)anova(lm(obs1[,1] ~ obs1[,2]))param
Mann-Whitney-Wilcoxon test
Aim:Comparison of two observed median and 2
Mann-Whitney-Wilcoxon test
Aim:Comparison of two observed median and 2
Measured variable:A quantitative variable and a qualitative variable with 2 classes
Mann-Whitney-Wilcoxon test
Aim:Comparison of two observed median and 2
Measured variable:A quantitative variable and a qualitative variable with 2 classes
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive
Mann-Whitney-Wilcoxon test
Aim:Comparison of two observed median and 2
Measured variable:A quantitative variable and a qualitative variable with 2 classes
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive
Test hypotheses:H0: me1 = me2 Medians are identical in the populationH1 bilat: me1 ≠ me2 Medians are different in the populationH1 upper: me1 > me2 Median me1 is strictly superior than me2 in the populationH1 lower: me1 < me2 Median me1 is strictly inferior than me2 in the population
Mann-Whitney-Wilcoxon test
Aim:Comparison of two observed median and 2
Measured variable:A quantitative variable and a qualitative variable with 2 classes
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive
Test hypotheses:H0: me1 = me2 Medians are identical in the populationH1 bilat: me1 ≠ me2 Medians are different in the populationH1 upper: me1 > me2 Median me1 is strictly superior than me2 in the populationH1 lower: me1 < me2 Median me1 is strictly inferior than me2 in the population
The statistic is: with:
𝑊 1❑=∑
𝑗=1
𝑛1𝑟 𝑗
In R: W1calc-0.5*n1*(1+n1)
𝑈 1❑=𝑊 1−
12 𝑛1(𝑛1+1)
In details:
Ceramics<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-data.frame(Ceramics[which(Ceramics$Human.remains=="Yes" | Ceramics$Human.remains=="No"), c(4,14)])obs1[, 2]<-factor(obs1[, 2])obs2<-na.omit(obs1)
Mann-Whitney-Wilcoxon test
In details:
Ceramics<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-data.frame(Ceramics[which(Ceramics$Human.remains=="Yes" | Ceramics$Human.remains=="No"), c(4,14)])obs1[, 2]<-factor(obs1[, 2])obs2<-na.omit(obs1)
nc.max<-max(table(obs2[, 2])) nb.na<-nc.max- table(obs2[, 2]) tempo<-split(obs2[, 1], obs2[, 2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))}
Mann-Whitney-Wilcoxon test
In details:
Ceramics<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-data.frame(Ceramics[which(Ceramics$Human.remains=="Yes" | Ceramics$Human.remains=="No"), c(4,14)])obs1[, 2]<-factor(obs1[, 2])obs2<-na.omit(obs1)
nc.max<-max(table(obs2[, 2])) nb.na<-nc.max- table(obs2[, 2]) tempo<-split(obs2[, 1], obs2[, 2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))}
obs3<-data.frame(tempo)xr<-sort(obs2[, 1])ci<-obs2[, 2]cr<-ci[order(obs2[, 1])] r<-rank(xr,ties.method="average") obs4<-data.frame(r,xr,cr)names(obs4)<- c("rank", paste(names(obs2)[1], "by increasing order"), names(obs2)[2])obs4
Mann-Whitney-Wilcoxon test
In details:
n1<-sapply(split(obs2[, 1], obs2[, 2]), length)[1] n2<-sapply(split(obs2[, 1], obs2[, 2]), length)[2] m1<- sapply(split(obs2[, 1], obs2[, 2]), mean)[1]m2<- sapply(split(obs2[, 1], obs2[, 2]), mean)[2]s1<-sapply(split(obs2[, 1], obs2[, 2]), sd)[1]s2<-sapply(split(obs2[, 1], obs2[, 2]), sd)[2]me1<- sapply(split(obs2[, 1], obs2[, 2]), median)[1]me2<- sapply(split(obs2[, 1], obs2[, 2]), median)[2]min1<- sapply(split(obs2[, 1], obs2[, 2]), min)[1]min2<- sapply(split(obs2[, 1], obs2[, 2]), min)[2]max1<- sapply(split(obs2[, 1], obs2[, 2]), max)[1]max2<- sapply(split(obs2[, 1], obs2[, 2]), max)[2]
Mann-Whitney-Wilcoxon test
In details:
param <- data.frame(c(n1, n2), c(m1, m2), c(s1, s2), c(me1, me2), c(min1, min2), c(max1, max2))names(param) <- c("Effective", "Mean", "Standard.deviation", "Median", "Minimum", "Maximum")row.names(levels(obs2[, 2]))paramW1calc<-sum(subset(obs4[, 1], obs4[, 3]== names(obs3)[1]))U1calc<-W1calc-0.5*n1*(1+n1)U1calcmin(pwilcox((U1calc-1), n1, n2, lower.tail=FALSE), pwilcox(U1calc, n1, n2))*2
Mann-Whitney-Wilcoxon test
In details:
param <- data.frame(c(n1, n2), c(m1, m2), c(s1, s2), c(me1, me2), c(min1, min2), c(max1, max2))names(param) <- c("Effective", "Mean", "Standard.deviation", "Median", "Minimum", "Maximum")row.names(levels(obs2[, 2]))paramW1calc<-sum(subset(obs4[, 1], obs4[, 3]== names(obs3)[1]))U1calc<-W1calc-0.5*n1*(1+n1)U1calcmin(pwilcox((U1calc-1), n1, n2, lower.tail=FALSE), pwilcox(U1calc, n1, n2))*2
## In Rwilcox.test(obs3[, 1], obs3[, 2])
Mann-Whitney-Wilcoxon test
Aim:Comparison of k observed median (
Kruskal-Wallis test
Aim:Comparison of k observed median (
Measured variable:A quantitative variable and a qualitative variable with k classes
Kruskal-Wallis test
Aim:Comparison of k observed median (
Measured variable:A quantitative variable and a qualitative variable with k classes
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive
Kruskal-Wallis test
Aim:Comparison of k observed median (
Measured variable:A quantitative variable and a qualitative variable with k classes
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive
Test hypotheses:H0: me1 = me2 Medians are identical in the populationH1 bilat: me1 ≠ me2 One of the medians at least is different in the population
Kruskal-Wallis test
Aim:Comparison of k observed median (
Measured variable:A quantitative variable and a qualitative variable with k classes
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive
Test hypotheses:H0: me1 = me2 Medians are identical in the populationH1 bilat: me1 ≠ me2 One of the medians at least is different in the population
The statistic is:
𝐾=𝐾
1−∑𝑖=1
𝑓
(𝑡 𝑖3− 𝑡𝑖)
𝑛3−𝑛
Kruskal-Wallis test
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-Ceram[,c(7,10)] obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1)
Kruskal-Wallis test
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-Ceram[,c(7,10)] obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1)
ri<-rank(obs2[,1] ,ties.method="average") obs3<-data.frame(obs2,ri)names(obs3)[3]<- "Rank"obs3nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs4<-data.frame(tempo)
Kruskal-Wallis test
In details:
nc<-sapply(split(obs2[, 1], obs2[, 2]), length) me.c<-sapply(split(obs2[,1], obs2[,2]), median) min.c<- sapply(split(obs2[,1], obs2[,2]), min) max.c<- sapply(split(obs2[,1], obs2[,2]), max) mc<-sapply(split(obs2[, 1], obs2[, 2]), mean) sc<-sapply(split(obs2[, 1], obs2[, 2]), sd) param <- data.frame(nc, me.c, min.c, max.c, mc, sc)names(param) <- c("Effective", "Median", "Minimum", "Maximum", "Mean", "Standard.deviation")row.names(levels(obs2[, 2]))param
Kruskal-Wallis test
In details:
n<-length(obs3$Rank)Rc<-tapply(obs3$Rank, obs3[, 2], "sum") K.calc<-12*sum(Rc^2/nc)/(n*(n+1))-3*(n+1)K.calcti<-table(obs3[,1])Kcorr<-K.calc/(1-sum(ti^3-ti)/(n^3-n))Kcorrk<-length(nc)pchisq(Kcorr, k-1, lower.tail=FALSE)k<-length(nc) n<-sum(nc)vecteur.ini<-NULLfor(i in 1:k){vecteur.ini<-c(vecteur.ini, rep(LETTERS[i], nc[i]))}rank<-1:n
## Test in Rkruskal.test(obs1[,1], obs1[,2])
Kruskal-Wallis test
Aim:Comparison of two observed variances and
Fisher-Snedecor test
Aim:Comparison of two observed variances and
Measured variable:A quantitative variable and a qualitative variable with 2 classes
Fisher-Snedecor test
Aim:Comparison of two observed variances and
Measured variable:A quantitative variable and a qualitative variable with 2 classes
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive
Fisher-Snedecor test
Aim:Comparison of two observed variances and
Measured variable:A quantitative variable and a qualitative variable with 2 classes
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive
Test hypotheses:H0: = Variances are identical in the populationH1 bilat: ≠ Variances are different in the populationH1 upper: > Variances is strictly superior than in the populationH1 lower: < Variances is strictly inferior than in the population
Fisher-Snedecor test
Aim:Comparison of two observed variances and
Measured variable:A quantitative variable and a qualitative variable with 2 classes
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneClasses of the variable must be exclusive
Test hypotheses:H0: = Variances are identical in the populationH1 bilat: ≠ Variances are different in the populationH1 upper: > Variances is strictly superior than in the populationH1 lower: < Variances is strictly inferior than in the population
The statistic is: 𝐹 𝑣 2𝑣 1=
𝜎12
𝜎 22
Fisher-Snedecor test
In details:
n<-length(obs3$Rank)Rc<-tapply(obs3$Rank, obs3[, 2], "sum") K.calc<-12*sum(Rc^2/nc)/(n*(n+1))-3*(n+1)K.calcti<-table(obs3[,1])Kcorr<-K.calc/(1-sum(ti^3-ti)/(n^3-n))Kcorrk<-length(nc)pchisq(Kcorr, k-1, lower.tail=FALSE)k<-length(nc) n<-sum(nc)vecteur.ini<-NULLfor(i in 1:k){vecteur.ini<-c(vecteur.ini, rep(LETTERS[i], nc[i]))}rank<-1:n
## Test in Rkruskal.test(obs1[,1], obs1[,2])
Fisher-Snedecor test
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-data.frame(Ceram[which(Ceram$Base=="Round" | Ceram$Base=="Flat"), c(6,13)]) obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1)
Fisher-Snedecor test
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-data.frame(Ceram[which(Ceram$Base=="Round" | Ceram$Base=="Flat"), c(6,13)]) obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1) nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3
Fisher-Snedecor test
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-data.frame(Ceram[which(Ceram$Base=="Round" | Ceram$Base=="Flat"), c(6,13)]) obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1) nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2])for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo)obs3
mc<-sapply(split(obs2[,1], obs2[,2]),mean) obs4<-obs2obs4[which(obs2[, 2] == levels(obs2[, 2])[1]), 1] <- obs4[which(obs2[, 2] == levels(obs2[, 2])[1]), 1] - mc[1] obs4[which(obs2[, 2] == levels(obs2[, 2])[2]), 1] <- obs4[which(obs2[, 2] == levels(obs2[, 2])[2]), 1] - mc[2]
Fisher-Snedecor test
In details:
n1<-sapply(split(obs2[, 1], obs2[, 2]), length)[1]n2<-sapply(split(obs2[, 1], obs2[, 2]), length)[2]s2.1<-sapply(split(obs2[, 1], obs2[, 2]), var)[1] s2.2<-sapply(split(obs2[, 1], obs2[, 2]), var)[2] m1<- sapply(split(obs2[, 1], obs2[, 2]), mean)[1]m2<- sapply(split(obs2[, 1], obs2[, 2]), mean)[2]param <- data.frame(c(n1, n2), c(s2.1, s2.2), c(s2.1^0.5, s2.2^0.5), c(m1, m2))names(param) <- c("Effective", "Variance", « Standard.deviation", "Mean")row.names(levels(obs2[, 2]))param
Fisher-Snedecor test
In details:
n1<-sapply(split(obs2[, 1], obs2[, 2]), length)[1]n2<-sapply(split(obs2[, 1], obs2[, 2]), length)[2]s2.1<-sapply(split(obs2[, 1], obs2[, 2]), var)[1] s2.2<-sapply(split(obs2[, 1], obs2[, 2]), var)[2] m1<- sapply(split(obs2[, 1], obs2[, 2]), mean)[1]m2<- sapply(split(obs2[, 1], obs2[, 2]), mean)[2]param <- data.frame(c(n1, n2), c(s2.1, s2.2), c(s2.1^0.5, s2.2^0.5), c(m1, m2))names(param) <- c("Effective", "Variance", « Standard.deviation", "Mean")row.names(levels(obs2[, 2]))param
F.calc<-s2.1/s2.2F.calcnu.1<-n1-1nu.2<-n2-1min(pf(F.calc, nu.1, nu.2, lower.tail=FALSE), pf(F.calc, nu.1, nu.2))*2F.calc/qf(0.025, nu.1, nu.2, lower.tail=FALSE)F.calc/qf(0.025, nu.1, nu.2)
Fisher-Snedecor test
In details:
n1<-sapply(split(obs2[, 1], obs2[, 2]), length)[1]n2<-sapply(split(obs2[, 1], obs2[, 2]), length)[2]s2.1<-sapply(split(obs2[, 1], obs2[, 2]), var)[1] s2.2<-sapply(split(obs2[, 1], obs2[, 2]), var)[2] m1<- sapply(split(obs2[, 1], obs2[, 2]), mean)[1]m2<- sapply(split(obs2[, 1], obs2[, 2]), mean)[2]param <- data.frame(c(n1, n2), c(s2.1, s2.2), c(s2.1^0.5, s2.2^0.5), c(m1, m2))names(param) <- c("Effective", "Variance", « Standard.deviation", "Mean")row.names(levels(obs2[, 2]))param
F.calc<-s2.1/s2.2F.calcnu.1<-n1-1nu.2<-n2-1min(pf(F.calc, nu.1, nu.2, lower.tail=FALSE), pf(F.calc, nu.1, nu.2))*2F.calc/qf(0.025, nu.1, nu.2, lower.tail=FALSE)F.calc/qf(0.025, nu.1, nu.2)
## In Rvar.test(obs3[, 1], obs3[, 2])
Fisher-Snedecor test
Spearman correlation test
Aim:Comparison of a correlation coefficient to the nulle theoretical value
Measured variable:Two quantitative variables x and y
Conditions of utilization:The individuals constituting the sample must be randomly chosen one by oneEach individual must possess a value of each variable
Test hypotheses:H0: = 0 Correlation coefficient is nulle in the populationH1 bilat: ≠ 0 Correlation coefficient is not nulle in the population H1 upper: > 0 Correlation coefficient is positive in the populationH1 lower: < 0 Correlation coefficient is negative in the population
The statistic is: 𝑡𝑝❑=
𝜌𝑥𝑦❑
√ 1−𝜌𝑥𝑦❑ ²
𝑛−2
In R: n*(n^2-1)*(1-rho.xy)/6
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-data.frame(Ceram[which(Ceram$Human.remains=="Yes"), c(2,4)]) obs2<-na.omit(obs1)
Spearman correlation test
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-data.frame(Ceram[which(Ceram$Human.remains=="Yes"), c(2,4)]) obs2<-na.omit(obs1)
rxi<-rank(obs2[,1],ties.method = "average") ryi<-rank(obs2[,2],ties.method = "average")obs3<-data.frame(rxi,ryi)names(obs3)<-c(paste("rank of", names(obs2)[1]), paste("rank of", names(obs2)[2]))obs3
Spearman correlation test
In details:
Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt",header=TRUE)
obs1<-data.frame(Ceram[which(Ceram$Human.remains=="Yes"), c(2,4)]) obs2<-na.omit(obs1)
rxi<-rank(obs2[,1],ties.method = "average") ryi<-rank(obs2[,2],ties.method = "average")obs3<-data.frame(rxi,ryi)names(obs3)<-c(paste("rank of", names(obs2)[1]), paste("rank of", names(obs2)[2]))obs3
n<-length(obs2[, 1])n rho.xy<-(sum(rxi*ryi) - n*((n+1)/2)^2) / ((sum(rxi^2) - n*((n+1)/2)^2)^0.5 * (sum(ryi^2) -n*((n+1)/2)^2)^0.5) rho.xy cor(rank(obs2[,1]), rank(obs2[,2])) Scalc<-n*(n^2-1)*(1-rho.xy)/6Scalcsum((ryi-rxi)^2)
Spearman correlation test
In details:
## In Rcor.test(obs1[,1], obs1[,2], method="spearman")
Spearman correlation test