The procedure of permutation test for PCA is as follows:
For each replicate,
Individually permute each column of the data matrix.
Conduct the PCA and find the proportion of variance explained by each of the components 1 to s. Store this information.
Repeat 1 and 2 R times.
At the end of this we will have a matrix with R rows and s columns that contains the proportion of variance explained by each component for each replicate.
Finally, compare the observed values from the original data to the set of values from the permutations in order to determine the approximate p-value.
# the fuction to assess the significance of the principal components.sign.pc<-function(x,R=1000,s=10, cor=T,...){# run PCA pc.out<-princomp(x,cor=cor,...)# the proportion of variance of each PC pve=(pc.out$sdev^2/sum(pc.out$sdev^2))[1:s]# a matrix with R rows and s columns that contains# the proportion of variance explained by each pc# for each randomization replicate. pve.perm<-matrix(NA,ncol=s,nrow=R)for(i in1:R){# permutation each column x.perm<-apply(x,2,sample)# run PCA pc.perm.out<-princomp(x.perm,cor=cor,...)# the proportion of variance of each PC.perm pve.perm[i,]=(pc.perm.out$sdev^2/sum(pc.perm.out$sdev^2))[1:s]}# calcalute the p-values pval<-apply(t(pve.perm)>pve,1,sum)/R
return(list(pve=pve,pval=pval))}# apply the functionlibrary(RCurl)data <- getURL("https://raw.githubusercontent.com/bioops/mis_scripts/master/statistics/data/pca.txt")OCRdata <- read.table(text = data, header=T,sep="\t")OCRdat<-OCRdata[,-1]#leave out location id columnsign.pc(OCRdat,cor=T)