statistical analysis code for analysis of castxb6 f2 mouse cross 2. network analysis ... · 2011....

Statistical analysis code for analysis of CASTxB6 F2 mouse cross

2. Network analysis in liver and adipose

Peter Langfelder

March 23, 2011

Contents

1 Setting up the R session and loading of data 1

2 Relationships among the physiological traits 3

3 Network construction and module identification 53.a Scale-free topology analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.b Network construction and module identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.c Merging of closely-related modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.d Trimming of genes with low module membership . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.e Identification and removal of linkage-driven modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.f Gene clustering dendrograms and module colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 GO enrichment analysis 174.a Exporting lists of genes in each module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.b GO enrichment analysis in WGCNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5 Modules related to physiological traits 205.a Module-trait relationships for all modules that relate significantly to a trait . . . . . . . . . . . . . . . 225.b Network plots of all module eigengenes and traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6 Output of module membership, eigengenes, and eigengene correlations 26

7 Overlap of liver and adipose modules 28

8 Gene significance and module membership in HDL-related modules are correlated 30

9 Module significance in male data validates association found in female data 32

10 Cross-referencing with genes implicated in GWA studies 35

1 Setting up the R session and loading of data

In this document we detail our network analysis of the CASTxB6 cross. We use the WGCNA package [1] to constructthe gene co-expression network, find modules, relate them to the clinical traits, study GO enrichment, and othertasks. We use the pre-processed data created in part 1.

1

# Set working directory. This step is necessary if your data is saved in a directory other than the current

directory. Replace the path name below with the directory where the data is stored on your drive.

# setwd("Z:/home/plangfelder/Work/Mouse-ReciprocalCXB/CxBOnly");

# Load the WGCNA library

library(WGCNA)

# This setting is important, do not leave out

options(stringsAsFactors = FALSE);

options(width = 109)

set.seed(1); #needed for .Random.seed to be defined

We now set up a few basic variables and load the preprocessed expression data. Liver and adipose wil be indexed 1and 2, respectively. The files necessary for this step have been generated in part 1 of the analysis.

nSets = 2;

setLabels = c("Female Liver", "Female Adipose");

shortLabels = c("Liver", "Adipose");

shortshortLabels = c("L", "A");

# Load expression data

files = c("../CxBOnly-Liver-outliersRemoved-exprFemaOR-pValFemaOR.RData",

"../CxBOnly-Adipose-outliersRemoved-exprFemaOR-pValFemaOR.RData");

express = list();

for (set in 1:nSets)

{

x = load(file = files[set]);

express[[set]] = list(data = exprFemaOR);

}

expr = express;

rm(express);

collectGarbage();

exprSize = checkSets(expr);

nSamples = exprSize$nSamples;

collectGarbage()

We now load the trait data and isolate numeric traits measured at the time the animals were sacrificed.

rawTr = read.csv(file = bzfile("../../../Data-AllMouse/CXB_Clinical_traits.csv.bz2"))

numTraitInd = c(15:46, 48)

numTraits = vector(mode = "list", length = nSets);

# The following is relative to numTraitInd

selTraitInd = c(5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 26, 27, 28, 29, 30, 31, 32, 33)

selTraits = vector(mode = "list", length = nSets);


{

mice = rownames(expr[[set]]$data);

expr2tr = match(mice, rawTr$Mice_id);

temp = rawTr[expr2tr, numTraitInd]

rownames(temp) = rawTr$Mice_id[expr2tr];

numTraits[[set]] = list(data = as.matrix(temp));

selTraits[[set]] = list(data = as.matrix(temp[, selTraitInd]));

}

collectGarbage();

2

Some trait measurements appear to be incorrect or outliers. For example, one mouse (F2 391) has a recorded bodylength of nearly 30 cm (1 foot), which is clearly a measurement (or record) error. Similarly, some fat measurementsare unrealistically high. We remove the unrealistic measurements from the data.


{

suspicious = numTraits[[set]]$data[,"length_cm"] > 20;

numTraits[[set]]$data[suspicious,"length_cm"] = NA;

selTraits[[set]]$data[suspicious,"length_cm"] = NA;

}

# remove the e_fat outlier


{

suspicious = numTraits[[set]]$data[,"efat_g"] > 20;

numTraits[[set]]$data[suspicious,"efat_g"] = NA;

selTraits[[set]]$data[suspicious,"efat_g"] = NA;

}

nSelTraits = length(selTraitInd)

tcInd = match("e_tc_mgdl", colnames(selTraits[[1]]$data));

Next we modify trait names to make them more descriptive.

renameTable = matrix(c("e_bweight_g", "e_fat_g", "e_mus_g", "e_fluid_g", "e_fat_per", "e_tg_mgdl",

"e_tc_mgdl", "e_hdl_mgdl", "e_uc_mgdl", "e_ffa_mgdl", "e_glu_mgdl", "length_cm", "efat_g", "rfat_g",

"vfat_g", "sfat_g", "insulin_pgml", "leptin_ngml", "bmd_mgcm2",

"weight", "fat", "muscle", "fluid", "fat.frac", "trigly", "tot.chol", "HDL",

"unest.chol", "FFA", "glucose", "length", "efat", "rfat", "vfat", "sfat",

"insulin", "leptin", "BMD"), ncol = 2, nrow = nSelTraits);

ind = match(renameTable[, 1], colnames(selTraits[[1]]$data));

renameTable = renameTable[ind, ];

2 Relationships among the physiological traits

Here we plot a heatmap of correlations among the traits.

# Calculate the matrix and order it using a hierarchical clustering dendrogram

mat = bicor(selTraits[[1]]$data, use = ’p’);

order = hclust(as.dist(1-mat), method = ’a’)$order;

# Open a suitably sized graphics window

sizeGrWindow(9,7);

# Alternatively, plot into a file. Make sure the directory Plots exists or change the file name

# appropriately.

# pdf(file = "Plots/Liver-allTraitCorHeatmap.pdf", width = 9, height = 7);

par(mar = c(5, 6, 2, 1));

labeledHeatmap(mat[order, order],

xLabels = renameTable[order, 2],

yLabels = renameTable[order, 2],

colors = greenWhiteRed(50),

zlim = c(-max(abs(mat)), max(abs(mat))),

setStdMargins = FALSE, cex.lab = 1.2,

main = "Correlation heatmap of physiological traits",

textMat = round(mat[order, order], 2), cex.text = 0.7)

# If plotting into a file, close it

dev.off();

The result is shown in Figure 1. Many of the traits are strongly correlated.

3

Correlation heatmap of physiological traits

−1

−0.5

0

0.5

1

fluid

insuli

ntri

gly FFA

unes

t.cho

l

tot.c

hol

HDL

BMD

mus

cle

lengt

h

gluco

selep

tin fat

fat.fr

ac

weigh

tsfa

tvfa

tefa

trfa

t

fluid

insulin

trigly

FFA

unest.chol

tot.chol

HDL

BMD

muscle

length

glucose

leptin

fat

fat.frac

weight

sfat

vfat

efat

rfat

1 −0.23 −0.17 −0.27 −0.19 −0.29 −0.42 0.1 0.01 −0.12 −0.3 −0.65 −0.67 −0.74 −0.45 −0.57 −0.55 −0.57 −0.6

−0.23 1 0.07 0.05 0.06 0.18 0.2 0.06 0.07 0.05 0.27 0.32 0.3 0.29 0.25 0.31 0.26 0.31 0.3

−0.17 0.07 1 0.76 0.51 0.38 0.31 0.09 0.09 −0.07 0.2 0.3 0.27 0.26 0.26 0.31 0.29 0.26 0.27

−0.27 0.05 0.76 1 0.58 0.43 0.46 0 −0.02 −0.04 0.11 0.36 0.32 0.34 0.26 0.32 0.34 0.31 0.33

−0.19 0.06 0.51 0.58 1 0.81 0.68 0 −0.05 −0.04 0.24 0.26 0.18 0.19 0.17 0.23 0.21 0.19 0.25

−0.29 0.18 0.38 0.43 0.81 1 0.84 0.02 0.04 0.06 0.33 0.41 0.32 0.35 0.3 0.33 0.35 0.3 0.38

−0.42 0.2 0.31 0.46 0.68 0.84 1 0.11 0.1 0.16 0.33 0.59 0.47 0.49 0.46 0.49 0.48 0.47 0.52

0.1 0.06 0.09 0 0 0.02 0.11 1 0.53 0.36 0.07 0.2 0.17 0.1 0.37 0.2 0.26 0.19 0.19

0.01 0.07 0.09 −0.02 −0.05 0.04 0.1 0.53 1 0.57 0.18 0.32 0.48 0.34 0.67 0.41 0.45 0.43 0.38

−0.12 0.05 −0.07 −0.04 −0.04 0.06 0.16 0.36 0.57 1 0.04 0.37 0.4 0.34 0.59 0.41 0.45 0.44 0.4

−0.3 0.27 0.2 0.11 0.24 0.33 0.33 0.07 0.18 0.04 1 0.33 0.34 0.32 0.41 0.38 0.42 0.3 0.35

−0.65 0.32 0.3 0.36 0.26 0.41 0.59 0.2 0.32 0.37 0.33 1 0.83 0.81 0.74 0.82 0.77 0.81 0.8

−0.67 0.3 0.27 0.32 0.18 0.32 0.47 0.17 0.48 0.4 0.34 0.83 1 0.97 0.85 0.89 0.86 0.9 0.86

−0.74 0.29 0.26 0.34 0.19 0.35 0.49 0.1 0.34 0.34 0.32 0.81 0.97 1 0.78 0.85 0.82 0.85 0.85

−0.45 0.25 0.26 0.26 0.17 0.3 0.46 0.37 0.67 0.59 0.41 0.74 0.85 0.78 1 0.85 0.88 0.89 0.85

−0.57 0.31 0.31 0.32 0.23 0.33 0.49 0.2 0.41 0.41 0.38 0.82 0.89 0.85 0.85 1 0.88 0.88 0.88

−0.55 0.26 0.29 0.34 0.21 0.35 0.48 0.26 0.45 0.45 0.42 0.77 0.86 0.82 0.88 0.88 1 0.89 0.89

−0.57 0.31 0.26 0.31 0.19 0.3 0.47 0.19 0.43 0.44 0.3 0.81 0.9 0.85 0.89 0.88 0.89 1 0.91

−0.6 0.3 0.27 0.33 0.25 0.38 0.52 0.19 0.38 0.4 0.35 0.8 0.86 0.85 0.85 0.88 0.89 0.91 1

Figure 1: Heatmap of correlations among the physiological traits. Many traits are strongly correlated, particularlythe adiposity traits.

4

3 Network construction and module identification

In this section we construct the co-expression network and identify co-expression modules. We construct a “signedhybrid” network in which the adjacency aij of nodes i, j with expression profiles xi, xj is defined as

aij ={

bicorβ(xi, xj) for bicor(xi, xj) > 00 otherwise

, (1)

where bicor is the biweight mid-correlation [3], a type of robust (that is, outlier-insensitive) correlation.

3.a Scale-free topology analysis

One of the important network construction parameters is the soft-thresholding power β. We apply the approximatescale-free topology criterion to select an appropriate power in each tissue separately. Note of caution: this codetakes some time (possibly several hours) to run. Please be patient, or, if you trust our results, this part can beskipped.

powers = c(seq(1,10,by=1));

powerTables = vector(mode = "list", length = nSets);


powerTables[[set]] = list(data =

pickSoftThreshold(expr[[set]]$data, powerVector=powers,

networkType = "signed hybrid",

verbose = 2 )[[2]]);

save(powerTables, file = "CxBOnly-Female-powerTables.RData");

collectGarbage();

We plot the results of the scale-free topology analysis.

sizeGrWindow(12,9)

#pdf(file = "Plots/Female-AL-ScaleFreeTopology.pdf", width = 12, height = 9);

par(mfrow = c(2,2));

cex1 = 0.7;


{

plot(powerTables[[set]]$data[,1], -sign(powerTables[[set]]$data[,3])*powerTables[[set]]$data[,2],

xlab="Soft Threshold (power)",ylab="Scale Free Topology Model Fit,signed R^2",type="n",

main = paste("Scale independence in ", setLabels[set]));

addGrid();

text(powerTables[[set]]$data[,1], -sign(powerTables[[set]]$data[,3])*powerTables[[set]]$data[,2],

labels=powers,cex=cex1,col="red");

# this line corresponds to using an R^2 cut-off of h

abline(h=0.90,col="red")

plot(powerTables[[set]]$data[,1], powerTables[[set]]$data[,5],

xlab="Soft Threshold (power)",ylab="Mean Connectivity", type="n",

main = paste("Mean connectivity in", setLabels[set]))

addGrid();

text(powerTables[[set]]$data[,1], powerTables[[set]]$data[,5], labels=powers, cex=cex1,col="red")

}

# If plotting into a file, close it.

dev.off();

The resulting plot is shown in Figure 2. The networks become approximately scale-free when the soft-thresholdingpower becomes 3 to 4. We choose the power 4 for both the liver and adipose networks (but in general the powerscould be different).

5

2 4 6 8 10

0.2

0.4

0.6

0.8

Scale independence in Female Liver

Soft Threshold (power)

Sca

le F

ree

Top

olog

y M

odel

Fit,

sign

ed R

^2

1

2

34 5

6 78 9 10

2 4 6 8 10

050

010

0015

00

Mean connectivity in Female Liver


Mea

n C

onne

ctiv

ity

1

2

34 5 6 7 8 9 10

2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

Scale independence in Female Adipose


Sca

le F

ree

Top

olog

y M

odel

Fit,

sign

ed R

^2

1

2

3 4 5 6 7 8 9 10

2 4 6 8 10

050

010

0015

0020

00

Mean connectivity in Female Adipose


Mea

n C

onne

ctiv

ity

1

2

3

45 6 7 8 9 10

Figure 2: Scale-free topology analysis. The left panels show the scale-free topology fit index R2 as a function of thesoft-thresholding power. The right panel shows mean network connectivity. The networks become approximatelyscale-free when the soft-thresholding power becomes 3 to 4.

6

3.b Network construction and module identification

Here we use the function blockwiseModules to construct the networks and identify modules. The function has multiplearguments and options; here we leave most of them at their default values. We save the result of the calculation soit only needs to be executed once.

Note of caution: this code assumes that the computer it runs on has enough memory to handle the full dataset. This is usually at least 16 GB but preferrably 32 GB. If your computer’s RAM is not large enough, the codewill trigger an error. In that case please download the file Female-LA-mods.RData from our web site and continue theanalysis below.

Second note of caution: If you do have a large-enough computer and run this code, be prepared to wait severalhours. The calculation can be speeded up substantially by installing a fast BLAS library such as ATLAS BLAS orGotoBLAS and compiling R against it. If you do not know what “installing a library and compiling R against it”means, your best bet is to be patient and/or run this calculation overnight. Again, you may want to download theresult Female-LA-mods.RData.

# Set up basic parameters

softPower = c(4,4);

minModSize = c(25, 30);

mergeCutHeight = 0.25;

cutHeight = 0.995;

collectGarbage()

# Call the module construction function for each tissue separately

mods = list();


{

mods[[set]] = blockwiseModules(expr[[set]]$data,

maxBlockSize = 30000,

networkType = "signed hybrid",

corType = "bicor", power = softPower[set],

TOMType = "signed",

TOMDenom = "mean",detectCutHeight = cutHeight,

minModuleSize = minModSize[set],

deepSplit = 2,

mergeCutHeight = mergeCutHeight, saveTOMs = TRUE,

saveTOMFileBase = spaste("CxBOnly-", shortLabels[set], "-consensusTOM"),

reassignThreshold = 1e-6,

minCoreKME = 0.5, minKMEtoStay = 0.3,

numericLabels = TRUE, verbose = 3);

collectGarbage();

}

# Save the results

save(mods, file = "Female-LA-mods.RData");

If the above code already ran once or instead of executing the code above you simply downloaded the resultFemale-LA-mods.RData, load it:

load(file = "Female-LA-mods.RData");

7

3.c Merging of closely-related modules

Here we take a look at the eigengene network of the unmerged modules and merge modules whose eigengenes arehighly correlated. We choose the thresholds for merging to be correlation 0.80 in liver and 0.90 in adipose.

# Set the cut heights (1-correlation)

mergeCut = c(0.20, 0.10)

merge = list();

# Call the module merge function on each tissue


{

merge[[set]] = mergeCloseModules(expr[[set]]$data, mods[[set]]$unmergedColors, cutHeight = mergeCut[set],

getNewUnassdME = TRUE, relabel = TRUE);

}

# Plot the eigengene dendrograms before and after merging

sizeGrWindow(12, 9);

#pdf("Plots/Female-LA-mergingDendrograms-%02d.pdf", onef = FALSE, width = 12, height = 10);


{


par(mar = c(0.2, 4, 2.5, 0.2));

plot(merge[[set]]$oldDendro, main = paste(setLabels[set], "modules before merging"),

sub = "", xlab = "", cex = 0.7);

abline(mergeCut[set], 0, col = "red");

plot(merge[[set]]$dendro, main = paste(setLabels[set], "modules after merging"),

sub = "", xlab = "", cex = 0.7);

abline(mergeCut[set], 0, col = "red");

}

# If plotting into a file, close it

dev.off();

# Put together variables for further use

labels = list();

colors = list();

MEs = list();


{

labels[[set]] = merge[[set]]$colors;

colors[[set]] = labels2colors(labels[[set]]);

MEs[[set]] = orderMEs(merge[[set]]$newMEs);

}

# Save the results so this code does not need re-running later.

save(merge, labels, colors, MEs, file = "Female-LA-merge-colors-labels-MEs.RData");

If the above code already ran once, the results can be loaded in one line of code:

load(file = "Female-LA-merge-colors-labels-MEs.RData");

The resulting module merging dendrograms are shown in Figures 3 and 4. Several modules have been merged inliver but none in adipose.

8

ME

43M

E69

ME

79M

E22

ME

29M

E72

ME

77M

E6

ME

55M

E58

ME

57M

E87

ME

68M

E70

ME

71M

E61

ME

67M

E84

ME

53M

E66

ME

48M

E80

ME

73M

E37

ME

5M

E35

ME

2M

E16

ME

38M

E19

ME

15M

E25

ME

59M

E49

ME

63M

E52

ME

86M

E45

ME

64M

E28

ME

60M

E32

ME

44M

E7

ME

13M

E26

ME

36M

E9

ME

17M

E30

ME

56M

E65

ME

42M

E39

ME

78 ME

51M

E4

ME

20M

E3

ME

8 ME

12M

E21

ME

89M

E18

ME

10M

E31 M

E1

ME

23M

E47

ME

14M

E24

ME

40M

E50

ME

34M

E74

ME

33M

E46

ME

62M

E83

ME

11M

E76

ME

75M

E41

ME

88M

E82

ME

85M

E27

ME

54M

E81

0.0

0.2

0.4

0.6

0.8

1.0

Female Liver modules before merging

Hei

ght

ME

50M

E48

ME

29M

E21

ME

26M

E33

ME

3M

E13

ME

18 ME

10M

E11

ME

22M

E81

ME

8M

E15

ME

1M

E17

ME

9M

E41

ME

4M

E38

ME

39M

E31

ME

68M

E53

ME

44M

E57

ME

42M

E78

ME

35M

E56

ME

23M

E45

ME

63M

E60

ME

62M

E72

ME

61M

E2

ME

5M

E37

ME

19M

E24

ME

58M

E71

ME

7M

E36

ME

51M

E52

ME

79M

E54

ME

65M

E74

ME

49M

E59

ME

16M

E14

ME

20M

E40

ME

73M

E64

ME

27M

E6

ME

25M

E32

ME

43M

E28

ME

66M

E30

ME

46M

E55

ME

76M

E12

ME

70M

E67

ME

34M

E80

ME

75M

E77

ME

47M

E69

0.2

0.4

0.6

0.8

1.0

Female Liver modules after merging

Hei

ght

Figure 3: Liver module eigengene dendrograms based on dissimilarity equal 1− bicor.

9

ME

16

ME

5

ME

18

ME

20

ME

6

ME

9

ME

2

ME

7

ME

8

ME

12

ME

4

ME

14

ME

15

ME

19

ME

3

ME

11

ME

10

ME

13

ME

1

ME

17

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Female Adipose modules before merging

Hei

ght

ME

17

ME

5

ME

18

ME

20

ME

6

ME

10

ME

2

ME

7

ME

9

ME

11

ME

4

ME

14

ME

15

ME

19

ME

3

ME

12

ME

8

ME

13

ME

1

ME

16

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Female Adipose modules after merging

Hei

ght

Figure 4: Adipose module eigengene dendrograms based on dissimilarity equal 1− bicor.

10

3.d Trimming of genes with low module membership

The module identification method sometimes assigns genes into modules although the gene has very low modulemembership (defined as the correlation of the gene expression profile and the eigengene). Although such moduleassignment could be meaningful, we aim for tighter modules and hence we remove module genes whose modulemembership is below the threshold of 0.30. Since removing any gene from a module in principle changes its eigengene,we iterate this process until no genes are removed.

mes = list();

origMEs = list();

trimLabs = labels;


{

changed = TRUE

threshold = 0.30;

mes[[set]] = moduleEigengenes(expr[[set]]$data, labels[[set]]);

origMEs[[set]] = mes[[set]];

trimLabs[[set]] = labels[[set]]

while (changed)

{

changed = FALSE;

nMods = ncol(mes[[set]]$eigengenes)

modNames = substring(names(mes[[set]]$eigengenes), 3)

#pind= initProgInd();

for (mod in 1:nMods) if (modNames[mod]!=’0’)

{

modGeneInd = (as.character(trimLabs[[set]])==modNames[mod]);

nModGenes = sum(modGeneInd);

KME = bicor(expr[[set]]$data[, modGeneInd], mes[[set]]$eigengenes[, mod], use = ’p’);

remove = KME < threshold;

if (sum(remove)>0) changed = TRUE;

printFlush("module", modNames[mod], ": removing", sum(remove), "of", length(remove), "genes.");

trimLabs[[set]][modGeneInd][remove] = 0;

#pind = updateProgInd(mod/nMods, pind);

}

#printFlush("");

# Redo module eigengene calculation

if (changed) mes[[set]] = moduleEigengenes(expr[[set]]$data, colors = trimLabs[[set]]);

}

}

Next we check how much the eigengenes have changed. We calculate the correlations between the “original” (i.e.,before gene trimming) and new eigengenes.

#out of curiosity: correlations between original and trimmed module eigengenes:

signif(diag(cor(mes[[1]]$eigengenes, origMEs[[1]]$eigengenes)), 3);

signif(diag(cor(mes[[2]]$eigengenes, origMEs[[2]]$eigengenes)), 3);

signif(min(abs(diag(cor(mes[[1]]$eigengenes, origMEs[[1]]$eigengenes))[-1])), 3);

signif(min(abs(diag(cor(mes[[2]]$eigengenes, origMEs[[2]]$eigengenes))[-1])), 3);

Excluding the eigengene of the improper module 0 (that collects the unassigned genes), the minimum correlation ofold and new eigengenes is 0.999, which indicates that although some outlying genes were removed, the eigengeneshave practically not changed. We now re-form eigengenes, save the results of this part and replace the module labelswith the trimmed labels.

MEs = list();

ordMEs = list();


{

MEs[[set]] = moduleEigengenes(expr[[set]]$data, trimLabs[[set]]);

11

ordMEs[[set]] = orderMEs(MEs[[set]], greyName = "ME0");

}

# Save the results so they can be loaded in future

save(trimLabs, MEs, ordMEs, file = "Female-LA-trimLabs.RData");

#load(file = "Female-LA-trimLabs.RData");

labels = trimLabs;

colors = lapply(labels, labels2colors)

3.e Identification and removal of linkage-driven modules

Some of the smaller modules appear to be linkage-driven in the sense that they group together genes located ina single chromosomal region and their eigengene is highly correlated with a genotype at that locus. Although thegenes in such modules are co-expressed in this particular CASTxB6 cross, they would likely not be co-expressed ina random (diverse) population. Therefore we identify such modules are remove them from the analysis (by settingthe module labels of the corresponding genes to 0). We start by loading and formatting the genotype data.

# (re) read the gene annotation table

file = bzfile(description = "../../../Data-AllMouse/CXB_GeneAnnotation.csv.bz2");

#file = "../Data-CXB/CXB_all_gene_annotation.csv"

annotX = read.csv(file = file);

# Read the SNP data and sort them

file = bzfile(description = "../../../Data-AllMouse/CXB_GENOTYPES_numeric.csv.bz2");

gtInfo = read.csv(file = file);

file = bzfile(description = "../../../Data-AllMouse/CXB_GENOTYPES_alpha.csv.bz2");

gtAlpha = read.csv(file = file);

# Correct the coding of numeric genotypes

num2alpha = match(gtInfo$marker_name, gtAlpha$marker_name);

gtAlphaN = gtAlpha[num2alpha, ]

all.equal(names(gtAlphaN), names(gtInfo))

gtInfo[gtAlphaN==’H’] = 1;

gtInfo[gtAlphaN==’B’] = 2;

collectGarbage()

# Sort the SNPs:

snpHasAnno = is.finite(gtInfo$chro_number) & is.finite(gtInfo$marker_pos_Bp);

gtInfoA = gtInfo[snpHasAnno, ];

SNPorder = order(gtInfoA$chro_number, gtInfoA$marker_pos_Bp);

gtInfoS = gtInfoA[SNPorder, ];

gtCols = substring(names(gtInfoS), 1, 3)=="F2_";

gtSamples = names(gtInfoS)[gtCols];

Next we identify and remove modules whose highest correlation with a SNP is above 0.5.

# Identify modules whose correlation with the best SNP is above 0.5

cleanLabels = labels;


{

common = intersect(rownames(expr[[set]]$data), gtSamples);

expr2gt = match(rownames(expr[[set]]$data), gtSamples);

print(table(is.na(expr2gt)))

# all expression-measured samples have a genotype, good.

gt = t(gtInfoS[, gtCols][, expr2gt]);

gtAnnot = gtInfoS[, c(2,5,6)];

collectGarbage()

x = bicorAndPvalue(gt, MEs[[set]]$eigengenes);

bestP = apply(x$p, 2, min, na.rm = TRUE)

whichP = apply(x$p, 2, which.min)

maxCor = apply(abs(x$bicor), 2, max);

which = apply(abs(x$bicor), 2, which.max);

12

if (!isTRUE(all.equal(whichP, which))) stop("which and whichP do not agree.");

suspicious = maxCor > 0.5;

suspInfo = data.frame(module = substring(names(MEs[[set]]$eigengenes)[suspicious], 3),

gtAnnot[which[suspicious], ],

absCor.SNP.ME = maxCor[suspicious],

pValue.SNP.ME = bestP[suspicious]);

modules = as.numeric(substring(names(MEs[[set]]$eigengenes)[suspicious], 3))

printFlush(paste("Suspicious modules: ", paste(modules, collapse = ", ")));

cleanLabels[[set]] [is.finite(match(labels[[set]], modules))] = 0;

write.csv(suspInfo, file = spaste("CxBOnly-Female-", shortLabels[set], "-HighSNP-MEcorrelations.csv"),

quote = FALSE, row.names = FALSE);

}

We again replace the module labels by the cleaned labels and recalculate module eigengenes for further use.

# From here on only use cleaned labels:

labels = cleanLabels;

MEs = list();

ordMEs = list();

MEs0 = list(); # Leave grey eigengene out


{

MEs[[set]] = moduleEigengenes(expr[[set]]$data, labels[[set]]);

ordMEs[[set]] = orderMEs(MEs[[set]], greyName = "ME0");

MEs0[[set]] = moduleEigengenes(expr[[set]]$data, labels[[set]], excludeGrey = TRUE, grey = 0);

}

save(labels, MEs, MEs0, ordMEs, file = "Female-LA-labels-MEs-ordMEs-afterCleaning.RData");

#load(file = "Female-LA-labels-MEs-ordMEs-afterCleaning.RData");

colors = lapply(labels, labels2colors)

3.f Gene clustering dendrograms and module colors

Here we take a look at the gene clustering trees in both tissues. This allows us to visually verify that the moduleidentification procedure led to modules that actually correspond to distinguishable branches of the gene clusteringdendrogram. We also add color-coded indicators of gene significance for the individual traits. It is better to savethe plots directly into a pdf (large file size, full resolution with zoom-in) or png (smaller file size but also smallerresolution), but the plot can also be viewed on-screen.

# Calculate gene significance for all traits

basePVal = 0.01;

traitGeneColors = list();


{

z = qnorm(1-basePVal)/sqrt(nSamples[set]-3);

baseCor = tanh(z);

cor = bicor(expr[[set]]$data, selTraits[[set]]$data, use = "p");

cor[abs(cor) < baseCor] = 0;

traitGeneColors[[set]] = numbers2colors(cor, signed = TRUE);

colnames(traitGeneColors[[set]]) = colnames(selTraits[[set]]$data);

}

# Plot the gene clustering trees and the gene significance

sizeGrWindow(12,9)

#pdf(file = "Plots/Female-LA-geneDendrograms-AllTraits-%02d.pdf", w = 30, h = 15, onefile = FALSE)

#png(file = "Plots/Female-LA-geneDendrograms-AllTraits-%02d.png", w = 1200, h = 600)


{

par(lheight=1.3);

plotDendroAndColors(mods[[set]]$dendrograms[[1]],

13

cbind(traitGeneColors[[set]], colors[[set]]),

c(renameTable[, 2], "modules"),

autoColorHeight = FALSE,

colorHeight = 0.6,

rowText = spaste(labels[[set]], ": ", colors[[set]]),

textPositions = nSelTraits + 1,

marAll = c(0, 8, 2, 3),

ylab = "", xlab = "", sub = "", dendroLabels = FALSE, hang = 0.03,

addGuide = TRUE, guideHang = 0.05, cex.rowText = 1.3, cex.colorLabels = 1.2,

rowWidths = c(rep(1, nSelTraits + 1), 15),

addTextGuide = TRUE,

main = spaste(shortLabels[set],

" gene dendrogram, association with traits and module colors"),

cex.main = 1.4);

}

dev.off()

We show the results (in the png version) in Figures 5 and 6. The dendrograms exhibit clear branches that areidentified as modules. In the large-resolution pdf figures one can also see the smaller branches that correspond tosmaller modules.

14

Figure 5: The upper panel shows the gene clustering tree (dendrogram) in liver. Each “leaf”, i.e., a short verticalline, corresponds to one gene (more precisely, a microarray probe). Branches of the dendrogram correspond tomodules. Below the dendrogram, color rows annotated by clinical traits give the gene significance for (correlationwith) the corresponding trait. Red color corresponds to positive gene significance (GS), and green color correspondsto negative GS. White color indicates no gene significance; color saturation corresponds to GS strength. The lastcolor row indicates module assignment. Module colors are annotated below the module color row.

15

Figure 6: The upper panel shows the gene clustering tree (dendrogram) in adipose. Each “leaf”, i.e., a short verticalline, corresponds to one gene (more precisely, a microarray probe). Branches of the dendrogram correspond tomodules. Below the dendrogram, color rows annotated by clinical traits give the gene significance for (correlationwith) the corresponding trait. Red color corresponds to positive gene significance (GS), and green color correspondsto negative GS. White color indicates no gene significance; color saturation corresponds to GS strength. The lastcolor row indicates module assignment. Module colors are annotated below the module color row.

16

4 GO enrichment analysis

Here we perform a functional enrichment analysis of the found modules. There are two main methods one can use:either export lists of genes in each module and use external software, or use the function GOenrichmentAnalysis inWGCNA to calculate enrichment in GO terms. We first show how to export the gene lists for each module for usewith external software, then perform the actual analysis using GOenrichmentAnalysis.

4.a Exporting lists of genes in each module

We (re-)load the gene annotation table and export the matchin Locus Link IDs (also known as Entrez IDs). Theoutput is a set of text files with names such as Liver-3.txt etc. in the subdirectory FEA of the current directory. Ifthe directory FEA does not exist, please create it or modify the variable outFileBase below.



annot = read.csv(file = file);

# Loop over tissues


{

# Base of the file names

outFileBase = spaste("FEA/", shortLabels[set], "-");

nMods = ncol(MEs[[set]]$eigengenes)

modNames = as.numeric(substring(names(MEs[[set]]$eigengenes), 3))

# loop over modules

for (mod in 1:nMods)

{

modGeneInd = (labels[[set]]==modNames[mod]);

modProbes = colnames(expr[[set]]$data)[modGeneInd];

annotInd = match(modProbes, annot$sequence);

annotInd = annotInd[!is.na(annotInd)];

modGeneIDs = annot$LocusLinkID[annotInd];

modGeneIDs = modGeneIDs[!is.na(modGeneIDs)];

write.table(data.frame(LLID = modGeneIDs),

file = paste(outFileBase, modNames[mod], ’.txt’, sep = ""), quote = F, row.names = F,

col.names = F);

}

allAnnotInd = match(names(expr[[set]]$data), annot$sequence);

allAnnotInd = allAnnotInd[!is.na(allAnnotInd)];

GeneIDs = annot$LocusLinkID[allAnnotInd];

GeneIDs = GeneIDs[!is.na(GeneIDs)];

# Also write out a file of all genes in the network, useful as a background list in the analysis.

write.table(data.frame(LLID = GeneIDs),

file = paste(outFileBase, ’all.txt’, sep = ""), quote = F, row.names = F,

col.names = F);

}

4.b GO enrichment analysis in WGCNA

Here we perform the GO enrichment analysis directly in WGCNA. This is usually much more convenient thanuploading each module separately to a separate application, but is restricted to GO. This calculation will takeseveral minutes.

# (re-)read gene annotation




# Calculate enrichment information

bt = list();

17


{

expr2annot = match(colnames(expr[[set]]$data), annot$sequence);

LLID = annot$LocusLinkID[expr2annot];

table(is.na(LLID))

fin = !is.na(LLID);

finLLID = LLID[fin];

finLabels = labels[[set]][fin];

system.time ( {

bt[[set]] = GOenrichmentAnalysis(finLabels, finLLID, organism = "mouse",

nBest = 20, nBiggest = 0, includeOffspring = TRUE);

} );

}

# Save the results for future use

save(bt, file = "Female-LA-GOEnrichemnt-trimmedAndCleanedLabels.RData");

Next we re-format the full information into a more manageable form and print it. To make the prinout readable,please make the R console at least 100 characters wide. The table is also saved into an excel sheet that can be openedusing MS Excel or OpenOffice Calc.

# If necessary, load the results

load(file = "Female-LA-GOEnrichemnt-trimmedAndCleanedLabels.RData");

# Loop over tissues


{

res = bt[[set]]$bestPTerms[[4]]$enrichment;

# Write an excel sheet containing the full information

write.table(res, file = spaste("CxBOnly-Female-", shortLabels[[set]], "-GOenrichment.txt"),

row.names = FALSE, sep = "\t", quote = FALSE);

# Print a "narrower" version

res2 = res[, c(1, 2, 4, 6, 8, 12, 13)];

res2[, c(4, 5)] = signif(apply(res2[, c(4,5)], 2, as.numeric), 2)

rownames(res2) = NULL

names(res2) = c("Mod", "Size", "Rnk", "p.Bonf", "fracModSz", "ont", "termName");

terms = res2$termName;

sterms = substring(terms, 1, 60);

res2$termName = sterms;

options(width = 100);

modules = sort(as.numeric(unique(res2$Mod)));

for (m in modules)

{

printFlush(spaste("=========== Module:", m, "; module size: ", sum(labels[[set]]==m)))

print(res2[res2$Mod==m, -c(1,2)]);

}

}

The result is a long printout of Bonferroni-corrected enrichment p-values. For example, for liver module 6 we get

=========== Module:6; module size: 666Rnk p.Bonf fracModSz ont termName

121 1 2.3e-31 0.460 MF catalytic activity122 2 6.7e-31 0.210 CC mitochondrion123 3 1.1e-28 0.150 MF oxidoreductase activity124 4 6.3e-26 0.140 BP oxidation reduction125 5 1.6e-20 0.500 CC cytoplasm126 6 9.4e-20 0.340 CC cytoplasmic part127 7 2.3e-10 0.099 BP lipid metabolic process

18

128 8 1.3e-09 0.480 BP metabolic process129 9 1.4e-09 0.077 BP organic acid metabolic process130 10 1.4e-09 0.077 BP carboxylic acid metabolic process131 11 1.8e-09 0.037 MF oxidoreductase activity, acting on CH-OH group of donors132 12 2.4e-08 0.035 MF electron carrier activity133 13 7.7e-08 0.580 CC intracellular part134 14 2.0e-07 0.032 MF oxidoreductase activity, acting on the CH-OH group of donors135 15 4.4e-07 0.590 CC intracellular136 16 8.8e-07 0.033 MF tetrapyrrole binding137 17 2.0e-06 0.032 MF heme binding138 18 2.4e-06 0.038 BP steroid metabolic process139 19 2.6e-06 0.450 CC intracellular membrane-bounded organelle140 20 2.9e-06 0.450 CC membrane-bounded organelle

Of note is also the adipose module 6 (809 probes) that is extremely highly enriched in the term mitochondrion:

=========== Module:6; module size: 809Rnk p.Bonf fracModSz ont termName

121 1 2.5e-234 0.500 CC mitochondrion122 2 9.2e-142 0.600 CC cytoplasmic part123 3 2.7e-109 0.210 CC mitochondrial part124 4 3.1e-100 0.190 CC mitochondrial envelope125 5 2.8e-98 0.190 CC mitochondrial membrane126 6 9.8e-98 0.170 CC mitochondrial inner membrane127 7 1.9e-92 0.680 CC cytoplasm128 8 6.9e-71 0.140 BP generation of precursor metabolites and energy129 9 5.4e-67 0.650 CC intracellular membrane-bounded organelle130 10 7.5e-67 0.650 CC membrane-bounded organelle131 11 3.3e-56 0.180 BP oxidation reduction132 12 6.4e-55 0.067 CC respiratory chain133 13 5.7e-50 0.080 BP electron transport chain134 14 7.7e-49 0.720 CC intracellular part135 15 1.9e-48 0.170 MF oxidoreductase activity136 16 2.1e-46 0.490 MF catalytic activity137 17 3.4e-46 0.730 CC intracellular138 18 7.4e-28 0.043 BP cellular respiration139 19 1.6e-27 0.550 BP metabolic process140 20 3.8e-20 0.063 MF cofactor binding

We now create text labels for the modules that reflect the name of the term with highest enrichment. We only createa GO label if the corresponding Bonferroni corrected p-value is better than 10−4.

# Crate GO labels for modules

goLabels = list();

goModules = list();

goPvalue = list();


{

goAnn = bt[[set]]$bestPTerms[[4]]$enrichment

nModules = length(unique(goAnn$module));

best = tapply(c(1:nrow(goAnn)), goAnn$module, min);

goModules[[set]] = goAnn$module[best][-1];

goLabels[[set]] = spaste(goModules[[set]], ": ", goAnn$termName[best][-1]);

goPvalue[[set]] = goAnn$BonferoniP[best][-1];

goLabels[[set]] [goPvalue[[set]] > 1e-4] = goModules[[set]] [goPvalue[[set]] > 1e-4];

}

collectGarbage();

19

The labels are as follows:

> goLabels[[1]][1] "1: receptor activity" "10: proteasome complex"[3] "11" "12: G-protein coupled receptor activity"[5] "13" "14: intracellular part"[7] "15" "16"[9] "17" "18: nucleus"

[11] "19: extracellular matrix" "2: intracellular"[13] "20: mitochondrion" "21"[15] "22: ribosome" "23: nucleosome assembly"[17] "24: cell adhesion" "25"[19] "26: cellular amino acid metabolic process" "27: mitochondrial part"[21] "29" "3"[23] "33: endoplasmic reticulum" "38"[25] "4: G-protein coupled receptor activity" "45"[27] "48" "5: intracellular"[29] "50" "58: cell cycle"[31] "6: catalytic activity" "64"[33] "65: serine-type peptidase activity" "68"[35] "7: immune response" "70"[37] "71: MHC class I protein complex" "73: nucleosome assembly"[39] "76: hemoglobin complex" "8: receptor activity"[41] "81" "9"

[[2]][1] "1: G-protein coupled receptor activity" "10: mitochondrion"[3] "11: extracellular matrix" "12"[5] "13" "14: ribosome"[7] "15: cell cycle" "16: membrane fraction"[9] "2: nucleus" "3: G-protein coupled receptor activity"

[11] "4: lymphocyte activation" "5: multicellular organismal development"[13] "6: mitochondrion" "7: membrane"[15] "8" "9"

5 Modules related to physiological traits

Here we identify modules related to physiological traits. We use the robust biweight midcorrelation to measure theassociation between each module eigengene and each trait. We consider the association significant if the correlationis above 0.35, corresponding to a p-value of roughly 10−5. Taking into account the number of modules (42 in liver)and number of traits (19), this translates roughly to a Bonferroni corrected p-value threshold of 10−2. The followingrather long section of code generates a list containing information about modules associated to each trait in eachtissue.

# Set up lists to hold the information

bestModules = list();


exprSize = checkSets(expr)


# Correlation thresholds

thresholds = c(0.35, 0.35);

nSelTraits = checkSets(selTraits)$nGenes;

# Loop over tissues


20

{

bestModules[[set]] = list();

traitGeneColors[[set]] = list();

modSizes = table(labels[[set]])[match(substring(names(MEs[[set]]$eigengenes), 3),

names(table(labels[[set]])))]

modColors = rep("grey", length(modSizes))

modNumbers = as.numeric(substring(names(MEs[[set]]$eigengenes), 3))

modColors[modNumbers!=0] = standardColors()[modNumbers[modNumbers!=0]]

# Loop over traits

for (t in 1:nSelTraits)

{

bestModules[[set]][[t]] = list();

bestModules[[set]][[t]]$trait = colnames(selTraits[[set]]$data)[t];

x = bicorAndPvalue(MEs[[set]]$eigengenes, selTraits[[set]]$data[, t])

cors = x$bicor;

pvals = x$p;

# Put the p-values into a single data frame

significance = data.frame(modSizes, cors, pvals, as.numeric(MEs[[set]]$varExplained),

modColors);

names(significance) = c("nGenes", spaste("r.", shortLabels[set]),

spaste("p.", shortLabels[set]), spaste("PVE.", shortLabels[set]), "Color");

rownames(significance) = names(MEs[[set]]$eigengenes);

order = order(significance[, 3]);

significant = significance[, 3] < 0.001;

bestModules[[set]][[t]]$significance = significance[order, ];

printSignif = significance;

printSignif[, c(2:4)] = signif(significance[, c(2:4)], 3);

bestModules[[set]][[t]]$printSignif = printSignif[order, ];

bestModules[[set]][[t]]$bestSignif = printSignif[order, ][

(abs(printSignif[order, 2]) > thresholds[set]), ]

printFlush("\n==============================================================================\n");

printFlush("Significance for", bestModules[[set]][[t]]$trait, ":");

options(width = 100);

print(bestModules[[set]][[t]]$bestSignif);

moduleList1 = as.numeric(substring(rownames(bestModules[[set]][[t]]$bestSignif), 3))

bestModules[[set]][[t]]$bestModules = moduleList1;

bestModules[[set]][[t]]$nGenesInBestModules = sum(bestModules[[set]][[t]]$bestSignif$nGenes);

}

}

# Save the results for future use

save(bestModules, file = "Female-LA-bestModules.RData");

For use in subsequent analysis, we also form a separate list of modules associated with each trait.

keepModules = list();

nKeep = rep(0, nSets);



nGenes = exprSize$nGenes;

basePVal = 0.01;

bestLabels = list();

bestColors = list();


{

z = qnorm(1-basePVal)/sqrt(nSamples[set]-3);

baseCor = tanh(z);

cor = bicor(expr[[set]]$data, selTraits[[set]]$data, use = "p");

cor[abs(cor) < baseCor] = 0;

traitGeneColors[[set]] = numbers2colors(cor, signed = TRUE);

colnames(traitGeneColors[[set]]) = colnames(selTraits[[set]]$data);

21

keepModules[[set]] = vector();

bestLabels[[set]] = matrix(0, nGenes, nSelTraits);

for (t in 1:nSelTraits)

{

keepModules[[set]] = c(keepModules[[set]], bestModules[[set]][[t]]$bestModules);

keep = labels[[set]] %in% bestModules[[set]][[t]]$bestModules

bestLabels[[set]][keep, t] = labels[[set]][ keep ];

}

keepModules[[set]] = sort(unique(keepModules[[set]]));

nKeep[set] = length(keepModules[[set]]);

bestColors[[set]] = labels2colors(bestLabels[[set]]);

}

hdlInd = match("e_hdl_mgdl", colnames(selTraits[[1]]$data));

Which modules are associated with HDL? In liver we find the following:

> bestModules[[1]] [[hdlInd]] $ bestSignifnGenes r.Liver p.Liver PVE.Liver Color

ME6 666 0.564 3.14e-13 0.368 redME64 40 0.470 4.02e-09 0.484 skyblue2ME11 379 -0.455 1.41e-08 0.333 greenyellowME20 161 0.419 2.28e-07 0.378 royalblueME10 409 -0.419 2.32e-07 0.338 purpleME16 231 0.398 1.04e-06 0.321 lightcyanME21 145 -0.396 1.14e-06 0.368 darkredME18 201 -0.385 2.39e-06 0.350 lightgreen

In adipose, we only find one module:

> bestModules[[2]] [[hdlInd]]$bestSignifnGenes r.Adipose p.Adipose PVE.Adipose Color

ME7 791 0.463 4.44e-10 0.424 black

5.a Module-trait relationships for all modules that relate significantly to a trait

Here we produce color-coded tables of module significance (defined as robust correlation of the module eigengeneand the trait) of between modules and traits. We restrict the modules to those that relate significantly to at leastone trait. We first calculate matrices holding the module significances and the corresponding p-values. We use therobust biweight midcorrelation to quantify module significance.

nTraits = dim(selTraits[[1]]$data)[2];

ordTraits = consensusOrderMEs(selTraits, greyLast = FALSE);

TraitSignif = vector(mode="list", length = nSets);

TraitCor = vector(mode="list", length = nSets);

TraitLabels = colnames(ordTraits[[1]]$data);

newTraitLabels = renameTable[ match(TraitLabels, renameTable[, 1]), 2];

MELabels = list();


{

MELabels[[set]] = colnames(ordMEs[[set]]$eigengenes);

tmp = bicorAndPvalue(ordMEs[[set]]$eigengenes, ordTraits[[set]]$data)

TraitSignif[[set]] = tmp$p

TraitCor[[set]] = tmp$bicor

}

minp = 1; maxp = 0;


{

minp = min(minp, TraitSignif[[set]]);

22

maxp = max(maxp, TraitSignif[[set]]);

}

if (minp

Female Liver module−trait significance

−0.5

0

0.5

fluid

FFA

trigly

tot.c

hol

unes

t.cho

l

mus

cleBM

Dlen

gth

efat

gluco

se

insuli

nHD

Llep

tin fat

fat.fr

ac rfat

vfat

weigh

tsfa

t

ME16

ME20: mitochondrion

ME64

ME6: catalytic activity

ME25

ME27: mitochondrial part

ME73: nucleosome assembly

ME19: extracellular matrix

ME7: immune response

ME58: cell cycle

ME70

ME21

ME26: cellular amino acidmetabolic process

ME45

ME13

ME18: nucleus

ME10: proteasome complex

ME11

ME9

ME1: receptor activity

ME17

−0.312e−04

0.150.07

0.0860.3

0.270.001

0.130.1

0.170.04

0.0810.3

0.336e−05

0.482e−09

0.335e−05

0.0650.5

0.41e−06

0.526e−09

0.473e−09

0.52e−10

0.496e−10

0.54e−10

0.491e−09

0.497e−10

−0.260.002

0.00630.9

0.00291

0.361e−05

0.110.2

0.0660.4

0.0440.6

0.230.006

0.312e−04

0.240.004

0.150.08

0.422e−07

0.375e−05

0.34e−04

0.338e−05

0.352e−05

0.321e−04

0.33e−04

0.33e−04

−0.53e−10

0.34e−04

0.170.05

0.343e−05

0.170.05

0.10.2

−0.0110.9

0.250.002

0.452e−08

0.343e−05

0.170.06

0.474e−09

0.546e−10

0.496e−10

0.532e−11

0.53e−10

0.511e−10

0.447e−08

0.431e−07

−0.461e−08

0.120.2

0.0950.3

0.545e−12

0.240.005

0.295e−04

0.120.2

0.393e−06

0.591e−14

0.524e−11

0.280.001

0.563e−13

0.532e−09

0.64e−15

0.615e−16

0.637e−17

0.648e−18

0.654e−18

0.591e−14

−0.140.1

−0.140.09

−0.190.02

0.422e−07

0.120.2

0.0980.2

0.00531

0.170.04

0.150.07

0.160.06

0.140.1

0.352e−05

0.210.03

0.160.05

0.160.06

0.230.006

0.210.01

0.180.03

0.150.08

−0.338e−05

−0.0460.6

−0.0960.3

0.210.01

−0.0210.8

0.0690.4

−0.0220.8

0.160.07

0.288e−04

0.416e−07

0.270.002

0.240.004

0.280.003

0.287e−04

0.33e−04

0.295e−04

0.337e−05

0.321e−04

0.230.006

−0.180.04

−0.110.2

−0.0620.5

0.230.005

0.0480.6

0.170.04

0.140.1

0.260.002

0.250.003

0.452e−08

0.323e−04

0.220.009

0.140.1

0.240.004

0.230.006

0.295e−04

0.329e−05

0.352e−05

0.260.002

−0.180.03

0.20.02

0.140.1

0.180.03

0.0570.5

0.375e−06

0.0810.3

0.343e−05

0.475e−09

0.312e−04

0.0760.4

0.240.005

0.377e−05

0.53e−10

0.481e−09

0.473e−09

0.511e−10

0.572e−13

0.53e−10

0.00670.9

0.130.1

0.150.08

0.295e−04

0.180.03

0.383e−06

0.120.2

0.260.002

0.352e−05

0.210.01

0.0240.8

0.270.001

0.20.03

0.344e−05

0.295e−04

0.377e−06

0.438e−08

0.481e−09

0.422e−07

0.0380.7

0.130.1

0.240.004

0.220.01

0.120.2

0.376e−06

0.0690.4

0.220.009

0.33e−04

0.240.004

0.0830.4

0.110.2

0.180.06

0.33e−04

0.250.003

0.33e−04

0.41e−06

0.444e−08

0.311e−04

−0.0240.8

0.180.04

0.180.04

0.0660.4

0.260.002

−0.296e−04

−0.33e−04

−0.377e−06

−0.20.02

0.130.1

0.0610.5

0.0320.7

−0.0320.7

−0.160.06

−0.140.1

−0.140.1

−0.190.02

−0.240.005

−0.0950.3

0.296e−04

−0.180.03

−0.120.2

−0.295e−04

−0.140.09

−0.295e−04

−0.120.1

−0.361e−05

−0.532e−11

−0.384e−06

−0.040.7

−0.41e−06

−0.511e−08

−0.523e−11

−0.531e−11

−0.517e−11

−0.552e−12

−0.599e−15

−0.551e−12

0.312e−04

−0.287e−04

−0.250.002

−0.180.03

−0.120.2

−0.352e−05

−0.180.03

−0.41e−06

−0.622e−16

−0.392e−06

−0.140.1

−0.33e−04

−0.486e−08

−0.611e−15

−0.66e−15

−0.67e−15

−0.656e−18

−0.681e−20

−0.671e−19

0.120.1

−0.110.2

−0.120.2

−0.443e−08

−0.220.008

0.0690.4

0.140.1

0.0910.3

−0.0570.5

−0.270.001

−0.160.07

−0.33e−04

−0.130.2

−0.0510.5

−0.0710.4

−0.0930.3

−0.120.2

−0.0720.4

−0.050.6

0.180.04

−0.160.05

−0.296e−04

−0.250.003

−0.160.06

−0.288e−04

−0.130.1

−0.170.04

−0.32e−04

−0.352e−05

−0.20.02

−0.260.002

−0.376e−05

−0.33e−04

−0.270.001

−0.338e−05

−0.321e−04

−0.391e−06

−0.33e−04

0.312e−04

−0.160.05

−0.230.007

−0.422e−07

−0.220.008

−0.338e−05

−0.110.2

−0.240.004

−0.423e−07

−0.47e−07

−0.250.004

−0.392e−06

−0.433e−06

−0.452e−08

−0.443e−08

−0.482e−09

−0.473e−09

−0.52e−10

−0.422e−07

0.180.03

0.0150.9

0.00351

−0.54e−10

−0.150.08

−0.240.004

−0.0560.5

−0.288e−04

−0.383e−06

−0.352e−05

−0.240.005

−0.422e−07

−0.342e−04

−0.361e−05

−0.336e−05

−0.415e−07

−0.452e−08

−0.446e−08

−0.391e−06

0.41e−06

−0.0710.4

−0.0650.4

−0.444e−08

−0.160.06

−0.392e−06

−0.120.1

−0.423e−07

−0.591e−14

−0.517e−11

−0.230.01

−0.461e−08

−0.471e−07

−0.65e−15

−0.611e−15

−0.612e−15

−0.633e−17

−0.675e−20

−0.592e−14

0.160.06

−0.20.02

−0.230.005

−0.0740.4

−0.140.09

−0.321e−04

−0.180.03

−0.210.01

−0.361e−05

−0.0530.5

−0.0890.3

−0.120.1

−0.343e−04

−0.382e−06

−0.361e−05

−0.343e−05

−0.344e−05

−0.415e−07

−0.352e−05

0.0990.2

−0.180.04

−0.220.009

−0.0980.2

−0.0590.5

−0.34e−04

−0.110.2

−0.230.007

−0.376e−06

0.0180.8

−0.0640.5

−0.150.09

−0.280.002

−0.361e−05

−0.352e−05

−0.329e−05

−0.338e−05

−0.392e−06

−0.337e−05

0.140.1

−0.130.1

−0.240.005

−0.210.01

−0.130.1

−0.329e−05

−0.110.2

−0.210.01

−0.311e−04

−0.140.09

−0.150.09

−0.230.007

−0.342e−04

−0.338e−05

−0.312e−04

−0.312e−04

−0.296e−04

−0.383e−06

−0.287e−04

Figure 7: Module significance of selected liver modules for traits measured for this cross. Numbers in the tableindicate the robust correlations and the corresponding p-values. The table is colored by correlation with red colorrepresenting positive correlation and green negative correlation.

24

Female Adipose module−trait significance

−0.5

0

0.5

fluid

FFA

trigly

tot.c

hol

unes

t.cho

l

mus

cleBM

Dlen

gth

efat

gluco

se

insuli

nHD

Llep

tin fat

fat.fr

ac rfat

vfat

weigh

tsfa

t

ME7: membrane

ME9

ME11: extracellular matrix

ME4: lymphocyte activation

ME8

ME13

ME16: membrane fraction

−0.651e−20

0.267e−04

0.210.006

0.39e−05

0.180.02

0.275e−04

0.150.05

0.41e−07

0.812e−39

0.292e−04

0.311e−04

0.464e−10

0.84e−31

0.81e−38

0.88e−38

0.83e−37

0.781e−35

0.751e−31

0.827e−42

−0.422e−08

0.160.04

0.150.06

0.190.02

0.140.08

0.269e−04

0.190.01

0.332e−05

0.631e−19

0.315e−05

0.278e−04

0.38e−05

0.527e−11

0.615e−18

0.582e−16

0.591e−16

0.621e−18

0.631e−19

0.641e−20

−0.42e−07

0.150.06

0.120.1

0.190.01

0.10.2

0.49e−08

0.292e−04

0.451e−09

0.662e−22

0.282e−04

0.240.003

0.274e−04

0.541e−11

0.676e−23

0.652e−21

0.681e−23

0.689e−24

0.724e−27

0.78e−26

0.0870.3

−0.0220.8

−0.0720.4

0.0260.7

0.0690.4

−0.386e−07

−0.267e−04

−0.49e−08

−0.283e−04

−0.20.009

−0.120.2

−0.110.2

−0.250.003

−0.39e−05

−0.220.004

−0.316e−05

−0.392e−07

−0.444e−09

−0.392e−07

0.0630.4

−0.0390.6

−0.0560.5

−0.0260.7

−0.0380.6

−0.210.007

−0.260.001

−0.170.03

−0.323e−05

−0.150.06

−0.140.09

−0.0680.4

−0.130.1

−0.266e−04

−0.240.002

−0.268e−04

−0.324e−05

−0.362e−06

−0.323e−05

0.0770.3

−0.120.1

−0.180.02

−0.080.3

−0.110.2

−0.31e−04

−0.284e−04

−0.160.04

−0.331e−05

−0.160.04

−0.0960.2

−0.0640.4

−0.150.08

−0.314e−05

−0.291e−04

−0.31e−04

−0.355e−06

−0.41e−07

−0.362e−06

−0.393e−07

0.150.05

0.0780.3

0.190.02

0.0430.6

0.274e−04

0.180.02

0.451e−09

0.512e−12

0.0580.5

0.190.02

0.316e−05

0.513e−10

0.541e−13

0.546e−14

0.553e−14

0.56e−12

0.491e−11

0.531e−13

Figure 8: Module significance of selected adipose modules for traits measured for this cross. Numbers in the tableindicate the robust correlations and the corresponding p-values. The table is colored by correlation with red colorrepresenting positive correlation and green negative correlation.

25

5.b Network plots of all module eigengenes and traits

We now produce plots of networks composed of module eigengenes and traits in each tissue. The plots are too big todisplay comfortably on screen, but can be viewed using a pdf viewer which will usually provide a zoom function. Theeigengene network plot contains two panels, one with a dendrogram of eigengenes and traits, and the correspondingcolor-coded heatmap and correlation/p-value table.

widths = c(20, 13)


{

mets = list(a = list(data = cbind(MEs[[set]]$eigengenes, selTraits[[set]]$data)));

colnames(mets$a$data) = c(colnames(MEs[[set]]$eigengene), renameTable[, 2]);

omets = consensusOrderMEs(mets);

pdf(file = spaste("Plots/Female-", shortLabels[set], "-ME-selTraitNetworkHeatmaps.pdf"),

width = widths[set], height = 2*widths[set]);

plotEigengeneNetworks(omets, shortLabels[set], marDendro = c(0,2,2,2), zlimPreservation = c(0,1),

marHeatmap = c(5,5,2,2), setMargins = TRUE,

plotAdjacency = FALSE,

printAdjacency = TRUE, cex.adjacency = 0.5)

dev.off();

}

6 Output of module membership, eigengenes, and eigengene correla-tions

In this section we output a whole lot of the network information into text csv files that can be viewed in MS Excel,OpenOffice Calc and other similar spreadsheet software. We begin with lists of samples used in network analysis and“expressions” of module eigengenes.

# Samples that are used for network analysis


{

samples = rownames(expr[[set]]$data);

write.table(data.frame(samples), col.names = FALSE, row.names = FALSE, quote = FALSE,

file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-networkSamples.txt"));

}

# Module eigengenes


{

write.table(as.data.frame(cbind(Mice_id = rownames(expr[[set]]$data), MEs[[set]]$eigengenes)),

col.names = TRUE, row.names = FALSE, sep = ",", quote = FALSE,

file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-moduleEigengenes.csv"))

}

Next we output a table of module-trait associations.

# Module-trait relationships

moduleTraitRels = list();


{

moduleTraitRels[[set]] = bicorAndPvalue(MEs[[set]]$eigengenes, selTraits[[set]]$data)

outMat = rbind(moduleTraitRels[[set]]$bicor, moduleTraitRels[[set]]$p);

dim(outMat) = c(ncol(MEs[[set]]$eigengenes), 2*nSelTraits)

nameMat = matrix(cbind(spaste("bicor.", colnames(selTraits[[set]]$data)),

spaste("p.", colnames(selTraits[[set]]$data))),

2, nSelTraits, byrow = TRUE)

colnames(outMat) = as.vector(nameMat);

26

write.table(as.data.frame(cbind(Eigengene = colnames(MEs[[set]]$eigengenes), outMat)),


file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-moduleTraitBicor.csv"))

}

Lastly, we output a large table of fuzzy module membership and gene significance for all traits. The modulemembership MM of a gene (probe), also known as module eigengene-based connectivity kME, is given by thecorrelation of the gene expression profile and the module eigengene. Similarly, gene significance for a trait is givenby the correlation of the gene expression profile with the numeric trait.





{

# Calculate module membership, a.k.a kME

KMEall = bicorAndPvalue(expr[[set]]$data, MEs[[set]]$eigengenes);

KMEmod = rep(NA, nGenes)

KMEmodP = rep(NA, nGenes)

modLevels = sort(unique(labels[[set]]));

nMods = length(modLevels);

for (mod in 1:nMods)

{

inMod = labels[[set]]==modLevels[mod];

# This assumes MEs[[set]]$eigengenes are sorted the same way as modLevels

KMEmod[inMod] = KMEall$bicor[inMod, mod];

KMEmodP[inMod] = KMEall$p[inMod, mod];

}

kmeMat = rbind(KMEall$bicor, KMEall$p);

dim(kmeMat) = c(nGenes, 2*nMods);

nameMat = matrix(cbind(spaste("k", colnames(MEs[[set]]$eigengenes)),

spaste("p.k", colnames(MEs[[set]]$eigengenes))),

2, nMods, byrow = TRUE)

colnames(kmeMat) = as.vector(nameMat);

# Connect probe names to gene names

genes = colnames(expr[[set]]$data)

expr2annot = match(genes, annot$sequence);

annotInfo = annot[expr2annot, c(4,5,6,7,8,9)];

# Calculate gene significance

GS = bicorAndPvalue(expr[[set]]$data, selTraits[[set]]$data);

GSmat = rbind(GS$bicor, GS$p);

dim(GSmat) = c(nGenes, 2*nSelTraits);

nameMat = matrix(cbind(spaste("GS.", colnames(selTraits[[set]]$data)),

spaste("pGS", colnames(selTraits[[set]]$data))),

2, nSelTraits, byrow = TRUE)

colnames(GSmat) = as.vector(nameMat);

# Put it all together

info = cbind(annotInfo,

moduleLabel = labels[[set]],

moduleColor = labels2colors(labels[[set]]),

KME.labelModule = KMEmod,

pKME.labelModule = KMEmodP,

GSmat,

kmeMat);

# Save the big table into a text csv file

write.table(info,


file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-moduleMembership.csv"))

27

collectGarbage();

}

7 Overlap of liver and adipose modules

We now produce a color-coded overlap table of liver and adipose modules.

# Call the overlapTable function to calculate the overlaps

overlap = overlapTable(labels[[1]], labels[[2]]);

# Prepare axis labels for the table plot

modSizes = lapply(labels, table);

xLabels = spaste("A.", sort(unique(labels[[2]])), " (", modSizes[[2]], ")");

yLabels = spaste("L.", sort(unique(labels[[1]])), " (", modSizes[[1]], ")");

# Content of the table

textMat = spaste(overlap$countTable, "|", signif(overlap$pTable, 1));

mat = overlap$pTable;

mat[mat

Overlap of adipose and liver modules

0

10

20

30

40

50

60

A.0

(607

5)

A.1

(540

0)

A.2

(303

1)

A.3

(180

4)

A.4

(155

9)

A.5

(115

1)

A.6

(809

)

A.7

(791

)

A.8

(442

)

A.9

(519

)

A.10

(490

)

A.11

(335

)

A.12

(358

)

A.13

(311

)

A.14

(246

)

A.15

(239

)

A.16

(63)

L.0 (9910)L.1 (2821)L.2 (1262)L.3 (1283)

L.4 (909)L.5 (777)L.6 (666)L.7 (637)L.8 (522)L.9 (435)

L.10 (409)L.11 (379)L.12 (343)L.13 (323)L.14 (268)L.15 (257)L.16 (231)L.17 (190)L.18 (201)L.19 (193)L.20 (161)L.21 (145)L.22 (135)L.23 (128)L.24 (126)

L.25 (92)L.26 (87)L.27 (80)L.29 (84)L.33 (63)L.38 (58)L.45 (50)L.48 (45)L.50 (51)L.58 (42)L.64 (40)L.65 (38)L.68 (32)L.70 (33)L.71 (33)L.73 (31)L.76 (28)L.81 (25)

3439|1e−157 1679|1 1187|1 613|1 588|1 651|1e−24 323|0.9 369|0.004 97|1 211|0.7 217|0.2 152|0.1 86|1 107|1 65|1 107|0.2 19|1

287|1 2018|0 26|1 64|1 45|1 56|1 34|1 35|1 119|4e−18 7|1 8|1 7|1 65|4e−04 34|0.7 7|1 3|1 6|0.8

156|1 47|1 729|0 22|1 41|1 28|1 31|1 20|1 1|1 94|7e−26 59|5e−09 16|0.7 1|1 2|1 7|1 8|0.9 0|1

159|1 107|1 21|1 781|0 46|1 33|1 18|1 20|1 37|0.006 6|1 5|1 3|1 11|1 20|0.2 9|0.9 4|1 3|0.7

306|3e−08 328|2e−20 122|0.3 29|1 16|1 27|1 5|1 21|1 13|0.9 6|1 10|1 4|1 10|0.9 2|1 5|1 4|1 1|0.9

144|1 30|1 375|2e−133 18|1 42|0.9 22|1 35|0.06 21|0.9 2|1 30|0.002 33|9e−05 8|0.9 1|1 2|1 6|0.8 8|0.5 0|1

139|1 88|1 85|0.5 21|1 26|1 34|0.4 77|6e−21 68|3e−16 8|0.9 33|1e−05 49|2e−14 17|0.01 3|1 4|1 5|0.8 8|0.4 1|0.8

146|1 35|1 41|1 8|1 258|1e−138 27|0.8 10|1 43|1e−05 1|1 19|0.1 3|1 21|3e−04 2|1 1|1 6|0.7 15|0.002 1|0.8

136|0.4 242|6e−33 33|1 15|1 13|1 18|1 8|1 3|1 11|0.4 1|1 8|0.9 3|1 18|0.001 6|0.7 2|1 5|0.6 0|1

62|1 145|3e−07 9|1 26|0.9 14|1 11|1 6|1 10|0.9 33|1e−11 0|1 1|1 0|1 23|3e−07 89|4e−81 2|0.9 2|0.9 2|0.3

116|0.1 29|1 61|0.1 18|1 46|3e−04 32|0.006 39|1e−08 15|0.4 4|0.9 6|0.9 14|0.05 7|0.4 3|0.9 1|1 10|0.01 6|0.2 2|0.3

103|0.3 74|0.9 26|1 8|1 46|5e−05 20|0.4 28|1e−04 22|0.009 7|0.6 10|0.3 15|0.01 6|0.5 2|1 5|0.6 5|0.4 2|0.9 0|1

81|0.8 109|8e−05 51|0.1 25|0.6 18|0.9 13|0.9 4|1 6|1 9|0.2 5|0.9 6|0.7 5|0.5 3|0.9 3|0.8 1|1 2|0.9 2|0.2

99|0.03 29|1 7|1 43|3e−04 107|1e−46 14|0.7 2|1 2|1 8|0.3 2|1 0|1 1|1 3|0.9 3|0.8 1|1 2|0.8 0|1

66|0.7 16|1 74|7e−11 6|1 32|9e−04 9|0.9 11|0.3 8|0.7 0|1 18|3e−05 15|5e−04 5|0.3 1|1 2|0.9 4|0.3 1|0.9 0|1

22|1 41|1 2|1 7|1 6|1 7|1 1|1 2|1 48|9e−34 1|1 0|1 1|1 107|3e−129 8|0.02 2|0.8 0|1 2|0.1

59|0.5 19|1 19|1 3|1 35|4e−06 30|1e−06 12|0.1 21|4e−05 0|1 7|0.2 11|0.009 7|0.05 1|1 0|1 1|0.9 5|0.09 1|0.5

19|1 90|9e−14 4|1 33|7e−06 9|0.9 4|1 5|0.8 1|1 14|1e−05 0|1 0|1 1|0.9 4|0.3 4|0.2 2|0.6 0|1 0|1

74|3e−04 17|1 6|1 7|1 56|7e−21 10|0.5 6|0.7 4|0.9 5|0.3 2|0.9 1|1 2|0.8 2|0.8 1|0.9 5|0.06 1|0.9 2|0.1

32|1 9|1 15|1 1|1 24|0.002 24|2e−05 2|1 27|3e−10 0|1 17|1e−06 3|0.8 31|1e−23 0|1 0|1 2|0.6 6|0.01 0|1

40|0.6 24|1 18|0.8 0|1 10|0.6 7|0.7 22|3e−08 12|0.008 3|0.6 9|0.009 7|0.05 3|0.4 3|0.4 2|0.6 0|1 1|0.8 0|1

49|0.02 11|1 19|0.5 3|1 13|0.2 13|0.03 12|0.004 6|0.4 0|1 2|0.8 4|0.4 1|0.9 4|0.2 6|0.01 0|1 2|0.4 0|1

9|1 22|1 4|1 6|1 1|1 1|1 0|1 1|1 0|1 0|1 0|1 0|1 0|1 0|1 91|1e−153 0|1 0|1

29|0.8 9|1 34|2e−05 1|1 4|1 6|0.6 9|0.03 7|0.1 0|1 15|1e−07 6|0.05 3|0.3 0|1 0|1 0|1 5|0.01 0|1

26|0.9 3|1 4|1 5|1 9|0.5 21|7e−07 1|1 16|5e−06 1|0.9 2|0.8 1|0.9 13|3e−08 0|1 0|1 1|0.7 2|0.4 21|7e−33

33|0.02 17|0.9 8|0.9 2|1 2|1 4|0.7 10|0.001 7|0.03 3|0.2 2|0.6 2|0.6 0|1 0|1 1|0.7 0|1 1|0.6 0|1

24|0.4 22|0.3 6|1 4|0.9 8|0.2 3|0.8 10|8e−04 2|0.8 2|0.5 2|0.6 2|0.5 0|1 1|0.7 1|0.7 0|1 0|1 0|1

9|1 8|1 6|1 2|1 0|1 1|1 40|3e−37 5|0.1 2|0.4 2|0.5 4|0.08 0|1 0|1 0|1 1|0.6 0|1 0|1

36|5e−04 13|1 2|1 5|0.8 12|0.009 4|0.6 5|0.2 1|0.9 3|0.2 0|1 0|1 0|1 1|0.7 2|0.3 0|1 0|1 0|1

19|0.2 7|1 8|0.6 2|1 5|0.4 4|0.4 2|0.6 1|0.9 0|1 3|0.2 1|0.7 2|0.2 1|0.6 1|0.6 3|0.03 4|0.004 0|1

21|0.05 17|0.2 6|0.8 2|0.9 0|1 3|0.5 5|0.05 1|0.9 1|0.7 0|1 0|1 1|0.6 1|0.6 0|1 0|1 0|1 0|1

21|0.009 4|1 7|0.5 2|0.9 2|0.9 7|0.01 2|0.5 0|1 0|1 1|0.7 0|1 1|0.5 0|1 1|0.5 1|0.4 1|0.4 0|1

11|0.6 13|0.2 0|1 2|0.9 2|0.8 2|0.7 10|2e−06 0|1 3|0.05 0|1 1|0.6 0|1 0|1 1|0.4 0|1 0|1 0|1

17|0.1 4|1 4|0.9 4|0.6 1|1 2|0.7 2|0.5 2|0.5 0|1 2|0.3 1|0.7 11|1e−10 0|1 0|1 0|1 1|0.4 0|1

4|1 4|1 1|1 1|1 4|0.3 1|0.9 0|1 2|0.4 0|1 0|1 0|1 0|1 0|1 0|1 1|0.4 24|1e−37 0|1

8|0.8 24|5e−07 1|1 1|1 2|0.8 1|0.9 0|1 1|0.7 1|0.5 1|0.6 0|1 0|1 0|1 0|1 0|1 0|1 0|1

8|0.8 20|6e−05 2|1 0|1 3|0.5 0|1 0|1 2|0.4 2|0.2 1|0.6 0|1 0|1 0|1 0|1 0|1 0|1 0|1

8|0.6 14|0.007 1|1 0|1 3|0.4 0|1 0|1 1|0.7 2|0.1 0|1 1|0.5 0|1 1|0.4 1|0.3 0|1 0|1 0|1

9|0.5 2|1 1|1 0|1 1|0.9 0|1 18|2e−18 1|0.7 0|1 1|0.5 0|1 0|1 0|1 0|1 0|1 0|1 0|1

20|2e−05 0|1 3|0.8 0|1 6|0.02 0|1 1|0.7 2|0.3 0|1 0|1 0|1 1|0.4 0|1 0|1 0|1 0|1 0|1

5|0.9 3|1 3|0.8 0|1 0|1 0|1 3|0.09 2|0.3 0|1 1|0.5 2|0.1 1|0.4 0|1 1|0.3 1|0.3 9|2e−11 0|1

20|5e−07 4|0.9 0|1 0|1 2|0.6 0|1 0|1 1|0.6 0|1 0|1 0|1 1|0.3 0|1 0|1 0|1 0|1 0|1

4|0.9 3|0.9 0|1 14|4e−10 1|0.8 1|0.7 0|1 0|1 2|0.08 0|1 0|1 0|1 0|1 0|1 0|1 0|1 0|1

Figure 9: Overlap of liver (y-axis) and adipose (x-axis) modules. Each row corresponds to a liver module indicatedon the left by name, color, and number of probes in the module. Conversely, each column corresponds to anadipose module indicted at the bottom. Numbers in the table indicate the number of probes in the overlap and thecorresponding Fisher exact p-value. The table is colored according to − log10 p, with the colors scale indicated onthe right. The large modules 1–4, and “module” 0, overlap very strongly between the tissues. Some other, smallermodules, also show strong overlaps, but HDL-related modules overlap more weakly with modules in the oppositetissue.

29

8 Gene significance and module membership in HDL-related modulesare correlated

Here we show that in HDL-related modules, highly connected genes (referred to as intramodular hub genes) also tendto have high gene significance for HDL. We use the function verboseScatterplot to plot annotated scatterplots ofgene significance (GS) vs. module membership (also known as eigengene-based connectivity kME).

hdlInd = match("e_hdl_mgdl", colnames(selTraits[[1]]$data));

sizeGrWindow(7, 8);

#pdf(file = "Plots/Female-LA-HubgeneSignifForHDL.pdf", width = 7, height = 8);


par(mar = c(3.5, 3.5, 4, 0.5));

par(mgp = c(1.8, 0.6, 0));


{

# Select only modules related to HDL

moduleList1 = bestModules[[set]][[hdlInd]]$bestModules;

for (mod in moduleList1) # For each module...

{

# Find the module in the eigengenes

modGeneInd = (labels[[set]] == mod);

meInd = match(paste("ME", mod, sep=""), names(MEs[[set]]$eigengenes));

# Calculate GS, KME, and module eigengenes significance (MES)

nModGenes = sum(modGeneInd);

KME = bicor(expr[[set]]$data[, modGeneInd], MEs[[set]]$eigengenes[, meInd], use = ’p’);

GS = bicor(expr[[set]]$data[, modGeneInd], selTraits[[set]]$data[, hdlInd], use = ’p’);

MS = bicor(MEs[[set]]$eigengenes[, meInd], selTraits[[set]]$data[, hdlInd], use = ’p’);

# Plot GS vs. kME

verboseScatterplot(KME, GS,

main = paste(shortLabels[set], mod, standardColors()[mod], "\nMES =",

signif(MS, 2), "\n"),

xlab = paste("kME in", shortLabels[set]),

ylab = paste("GS.HDL in", shortLabels[set]), abline = TRUE, cex.lab = 1.2,

cex.main = 1.2, cex.axis = 1.2);

}

}

# If plotting into a file, close it.

dev.off();

The result is shown in Figure 10. We observe that GS.HDL and kME are strongly correlated, that is hub genes inHDL-related modules also tend to be strongly related to HDL.

30

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

● ●●●

●

●

●

●

●

●

●

●

●

● ●●

●

●

●

●●

●

●

●

●

●

●●

●

●●

●

●

●●

●

●

●

●●

●

●

●●

● ●

●●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

● ●

●●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●● ●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●●

●

●

●

●

●

●●

●

●

●

●

● ●

●

●

●

●●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●● ●

●

●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●● ●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●● ●

●●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

● ●

●

●

●●

●●

●●

●

●

●

●

●

●

●

●

● ●

●●

●

●●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●●

●

●

●●

●

●

●

●●

●

●

●

●

● ●

●

●●

● ●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●

●●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

● ●

●

●●

●●

●●●●

●● ●

●

●

●

●

●

●

●●

●

●

0.4 0.6 0.8

0.0

0.2

0.4

0.6

Liver 6 red MES = 0.56

cor=0.62, p=5.6e−72

kME in Liver

GS

.HD

L in

Liv

er

●

●

●

●

●

●

●

●

●

●

●● ●

●

●

●

●

●

●●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

0.4 0.6 0.8

0.20

0.30

0.40

0.50

Liver 64 skyblue2 MES = 0.47

cor=0.6, p=4.3e−05

kME in Liver

GS

.HD

L in

Liv

er

●

● ●●

●

●

● ●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

● ●

●

●

●

●

●

●●

●

●

●●

●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

● ● ●

●

●●

●

●

●

● ●

●

●

●

●

●●

●●

●●

●

●

●

● ●

●

●

●●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

● ●●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●●

●●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●●

●

●

●

●

●

●

●

●●

●●

●

●●

●

●

●

●

●●

●

●●

●

●

●●

●

●●●

●

●

●

●

●

●

●

●●●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●●

●●●

●

● ●

●

●●

●

●

●

●

●●

●●

●

●

●

●

●

●

●

● ●●●

●

●

●●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

0.3 0.4 0.5 0.6 0.7 0.8−0.

6−

0.4

−0.

20.

0

Liver 11 greenyellow MES = −0.46

cor=−0.53, p=7.8e−29

kME in Liver

GS

.HD

L in

Liv

er●

●

●●

●

●

●

●

●

●●

●

●●●

●

●●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●

●

●●

●

●

●

●

●●

●

●

●●

●

●

●

● ●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

0.4 0.5 0.6 0.7 0.8

0.0

0.2

0.4

Liver 20 royalblue MES = 0.42

cor=0.36, p=2.7e−06

kME in Liver

GS

.HD

L in

Liv

er

●

●●

●●

●●●

●

●●

●

●

●

●

●●

●

●

●

● ●

●●

●

●

●

●

●

●

●

●

●

● ●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

● ●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●●

●

●

●

●

●

●

●

●

●

●●

●

● ●

●

●

●

●●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●●

●●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●●

●

●

●

● ●

●

●●●

●

● ●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●●

●●

●

●

●●

●

●●

●●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●�

statistical analysis code for analysis of castxb6 f2 mouse cross 2. network analysis ... · 2011....

Documents