statistical analysis code for analysis of castxb6 f2 mouse cross 2. network analysis ... · 2011....

38
Statistical analysis code for analysis of CASTxB6 F2 mouse cross 2. Network analysis in liver and adipose Peter Langfelder March 23, 2011 Contents 1 Setting up the R session and loading of data 1 2 Relationships among the physiological traits 3 3 Network construction and module identification 5 3.a Scale-free topology analysis .......................................... 5 3.b Network construction and module identification ............................... 7 3.c Merging of closely-related modules ...................................... 7 3.d Trimming of genes with low module membership .............................. 11 3.e Identification and removal of linkage-driven modules ............................ 12 3.f Gene clustering dendrograms and module colors ............................... 13 4 GO enrichment analysis 17 4.a Exporting lists of genes in each module ................................... 17 4.b GO enrichment analysis in WGCNA ..................................... 17 5 Modules related to physiological traits 20 5.a Module-trait relationships for all modules that relate significantly to a trait ............... 22 5.b Network plots of all module eigengenes and traits .............................. 26 6 Output of module membership, eigengenes, and eigengene correlations 26 7 Overlap of liver and adipose modules 28 8 Gene significance and module membership in HDL-related modules are correlated 30 9 Module significance in male data validates association found in female data 32 10 Cross-referencing with genes implicated in GWA studies 35 1 Setting up the R session and loading of data In this document we detail our network analysis of the CASTxB6 cross. We use the WGCNA package [1] to construct the gene co-expression network, find modules, relate them to the clinical traits, study GO enrichment, and other tasks. We use the pre-processed data created in part 1.

Upload: others

Post on 31-Jan-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Statistical analysis code for analysis of CASTxB6 F2 mouse cross

    2. Network analysis in liver and adipose

    Peter Langfelder

    March 23, 2011

    Contents

    1 Setting up the R session and loading of data 1

    2 Relationships among the physiological traits 3

    3 Network construction and module identification 53.a Scale-free topology analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.b Network construction and module identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.c Merging of closely-related modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.d Trimming of genes with low module membership . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.e Identification and removal of linkage-driven modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.f Gene clustering dendrograms and module colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    4 GO enrichment analysis 174.a Exporting lists of genes in each module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.b GO enrichment analysis in WGCNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    5 Modules related to physiological traits 205.a Module-trait relationships for all modules that relate significantly to a trait . . . . . . . . . . . . . . . 225.b Network plots of all module eigengenes and traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    6 Output of module membership, eigengenes, and eigengene correlations 26

    7 Overlap of liver and adipose modules 28

    8 Gene significance and module membership in HDL-related modules are correlated 30

    9 Module significance in male data validates association found in female data 32

    10 Cross-referencing with genes implicated in GWA studies 35

    1 Setting up the R session and loading of data

    In this document we detail our network analysis of the CASTxB6 cross. We use the WGCNA package [1] to constructthe gene co-expression network, find modules, relate them to the clinical traits, study GO enrichment, and othertasks. We use the pre-processed data created in part 1.

    1

  • # Set working directory. This step is necessary if your data is saved in a directory other than the current

    directory. Replace the path name below with the directory where the data is stored on your drive.

    # setwd("Z:/home/plangfelder/Work/Mouse-ReciprocalCXB/CxBOnly");

    # Load the WGCNA library

    library(WGCNA)

    # This setting is important, do not leave out

    options(stringsAsFactors = FALSE);

    options(width = 109)

    set.seed(1); #needed for .Random.seed to be defined

    We now set up a few basic variables and load the preprocessed expression data. Liver and adipose wil be indexed 1and 2, respectively. The files necessary for this step have been generated in part 1 of the analysis.

    nSets = 2;

    setLabels = c("Female Liver", "Female Adipose");

    shortLabels = c("Liver", "Adipose");

    shortshortLabels = c("L", "A");

    # Load expression data

    files = c("../CxBOnly-Liver-outliersRemoved-exprFemaOR-pValFemaOR.RData",

    "../CxBOnly-Adipose-outliersRemoved-exprFemaOR-pValFemaOR.RData");

    express = list();

    for (set in 1:nSets)

    {

    x = load(file = files[set]);

    express[[set]] = list(data = exprFemaOR);

    }

    expr = express;

    rm(express);

    collectGarbage();

    exprSize = checkSets(expr);

    nSamples = exprSize$nSamples;

    collectGarbage()

    We now load the trait data and isolate numeric traits measured at the time the animals were sacrificed.

    rawTr = read.csv(file = bzfile("../../../Data-AllMouse/CXB_Clinical_traits.csv.bz2"))

    numTraitInd = c(15:46, 48)

    numTraits = vector(mode = "list", length = nSets);

    # The following is relative to numTraitInd

    selTraitInd = c(5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 26, 27, 28, 29, 30, 31, 32, 33)

    selTraits = vector(mode = "list", length = nSets);

    for (set in 1:nSets)

    {

    mice = rownames(expr[[set]]$data);

    expr2tr = match(mice, rawTr$Mice_id);

    temp = rawTr[expr2tr, numTraitInd]

    rownames(temp) = rawTr$Mice_id[expr2tr];

    numTraits[[set]] = list(data = as.matrix(temp));

    selTraits[[set]] = list(data = as.matrix(temp[, selTraitInd]));

    }

    collectGarbage();

    2

  • Some trait measurements appear to be incorrect or outliers. For example, one mouse (F2 391) has a recorded bodylength of nearly 30 cm (1 foot), which is clearly a measurement (or record) error. Similarly, some fat measurementsare unrealistically high. We remove the unrealistic measurements from the data.

    for (set in 1:nSets)

    {

    suspicious = numTraits[[set]]$data[,"length_cm"] > 20;

    numTraits[[set]]$data[suspicious,"length_cm"] = NA;

    selTraits[[set]]$data[suspicious,"length_cm"] = NA;

    }

    # remove the e_fat outlier

    for (set in 1:nSets)

    {

    suspicious = numTraits[[set]]$data[,"efat_g"] > 20;

    numTraits[[set]]$data[suspicious,"efat_g"] = NA;

    selTraits[[set]]$data[suspicious,"efat_g"] = NA;

    }

    nSelTraits = length(selTraitInd)

    tcInd = match("e_tc_mgdl", colnames(selTraits[[1]]$data));

    Next we modify trait names to make them more descriptive.

    renameTable = matrix(c("e_bweight_g", "e_fat_g", "e_mus_g", "e_fluid_g", "e_fat_per", "e_tg_mgdl",

    "e_tc_mgdl", "e_hdl_mgdl", "e_uc_mgdl", "e_ffa_mgdl", "e_glu_mgdl", "length_cm", "efat_g", "rfat_g",

    "vfat_g", "sfat_g", "insulin_pgml", "leptin_ngml", "bmd_mgcm2",

    "weight", "fat", "muscle", "fluid", "fat.frac", "trigly", "tot.chol", "HDL",

    "unest.chol", "FFA", "glucose", "length", "efat", "rfat", "vfat", "sfat",

    "insulin", "leptin", "BMD"), ncol = 2, nrow = nSelTraits);

    ind = match(renameTable[, 1], colnames(selTraits[[1]]$data));

    renameTable = renameTable[ind, ];

    2 Relationships among the physiological traits

    Here we plot a heatmap of correlations among the traits.

    # Calculate the matrix and order it using a hierarchical clustering dendrogram

    mat = bicor(selTraits[[1]]$data, use = ’p’);

    order = hclust(as.dist(1-mat), method = ’a’)$order;

    # Open a suitably sized graphics window

    sizeGrWindow(9,7);

    # Alternatively, plot into a file. Make sure the directory Plots exists or change the file name

    # appropriately.

    # pdf(file = "Plots/Liver-allTraitCorHeatmap.pdf", width = 9, height = 7);

    par(mar = c(5, 6, 2, 1));

    labeledHeatmap(mat[order, order],

    xLabels = renameTable[order, 2],

    yLabels = renameTable[order, 2],

    colors = greenWhiteRed(50),

    zlim = c(-max(abs(mat)), max(abs(mat))),

    setStdMargins = FALSE, cex.lab = 1.2,

    main = "Correlation heatmap of physiological traits",

    textMat = round(mat[order, order], 2), cex.text = 0.7)

    # If plotting into a file, close it

    dev.off();

    The result is shown in Figure 1. Many of the traits are strongly correlated.

    3

  • Correlation heatmap of physiological traits

    −1

    −0.5

    0

    0.5

    1

    fluid

    insuli

    ntri

    gly FFA

    unes

    t.cho

    l

    tot.c

    hol

    HDL

    BMD

    mus

    cle

    lengt

    h

    gluco

    selep

    tin fat

    fat.fr

    ac

    weigh

    tsfa

    tvfa

    tefa

    trfa

    t

    fluid

    insulin

    trigly

    FFA

    unest.chol

    tot.chol

    HDL

    BMD

    muscle

    length

    glucose

    leptin

    fat

    fat.frac

    weight

    sfat

    vfat

    efat

    rfat

    1 −0.23 −0.17 −0.27 −0.19 −0.29 −0.42 0.1 0.01 −0.12 −0.3 −0.65 −0.67 −0.74 −0.45 −0.57 −0.55 −0.57 −0.6

    −0.23 1 0.07 0.05 0.06 0.18 0.2 0.06 0.07 0.05 0.27 0.32 0.3 0.29 0.25 0.31 0.26 0.31 0.3

    −0.17 0.07 1 0.76 0.51 0.38 0.31 0.09 0.09 −0.07 0.2 0.3 0.27 0.26 0.26 0.31 0.29 0.26 0.27

    −0.27 0.05 0.76 1 0.58 0.43 0.46 0 −0.02 −0.04 0.11 0.36 0.32 0.34 0.26 0.32 0.34 0.31 0.33

    −0.19 0.06 0.51 0.58 1 0.81 0.68 0 −0.05 −0.04 0.24 0.26 0.18 0.19 0.17 0.23 0.21 0.19 0.25

    −0.29 0.18 0.38 0.43 0.81 1 0.84 0.02 0.04 0.06 0.33 0.41 0.32 0.35 0.3 0.33 0.35 0.3 0.38

    −0.42 0.2 0.31 0.46 0.68 0.84 1 0.11 0.1 0.16 0.33 0.59 0.47 0.49 0.46 0.49 0.48 0.47 0.52

    0.1 0.06 0.09 0 0 0.02 0.11 1 0.53 0.36 0.07 0.2 0.17 0.1 0.37 0.2 0.26 0.19 0.19

    0.01 0.07 0.09 −0.02 −0.05 0.04 0.1 0.53 1 0.57 0.18 0.32 0.48 0.34 0.67 0.41 0.45 0.43 0.38

    −0.12 0.05 −0.07 −0.04 −0.04 0.06 0.16 0.36 0.57 1 0.04 0.37 0.4 0.34 0.59 0.41 0.45 0.44 0.4

    −0.3 0.27 0.2 0.11 0.24 0.33 0.33 0.07 0.18 0.04 1 0.33 0.34 0.32 0.41 0.38 0.42 0.3 0.35

    −0.65 0.32 0.3 0.36 0.26 0.41 0.59 0.2 0.32 0.37 0.33 1 0.83 0.81 0.74 0.82 0.77 0.81 0.8

    −0.67 0.3 0.27 0.32 0.18 0.32 0.47 0.17 0.48 0.4 0.34 0.83 1 0.97 0.85 0.89 0.86 0.9 0.86

    −0.74 0.29 0.26 0.34 0.19 0.35 0.49 0.1 0.34 0.34 0.32 0.81 0.97 1 0.78 0.85 0.82 0.85 0.85

    −0.45 0.25 0.26 0.26 0.17 0.3 0.46 0.37 0.67 0.59 0.41 0.74 0.85 0.78 1 0.85 0.88 0.89 0.85

    −0.57 0.31 0.31 0.32 0.23 0.33 0.49 0.2 0.41 0.41 0.38 0.82 0.89 0.85 0.85 1 0.88 0.88 0.88

    −0.55 0.26 0.29 0.34 0.21 0.35 0.48 0.26 0.45 0.45 0.42 0.77 0.86 0.82 0.88 0.88 1 0.89 0.89

    −0.57 0.31 0.26 0.31 0.19 0.3 0.47 0.19 0.43 0.44 0.3 0.81 0.9 0.85 0.89 0.88 0.89 1 0.91

    −0.6 0.3 0.27 0.33 0.25 0.38 0.52 0.19 0.38 0.4 0.35 0.8 0.86 0.85 0.85 0.88 0.89 0.91 1

    Figure 1: Heatmap of correlations among the physiological traits. Many traits are strongly correlated, particularlythe adiposity traits.

    4

  • 3 Network construction and module identification

    In this section we construct the co-expression network and identify co-expression modules. We construct a “signedhybrid” network in which the adjacency aij of nodes i, j with expression profiles xi, xj is defined as

    aij ={

    bicorβ(xi, xj) for bicor(xi, xj) > 00 otherwise

    , (1)

    where bicor is the biweight mid-correlation [3], a type of robust (that is, outlier-insensitive) correlation.

    3.a Scale-free topology analysis

    One of the important network construction parameters is the soft-thresholding power β. We apply the approximatescale-free topology criterion to select an appropriate power in each tissue separately. Note of caution: this codetakes some time (possibly several hours) to run. Please be patient, or, if you trust our results, this part can beskipped.

    powers = c(seq(1,10,by=1));

    powerTables = vector(mode = "list", length = nSets);

    for (set in 1:nSets)

    powerTables[[set]] = list(data =

    pickSoftThreshold(expr[[set]]$data, powerVector=powers,

    networkType = "signed hybrid",

    verbose = 2 )[[2]]);

    save(powerTables, file = "CxBOnly-Female-powerTables.RData");

    collectGarbage();

    We plot the results of the scale-free topology analysis.

    sizeGrWindow(12,9)

    #pdf(file = "Plots/Female-AL-ScaleFreeTopology.pdf", width = 12, height = 9);

    par(mfrow = c(2,2));

    cex1 = 0.7;

    for (set in 1:nSets)

    {

    plot(powerTables[[set]]$data[,1], -sign(powerTables[[set]]$data[,3])*powerTables[[set]]$data[,2],

    xlab="Soft Threshold (power)",ylab="Scale Free Topology Model Fit,signed R^2",type="n",

    main = paste("Scale independence in ", setLabels[set]));

    addGrid();

    text(powerTables[[set]]$data[,1], -sign(powerTables[[set]]$data[,3])*powerTables[[set]]$data[,2],

    labels=powers,cex=cex1,col="red");

    # this line corresponds to using an R^2 cut-off of h

    abline(h=0.90,col="red")

    plot(powerTables[[set]]$data[,1], powerTables[[set]]$data[,5],

    xlab="Soft Threshold (power)",ylab="Mean Connectivity", type="n",

    main = paste("Mean connectivity in", setLabels[set]))

    addGrid();

    text(powerTables[[set]]$data[,1], powerTables[[set]]$data[,5], labels=powers, cex=cex1,col="red")

    }

    # If plotting into a file, close it.

    dev.off();

    The resulting plot is shown in Figure 2. The networks become approximately scale-free when the soft-thresholdingpower becomes 3 to 4. We choose the power 4 for both the liver and adipose networks (but in general the powerscould be different).

    5

  • 2 4 6 8 10

    0.2

    0.4

    0.6

    0.8

    Scale independence in Female Liver

    Soft Threshold (power)

    Sca

    le F

    ree

    Top

    olog

    y M

    odel

    Fit,

    sign

    ed R

    ^2

    1

    2

    34 5

    6 78 9 10

    2 4 6 8 10

    050

    010

    0015

    00

    Mean connectivity in Female Liver

    Soft Threshold (power)

    Mea

    n C

    onne

    ctiv

    ity

    1

    2

    34 5 6 7 8 9 10

    2 4 6 8 10

    0.0

    0.2

    0.4

    0.6

    0.8

    Scale independence in Female Adipose

    Soft Threshold (power)

    Sca

    le F

    ree

    Top

    olog

    y M

    odel

    Fit,

    sign

    ed R

    ^2

    1

    2

    3 4 5 6 7 8 9 10

    2 4 6 8 10

    050

    010

    0015

    0020

    00

    Mean connectivity in Female Adipose

    Soft Threshold (power)

    Mea

    n C

    onne

    ctiv

    ity

    1

    2

    3

    45 6 7 8 9 10

    Figure 2: Scale-free topology analysis. The left panels show the scale-free topology fit index R2 as a function of thesoft-thresholding power. The right panel shows mean network connectivity. The networks become approximatelyscale-free when the soft-thresholding power becomes 3 to 4.

    6

  • 3.b Network construction and module identification

    Here we use the function blockwiseModules to construct the networks and identify modules. The function has multiplearguments and options; here we leave most of them at their default values. We save the result of the calculation soit only needs to be executed once.

    Note of caution: this code assumes that the computer it runs on has enough memory to handle the full dataset. This is usually at least 16 GB but preferrably 32 GB. If your computer’s RAM is not large enough, the codewill trigger an error. In that case please download the file Female-LA-mods.RData from our web site and continue theanalysis below.

    Second note of caution: If you do have a large-enough computer and run this code, be prepared to wait severalhours. The calculation can be speeded up substantially by installing a fast BLAS library such as ATLAS BLAS orGotoBLAS and compiling R against it. If you do not know what “installing a library and compiling R against it”means, your best bet is to be patient and/or run this calculation overnight. Again, you may want to download theresult Female-LA-mods.RData.

    # Set up basic parameters

    softPower = c(4,4);

    minModSize = c(25, 30);

    mergeCutHeight = 0.25;

    cutHeight = 0.995;

    collectGarbage()

    # Call the module construction function for each tissue separately

    mods = list();

    for (set in 1:nSets)

    {

    mods[[set]] = blockwiseModules(expr[[set]]$data,

    maxBlockSize = 30000,

    networkType = "signed hybrid",

    corType = "bicor", power = softPower[set],

    TOMType = "signed",

    TOMDenom = "mean",detectCutHeight = cutHeight,

    minModuleSize = minModSize[set],

    deepSplit = 2,

    mergeCutHeight = mergeCutHeight, saveTOMs = TRUE,

    saveTOMFileBase = spaste("CxBOnly-", shortLabels[set], "-consensusTOM"),

    reassignThreshold = 1e-6,

    minCoreKME = 0.5, minKMEtoStay = 0.3,

    numericLabels = TRUE, verbose = 3);

    collectGarbage();

    }

    # Save the results

    save(mods, file = "Female-LA-mods.RData");

    If the above code already ran once or instead of executing the code above you simply downloaded the resultFemale-LA-mods.RData, load it:

    load(file = "Female-LA-mods.RData");

    7

  • 3.c Merging of closely-related modules

    Here we take a look at the eigengene network of the unmerged modules and merge modules whose eigengenes arehighly correlated. We choose the thresholds for merging to be correlation 0.80 in liver and 0.90 in adipose.

    # Set the cut heights (1-correlation)

    mergeCut = c(0.20, 0.10)

    merge = list();

    # Call the module merge function on each tissue

    for (set in 1:nSets)

    {

    merge[[set]] = mergeCloseModules(expr[[set]]$data, mods[[set]]$unmergedColors, cutHeight = mergeCut[set],

    getNewUnassdME = TRUE, relabel = TRUE);

    }

    # Plot the eigengene dendrograms before and after merging

    sizeGrWindow(12, 9);

    #pdf("Plots/Female-LA-mergingDendrograms-%02d.pdf", onef = FALSE, width = 12, height = 10);

    for (set in 1:nSets)

    {

    par(mfrow = c(2,1));

    par(mar = c(0.2, 4, 2.5, 0.2));

    plot(merge[[set]]$oldDendro, main = paste(setLabels[set], "modules before merging"),

    sub = "", xlab = "", cex = 0.7);

    abline(mergeCut[set], 0, col = "red");

    plot(merge[[set]]$dendro, main = paste(setLabels[set], "modules after merging"),

    sub = "", xlab = "", cex = 0.7);

    abline(mergeCut[set], 0, col = "red");

    }

    # If plotting into a file, close it

    dev.off();

    # Put together variables for further use

    labels = list();

    colors = list();

    MEs = list();

    for (set in 1:nSets)

    {

    labels[[set]] = merge[[set]]$colors;

    colors[[set]] = labels2colors(labels[[set]]);

    MEs[[set]] = orderMEs(merge[[set]]$newMEs);

    }

    # Save the results so this code does not need re-running later.

    save(merge, labels, colors, MEs, file = "Female-LA-merge-colors-labels-MEs.RData");

    If the above code already ran once, the results can be loaded in one line of code:

    load(file = "Female-LA-merge-colors-labels-MEs.RData");

    The resulting module merging dendrograms are shown in Figures 3 and 4. Several modules have been merged inliver but none in adipose.

    8

  • ME

    43M

    E69

    ME

    79M

    E22

    ME

    29M

    E72

    ME

    77M

    E6

    ME

    55M

    E58

    ME

    57M

    E87

    ME

    68M

    E70

    ME

    71M

    E61

    ME

    67M

    E84

    ME

    53M

    E66

    ME

    48M

    E80

    ME

    73M

    E37

    ME

    5M

    E35

    ME

    2M

    E16

    ME

    38M

    E19

    ME

    15M

    E25

    ME

    59M

    E49

    ME

    63M

    E52

    ME

    86M

    E45

    ME

    64M

    E28

    ME

    60M

    E32

    ME

    44M

    E7

    ME

    13M

    E26

    ME

    36M

    E9

    ME

    17M

    E30

    ME

    56M

    E65

    ME

    42M

    E39

    ME

    78 ME

    51M

    E4

    ME

    20M

    E3

    ME

    8 ME

    12M

    E21

    ME

    89M

    E18

    ME

    10M

    E31 M

    E1

    ME

    23M

    E47

    ME

    14M

    E24

    ME

    40M

    E50

    ME

    34M

    E74

    ME

    33M

    E46

    ME

    62M

    E83

    ME

    11M

    E76

    ME

    75M

    E41

    ME

    88M

    E82

    ME

    85M

    E27

    ME

    54M

    E81

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Female Liver modules before merging

    Hei

    ght

    ME

    50M

    E48

    ME

    29M

    E21

    ME

    26M

    E33

    ME

    3M

    E13

    ME

    18 ME

    10M

    E11

    ME

    22M

    E81

    ME

    8M

    E15

    ME

    1M

    E17

    ME

    9M

    E41

    ME

    4M

    E38

    ME

    39M

    E31

    ME

    68M

    E53

    ME

    44M

    E57

    ME

    42M

    E78

    ME

    35M

    E56

    ME

    23M

    E45

    ME

    63M

    E60

    ME

    62M

    E72

    ME

    61M

    E2

    ME

    5M

    E37

    ME

    19M

    E24

    ME

    58M

    E71

    ME

    7M

    E36

    ME

    51M

    E52

    ME

    79M

    E54

    ME

    65M

    E74

    ME

    49M

    E59

    ME

    16M

    E14

    ME

    20M

    E40

    ME

    73M

    E64

    ME

    27M

    E6

    ME

    25M

    E32

    ME

    43M

    E28

    ME

    66M

    E30

    ME

    46M

    E55

    ME

    76M

    E12

    ME

    70M

    E67

    ME

    34M

    E80

    ME

    75M

    E77

    ME

    47M

    E69

    0.2

    0.4

    0.6

    0.8

    1.0

    Female Liver modules after merging

    Hei

    ght

    Figure 3: Liver module eigengene dendrograms based on dissimilarity equal 1− bicor.

    9

  • ME

    16

    ME

    5

    ME

    18

    ME

    20

    ME

    6

    ME

    9

    ME

    2

    ME

    7

    ME

    8

    ME

    12

    ME

    4

    ME

    14

    ME

    15

    ME

    19

    ME

    3

    ME

    11

    ME

    10

    ME

    13

    ME

    1

    ME

    17

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    Female Adipose modules before merging

    Hei

    ght

    ME

    17

    ME

    5

    ME

    18

    ME

    20

    ME

    6

    ME

    10

    ME

    2

    ME

    7

    ME

    9

    ME

    11

    ME

    4

    ME

    14

    ME

    15

    ME

    19

    ME

    3

    ME

    12

    ME

    8

    ME

    13

    ME

    1

    ME

    16

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    Female Adipose modules after merging

    Hei

    ght

    Figure 4: Adipose module eigengene dendrograms based on dissimilarity equal 1− bicor.

    10

  • 3.d Trimming of genes with low module membership

    The module identification method sometimes assigns genes into modules although the gene has very low modulemembership (defined as the correlation of the gene expression profile and the eigengene). Although such moduleassignment could be meaningful, we aim for tighter modules and hence we remove module genes whose modulemembership is below the threshold of 0.30. Since removing any gene from a module in principle changes its eigengene,we iterate this process until no genes are removed.

    mes = list();

    origMEs = list();

    trimLabs = labels;

    for (set in 1:nSets)

    {

    changed = TRUE

    threshold = 0.30;

    mes[[set]] = moduleEigengenes(expr[[set]]$data, labels[[set]]);

    origMEs[[set]] = mes[[set]];

    trimLabs[[set]] = labels[[set]]

    while (changed)

    {

    changed = FALSE;

    nMods = ncol(mes[[set]]$eigengenes)

    modNames = substring(names(mes[[set]]$eigengenes), 3)

    #pind= initProgInd();

    for (mod in 1:nMods) if (modNames[mod]!=’0’)

    {

    modGeneInd = (as.character(trimLabs[[set]])==modNames[mod]);

    nModGenes = sum(modGeneInd);

    KME = bicor(expr[[set]]$data[, modGeneInd], mes[[set]]$eigengenes[, mod], use = ’p’);

    remove = KME < threshold;

    if (sum(remove)>0) changed = TRUE;

    printFlush("module", modNames[mod], ": removing", sum(remove), "of", length(remove), "genes.");

    trimLabs[[set]][modGeneInd][remove] = 0;

    #pind = updateProgInd(mod/nMods, pind);

    }

    #printFlush("");

    # Redo module eigengene calculation

    if (changed) mes[[set]] = moduleEigengenes(expr[[set]]$data, colors = trimLabs[[set]]);

    }

    }

    Next we check how much the eigengenes have changed. We calculate the correlations between the “original” (i.e.,before gene trimming) and new eigengenes.

    #out of curiosity: correlations between original and trimmed module eigengenes:

    signif(diag(cor(mes[[1]]$eigengenes, origMEs[[1]]$eigengenes)), 3);

    signif(diag(cor(mes[[2]]$eigengenes, origMEs[[2]]$eigengenes)), 3);

    signif(min(abs(diag(cor(mes[[1]]$eigengenes, origMEs[[1]]$eigengenes))[-1])), 3);

    signif(min(abs(diag(cor(mes[[2]]$eigengenes, origMEs[[2]]$eigengenes))[-1])), 3);

    Excluding the eigengene of the improper module 0 (that collects the unassigned genes), the minimum correlation ofold and new eigengenes is 0.999, which indicates that although some outlying genes were removed, the eigengeneshave practically not changed. We now re-form eigengenes, save the results of this part and replace the module labelswith the trimmed labels.

    MEs = list();

    ordMEs = list();

    for (set in 1:nSets)

    {

    MEs[[set]] = moduleEigengenes(expr[[set]]$data, trimLabs[[set]]);

    11

  • ordMEs[[set]] = orderMEs(MEs[[set]], greyName = "ME0");

    }

    # Save the results so they can be loaded in future

    save(trimLabs, MEs, ordMEs, file = "Female-LA-trimLabs.RData");

    #load(file = "Female-LA-trimLabs.RData");

    labels = trimLabs;

    colors = lapply(labels, labels2colors)

    3.e Identification and removal of linkage-driven modules

    Some of the smaller modules appear to be linkage-driven in the sense that they group together genes located ina single chromosomal region and their eigengene is highly correlated with a genotype at that locus. Although thegenes in such modules are co-expressed in this particular CASTxB6 cross, they would likely not be co-expressed ina random (diverse) population. Therefore we identify such modules are remove them from the analysis (by settingthe module labels of the corresponding genes to 0). We start by loading and formatting the genotype data.

    # (re) read the gene annotation table

    file = bzfile(description = "../../../Data-AllMouse/CXB_GeneAnnotation.csv.bz2");

    #file = "../Data-CXB/CXB_all_gene_annotation.csv"

    annotX = read.csv(file = file);

    # Read the SNP data and sort them

    file = bzfile(description = "../../../Data-AllMouse/CXB_GENOTYPES_numeric.csv.bz2");

    gtInfo = read.csv(file = file);

    file = bzfile(description = "../../../Data-AllMouse/CXB_GENOTYPES_alpha.csv.bz2");

    gtAlpha = read.csv(file = file);

    # Correct the coding of numeric genotypes

    num2alpha = match(gtInfo$marker_name, gtAlpha$marker_name);

    gtAlphaN = gtAlpha[num2alpha, ]

    all.equal(names(gtAlphaN), names(gtInfo))

    gtInfo[gtAlphaN==’H’] = 1;

    gtInfo[gtAlphaN==’B’] = 2;

    collectGarbage()

    # Sort the SNPs:

    snpHasAnno = is.finite(gtInfo$chro_number) & is.finite(gtInfo$marker_pos_Bp);

    gtInfoA = gtInfo[snpHasAnno, ];

    SNPorder = order(gtInfoA$chro_number, gtInfoA$marker_pos_Bp);

    gtInfoS = gtInfoA[SNPorder, ];

    gtCols = substring(names(gtInfoS), 1, 3)=="F2_";

    gtSamples = names(gtInfoS)[gtCols];

    Next we identify and remove modules whose highest correlation with a SNP is above 0.5.

    # Identify modules whose correlation with the best SNP is above 0.5

    cleanLabels = labels;

    for (set in 1:nSets)

    {

    common = intersect(rownames(expr[[set]]$data), gtSamples);

    expr2gt = match(rownames(expr[[set]]$data), gtSamples);

    print(table(is.na(expr2gt)))

    # all expression-measured samples have a genotype, good.

    gt = t(gtInfoS[, gtCols][, expr2gt]);

    gtAnnot = gtInfoS[, c(2,5,6)];

    collectGarbage()

    x = bicorAndPvalue(gt, MEs[[set]]$eigengenes);

    bestP = apply(x$p, 2, min, na.rm = TRUE)

    whichP = apply(x$p, 2, which.min)

    maxCor = apply(abs(x$bicor), 2, max);

    which = apply(abs(x$bicor), 2, which.max);

    12

  • if (!isTRUE(all.equal(whichP, which))) stop("which and whichP do not agree.");

    suspicious = maxCor > 0.5;

    suspInfo = data.frame(module = substring(names(MEs[[set]]$eigengenes)[suspicious], 3),

    gtAnnot[which[suspicious], ],

    absCor.SNP.ME = maxCor[suspicious],

    pValue.SNP.ME = bestP[suspicious]);

    modules = as.numeric(substring(names(MEs[[set]]$eigengenes)[suspicious], 3))

    printFlush(paste("Suspicious modules: ", paste(modules, collapse = ", ")));

    cleanLabels[[set]] [is.finite(match(labels[[set]], modules))] = 0;

    write.csv(suspInfo, file = spaste("CxBOnly-Female-", shortLabels[set], "-HighSNP-MEcorrelations.csv"),

    quote = FALSE, row.names = FALSE);

    }

    We again replace the module labels by the cleaned labels and recalculate module eigengenes for further use.

    # From here on only use cleaned labels:

    labels = cleanLabels;

    MEs = list();

    ordMEs = list();

    MEs0 = list(); # Leave grey eigengene out

    for (set in 1:nSets)

    {

    MEs[[set]] = moduleEigengenes(expr[[set]]$data, labels[[set]]);

    ordMEs[[set]] = orderMEs(MEs[[set]], greyName = "ME0");

    MEs0[[set]] = moduleEigengenes(expr[[set]]$data, labels[[set]], excludeGrey = TRUE, grey = 0);

    }

    save(labels, MEs, MEs0, ordMEs, file = "Female-LA-labels-MEs-ordMEs-afterCleaning.RData");

    #load(file = "Female-LA-labels-MEs-ordMEs-afterCleaning.RData");

    colors = lapply(labels, labels2colors)

    3.f Gene clustering dendrograms and module colors

    Here we take a look at the gene clustering trees in both tissues. This allows us to visually verify that the moduleidentification procedure led to modules that actually correspond to distinguishable branches of the gene clusteringdendrogram. We also add color-coded indicators of gene significance for the individual traits. It is better to savethe plots directly into a pdf (large file size, full resolution with zoom-in) or png (smaller file size but also smallerresolution), but the plot can also be viewed on-screen.

    # Calculate gene significance for all traits

    basePVal = 0.01;

    traitGeneColors = list();

    for (set in 1:nSets)

    {

    z = qnorm(1-basePVal)/sqrt(nSamples[set]-3);

    baseCor = tanh(z);

    cor = bicor(expr[[set]]$data, selTraits[[set]]$data, use = "p");

    cor[abs(cor) < baseCor] = 0;

    traitGeneColors[[set]] = numbers2colors(cor, signed = TRUE);

    colnames(traitGeneColors[[set]]) = colnames(selTraits[[set]]$data);

    }

    # Plot the gene clustering trees and the gene significance

    sizeGrWindow(12,9)

    #pdf(file = "Plots/Female-LA-geneDendrograms-AllTraits-%02d.pdf", w = 30, h = 15, onefile = FALSE)

    #png(file = "Plots/Female-LA-geneDendrograms-AllTraits-%02d.png", w = 1200, h = 600)

    for (set in 1:nSets)

    {

    par(lheight=1.3);

    plotDendroAndColors(mods[[set]]$dendrograms[[1]],

    13

  • cbind(traitGeneColors[[set]], colors[[set]]),

    c(renameTable[, 2], "modules"),

    autoColorHeight = FALSE,

    colorHeight = 0.6,

    rowText = spaste(labels[[set]], ": ", colors[[set]]),

    textPositions = nSelTraits + 1,

    marAll = c(0, 8, 2, 3),

    ylab = "", xlab = "", sub = "", dendroLabels = FALSE, hang = 0.03,

    addGuide = TRUE, guideHang = 0.05, cex.rowText = 1.3, cex.colorLabels = 1.2,

    rowWidths = c(rep(1, nSelTraits + 1), 15),

    addTextGuide = TRUE,

    main = spaste(shortLabels[set],

    " gene dendrogram, association with traits and module colors"),

    cex.main = 1.4);

    }

    dev.off()

    We show the results (in the png version) in Figures 5 and 6. The dendrograms exhibit clear branches that areidentified as modules. In the large-resolution pdf figures one can also see the smaller branches that correspond tosmaller modules.

    14

  • Figure 5: The upper panel shows the gene clustering tree (dendrogram) in liver. Each “leaf”, i.e., a short verticalline, corresponds to one gene (more precisely, a microarray probe). Branches of the dendrogram correspond tomodules. Below the dendrogram, color rows annotated by clinical traits give the gene significance for (correlationwith) the corresponding trait. Red color corresponds to positive gene significance (GS), and green color correspondsto negative GS. White color indicates no gene significance; color saturation corresponds to GS strength. The lastcolor row indicates module assignment. Module colors are annotated below the module color row.

    15

  • Figure 6: The upper panel shows the gene clustering tree (dendrogram) in adipose. Each “leaf”, i.e., a short verticalline, corresponds to one gene (more precisely, a microarray probe). Branches of the dendrogram correspond tomodules. Below the dendrogram, color rows annotated by clinical traits give the gene significance for (correlationwith) the corresponding trait. Red color corresponds to positive gene significance (GS), and green color correspondsto negative GS. White color indicates no gene significance; color saturation corresponds to GS strength. The lastcolor row indicates module assignment. Module colors are annotated below the module color row.

    16

  • 4 GO enrichment analysis

    Here we perform a functional enrichment analysis of the found modules. There are two main methods one can use:either export lists of genes in each module and use external software, or use the function GOenrichmentAnalysis inWGCNA to calculate enrichment in GO terms. We first show how to export the gene lists for each module for usewith external software, then perform the actual analysis using GOenrichmentAnalysis.

    4.a Exporting lists of genes in each module

    We (re-)load the gene annotation table and export the matchin Locus Link IDs (also known as Entrez IDs). Theoutput is a set of text files with names such as Liver-3.txt etc. in the subdirectory FEA of the current directory. Ifthe directory FEA does not exist, please create it or modify the variable outFileBase below.

    file = bzfile(description = "../../../Data-AllMouse/CXB_GeneAnnotation.csv.bz2");

    #file = "../Data-CXB/CXB_all_gene_annotation.csv"

    annot = read.csv(file = file);

    # Loop over tissues

    for (set in 1:nSets)

    {

    # Base of the file names

    outFileBase = spaste("FEA/", shortLabels[set], "-");

    nMods = ncol(MEs[[set]]$eigengenes)

    modNames = as.numeric(substring(names(MEs[[set]]$eigengenes), 3))

    # loop over modules

    for (mod in 1:nMods)

    {

    modGeneInd = (labels[[set]]==modNames[mod]);

    modProbes = colnames(expr[[set]]$data)[modGeneInd];

    annotInd = match(modProbes, annot$sequence);

    annotInd = annotInd[!is.na(annotInd)];

    modGeneIDs = annot$LocusLinkID[annotInd];

    modGeneIDs = modGeneIDs[!is.na(modGeneIDs)];

    write.table(data.frame(LLID = modGeneIDs),

    file = paste(outFileBase, modNames[mod], ’.txt’, sep = ""), quote = F, row.names = F,

    col.names = F);

    }

    allAnnotInd = match(names(expr[[set]]$data), annot$sequence);

    allAnnotInd = allAnnotInd[!is.na(allAnnotInd)];

    GeneIDs = annot$LocusLinkID[allAnnotInd];

    GeneIDs = GeneIDs[!is.na(GeneIDs)];

    # Also write out a file of all genes in the network, useful as a background list in the analysis.

    write.table(data.frame(LLID = GeneIDs),

    file = paste(outFileBase, ’all.txt’, sep = ""), quote = F, row.names = F,

    col.names = F);

    }

    4.b GO enrichment analysis in WGCNA

    Here we perform the GO enrichment analysis directly in WGCNA. This is usually much more convenient thanuploading each module separately to a separate application, but is restricted to GO. This calculation will takeseveral minutes.

    # (re-)read gene annotation

    file = bzfile(description = "../../../Data-AllMouse/CXB_GeneAnnotation.csv.bz2");

    #file = "../Data-CXB/CXB_all_gene_annotation.csv"

    annot = read.csv(file = file);

    # Calculate enrichment information

    bt = list();

    17

  • for (set in 1:nSets)

    {

    expr2annot = match(colnames(expr[[set]]$data), annot$sequence);

    LLID = annot$LocusLinkID[expr2annot];

    table(is.na(LLID))

    fin = !is.na(LLID);

    finLLID = LLID[fin];

    finLabels = labels[[set]][fin];

    system.time ( {

    bt[[set]] = GOenrichmentAnalysis(finLabels, finLLID, organism = "mouse",

    nBest = 20, nBiggest = 0, includeOffspring = TRUE);

    } );

    }

    # Save the results for future use

    save(bt, file = "Female-LA-GOEnrichemnt-trimmedAndCleanedLabels.RData");

    Next we re-format the full information into a more manageable form and print it. To make the prinout readable,please make the R console at least 100 characters wide. The table is also saved into an excel sheet that can be openedusing MS Excel or OpenOffice Calc.

    # If necessary, load the results

    load(file = "Female-LA-GOEnrichemnt-trimmedAndCleanedLabels.RData");

    # Loop over tissues

    for (set in 1:nSets)

    {

    res = bt[[set]]$bestPTerms[[4]]$enrichment;

    # Write an excel sheet containing the full information

    write.table(res, file = spaste("CxBOnly-Female-", shortLabels[[set]], "-GOenrichment.txt"),

    row.names = FALSE, sep = "\t", quote = FALSE);

    # Print a "narrower" version

    res2 = res[, c(1, 2, 4, 6, 8, 12, 13)];

    res2[, c(4, 5)] = signif(apply(res2[, c(4,5)], 2, as.numeric), 2)

    rownames(res2) = NULL

    names(res2) = c("Mod", "Size", "Rnk", "p.Bonf", "fracModSz", "ont", "termName");

    terms = res2$termName;

    sterms = substring(terms, 1, 60);

    res2$termName = sterms;

    options(width = 100);

    modules = sort(as.numeric(unique(res2$Mod)));

    for (m in modules)

    {

    printFlush(spaste("=========== Module:", m, "; module size: ", sum(labels[[set]]==m)))

    print(res2[res2$Mod==m, -c(1,2)]);

    }

    }

    The result is a long printout of Bonferroni-corrected enrichment p-values. For example, for liver module 6 we get

    =========== Module:6; module size: 666Rnk p.Bonf fracModSz ont termName

    121 1 2.3e-31 0.460 MF catalytic activity122 2 6.7e-31 0.210 CC mitochondrion123 3 1.1e-28 0.150 MF oxidoreductase activity124 4 6.3e-26 0.140 BP oxidation reduction125 5 1.6e-20 0.500 CC cytoplasm126 6 9.4e-20 0.340 CC cytoplasmic part127 7 2.3e-10 0.099 BP lipid metabolic process

    18

  • 128 8 1.3e-09 0.480 BP metabolic process129 9 1.4e-09 0.077 BP organic acid metabolic process130 10 1.4e-09 0.077 BP carboxylic acid metabolic process131 11 1.8e-09 0.037 MF oxidoreductase activity, acting on CH-OH group of donors132 12 2.4e-08 0.035 MF electron carrier activity133 13 7.7e-08 0.580 CC intracellular part134 14 2.0e-07 0.032 MF oxidoreductase activity, acting on the CH-OH group of donors135 15 4.4e-07 0.590 CC intracellular136 16 8.8e-07 0.033 MF tetrapyrrole binding137 17 2.0e-06 0.032 MF heme binding138 18 2.4e-06 0.038 BP steroid metabolic process139 19 2.6e-06 0.450 CC intracellular membrane-bounded organelle140 20 2.9e-06 0.450 CC membrane-bounded organelle

    Of note is also the adipose module 6 (809 probes) that is extremely highly enriched in the term mitochondrion:

    =========== Module:6; module size: 809Rnk p.Bonf fracModSz ont termName

    121 1 2.5e-234 0.500 CC mitochondrion122 2 9.2e-142 0.600 CC cytoplasmic part123 3 2.7e-109 0.210 CC mitochondrial part124 4 3.1e-100 0.190 CC mitochondrial envelope125 5 2.8e-98 0.190 CC mitochondrial membrane126 6 9.8e-98 0.170 CC mitochondrial inner membrane127 7 1.9e-92 0.680 CC cytoplasm128 8 6.9e-71 0.140 BP generation of precursor metabolites and energy129 9 5.4e-67 0.650 CC intracellular membrane-bounded organelle130 10 7.5e-67 0.650 CC membrane-bounded organelle131 11 3.3e-56 0.180 BP oxidation reduction132 12 6.4e-55 0.067 CC respiratory chain133 13 5.7e-50 0.080 BP electron transport chain134 14 7.7e-49 0.720 CC intracellular part135 15 1.9e-48 0.170 MF oxidoreductase activity136 16 2.1e-46 0.490 MF catalytic activity137 17 3.4e-46 0.730 CC intracellular138 18 7.4e-28 0.043 BP cellular respiration139 19 1.6e-27 0.550 BP metabolic process140 20 3.8e-20 0.063 MF cofactor binding

    We now create text labels for the modules that reflect the name of the term with highest enrichment. We only createa GO label if the corresponding Bonferroni corrected p-value is better than 10−4.

    # Crate GO labels for modules

    goLabels = list();

    goModules = list();

    goPvalue = list();

    for (set in 1:nSets)

    {

    goAnn = bt[[set]]$bestPTerms[[4]]$enrichment

    nModules = length(unique(goAnn$module));

    best = tapply(c(1:nrow(goAnn)), goAnn$module, min);

    goModules[[set]] = goAnn$module[best][-1];

    goLabels[[set]] = spaste(goModules[[set]], ": ", goAnn$termName[best][-1]);

    goPvalue[[set]] = goAnn$BonferoniP[best][-1];

    goLabels[[set]] [goPvalue[[set]] > 1e-4] = goModules[[set]] [goPvalue[[set]] > 1e-4];

    }

    collectGarbage();

    19

  • The labels are as follows:

    > goLabels[[1]][1] "1: receptor activity" "10: proteasome complex"[3] "11" "12: G-protein coupled receptor activity"[5] "13" "14: intracellular part"[7] "15" "16"[9] "17" "18: nucleus"

    [11] "19: extracellular matrix" "2: intracellular"[13] "20: mitochondrion" "21"[15] "22: ribosome" "23: nucleosome assembly"[17] "24: cell adhesion" "25"[19] "26: cellular amino acid metabolic process" "27: mitochondrial part"[21] "29" "3"[23] "33: endoplasmic reticulum" "38"[25] "4: G-protein coupled receptor activity" "45"[27] "48" "5: intracellular"[29] "50" "58: cell cycle"[31] "6: catalytic activity" "64"[33] "65: serine-type peptidase activity" "68"[35] "7: immune response" "70"[37] "71: MHC class I protein complex" "73: nucleosome assembly"[39] "76: hemoglobin complex" "8: receptor activity"[41] "81" "9"

    [[2]][1] "1: G-protein coupled receptor activity" "10: mitochondrion"[3] "11: extracellular matrix" "12"[5] "13" "14: ribosome"[7] "15: cell cycle" "16: membrane fraction"[9] "2: nucleus" "3: G-protein coupled receptor activity"

    [11] "4: lymphocyte activation" "5: multicellular organismal development"[13] "6: mitochondrion" "7: membrane"[15] "8" "9"

    5 Modules related to physiological traits

    Here we identify modules related to physiological traits. We use the robust biweight midcorrelation to measure theassociation between each module eigengene and each trait. We consider the association significant if the correlationis above 0.35, corresponding to a p-value of roughly 10−5. Taking into account the number of modules (42 in liver)and number of traits (19), this translates roughly to a Bonferroni corrected p-value threshold of 10−2. The followingrather long section of code generates a list containing information about modules associated to each trait in eachtissue.

    # Set up lists to hold the information

    bestModules = list();

    traitGeneColors = list();

    exprSize = checkSets(expr)

    nSamples = exprSize$nSamples;

    # Correlation thresholds

    thresholds = c(0.35, 0.35);

    nSelTraits = checkSets(selTraits)$nGenes;

    # Loop over tissues

    for (set in 1:nSets)

    20

  • {

    bestModules[[set]] = list();

    traitGeneColors[[set]] = list();

    modSizes = table(labels[[set]])[match(substring(names(MEs[[set]]$eigengenes), 3),

    names(table(labels[[set]])))]

    modColors = rep("grey", length(modSizes))

    modNumbers = as.numeric(substring(names(MEs[[set]]$eigengenes), 3))

    modColors[modNumbers!=0] = standardColors()[modNumbers[modNumbers!=0]]

    # Loop over traits

    for (t in 1:nSelTraits)

    {

    bestModules[[set]][[t]] = list();

    bestModules[[set]][[t]]$trait = colnames(selTraits[[set]]$data)[t];

    x = bicorAndPvalue(MEs[[set]]$eigengenes, selTraits[[set]]$data[, t])

    cors = x$bicor;

    pvals = x$p;

    # Put the p-values into a single data frame

    significance = data.frame(modSizes, cors, pvals, as.numeric(MEs[[set]]$varExplained),

    modColors);

    names(significance) = c("nGenes", spaste("r.", shortLabels[set]),

    spaste("p.", shortLabels[set]), spaste("PVE.", shortLabels[set]), "Color");

    rownames(significance) = names(MEs[[set]]$eigengenes);

    order = order(significance[, 3]);

    significant = significance[, 3] < 0.001;

    bestModules[[set]][[t]]$significance = significance[order, ];

    printSignif = significance;

    printSignif[, c(2:4)] = signif(significance[, c(2:4)], 3);

    bestModules[[set]][[t]]$printSignif = printSignif[order, ];

    bestModules[[set]][[t]]$bestSignif = printSignif[order, ][

    (abs(printSignif[order, 2]) > thresholds[set]), ]

    printFlush("\n==============================================================================\n");

    printFlush("Significance for", bestModules[[set]][[t]]$trait, ":");

    options(width = 100);

    print(bestModules[[set]][[t]]$bestSignif);

    moduleList1 = as.numeric(substring(rownames(bestModules[[set]][[t]]$bestSignif), 3))

    bestModules[[set]][[t]]$bestModules = moduleList1;

    bestModules[[set]][[t]]$nGenesInBestModules = sum(bestModules[[set]][[t]]$bestSignif$nGenes);

    }

    }

    # Save the results for future use

    save(bestModules, file = "Female-LA-bestModules.RData");

    For use in subsequent analysis, we also form a separate list of modules associated with each trait.

    keepModules = list();

    nKeep = rep(0, nSets);

    traitGeneColors = list();

    nSamples = exprSize$nSamples;

    nGenes = exprSize$nGenes;

    basePVal = 0.01;

    bestLabels = list();

    bestColors = list();

    for (set in 1:nSets)

    {

    z = qnorm(1-basePVal)/sqrt(nSamples[set]-3);

    baseCor = tanh(z);

    cor = bicor(expr[[set]]$data, selTraits[[set]]$data, use = "p");

    cor[abs(cor) < baseCor] = 0;

    traitGeneColors[[set]] = numbers2colors(cor, signed = TRUE);

    colnames(traitGeneColors[[set]]) = colnames(selTraits[[set]]$data);

    21

  • keepModules[[set]] = vector();

    bestLabels[[set]] = matrix(0, nGenes, nSelTraits);

    for (t in 1:nSelTraits)

    {

    keepModules[[set]] = c(keepModules[[set]], bestModules[[set]][[t]]$bestModules);

    keep = labels[[set]] %in% bestModules[[set]][[t]]$bestModules

    bestLabels[[set]][keep, t] = labels[[set]][ keep ];

    }

    keepModules[[set]] = sort(unique(keepModules[[set]]));

    nKeep[set] = length(keepModules[[set]]);

    bestColors[[set]] = labels2colors(bestLabels[[set]]);

    }

    hdlInd = match("e_hdl_mgdl", colnames(selTraits[[1]]$data));

    Which modules are associated with HDL? In liver we find the following:

    > bestModules[[1]] [[hdlInd]] $ bestSignifnGenes r.Liver p.Liver PVE.Liver Color

    ME6 666 0.564 3.14e-13 0.368 redME64 40 0.470 4.02e-09 0.484 skyblue2ME11 379 -0.455 1.41e-08 0.333 greenyellowME20 161 0.419 2.28e-07 0.378 royalblueME10 409 -0.419 2.32e-07 0.338 purpleME16 231 0.398 1.04e-06 0.321 lightcyanME21 145 -0.396 1.14e-06 0.368 darkredME18 201 -0.385 2.39e-06 0.350 lightgreen

    In adipose, we only find one module:

    > bestModules[[2]] [[hdlInd]]$bestSignifnGenes r.Adipose p.Adipose PVE.Adipose Color

    ME7 791 0.463 4.44e-10 0.424 black

    5.a Module-trait relationships for all modules that relate significantly to a trait

    Here we produce color-coded tables of module significance (defined as robust correlation of the module eigengeneand the trait) of between modules and traits. We restrict the modules to those that relate significantly to at leastone trait. We first calculate matrices holding the module significances and the corresponding p-values. We use therobust biweight midcorrelation to quantify module significance.

    nTraits = dim(selTraits[[1]]$data)[2];

    ordTraits = consensusOrderMEs(selTraits, greyLast = FALSE);

    TraitSignif = vector(mode="list", length = nSets);

    TraitCor = vector(mode="list", length = nSets);

    TraitLabels = colnames(ordTraits[[1]]$data);

    newTraitLabels = renameTable[ match(TraitLabels, renameTable[, 1]), 2];

    MELabels = list();

    for (set in 1:nSets)

    {

    MELabels[[set]] = colnames(ordMEs[[set]]$eigengenes);

    tmp = bicorAndPvalue(ordMEs[[set]]$eigengenes, ordTraits[[set]]$data)

    TraitSignif[[set]] = tmp$p

    TraitCor[[set]] = tmp$bicor

    }

    minp = 1; maxp = 0;

    for (set in 1:nSets)

    {

    minp = min(minp, TraitSignif[[set]]);

    22

  • maxp = max(maxp, TraitSignif[[set]]);

    }

    if (minp

  • Female Liver module−trait significance

    −0.5

    0

    0.5

    fluid

    FFA

    trigly

    tot.c

    hol

    unes

    t.cho

    l

    mus

    cleBM

    Dlen

    gth

    efat

    gluco

    se

    insuli

    nHD

    Llep

    tin fat

    fat.fr

    ac rfat

    vfat

    weigh

    tsfa

    t

    ME16

    ME20: mitochondrion

    ME64

    ME6: catalytic activity

    ME25

    ME27: mitochondrial part

    ME73: nucleosome assembly

    ME19: extracellular matrix

    ME7: immune response

    ME58: cell cycle

    ME70

    ME21

    ME26: cellular amino acidmetabolic process

    ME45

    ME13

    ME18: nucleus

    ME10: proteasome complex

    ME11

    ME9

    ME1: receptor activity

    ME17

    −0.312e−04

    0.150.07

    0.0860.3

    0.270.001

    0.130.1

    0.170.04

    0.0810.3

    0.336e−05

    0.482e−09

    0.335e−05

    0.0650.5

    0.41e−06

    0.526e−09

    0.473e−09

    0.52e−10

    0.496e−10

    0.54e−10

    0.491e−09

    0.497e−10

    −0.260.002

    0.00630.9

    0.00291

    0.361e−05

    0.110.2

    0.0660.4

    0.0440.6

    0.230.006

    0.312e−04

    0.240.004

    0.150.08

    0.422e−07

    0.375e−05

    0.34e−04

    0.338e−05

    0.352e−05

    0.321e−04

    0.33e−04

    0.33e−04

    −0.53e−10

    0.34e−04

    0.170.05

    0.343e−05

    0.170.05

    0.10.2

    −0.0110.9

    0.250.002

    0.452e−08

    0.343e−05

    0.170.06

    0.474e−09

    0.546e−10

    0.496e−10

    0.532e−11

    0.53e−10

    0.511e−10

    0.447e−08

    0.431e−07

    −0.461e−08

    0.120.2

    0.0950.3

    0.545e−12

    0.240.005

    0.295e−04

    0.120.2

    0.393e−06

    0.591e−14

    0.524e−11

    0.280.001

    0.563e−13

    0.532e−09

    0.64e−15

    0.615e−16

    0.637e−17

    0.648e−18

    0.654e−18

    0.591e−14

    −0.140.1

    −0.140.09

    −0.190.02

    0.422e−07

    0.120.2

    0.0980.2

    0.00531

    0.170.04

    0.150.07

    0.160.06

    0.140.1

    0.352e−05

    0.210.03

    0.160.05

    0.160.06

    0.230.006

    0.210.01

    0.180.03

    0.150.08

    −0.338e−05

    −0.0460.6

    −0.0960.3

    0.210.01

    −0.0210.8

    0.0690.4

    −0.0220.8

    0.160.07

    0.288e−04

    0.416e−07

    0.270.002

    0.240.004

    0.280.003

    0.287e−04

    0.33e−04

    0.295e−04

    0.337e−05

    0.321e−04

    0.230.006

    −0.180.04

    −0.110.2

    −0.0620.5

    0.230.005

    0.0480.6

    0.170.04

    0.140.1

    0.260.002

    0.250.003

    0.452e−08

    0.323e−04

    0.220.009

    0.140.1

    0.240.004

    0.230.006

    0.295e−04

    0.329e−05

    0.352e−05

    0.260.002

    −0.180.03

    0.20.02

    0.140.1

    0.180.03

    0.0570.5

    0.375e−06

    0.0810.3

    0.343e−05

    0.475e−09

    0.312e−04

    0.0760.4

    0.240.005

    0.377e−05

    0.53e−10

    0.481e−09

    0.473e−09

    0.511e−10

    0.572e−13

    0.53e−10

    0.00670.9

    0.130.1

    0.150.08

    0.295e−04

    0.180.03

    0.383e−06

    0.120.2

    0.260.002

    0.352e−05

    0.210.01

    0.0240.8

    0.270.001

    0.20.03

    0.344e−05

    0.295e−04

    0.377e−06

    0.438e−08

    0.481e−09

    0.422e−07

    0.0380.7

    0.130.1

    0.240.004

    0.220.01

    0.120.2

    0.376e−06

    0.0690.4

    0.220.009

    0.33e−04

    0.240.004

    0.0830.4

    0.110.2

    0.180.06

    0.33e−04

    0.250.003

    0.33e−04

    0.41e−06

    0.444e−08

    0.311e−04

    −0.0240.8

    0.180.04

    0.180.04

    0.0660.4

    0.260.002

    −0.296e−04

    −0.33e−04

    −0.377e−06

    −0.20.02

    0.130.1

    0.0610.5

    0.0320.7

    −0.0320.7

    −0.160.06

    −0.140.1

    −0.140.1

    −0.190.02

    −0.240.005

    −0.0950.3

    0.296e−04

    −0.180.03

    −0.120.2

    −0.295e−04

    −0.140.09

    −0.295e−04

    −0.120.1

    −0.361e−05

    −0.532e−11

    −0.384e−06

    −0.040.7

    −0.41e−06

    −0.511e−08

    −0.523e−11

    −0.531e−11

    −0.517e−11

    −0.552e−12

    −0.599e−15

    −0.551e−12

    0.312e−04

    −0.287e−04

    −0.250.002

    −0.180.03

    −0.120.2

    −0.352e−05

    −0.180.03

    −0.41e−06

    −0.622e−16

    −0.392e−06

    −0.140.1

    −0.33e−04

    −0.486e−08

    −0.611e−15

    −0.66e−15

    −0.67e−15

    −0.656e−18

    −0.681e−20

    −0.671e−19

    0.120.1

    −0.110.2

    −0.120.2

    −0.443e−08

    −0.220.008

    0.0690.4

    0.140.1

    0.0910.3

    −0.0570.5

    −0.270.001

    −0.160.07

    −0.33e−04

    −0.130.2

    −0.0510.5

    −0.0710.4

    −0.0930.3

    −0.120.2

    −0.0720.4

    −0.050.6

    0.180.04

    −0.160.05

    −0.296e−04

    −0.250.003

    −0.160.06

    −0.288e−04

    −0.130.1

    −0.170.04

    −0.32e−04

    −0.352e−05

    −0.20.02

    −0.260.002

    −0.376e−05

    −0.33e−04

    −0.270.001

    −0.338e−05

    −0.321e−04

    −0.391e−06

    −0.33e−04

    0.312e−04

    −0.160.05

    −0.230.007

    −0.422e−07

    −0.220.008

    −0.338e−05

    −0.110.2

    −0.240.004

    −0.423e−07

    −0.47e−07

    −0.250.004

    −0.392e−06

    −0.433e−06

    −0.452e−08

    −0.443e−08

    −0.482e−09

    −0.473e−09

    −0.52e−10

    −0.422e−07

    0.180.03

    0.0150.9

    0.00351

    −0.54e−10

    −0.150.08

    −0.240.004

    −0.0560.5

    −0.288e−04

    −0.383e−06

    −0.352e−05

    −0.240.005

    −0.422e−07

    −0.342e−04

    −0.361e−05

    −0.336e−05

    −0.415e−07

    −0.452e−08

    −0.446e−08

    −0.391e−06

    0.41e−06

    −0.0710.4

    −0.0650.4

    −0.444e−08

    −0.160.06

    −0.392e−06

    −0.120.1

    −0.423e−07

    −0.591e−14

    −0.517e−11

    −0.230.01

    −0.461e−08

    −0.471e−07

    −0.65e−15

    −0.611e−15

    −0.612e−15

    −0.633e−17

    −0.675e−20

    −0.592e−14

    0.160.06

    −0.20.02

    −0.230.005

    −0.0740.4

    −0.140.09

    −0.321e−04

    −0.180.03

    −0.210.01

    −0.361e−05

    −0.0530.5

    −0.0890.3

    −0.120.1

    −0.343e−04

    −0.382e−06

    −0.361e−05

    −0.343e−05

    −0.344e−05

    −0.415e−07

    −0.352e−05

    0.0990.2

    −0.180.04

    −0.220.009

    −0.0980.2

    −0.0590.5

    −0.34e−04

    −0.110.2

    −0.230.007

    −0.376e−06

    0.0180.8

    −0.0640.5

    −0.150.09

    −0.280.002

    −0.361e−05

    −0.352e−05

    −0.329e−05

    −0.338e−05

    −0.392e−06

    −0.337e−05

    0.140.1

    −0.130.1

    −0.240.005

    −0.210.01

    −0.130.1

    −0.329e−05

    −0.110.2

    −0.210.01

    −0.311e−04

    −0.140.09

    −0.150.09

    −0.230.007

    −0.342e−04

    −0.338e−05

    −0.312e−04

    −0.312e−04

    −0.296e−04

    −0.383e−06

    −0.287e−04

    Figure 7: Module significance of selected liver modules for traits measured for this cross. Numbers in the tableindicate the robust correlations and the corresponding p-values. The table is colored by correlation with red colorrepresenting positive correlation and green negative correlation.

    24

  • Female Adipose module−trait significance

    −0.5

    0

    0.5

    fluid

    FFA

    trigly

    tot.c

    hol

    unes

    t.cho

    l

    mus

    cleBM

    Dlen

    gth

    efat

    gluco

    se

    insuli

    nHD

    Llep

    tin fat

    fat.fr

    ac rfat

    vfat

    weigh

    tsfa

    t

    ME7: membrane

    ME9

    ME11: extracellular matrix

    ME4: lymphocyte activation

    ME8

    ME13

    ME16: membrane fraction

    −0.651e−20

    0.267e−04

    0.210.006

    0.39e−05

    0.180.02

    0.275e−04

    0.150.05

    0.41e−07

    0.812e−39

    0.292e−04

    0.311e−04

    0.464e−10

    0.84e−31

    0.81e−38

    0.88e−38

    0.83e−37

    0.781e−35

    0.751e−31

    0.827e−42

    −0.422e−08

    0.160.04

    0.150.06

    0.190.02

    0.140.08

    0.269e−04

    0.190.01

    0.332e−05

    0.631e−19

    0.315e−05

    0.278e−04

    0.38e−05

    0.527e−11

    0.615e−18

    0.582e−16

    0.591e−16

    0.621e−18

    0.631e−19

    0.641e−20

    −0.42e−07

    0.150.06

    0.120.1

    0.190.01

    0.10.2

    0.49e−08

    0.292e−04

    0.451e−09

    0.662e−22

    0.282e−04

    0.240.003

    0.274e−04

    0.541e−11

    0.676e−23

    0.652e−21

    0.681e−23

    0.689e−24

    0.724e−27

    0.78e−26

    0.0870.3

    −0.0220.8

    −0.0720.4

    0.0260.7

    0.0690.4

    −0.386e−07

    −0.267e−04

    −0.49e−08

    −0.283e−04

    −0.20.009

    −0.120.2

    −0.110.2

    −0.250.003

    −0.39e−05

    −0.220.004

    −0.316e−05

    −0.392e−07

    −0.444e−09

    −0.392e−07

    0.0630.4

    −0.0390.6

    −0.0560.5

    −0.0260.7

    −0.0380.6

    −0.210.007

    −0.260.001

    −0.170.03

    −0.323e−05

    −0.150.06

    −0.140.09

    −0.0680.4

    −0.130.1

    −0.266e−04

    −0.240.002

    −0.268e−04

    −0.324e−05

    −0.362e−06

    −0.323e−05

    0.0770.3

    −0.120.1

    −0.180.02

    −0.080.3

    −0.110.2

    −0.31e−04

    −0.284e−04

    −0.160.04

    −0.331e−05

    −0.160.04

    −0.0960.2

    −0.0640.4

    −0.150.08

    −0.314e−05

    −0.291e−04

    −0.31e−04

    −0.355e−06

    −0.41e−07

    −0.362e−06

    −0.393e−07

    0.150.05

    0.0780.3

    0.190.02

    0.0430.6

    0.274e−04

    0.180.02

    0.451e−09

    0.512e−12

    0.0580.5

    0.190.02

    0.316e−05

    0.513e−10

    0.541e−13

    0.546e−14

    0.553e−14

    0.56e−12

    0.491e−11

    0.531e−13

    Figure 8: Module significance of selected adipose modules for traits measured for this cross. Numbers in the tableindicate the robust correlations and the corresponding p-values. The table is colored by correlation with red colorrepresenting positive correlation and green negative correlation.

    25

  • 5.b Network plots of all module eigengenes and traits

    We now produce plots of networks composed of module eigengenes and traits in each tissue. The plots are too big todisplay comfortably on screen, but can be viewed using a pdf viewer which will usually provide a zoom function. Theeigengene network plot contains two panels, one with a dendrogram of eigengenes and traits, and the correspondingcolor-coded heatmap and correlation/p-value table.

    widths = c(20, 13)

    for (set in 1:nSets)

    {

    mets = list(a = list(data = cbind(MEs[[set]]$eigengenes, selTraits[[set]]$data)));

    colnames(mets$a$data) = c(colnames(MEs[[set]]$eigengene), renameTable[, 2]);

    omets = consensusOrderMEs(mets);

    pdf(file = spaste("Plots/Female-", shortLabels[set], "-ME-selTraitNetworkHeatmaps.pdf"),

    width = widths[set], height = 2*widths[set]);

    plotEigengeneNetworks(omets, shortLabels[set], marDendro = c(0,2,2,2), zlimPreservation = c(0,1),

    marHeatmap = c(5,5,2,2), setMargins = TRUE,

    plotAdjacency = FALSE,

    printAdjacency = TRUE, cex.adjacency = 0.5)

    dev.off();

    }

    6 Output of module membership, eigengenes, and eigengene correla-tions

    In this section we output a whole lot of the network information into text csv files that can be viewed in MS Excel,OpenOffice Calc and other similar spreadsheet software. We begin with lists of samples used in network analysis and“expressions” of module eigengenes.

    # Samples that are used for network analysis

    for (set in 1:nSets)

    {

    samples = rownames(expr[[set]]$data);

    write.table(data.frame(samples), col.names = FALSE, row.names = FALSE, quote = FALSE,

    file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-networkSamples.txt"));

    }

    # Module eigengenes

    for (set in 1:nSets)

    {

    write.table(as.data.frame(cbind(Mice_id = rownames(expr[[set]]$data), MEs[[set]]$eigengenes)),

    col.names = TRUE, row.names = FALSE, sep = ",", quote = FALSE,

    file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-moduleEigengenes.csv"))

    }

    Next we output a table of module-trait associations.

    # Module-trait relationships

    moduleTraitRels = list();

    for (set in 1:nSets)

    {

    moduleTraitRels[[set]] = bicorAndPvalue(MEs[[set]]$eigengenes, selTraits[[set]]$data)

    outMat = rbind(moduleTraitRels[[set]]$bicor, moduleTraitRels[[set]]$p);

    dim(outMat) = c(ncol(MEs[[set]]$eigengenes), 2*nSelTraits)

    nameMat = matrix(cbind(spaste("bicor.", colnames(selTraits[[set]]$data)),

    spaste("p.", colnames(selTraits[[set]]$data))),

    2, nSelTraits, byrow = TRUE)

    colnames(outMat) = as.vector(nameMat);

    26

  • write.table(as.data.frame(cbind(Eigengene = colnames(MEs[[set]]$eigengenes), outMat)),

    col.names = TRUE, row.names = FALSE, sep = ",", quote = FALSE,

    file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-moduleTraitBicor.csv"))

    }

    Lastly, we output a large table of fuzzy module membership and gene significance for all traits. The modulemembership MM of a gene (probe), also known as module eigengene-based connectivity kME, is given by thecorrelation of the gene expression profile and the module eigengene. Similarly, gene significance for a trait is givenby the correlation of the gene expression profile with the numeric trait.

    file = bzfile(description = "../../../Data-AllMouse/CXB_GeneAnnotation.csv.bz2");

    #file = "../Data-CXB/CXB_all_gene_annotation.csv"

    annot = read.csv(file = file);

    for (set in 1:nSets)

    {

    # Calculate module membership, a.k.a kME

    KMEall = bicorAndPvalue(expr[[set]]$data, MEs[[set]]$eigengenes);

    KMEmod = rep(NA, nGenes)

    KMEmodP = rep(NA, nGenes)

    modLevels = sort(unique(labels[[set]]));

    nMods = length(modLevels);

    for (mod in 1:nMods)

    {

    inMod = labels[[set]]==modLevels[mod];

    # This assumes MEs[[set]]$eigengenes are sorted the same way as modLevels

    KMEmod[inMod] = KMEall$bicor[inMod, mod];

    KMEmodP[inMod] = KMEall$p[inMod, mod];

    }

    kmeMat = rbind(KMEall$bicor, KMEall$p);

    dim(kmeMat) = c(nGenes, 2*nMods);

    nameMat = matrix(cbind(spaste("k", colnames(MEs[[set]]$eigengenes)),

    spaste("p.k", colnames(MEs[[set]]$eigengenes))),

    2, nMods, byrow = TRUE)

    colnames(kmeMat) = as.vector(nameMat);

    # Connect probe names to gene names

    genes = colnames(expr[[set]]$data)

    expr2annot = match(genes, annot$sequence);

    annotInfo = annot[expr2annot, c(4,5,6,7,8,9)];

    # Calculate gene significance

    GS = bicorAndPvalue(expr[[set]]$data, selTraits[[set]]$data);

    GSmat = rbind(GS$bicor, GS$p);

    dim(GSmat) = c(nGenes, 2*nSelTraits);

    nameMat = matrix(cbind(spaste("GS.", colnames(selTraits[[set]]$data)),

    spaste("pGS", colnames(selTraits[[set]]$data))),

    2, nSelTraits, byrow = TRUE)

    colnames(GSmat) = as.vector(nameMat);

    # Put it all together

    info = cbind(annotInfo,

    moduleLabel = labels[[set]],

    moduleColor = labels2colors(labels[[set]]),

    KME.labelModule = KMEmod,

    pKME.labelModule = KMEmodP,

    GSmat,

    kmeMat);

    # Save the big table into a text csv file

    write.table(info,

    col.names = TRUE, row.names = FALSE, sep = ",", quote = FALSE,

    file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-moduleMembership.csv"))

    27

  • collectGarbage();

    }

    7 Overlap of liver and adipose modules

    We now produce a color-coded overlap table of liver and adipose modules.

    # Call the overlapTable function to calculate the overlaps

    overlap = overlapTable(labels[[1]], labels[[2]]);

    # Prepare axis labels for the table plot

    modSizes = lapply(labels, table);

    xLabels = spaste("A.", sort(unique(labels[[2]])), " (", modSizes[[2]], ")");

    yLabels = spaste("L.", sort(unique(labels[[1]])), " (", modSizes[[1]], ")");

    # Content of the table

    textMat = spaste(overlap$countTable, "|", signif(overlap$pTable, 1));

    mat = overlap$pTable;

    mat[mat

  • Overlap of adipose and liver modules

    0

    10

    20

    30

    40

    50

    60

    A.0

    (607

    5)

    A.1

    (540

    0)

    A.2

    (303

    1)

    A.3

    (180

    4)

    A.4

    (155

    9)

    A.5

    (115

    1)

    A.6

    (809

    )

    A.7

    (791

    )

    A.8

    (442

    )

    A.9

    (519

    )

    A.10

    (490

    )

    A.11

    (335

    )

    A.12

    (358

    )

    A.13

    (311

    )

    A.14

    (246

    )

    A.15

    (239

    )

    A.16

    (63)

    L.0 (9910)L.1 (2821)L.2 (1262)L.3 (1283)

    L.4 (909)L.5 (777)L.6 (666)L.7 (637)L.8 (522)L.9 (435)

    L.10 (409)L.11 (379)L.12 (343)L.13 (323)L.14 (268)L.15 (257)L.16 (231)L.17 (190)L.18 (201)L.19 (193)L.20 (161)L.21 (145)L.22 (135)L.23 (128)L.24 (126)

    L.25 (92)L.26 (87)L.27 (80)L.29 (84)L.33 (63)L.38 (58)L.45 (50)L.48 (45)L.50 (51)L.58 (42)L.64 (40)L.65 (38)L.68 (32)L.70 (33)L.71 (33)L.73 (31)L.76 (28)L.81 (25)

    3439|1e−157 1679|1 1187|1 613|1 588|1 651|1e−24 323|0.9 369|0.004 97|1 211|0.7 217|0.2 152|0.1 86|1 107|1 65|1 107|0.2 19|1

    287|1 2018|0 26|1 64|1 45|1 56|1 34|1 35|1 119|4e−18 7|1 8|1 7|1 65|4e−04 34|0.7 7|1 3|1 6|0.8

    156|1 47|1 729|0 22|1 41|1 28|1 31|1 20|1 1|1 94|7e−26 59|5e−09 16|0.7 1|1 2|1 7|1 8|0.9 0|1

    159|1 107|1 21|1 781|0 46|1 33|1 18|1 20|1 37|0.006 6|1 5|1 3|1 11|1 20|0.2 9|0.9 4|1 3|0.7

    306|3e−08 328|2e−20 122|0.3 29|1 16|1 27|1 5|1 21|1 13|0.9 6|1 10|1 4|1 10|0.9 2|1 5|1 4|1 1|0.9

    144|1 30|1 375|2e−133 18|1 42|0.9 22|1 35|0.06 21|0.9 2|1 30|0.002 33|9e−05 8|0.9 1|1 2|1 6|0.8 8|0.5 0|1

    139|1 88|1 85|0.5 21|1 26|1 34|0.4 77|6e−21 68|3e−16 8|0.9 33|1e−05 49|2e−14 17|0.01 3|1 4|1 5|0.8 8|0.4 1|0.8

    146|1 35|1 41|1 8|1 258|1e−138 27|0.8 10|1 43|1e−05 1|1 19|0.1 3|1 21|3e−04 2|1 1|1 6|0.7 15|0.002 1|0.8

    136|0.4 242|6e−33 33|1 15|1 13|1 18|1 8|1 3|1 11|0.4 1|1 8|0.9 3|1 18|0.001 6|0.7 2|1 5|0.6 0|1

    62|1 145|3e−07 9|1 26|0.9 14|1 11|1 6|1 10|0.9 33|1e−11 0|1 1|1 0|1 23|3e−07 89|4e−81 2|0.9 2|0.9 2|0.3

    116|0.1 29|1 61|0.1 18|1 46|3e−04 32|0.006 39|1e−08 15|0.4 4|0.9 6|0.9 14|0.05 7|0.4 3|0.9 1|1 10|0.01 6|0.2 2|0.3

    103|0.3 74|0.9 26|1 8|1 46|5e−05 20|0.4 28|1e−04 22|0.009 7|0.6 10|0.3 15|0.01 6|0.5 2|1 5|0.6 5|0.4 2|0.9 0|1

    81|0.8 109|8e−05 51|0.1 25|0.6 18|0.9 13|0.9 4|1 6|1 9|0.2 5|0.9 6|0.7 5|0.5 3|0.9 3|0.8 1|1 2|0.9 2|0.2

    99|0.03 29|1 7|1 43|3e−04 107|1e−46 14|0.7 2|1 2|1 8|0.3 2|1 0|1 1|1 3|0.9 3|0.8 1|1 2|0.8 0|1

    66|0.7 16|1 74|7e−11 6|1 32|9e−04 9|0.9 11|0.3 8|0.7 0|1 18|3e−05 15|5e−04 5|0.3 1|1 2|0.9 4|0.3 1|0.9 0|1

    22|1 41|1 2|1 7|1 6|1 7|1 1|1 2|1 48|9e−34 1|1 0|1 1|1 107|3e−129 8|0.02 2|0.8 0|1 2|0.1

    59|0.5 19|1 19|1 3|1 35|4e−06 30|1e−06 12|0.1 21|4e−05 0|1 7|0.2 11|0.009 7|0.05 1|1 0|1 1|0.9 5|0.09 1|0.5

    19|1 90|9e−14 4|1 33|7e−06 9|0.9 4|1 5|0.8 1|1 14|1e−05 0|1 0|1 1|0.9 4|0.3 4|0.2 2|0.6 0|1 0|1

    74|3e−04 17|1 6|1 7|1 56|7e−21 10|0.5 6|0.7 4|0.9 5|0.3 2|0.9 1|1 2|0.8 2|0.8 1|0.9 5|0.06 1|0.9 2|0.1

    32|1 9|1 15|1 1|1 24|0.002 24|2e−05 2|1 27|3e−10 0|1 17|1e−06 3|0.8 31|1e−23 0|1 0|1 2|0.6 6|0.01 0|1

    40|0.6 24|1 18|0.8 0|1 10|0.6 7|0.7 22|3e−08 12|0.008 3|0.6 9|0.009 7|0.05 3|0.4 3|0.4 2|0.6 0|1 1|0.8 0|1

    49|0.02 11|1 19|0.5 3|1 13|0.2 13|0.03 12|0.004 6|0.4 0|1 2|0.8 4|0.4 1|0.9 4|0.2 6|0.01 0|1 2|0.4 0|1

    9|1 22|1 4|1 6|1 1|1 1|1 0|1 1|1 0|1 0|1 0|1 0|1 0|1 0|1 91|1e−153 0|1 0|1

    29|0.8 9|1 34|2e−05 1|1 4|1 6|0.6 9|0.03 7|0.1 0|1 15|1e−07 6|0.05 3|0.3 0|1 0|1 0|1 5|0.01 0|1

    26|0.9 3|1 4|1 5|1 9|0.5 21|7e−07 1|1 16|5e−06 1|0.9 2|0.8 1|0.9 13|3e−08 0|1 0|1 1|0.7 2|0.4 21|7e−33

    33|0.02 17|0.9 8|0.9 2|1 2|1 4|0.7 10|0.001 7|0.03 3|0.2 2|0.6 2|0.6 0|1 0|1 1|0.7 0|1 1|0.6 0|1

    24|0.4 22|0.3 6|1 4|0.9 8|0.2 3|0.8 10|8e−04 2|0.8 2|0.5 2|0.6 2|0.5 0|1 1|0.7 1|0.7 0|1 0|1 0|1

    9|1 8|1 6|1 2|1 0|1 1|1 40|3e−37 5|0.1 2|0.4 2|0.5 4|0.08 0|1 0|1 0|1 1|0.6 0|1 0|1

    36|5e−04 13|1 2|1 5|0.8 12|0.009 4|0.6 5|0.2 1|0.9 3|0.2 0|1 0|1 0|1 1|0.7 2|0.3 0|1 0|1 0|1

    19|0.2 7|1 8|0.6 2|1 5|0.4 4|0.4 2|0.6 1|0.9 0|1 3|0.2 1|0.7 2|0.2 1|0.6 1|0.6 3|0.03 4|0.004 0|1

    21|0.05 17|0.2 6|0.8 2|0.9 0|1 3|0.5 5|0.05 1|0.9 1|0.7 0|1 0|1 1|0.6 1|0.6 0|1 0|1 0|1 0|1

    21|0.009 4|1 7|0.5 2|0.9 2|0.9 7|0.01 2|0.5 0|1 0|1 1|0.7 0|1 1|0.5 0|1 1|0.5 1|0.4 1|0.4 0|1

    11|0.6 13|0.2 0|1 2|0.9 2|0.8 2|0.7 10|2e−06 0|1 3|0.05 0|1 1|0.6 0|1 0|1 1|0.4 0|1 0|1 0|1

    17|0.1 4|1 4|0.9 4|0.6 1|1 2|0.7 2|0.5 2|0.5 0|1 2|0.3 1|0.7 11|1e−10 0|1 0|1 0|1 1|0.4 0|1

    4|1 4|1 1|1 1|1 4|0.3 1|0.9 0|1 2|0.4 0|1 0|1 0|1 0|1 0|1 0|1 1|0.4 24|1e−37 0|1

    8|0.8 24|5e−07 1|1 1|1 2|0.8 1|0.9 0|1 1|0.7 1|0.5 1|0.6 0|1 0|1 0|1 0|1 0|1 0|1 0|1

    8|0.8 20|6e−05 2|1 0|1 3|0.5 0|1 0|1 2|0.4 2|0.2 1|0.6 0|1 0|1 0|1 0|1 0|1 0|1 0|1

    8|0.6 14|0.007 1|1 0|1 3|0.4 0|1 0|1 1|0.7 2|0.1 0|1 1|0.5 0|1 1|0.4 1|0.3 0|1 0|1 0|1

    9|0.5 2|1 1|1 0|1 1|0.9 0|1 18|2e−18 1|0.7 0|1 1|0.5 0|1 0|1 0|1 0|1 0|1 0|1 0|1

    20|2e−05 0|1 3|0.8 0|1 6|0.02 0|1 1|0.7 2|0.3 0|1 0|1 0|1 1|0.4 0|1 0|1 0|1 0|1 0|1

    5|0.9 3|1 3|0.8 0|1 0|1 0|1 3|0.09 2|0.3 0|1 1|0.5 2|0.1 1|0.4 0|1 1|0.3 1|0.3 9|2e−11 0|1

    20|5e−07 4|0.9 0|1 0|1 2|0.6 0|1 0|1 1|0.6 0|1 0|1 0|1 1|0.3 0|1 0|1 0|1 0|1 0|1

    4|0.9 3|0.9 0|1 14|4e−10 1|0.8 1|0.7 0|1 0|1 2|0.08 0|1 0|1 0|1 0|1 0|1 0|1 0|1 0|1

    Figure 9: Overlap of liver (y-axis) and adipose (x-axis) modules. Each row corresponds to a liver module indicatedon the left by name, color, and number of probes in the module. Conversely, each column corresponds to anadipose module indicted at the bottom. Numbers in the table indicate the number of probes in the overlap and thecorresponding Fisher exact p-value. The table is colored according to − log10 p, with the colors scale indicated onthe right. The large modules 1–4, and “module” 0, overlap very strongly between the tissues. Some other, smallermodules, also show strong overlaps, but HDL-related modules overlap more weakly with modules in the oppositetissue.

    29

  • 8 Gene significance and module membership in HDL-related modulesare correlated

    Here we show that in HDL-related modules, highly connected genes (referred to as intramodular hub genes) also tendto have high gene significance for HDL. We use the function verboseScatterplot to plot annotated scatterplots ofgene significance (GS) vs. module membership (also known as eigengene-based connectivity kME).

    hdlInd = match("e_hdl_mgdl", colnames(selTraits[[1]]$data));

    sizeGrWindow(7, 8);

    #pdf(file = "Plots/Female-LA-HubgeneSignifForHDL.pdf", width = 7, height = 8);

    par(mfrow = c(3,3));

    par(mar = c(3.5, 3.5, 4, 0.5));

    par(mgp = c(1.8, 0.6, 0));

    for (set in 1:nSets)

    {

    # Select only modules related to HDL

    moduleList1 = bestModules[[set]][[hdlInd]]$bestModules;

    for (mod in moduleList1) # For each module...

    {

    # Find the module in the eigengenes

    modGeneInd = (labels[[set]] == mod);

    meInd = match(paste("ME", mod, sep=""), names(MEs[[set]]$eigengenes));

    # Calculate GS, KME, and module eigengenes significance (MES)

    nModGenes = sum(modGeneInd);

    KME = bicor(expr[[set]]$data[, modGeneInd], MEs[[set]]$eigengenes[, meInd], use = ’p’);

    GS = bicor(expr[[set]]$data[, modGeneInd], selTraits[[set]]$data[, hdlInd], use = ’p’);

    MS = bicor(MEs[[set]]$eigengenes[, meInd], selTraits[[set]]$data[, hdlInd], use = ’p’);

    # Plot GS vs. kME

    verboseScatterplot(KME, GS,

    main = paste(shortLabels[set], mod, standardColors()[mod], "\nMES =",

    signif(MS, 2), "\n"),

    xlab = paste("kME in", shortLabels[set]),

    ylab = paste("GS.HDL in", shortLabels[set]), abline = TRUE, cex.lab = 1.2,

    cex.main = 1.2, cex.axis = 1.2);

    }

    }

    # If plotting into a file, close it.

    dev.off();

    The result is shown in Figure 10. We observe that GS.HDL and kME are strongly correlated, that is hub genes inHDL-related modules also tend to be strongly related to HDL.

    30

  • ●●●

    ● ●●●

    ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●● ●

    ● ●

    ●●

    ●● ●

    ●●

    ●●

    ●●

    ●●

    ●● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●●●

    ●● ●

    ●●

    0.4 0.6 0.8

    0.0

    0.2

    0.4

    0.6

    Liver 6 red MES = 0.56

    cor=0.62, p=5.6e−72

    kME in Liver

    GS

    .HD

    L in

    Liv

    er

    ●● ●

    ●●

    ●●

    ●●

    0.4 0.6 0.8

    0.20

    0.30

    0.40

    0.50

    Liver 64 skyblue2 MES = 0.47

    cor=0.6, p=4.3e−05

    kME in Liver

    GS

    .HD

    L in

    Liv

    er

    ● ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ● ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ● ●

    ●●

    ●●

    ●●

    ● ●●●

    ●●

    ●●

    ●●

    0.3 0.4 0.5 0.6 0.7 0.8−0.

    6−

    0.4

    −0.

    20.

    0

    Liver 11 greenyellow MES = −0.46

    cor=−0.53, p=7.8e−29

    kME in Liver

    GS

    .HD

    L in

    Liv

    er●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●●

    ●●

    ●●

    ●●

    ●●

    0.4 0.5 0.6 0.7 0.8

    0.0

    0.2

    0.4

    Liver 20 royalblue MES = 0.42

    cor=0.36, p=2.7e−06

    kME in Liver

    GS

    .HD

    L in

    Liv

    er

    ●●

    ●●

    ●●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●�