functional networks inference from rule-based …ico2s.org/data/papers/lazzarini2016.pdffunctional...

36
Lazzarini et al. METHODOLOGY Functional networks inference from rule-based machine learning models Nicola Lazzarini 1 , Pawe l Widera 1 , Stuart Williamson 2 , Rakesh Heer 3 , Natalio Krasnogor 1 and Jaume Bacardit 1* Abstract Background: Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. Results: We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods, and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. Availability: The implementation of our network inference protocol is available at: http://ico2s.org/software/funel.html Keywords: machine learning; biological knowledge extraction; network inference; functional networks Background The inference of biological networks is a highly relevant and challenging task in systems biology and integra- tive bioinformatics. Biological networks are graphs in * Correspondence: [email protected] 1 Interdisciplinary Computing and Complex BioSystems (ICOS) research group, School of Computing Science, Newcastle University, Newcastle upon Tyne, UK Full list of author information is available at the end of the article which nodes represent genes or proteins, and a con- nection between them indicates some kind of biologi- cal relationship, e.g. regulatory or functional. The net- work inference is, in an essence, an attempt to reverse engineer the biological relationships from the high- throughput biological data [1]. Most biological network inference methods focus on the definition of gene regulatory networks, in which edges represent direct regulatory interactions between

Upload: others

Post on 01-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Lazzarini et al.

METHODOLOGY

Functional networks inference from rule-basedmachine learning modelsNicola Lazzarini1, Pawe l Widera1, Stuart Williamson2, Rakesh Heer3, Natalio Krasnogor1 and JaumeBacardit1*

Abstract

Background: Functional networks play an important role in the analysis of biological processes and systems.The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, thesimilarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumesa functional relationship between genes which are expressed at similar levels across different samples. Analternative to this paradigm is the inference of relationships from the structure of machine learning models.These models are able to capture complex relationships between variables, that often aredifferent/complementary to the similarity-based methods.

Results: We propose a protocol to infer functional networks from machine learning models, called FuNeL. Itassumes, that genes used together within a rule-based machine learning model to classify the samples, mightalso be functionally related at a biological level. The protocol is first tested on synthetic datasets and thenevaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from thereal-world data are compared against gene co-expression networks of equal size, generated with 3 differentmethods. The comparison is performed from two different points of view. We analyse the enriched biologicalterms in the set of network nodes and the relationships between known disease-associated genes in a context ofthe network topology. The comparison confirms both the biological relevance and the complementary characterof the knowledge captured by the FuNeL networks in relation to similarity-based methods, and demonstratesits potential to identify known disease associations as core elements of the network. Finally, using a prostatecancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant tothe disease and consistent with the specialised literature and with an independent dataset not used in theinference process.

Availability: The implementation of our network inference protocol is available at:http://ico2s.org/software/funel.html

Keywords: machine learning; biological knowledge extraction; network inference; functional networks

Background

The inference of biological networks is a highly relevantand challenging task in systems biology and integra-tive bioinformatics. Biological networks are graphs in

*Correspondence: [email protected]

1 Interdisciplinary Computing and Complex BioSystems (ICOS) research

group, School of Computing Science, Newcastle University, Newcastle upon

Tyne, UK

Full list of author information is available at the end of the article

which nodes represent genes or proteins, and a con-nection between them indicates some kind of biologi-cal relationship, e.g. regulatory or functional. The net-work inference is, in an essence, an attempt to reverseengineer the biological relationships from the high-throughput biological data [1].

Most biological network inference methods focus onthe definition of gene regulatory networks, in whichedges represent direct regulatory interactions between

Lazzarini et al. Page 2 of 16

NETWORK INFERENCE OUTPUTINPUT

X

Y

SamplesX Y

similarity (X,Y) > thresholdX

Y

X Y

X

Y

Classification rules-omics data

X Y Pheno.

Phenotype information

Cancer Normal

XY

YX

Machine learning model Knowledge extraction

Edge inferenceProfile similarity

If X > 0.23 and Y > 0.55: Cancer

Fig 1 Two approaches to functional network inference: one based on the expression profile similarity and the other based on theextraction of knowledge from machine learning models. The similarity-based methods construct a new network edge X ↔ Y , whenthe similarity between the expressions of genes X and Y across the samples is above a threshold. Methods based on machinelearning, first build a predictive model, in this example a rule-based model, using the samples phenotype information (class labels)and then construct a network edge X ↔ Y , when genes X and Y are used together within that model to classify the samples. Asthese two approaches lead to different functional networks, it is possible that they capture complementary knowledge.

genes [2–4]. Far less effort has been put into the de-sign of methods to build functional networks in whicha connection indicates a functional relationship, e.g.membership in the same pathway or protein complex.One of the typical uses of these networks is the iden-tification of functional modules (subset of genes withmultiple internal connections and a few connectionswith genes outside the module that describe, explainor predict a biological process or phenotype.).

One of the earliest (but still widely used) approach toinfer functional networks is the ”guilt-by-association”principle [5]. That is, if two genes show similar expres-sion profiles, it is assumed they are also functionallyrelated (via a direct or indirect interaction). Initially,this paradigm was applied to infer networks from tran-scriptomics data, and this is why in most of the liter-ature it is known as the co-expression network infer-ence principle. Nevertheless, it is abstract enough tobe applied to all kinds of biological data. It has beendemonstrated that co-expression networks are able toeffectively identify pathways and candidate biomark-ers [6] or reveal gene modules representing a biologicalprocess perturbed in a disease [7], just to name a fewexamples, and the similarity-based approach remainsthe dominant method of functional network inferencetoday, with many recent examples: [8–12].

A different approach that is recently gaining popu-larity, is the use of machine learning techniques toinfer biological networks. Due to the wide range of

knowledge representations used within machine learn-ing methods (e.g. classification rules, decision trees, ar-tificial neural networks, SVM kernels, etc.), they candiscover more complex and diverse relationships, andovercome the limitations of the similarity-based meth-ods. This is possible since within machine learningmodels the attributes are associated not because theyare similar (e.g. have similar expression profiles), butbecause together they detect strong patterns. In addi-tion, if learning is supervised, it can take advantage ofthe additional phenotype information (class labels ofthe samples, e.g. case and control) available with thedata. Therefore, by mining the complex machine learn-ing models, it should be possible to uncover new anddifferent (biological) knowledge, that is likely to escapethe traditional approaches. Figure 1 illustrates thesedifferences between the two approaches (similarity-based methods vs. knowledge extraction from the ma-chine learning models).

Alternative strategies exist to infer networks usingmachine learning. One approach is to train machinelearning models that directly predict network edges[13], but this process requires an experimentally ver-ified ”ground truth” of known interactions and suit-able controls. A different approach, which is the focusof this work, is to generate machine learning modelsfrom the biological data and then mine the structureof the models to infer networks. Several types of ma-chine learning have been successfully applied to this

Lazzarini et al. Page 3 of 16

task: unsupervised learning in the form of associationrules [14], supervised learning using regression (modeltrees [15]) or classification (random forest [16]).

The specific focus of this paper is the network infer-ence from rule-based machine learning models. Suchmodels have been successfully applied before to ex-tract knowledge from genetic data [17] and identifydisease risk factors in a bladder cancer study [18]. Themethods presented in these works share some pipelinecomponents with our current work, such as the per-mutation test and a 2-phase learning strategy. In ourprevious works we applied rule-based machine learningto transcriptomics [19, 20], proteomics [21], lipidomics[22] and protein structure data [23]. We formulateda paradigm called co-prediction (in opposition to theclassic co-expression) in which the prediction rules ofa classification algorithm, in our case BioHEL [24], areused to identify relationships between genes.

Co-prediction is based on the assumption that at-tributes (e.g. genes) within the same classificationrules, due to their co-operation in predicting the sam-ple class, have an increased likelihood of being func-tionally related to the biological process in question(Figure 2). Differently than co-expression, the co-prediction approach exploits the phenotype informa-tion of the data (class labels) to detect functional re-lations.

Fig 2 Co-prediction paradigm. Association between the genesis inferred from their co-occurrence in classification rules.

However, from a methodological perspective, manyquestions remained unanswered. Can the co-predictionapproach identify known genetic relationships? Howcan we quantify the biological significance of the co-prediction networks? What is the impact of data pre-processing on the generated networks? Is this method-ology able to capture knowledge that escapes other

methods? Are the discovered functional relationshipsmeaningful in the human disease context?

To address these questions, we propose in this arti-cle a new network inference protocol, called FuNeL(Functional Network Learning). FuNeL substantiallyextends our previous work [19] by incorporating: (1)statistical filtering of inferred functional relationshipsvia permutation tests, (2) a multi-stage network gener-ation to maximise the knowledge extraction, and (3) aconfigurable feature selection stage to control the sizeof the generated networks.

We first tested FuNeL’s ability to correctly iden-tify functional relationships using a set of syntheticdatasets. Then, we evaluated FuNeL on 8 real-worldtranscriptomics datasets related to different types ofcancer. For each dataset we tested 4 different configu-rations of the protocol and compared the inferred net-works to co-expression networks of equivalent size. Inorder to have an extensive evaluation of our approach,we employed 3 different methods to generate co-expression networks. We systematically looked at thedifferences between co-prediction and co-expressionnetworks from two points of view: (1) the enriched bi-ological terms and (2) the relationships between thegenes known to be associated with a particular type ofcancer. Finally, we used a prostate cancer dataset asa case study and performed a more detailed biologicalanalysis of the enriched terms and the disease relatedgenes. We looked at the largest hubs and the mostcentral nodes in the prostate cancer co-prediction net-works and studied their involvement in the disease. Wefound literature support for the association betweenthese topologically important genes and prostate can-cer, and we further confirmed it with an independenttranscriptomics dataset (not used as a source in the in-ference process). Overall, we found that the FuNeL in-ferred networks: (1) capture relevant biological knowl-edge that is complementary to the knowledge capturedby different co-expression networks, and (2) more ad-equately represent the relationships between genes as-sociated with the disease targeted by each dataset.

Materials and Methods

In this section we describe the proposed network in-ference protocol, the datasets from which we inferredthe networks and the experimental design we used toevaluate it.

The functional network inference protocol

The stages of the co-prediction inference protocol areillustrated in Figure 3. Two of these stages are optional

Lazzarini et al. Page 4 of 16

(1 and 4), they lead to a total of 4 different protocolconfigurations. If the first optional stage (feature se-lection) is performed, the original dataset is reducedto the most relevant attributes. In the second stage arule-based machine learning is used to infer a network.This network is statistically refined in Stage 3, in whicha permutation test is used to filter out non-significantnodes. The final stage, in which the network generationis repeated for the second time, is again optional. Acomplete time complexity analysis of the FuNeL pro-tocol is available in Section 2 of the SupplementaryMaterial.

Fig 3 Stages of the functional network inference protocol.

Feature selection (stage 1) When datasets contain alarge number of attributes, some might be irrelevantto the prediction target and discarding them helps theclassification algorithm to focus its learning effort onthe attributes that matters. Therefore, the feature se-lection is the first stage of the inference process. Topick the relevant attributes we used the support vec-tor machine recursive feature elimination (SVM-RFE)[25]. We opted for the SVM algorithm with a linear

kernel as our preliminary studies suggested that it caneliminate as much as 90% of the original dataset at-tributes, without losing much of the classification ac-curacy (see Section 1 in Supplementary Material).

Rule-based network inference (stage 2) To infer therule-based classification models we used BioHEL [24].It generates sets of classification rules using a geneticalgorithm and is able to work with large datasets. Dueto the stochastic nature of BioHEL’s learning process,each of its runs generates a different rule set. We lever-age this fact by creating a large number of alterna-tive hypotheses of functional relationships via multi-ple runs of the algorithm. For each dataset we runBioHEL 10 000 times and infer the network from theconsensus of all the generated rule sets. To do that, weuse all the pairs of attributes that appear together inthe same classification rule as the network edges (co-prediction paradigm). Then, we score each networknode (attribute) by counting how many times it hasbeen used in the rules (node score).

Permutation test (stage 3) Given a list of edges(attribute-attribute associations) extracted from therule sets, we try to filter out the non-significant nodes.To determine the node significance, we follow a statis-tical analysis procedure based on a permutation test,similar to the one described in [17]. We generate 100permutated datasets by randomly shuffling the classlabels. Next, we infer the co-prediction networks (asin Stage 2) from these permutated datasets. Then,for each node, we calculate a distribution of scoresacross the 100 networks generated from the permu-tated datasets. Using a one-tailed permutation test, weassign to each node a p-value, to estimate how likelyit is to draw its score from the calculated distribution.With this process we make sure that the nodes withhigh scores are really tied to the classes present in thedata, and that the network truly represents functionalrelationships. To decide if a node is statistically signif-icant we use a typical α = 0.05 threshold.

After preliminary experiments we realised, that usingsignificant nodes alone leads to small and dense net-works. To counter that, we relaxed the node pruning toalso keep all direct neighbours of the significant nodes.

Network construction (stage 4) There are two waysto interpret the result of the statistical test (option 2in Figure 3). The first approach is to use the significantnodes as a filter for the inferred relationships (edges)and remove all the edges between two non-significantnodes. The second approach is to use the permuta-tion test as a further feature selection and build a new

Lazzarini et al. Page 5 of 16

rule-based machine learning model using only the sig-nificant nodes. This second run of the learning algo-rithm is then focused only on the statistically impor-tant genes and creates the final network.

Protocol configurations As a result of two indepen-dent optional stages in the FuNeL protocol, there are4 different configurations that it can run with (see Ta-ble 1). We decided to test them all and infer four net-works from each dataset, one per configuration.

Table 1 Protocol configurations used in the experiments.

Config. Description

C1 reduced dataset + 1 stage of network generationC2 original dataset + 1 stage of network generationC3 reduced dataset + 2 stages of network generationC4 original dataset + 2 stages of network generation

Datasets

Synthetic datasets

To verify if FuNeL is able to correctly identify func-tional relationships we tested it on a set of syntheticdatasets. Although there are several generators thatmodel expression data with genetic relationships, suchas GNW used in several DREAM challenges [26], theygenerate unlabeled samples (without phenotype infor-mation, e.g. case vs. control) and the class labels arenecessary to perform the supervised learning at thecore of FuNeL.

For that reason, we decided to use GAMETES instancegenerator [27], designed to create genetic datasets withmulti-locus disease associations, where no fewer thann loci can predict a phenotype (disease status). GA-METES generates genotype data (rather than gene ex-pression data) based on models with specific geneticconstraints, e.g. different heritabilities or frequenciesof the SNPs.

To generate the synthetic datasets, we used a set of2-locus configurations similar to what was employedin a recent work of Li et al. [28] to evaluate permutedrandom forest networks of gene interactions. Specifi-cally, the genetic models varied in terms of heritability(0.001-0.4) and number of attributes (5-25), with fixedallele frequency of 0.2 and 2000 samples per dataset.For each configuration, we selected from 100 000 ran-dom models, two models with extreme value of the easeof detection metric (EDM) (the least and the most dif-ficult). Finally, for each selected model we generated50 datasets, obtaining 4000 datasets in total.

Real-world datasets

We used 8 publicly available human cancer microar-ray datasets (see Table 2). These datasets representa broad range of characteristics in terms of biologi-cal information (different types of cancers), number ofsamples (patients) and number of attributes (genes).For each dataset the attributes were defined by theprobes used in the microarray experiment. Generally,a gene can be represented by more than one probeand extra post-processing step is needed to merge theinformation and generate networks where nodes trulyrepresent genes. We used MADGene [29] to map theAffymetrix probe IDs into HUGO gene IDs, then forall probes mapped to the same gene, we merged theprobes and their connections. If a probe was unmappedit was removed from the network.

Table 2 Description of the source datasets used to infernetworks.

Name Attributes Samples Class labels

Dlbcl [30] 2647 77 Dlbcl; Follicular lymphomaCNS [31] 7129 60 Survivor; FailuresLeukemia [32] 7129 72 AML; ALLLung-Michigan [33] 7129 96 Tumor; NormalLung-Harvard [34] 12534 181 Mesothelioma; ADCAProstate [35] 12600 102 Tumor; NormalAML [36] 12625 54 Remission; RelapseColon-Breast [37] 22283 52 Colon cancer; Breast cancer

While in this instance we focused on transcriptomicsdatasets only, the FuNeL protocol is general and canbe applied to other types of biological data too (pro-teomics, lipidomics, etc.).

Co-expression networks

In this paper we are comparing our FuNeL networksagainst co-expression networks. The co-expressionparadigm identifies similarity of gene expression pat-tern under different experimental conditions. Co-expression edges are an abstraction of functional rela-tionships between genes and do not represent physicalbinding as in protein interaction or gene regulatorynetworks. Two genes are considered to be function-ally related (co-expressed), if their transcript levelsare similar across a set of samples.

In here we employed three well known methods to inferco-expression networks, each one uses a different met-ric to assess gene expressions similarity: Pearson cor-relation coefficient, ARACNE [2] and MIC [38]. In thefollowing subsections we briefly present those methods,for more details check the cited original papers.

Pearson correlation coefficient

Pearson’s correlation coefficient (PCC) is a well knownmeasure of linear dependence between two variables.

Lazzarini et al. Page 6 of 16

Applied to gene expression profiles, it measures thesimilarity in the direction of gene response across sam-ples. Its main disadvantages are the lack of distribu-tional robustness (it assumes data normality) and thesensitivity to outliers. We generated the PCC-basedco-expression networks using the SciPy Python library[39].

ARACNE: Algorithm for the Reconstruction of GeneRegulatory Networks

The ARACNE method [2] measures the dependencebetween two gene expression profiles using mutual in-formation. Mutual information I(X;Y ) estimates en-tropy to quantify the amount of information that Ycontains about X (measured in bits). In contrast tocorrelation, it is able to detect non-linear dependen-cies. ARACNE calculates I(X;Y ) for every pair ofgene expression profiles X and Y , and applies the dataprocessing inequality to remove the majority of indi-rect dependencies. For each triplet X, Y and Z theweakest link is removed, e.g. the edge between X andY is removed if I(X;Y ) ≤ min(I(X;Z), M(Z;Y )) −ε. The tolerance threshold ε is used to adjust forthe variance of the mutual information estimator. Togenerate the ARACNE based networks we used theminet R package [40] with the following parameters:mi.empirical estimator, equalwidth distance and ε = 0.

MIC: Maximal Information Coefficient

The MIC [38] is a recently proposed measure of thestrength of association between two variables, closelyrelated to mutual information. Instead of using a singlediscretisation strategy to bin the compared variables,it chooses individual bins for each variable, such thatvalue of mutual information I(X;Y ) is maximised.Compared to standard estimation of I(X;Y ) valueused in ARACNE, the optimised estimation providedby MIC is able to detect a wider range of non-linearassociations. To generate MIC based networks we usedthe minepy Python library [41] with the following pa-rameters: α = 0.6 and c = 15.

Inference of the co-expression networks counterparts

To fairly compare the co-prediction and co-expressionnetworks generated from the same data, we had tomake sure they match in size. To do that, for everyco-prediction network C with m edges and n nodes,we created two co-expression counterparts:

• SE(C): co-expression network with m edges

• SN(C): co-expression network with n nodes

PCC and MIC methods directly compute the pairwisesimilarity between the gene expressions. Given that,we generated SE(C) using m gene pairs with the high-est similarity coefficient. To build SN(C) we used asmany top gene pairs as needed, to reach at least nnodes (as we included all pairs tied on the similarityvalue, sometimes we end up with a few nodes more).

ARACNE uses a pruning procedure and generates aweighted network, not a list of pairwise similarities.When the resulting network was smaller than m edgesor n nodes, we increased the default tolerance thresh-old ε to obtain a large enough network. This was thecase for the CNS (ε = 0.002) and the Dlbcl datasets(ε = 0.043). Then we used the edge weights to selecttop gene pairs, as in the case of PCC and MIC meth-ods.

Several examples of inferred co-prediction networksand corresponding co-expression networks are visu-alised in Section 7 of the Supplementary Material andare accompanied, in there, by an initial analysis of se-lected topological properties in Section 3.

Enrichment analysis

To understand the biological information capturedby the generated networks we conducted an enrich-ment analysis. This is a statistical method of check-ing whether a set of genes have common character-istics. In our study, the set is defined by the nodesof the generated functional network and is analysedwith PANTHER [42]. Because many statistical testsare performed (one for each term) at the same time,PATHER uses Bonferroni correction for multiple test-ing with α = 0.05. We searched for two categories ofbiological knowledge: Gene Ontology (GO) terms andPANTHER pathways (176 primarily signalling path-ways). From the set of GO term, we selected only themanually curated annotations that were supported byexperimental evidence.

Disease association analysis

To evaluate the predictive power of the generated net-works, and to assess their relevance within a cancer-related context, we analysed the relationships be-tween known disease-associated genes. We used twosources for the disease associations: Malacards (ameta-database of human maladies consolidated from64 independent sources) [43] and the union of severalmanually curated databases (OMIM [44], Orphanet[45], Uniprot [46] and CTD [47]). We looked at twoproperties: (1) the proximity of the disease-associated

Lazzarini et al. Page 7 of 16

Table 3 FuNeL success rate in identification of disease-predicting SNPs. The datasets differed with respect to heritability, number ofSNPs and detection difficulty (L-EDM models were the hardest, H-EDM the easiest).

5 SNP 10 SNP 15 SNP 20 SNP 25 SNP

Her. L-EDM H-EDM L-EDM H-EDM L-EDM H-EDM L-EDM H-EDM L-EDM H-EDM

0.001 6 % 16 % 8 % 18 % 4 % 10 % 4 % 12 % 12 % 16 %0.005 8 % 82 % 0 % 86 % 6 % 80 % 2 % 82 % 8 % 72 %0.01 8 % 96 % 8 % 100 % 8 % 100 % 12 % 100 % 14 % 100 %0.05 14 % 100 % 60 % 100 % 42 % 100 % 34 % 100 % 34 % 100 %0.1 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 %0.2 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 %0.3 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 %0.4 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 %

genes within a network and (2) the number of tri-angles in a network, containing one or more disease-associated genes.

Higher proximity represents stronger functional rela-tionship between genes involved in the disease. Trian-gles represent groups of attributes used together acrossdifferent prediction rules, and therefore indicate strongmutual relationship between the genes (useful in thediscovery of potential new disease associations). Tri-angles are also the smallest non-trivial motifs that canbe found in a complex network and over-representedmotifs usually identify functional units of biologicalprocesses in cells [48].

The proximity of disease-associated genes was mea-sured using the average shortest path length (SPL).The proximity was defined as a ratio of two distances:average SPL between all pairs of the non-associatedgenes and average SPL between all pairs of disease-associated genes A:

1

n

n∑i=1

wiSPL(CCi \A)

SPL(A),where wi =

|CCi|∑nj=1 |CCj |

As the generated networks often were disconnected(had more than 1 connected component), we intro-duced a weight wi that represents the relative size of aconnected component CCi. Components with less than3 nodes or disease-associated genes were not used inthe calculation.

Results

The main results described in this section are basedon the analysis of 8 real-world datasets. The only ex-ception is the subsection below, which reports the testresults on synthetic datasets.

Identification of predefined relationships in syntheticdatasets

To verify how well FuNeL is able to identify func-tional relationships, we tested it first on synthetic

datasets generated using GAMETES. We used 80 dif-ferent model configurations that varied in heritability,number of SNPs and ease of detection, and tested thesuccess rate on 50 datasets per model. Given the smallnumber of attributes in the synthetic datasets, we usedonly the C2 protocol configuration in the tests (no fea-ture selection, single learning phase). The percentageof successfully identified relationships for each modelis reported in Table 3. We counted as success the pres-ence of an edge between the interacting pair of SNPsin the inferred network.

As expected, a higher success rate was obtained formodels where relationships were easy to detect (H-EDM). The performance increased with higher valuesof heritability and 100% success rate was obtained forheritability values above 0.05 regardless of model diffi-culty. The overall results are similar to those reportedin [28], or even slightly better, as FuNeL’s success ratewas unaffected by the increase in the number of SNPs.

Complementarity of the enriched terms

To test how unique are the biological terms (GO termsand pathways) over-represented in the inferred Fu-NeL networks, we measured an overlap between termsfound for each type of network. We defined the over-lap between terms enriched for networks inferred usingconfigurations Ca and Cb as:

O(Ca, Cb) =c

ua + ub + c

where c is the number of common terms, ua is thenumber of unique terms for Ca and ub is the numberof unique terms for Cb.

Table 4 summaries the pair-wise overlap between the4 different FuNeL configurations. For GO terms we re-ported the average overlap between the biological pro-cess, cellular component and molecular function cat-egories. Although configurations that operate on thesame dataset (C1/C3 and C2/C4) shared the most

Lazzarini et al. Page 8 of 16

terms/pathways, the overlap is quite far from 100%.The observed difference is a result of the second train-ing stage. Configurations used on different datasets(i.e. different set of attributes) resulted in networkssharing less than 40% GO terms and 20% pathways.

Table 4 Average overlap of enriched GO terms and pathwaysbetween different FuNeL configurations. The overlap wasaveraged across all 8 datasets.

Gene Ontology Pathways

C1 C2 C3 C4 C1 C2 C3 C4

C1 — 0.353 0.749 0.405 — 0.186 0.513 0.183C2 — 0.321 0.701 — 0.095 0.591C3 — 0.364 — 0.104C4 — —

Similarly, we analysed the term overlap between co-prediction and co-expression by comparing the Ci net-works with their co-expression counterparts SE (Ci)and SN (Ci) generated with different approaches (seeTable 5). We found the percentage of overlap to be sim-ilar across the different inference methods. The over-lap in enriched terms was never higher than 62% (stillleading to a difference around 40%) and was the largestfor configuration not using feature selection (C2 andC4). In general the percentages were lower for biolog-ical pathways with a minimum of only 10% of sharedterms. Low values of terms overlap indicate that theco-prediction and the co-expression approaches can beseen as complementary. Despite starting from the samedataset, they generate networks expressing differentbiological information.

Table 5 Average overlap of enriched GO terms and pathwaysbetween the co-prediction and co-expression networks. Eachco-expression network Ci was compared to the correspondingco-expression networks SE(Ci) and SN(Ci). The overlap wasaveraged across all 8 datasets.

Co-expression (SE) Co-expression (SN)

Method Cat. C1 C2 C3 C4 C1 C2 C3 C4

PCCGO 0.280 0.414 0.297 0.432 0.315 0.576 0.367 0.488

path. 0.223 0.260 0.258 0.190 0.264 0.400 0.175 0.287

ARACNEGO 0.348 0.621 0.272 0.565 0.333 0.612 0.277 0.535

path. 0.126 0.463 0.139 0.479 0.085 0.423 0.016 0.356

MICGO 0.316 0.513 0.283 0.487 0.300 0.614 0.289 0.527

path. 0.097 0.339 0.142 0.315 0.112 0.469 0.080 0.352

Quantifying the amount of captured biologicalknowledge

The amount of biological knowledge (number of en-riched terms) captured by a network is related to itssize (number of nodes). To fairly compare the networksof different sizes we used the normalised EnrichmentScore (ES):

ES =number of enriched terms

number of nodes

The score assesses if a network contains biologicallyrelated nodes. The higher it is, the larger is a biologicalsimilarity between the nodes of a network.

To have a global view of the performances of each in-ference method in term of ES, we performed a two-stepanalysis for each enrichment category. First, using theES, we ranked the networks generated by each methodin order to identify the best performing one. See Sec-tion 4 of the Supplementary Material for the completeanalysis.

Once we identified the best network for each method,we ranked them together by ES and calculated theiraverage rank across the datasets. The results of thisanalysis are reported in Table 6. MIC performed bestwhen ES was calculated using the GO terms (it wasranked first in each of those categories). When ESwas calculated using the biological pathways, C4 andARACNE SE(C1) shared the highest rank.

Table 6 Average ranks based on the Enrichment Score for thebest performing networks of each inference method. For eachcategory and for each method, we report the network used in theanalysis. The ranks (in brackets) were averaged across all 8datasets, and the highest ranks are shown with bold font. The lastrow reports the average ranks across all the biological categories.The following abbreviations were used for GO categories:biological process (BP), molecular function (MF) and cellularcomponent (CC).

Category FuNeL PCC ARACNE MIC

GO BP C4 (3) SE(C3) (1.5) SN(C3) (4) SN(C3) (1.5)GO MF C3 (3.5) SN(C3) (3.5) SN(C3) (2) SN(C3) (1)GO CC C3 (4) SN(C1) (3) SN(C3) (2) SN(C3) (1)Patwhays C4 (1.5) SN(C2) (3.5) SE(C1) (1.5) SE(C3) (3.5)

Average 3 2.88 2.38 1.75

Table 6 shows that the best performing networks foreach method were mostly C3 co-expression counter-parts, in particular SN(C3). This is consistent withthe result of the topological analysis in Section 3 ofthe Supplementary Material were these networks werefound to have the lowest number of nodes, and suggeststhat smallest networks tend to be more enriched. Thedifference in performance between the FuNeL configu-rations is mainly a result of the application of the sec-ond machine learning phase (the best networks wereC3 and C4).

In Supplementary Table 6 we reported the results ofa similar analysis where we compared the similarity-based inference methods against FuNeL (ranks in thererange from 1 to 12: 4 Ci + 4 SE (Ci) + 4 SN (Ci). Inthis pairwise analysis, FuNeL networks performed sim-ilarly to PCC and ARACNE. We did not observe anyconsistent winner across all the enrichment categories.MIC seems to have better results than FuNeL only forGO categories, as emerged from Table 6, while FuNeLnetworks tend to be more enriched for biological path-ways.

Lazzarini et al. Page 9 of 16

Evaluation of the networks in a disease context

To verify if the topology of the inferred networks isbiologically meaningful, we analysed how it definesthe relationships between genes that are known to beassociated with a disease targeted by each dataset.We expected the disease-associated genes to be moreclosely connected than other genes and to be presentin functional units, such as triangle motifs. We mea-sured the proximity of the disease-associated genes(i.e. how closely connected they are compared withnon-disease-associated genes) and counted the num-ber of triangular relationships present in each network(i.e. the percentage of triangles containing one, two orthree disease-associated genes). We repeated the two-step analysis as presented in Section Quantifying theamount of captured biological knowledge by using thegene-disease metrics for the ranking. The results arereported in Table 7. The detailed results for each in-ference method are available in Section 5 of the Sup-plementary Material.

Table 7 Average ranks based on the disease-associations forthe best performing networks of each inference method. Foreach category and for each method we report the network usedfor the analysis. The ranks (in brackets) were averaged across all8 datasets, and the highest ranks are shown with bold font. Thelast row reports the average ranks across all the categories. Thenumber of disease-associated genes participating in a triangle isdenoted as 1A, 2A and 3A.

Source Cat. FuNeL PCC ARACNE MIC

Curated

1A C2 (1) SN(C2) (4) SN(C3) (2.5) SN(C2) (2.5)2A C3 (1) SN(C3) (2) SE(C2) (3) SN(C2) (4)3A C1 (2) SN(C1) (3) SE(C4) (4) SE(C2) (1)Proximity C2 (1) SN(C3) (2.5) SE(C4) (2.5) SE(C2) (4)

Average 1.25 2.88 3 2.88

Malacards

1A C2 (1) SN(C2) (4) SN(C4) (3) SE(C4) (2)2A C2 (1.5) SN(C4) (4) SE(C4) (1.5) SN(C2) (3)3A C3 (2) SN(C4) (3) SE(C2) (4) SN(C2) (1)Proximity C2 (1) SE(C4) (4) SE(C4) (3) SE(C2) (2)

Average 1.78 3.75 2.88 2

The average ranks, for both sources of disease associa-tions, suggest that co-prediction outperforms the otherinference paradigms. The proximity of the disease-associated genes was in general higher in C2 network.Therefore, the co-prediction paradigm has identifiedthe core elements of the network more accurately. Thisresult highlights the benefits of including functional in-formation, whenever these are available, in the networkinference process (FuNeL is using the class labels as-signed to the samples of the dataset), in contrast tothe co-expression approach solely based on gene ex-pression similarity (unsupervised).

There is also a clear difference in the number ofdisease-associated genes participating in the triangles;co-prediction networks were ranked higher than the co-expression networks. The only category in which MIC

had a higher rank was 3A. However, considering thatthere were not many triangles with disease-associatedgenes, many ties affected the ranks in this category.Overall, these results demonstrate the higher predic-tive potential of the FuNeL networks in identifying newdisease associations.

Prostate cancer case study: enriched terms

To compare in detail the difference in biological knowl-edge captured by the co-prediction and co-expressionnetworks, we followed our global analysis with a casestudy focused on a dataset targeting a single disease— prostate cancer [35]. We were especially interestedin specific knowledge captured by one paradigm butnot the other.

In Figures 4 and 5 we compared the co-prediction andPCC co-expression networks inferred from the prostatecancer dataset. We focused on unique GO terms andpathways, enriched only in one type of networks. Forthe sake of readability we filtered out the generic GOterms (with depth < 9 in the GO hierarchical struc-ture). C2 was the network with the largest number ofunique terms, followed by C4 and SN(C2). We found16 GO terms and 21 pathways unique to co-predictionnetworks and only 3 GO terms and 4 pathways uniqueto co-expression networks. A similar disproportion infavour of the co-prediction networks was found in com-parison with MIC and ARACNE networks (see Sup-plementary Figures 2 and 3).

We found several of the unique GO terms enrichedin the co-prediction networks to be related to prostatecancer. The role of the Protein ubiquination in prostatecancer was recently analysed and showed an impact forits treatments [49]. ERK pathway is involved in themotility of prostate cancer cells [50]. Prostate cancercells seems to alter the nature of their calcium influxto promote growth and acquire apoptotic resistance[51]. Furthermore, the role of calcium homeostasis inthe majority of the cell-signaling pathways involvedin carcinogenesis has been well established, prostatecancer included [52].

A number of enriched pathways specific to co-predictionnetworks are also highly relevant to the prostate can-cer. Several studies demonstrated the involvement ofthe JAK/STAT pathway in the prostate cancer devel-opment [53, 54]. There is multiple evidence suggest-ing that one of the major aging-associated influenceson prostate carcinogenesis is oxidative stress and itscumulative impact on DNA damage [55, 56]. Finally,FAS (also called Apo1 or CD95) plays a central role inthe physiological regulation of programmed cell death

Lazzarini et al. Page 10 of 16

C1 C2 C3 C4

SE(C1)

SE(C2)

SE(C3)

SE(C4)

SN(C1)

SN(C2)

SN(C3)

SN(C4)

positive regulation of peptidyl­tyrosine phosphorylation (GO:0050731)

regulation of MAP kinase activity (GO:0043405)

positive regulation of cytosolic calcium ion concentration (GO:0007204)

negative regulation of protein phosphorylation (GO:0001933)

regulation of calcium ion transport (GO:0051924)

regulation of metal ion transport (GO:0010959)

positive regulation of T cell proliferation (GO:0042102)

positive regulation of ERK1 and ERK2 cascade (GO:0070374)

cellular calcium ion homeostasis (GO:0006874)

regulation of protein ubiquitination (GO:0031396)

positive regulation of neurogenesis (GO:0050769)

regulation of epithelial cell apoptotic process (GO:1904035)

regulation of focal adhesion assembly (GO:0051893)

regulation of epithelial cell migration (GO:0010632)

positive regulation of epithelial cell migration (GO:0010634)

positive regulation of neuron differentiation (GO:0045666)

regulation of neuron differentiation (GO:0045664)

calcium ion homeostasis (GO:0055074)

regulation of neuron projection development (GO:0010975)

GO biological process terms

Fig 4 Number of unique enriched GO terms (biological process) for each network configuration (generated from the prostatecancer dataset). On the x-axis we show the 12 investigated networks. On the y-axis we show the names of enriched terms unique toco-prediction or PCC co-expression networks. Red terms are associated with co-expression networks, blue with co-prediction. Emptycolumns indicate networks with no unique terms.

C1 C2 C3 C4

SE(C1)

SE(C2)

SE(C3)

SE(C4)

SN(C1)

SN(C2)

SN(C3)

SN(C4)

VEGF signaling pathway (P00056)

Insulin/IGF pathway­mitogen activated protein kinase kinase/MAP kinase cascade (P00032)

Axon guidance mediated by semaphorins (P00007)

Huntington disease (P00029)

Beta1 adrenergic receptor signaling pathway (P04377)

5HT4 type receptor mediated signaling pathway (P04376)

Dopamine receptor mediated signaling pathway (P05912)

5HT2 type receptor mediated signaling pathway (P04374)

5HT1 type receptor mediated signaling pathway (P04373)

Opioid proenkephalin pathway (P05915)

Opioid prodynorphin pathway (P05916)

Opioid proopiomelanocortin pathway (P05917)

Cortocotropin releasing factor receptor signaling pathway (P04380)

Angiotensin II­stimulated signaling through G proteins and beta­arrestin (P05911)

Insulin/IGF pathway­protein kinase B signaling cascade (P00033)

Beta3 adrenergic receptor signaling pathway (P04379)

Beta2 adrenergic receptor signaling pathway (P04378)

Oxidative stress response (P00046)

Oxytocin receptor mediated signaling pathway (P04391)

Metabotropic glutamate receptor group II pathway (P00040)

Muscarinic acetylcholine receptor 2 and 4 signaling pathway (P00043)

JAK/STAT signaling pathway (P00038)

Alpha adrenergic receptor signaling pathway (P00002)

Enkephalin release (P05913)

FAS signaling pathway (P00020)

Biological pathways

Fig 5 Number of unique enriched biological pathways for each network configuration (generated from the prostate cancer

dataset). On the x-axis we show the 12 investigated networks. On the y-axis we show the names of enriched pathways unique toco-prediction or PCC co-expression networks. Red terms are associated with co-expression networks, blue with co-prediction. Emptycolumns indicate networks with no unique pathways.

Lazzarini et al. Page 11 of 16

Fig 6 Overlap of enriched terms between the best performing networks in the disease-association analysis (curated databases).On the left the overlap of GO terms (including all 3 categories: BP, CC and MF), on the right the overlap of pathways.

and has been implicated in the pathogenesis of vari-ous malignancies and diseases of the immune systemincluding prostate cancer [57].

We also performed an additional analysis of the bi-ological terms related to the hubs (highly connectednodes) of the inferred networks. A node v was consid-ered to be a hub if its degree was at least one standarddeviation above the mean network degree. To comparethe networks, we used the 10 most frequent Gene On-tology terms (biological processes with at least depth10) shared among each network’s hubs. We found 16unique terms for co-prediction networks, 19 uniqueterms for PCC co-expression networks and 11 com-mon terms. These results further highlight that somebiological terms are exclusively associated either withco-prediction or co-expression networks. The completeanalysis (method by method) is available in the Sup-plementary Material (Supplementary Figures 4, 5 and6).

A further analysis of term overlap was conducted us-ing only the best performing networks in the curateddisease-association analysis (namely C2 for FuNeL,SN(C3) for PCC, SE(C4) for ARACNE and SE(C2)for MIC, see Section 5 of the Supplementary Mate-rial for details). In Figure 6 we show the overlap ofGO terms (including all three GO categories) andpathways across networks from different inference al-gorithms. In both categories FuNeL had much largernumber of unique terms than the co-expression meth-ods and it shared the largest number of terms withARACNE. In total 122 common GO terms were foundbetween all the methods, while there was only 1 com-

mon pathway. Figure 6 further highlights the comple-mentarity between the co-prediction and co-expressionapproaches in terms of captured biological knowledge.

Prostate cancer case study: disease associations

We searched the literature and the public cancerdatabases (not used in the inference process), to verifyif key nodes in the generated networks are associatedwith prostate cancer. As a measure of node importancewe used the node degree (number of connections) andthe betweenness centrality (number of shortest pathsbetween all pair of nodes pass through a given node).

Literature analysis We picked the top 3 most con-nected nodes (hubs) for each of the four co-predictionnetworks. The set contained six genes: GSTM2, NELL2,CFD, PTGDS, PAGE4 and LMO3. All the genes fromthis set, except LMO3, were also found to be the mostcentral nodes (with highest betweenness centrality).

Almost all these genes are related with prostate cancer:

• NELL2 contributes to alterations in epithelial-stromal homeostasis in benign prostatic hyperpla-sia and codes for a novel prostatic growth factor[58], and is also an indicator of expression changesin cancer samples [59],

• CFD (adipsin gene) is over expressed in PPperiprostatic adipose tissue of prostate cancer pa-tients [60],

Lazzarini et al. Page 12 of 16

• PTGDS (and other 2 genes) are expressed at con-sistently lower levels in clinical prostate cancer tis-sues and form a signature that predicts biochem-ical relapse [61],

• PAGE4 modulates androgen receptor signaling,promoting the progression to advanced lethalprostate cancer [62], and has a significantly lowerexpression level in patients with prostate recur-rent disease [63],

• LMO3 interacts with p53, a well known gene tu-mour suppressor in prostate cancer [64].

The only gene without literature support was GSTM2.It might represent a good target for further experimen-tal verification.

Validation on independent data To further validatethe biological significance of the inferred networks,we used an independent prostate cancer dataset [65]from the cBioPortal for Cancer Genomics [66]. Weanalysed the top 10 hubs (nodes with highest degree)and the top 10 central nodes (with highest between-ness centrality) in the co-prediction network that bet-ter performed in the gene-disease association analy-sis using the curated databases: C2 (see Supplemen-tary Table 8a). The genes with highest degree were:PTGDS, PAGE4, NELL2, GSTM2, PARM1, MAF,LMO3, COL4A6, RBP1 and ABL1. For the between-ness centrality, the set was almost identical, only RBP1was replaced by MYH11. On average the expressionin samples was altered in 31.8% cases for hubs andin 35.6% cases for central nodes. The most alteredgenes were found to be downregulated at the mRNAlevel: COL4A6 (65%), MYH11 (58%), PARM1 (53%)and GSTM2 (52%). In addition, genomic alterationsin several key genes have been found to be strongly co-occurent (e.g. PTGDS – GSTM2, PAGE4 – COL4A6,PAGE4 – RBP1, etc.).

When we repeated this analysis for the co-expressionnetworks that were best ranked in the gene-diseaseanalysis using the curated databases (SN(C3) forPCC, SE(C4) for ARACNE and SE(C2) for MIC),we found that on average the alteration level was con-sistently lower, at most half of the co-prediction keygenes. The percentages of alterations are representedas boxplots in Figure 7, while the average alterationsare reported in Table 8. As Figure 7 shows, our methodis able to identify many more genes with higher per-centage of alteration than other methods. Therefore,the topologically important nodes in the best co-prediction network represent genes more strongly re-lated to the prostate cancer, with over two times morefrequent genomic alterations.

Fig 7 Distribution of the percentage genomic alteration inthe samples of an independent dataset for top 10 hubs andcentral nodes. The topologically important genes wereselected from the best performing networks in thedisease-association analysis on curated datasets.

Table 8 Average percentage of genomic alteration for top hubsand central nodes in the independent dataset.

Genes FuNeL PCC ARACNE MIC

Hubs 31.8 % 14.2 % 12.3 % 15.2 %Central nodes 35.6 % 14.7 % 12.2 % 17.1 %

The detailed list of genomic alterations for top 10 hubsand top 10 central nodes for each analysed networkis shown in Section 6 of the Supplementary Material(Figures 7–14).

Discussion

We proposed FuNeL, a protocol to infer functionalnetworks based on the co-prediction paradigm wherethe structure of a rule-based machine learning model(in this paper the rules of a classification algorithmcalled BioHEL) is used to identify relationships be-tween genes. We tested FuNeL on synthetic datasetsand obtained a high success rate in identifying pair-wise relationships between attributes. Encouraged bythis result, we hypothesised that a rule-based machinelearning model, with its complex knowledge represen-tation, might be used to identify biologically mean-ingful relationships that escape the standard inferencemethods.

To test this hypothesis, we evaluated 4 different con-figurations of the inference protocol using 8 cancer-related transcriptomics datasets. We compared Fu-NeL with other 3 co-expression inference methodsby using networks of matching size generated fromthe same data. We looked at the differences, betweenco-prediction and co-expression, from three points of

Lazzarini et al. Page 13 of 16

view: basic topological properties, enriched biologi-cal terms and relationships between known disease-associated genes.

The comparison of networks topology (see Section 3 ofthe Supplementary Material) revealed the influence ofthe protocol options. Not surprisingly, both the featureselection and the second training phase reduced thesize of the networks, but at the same time, increasedthe clustering coefficient and the number of connec-tions. The clustering coefficient was found to be lowerin almost all the ARACNE networks, probably dueto the pruning procedure, it was also lower in manyMIC networks. Moreover, when feature selection wasapplied, the resulting networks had higher clusteringcoefficient than PCC co-expression networks with thesame number of edges. Interestingly, all co-expressionnetworks were less compact, with up to 3 times higherdiameter for PCC and ARACNE and up to 7 timeshigher for MIC.

The differences in networks topology translated to dif-ferences in contained biological information. The over-lap between enriched GO terms and pathways acrossprotocol configurations was generally low, indicatingthat different configurations infer networks that cap-ture different biological knowledge. The same termsoverlap between the co-prediction networks and theirequivalent co-expression counterparts was even lower,never exceeding 62%. We interpret that as evidence,that the biological knowledge captured by the twoparadigms is not completely redundant, but in a largepart complementary.

The most apparent differences between the networkswere observed during the analysis of the connectionsbetween genes known to be related to a specific dis-ease. The disease-associated genes were more closelyconnected (higher proximity) in the co-prediction net-works, which means that the disease-related nodes ofthe network were closer to its core. We also found thatthe number of functional units (triangle motifs), thatcan identify new gene-disease associations, was higherin the co-prediction networks. Therefore, we concludethat the co-prediction networks better capture the ab-stract concept of functional relationship.

The prostate cancer case study further confirmed thisconclusion. We found enriched GO terms and biologi-cal pathways, unique to the co-prediction networks, tobe reported in the literature as related to prostate can-cer. Furthermore, FuNeL generated networks enrichedwith knowledge totally missed by all the co-expressionnetworks when using the prostate cancer dataset. Wealso found that genes corresponding to the topologi-cally important nodes in the co-prediction networks:

(1) were altered in a high percentage of tumour sam-ples in an independent cancer transcriptomic study,and (2) were already associated with prostate can-cer according to the specialised literature. Therefore,the co-prediction networks not only capture biologi-cal knowledge complementary to the co-expression net-works, but also highlights better the important genesinvolved in the disease process.

The superior performance of FuNeL networks in identi-fying the disease-associated genes is likely a result of ef-fective use of the class labels of the samples, which thesimilarity-based methods ignore. Although it would betempting to attribute this performance difference en-tirely to the use of supervised learning in FuNeL, itwould be an overstatement, as the knowledge of ex-plicit links between genes and diseases is not availableto it in training. Our hypothesis is that this is rather aresult of differences in expression values of the disease-associated genes, which taken together are able to dis-criminate between sample phenotypes.

Given that our co-prediction networks were found tobe not only biologically meaningful, but also comple-mentary to similarity-based functional networks, webelieve that network inference based on machine learn-ing models deserves to be studied in more detail in thefuture. In here we only touched the subject of featureselection and network post-processing, and althoughwe now know they indeed influence the network topol-ogy and its biological interpretation, there are manystrategies to choose from in that respect.

At the same time, the machine learning step in theFuNeL protocol does not have to be limited to therule-based machine learning methods. We can imagineunsupervised methods, such as the Apriori algorithmfor association rule learning, or other supervised meth-ods, such as decision tree algorithms (e.g. C4.5 or ran-dom forest), replacing BioHEL in the FuNeL protocol.Some adjustment would be necessary to extract theknowledge from a different model representation, butthe rest of the protocol could remain unchanged. Forexample in the case of the decision trees, relationshipscould be inferred between attributes that share thesame path from the root to the leaves of a tree. Thispotential flexibility in the choice of a learning algo-rithm, together with the ability to apply the protocolto different types of data, becomes important in thecontext of results correctness. As has been discussedto a great length in [67], when methods or data usedin the network inference process are tightly controlled,some results will replicate more easily than others notbecause they are correct, but due to a replicable bias.Therefore a diversity in methods and data is a neces-sary condition to be able to converge on the scientifictruth.

Lazzarini et al. Page 14 of 16

Finally, in terms of testing new functional networks,there is a limit of how thorough and complete a man-ual literature analysis can be, which leads to a greatneed of synthetic or experimentally validated bench-marks, similar to those proposed for protein-proteininteraction networks or gene regulatory networks. Al-though we understand that this would be a difficultand challenging task, we see this as a necessary stepon the way to refining the functional inference meth-ods.

Conclusions

We presented FuNeL: a protocol for the inference offunctional networks from rule-based machine learningmodels. FuNeL is based on the co-prediction paradigm,which hypothesises that genes used together with arule-based machine learning model, are more likely tobe functionally related. We verified that FuNeL cor-rectly identifies relationships in synthetic datasets andwe thoroughly compared FuNeL to three co-expressioninference methods: PCC, ARACNE and MIC, on 8real-world datasets. We contrasted the different ap-proaches by looking at the inferred networks topol-ogy, enriched biological terms and the relationshipsbetween genes associated with cancer. We found thatFuNeL networks capture relevant biological knowledgethat is complementary to what is captured by theco-expression approaches, and demonstrated that Fu-NeL networks are better at identifying relationshipsbetween genes with known disease associations.

Availability of data and material

The datasets used for the analysis were collected from the following public

domain resources:

Dlbcl http://ico2s.org/datasets/microarray.html

CNS http://datam.i2r.a-star.edu.sg/datasets/krbd/NervousSystem/NervousSystem.html

Leukemia http://datam.i2r.a-star.edu.sg/datasets/krbd/Leukemia/ALLAML.html

Lung-Michigan http://datam.i2r.a-star.edu.sg/datasets/krbd/LungCancer/LungCancer-Michigan.html

Lung-Harvard http://datam.i2r.a-star.edu.sg/datasets/krbd/LungCancer/LungCancer-Harvard2.html

Prostate http://datam.i2r.a-star.edu.sg/datasets/krbd/ProstateCancer/ProstateCancer.html

AML http://www.biolab.si/supp/bi-cancer/projections/info/AMLGSE2191.html

Colon-Breast http://www.biolab.si/supp/bi-cancer/projections/info/BC_CCGSE3726_frozen.html

The FuNeL source code is publicly available:

Project name: FuNeL

Project home page: http://ico2s.org/software/funel.html

Operating system(s): GNU/Linux

Programming language: Python, R

License: GNU GPLv3

Author’s contributions

NL, PW, NK and JB designed the experiments. NL and PW performed the

experiments. NL, PW, NK and JB analysed the data. RH and SW

contributed to the biological validation of the results. NL, PW, NK and JB

wrote the paper. All authors read and approved the final manuscript.

Acknowledgements

We thank Joseph Mullen for the help provided in finding the gene-disease

associations. This work made use of the facilities of N8 HPC Centre of

Excellence, provided and funded by the N8 consortium and EPSRC

[EP/K000225/1]. The Centre is co-ordinated by the Universities of Leeds

and Manchester.

Funding

This work was supported by the Engineering and Physical Sciences Research

Council [EP/L001489/2, EP/J004111/2, EP/I031642/2, EP/N031962/1].

The funding agency was not involved with the design of the study, analysis

and interpretation of data or in the writing of the manuscript.

Author details

1 Interdisciplinary Computing and Complex BioSystems (ICOS) research

group, School of Computing Science, Newcastle University, Newcastle upon

Tyne, UK. 2 Clinical and Experimental Pharmacology Group, Cancer

Research UK Manchester Institute, University of Manchester, Manchester,

UK. 3 Northern Institute for Cancer Research, Medical School, Newcastle

University, Newcastle upon Tyne, UK.

References

1. Bansal, M., Belcastro, V., Ambesi-Impiombato, A., di Bernardo, D.:

How to infer gene networks from expression profiles. Molecular

Systems Biology 3(1) (2007). doi:10.1038/msb4100120

2. Margolin, A.A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G.,

Dalla Favera, R., Califano, A.: ARACNE: An algorithm for the

reconstruction of gene regulatory networks in a mammalian cellular

context. BMC Bioinformatics 7(Suppl 1), 1–15 (2006).

doi:10.1186/1471-2105-7-S1-S7

3. Barzel, B., Barabasi, A.-L.: Network link prediction by global silencing

of indirect correlations. Nature biotechnology 31(8), 720–725 (2013).

doi:10.1038/nbt.2601

4. Huynh-Thu, V.A., Irrthum, A., Wehenkel, L., Geurts, P.: Inferring

regulatory networks from expression data using tree-based methods.

PLoS ONE 5(9), 1–10 (2010). doi:10.1371/journal.pone.0012776

5. Childs, K.L., Davidson, R.M., Buell, C.R.: Gene Coexpression Network

Analysis as a Source of Functional Annotation for Rice Genes. PLoS

ONE 6(7), 22196 (2011). doi:10.1371/journal.pone.0022196

6. Presson, A.P., Sobel, E.M., Papp, J.C., Suarez, C.J., Whistler, T.,

Rajeevan, M.S., Vernon, S.D., Horvath, S.: Integrated weighted gene

co-expression network analysis with an application to chronic fatigue

syndrome. BMC Systems Biology 2(1), 1–12 (2008).

doi:10.1186/1752-0509-2-95

7. Ray, M., Jianhua, R., Weixiong, Z.: Variations in the transcriptome of

Alzheimer’s disease reveal molecular networks involved in

cardiovascular diseases. Genome Biology 9(10), 148 (2008).

doi:10.1186/gb-2008-9-10-r148

8. Ransbotyn, V., Yeger-Lotem, E., Basha, O., Acuna, T., Verduyn, C.,

Gordon, M., Chalifa-Caspi, V., Hannah, M.A., Barak, S.: A

combination of gene expression ranking and co-expression network

analysis increases discovery rate in large-scale mutant screens for novel

Arabidopsis thaliana abiotic stress genes. Plant Biotechnology Journal

13(4), 501–513 (2015). doi:10.1111/pbi.12274

9. Yang, Y., Han, L., Yuan, Y., Li, J., Hei, N., Liang, H.: Gene

co-expression network analysis reveals common system-level properties

of prognostic genes across cancer types. Nature Communications 5(2014). doi:10.1038/ncomms4231

10. Kommadath, A., Bao, H., Arantes, A.S., Plastow, G.S., Tuggle, C.K.,

Bearson, S.M.D., Luo Guan, L., Stothard, P.: Gene co-expression

network analysis identifies porcine genes associated with variation in

Salmonella shedding. BMC Genomics 15(1), 1–15 (2014).

doi:10.1186/1471-2164-15-452

11. Wei, S.-N., Zhao, W.-J., Zeng, X.-J., Kang, Y.-M., Du, J., Li, H.-H.:

Microarray and co-expression network analysis of genes associated with

acute doxorubicin cardiomyopathy in mice. Cardiovascular Toxicology

15(4), 377–393 (2015). doi:10.1007/s12012-014-9306-7

Lazzarini et al. Page 15 of 16

12. Silva, A.T., Ribone, P.A., Chan, R.L., Ligterink, W., Hilhorst, H.W.M.:

A predictive co-expression network identifies novel genes controlling

the seed-to-seedling phase transition in Arabidopsis thaliana. Plant

Physiology 170(4), 2218–2231 (2016). doi:10.1104/pp.15.01704

13. Mordelet, F., Vert, J.-P.: ProDiGe: Prioritization Of Disease Genes

with multitask machine learning from positive and unlabeled examples.

BMC Bioinformatics 12(1), 1–15 (2011).

doi:10.1186/1471-2105-12-389

14. Martınez-Ballesteros, M., Nepomuceno-Chamorro, I.A., Riquelme, J.C.:

Inferring gene-gene associations from Quantitative Association Rules.

In: ISDA, pp. 1241–1246 (2011). doi:10.1109/ISDA.2011.6121829

15. Nepomuceno-Chamorro, I.A., Aguilar-Ruiz, J.S., Riquelme, J.C.:

Inferring gene regression networks with model trees. BMC

bioinformatics 11(1), 517 (2010). doi:10.1186/1471-2105-11-517

16. Yoshida, M., Koike, A.: SNPInterForest: a new method for detecting

epistatic interactions. BMC bioinformatics 12(1), 469 (2011).

doi:10.1186/1471-2105-12-469

17. Urbanowicz, R.J., Granizo-Mackenzie, A., Moore, J.H.: An Analysis

Pipeline with Statistical and Visualization-Guided Knowledge

Discovery for Michigan-Style Learning Classifier Systems. IEEE Comp.

Int. Mag. 7(4), 35–45 (2012). doi:10.1109/MCI.2012.2215124

18. Urbanowicz, R.J., Andrew, A.S., Karagas, M.R., Moore, J.H.: Role of

genetic heterogeneity and epistasis in bladder cancer susceptibility and

outcome: a learning classifier system approach. Journal of the

American Medical Informatics Association : JAMIA 20(4), 603–612

(2013). doi:10.1136/amiajnl-2012-001574

19. Bassel, G.W., Glaab, E., Marquez, J., Holdsworth, M.J., Bacardit, J.:

Functional Network Construction in Arabidopsis Using Rule-Based

Machine Learning on Large-Scale Data Sets. The Plant Cell Online

23(9), 3101–3116 (2011). doi:10.1105/tpc.111.088153

20. Glaab, E., Bacardit, J., Garibaldi, J.M., Krasnogor, N.: Using

Rule-Based Machine Learning for Candidate Disease Gene

Prioritization and Sample Classification of Cancer Gene Expression

Data. PLoS ONE 7(7), 39932 (2012).

doi:10.1371/journal.pone.0039932

21. Swan, A.L., Hillier, K.L., Smith, J.R., Allaway, D., Liddell, S.,

Bacardit, J., Mobasheri, A.: Analysis of mass spectrometry data from

the secretome of an explant model of articular cartilage exposed to

pro-inflammatory and anti-inflammatory stimuli using machine

learning. BMC musculoskeletal disorders 14(1), 349 (2013).

doi:10.1186/1471-2474-14-349

22. Fainberg, H.P., Bodley, K., Bacardit, J., Li, D., Wessely, F., Mongan,

N.P., Symonds, M.E., Clarke, L., Mostyn, A.: Reduced Neonatal

Mortality in Meishan Piglets: A Role for Hepatic Fatty Acids? PLoS

ONE 7(11), 49101 (2012). doi:10.1371/journal.pone.0049101

23. Bacardit, J., Widera, P., Marquez-Chamorro, A., Divina, F.,

Aguilar-Ruiz, J.S., Krasnogor, N.: Contact map prediction using a

large-scale ensemble of rule sets and the fusion of multiple predicted

structural features. Bioinformatics 28(19), 2441–2448 (2012).

doi:10.1093/bioinformatics/bts472

24. Bacardit, J., Burke, E.K., Krasnogor, N.: Improving the scalability of

rule-based evolutionary learning. Memetic Computing 1(1), 55–67

(2009). doi:10.1007/s12293-008-0005-4

25. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for

cancer classification using Support Vector Machines. Machine Learning

46(1-3), 389–422 (2002). doi:10.1023/A:1012487302797

26. Schaffter, T., Marbach, D., Floreano, D.: GeneNetWeaver: in silico

benchmark generation and performance profiling of network inference

methods. Bioinformatics 27(16), 2263–2270 (2011).

doi:10.1093/bioinformatics/btr373

27. Urbanowicz, R.J., Kiralis, J., Sinnott-Armstrong, N.A., Heberling, T.,

Fisher, J.M., Moore, J.H.: GAMETES: a fast, direct algorithm for

generating pure, strict, epistatic models with random architectures.

BioData Mining 5(1), 1–14 (2012). doi:10.1186/1756-0381-5-16

28. Li, J., Malley, J.D., Andrew, A.S., Karagas, M.R., Moore, J.H.:

Detecting gene-gene interactions using a permutation-based random

forest method. BioData Mining 9(1), 1–17 (2016).

doi:10.1186/s13040-016-0093-5

29. Baron, D., Bihouee, A., Teusan, R., Dubois, E., Savagner, F.,

Steenman, M., Houlgatte, R., Ramstein, G.: MADGene: retrieval and

processing of gene identifier lists for the analysis of heterogeneous

microarray datasets. Bioinformatics 27(5), 725–726 (2011).

doi:10.1093/bioinformatics/btq710

30. Shipp, M.A., Ross, K.N., Tamayo, P., Weng, A.P., Kutok, J.L., Aguiar,

R.C., Gaasenbeek, M., Angelo, M., Reich, M., Pinkus, G.S., Ray, T.S.,

Koval, M.A., Last, K.W., Norton, A., Lister, T.A., Mesirov, J.,

Neuberg, D.S., Lander, E.S., Aster, J.C., Golub, T.R.: Diffuse large

B-cell lymphoma outcome prediction by gene-expression profiling and

supervised machine learning. Nature medicine 8(1), 68–74 (2002).

doi:10.1038/nm0102-68

31. Pomeroy, S.L., Tamayo, P., Gaasenbeek, M., Sturla, L.M., Angelo, M.,

McLaughlin, M.E., Kim, J.Y.H., Goumnerova, L.C., Black, P.M., Lau,

C., Allen, J.C., Zagzag, D., Olson, J.M., Curran, T., Wetmore, C.,

Biegel, J.A., Poggio, T., Mukherjee, S., Rifkin, R., Califano, A.,

Stolovitzky, G., Louis, D.N., Mesirov, J.P., Lander, E.S., Golub, T.R.:

Prediction of central nervous system embryonal tumour outcome based

on gene expression. Nature 415(6870), 436–442 (2002).

doi:10.1038/415436a

32. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M.,

Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A.,

Bloomfield, C.D., Lander, E.S.: Molecular Classification of Cancer:

Class Discovery and Class Prediction by Gene Expression Monitoring.

Science 286(5439), 531–537 (1999). doi:10.1126/science.286.5439.531

33. Beer, D.G., Kardia, S.L., Huang, C.-C.C., Giordano, T.J., Levin, A.M.,

Misek, D.E., Lin, L., Chen, G., Gharib, T.G., Thomas, D.G., Lizyness,

M.L., Kuick, R., Hayasaka, S., Taylor, J.M., Iannettoni, M.D.,

Orringer, M.B., Hanash, S.: Gene-expression profiles predict survival of

patients with lung adenocarcinoma. Nature medicine 8(8), 816–824

(2002). doi:10.1038/nm733

34. Gordon, G.J., Jensen, R.V., Hsiao, L.-L., Gullans, S.R., Blumenstock,

J.E., Ramaswamy, S., Richards, W.G., Sugarbaker, D.J., Bueno, R.:

Translation of Microarray Data into Clinically Relevant Cancer

Diagnostic Tests Using Gene Expression Ratios in Lung Cancer and

Mesothelioma. Cancer Research 62(17), 4963–4967 (2002)

35. Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C.,

Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., Lander, E.S.,

Loda, M., Kantoff, P.W., Golub, T.R., Sellers, W.R.: Gene expression

correlates of clinical prostate cancer behavior. Cancer Cell 1(2),

203–209 (2002). doi:10.1016/S1535-6108(02)00030-2

36. Yagi, T., Morimoto, A., Eguchi, M., Hibi, S., Sako, M., Ishii, E.,

Mizutani, S., Imashuku, S., Ohki, M., Ichikawa, H.: Identification of a

gene expression signature associated with pediatric AML prognosis.

Blood 102(5), 1849–1856 (2003). doi:10.1182/blood-2003-02-0578

37. Chowdary, D., Lathrop, J., Skelton, J., Curtin, K., Briggs, T., Zhang,

Y., Yu, J., Wang, Y., Mazumder, A.: Prognostic gene expression

signatures can be measured in tissues collected in RNAlater

preservative. The Journal of Molecular Diagnostics 8(1), 31–39 (2006).

doi:10.2353/jmoldx.2006.050056

38. Reshef, D.N., Reshef, Y.A., Finucane, H.K., Grossman, S.R., McVean,

G., Turnbaugh, P.J., Lander, E.S., Mitzenmacher, M., Sabeti, P.C.:

Detecting novel associations in large data sets. Science 334(6062),

1518–1524 (2011). doi:10.1126/science.1205438

39. Jones, E., Oliphant, T., Peterson, P., et al.: SciPy: Open source

scientific tools for Python (2001–). http://www.scipy.org/

40. Meyer, P.E., Lafitte, F., Bontempi, G.: minet: A R/Bioconductor

package for inferring large transcriptional networks using mutual

information. BMC Bioinformatics 9(1), 1–10 (2008).

doi:10.1186/1471-2105-9-461

41. Albanese, D., Filosi, M., Visintainer, R., Riccadonna, S., Jurman, G.,

Furlanello, C.: minerva and minepy: a C engine for the MINE suite and

its R, Python and MATLAB wrappers. Bioinformatics 29(3), 407–408

(2013). doi:10.1093/bioinformatics/bts707

42. Thomas, P.D., Campbell, M.J., Kejariwal, A., Mi, H., Karlak, B.,

Daverman, R., Diemer, K., Muruganujan, A., Narechania, A.:

PANTHER: A library of protein families and subfamilies indexed by

function. Genome Research 13(9), 2129–2141 (2003).

doi:10.1101/gr.772403

43. Rappaport, N., Nativ, N., Stelzer, G., Twik, M., Guan-Golan, Y.,

Iny Stein, T., Bahir, I., Belinky, F., Morrey, C.P., Safran, M., Lancet,

D.: MalaCards: an integrated compendium for diseases and their

annotation. Database 2013 (2013). doi:10.1093/database/bat018

Lazzarini et al. Page 16 of 16

44. Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., McKusick,

V.A.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase

of human genes and genetic disorders. Nucleic Acids Research

33(suppl 1), 514–517 (2005). doi:10.1093/nar/gki033

45. INSERM: Orphanet: an online database of rare diseases and orphan

drugs (1997). http://www.orpha.net/

46. Magrane, M., Consortium, U.: UniProt Knowledgebase: a hub of

integrated protein data. Database 2011 (2011).

doi:10.1093/database/bar009

47. Davis, A.P., Grondin, C.J., Lennon-Hopkins, K., Saraceni-Richards, C.,

Sciaky, D., King, B.L., Wiegers, T.C., Mattingly, C.J.: The

Comparative Toxicogenomics Database’s 10th year anniversary: update

2015. Nucleic Acids Research 43(D1), 914–920 (2015).

doi:10.1093/nar/gku935

48. Tran, N.H., Choi, K.P., Zhang, L.: Counting motifs in the human

interactome. Nature communications 4 (2013).

doi:10.1038/ncomms3241

49. Chen, Z., Lu, W.: Roles of Ubiquitination and SUMOylation on

Prostate Cancer: Mechanisms and Clinical Implications. International

Journal of Molecular Sciences 16(3), 4560–4580 (2015).

doi:10.3390/ijms16034560

50. McCubrey, J.A., Steelman, L.S., Chappell, W.H., Abrams, S.L., Wong,

E.W.T., Chang, F., Lehmann, B., Terrian, D.M., Milella, M., Tafuri,

A., Stivala, F., Libra, M., Basecke, J., Evangelisti, C., Martelli, A.M.,

Franklin, R.A.: Roles of the Raf/MEK/ERK pathway in cell growth,

malignant transformation and drug resistance. Biochimica et

Biophysica Acta (BBA) - Molecular Cell Research 1773(8), 1263–1284

(2007). doi:10.1016/j.bbamcr.2006.10.001. Mitogen-Activated Protein

Kinases: New Insights on Regulation, Function and Role in Human

Disease

51. Monteith, G.R.: Prostate cancer cells alter the nature of their calcium

influx to promote growth and acquire apoptotic resistance. Cancer cell

26(1), 19–32 (2014). doi:10.1016/j.ccr.2014.06.015

52. Flourakis, M., Prevarskaya, N.: Insights into Ca2+ homeostasis of

advanced prostate cancer cells. Biochimica et Biophysica Acta (BBA)

- Molecular Cell Research 1793(6), 1105–1109 (2009).

doi:10.1016/j.bbamcr.2009.01.009. 10th European Symposium on

Calcium

53. Kwon, E.M., Holt, S.K., Fu, R., Kolb, S., Williams, G., Stanford, J.L.,

Ostrander, E.A.: Androgen metabolism and JAK/STAT pathway genes

and prostate cancer risk. Cancer Epidemiology 36(4), 347–353 (2012).

doi:10.1016/j.canep.2012.04.002

54. Barton, B.E., Karras, J.G., Murphy, T.F., Barton, A., Huang, H.F.-S.:

Signal transducer and activator of transcription 3 (STAT3) activation

in prostate cancer: Direct STAT3 inhibition induces apoptosis in

prostate cancer lines. Molecular Cancer Therapeutics 3(1), 11–20

(2004)

55. Minelli, A., Bellezza, I., Conte, C., Culig, Z.: Oxidative stress-related

aging: A role for prostate cancer? Biochimica et Biophysica Acta

(BBA) - Reviews on Cancer 1795(2), 83–91 (2009).

doi:10.1016/j.bbcan.2008.11.001

56. Khandrika, L., Kumar, B., Koul, S., Maroni, P., Koul, H.K.: Oxidative

stress in prostate cancer. Cancer Letters 282(2), 125–136 (2009).

doi:10.1016/j.canlet.2008.12.011

57. Drewa, T., Wolski, Z., Skok, Z., Czajkowski, R., Wisniewska, H.: The

FAS-related apoptosis signaling pathway in the prostate intraepithelial

neoplasia and cancer lesions. Acta Poloniae Pharmaceutica 63(4),

311–315 (2006)

58. DiLella, A.G., Toner, T.J., Austin, C.P., Connolly, B.M.: Identification

of Genes Differentially Expressed in Benign Prostatic Hyperplasia.

Journal of Histochemistry and Cytochemistry 49(5), 669–670 (2001).

doi:10.1177/002215540104900517

59. Luo, J., Dunn, T.A., Ewing, C.M., Walsh, P.C., Isaacs, W.B.:

Decreased gene expression of steroid 5 alpha-reductase 2 in human

prostate cancer: Implications for finasteride therapy of prostate

carcinoma. The Prostate 57(2), 134–139 (2003).

doi:10.1002/pros.10284

60. Ribeiro, R., Monteiro, C., Silvestre, R., Castela, A., Coutinho, H.,

Fraga, A., Prıncipe, P., Lobato, C., Costa, C., Cordeiro-da-Silva, A.,

Lopes, J.M., Lopes, C., Medeiros, R.: Human periprostatic white

adipose tissue is rich in stromal progenitor cells and a potential source

of prostate tumor stroma. Experimental Biology and Medicine

237(10), 1155–1162 (2012). doi:10.1258/ebm.2012.012131

61. Thompson, V.C., Day, T.K., Bianco-Miotto, T., Selth, L.A., Han, G.,

Thomas, M., Buchanan, G., Scher, H.I., Nelson, C.C., Greenberg,

N.M., Butler, L.M., Tilley, W.D.: A gene signature identified using a

mouse model of androgen receptor-dependent prostate cancer predicts

biochemical relapse in human disease. Int J Cancer 131(3), 662–672

(2012). doi:10.1002/ijc.26414

62. Sampson, N., Ruiz, C., Zenzmaier, C., Bubendorf, L., Berger, P.:

PAGE4 Positivity Is Associated with Attenuated AR Signaling and

Predicts Patient Survival in Hormone-Naive Prostate Cancer. The

American Journal of Pathology 181(4), 1443–1454 (2012).

doi:10.1016/j.ajpath.2012.06.040

63. Shiraishi, T., Terada, N., Zeng, Y., Suyama, T., Luo, J., Trock, B.,

Kulkarni, P., Getzenberg, R.: Cancer/Testis antigens as potential

predictors of biochemical recurrence of prostate cancer following

radical prostatectomy. Journal of Translational Medicine 9(1), 153

(2011). doi:10.1186/1479-5876-9-153

64. Larsen, S., Yokochi, T., Isogai, E., Nakamura, Y., Ozaki, T.,

Nakagawara, A.: LMO3 interacts with p53 and inhibits its

transcriptional activity. Biochemical and Biophysical Research

Communications 392(3), 252–257 (2010).

doi:10.1016/j.bbrc.2009.12.010

65. Taylor, B.S., Schultz, N., Hieronymus, H., Gopalan, A., Xiao, Y.,

Carver, B.S., Arora, V.K., Kaushik, P., Cerami, E., Reva, B., Antipin,

Y., Mitsiades, N., Landers, T., Dolgalev, I., Major, J.E., Wilson, M.,

Socci, N.D., Lash, A.E., Heguy, A., Eastham, J.A., Scher, H.I., Reuter,

V.E., Scardino, P.T., Sander, C., Sawyers, C.L., Gerald, W.L.:

Integrative Genomic Profiling of Human Prostate Cancer. Cancer Cell

18(1), 11–22 (2010). doi:10.1016/j.ccr.2010.05.026

66. Cerami, E., Gao, J., Dogrusoz, U., Gross, B.E., Sumer, S.O., Aksoy,

B.A., Jacobsen, A., Byrne, C.J., Heuer, M.L., Larsson, E., Antipin, Y.,

Reva, B., Goldberg, A.P., Sander, C., Schultz, N.: The cBio Cancer

Genomics Portal: An Open Platform for Exploring Multidimensional

Cancer Genomics Data. Cancer Discovery 2(5), 401–404 (2012).

doi:10.1158/2159-8290.CD-12-0095

67. Verleyen, W., Ballouz, S., Gillis, J.: Positive and negative forms of

replicability in gene network analysis. Bioinformatics 32(7), 1065–1073

(2016). doi:10.1093/bioinformatics/btv734

Supplementary MaterialFunctional networks inference from machine learning models

Nicola Lazzarini, Paweł Widera, Stuart Williamson, Rakesh Heer, Natalio Krasnogor and Jaume Bacardit

1 Classification accuracy under feature selection

To choose the default percentage of attributes retained in the feature selection procedure, we performed apreliminary analysis using all 8 transcriptomic datasets from the main article. We evaluated how the Bio-HEL classification accuracy changes with the number of selected features (using linear SVM-RFE). The ac-curacy was measured using a standard 10 cross-fold validation. The full experiment (not reported here) used100%, 90%, 80%, ..., 10% of the original dataset attributes.

We found that even when only 10% of the attributes are retained, the classification accuracy remains almostunchanged. Specifically, with 10% of the original attributes the accuracy increased for 2 datasets, slightlydecreased for 3 datasets and remained unaltered for the other datasets. The exact results are reported inSupplementary Table 1 below:

Supplementary Table 1: BioHEL classification accuracy for each dataset, in 10-fold cross-validation experiments onthe original and reduced set of attributes (before and after the feature selection). Linear SVM-RFE was used to selectbest 10% of the attributes.

dataset all attributes 10% attributesDlbcl 0.871 0.886CNS 0.473 0.451Leukemia 0.945 0.945Lung-Michigan 0.980 0.980Lung-Harvard 0.978 0.964Prostate 0.892 0.892AML 0.637 0.592Colon-Breast 0.903 0.940

Given that we were able to maintain good classification accuracy despite large reduction in number of usedattributes, we decided to use 10% attributes as a default setting for the FuNeL feature selection procedure.However, this FuNeL parameter is under the user control and the default setting can be changed.

2 Time complexity

The FuNeL protocol has four stages (see Figure 2 in the main article): (1) feature selection (optional), (2)rule-based network generation, (3) permutation test and (4) second rule-based network generation (optional).

The running time for the whole pipeline depends on the rule set generation time (execution time of BioHEL),as the optional feature selection stage can be seen as running in constant time. Two main factors that influencethe rule set generation time are: (1) the number of attributes and (2) the number of samples.

We performed an execution time analysis of BioHEL using the largest (in terms of number of attributes) Colon-Breast dataset (Chowdary et al., 2006). In the feature selection stage we retained: 20, 200, 2000, 10 000 and20 000 attributes. From each of these 5 datasets we generated 100 random subsets of 50, 40, 30, 20 and 10samples. Finally, we ran BioHEL 1000 times to obtain 1000 rule sets for each dataset.

Supplementary Figure 1 shows the running times averaged across 100 000 runs (1000 runs for each of the 100datasets).

1

0 5000 10000 15000 20000attributes

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

seco

nds

BioHel execution times

50 samples40 samples30 samples20 samples10 samples

Supplementary Figure 1: Average execution times of a single BioHEL run for a given number of samples andattributes.

The total execution time of FuNeL configurations C1 and C2 is calculated as:

T1 = (rule_sets× t(atts1, samples)) + (permutation_runs× t(atts1, samples)) (1)

where rule_sets is the number of inferred rule sets, permutation_runs is the number of randomised datasetsused in the permutation test and t(atts1, samples) represents execution time of a single BioHEL run, thatlinearly depends on the size of a dataset measured in number of attributes and samples.

Configurations C3 and C4 require an additional run of BioHEL (step 4), and their total execution time is:

T2 = T1 + (rule_sets× t(atts2, samples)) (2)

where atts2 is the number of attributes after the permutation test (atts1 ≤ atts2).

It is important to notice that each run of BioHEL is independent, thus the generation of the rule sets can betrivially parallelised without any extra overhead. Given n computational cores, the total execution times couldbe reduced to:

Treal1 = T1

nTreal2 = T2

n(3)

2

3 Comparison of networks topological properties

The network topology refers to the spatial arrangements of its elements. The analysis of topological propertiestells us how different nodes are connected to each other and how their communication paths look like. Thereare many aspects and characteristics that can be evaluated in a network. For simplicity we report just fourmetrics: number of nodes, number of edges, clustering coefficient and diameter.

The clustering coefficient is a measure of degree to which nodes in a network tend to cluster together. Itexpresses the likelihood that any two nodes with a common neighbour are themselves connected. The diameterindicates the maximum distance between two nodes in the network.

We compared the topology of the networks built with two different approaches: co-prediction and co-expression.For each generated network we calculated the topological properties described above. We compare the FuNeLnetworks with co-expression networks inferred with different methods. The Supplementary Tables 2 to 4 belowshow the results for the PCC, ARACNE and MIC networks.

Supplementary Table 2: Topological properties for co-prediction and PCC co-expression networks generated for all8 datasets.

Co-prediction Co-expression (SE) Co-expression (SN)Dataset Cat. C1 C2 C3 C4 SE(C1) SE(C2) SE(C3) SE(C4) SN(C1) SN(C2) SN(C3) SN(C4)

Leukemia

Nodes 421 1480 294 988 683 873 843 941 422 1482 293 979Edges 1529 2294 2154 2646 1529 2294 2154 2646 680 7145 409 2870Clust.Coef. 0.712 0.155 0.589 0.33 0.333 0.348 0.354 0.341 0.323 0.388 0.303 0.344Diameter 5 6 4 6 24 18 19 22 16 18 9 20

LungH

Nodes 429 1419 382 1030 578 930 955 1214 432 1413 384 1027Edges 1068 2317 2398 3410 1068 2317 2398 3410 617 4302 476 2650Clust.Coef. 0.344 0.298 0.43 0.404 0.356 0.373 0.376 0.372 0.341 0.386 0.296 0.376Diameter 5 8 5 7 10 23 23 21 6 22 6 23

LungM

Nodes 91 919 48 247 76 280 59 119 90 915 50 248Edges 134 1858 78 410 134 1858 78 410 224 13574 64 1510Clust.Coef. 0.379 0.262 0.418 0.457 0.465 0.525 0.446 0.514 0.539 0.523 0.493 0.511Diameter 3 5 3 3 6 11 6 6 5 14 6 12

CNS

Nodes 501 4257 494 3538 945 2152 1616 2607 501 4261 488 3532Edges 4302 25069 12769 40840 4302 25069 12769 40840 1553 171052 1502 90395Clust.Coef. 0.743 0.255 0.521 0.302 0.354 0.389 0.367 0.400 0.346 0.427 0.35 0.421Diameter 4 7 4 6 21 15 23 13 12 13 14 12

Dlbcl

Nodes 201 1699 201 1617 207 1411 1238 1790 200 1699 200 1614Edges 848 10471 7351 33170 848 10471 7351 33170 832 24280 832 17865Clust.Coef. 0.872 0.574 0.642 0.453 0.508 0.438 0.411 0.51 0.501 0.504 0.501 0.481Diameter 3 5 3 5 2 16 17 14 2 14 2 15

GSE2191

Nodes 890 4802 846 3561 837 1848 1239 1750 897 4799 839 3553Edges 3290 13424 6469 12074 3290 13424 6469 12074 3711 90410 3292 47806Clust.Coef. 0.488 0.082 0.317 0.291 0.377 0.409 0.394 0.4 0.382 0.415 0.377 0.417Diameter 5 9 5 9 25 23 19 21 21 13 25 15

GS3726

Nodes 668 2077 524 1170 879 1300 992 1367 759 2300 1739 3440Edges 1761 3255 2051 3502 1761 3255 2051 3502 1471 9808 5479 26761Clust.Coef. 0.134 0.0077 0.307 0.109 0.226 0.23 0.223 0.233 0.213 0.287 0.254 0.346Diameter 8 10 7 8 20 26 20 25 26 15 19 15

Prostate

Nodes 938 4290 704 2277 356 543 322 448 920 4298 702 2287Edges 3796 10175 3090 6546 3796 10175 3090 6546 33250 914829 16934 24427Clust.Coef. 0.328 0.245 0.29 0.25 0.565 0.607 0.541 0.584 0.655 0.703 0.641 0.711Diameter 7 10 6 8 6 9 5 8 8 11 7 12

3

Supplementary Table 3: Topological properties for co-prediction and ARACNE co-expression networks generated forall 8 datasets.

Co-prediction Co-expression (SE) Co-expression (SN)Dataset Cat. C1 C2 C3 C4 SE(C1) SE(C2) SE(C3) SE(C4) SN(C1) SN(C2) SN(C3) SN(C4)

Leukemia

Nodes 421 1480 294 988 1024 1426 1356 1577 422 1480 294 989Edges 1529 2294 2154 2646 1529 2294 2154 2646 512 2416 327 1479Clust.Coef. 0.712 0.155 0.589 0.330 0.002 0.002 0.002 0.002 0.000 0.002 0.000 0.002Diameter 5 6 4 6 17 19 22 19 9 17 11 19

LungH

Nodes 429 1419 382 1030 907 1614 1653 2066 429 1419 382 1030Edges 1068 2317 2398 3410 1068 2317 2398 3410 435 1924 375 1250Clust.Coef. 0.344 0.298 0.430 0.404 0.007 0.006 0.006 0.005 0.013 0.006 0.012 0.007Diameter 5 8 5 7 23 16 15 13 14 18 10 18

LungM

Nodes 91 919 48 247 143 1321 96 370 91 920 48 247Edges 134 1858 78 410 134 1858 78 410 72 1127 34 259Clust.Coef. 0.379 0.262 0.418 0.475 0.000 0.002 0.000 0.009 0.000 0.005 0.000 0.014Diameter 3 5 3 3 13 17 11 18 11 17 5 11

CNS

Nodes 501 4257 494 3538 2002 4509 3581 5342 502 4257 494 3538Edges 4302 25069 12769 40840 4302 25069 12769 41661 513 20409 505 12358Clust.Coef. 0.743 0.255 0.521 0.302 0.004 0.005 0.006 0.026 0.004 0.005 0.004 0.005Diameter 4 7 4 6 12 8 12 7 20 9 20 12

Dlbcl

Nodes 201 1699 201 1617 380 1452 1191 2236 201 1699 201 1617Edges 848 10471 7351 33170 848 10471 7351 33890 269 14149 269 12903Clust.Coef. 0.872 0.574 0.642 0.453 0.136 0.126 0.140 0.176 0.113 0.110 0.113 0.115Diameter 3 5 3 5 13 9 11 5 12 8 12 9

GSE2191

Nodes 890 4802 846 3561 2574 5226 3846 5027 890 4802 846 3561Edges 3290 13424 6469 12074 3290 13424 6469 12076 846 10671 794 5564Clust.Coef. 0.488 0.082 0.317 0.291 0.002 0.002 0.002 0.002 0.004 0.002 0.004 0.002Diameter 5 9 5 9 19 13 16 13 30 15 30 17

GS3726

Nodes 668 2077 524 1170 1362 2167 1546 2279 668 2166 524 1170Edges 1761 3255 2051 3502 1761 3255 2051 3502 787 3250 597 1455Clust.Coef. 0.134 0.077 0.307 0.109 0.024 0.053 0.029 0.050 0.014 0.053 0.016 0.021Diameter 8 10 7 8 20 20 19 18 15 20 13 19

Prostate

Nodes 938 4290 704 2277 2760 6805 2268 4575 939 4290 704 2277Edges 3796 10175 3090 6546 3796 10175 3090 6546 1300 6095 1017 3102Clust.Coef. 0.328 0.245 0.290 0.250 0.005 0.003 0.006 0.003 0.001 0.003 0.002 0.005Diameter 7 10 6 8 13 13 15 13 12 13 9 15

4

Supplementary Table 4: Topological properties for co-prediction and MIC co-expression networks generated for all 8datasets.

Co-prediction Co-expression (SE) Co-expression (SN)Dataset Cat. C1 C2 C3 C4 SE(C1) SE(C2) SE(C3) SE(C4) SN(C1) SN(C2) SN(C3) SN(C4)

Leukemia

Nodes 421 1480 294 988 640 807 780 896 421 1480 294 989Edges 1529 2294 2154 2646 1529 2294 2155 2647 749 6173 432 3096Clust.Coef. 0.712 0.155 0.589 0.330 0.162 0.182 0.180 0.179 0.138 0.180 0.127 0.175Diameter 5 6 4 6 18 29 27 29 10 17 8 18

LungH

Nodes 429 1419 382 1030 384 685 703 944 429 1419 382 1030Edges 1068 2317 2398 3410 1068 2317 2399 3410 1264 5867 1045 3841Clust.Coef. 0.344 0.298 0.430 0.404 0.349 0.308 0.305 0.302 0.339 0.282 0.343 0.305Diameter 5 8 5 7 9 13 13 17 7 18 9 19

LungM

Nodes 91 919 48 247 118 626 79 219 91 919 48 247Edges 134 1858 78 410 134 1858 78 410 93 3109 38 484Clust.Coef. 0.379 0.262 0.418 0.475 0.212 0.272 0.213 0.306 0.208 0.235 0.153 0.302Diameter 3 5 3 3 8 18 7 8 6 14 3 7

CNS

Nodes 501 4257 494 3538 1424 3104 2357 3725 501 4257 495 3538Edges 4302 25069 12769 40840 4305 25131 12771 40850 704 62208 694 36027Clust.Coef. 0.743 0.255 0.521 0.302 0.124 0.154 0.144 0.159 0.089 0.162 0.091 0.161Diameter 4 7 4 6 17 11 12 10 11 10 11 10

Dlbcl

Nodes 201 1699 201 1617 475 1140 1047 1453 203 1699 203 1617Edges 848 10471 7351 33170 848 10471 7362 33172 196 74773 196 59307Clust.Coef. 0.872 0.574 0.642 0.453 0.111 0.240 0.219 0.319 0.082 0.381 0.082 0.366Diameter 3 5 3 5 21 13 16 15 11 11 11 11

GSE2191

Nodes 890 4802 846 3561 1700 4129 2617 3883 890 4803 846 3563Edges 3290 13424 6469 12074 3299 13433 6469 12207 1380 17797 1271 10540Clust.Coef. 0.488 0.082 0.317 0.291 0.109 0.095 0.098 0.098 0.120 0.095 0.118 0.099Diameter 5 9 5 9 22 15 18 16 19 15 21 15

GS3726

Nodes 668 2077 524 1170 1271 1921 1357 1996 672 2152 526 1172Edges 1761 3255 2051 3502 1890 3261 2056 3524 852 3921 538 1705Clust.Coef. 0.134 0.077 0.307 0.109 0.110 0.100 0.104 0.100 0.126 0.099 0.121 0.117Diameter 8 10 7 8 23 29 23 28 14 24 14 24

Prostate

Nodes 938 4290 704 2277 687 839 667 773 964 4290 712 2277Edges 3796 10175 3090 6546 3981 10186 3777 8257 15928 1763794 5254 308709Clust.Coef. 0.328 0.245 0.290 0.250 0.167 0.278 0.169 0.265 0.313 0.758 0.218 0.661Diameter 7 10 6 8 7 8 7 8 7 8 7 9

When analysing FuNeL networks we observed, as expected, that configurations having feature selection (C1 andC3) lead to networks with a smaller number of nodes than when the original set of attributes is used (C2 andC4). Furthermore, the second phase of machine learning modeling (C3 and C4) tends to reduce the number ofnodes as it uses a reduced set of attributes as input (only significant nodes and their neighbours from the firsttraining phase), while increasing both clustering coefficient and number of edges.

When comparing FuNeL and co-expression networks we notice that the ARACNE SE counterparts have ingeneral more nodes. The same patter can be found in SE(C2 and SE(C4) counterparts generated with PCCand MIC, while it’s not true for the SE-networks based on configurations that use feature selection (C1 andC2). Conversely, SN -networks differ according to the inference method used. In fact ARACNE generated SNcounterparts with less edges, while this is true only for SN(C1) and SN(C3) inferred with MIC and PCC.The clustering coefficient is constantly lower in ARACNE networks than in FuNeL, this is probably due to thepruning phase operated by the method. A similar trend can be noticed for MIC networks with some exceptions(e.g Prostate SN(C2) and SN(C4)). A more balanced situation occurs when FuNeL is contrasted with PCC,in fact networks generated with feature selection (C1 and C3) have a lower coefficient than their co-expressioncounterparts. Finally a clear pattern emerge when analysing the diameter of the networks. Co-predictionnetworks are always more compact than co-expression counterparts having up to 3 time lower diameter for MICand PCC and up to 7 time lower for ARACNE.

5

4 Enrichment Score analysis

In this section we report the network average rankings, based on the Enrichment Score, across the 8 datasetsfor each inferring method. The networks are ranked between 1 and N (where N = 4 for FuNeL and N = 8for PCC, ARACNE and MIC: 4 SE(Ci) + 4 SN(Ci)). We considered Gene Ontology terms (biological process(BP), molecular function (MF) and cellular component (CC)) and biological pathways. The last row of eachtable represents the average rank across different biological categories.

Supplementary Table 5: Average network ranks for each method across the 8 datasets (based on ES).

Cat. C1 C2 C3 C4

GO BP 3 4 2 1GO MF 4 2.5 1 2.5GO CC 2 4 1 3Pathways 4 2 3 1

Average 3.25 3.125 1.75 1.88

(a) FuNeL networks

PCC (SE) PCC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4

GO BP 2 4 1 3 6 7 5 8GO MF 8 3.5 5 6 7 3.5 1 2GO CC 2 5 4 6 1 8 3 7Pathways 6 5 4 3 7 1 8 2Average 4.5 4.38 3.5 4.5 5.25 4.88 4.25 4.75

(b) PCC networks

ARACNE (SE) ARACNE (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4

GO BP 4 7 5.5 8 2 5.5 1 3GO MF 4 8 4 7 2 6 1 4GO CC 3 7 5 8 2 6 1 4Pathways 1 4 2 6 7 5 8 3Average 3 6.5 4.13 7.25 3.25 5.63 2.75 3.5

(c) ARACNE networks

MIC (SE) MIC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4

GO BP 2 6 4 5 3 8 1 7GO MF 1 6 4 7 3 8 2 5GO CC 3 5.5 4 5.5 2 8 1 7Pathways 2.5 4 1 5.5 8 5.5 7 2.5Average 2.13 5.38 3.25 5.75 4 7.38 2.75 5.38

(d) MIC networks

We also compared the generated networks against FuNeL. The networks are ranked from 1 to 12: 4 Ci + 4SE(Ci) + 4 SN (Ci). The ranks in Supplementary Table 6 are averaged across the 8 datasets, for each biologicalcategory and for each network. The row-wise rank is given in brackets and the highest ranks are shown withbold font. The following abbreviations were used for GO categories: biological process (BP), molecular function(MF) and cellular component (CC).

Supplementary Table 6: Average network ranks (co-prediction vs. co-expression).

Co-prediction Co-expression (SE) Co-expression (SN)Method Cat. C1 C2 C3 C4 SE(C1) SE(C2) SE(C3) SE(C4) SN(C1) SN(C2) SN(C3) SN(C4)

PCC

GO BP 6.06 (6) 7.00 (7.5) 7.00 (7.5) 5.88 (3.5) 5.88 (3.5) 5.88 (3.5) 5.12 (1) 5.88 (3.5) 7.06 (9) 7.75 (12) 7.12 (10) 7.38 (11)GO MF 7.81 (11) 5.38 (2) 6.19 (5) 5.62 (3) 9.12 (12) 6.50 (7.5) 6.50 (7.5) 7.38 (10) 7.19 (9) 6.25 (6) 4.31 (1) 5.75 (4)GO CC 4.31 (5) 11.00 (12) 4.19 (4) 9.00 (10) 3.88 (2) 6.25 (6.5) 6.25 (6.5) 8.12 (8) 3.19 (1) 9.38 (11) 4.06 (3) 8.38 (9)Pathways 8.12 (10.5) 4.75 (2) 8.12 (10.5) 4.38 (1) 6.94 (8) 6.50 (6) 6.69 (7) 5.88 (5) 7.62 (9) 5.00 (3) 8.50 (12) 5.50 (4)

ARACNE

GO BP 6.69 (7) 6.25 (5.5) 6.94 (10) 4.75 (1) 6.25 (5.5) 8.12 (11) 6.88 (8.5) 9.25 (12) 5.19 (3) 6.88 (8.5) 5.06 (2) 5.75 (4)GO MF 7.44 (10) 6.50 (8) 6.19 (6.5) 5.69 (3) 6.00 (5) 8.75 (12) 5.62 (2) 8.62 (11) 5.81 (4) 7.00 (9) 4.19 (1) 6.19 (6.5)GO CC 4.31 (4) 10.75 (12) 3.44 (3) 8.25 (8) 5.50 (5) 9.38 (10) 6.75 (7) 10.12 (11) 2.44 (2) 8.88 (9) 2.31 (1) 5.88 (6)Pathways 7.88 (10.5) 5.38 (3) 7.88 (10.5) 5.25 (2) 4.88 (1) 6.25 (6) 5.50 (4) 7.00 (8) 7.56 (9) 6.50 (7) 8.38 (12) 5.56 (5)

MIC

GO BP 7.44 (8.5) 7.88 (11) 7.44 (8.5) 6.38 (5.5) 4.12 (2) 7.00 (7) 6.00 (4) 6.38 (5.5) 4.81 (3) 9.12 (12) 3.94 (1) 7.50 (10)GO MF 8.06 (10) 8.50 (12) 7.19 (8) 7.75 (9) 3.62 (1) 6.50 (5.5) 5.12 (3) 6.75 (7) 5.31 (4) 8.12 (11) 4.56 (2) 6.50 (5.5)GO CC 5.19 (6) 11.62 (12) 4.44 (4) 10.12 (10) 4.25 (3) 7.12 (7.5) 5.12 (5) 7.12 (7.5) 3.19 (2) 10.38 (11) 1.69 (1) 7.75 (9)Pathways 8.75 (12) 4.75 (1) 7.50 (10) 5.50 (3) 6.00 (5) 6.50 (6) 5.00 (2) 6.62 (7) 7.56 (11) 7.00 (9) 6.94 (8) 5.88 (4)

6

5 Disease association analysis

In this section we report the network average rankings across the 8 datasets for every inferring method basedon the gene-disease association properties: participation in triangular relationship and proximity. We used twosources for the disease associations: Malacards (Rappaport et al., 2013) (a meta-database of human maladiesconsolidated from 64 independent sources) and manually curated databases (OMIM (Hamosh et al., 2005),Orphanet (INSERM, 1997), Uniprot (Magrane and Consortium, 2011) and CTD (Davis et al., 2015)). Thenetworks are ranked between 1 and N (where N = 4 for FuNeL and N = 8 for PCC, ARACNE and MIC: 4SE(Ci) + 4 SN(Ci)). The number of disease-associated genes participating in a triangle is denoted as 1A, 2Aand 3A. The last row of each table represents the average rank across different metrics.

Supplementary Table 7: Average network ranks for each method across the 8 datasets (based on diseaseassociations from Malacards).

Cat. C1 C2 C3 C4

1A 3 1 4 22A 4 1 2 33A 3 3 1 3Proximity 3 1 4 2

Average 3.25 1.5 2.75 2.5

(a) FuNeL networks

PCC (SE) PCC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4

1A 8 5 6 3 4 1 2 72A 7 4 5 3 8 2 6 13A 6.5 6.5 6.5 6.5 3 2 4 1Proximity 1.5 5 3 1.5 7 6 8 4Average 5.75 5.13 5.13 3.5 5.5 2.75 5 3.25

(b) PCC networks

ARACNE (SE) ARACNE (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4

1A 4 8 7 5.5 2.5 2.5 5.5 12A 7 6 8 1 4 3 2 53A 5.5 1 5.5 2 5.5 5.5 5.5 5.5Proximity 7 3 5 1 6 2 8 4Average 5.88 4.5 6.38 2.38 4.5 3.25 5.25 3.88

(c) ARACNE networks

MIC (SE) MIC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4

1A 6 3 7.5 1 5 2 7.5 42A 8 3 5 2 7 1 6 43A 7 2 8 6 5 1 3 4Proximity 6 1 2 5 7 4 8 3Average 6.75 2.25 5.63 3.5 6 2 6.13 3.75

(d) MIC networks

Supplementary Table 8: Average network ranks for each method across the 8 datasets (based on diseaseassociations from curated databases).

Cat. C1 C2 C3 C4

1A 3 1 4 22A 3 4 1 23A 1 2.5 2.5 4Proximity 4 1 3 2

Average 2.75 2.13 2.67 2.5

(a) FuNeL networks

PCC (SE) PCC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4

1A 8 4 6.5 1.5 6.5 1.5 3 52A 2 7 5.5 3 4 5.5 1 83A 5 7 4 8 1 6 3 2Proximity 4 8 2.5 7 5 2.5 1 6Average 4.5 6.5 4.63 4.88 5.13 3.88 2 5.25

(b) PCC networks

ARACNE (SE) ARACNE (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4

1A 3 5 8 6 2 4 1 72A 5 1 2 3 7.5 6 7.5 43A 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5Proximity 6 5 4 1 7.5 3 7.5 2Average 4.63 3.88 4.63 3.63 5.38 4.38 5.13 4.38

(c) ARACNE networks

MIC (SE) MIC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4

1A 6 2 4 3 8 1 7 52A 7 3 8 1.5 5.5 1.5 5.5 43A 4 1 8 2 6 5 7 3Proximity 5.5 1 3 4 7 2 8 5.5Average 5.63 1.75 5.75 2.63 6.63 2.38 6.88 4.38

(d) MIC networks

7

6 Case study: prostate cancer dataset

In this section we report the additional results from the analysis performed using the prostate dataset (Singhet al., 2002) as a case study. In particular we show: 1) the overlap of enriched terms between co-prediction andco-expression networks, 2) the overlap between GO terms associated to the hubs of the networks generated withdifferent methods and FuNeL and 3) the average percentages of alteration for key nodes of both co-predictionand co-expression networks in an independent dataset.

6.1 Overlap of the enriched terms

We performed an analysis on the enriched terms of each network to highlight the complementary nature of theco-prediction and the co-expression paradigm. We generated heatmaps showing the unique terms associatedonly to co-prediction or co-expression networks. The main manuscript includes the comparison between FuNeLand PCC networks, in here we report the analysis performed considering ARACNE (Supplementary Figure 2)and MIC (Supplementary Figure 3) networks. For the sake of readability we filtered out the generic GO terms(with depth < 9 in the GO hierarchical structure).

Supplementary Figure 2: Number of non-common enriched GO terms (biological process) for eachnetwork configuration (generated from the prostate cancer dataset). On the x-axis we show the 12 investigatednetworks. On the y-axis we show the names of enriched terms unique to co-prediction or ARACNE co-expressionnetworks. Red terms are associated with co-expression networks, blue with co-prediction. Empty columns indicatenetworks with no unique terms.

Supplementary Figure 3: Number of non-common enriched GO terms (biological process) for eachnetwork configuration (generated from the prostate cancer dataset). On the x-axis we show the 12 investigatednetworks. On the y-axis we show the names of enriched terms unique to co-prediction or MIC co-expression networks.Red terms are associated with co-expression networks, blue with co-prediction. Empty columns indicate networks withno unique terms.

8

When comparing ARACNE and FuNeL, we found 16 unique pathways for co-prediction networks and 8 forco-expression. In terms of unique GO terms, the overlap was more balanced, 7 for co-prediction networks and9 for co-expression networks. C2 and C4, generated without feature selection, had the largest number of uniquepathways, while SE(C2) had the highest number of terms for ARACNE. The comparison of FuNeL with MICgenerated many empty columns (Supplementary Figure 3) for the GO terms because several networks resultedhaving no unique enriched terms. All the 15 unique GO terms related to MIC were associated to SN(C2) (andwith SN(C4) in two cases), conversely FuNeL had more networks sharing the 12 unique terms. Finally, asnoticed for in the ARACNE comparison, FuNeL networks are more enriched in biological pathways: 16 against8 unique terms for MIC co-expression.

6.2 Overlap of hub-related terms

We also analysed the gene associated to the hubs of each network in order to compare the biological knowledgeassociated to them. A node v was considered to be a hub if its degree was at least one standard deviation abovethe mean network degree, that is if:

d(v) > µd + σd (4)

where d(v) is a degree of the node v, and µd and σd are the mean and standard deviation of a network nodedegree distribution.

To compare the networks, we used the 10 most frequent GO terms (biological processes) shared among eachnetwork’s hubs. Supplementary Figures 4 to 6 show the terms-overlap analysis between FuNeL networks andPCC, ARACNE and MIC respectively. To make this analysis more specific we have discarded the most generic/ most common terms (which could be be associated with many genes), we considered only the GO termssituated at level 10 of the GO hierarchy or lower.

Blue terms were found only in co-prediction networks, red terms were found only in co-expression networks,and green terms were found in both. In Supplementary Table 9 we summarise the number of unique andcommon terms shared between networks created with different approaches. This analysis further highlights thecomplementary nature of co-prediction and co-expression approach, the terms that are paradigm-specif alwaysoutnumber the common ones.

Supplementary Table 9: Unique and common terms from networks’ hubs

Terms PCC ARACNE MICCo-prediction 16 18 16Co-expression 19 20 19Common 11 9 11

9

C1 C2 C3 C4

SE(C1)

SE(C2)

SE(C3)

SE(C4)

SN(C1)

SN(C2)

SN(C3)

SN(C4)

positive regulation of ERK1 and ERK2 cascade (GO:0070374)

TRIF­dependent toll­like receptor signaling pathway (GO:0035666)

toll­like receptor TLR6:TLR2 signaling pathway (GO:0038124)

positive regulation of neuron projection development (GO:0010976)

epidermal growth factor receptor signaling pathway (GO:0007173)

toll­like receptor TLR1:TLR2 signaling pathway (GO:0038123)

positive regulation of cytosolic calcium ion concentration (GO:0007204)

toll­like receptor 2 signaling pathway (GO:0034134)

positive regulation of peptidyl­tyrosine phosphorylation (GO:0050731)

toll­like receptor 3 signaling pathway (GO:0034138)

activation of MAPKK activity (GO:0000186)

positive regulation of phosphatidylinositol 3­kinase signaling (GO:0014068)

positive regulation of MAP kinase activity (GO:0043406)

regulation of intracellular pH (GO:0051453)

positive regulation of calcineurin­NFAT signaling cascade (GO:0070886)

toll­like receptor 4 signaling pathway (GO:0034142)

MyD88­dependent toll­like receptor signaling pathway (GO:0002755)

toll­like receptor 5 signaling pathway (GO:0034146)

toll­like receptor signaling pathway (GO:0002224)

positive regulation of the force of heart contraction (GO:0098735)

calcium ion transport (GO:0006816)

regulation of Rho protein signal transduction (GO:0035023)

positive regulation of sodium ion export from cell (GO:1903278)

positive regulation of collateral sprouting (GO:0048672)

negative regulation of activated T cell proliferation (GO:0046007)

regulation of proteasomal ubiquitin­dependent protein catabolic process (GO:0032434)

spliceosomal tri­snRNP complex assembly (GO:0000244)

MyD88­independent toll­like receptor signaling pathway (GO:0002756)

positive regulation of cAMP biosynthetic process (GO:0030819)

activation of MAPK activity (GO:0000187)

stimulatory C­type lectin receptor signaling pathway (GO:0002223)

negative regulation of ERK1 and ERK2 cascade (GO:0070373)

transforming growth factor beta receptor complex assembly (GO:0007181)

activation of protein kinase A activity (GO:0034199)

negative regulation of neuron projection development (GO:0010977)

negative regulation of histone deacetylation (GO:0031064)

negative regulation of activation­induced cell death of T cells (GO:0070236)

activation of phospholipase C activity (GO:0007202)

regulation of skeletal muscle contraction by regulation of release of sequestered calcium ion (GO:0014809)

positive regulation of protein autophosphorylation (GO:0031954)

signal transduction involved in DNA damage checkpoint (GO:0072422)

regulation of cardiac muscle contraction by regulation of the release of sequestered calcium ion (GO:0010881)

positive regulation of peptidyl­serine phosphorylation (GO:0033138)

negative regulation of BMP signaling pathway (GO:0030514)

T cell receptor signaling pathway (GO:0050852)

positive regulation of glucocorticoid receptor signaling pathway (GO:2000324)

Supplementary Figure 4: Top 10 most frequent biological processes from Gene Ontology found in the network hubswhen comparing FuNeL and PCC co-expression networks.

10

C1 C2 C3 C4

SE(C1)

SE(C2)

SE(C3)

SE(C4)

SN(C1)

SN(C2)

SN(C3)

SN(C4)

stimulatory C­type lectin receptor signaling pathway (GO:0002223)

TRIF­dependent toll­like receptor signaling pathway (GO:0035666)

positive regulation of neuron projection development (GO:0010976)

epidermal growth factor receptor signaling pathway (GO:0007173)

toll­like receptor TLR1:TLR2 signaling pathway (GO:0038123)

positive regulation of cytosolic calcium ion concentration (GO:0007204)

toll­like receptor 2 signaling pathway (GO:0034134)

toll­like receptor 3 signaling pathway (GO:0034138)

activation of MAPKK activity (GO:0000186)

protein polyubiquitination (GO:0000209)

sodium ion export from cell (GO:0036376)

positive regulation of ubiquitin­protein ligase activity involved in regulation of mitotic cell cycle transition (GO:0051437)

DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest (GO:0006977)

toll­like receptor 4 signaling pathway (GO:0034142)

negative regulation of telomere single strand break repair (GO:1903824)

toll­like receptor 5 signaling pathway (GO:0034146)

toll­like receptor 9 signaling pathway (GO:0034162)

calcium ion transport (GO:0006816)

anaphase­promoting complex­dependent proteasomal ubiquitin­dependent protein catabolic process (GO:0031145)

activation of protein kinase activity (GO:0032147)

positive regulation of cAMP biosynthetic process (GO:0030819)

positive regulation of sodium ion export from cell (GO:1903278)

protein K11­linked ubiquitination (GO:0070979)

positive regulation of activin receptor signaling pathway (GO:0032927)

positive regulation of adenylate cyclase activity involved in G­protein coupled receptor signaling pathway (GO:0010579)

cellular iron ion homeostasis (GO:0006879)

MyD88­independent toll­like receptor signaling pathway (GO:0002756)

MyD88­dependent toll­like receptor signaling pathway (GO:0002755)

activation of MAPK activity (GO:0000187)

regulation of cardiac muscle contraction by regulation of the release of sequestered calcium ion (GO:0010881)

negative regulation of ERK1 and ERK2 cascade (GO:0070373)

transforming growth factor beta receptor complex assembly (GO:0007181)

activation of protein kinase A activity (GO:0034199)

negative regulation of neuron projection development (GO:0010977)

positive regulation of ERK1 and ERK2 cascade (GO:0070374)

positive regulation of peptidyl­tyrosine phosphorylation (GO:0050731)

negative regulation of histone deacetylation (GO:0031064)

negative regulation of activation­induced cell death of T cells (GO:0070236)

activation of phospholipase C activity (GO:0007202)

regulation of skeletal muscle contraction by regulation of release of sequestered calcium ion (GO:0014809)

positive regulation of protein autophosphorylation (GO:0031954)

signal transduction involved in DNA damage checkpoint (GO:0072422)

toll­like receptor TLR6:TLR2 signaling pathway (GO:0038124)

positive regulation of peptidyl­serine phosphorylation (GO:0033138)

negative regulation of BMP signaling pathway (GO:0030514)

T cell receptor signaling pathway (GO:0050852)

positive regulation of glucocorticoid receptor signaling pathway (GO:2000324)

Supplementary Figure 5: Top 10 most frequent biological processes from Gene Ontology found in the network hubswhen comparing FuNeL and ARACNE co-expression networks.

11

C1 C2 C3 C4

SE(C1)

SE(C2)

SE(C3)

SE(C4)

SN(C1)

SN(C2)

SN(C3)

SN(C4)

stimulatory C­type lectin receptor signaling pathway (GO:0002223)

positive regulation of ERK1 and ERK2 cascade (GO:0070374)

TRIF­dependent toll­like receptor signaling pathway (GO:0035666)

toll­like receptor TLR6:TLR2 signaling pathway (GO:0038124)

epidermal growth factor receptor signaling pathway (GO:0007173)

toll­like receptor TLR1:TLR2 signaling pathway (GO:0038123)

positive regulation of cytosolic calcium ion concentration (GO:0007204)

toll­like receptor 2 signaling pathway (GO:0034134)

toll­like receptor 3 signaling pathway (GO:0034138)

positive regulation of peptidyl­serine phosphorylation (GO:0033138)

activation of MAPKK activity (GO:0000186)

positive regulation of cysteine­type endopeptidase activity involved in apoptotic signaling pathway (GO:2001269)

positive regulation of dendrite extension (GO:1903861)

regulation of intracellular pH (GO:0051453)

positive regulation of adenylate cyclase activity involved in G­protein coupled receptor signaling pathway (GO:0010579)

positive regulation of protein ubiquitination (GO:0031398)

MyD88­dependent toll­like receptor signaling pathway (GO:0002755)

toll­like receptor 5 signaling pathway (GO:0034146)

toll­like receptor signaling pathway (GO:0002224)

toll­like receptor 9 signaling pathway (GO:0034162)

positive regulation of peptidyl­threonine phosphorylation (GO:0010800)

calcium ion transport (GO:0006816)

positive regulation of cAMP biosynthetic process (GO:0030819)

negative regulation of activated T cell proliferation (GO:0046007)

positive regulation of histone acetylation (GO:0035066)

toll­like receptor 4 signaling pathway (GO:0034142)

positive regulation of proteasomal ubiquitin­dependent protein catabolic process (GO:0032436)

MyD88­independent toll­like receptor signaling pathway (GO:0002756)

activation of JUN kinase activity (GO:0007257)

activation of MAPK activity (GO:0000187)

negative regulation of ERK1 and ERK2 cascade (GO:0070373)

transforming growth factor beta receptor complex assembly (GO:0007181)

activation of protein kinase A activity (GO:0034199)

positive regulation of neuron projection development (GO:0010976)

negative regulation of neuron projection development (GO:0010977)

positive regulation of peptidyl­tyrosine phosphorylation (GO:0050731)

negative regulation of histone deacetylation (GO:0031064)

negative regulation of activation­induced cell death of T cells (GO:0070236)

activation of phospholipase C activity (GO:0007202)

regulation of skeletal muscle contraction by regulation of release of sequestered calcium ion (GO:0014809)

positive regulation of protein autophosphorylation (GO:0031954)

signal transduction involved in DNA damage checkpoint (GO:0072422)

regulation of cardiac muscle contraction by regulation of the release of sequestered calcium ion (GO:0010881)

negative regulation of BMP signaling pathway (GO:0030514)

T cell receptor signaling pathway (GO:0050852)

positive regulation of glucocorticoid receptor signaling pathway (GO:2000324)

Supplementary Figure 6: Top 10 most frequent biological processes from Gene Ontology found in the network hubswhen comparing FuNeL and MIC co-expression networks.

12

6.3 Validation on independent dataset

In this section we report additional informations about the analysis performed using data from the independentprostate cancer study (Taylor et al., 2010) available in the cBioPortal for Cancer Genomics (Cerami et al., 2012).In particular we report the full list of alterations for the topologically important genes analysed in the mainarticle. The Supplementary Figure 7–14 show the percentage of altered tumour samples for top 10 hubs (nodeswith highest degree) and top 10 central nodes (with highest betweenness centrality) in the best performingnetworks according to the gene-disease association analysis (using the information from the curated databases).The selected networks are C2 for FuNeL, SN(C3) for PCC, SE(C4) for ARACNE and SE(C2) for MIC. Forall of them we report the alterations for both hubs and central nodes.

PTGDS 45%

PAGE4 28%

LMO3 11%

GSTM2 52%

NELL2 9%

COL4A6 65%

MAF 24%

ABL1 11%

RBP1 20%

PARM1 53%

Genetic Alteration Deep Deletion Missense Mutation mRNA Upregulation mRNA Downregulation

Supplementary Figure 7: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest degree (hubs) in C2 network are shown.

PTGDS 45%

PAGE4 28%

GSTM2 52%

LMO3 11%

MAF 24%

NELL2 9%

ABL1 11%

COL4A6 65%

PARM1 53%

MYH11 58%

Genetic Alteration Deep Deletion Missense Mutation mRNA Upregulation mRNA Downregulation

Supplementary Figure 8: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest betweenness centrality (central nodes) in C2 network are shown.

13

PCBP3 5%

HAUS5 19%

PSG1 16%

LGALS9 9%

KAT7 5%

POU4F1 1%

SLC9A1 44%

PAX8 4%

ODF1 20%

BCR 19%

Genetic Alteration Amplification Deep Deletion mRNA Upregulation mRNA Downregulation

Supplementary Figure 9: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest degree (hubs) in PCC SN(C3) network are shown.

SBNO2 22%

CADM4 25%

WBSCR22 39%

FOXE1 11%

KAT7 5%

POU4F1 1%

PSG1 16%

PCBP3 5%

DHRS1 4%

HAUS5 19%

Genetic Alteration Amplification mRNA Upregulation mRNA Downregulation

Supplementary Figure 10: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest betweenness centrality (central nodes) in PCC SN(C3) network are shown.

IL10RA 16%

THRA 33%

ALAS2 4%

SAMD14 1%

RHBDL1 8%

PSG1 16%

DDX11 14%

GNAS 13%

BTF3 13%

KIFC3 5%

Genetic Alteration Amplification mRNA Upregulation mRNA Downregulation

Supplementary Figure 11: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest degree (hubs) in ARACNE SE(C4) network are shown.

14

IL10RA 16%

THRA 33%

PSG1 16%

SAMD14 1%

BTF3 13%

GNAS 13%

ALAS2 4%

RHBDL1 8%

MYBPC3 4%

DDX11 14%

Genetic Alteration Amplification mRNA Upregulation mRNA Downregulation

Supplementary Figure 12: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest betweenness centrality (central nodes) in ARACNE SE(C4) network are shown.

WBSCR22 39%

ADAM15 12%

GP2 7%

COPS6 14%

RIMS2 16%

ATP1B2 18%

CEP170B 21%

POU4F1 1%

HAUS5 19%

PCBP3 5%

Genetic Alteration Amplification Deep Deletion mRNA Upregulation mRNA Downregulation

Supplementary Figure 13: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest degree (hubs) in MIC SE(C2) network are shown.

ADAM15 12%

WBSCR22 39%

COPS6 14%

CEP170B 21%

CPNE6 34%

GP2 7%

ATP1B2 18%

EFNA3 9%

UBE3B 13%

PAX8 4%

Genetic Alteration Amplification Deep Deletion mRNA Upregulation mRNA Downregulation

Supplementary Figure 14: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest betweenness centrality (central nodes) in MIC SE(C2) network are shown.

15

7 Visualisation of the co-prediction and co-expression networks

In this section we include the layouts of co-prediction and co-expression networks for three of the datasets usedin the main article: Prostate (used in the case study) and Lung-Michigan (small networks). Only a few examplesare shown, for which the differences in the topology of the networks generated with the two approaches is themost visible. The networks were visualised using the Organic Layout in Cytoscape (Shannon et al., 2003).

Prostate: FuNeL – C1 Prostate: FuNeL – C3

HIST1H2BM

CXorf40B

H1F0

NAB1

NOLC1KDELR1

NR2F6

GPNMB

GUSB IGL@

PHLDA2

EPS8

IL18

SPRY1 CDKN1C

RBPMS

HYAL2

NCF4

MAPKAPK5

NDUFB8ARHGAP22

NPYACSL1

PEBP1

TPT1

OCRL

H2AFY

TSPYL2

CR1

DGKZHLA-DRA

NAGLU

BRE

PRYDNAJB5

ETV5

MATN2

TNFSF10LEPROT

CPMIFI30

HLA-DOA

GPR125

SERPINE1

SOX9

BIK

SFRS2B

PARP1

BOLA2

IL6ST

IGH@

ATP5A1

TOM1L2

TSTA3

MAOB

HLA-DQB1

OLIG2

DIO2

TBXA2R

CRYAA

CUL7

ITGA3

EXTL2

ATP1B1

LY6D

CRABP2

PRKAR1B

ERCC1

SKIV2L2

FMO3

FAM107A

MDM4

LY96

TARP

NECAB3

MDN1

CTNNA1

RIF1

SLC19A1

HIST1H2AL

NCK1CSNK1D

DHX8

TRIP6SC65

ARPC5

RABGGTA

FOLH1PPP1R2

MID2

PPP5C

RRAS

RPS11ANXA6

SHMT2

HBA1

PLCL2

MMP3

MYO6

CTAGE5KIAA1279

SMARCA4KRT17

APOE

VEGFAPHF21A

ARL4DYWHAZRNPS1

BCKDHB

NELL2

SSBP2KIAA1109

ALDH2

TP63

MIPEP

SWAP70

VAT1

LSM12TSPAN1

DEFB1ZZEF1

PTPN13

CD53

RBBP4

SLC25A40

COL6A2INPP5DATP1B3

SPOCK1

CASP2

FECH PRKAR1A

PMP22

DSPPRIM2HMGB2

BLK

FKBP9SMARCE1

COL1A2

GPRC5A

CAPZB

HPCA

SRGAP3

SLC9A7 HBBELF3

MCL1

NEFM

SHROOM2

COL7A1

ANPEP

PARVB

SLC7A8MPRIPUCNAHCYL1

SNTB2GATA2

GNMT

VEZF1

INPP1

SPRR1A

LILRB4

LIMS1

NSFL1C

PSCAIRF3

TFPI ATP9AXBP1

TTBK2

MAP2

IFI44IGHD

TBC1D4PPBP

KRT13

OBSL1C6orf145

DBI

NRP1

ITGA6

CLSTN1HOMER2RUNX1T1

MT1B

NAMPT

RIN2 BECN1HLA-JPIK3R2ABI1ALDH4A1

EIF2S3KRAS

CREB3L1

HNRNPH1

S1PR1DOC2B

CDC2L2

NUCB1GORASP2

ADK

LOC100134331

CCND2MEIS2 BMI1

ZNF37AWSB2

RRM1

TRMT1

TMEM106CAMFR

TFF1

NBL1

TMSB4X

FMODCALM1

GALR3

RBP1GNB2L1P4HB

RCAN1

TMEM47

HLA-ASKIL

ME1HOXC6

C6orf54MAPKAPK3

LGALS1

CXCL12PAGE4

SNCGFEM1C

NDUFV2BCL2 CANXATP6V0A2RAP1GAP

FUBP3ARHGAP1

UGP2

ARG2

AOX1SGCD

KRT10 LDOC1

SMAD5CALD1

KRT15

RBBP6IQCK

SEPT10NKX2-5

VCAN

FBLN1

RAB27A

RGS10

HLA-DRB4

DNASE1L3

GRK6MDC1MAF PSG11

SERBP1 FN1

C7 PTGDS

ITGB3BP

TXNIP

PRKRA

IGF1 PTPRC

CDS1COL13A1

GSTP1GPD1L

ATP6V0D1BMP2

TIMM17A

PKD2

LYZ

UBE3BTCP1HLA-DPA1

ZNF133

ANXA4

KRT19CBR4

ERBB3 COX5A

PTH

THOP1

PWP2

CFDALDH9A1

LDLRMTHFD2

NCAPD3

TSPAN31

IL10RB

DFFA

GCLC

ARHGEF18

TARDBP

CDC42BPAEMP2

GPX1

TACSTD2

BGN

RHCE

PLXNB2

TMSB15A

ACYP1

CCL14

DOPEY2

FKBP4

MT1E

TRAF4

PTPRG

NXPH4

FAM114A1

ABP1

ATF5

NPHP4

LCP1TRO

DPTLDHA

CIT

STK17A

P2RX7

DUSP2

PPPDE2

MAP3K8

BRD3RNASE4

SCN2BUQCRQ

SMR3B

TBC1D2BALDH1B1

C1QBPDEAF1

GGA3

CASP10

POU2F3ATP6V1E1

RABGAP1BTG2

MT3

SPINK4

ULK1

BUD31CHKB

LTBP4

FBXL2SIVA1

AHNAK

NCAPD2PDGFRL

PYGB

EFEMP1TAPBPTK2 RAP1B

SLC6A15DLGAP2

DPYSFAM91A2

LRP10XRCC6

ILK

AGR2

NFKBIA

STC1

FES

COX7C

PPAP2B

SFRS16

SLC25A6

CFDP1MYL9

ROR2

MAPK6

CCND1POP5

FADD

ZP3KIAA0495

TLX1

ID1 LITAF

DUT RAB1A

ROCK1BTF3 KIAA0182

FOXO1

CSRP2

HNRNPDTOM1

KIAA1012

CUL5

ARHGAP5

YWHAQ

HADH

CYP3A5KRT5

SNRPE

SPON1

PFDN6NAP1L1RMND5A

UBE2B

STAT5B

ANXA3RALYL

VGLL4

CARD8

PDGFRA

SLC15A2

ALCAM

SEPT8

DHX15DPM1

PCBP2

IL1R1

USP9XRPS3

MRPS27

PTRF

DUSP1

FBXL7

DST

ACTG1

RYBP

GRAP2 DNMT1TSPAN7MYOF PAX5

SYPL1 SERPINF1LAMA2

NACA

GSTA4 AP1S2

EIF4EBP2DUSP5 CLCA1

BNIP3LUSP10

ACSM3

PCNA SAT1

LTF

ARFGAP3

DUSP3

GSTA2

FZD6

TBX6

PPIL2 ERCC2

MAP7

ATP2C1

KCNMA1

HSP90AA1RUNX2

INSR

NFATC4RNF41

LDHBMARK3

GRK5

GUCY2FNELL1

COL16A1POLR2DOVOL1

ITGA4

RPS16

RPL28

TRAK1COPB2

STOML2

GPD1

MGST3

MARCKSL1

BST2

PCGF1

SPINK1PLXDC1

KIAA0040

PRDM2

COPS2CPA3

CAPG

GPR39

USP13

TYMS

CALCOCO2

RPSA

HEPH

GSTM3

EIF2S1

ALDH3A2

DKFZP564O0823

IGFBP2

NCKAP1

BACH1

TM9SF2

PITX1

HDDC2

CCK

PALM

ALDOA

PRKDC

ABCA8

CELA3BRAB3GAP2

FXYD1POLD3ASCL1

TOB1

SERTAD2

EMP3 CEBPDASH2L

PIBF1GUCY1A3

RPS6KA5TNFRSF21

TBPL1 STACSEMA6C

TGFB3

ACAT2SMAD9

GNA15

TFIP11

LGALS2PLP2

ZNF410

PPM1H

PAMFADS1GPRASP1

TOR1ARAB11FIP3EPX WSB1

KDM5A

KCNB1

SIK1PNN

IQGAP2

ACTB

GSTM2

EPCAMCES1

INPP4A

HRSP12WIF1

BTG1

RPL10

WIPF1

RRAGA

SALL2RGS2

DHX29

NUP205

PGCP

ROCK2

N4BP2L2COL11A2

KANK1

VIM

ARFRP1NFIBIL11RA

EIF1B

SCGB1A1TFAP2A

CHM

ARL2BPCASKPCMT1

WNT5A

G3BP2

FAM65B

KCNAB1

ZNF268

HAUS3MEG3

MAN1A1

LRIG1

ATP6V1H

CCL25

SH3YL1

H1FX

FBP1

GTF2BZMPSTE24

FKBP2MPHOSPH9

PDXK

FKBP5

MANBA

MT1X

LYN

BMPR1A

BAG2

ACAA1

PTPN4

IPO5

HSPB11

BTN3A1

RPL8

LPCAT1TRBC1PDE3B

SET

PTPRCAP TOR1AIP1

SMPDL3A

MAL

G0S2

F13A1

GDF15

KCNS1

ASTN1

PDGFA

MAPK8SPOP

AUTS2

C5orf13

CSF2RB

MGAT5AHSA1

FAS

FUSIP1

ANKRD1HCFC1

EEF2UAP1PTTG1IP

XCL1

CAND1

RANGAP1

MMP7NFYB

SYNJ2HBEGFSETD4

SAP18

MYL1

GAPDH

MRC2

R3HCC1

LCK

HNRPDL

CKS1B

RPL19

BCAR3

COL4A3

GCHFRNME2 ABCC1

PGA3

BMP2KTRPV6

BBXPCLO PREP

SMURF1STX7

TAX1BP3

BCL6

PIM1

AZI1DENND3

ULK2

MAP4

ESD

OAT

C10orf116

R3HDM2

NFKB1

TGFBR3FAAH

PRSS23

PAH

RRAGB

LPGAT1

STMN2

ACTR2

STAT1

RPL23

SPTB

IRF1

DLG5

SLC35A1

HSPD1

RPL18A

NR3C2 LNPEP

CAV1

IARS2PLCH2

ATP5C1

MLH3

ANXA2P3ANKLE2

SMAD1

USP6

CDR1

CETN2

PPY

PIK3R3

CDC5L

HIST1H4C

CD38

TTRAP

CRABP1

RNF103SERPINB10

GCDH

BAT2D1

PDIA5

GMFB

VHL

UTP18ETHE1

SLC7A1

PCNX

CXADR

LTBP1

ATP6AP1

RAB11A

CCT5

AZGP1

SIM2

S100A4 DDHD2TAF2

LAMC1

KIAA0247

LPIN1

TNPO1

SPCS2HSD11B1

C19orf2MPPED2

SORL1ZHX2

GPM6B

FLAD1EHMT2

NTRK3GUCY2D

MT1H

FCN1

GOSR2

TMEM63A

ACO1PKIGTMED10

LGALS3BP

SAG

KLK2AMD1

NEFH

MAST4

CLEC3B

ATP6AP2RSU1PTMA

ACTG2XRCC5

PAK1

ADD3

IFT20

CLU

HLA-DPB1

IGFBP5

SPRY2

PEX3

TMEM87AMCM5PLAGL1

TPST1MAN2B2

KLHL21CBX5

RABEPK

CRELD1NBPF15 NR4A2

ID4

MT1GSATB1

NDUFS2TMX4

MTMR15RP11-345P4.4

RPL39L

ISCU

SH3GLB1

EEF1G

PTPRM

TUBB2A

C16orf45

RPLP0

LUM

NCBP2

TBL3

MTA1

FBXO9

RAC1

FADS2

PRKRIR

LMOD1

RPL9

LTB

MCM6CPE

PDLIM1

ZNF652

WNK1

CPD

DUSP6

KRT18

LMO3

TMEM123CYR61

KLK11CETN3HOXB13

BCAM

IFNG

COX7A1

PPT2CMAH

HSF2

SNX7

MGMTCTSE

CEACAM1

STAT3

ATP5F1TPBG

SORD

C4orf46

ALKBH1

ARID5B

CD81

RHOBTB1

YTHDC1

EPB41L3

CYP4F2

MTIF2

ZNF646

DNAJB6

NR4A3NDRG1

ALG13

HES2

CDT1RGS1

C6orf130

SMG5

CLDN3

AIM1

EFS

GALNT1

DHCR24

BRD2MAPK10

ABCC4

KIAA0907JAG2

IL1B

ALB

CRYM

PPP1R3C

SDC1HEBP2

MMETRA2AAQP3RFK

RND3

NAB2

ITGA1

SLC14A1PRKAR2B

SERPINB5

PPP2R1A

EEF1A2

DYNC1H1

CCNT2

GGCT CCNOCYBAPROSC

RTN4

ARHGEF1

RBPJ

GAS1

TPM1TSC22D3ZFP36L2

PRKCIMETTL7A

CADM1MSH6

WDR1

STAT6SH3BGRL

PKMYT1

EEF1A1SON

COL4A6

ANGPT1

F5

SFRS6

MYH11PFDN5

SFRP1

VAMP1CCR1

UBB

ATP1A1

NCR1

RAB11A

ACAA1

CCL14

INPP5D

DPM1

HEBP2

G0S2

RNASE4

PWP2 CPD

SPTB

BUD31

PALM

TMEM106C

NCF4

SFRS2B

ROR2

LNPEP

CYBA

FUSIP1

CDKN1CLITAF

RAB27A

TP63

NPY

GCLC R3HCC1

ATP5C1

TPBG

MAPK10

TFIP11

LSM12

PGCP

CANX

KRT17

STAT6

CCL25

INPP4A

NR2F6

RAB3GAP2

PCNA

TFPI

TOM1

LAMA2

EPB41L3

KDM5ARPSA

STK17A

PAMRTN4

P4HB

BTN3A1

NCBP2

SC65

SPRY1HNRNPH1

UCNTRA2A

RHCE

LYZ

ARFRP1

PPP2R1A

TOM1L2

RRAS

NR3C2

FUBP3

GUCY1A3

HIST1H2BMBMI1

HMGB2

CHM

SERBP1

ZNF268MGMT

ISCU

HOMER2FAM107A

LCP1

ATP6V1E1

TBC1D4

PRIM2

PPT2

KLHL21WNK1

SLC7A8

KDELR1SEPT8

HIST1H2AL

BRD3

KLK2 ETV5

PRKDC

CLU COL13A1

FKBP4

LMO3

ABCC4

SRGAP3NEFH

GGCT

SLC35A1

KRT18

ARL4D

IRF1

GPX1

ALBMME

GNB2L1

KCNMA1

ABP1

MT1E

FKBP9

STAT1

PKMYT1

GAPDH

SORD

DSP

SMG5

GALNT1

GUSB

IFI30

MID2

PARVB

EEF1A2PPM1H

PLAGL1

ATP6AP1

SOX9

CXADR

RPS6KA5

SATB1

ALCAM

HLA-J

TMSB15A

SNRPE

SGCD MT1H

IFI44

CPE

GDF15

HOXB13SPON1

SH3GLB1 ACO1

HNRPDL

PRYCIT

IQCK

XRCC6

BIK

BTG2

CLEC3B

TSPAN1

VEZF1 FOLH1 TPST1

AIM1

PPP5C

XBP1

SERTAD2

C6orf130

LTF

LY6D

TROSALL2

TARDBPPROSC

SPRR1A COPS2

AMD1

ACSL1

SHROOM2

ALKBH1

ITGA1TSPAN31

PPBP

PTPRGMIPEPCDS1

BTG1

HLA-DPA1TM9SF2 DEFB1

CASK

EFS

NDRG1

CDC2L2

CXCL12

IFNG

GSTA4

HES2

UQCRQ

GCHFR

CMAH

ACSM3

DUSP6

CHKB

MYH11TTBK2

HBB

NEFM

MAPKAPK3

PAGE4

ITGA4CLCA1

VCAN

ATP5F1

KIAA0040

SPINK4

PTGDS

SH3BGRLRND3

MARK3

MMP7

R3HDM2

DPT

KRT10

PRKAR1ASPOCK1 NCAPD3BCL2GSTP1

COL4A6

MT1G

NXPH4

TOR1AAP1S2

TMED10

PLXNB2

GPM6B

HCFC1

MDC1

NDUFV2

PCMT1TNFSF10

DLGAP2

LGALS3BP

GRK6CCNOPCBP2

RPLP0

SCN2B

CXorf40B

DST

NAB1

TMEM47MAP2

HDDC2TACSTD2

EEF1A1

CFDCAPZB

ZNF652

ATP9AC7 MCM5

CSF2RBPEX3

RSU1RUNX1T1

BMP2

FAAHPPY

YTHDC1C10orf116

CLSTN1

POP5ANGPT1

RHOBTB1

DHCR24

MMP3

IGL@

AZGP1

PLCH2

RAB11FIP3

CBR4

NTRK3

MTIF2

POLD3GPD1

PKD2

GATA2COL6A2ADD3

EXTL2

KCNS1

STMN2

THOP1ATP6V0D1

RP11-345P4.4

SLC25A40

AHNAK

ME1

NAMPT

NELL2PRKAR2BULK1

RRAGA

SFRS6TXNIP

MEIS2

CALCOCO2

RBBP4IL10RB

BLK

PIBF1UBB

GSTM3

TMX4

HLA-A

SDC1IPO5

GRK5

RNPS1

CCT5

UBE3B

MATN2

IGF1

CCND2

ASH2L

ASCL1

SWAP70

RABGAP1

MEG3RBP1

TRAK1CYR61

RIF1

LRP10

FAM114A1

DNASE1L3

LDOC1

DHX8

CTAGE5

COL11A2

SKIV2L2

CCR1

MRC2

PKIG

MT1B

ITGB3BP

ATP1B1S100A4

LY96

MRPS27

RAP1GAP

SON

ERBB3

RPS3

INPP1

PDGFRL

ACYP1

PLP2 BGN

HSP90AA1

SFRP1

RAB1A

PTH

MAN1A1

ID4

VGLL4

LGALS2

GPRC5ATLX1

S1PR1 CAPG

CDC5L

LOC100134331

PSCA

MGAT5

SCGB1A1

AQP3

NKX2-5

SPOPFBXL2

DBI

SNTB2ABCA8

TSTA3

OBSL1

HYAL2

BAG2BMP2K

MANBA

GALR3 RCAN1

ZP3FN1

WIF1

TRIP6

TMSB4X

NBL1

TBXA2R

DKFZP564O0823

CDC42BPASMAD5

FADD

HSD11B1

AOX1

EPS8

NAB2

CD38

SERPINE1

CPA3

CTSE

LYN

GUCY2FANKRD1 SNX7 TAF2

LEPROT

PHF21AMTA1 DUSP3

H1F0TGFBR3

CASP2

SORL1 ESD

TBX6

CUL5CBX5

LDHAPPIL2

GOSR2

ARHGAP1

CTNNA1 RFKIGFBP5 PIM1LTBP1

IFT20 ARHGAP5

CALD1TFAP2A

VAT1

OVOL1VIM

RPL23

SIVA1RPL39L

GPR39HBEGF

COL1A2FBXO9EHMT2 POLR2D

ZNF410

RANGAP1

PEBP1MCL1

RRAGBRMND5A

HOXC6

GPRASP1TK2

OCRL

TAX1BP3

AUTS2

ZNF133 SAGSH3YL1

PFDN6

COX7CAHCYL1

CLDN3

CARD8PAK1

RPL28SYPL1

LRIG1TPT1

ITGA6

KIAA1279

SSBP2DUSP1

LTBP4 ACTG1ADK

CRELD1 PFDN5

EPX

MAPK6

NFKBIAZFP36L2

FZD6

HRSP12

NBPF15 ANXA4

NDUFB8 USP10

SFRS16

DYNC1H1 HPCA

NFATC4SAP18

DNAJB5C1QBP

HADH

PAH

MPPED2

IRF3

MAP7

GPR125

G3BP2

IGH@

TRPV6

MPRIP

ALDH3A2

NUCB1

PITX1

HLA-DOA

MTMR15

APOECADM1

SMURF1

DIO2

ARID5B

LDHBKIAA0182

BST2

FKBP5

LPGAT1

N4BP2L2 POU2F3IGFBP2

COL4A3

DGKZFOXO1

RPL10

ALDH1B1

CCKTNPO1

MAP4

IGHD

AZI1

USP9X

HEPHTSPAN7DENND3

C6orf145

MAN2B2CDT1

DOC2BSYNJ2

LIMS1SLC14A1

WDR1

FAM65B

HLA-DRB4

KIAA1012PAX5

TGFB3

PTPRC

CR1

CETN3

YWHAZ

WSB1

ACTG2

SHMT2

HBA1

BCL6

PLXDC1

ROCK1

PDGFRA

TTRAP

IL6ST

NDUFS2XRCC5

TBC1D2B

LILRB4STAC

CYP3A5

SLC19A1

ROCK2

ALDH9A1

NCKAP1

H1FX

SMAD9MYO6

PTPRMIQGAP2

RIN2

ATP1B3

SPCS2

CCND1

SAT1

RPS11 LMOD1

PLCL2

VAMP1

PRKRA

LAMC1

ATP6V0A2

ATP1A1FBXL7

PSG11

KIAA0907

ALDH4A1RGS1

PTMA MAST4CRYAA

ACTBTMEM87A

TSC22D3FADS1

TIMM17A

CPM

CREB3L1

BBXC16orf45

BRE

VEGFAEMP3

TYMS

MSH6 NFIBDLG5

ALG13

INSR

LUM

PRKAR1BNSFL1C

DHX29 DEAF1MDN1FMO3

SIK1

NPHP4

GPD1LMYOFMAF

SLC9A7

FBLN1

TNFRSF21RBBP6

IARS2

STAT3CRYM

IL1B

PARP1

WIPF1

USP13

BACH1LPIN1

ARHGEF1

GSTM2PIK3R2

MDM4 SMR3BPHLDA2GNMT

ATP6AP2

SETD4

MTHFD2

CSRP2

ACAT2TAPBP

SKIL

PIK3R3

EIF1B

RUNX2GAS1

CD53

HLA-DRA

SERPINF1

RABEPK

TRAF4

CALM1

HLA-DPB1

PNNKRT19

C6orf54

OAT

KRT13

JAG2

CYP4F2

RPS16RABGGTA

DDHD2 KRT15

PRKCI

KLK11

CRABP2

PDLIM1

PPP1R3C

NFKB1

EIF2S1MT3SERPINB5ID1

DNMT1

PDXK

DUT

KANK1

TPM1EIF2S3KIAA1109

SEPT10COL7A1 CEACAM1ZHX2STC1

Prostate: ARACNE – SN(C1) Prostate: ARACNE – SE(C3)

ESR1

FLT3LG

NQO1

SEMA7A

GCKCOX10

GJA1MAP2K4

SCAPERPPP3CA PI4KBSSTR5

SLC9A1LRCH4

ACANADRBK1

ASPHD1COBRA1

EFNA3

ECE2

IVL

PAK2

CECR7 CRCPCCKAR

FETUB

CD33

COL19A1

RAD51L3

HAP1

OCEL1

DDX51

PRKCB

HIST1H3E

LOC26102

KYNU

RPS18

TMSB4Y

KRT6A

PLEKHM1

RPL11

TGFB3

MR1IFIT1

GUCA1A

PRPF6

DHRS1

PAIP2B

LOC644165

CCL19

SLC24A1

ASGR2

FAM153A

MAX

CNTN1

PRPF31IQSEC1

LOC100134331

SPEF1

RPL36A LOC100131311

RBCK1

DNAJC7

RPS24EEF1D FAU CD34ECDELAVL3CCL22FUSFAM38ANACA

HHLA1

RPL23A

RPL17

psiTPTE22

RPS8

ABO

MAP3K14

SLC6A11SEMA3F

UTP3

ZNF239

RPL9KLK3 ISLR IGSF9BKLK2MCL1

NSFL1CSAP18NFKB1AGPS DPT

AANATNEUROG3MAPKAPK2

EXOSC2 MFNGUNC5B

MPP2MAD1L1

LLGL2

AATKRAP1GAP

GRM4

ITGA10

TOP2A

IL17RA

RAD54L2

SLC7A4

SPN

PSG7PRPF19

ARID3AASMTL

GPR35

SORBS3

C2orf55

TBXA2R

IFRD2

UGT2B15

MYO1C

GRINA

SLC4A8

NTNG1MYBPC3

SLC25A11

PAX8

HAUS5

WBSCR22

DUS4L

ANKRD46

CCKBR

SLC8A1RABGGTA

LOC201229

MYST1PDGFRA

BZRPL1

PRF1

MLANAZNF264

LY9

ARPC4

CRTC1

IGHG1

HTR7

POU4F1

NOC2L

GPR143

TNIK

PTGER3

CHST2

LTK

ABL1GRB2

HMX1

RHO

TNKS

LUZP1 XDHMC2R

RAPGEF5

COL1A1CASP2

OBSL1KRT86

TCOF1

EPOR

TERF1

JAK3ZBED4

MLH1

EZH2

OPN1SW

TNP1 DOK2

DIDO1IFT140

SLA

LAIR1PTGS1

C9

FGF2BCAM NDST1

COL6A2

SLC1A7ITGAD

BFSP2 KRT35

TRPC4APCAD

TRAFD1

SMOX

POL3SCDX2

FSHB

NTSR2

RPH3A

EXD2

ELAVL2

B4GALT3

ARSE

RGS9

CHST3ATP6V1B1

CARD10

LYPLA2

CORT

IL18RAP

GAD2

EPAS1

TCF7L2

CD52PPP3CC

SLC6A9PLCD1

ADAM15PAXIP1

MAG

KLRC2

RBM8AABCB1

MDC1

LSM4

KRT1

RAGE

HLA-DOB HOXC11

METTL3

TYMS

KLF1

NFRKBCYP2A6

MYEF2CPNE6

IGFBP5

MAP2K5

AGFG2

UBL3

RPRD2

DDIT3

TCAP

ITGA2B

PCGF1

ADAMTSL2

ANKLE2TM4SF5

CRYBA4

SLC20A2

PDLIM7EDIL3

USP19POU2F2

ATP1B2

NRTNPAX3

C16orf7

LOC729400DSG1

MLX

RECQL5BMI1CUBN

TUBB2B

PRPS1L1INSL3

TFF3ARMC6

ARFIP2

TIMP3 PPP1R10

PNMTREG1B

TCF3SRM

FANCASDCCAG1

B4GALT5

PIM1

KCTD20

LEFTY1

P2RX6

DGCR9

DKK4PITPNM1

MTHFR

F7

RAB40BPARD3

CTSG

PQBP1PRH1

NCOR2ATP2B2

PLIN1

RRP12DEFA3 MEF2C

MAGEA5KCTD17

FRAT2

BTF3HIPK2

MSCANK1

AXIN1

SEPT6

CYP3A4

USP11UBXN1

TAF5

FCGR2A

TULP3

GAGE12J

NFYB MGP

SLC6A7

E2F1

CD40

GALK1SAR1A

IGH@

CHKA

FURIN

PMF1

CELP

SLC35A2

GPR19

TROAPVCX3A

VAMP5

CDK2

EFNA2RNF167

ABCC8

TNK1

PTPRD

ARTN

ITGAM

FLOT1

TPMT

GTSE1

AMELX

CHIT1

CNP

STK25

AP2A2

PDE4A

GCKR

KCNQ1

NFATC2IP

CCNF

ERBB2

ABCB6

OTUB1FSCN2

FZR1

INPP4A

ACY1

JAM3

PPM1D

USP6

GIGYF2

ST3GAL1

AGAP8 HTR2A

NLE1

RDH16

SKI

GNAT1

MGAT3

MFAP2

TET3

CNNM2

SMARCD3

PTH

WNT7AUBE2N

MAP3K11

ARHGAP26

SLC12A4

TRIM58

SVEP1

LGALS9

TBX5

HDGF

SLC30A3

WWOX

USP9X

IL8RB

LRP3

MATN3

SIT1

TRIM16

RAB3A

PDE4C

PPAP2C

C5orf30

PCDH9

PIGC

GP2

ORC1LTPPP

UBE3BLOC552889

EPR1

SULT4A1

SERPIND1

KIR2DL4

BCL3

RIMS2

AQP7PCBP3

PIGR

ELN

OGG1

BRCA1

TAF1

ANGEL1

KRT33A

SDC3 RUNX1

ZNF592

TPOATF6B

MTX1

SIX3

ZBTB7A

MEN1

WNT10B

FGD1

ANG

PTAFR

MMACHC

TRIM22GTF2H3KIF22

ASGR1

CYP27A1SNX27

SAMD14

KLF11

APOBEC3C

AKAP3

LOC145678

CEACAM3

ADCYAP1ETV3

ZKSCAN3

MVKLGR5

GAGE2E

DEPDC5COPS6

LY6G6CSPTLC2

SLC5A2

SLC4A3

PAMR1

TUB

NRG2

UPK2

WNT11

ZNF500

KCNJ4

FLT1

FDXR

SMARCD2

TOP3BERBB3

JAG1UBOX5

CCDC69

ZNF157

CDC2L5EIF2AK2

KRT85

POLG

PHKG2CFDP1

NCKIPSD

EVPL

SCAMP5

CAPNS1AQP5

DGKA

DDX11

RASSF7

CYP3A5

NF1

FKBP15

UTRN

ACR

ACCN1

GYPB

BCRHNRNPL

FOSL1

ARID4B

KLHL18

HCRTR1TMPRSS6

SLURP1

PXDN

CRXST14CEACAM4

TERF2ETS1

FAM5C

NUDT3

DOK1C19orf50

TREX1 MAPK8IP1

IRF4

NBEAL2

KRT2

CBARA1

GUCA2A

HTR1B

HRAS

GPA33

DNAH9

SMYD5

RYR3

GRIK5CDK5R1

PAX9

SLC16A2

FAM102A

TRA2A

RFC1

CRYBB3

HAS1

KIF1C

SLC1A6CORO1A

PGM3

CHD8

TBC1D22A

RHBDL1 PSMB4

CYHR1

IRF5

IL10RAC14orf1

IGF1R

GPR3

LILRA4

CAPN5

SCARB1

MFHAS1

SPP2LOC91316

CBLN1

PCTK3

SOX12

HADHAGNL1

MRPL28

RELA

MAGEA4ABCA1

C1orf144

TENC1

RPL3L

NRG1

HTR4

LRIG2

HOXD13

IAH1

CPSF1

RAD52

RASGRP3

MAGEA9

PTOV1

DHX30

MYCL1

LY6E

LIPE

PFDN1

IGF2

MEGF6

GDI1

SBNO2

TOMM34

MAGEA1

RAB8A

SUPT6H

ERF

UCHL1

P2RX7

FADS1

RBBP8

POU6F2

CCR4

STK17A

ST20

EML3

STAT2

TSSK2

SLC16A3

FOXC2

HPGD

ALAS2

SIM2ANXA9

POLR2E

CYP11A1

RAB14

PPP4R1

DCBLD2

IL1RL1

CRHR2

STMN1

TFAP2A

NECAP1

PCDH7

MMP25

IKZF1

GYPESLC17A3

CSNK1A1

SFPQ

CCHCR1

MVD

VENTX

TAP2

TNIP1

YPEL1

KIAA0323

TTYH2

SERPINA4

DNAJC4

ADCY2

STXBP1

TGOLN2

SMG6

PCDH1

ITPR2

LALBA

DCT

PML

TIMP2

SCNN1D

PBX1

REM1

SMOU2AF2

TBX19

SFTPD

KCNH2

NCDN

TRIM3

FOXE1

OR2B6

ZC3H7B

DTX4

EXOSC7

ZNF783

KCNF1

SLC6A2

HSF4

FAM189A2

CSNK2A1

MAP3K3

ATP10B

SLC25A17

SERBP1TP53I11

BAIAP3NFATC3

CD22

GPRIN2FOSL2 GNB3

ACOT11

PTGES

EPHB2PRPHMLLT4

SLMO1

STATHHRK

HTRA2PHF21A

MYST2TYRGSTA2

CEP135

NFKB2 KRTAP26-1RPS6KB2

XRCC2IDS MLL

FAM189BST6GALNAC4

THRATNFRSF25

SPC25

PRRC1

SLC38A10

AIF1

AFAP1

SPRR2CTHBS3KIAA0284

HBBP1ALPPL2

LSM7

ODF1

GPR161

LPAR2

SCYL3 H6PDSPINT3

CACNB4BAHD1

E2F4

SPG20

ALDOC

EIF5BIGHMBP2

RPS26

LBP

MFAP3

PGAP2

FLNA

CNN1

TUBB3

MYL9

HSP90B1

RPS27PTPN9

RPS20CACNB1

ERP29

RPL37ACAMK2BP4HB

RPL10

RPSA

RPS17

RPL3

RPL29

RPLP0

RPS3

RPS14

RPL18

RPS19

RPL15

RPS5

GNB2L1

RPL19

SLC25A6

RPL28

EEF1G

KIF20BATG2A NTRK3SLC6A12

ACADVL

NDUFV1

CCDC85B

CDC37PPP2R5D

GPX4

POM121L9PGDF5

ENTPD2PTPRO

DDN

FGF3

GABRG3

CLEC11A

GFPT2

NRP2

MPDU1

IHH

RPS3A

RPL31RPS23

ARL3 MYO9BTUBB2A RPL24 RPS15A RPS21

TUBB2CRPS6

SEC16A CROCC

RER1

MYH8

NUP214ARVCF

ERGIC3

EIF4G1

KDM1

TUBGCP4 GARNL4PXN

ZBED1CSF1

ZNF593

STIP1

KIAA0999

LOC652147

CCL16

T

FANCL

NHP2

ALOX5

GPR6PDPN

CAMSAP1GRN

PSG1

GGT5CALR

ARHGEF6AHSG

HSP90AB1

DNAL4

ATP2A3

SF1

GIPRDGCR6L

MUC1

TCF25

DDX17SLC2A1

TUBGCP2

BTRC

MSX1

XPO6

DGCR11

RAB11FIP5

AMMECR1

OPRL1NCSTN

SLC1A4

IKBKG

RALYL

TGFB1SIX6

NDST2

MARCO

RPA2

SGCAIL13

GRIP2

FUT7

SRRT

ATP13A3

NOMO1

ATF7

ACVR1B

PRB1

DAPK2

OAS2

NAAA

CADM4

GRIN2A

NCR2

LMTK2

SLC18A3

ZSCAN12WDR62

GTF2ISMPDL3B

GNAS

VEGFC

RFC5

SSH1

INA

GHITM

CA11

NCR3 ZNF169

MCM2

SOCS3

DTNA

ZNF274

CDK6

GJB3

RBM4

MYB

MGAT1GJB5TNPO1

DUSP7

TLX2

CCBP2

CCPG1

DHCR7

AGPAT1

PVT1SRPX

TRIM15

SLC9A3R2

METTL1

AURKA

FAM50A

RUNX2

RP1-21O18.1

ITM2B

POLD2

PFKFB2

HIRIP3SSX1

SLC6A8

DAXX

BAT3PSMD9

ADH1C PLEC1

CHMP7

RREB1

FOXN2

PHLDA2

KIFC3

WNT1

TK2

DUSP14

XPNPEP2

IFI35

GPT

REL

ANKRD12

HCG9UBE2H

ADAM20

TIMP2

EFNA5

PBX1

HTR1E

ZNF239

MED21

GREM1

SH2B1U2AF2

B4GALT4

KRASCTSO

SCGB1D2

SIGLEC7

GPR182NRL

CD33SERHL2

ZNF264

USP24

ALB

WNT7A

MAGEA1

CRCP SPTLC2

TRIP13

KRT37

MRPS12

MR1

SOD3

GTSE1

CARD10

PRPF31

KIF1C

FOSL1SMYD5

AKR1C4PDE1A

HEG1

STX5

SLC9A3R2SNAP25

AGPAT1

CCBP2

TRIM15TK2

C19orf21

NUDT1

NUPR1

THBS1

DOM3Z

UBE2H

PDPN

SOX1ATXN10

CDC37

LITAF YBX1

PAX6

PTPRO

PCYT1B

HLA-DOB

NEBSERPINC1

FAM50A

METTL1SLC28A1

CCPG1

WDR45 RDH5

DTNA

ALPPL2

ST6GALNAC4FPR3

GPR176

PRPH

CD4

HAS1

PIP4K2A

UBTF

FANCG

CCL16

DNAH9

RIBC2

TAX1BP1

PTPRU

DHRS1

GNS

PGRMC2

PAXIP1

ARPC4

HAUS5PAX8

KRT1

DOK2

WNT2B

MASP2

ADAM15

POM121L9P

EZH1

MAP1B

RASL10A

NEU3

SCAMP1TBC1D5

LIN37

INPP4A

CRX

C1QL1

OPRL1

TYMS

MVK

VAT1

GGT5

DCT

PIK3R1

MLH1

ODF1

HIPK2

TRIM58

KIFC3

PPAT

SOX2

PMS2CL

NFKB2

PTPRS

ZNF592

SLC9A7

CR1

GPR162MAP3K11

STK25B4GALNT1

AP2A2

CHIT1

ACAN

ZP2

UBE2N

FZR1

EXTL1IQSEC1

SLC35A2

IFT140

SORBS3

PHF21A

AQP7

SLC30A3

WASF2

ABCD1

FLJ40113

TRIM22

TYR

LPCAT4

KIR2DL4

KATNB1

SLC16A3

HHLA1

TNKS

KLF1

RNASEH2B

PDXKSIM2

LLGL2

SPC25

LRRC15

HBBP1

LGALS9

F7

ANGEL1

CENPA

GHRHR

UBOX5

SLC12A4

NLE1

AGAP8

RAPGEF5

DNAJA3

HTR3A

ALG3

PSMD9

CRHR1ABCB1

GBA

LGR5

MDC1

WNT10B

LOC100134331

UBE2D1

GHITM

RAB40B

SLC22A6

PLCD1

NR0B2

IL8

COX6A2

ARL4CPDCL

RQCD1ST5

C16orf7

CTSG

TEFACCN1

IGSF9B

N4BP2L1 ISLR

CSNK2A1CEP68

AFAP1

COL21A1

MAX

OMG

NTSR2

CDK5

KHK

MOCS1

CASP2

TMSB4Y

TCF3

ASIP

LOC100129500

SYNGR4

SEMA7A

SEZ6L

RHO

DIDO1

C2orf55

TGFB3

psiTPTE22

AATK

AANAT

TEX28

BCL3

RRP12 SHC2PRPF6

CELSR2

TKT

RAC2

CACNA1B

EIF4G2

HOXD10

CPNE1

CNP

CYC1

SIT1

EZH2

MPZL1

LUZP1

DHPS

EPB41ST8SIA5

RAGE

NEUROD2BYSL

PAIP2B

NSFL1C

PCDH9

LOC157627

ITIH4

MAG

IGF1RFAM189B

MYL2

LDB2

ADORA1

RP5-1000E10.4

CYP2B7P1

RPL13P5

PGAM2

CSNK1G2

DCHS1

XPO6

HSPBP1

SEMA3A

STAU1

HMOX2

SLC25A11

CACNA1CBRCA1 GPR12EFNA3

TCF25

SAP18NPAT

RBMX2

HSPA4

FSCN2

NOC2L

PAK2

PQBP1

PRRC1

PCBP3

FAM89B

DOPEY1

ATF6B

TBR1

EAPP

FKBP6

CCR3IL18RAPHDGF

TUBGCP2NQO1

TPO

APOBEC3C

GTF2H3

GUCA1A

MAZ

NTSR1

CTAG1A

BCL2L11

C21orf33

CST6

DDX51

ITGA3

GRIN2C

MCM2

RBMS1

TRBV5-4

KCNQ1

DYRK1BS100A5

IGF2

GDF5

FGD1

IDS

TMPRSS6

IHH

NPPB

ACVRL1

FAM5B

UNC5B

IPCEF1

KIAA1024AQP8

RECQL5PTGS1ZNF263

IL1B

FOXN1

HNRNPL

KCNJ4

MYEF2

CPSF4

MFHAS1LECT2

LHX2

IGHMBP2

PRKDC

NRXN1CSRP3

SLITRK3

CRTC1

CYP2A6

AGPS

S100A4WBSCR22

LEFTY2

FNDC3A

DBN1

IFNA4NFRKB

PPP3CC

PRKCBRBM8A

WASF3

ZBED4

NTNG1

RANBP9

FAIM2

FAM119BLPAR2

REL

ATP2B2PLIN1

RHCE

MXD1

ACOT11

TRAFD1

FSHB

P2RX6

PPP1R8PITPNM1

FOSL2

LEFTY1MYT1

PLAU

GM2A

CD22

KCNJ6

PSD

CRYBA4

CHKB

TIMP3PTGES

ATRN

TFF1

ERBB2

XPNPEP1

HOXD13

PHF16

CHRNE

RPL3L

MYCL1

CRMP1

PHLDB1

RASGRP3

RPL6

RPL34

RPL24

RPL31

RPL27

AIF1

DYRK2

RPL41

RPL7

PTPRJ

RPL37A

RPS29

PCSK7

RDH16ZNF500

RPS20

RPS18TSSK2

DNAJC4 RPL32

RPS8

RPS27AINHA

RPS3A

RPL23

NPM1

DPP6

IGFALS

ZBTB17

ANXA9

DOLKTROVE2 CCT8

SNAPC4

CYP17A1

RPL11

MVD

TAP2IL23A

CACNA1G

RPL21

RPL13ARPS13SLC16A6

RPS25

RPS23

RPS6KB1

TJP1

LAIR1

CHMP1A

NCOR2

ABCC1

TBCD

PKP4

CCR7

PRUNE

TNIK

LY9RAPGEF1LALBA

ARL3

SNORA21

REM1PML

RPS15 RPL9 KIF21BIL24 SPRR2C

RPS17RPL19 STMN1

SCNN1DRPS16

RPSA

RPS11

LMO1RPS14RPL4

GARTRIMS2

RPL38

RPL12 UQCRB

COX6C

DNASE1L1TOMM20

CCDC94

TFF3

MYH8

FAM65B

POR

SCAPER

EPAS1

MC5R

RGS4FCN3

SLC13A2

TCF7L2

AGT

AMBP

FEM1B

TECR

ARHGAP1

OCEL1

GIP

PI4KB

SLC8A1

GRP

RBM38

IL17RA

CACYBP

TET3

PDE4D

LHCGR

RABGAP1L

NCAN

NACAD

PRAMEF12

LOC728643

LSM4

PDGFRA

GAGE2E

TAF12

DGCR11

TOP3B

RAD52

UPK2

NCL

ITGA10

EPORZKSCAN3

MMACHC

C19orf26

MAPKAPK2

ASCC2

RAD54L

TGDS

AHSG

DGCR6L

ARHGEF17

TTLL4

SLC24A1

ATP6V1B1

ARSESMAD9

CCKBR

CSF2RB

BTRC

SFRS1

LOC399904

GPR18

PCDHA3

CPT1B

ACSM3

COL6A2

B4GALT5ARMC6

CCDC6

HTR1B

SFT2D2

ZBTB40

CDKL5

NEUROG3

CYP27B1KRT35

LOC145678

NMB

MICAL2

PRM1

NFKB1

KIAA0892

VCX3A

MORF4L2

KPNA6SLC25A17POL3S

RUNX1

DDX6

MRFAP1L1

AKAP3

GATA4

C16orf88

PDE4A

PIM1

CCDC86

RPRD2KIAA0226

EFNB1

AKAP4

REG1B

RAB31

SLC22A18AS

C10orf12

SYNPO

SLC16A5

FST

MRPL28

GNLY

ANXA5

HPCAL1

PIGC

UCP2

VLDLR

MAPK3

SOS1

FANCA

SV2B

ZNF282

RSC1A1

CD3E

ISG20L2

GARNL4

ST3GAL5

PARD3

TSHB

KEAP1

ITGAD

SDCCAG1

ANKLE2

HAP1GAD2

CD52

SGTA

KLK3

CAPN5

GRAP2

GPT C19orf50

NCR2

SEMA3F

PSMB8

YPEL1

KDM4A

DCBLD2

NIACR2SMO

CYP1A2SMOX

C19orf6

PTPRB

PDE3A

SMAP1

SDF2

RPS19

GAMT

RPL18NDST2

ZNF157

CCT3

LY6E

RFXANK

LMTK2XCL1MSX1

RAB11FIP5FBLSC5DL

FCHO1

SCP2

ARFIP2PLLP

HERPUD1

EEF2

IL1RAP MLLT4

PRKD2

CSTF3

NKX2-2

ATPAF2

RPL3

ITGB6

ENC1

BAK1ARAF

FAM155B

TMEM8BFRAT2

KLK2

ELOVL5

CEP135

RPLP1

VENTX EIF3B

LIAS

JTB

RBM42

PHOX2B

KLHL18

ABHD14ATWF2

ARVCF

PMF1

DDX3X RHEB FAUATP5J2

TSPAN1

SBNO2

TTBK2NGF

MTHFR

PSG1

HPR

C18orf1

DLX2

TAF6L

FMO5

STAC

HSPA5PNMTRPL28

FAM153ARPL10SMARCD2

CYP27A1

RPL18A

RPS5EEF1A2

GDI1

PTPN21PLIN3PPP1R2

ADAM3ACDKN2B

CTSA

PFDN5

FLOT1

FOXD1

MET

TFDP2

PAX3

TM4SF5

UBL3

HRASUSP13

VPS8

EXOSC7

LRIG2

LTA

SLC30A9

ANKRD46

HTR4

TRIM3 RPS12RPL5RPL14

SRM

PPM1D

EPR1

APCVPS11

ST3GAL1

CRHR2KCNQ3

P2RX7

MUC1ZNF79

THOC1PAMR1

C1orf77

KNG1

USP22

POU6F1

NOX1

SDC2

CBFA2T3

DPM2

GYG1

PCGF1

MAP3K3TRIM38

PIK3R2

KRT7

IKBKG

PFDN1

ASGR1

MYO9B

SF1AKT3

C11orf52

AGFG2

PPP1R10

POLR2C

CBARA1

RPL15

KYNU

RPL29

LPHN2

BCL2L1

ETV3

RPS27

PTK7

KCNF1

NACA

ATP5H

FGF2PET112L

CIAPIN1NTRK1

TBX5 C11orf80TAGLN3

SLC25A16

APLP1OPN1SW

REG1A

RAB3GAP2

AMMECR1

TPT1

RPS4X

RPS6

RPS3

RPL36ALRPL10A

RPLP0

MBL1P1

NKX2-1

MELK

LRRC48

GLP1R

CAPN3

ATRX

XAF1

NFATC3

RB1

SLC4A8

CPA1

TUBB2B

SLC6A6

GCKR

ADRBK1

IGHG1 PLA2G6

KRT12

KCNH1

LDLR

GZMM

RGS9

PRF1

DPT

BZRPL1

DKK4

COL6A1

DEFA3

STX11PTGER3

DDX11

LOC652147

CHRNB2

TPMT

PTH

ABTB2

KPNA5

PSMD1

MUC8

UGT2B15

PIK3IP1

NUCB1TRPC4AP

DLEC1SPP2

ZSCAN12

RPA2

MAGED2 IL13

ADH1C

GDI2VAMP7

HLA-F

SRRT

COLQSMARCC2

TRADD

PDCD10LOC728849BCAP29

ATP13A3TCP1

LSM7FUT7 TUBB2A

EIF4A2

ARF1 SLC25A6 SRP9

TMEM87A

SLC1A4

CFDP1

PSME1

PRRG1

RABGGTBCEL

SRD5A1 C3orf32

RPS24

RPL27A

ATP5G3 GLT25D2

ICA1

ASGR2

KRT85

RPL36AGNB2L1

NHP2

SLC25A3

SNRPB

CLIC1NDUFV2FKBP8

RER1

EEF1G

EIF2AK2

PADI2

SAR1A

C1orf21

RPS15ARPL37

GCOM1KIFC1NME2 NME1PDCD6 HINT1CCL22 ELAVL3 KIF17 ACADS UQCRQHNRNPA2B1SIPA1

RPL30

RPS7

EIF3E

PTPN9

RPL23A

RPL35

RPS21

RPL17

RPS10CACNB1

EIF2B5CAMK2B

DRG2SLC22A24 COL4A1CTNS C14orf79 COL4A2TGM3 TNXB

CDH1

SORD

PTPRF

KRT18

ATP6AP1

FAM120A

GRN

PVALB

CBX6

TRAP1

PSMB4

GTF2I

AR

EIF4G1

ECH1

FCGR2A

PRSS23

KISS1

MAP2K3

C9

PCDH1

PSMD2

COPA

TNK1

SLC6A2

DLGAP4

POLG

TRIM28

MRPL12

UROS

RALY

ITGA2B

UQCRC1

ACOT2

SIRPB1MTFR1

NCSTN

LCK

CD63 DDX5

GRIP2

SKIV2LNASP

ASNA1

FAM168B

VAMP5

HMG20B

RFC5

WNT11

SFTPD

USP9X

MTERFD2

SRPX

AK1

TTC15

INPP5D

CD72

DOK1

SCYL3

PPYR1

FIG4

NUDT3

OR2B6

LARGE

TLX2

JAG1

RNF167

IRF4

RBP3DDIT3

GPR6

SIGMAR1

C1orf216

SLMO1

FAM102A

SLC2A1LILRA4

LAS1L

KCNMA1

EDIL3AP3M2

USP19

ADAM20

MAPK8IP1

C1orf105

LYPD3

DDX17

IL8RB

FLJ10038

SMPDL3B

MAST2

PSTPIP1

TCOF1

SPEF1

MARK2

CCKAR

DEXI

ABL1

HRC

ANG

RANBP2

IFRD2FNTB

SPN

CGREF1

NECAB2

DDN

OR5I1

FKBP15

GTF3A

C5orf30

MLLT1

STX16

CSF1

PXN HTR6

EMD

FEZ1

CNNM2

CNOT4CYP2D6PFKM

OSGIN2

KRT86

STX1A

PIGA

HSF1KIF1A

TEAD3

UBL4A

ABCC8

PIK3CD

ITGAM

ARTN

B3GAT2

LOR

ARHGAP4

TRIM66

NRCAM

ACY1

E2F5

IL5RACLCN5

ATF7

CDC20

ST7ITM2B

SAPS1

ZP3

BMP1

PYGM

ORC1L

NTRK2

CADM3

ABCC3

ERLIN2

UPP1

ATP6V1D

GSTA2

ZNF593

NFIC

COX10

CYP3A5

EXOSC2

CCDC69

HMX1

DNAL4

RHBDL1

HCRTR1

TACC2

CDC42 PHF20

GUCA2A

GRIK5

SLC5A2

GGT2

RP1-21O18.1RAF1

EPHB2

FCGRT

PTH2R

CAPNS1

POU2F2

OBSL1

SMARCD3

EVPLDEPDC5

THBS3

CLINT1FANCL

FAM13A

NUP214

DUSP7

ACR

KRTAP5-9

UTRN

CREB5

DYNC1I1

ACVR1B

BAHD1

MCRS1

TXNRD2DGKG

SKAP1

EPS15

NCDN

CLCN6

IFNA6ZBTB25

PAK4

ABCA2

SULT4A1

ALAS2CELP SOCS3

UBE3B

CDK6E2F4

GPR161

ABCA1TPI1

SLC11A1

BPHL

HSP90AB1

C12orf47TGOLN2

GALK1

TAF15

GNL1

MMP19ADCY1

SSTR2

TTYH2

LOC201229

PIK3CB

GPR3

RGS12

MFAP2

CDC34

NFATC4H2AFX

PTK6

ARHGEF6

LIF

CNTN2

PEX6ARID3ASLC7A4

GPR35

IL7R

MYO1C

IGKC

UTF1

RELA

PTPRDSLC6A7

RFC1

CSPG4

CDC2L5

BCL11A

PDPK1

STATH

SEMA5A

IAH1

TNPO1

PRKCE

C9orf114

KLRC3

ZNF101

HTR2B

SLC7A2

FDXR

TREX1

EVX1

PXDN

GRINA

MAGEA4

PLXND1CIZ1

JAK3

PDHA2

CROCC

SPARC

COL1A2

ATG2A

MAN1A2

KDM1

MUC5ACKIF20B

NTRK3

PRSS16

CDR2L

TEKT2

ILVBL

STXBP1

STIM1

LGALS1

EFNA1HNRNPA3

CCNI

RBPMS

SFRS11

NUP210TTC3

ACTB

BCAT1

ZNF134

CEBPE

DKC1

RBM4

TUBGCP4

LOC552889

CAMSAP1

TP53TG5

KDM5B

GPR19

SHBG

XRCC2

CEACAM3

GP2

SKI

CACNB4

RAB8A

FAM5C

GH1

TROAP

THRA M6PR

PPAP2C ProSAPiP1

SLURP1

KRTAP26-1

CADM4

OAS2

YME1L1

NRG2

TBX19SLC15A2

IGFBP6

LOC100128031

VDAC2

sept-06

NONO

FAM127ADYRK1A

MARCO

MTUS1SFRS2

CRYAB

ZFP36L2

ZNF451

CTNNA1

SUPT6H

ZMYM3

SLC4A3

TGFB1

NOMO1

CYHR1HSPB3

NAALADL1

RALYL

GIPR

NECAP1

DHX9

DNAJA1

HMGXB3

PCDHGC3

UBE2S

HIST1H2AG

WDR62

RHOB

LTK

PLEKHG3

PFKFB2TIAM1

EIF3F

WFS1TGFBR2CSNK1A1

LPXN

H1FX

SFRS4

OASL

SPINT1CALR

TNFSF12

UBXN1

C12orf51

NCKIPSD

KIAA0430

SEC16A

RSAD2

NAAA

PLEC1

DCTD

SPG7

MAT1A

SLC5A5

PIK3R3

MFN1

MPV17

CD40

C9orf33

GHRH

LOC729400

SMTN

PPP1R12B

DFFB

MMP15

NBAS

IRGC

HIRIP3

DPF2

MT1A

MT1X

DCAF8

LSR

ARHGEF10

NDUFV1

DGCR2

HOXC4

NEU1

ACADVL

FGFR1

DYNC1LI2

PLXNA3

PPIH

SCAND2

HS3ST1

GNAS

LCT

MTG1

MGAT4C

EIF3CL

MXRA8

CRYBB3

CAP1

NFYB

MRC2TRAF3IP2

UQCRC2DHCR7SPTB

TADA3L

CLSTN3

RRAGA

EMR1KANK3

NID1

ACTN1

POU2F3

INPP1

TNFSF9

MOGS

GABARAPL2

NOLC1

FAM38A

HPS1

KHSRP

TNFAIP6

H2AFY

VASH1

GYPC

IRF2

C1orf63

SIX6

FOXK2

EIF2C2

DHX30

BAT3IRAK1

ZFR

HNRNPCDNMBP

ZNF143

TCEB1

PRCP

FBXO21

PTOV1

SERPINB7

UBE2C

LIMK2

EIF3G

TLE1

DAP

ILF3

MLX

CTNNBIP1

CYP2A13

SLC25A1

PAN2

CPSF1

SLC7A8

ZBTB16

UBE4A

SLC38A10

C8orf84

TFDP1

RAE1

SMR3B

RNF185

SERBP1

LSM14A

ACTR2PPP3R1

SLA

ADD2

MPZ

ALDH1B1

MGAT5

CAD

MAML3

FGRZER1

ZFY

CCRK

NRGN

MED24

KCNJ9

ERBB3

SAG

HSD17B8

ADCYAP1

MGAT3D4S234E

TRA2A

MLL

DBT

ABCF2

DPH2

CD79A

BRD3

ARFIP1

BFSP2

NDUFS8

DDX23

IKBKE

SKP1

RYR3

TGFBRAP1

STX2

MAGEA5DUS4L

NOL7LRP3

LRIT1

PRPS1L1KCTD20

DOHH

DPH1SPG20CLIP2MAGEA9

RPS26

EIF5B

ZMYND10

SERPIND1

F8

SVEP1

ZNF259C19orf29

MDM2

VSIG4

MEF2C

MPDU1

ZNF202

CD7

ABL2

PMS2L1

C19orf57

LOC283079

INSL3

SLC6A9

HOXC11

KRT16

PCNXL2 STBD1

ARL2

DRG1

CLDN5

HRK

PPP2R5D

CTBS

PDAP1

MGAT1

MFAP3

BNIP1

LYZL6

VPS41

HIPK3

CHP

PFKFB3

RNF113A

GPSM3

ZBTB24

MAPK9

ESR1

ATIC

SLC6A11

ABO

GYPB

GCN1L1

CHST3 NUAK1

ACP1

KLRC2B4GALT3

EMX1ARHGEF12

DGKD

KCND3

NCAM1

PLEKHB1

C19orf36

SNX27

KRT33A

PAPPA2

FLT1

CCT4

FZD9

PTEN

ITGB3BPHTR1D

HIST1H3E

FOXE1

MLANA

RNF5

CHKAODF2

XRCC5HTR2A

BBC3MT1B

SLC9A1

PPP3CA

ALOX5ERCC4

KCNA3

DNAJC7

PTPRR

PLEKHM2

ANK1

RBCK1

ZNF467

CYP2C18

HMGA2

PRKAR2ASLC29A2

COL19A1

CNTN6RABGGTA

PAX7C2orf24

CCR5

GCK DMP1

SSTR5

KCTD17

CHST1OLFM1

RXRB

SLC6A4SH3PXD2A

BZRAP1

SCN1B

PAFAH1B1

RBM4B

IGH@

MT3 ZC3H7B

DTX4PSMC5

POLR2J

NUP155

LRCH4

CECR7BMP8AASPHD1

NME6

MDH2DMBT1

CIITA

IMP4PRH1

IFNW1

BMI1

NEURL P2RY6

GATA2

CUBN

ARID4A

ZNF22

COBRA1

SLCO2A1

DOC2A

TNP1

TERF1

TBC1D9B

EDEM1

CRABP1

FURIN

ZKSCAN5

TUB

ABCC10

KLHL35

CHP2

KIAA1539

IL3

DGCR9

SRPK3

ARHGAP26

ERC1

XDH

ODZ4

SLC18A3

ENTPD2

DUSP14

CCDC85B

SLC17A7

RAMP3

GRB10

FGF3

KIAA0467MFSD10

GFPT2CXCL5

PIGR

VAMP4

FLRT2

BCAM

PRKACG

PHLDA2 EIF2B1

SH2B2

NCR3

MDK

WNT1

LOC91316

SLC16A2

PBX2

ILF2

S1PR4

ACOT8

ADCY2

KLHL25

TRIM33

KCNH2

ALPK3

PHF8

IFNGR2

CLEC11A

GABRG3

SCN8A

CYB5R1

HNF4G

MC1R

ZNF187

FSHR

SERPINA4TNPO2REG3A

PGM3

CALCOCO1

CREB3

PTMA

YWHAQ

STIP1

TCF15

FKBP2

SGCA

EP300

SLC1A6

USF2

NDUFS7

GRIN1

C6orf120EEF1D

BGLAP

BRD2

HLA-G

MIF

DDOST

RPL8

PHB2

SCAMP5

RASGRF1

TMBIM6

GOLGA2

SUMO1ACP2

ACAA1

HSPA8

PYCR1

P4HB

PSMA2

ATP6V0E1

RNASE4

SIAH2

ERP29

IGFBP2

PSMA3

HSP90B1

GTF2A2

PDIA3

MGST3

PARK7

VWA5A

TUSC2

ATF4

ZNF132

BAT1

ERGIC3

NME3POLD2

GJB5

OAZ1

FUS

JUNBNUP188

GPX4

TUBB2C

DAXX

TALDO1SOX3MMP23B

TENC1

TP53

MATR3

HIST1H2AH

ATP10B

ARFGEF2

PVRL1

ZBED1

ERCC1

IRF8

IFIT1

MPP2

KRT6A

SLC17A1

IVLSAPS2

TNFRSF10C

TNFRSF1B

MAGEC1

BCR

TCL1A

SPINT3

PSD4

TBC1D22A

MTX1

ERF

NRG1

GIGYF2

CPZ

TPPP

C20orf194

TLE3 IQCB1

GRAMD1B

IL1RL1

AMHR2

TTLL5

HSF4

HOXD9

ZNF337

ECE2

BBS9

GBX1

MSC

RPS6KA3

BAT2D1BMP10

TTC22

ATXN3

PPP1R16B

LDB3

CASP10

HYAL2

NR6A1

MC2R

CDH16

CCNF

BTF3

E2F1

TOP2A

C9orf125

MLXIP

MAP2K4DRD5

MYO16GRB2

SLC9A3

AURKA

UBE2M

CCR4

CXCR5LYPLA2

F2RL3

RGS10

EXD2

C8orf71

CARS

ALDOC

PTPRN

ARR3

TBX1WHAMMRBBP8

MRPL33

MYBPC3

LOC26102

CMA1

FLT3LG

ELAVL2

DNPEP

RPH3A

POU4F1

PPIF

TBXA2R

ZBTB7A

LSM12

ETFB

NAGPA

OSTF1

HCLS1

WWOX

CORT

CALB2

CD34ECD

USP6PHLDA1

GPR143

RAP1GAP

GRM4

DNAJC8

POU6F2

DNASE1

GNAT1

SLN

MEN1

SH3BP1

CYP11A1

C1orf68

CA5A

GOLGA1

POLA2

BRD8

HPGD

PLEKHM1

FGF9

FADS1

SSX1

LAD1

CNTN1

FMO4

TCF20MOBKL1B

PI3

COL1A1 CHST2

RAD51L3

VILL

CTSC

DDX19A

TRIM16

ATP6V0A2

SOX12ADAMTSL2

OPCML

RAB3GAP1

RMND5B

TAF1

ITPR2

JMJD6SCD

MYST2

BTBD2

NPM3

STARD5

GLE1ASMTL

XK PSG7

PRLRH2AFB1

NPAS2SDC3PTAFR

WDR18

DSC1

ZNF76

STK17A

MAPK12

MATN4

ST20

RTN2

TOMM34

BHMTAQP2

GJA1

ATP11A

PRPF19

SIX3 NFIX

LOC644165

MATN3

PAX9

AQP5

PCTK3

KIF22PRB1

CDK2

MFAP4

COPS6

EIF3KCDC25C

IFI35

MYB

H6PD

ARHGAP12RCC1

ARID4B

HTR7

HOXD4

HCG9

OGG1

UQCRFS1 NDRG2

PCDHGA12 EML3TCAP

SLC15A1 NF1CPNE6

DGKA

SETD2NRTN

DAGLA

GYPE

UCHL1

ARHGEF1

ZNF20

LBPCDKN2CAHDC1

OTUB1VAPB

SLC6A8

MYST1

RGS7

ITIH1

ZNF167

ALDH5A1

UCP3DAPK2

SFPQ

FLIILOC729678

PGAP2

LY6G6C

BAX

AMELX

PRKACAMFNG AVPR1B

ACHE

CASK

DKFZP564C196

PDE4C

RASSF9

KLF11

LPIN2LLGL1

RAB3ASTAT2

DMWDAXIN1

T

SLC1A7

IL10RA

ANKRD12HARS

KIAA0513

CCR1C3orf51

SAMD14

ELN CHMP2A

EWSR1

KLF12

TFAP2A

HMOX1

CD99

KIAA0586KIAA0284

C2CD2

TYMP

SEPT8

GPLD1

CHMP7

NFATC2IP

GPA33

ATP1B2

POLR2E

GRK5

TERF2

ETS1

HADHA

DSG1

EIF4BNADK

AMIGO2

CAMK2G

FOXN2

RAD54L2

TULP3

TMCC1

CADPS

MEGF6TNFRSF25

HTRA2

INA

TFCP2

PLEKHO2

COL16A1

ADRA1A

NDST1

MAEA

PALLD

HSPB1

TPM2

TAGLN

CSRP1

FHL1SVIL

MOG

CD1C

C2

CTRL

SLC17A3

MMP25

ZFP36

IKZF1

PPP4R1

GRIN2A

VEGFC

PHKG1

LOC100131311

WWTR1

ACTG2

MYLK

MAP2K2

IGFBP5

CDX2

MAP2K5GNB3

PDLIM7SLC20A2

INPP5B

KRT2FUBP3

PPM1E

FERMT2

FGF8

ATF3

CKB

MYH11

SYNM

ALDH1L1

PEA15

MYL9

CDC6

UTP3

MAP3K14

CCL19

AP3B2

BAIAP3

TP53I11

RND1

VPS72

PSMD14

ESPL1

SF3B2

NRP2

RGN

GPRIN2

GPR44

PCBP1

DPYSL3

SLC6A12

PVT1

PLRG1PARP2

ZMYND11

SNAPC2

FOSB

MAPKAPK3CCHCR1

SHC1

RASSF7

MCL1

GAGE12JPRMT2

RAB14PHKG2

PCDH7

CEACAM4

KIAA0323

NR4A1

H3F3B

C1orf144

SCARB1

FTH1ACTG1

CSNK2BVIM

TPM1

FMNL1

CSK

KIAA0999

ST14

PSMD11

CDC42EP4

PKM2

LCAT

TCF7

CLIP3

RPS6KB2

NBEAL2

NPPA

AHCYL1FAM189A2

PCBP2

ATP2A3

LDHA

COASY

FOXC2

LIPE

GLUD2

IRF5

TUBB3

ZFHX3

RBM5

ZCCHC24

CORO1A

SULT2B1

PTGIS

WDR1

ARHGAP6

ZNF783

PZP

CNN1

TRAK1MGP

FETUB

FLNA

JAM3

C14orf1

KIR3DL3

NCF4

TRD@

USP11

RUNX2

ARL4A

FBLN1

ATP5S

EFNA2

SSH1TNFSF8

CHD8

PROL1

LASS1

BAP1

TAOK3

MAD1L1

DDIT4

MYH2GNB1BCL2

ZNF169

TCTA

DUSP1

PLCG2

CA11

METTL3

ATP6V1A

SMG6

TTLL12

ABCB6

CBLN1

CDK5R1

SERPING1

UBC

SV2A

RREB1

CALCA

PUM2

APBB3

SEPT6

CYP3A4 TAF5

TNIP1

XPNPEP2

SNX13HSF2

ZNF274

FMOD

GJB3

16

Prostate: PCC – SN(C1) Prostate: PCC – SE(C3)

PITPNM1USP11PSG1

SAP18LBPTOMM34

DMWD MYO9BTUBB2BCAMSAP1PXN

TMPRSS6CEACAM3

IKBKG

LLGL2SLC2A1

SIM2STMN1

CD33KIAA0323

FAM189BH6PD

SLC6A2DGCR11

SLC24A1

AANAT

ACR

RUNX1

LOC91316

TUBGCP2SULT4A1 TCOF1

AGAP8 JAG1

RP1-21O18.1 EML3

POU4F1

BCRHLA-DOB

RHBDL1MFAP2

IGSF9BGPR161

CACNB1GRIK5NUDT3

MYBKIFC3CD22

NRP2

MLANA SULT2B1ATP6V1B1

PLIN1SCAMP5

ITM2BAHSG

LYPD3

DOLK

MR1

NTRK3MT4

GRIN2A

HNRNPL LRIG2

KRT33A

N4BP2L1SLC6A9

ARPC4

CCPG1

TTLL12

PRH1

ESPL1

SPN

VAT1

PIK3CB

USP9X

D4S234EANGEL1

ARHGEF12 NEUROD2

FGF7

CYP2A6MUC1C19orf26

HMGXB3

GRB2

HIST1H3ESLC4A3UTP3

MFSD10

SAR1A

WWOX

PDE4CMSC

CDK2ADCYAP1

LEFTY1

AGPAT1

TAZ

PIM1

PAFAH1B1FNTB

ESR1GOLGA1PLEKHG3

TRIM58M6PR

NTNG1

MAPK8IP1

IHHRUNX2

U2AF2

ACVRL1

RPS17

KLHL18

SIX6

RPS5

RPS19

RPL18

MAGEA9

EXOSC2STX16

MYO16

MATR3

CASP2

RAPGEF5

ST5

SEMG2

TUBB2C

TUBB3

TPM1

TUBB2A

SPINK2

SEMG1

AKR1B1

ACTG2 PIPMYH11

GPR35SRPX

RPL29

RPSA

LUZP1

CCDC69

SEMA3F

DSG1

MYEF2

RAB11FIP5CIAPIN1

EDIL3ADAM20

B4GALT3

ALPPL2

CCL16RPA2

PVT1

MYO1C

ARHGEF6

HTR4

CNNM2MRPS12

POLA2

SEMA7A

HSF4

DMBT1

LRRC23PTGS1SLC7A4

LOC100134331

UBE3B

SLC22A24

LY6G6CNCKIPSD

RIMS2

RAD51L3TERF2

IGHMBP2

DOK1

JAK3

EPOR

DNAH9

PRRC1IL8RB

CHST3

ACVR1B

TBX5WBSCR22

HPS1

KRTAP26-1

SLC30A3

NFATC3ITGA2B TGFB1HTRA2

GTF2H3

LY6E SIX3

CNP

SSTR5

GPA33SLC6A8

AP2A2

SMYD5CRCPRREB1

PI4KBSLC6A7

NUPR1

OPRL1

ZBTB7ASVEP1

DGKA

PIK3R1

TUB

ARID3A

LILRA4NQO1

RPL31RPS23

RPS15ARPS12

RPS21

RNASEH2B

AFAP1GP2INA

SLC9A3R2

CDK6RBBP8

ST20

CSK

PAX7

PPM1D

TBC1D22A

SCAMP1

HRK

RPS6KB2

SORBS3

HTR1BMDC1FSTGPR6BAHD1PIGR

ARSEKRT6A

GARNL4XPO6MYH2

HPR

ZNF167CBLN1LRCH4

CRYBA4

GAGE2E

VCX3A

SLC18A3

ANKRD12

POM121L9P

CNTN6

RPS16 GNB2L1 EEF1GRPL27A MCL1 LOC100131311RPS11RPL24

VASH1

SERPINA4THBS1

GYPBRAB8A

RABGGTATNIKTMSB4Y

TBXA2RFOSL1

ASGR1

POL3S

TNPO1

SLAATP2B2

TAF15

UTRN

ENTPD2

SFTPD

CNKSR1IFIT1

RANBP9 CHD8

SLC16A5MMACHC

PGM3

CDK5R1

MYCL1

FUT7

FAM189A2

AQP7

SPRR2C

TUBGCP4

REG1BLY9

LOC145678

RALYL MAGEA4ABCA1

CCNFGGT5

DDX51

AMIGO2C16orf7

ADAMTSL2

ATF6B

DAPK2

RELA

SLC17A3

ELAVL2

PCTK3

IDSSTAT2

ABOCEACAM4

GPR3

ALPK3PAX9

XRCC2CCKAR

OR2B6

STATH

DHRS1PSD4

THRAKIFC1

GRINASHBG

NRG2PCDH9CDR2L

KRT86GPR19HOXC11

IFT140 HPGDSMPDL3B

DTX4ARID4B

VENTXMC2R

CRABP1

CDH4

FAM153A

ABCC8STXBP1

EIF5B

SMARCD2

MTX1

PXDN

LOC552889

ORC1L

RTN2

MCM2

FGF3MGPEIF2AK2

UBE2S

ALDOC

SCARB1IGHG1 SKI

TCF7TRD@

CYP11A1

E2F1

YPEL1

CEP135DDX11

MAG

ZNF500

KRT2

MYBPC3

KCNH2MMP25

PAXIP1HTR7

IGF2T

PCDHGC3

CD34LAIR1

PRPF31

ALAS2ZNF157

ANKRD46

NTSR2

FZR1SPEF1

PHB

NDST1

PTPN9 KCNJ4JMJD6

AGFG2PLEKHM2

GUCA2ALIPE CCDC94

SLC1A7MEGF6

UBE2C

DCISOCS3CCDC86

FETUBPTPN21

AP3M2

PRAMEF12

PPP3CB

NFIX

IFI35BBC3

SERPIND1

ZMYND10

MMP19RB1

NRL

ARHGAP6 POLR2J

SAG

COL6A2

BTF3C9orf114

C20orf194

SNUPN

RBMX2KLHL25

KRT37

CCL22

ARHGEF17

CHKASNAPC4

HSF1

ARR3

GPR162

KATNB1CSNK1G2

BAP1

FGD1CYP27A1

RBP3

CEL

ELOVL5TEKT2COL6A1

CUBN RDH16

SOLH

SLC9A3

SV2A

SLC28A1

DYRK1B

MC1R

YY1

KRTAP5-9AMHR2

SPTLC2

TCAP

KPNA6

SLC6A12

ZBTB25

TPPP

XPNPEP2

C9

IQSEC3ATP11A

TNFSF12LSM7

PTGES

CROCC

PCDHA3ZFY

ARHGAP26

UTF1NOMO1

KRT35

KIR2DL4

SEMA5A

MATN3

HIRIP3

PCDH1GRN

PAMR1

RHO

UCHL1

MAX

SMARCD3

SLC35A2

CAPN5

DDNZNF239

GATA4

INPP4A

RAD54L2PRPF6 CD52

ARMC6KCNQ1

ARFIP2

TAP2DDIT3

JAM3

ZNF593

VEGFC

RBM38UBOX5

CA11MFHAS1

TCF7L2

ABCB1

SMO

HLFOPN1SWPHKG2

HSPA4

SLC25A17

TRIM3

GBX1

SERPINB7

RRP12 NCOR2CBARA1

STX11

PDE3A

C2orf55

CADM4

E2F4OTUB1

PFKFB2

KIF20B

PAX3

DPEP1

CD40MEF2CZKSCAN3

CARD10RPS26

UNC5BHAUS5POU2F2 CNTN1COPS6

EFNA3ADAM15GTSE1 CDX2

TFF3EPHB2

ADRBK1MYST2GNL1SPINT3

AATKABL1

HDGF

APOBEC3CGJB5

SEMA4DTBX19

FAM102A

LTKPPP3CCWNT10B

SLC25A11

FCGR2A

SPC25

ZBED1TRIM33

CHST1

BCL3

PTPRN

TPMT

NUP214

FOXE1

KCTD17GIPRNCDN

CLEC11ABMI1

NRTN

LSM4

RGS9

CPNE6SCYL3

MEN1

KIAA0999

C2orf24

FKBP15

PPP2R5D

CAPN3

CAMK2B

SLC29A2

BCAM

SBNO2SLC12A4

ACP1

ITPR2

PAIP2B

CYP3A5

WDR62MSX1

CSNK2A1

HHLA1

GPLD1

GJB3

IRF4

C1orf105

PRKCBCEBPE

GART

RAB3ANCSTN

RAB40B

HCG9

MAD1L1

RNF167

COX10

EPR1

IL10RA

GPR143

DUSP7

KIAA0284

NLE1

TNK1

GNAS

SCAPER

XRCC5

ABCB6

LAS1L

MARCO

POLR2E

PPP1R8

C14orf1GJA1 AURKA

NADK CHRNB2

IL18RAPADCY2

METTL3

SLC38A10

LOC644165

KDM1DGCR6L

P2RX6

HIPK2

SLC8A1NCR2

DTNA

ITGAM

ZC3H7B

WNT11

HCRTR1

PGAP2

NFATC2IP

TROAP

HMX1

P2RY6

SGCA

PTAFR ZNF264

IL17RA

NSFL1C

TERF1SLC16A3

SOX12

CPSF1

POLG PLEKHM1

CORO1A

FLT3LG

C1orf144

KLF1

ADH1C

PSTPIP1

PLCD1

FAM5C

ATG2A

UPK2

NOC2LCRTC1

FGF2

SEC16A

CRYBB3

GPT

CCL19

MGAT3

TREX1

PRB1

ETS1

BAIAP3 LOC652147

REM1IRF2

RBCK1

FLOT1KIF22

SMG6

UBXN1

PDLIM7

MVDGALK1GCKR

DEPDC5

MLH1PLEC1

RASGRP3 PTH

MAP3K3

IAH1

SAMD14PAK2

AXIN1TOP3B

MAP2K4INSL3

FSCN2

ACAN

WNT7A

KCNF1

UBE2HFAM50AGSTA2

CHMP7GDF5EVPL

MAP3K14

ANG MPDU1

ALOX5

LOC26102

PSMD14

IQSEC1

CRXC19orf50RPL3L

ERFTGFBRAP1

ITGA3SRM

IL3

GRM4

PRPS1L1

ELAVL3

GPRIN2

TGOLN2C19orf29

PIGC

GYPE

LPAR2

F7

RAP1GAP

ETV3

LOC729400PDAP1

PDGFRA

AIF1SSX1

SDC3

psiTPTE22

CCBP2

GIGYF2

GNB3

USP19

RAE1

ARVCF

CYP2A13

SFPQ

POU6F2DUS4L

HTR2A

TBX1

TRIM15

IVLPPP3CA

CTRB2

RGS10

LMTK2

CACNA1B

ABCC10EMR1

METTL1

BAX

EMID1

GRIN2C

AMMECR1

ACCN1RPL11SERBP1

CREB3

SPINT1

OGG1

FDXRZNF592PDPN

DGCR9

TNFRSF25

ENC1 ALDH4A1

PRPH

ELN

MLLT4

TULP3

RASSF7

PTGER3

BRD2PARD3

RECQL5

MPP2KYNU

IGH@

RPS18 RPS8

TACC2

HMGB3

TRA2ACAD

RAGEAQP5

KRT1

TET3

SLMO1EPAS1

MAPKAPK2

COBRA1ZBTB40

GUCA1ATADA3L

CDC2L5

MYST1

DCBLD2

TRAFD1HAP1

CCHCR1

DHCR7DEFA3 RPH3A

ST14 TNP1FLT1NFRKBFEZ1PCNXL2

ITIH4

ASPHD1MGAT1BZRPL1PIK3R3ARTN

PZP LGR5ZNF783CSF1MVK

MXRA8 ZSCAN12COL19A1

SLC5A2PPP1R10

NUP188

OCEL1ODF1

NFKB2SLC9A1

PCBP3LOC201229

LGALS9PAX8TYR

TLX2LYPLA2

RELIRF5HSP90AB1

SLC1A6XDH RBM8A

CHIT1

UGT2B15PSG7

ATP1B2

MTHFRDPT

PNMTPTPRD ITGAD

AGPS

ISLR

OAS2

MAGEA1

ERBB3

GADD45B

CECR7 TYMS SIT1

GRIP2SLN

GAD2

PLIN3

IL4

RAMP3

AFF2CSTF3

MAGEA5

KIAA1539

CCR7DNAJC9

ST6GALNAC4

RPRD2GLT25D2

LOC652147

FGF2

PLCD1 COL6A2

GPT

KLF1

RB1

CHMP7

UBE2H

PTH

MAP3K14

GDF5

IAH1PTAFR BAIAP3

TOP3B

FLOT1MVD

GSTA2

NOC2L

TREX1

SERPIND1

FSCN2

UPK2

PHKG2

ATG2A

GALK1

IL10RA SCAPER

GCKR

POLGKIF22 CRXANG

GPR143

ETS1

MGAT3

AXIN1

RASGRP3SBNO2MAP2K4

CCDC94

FGD1

AGFG2CRTC1

XPNPEP2

ARR3

ARHGAP6

CYP27A1

ACAN

SAMD14

UBXN1

CROCC

UBOX5

RNF167

ADH1C

CADM4

WNT11 LOC26102

HCG9

BTF3

NLE1

C1orf144

ARHGAP26

INSL3

REM1

ZMYND10

MMP19

WNT7A

CEACAM3

UGT2B15

ZNF167

U2AF2

MR1

SKI

TPMTMYST2

LTK

TAZ

H6PD

FOXE1

ANKRD46ORC1L

CDK2

ADCYAP1

SPN

SLC24A1

UBE3B

ACVR1B

AGAP8OCEL1

CAMSAP1

GPR161

LOC201229

LY6G6C

SCARB1

CD34

ARPC4RPS26

C2orf55CSF1

REL

MVK

SIM2

BMI1

JAG1

TUBGCP2

SPINT3

SLC12A4

PFKFB2

CYP3A5EML3

TMPRSS6

PNMT

KIFC3

ALDOC

FCGR2A

OR2B6

LLGL2

CPNE6RAP1GAP

LYPLA2

TLX2

TFF3

HTR2A

GIGYF2

MGAT1

COL19A1 RBM8A

POU2F2

MEF2C

CLEC11ASULT4A1

TGOLN2

ABL1

MYO9BSLC1A6

SEMG2

SEMG1

GRIP2

PIP

NTNG1

psiTPTE22 HHLA1

NTSR2SLC5A2

TOMM34

SLC6A9SLC18A3

TYR

TBX19 IKBKG

STXBP1

RGS9

SLC4A3

TROAP

USP19

XDH

FEZ1

GRB2

MYBTCOF1AATK

MSCEPHB2TRIM58

KCTD17MXRA8

CD40

RUNX2SDC3

STMN1

LSM4

MFAP2

PAIP2B

GTSE1

FDXR

FAM102A

E2F4

DPT

APOBEC3C

CDX2

KRT86ITGAD

WNT10B

OAS2

GRIK5

SLC6A2

HTR7

LBP

INA

ZNF500

RPL11

CNTN1

PCBP3

CARD10LGR5

ZSCAN12SEMA4DELAVL3

ZKSCAN3

NCDN

GIPR

PAX8

KCNJ4

FAM189B

HDGF

USP11

TUBB2B

SLC9A1

NFKB2CCPG1

MC2RPSG7

LGALS9

ATP1B2

WDR62

BCR

PSG1

HLA-DOB

CD22

GARNL4

ADAM15CRYBA4

GNL1

PPP3CCCYP11A1

ADRBK1

ODF1

GRM4

COPS6

ARID4BRHBDL1

PAXIP1

MYBPC3SPC25

CEP135

TET3

NUDT3

CD33

SLC2A1

POU4F1

GP2

FUT7

GPA33

AQP7

PRPH

GPR35

RBBP8

ALAS2

CDK6

SLC9A3R2

IDSB4GALT3

SPRR2CWBSCR22

LY9

DEFA3

SMYD5

KRTAP26-1

MAG

PAX9

DNAH9

YPEL1

RECQL5

AGPS

THRACOBRA1

PRRC1DDX11NCKIPSD

RAD51L3FZR1

SLC22A24

DOK1

CDK5R1

BCL3 MTHFRIGSF9B UNC5B

PITPNM1CACNB1

AQP5

SAP18

ZNF157

LOC100134331

PTPRD

RP1-21O18.1

NRG2

DHRS1

TERF2

PRPF31

SCYL3

ACR

CHST3

RUNX1

IL8RB

RAGEDGCR11

MSX1AANAT

HAUS5

EFNA3GJB5

RIMS2

ZBED1

SEMA7A

FOSL1

ACCN1 HTR4SLC30A3AP2A2

EPAS1EPOR

SLMO1

HTRA2 ITGA2B

ZNF592 SLC6A8IGHMBP2

MAPKAPK2

TRAFD1

Prostate: MIC – SN(C1) Prostate: MIC – SE(C3)

IRF2BP1

PTPRS

MYOZ3

SLC16A5RPL3L

ERGIC3

DUSP1

FAM38A GNB1

EIF3B

NAAA VSIG4TRIM28 GLT25D2

HOXD10

JUNB

C9orf114

NFATC3

ERBB2GFAP

PTPN21ITIH4

CPNE6

REEP2ABCC8RAB14NUDT3CRMP1

FAU ZNF76MYL2

GCOM1

TRIM66

CHST15

TULP3BCL2

ORC1LCACNA1BIFRD2

POL3S

NF1

CAPN3SMOX

RMND5B

UBE2NCCBP2

BCL2L1

LARGE

TAF1CBX3

CCR7

GPR12 ZNF593

KCNJ6

FOXE1

PBX1CARD10SFPQ

DTX4SCNN1D

KIF22ZNF169

MYO9BVCX3ASKI

C19orf29PGAP2

CA11

MSX1

KCTD20CD40CTSG

AKAP3

ATP11A

SERBP1AATK

IL8

NCSTNPDE4C

DNAJA3

IRF5

TRIM3

MR1 HTR7NFRKBHLA-DOB

OGG1

NTRK3

TUBB2B

ALOX5SIT1

PTH

GRM4WNT11

RUNX1

SERPIND1

KCNJ4

INA

EVPLCYP3A5

ELN

SPRR2C

AGFG2

CDR2LLUZP1

ELAVL3

TBX5

CECR7HPGD

PTPRNGRNARPC4AURKAHTR2AOSGIN2

DOK1SLC6A3USP9X

MLXARMC6AFAP1 RP1-21O18.1

VASH1

INPP4A

MAP3K11MYO16ZBTB25

MAPK3

KYNU

SLC18A3RPRD2

H6PDBZRPL1

CD3EPPYR1

ATP2A3

JAG1CSF1NFIX

PARP2

LYPLA2

ARFIP2IGFBP5

NRLLLGL2 GCK

SNUPN

NDST2

CHST3MLANA

KDM5B

MAX

MYOGFAM153A

IDS

EMID1

POLD2

CYP11A1AQP5 HMX1

PHKG2PMS2L1

SRMDEPDC5

ITM2BSTIP1

FNTB

IFIT1

SEC14L1

CD33 MCM2

ESR2CRXMYBPC2RAC2

ACCN1

DDN

PISDATF7

ST5

PTK7

HTRA2

JMJD6

SORBS3EPOR

CHD8CYP2A6

EFNA3

C10orf12NFKB1

OLFM1

CHKA

CNTN1

C19orf26

ADCYAP1

HPCAL1

SLC25A17

CLEC11A

KIR2DL4

GSTA2SLC1A7COL4A1COL4A2P4HBRPL18GAGE2ECAP2

HIST1H3E

POU6F1

GIGYF2

ADRBK1

GDI1

ZNF264

MT3

FANCL

SYNM

MYH11

MYL9

ACTG2

PTPN9ZP2 TAX1BP1 GZMMCSPG4 NADK

MTX1CNNM2CYP3A4

SPC25

MYBPC3LOC201229

COL1A1

NCR1

ATG4B PRPS1L1

CRCP

ALAS2

DPEP1

TFDP2

KIFC1

RREB1

KIAA0284

ST6GALNAC4

TUB

EIF2AK2

AVPR1B

CTNNBIP1

LOC145678

CLSTN3

UQCRC1

MYST1

CHRNB2

LOC91316

SIX3STATH

IRF2

THBS1PNMT

WNT7A

S1PR4EIF3G

BRD2

GARNL4

ARFGEF2

RASSF9

KCNH2

NRP2

GALK1

AMMECR1

CCL22

GTF2H3FNDC3A

GNAT1NFATC4

LIPETAF5

HIPK2 MT4

TNK1

CRTC1

CCDC86

TPPP

ZMYND10

GPR18

TWF2

CCNF

TIMP3ABCC3

TBXA2RARHGAP4

DCI

DNAJC9

TRPC4AP

LPCAT4LEFTY1GAMT

FLT1

CADM4UPK2

GRINA

MFAP2ELAVL2

SLC5A2

FURIN SMOANKRD12

FAM5C

DGKG

SEPT6TOP3B

KEAP1PPAP2C

PAX3ABL1

SLC6A8

ESR1RAB3A

C6orf120

ASGR1NCDNLRRC23

KCTD17DUSP7KIAA0323

COPS6KIFC3KLF1

MAGEA5

CGB

IL10RA

TMPRSS6 ECE2IAH1

SLC9A3R2

PCBP3KRTAP26-1

BRCA1CAMSAP1NFKB2OPRL1

MAGEA1

RELCAPNS1

NOMO1

CEACAM3

RAB8A

SAR1A

psiTPTE22TCF7L2

GTSE1

LILRA4EXOSC2

IL8RB

ENTPD2

ARSE

ATP6V1B1ZNF167CRHR1LRRC48

PPP1R16BDGKA

CBFA2T3

RIMS2

KLHL18

PROX1SLC17A3

PDGFRA

CCR4

ZBED4

C16orf7

ANKRD46KRT6A

USP6

ZNF783NTNG1

MYO1C

CDK2

GJB3

MLL

GYPB

TYMSHCG9

FZR1TPMTIGHMBP2

SIM2LGALS9

DPT

AMIGO2

RPS6KB2PTGES

TUBGCP2GP2

MGAT3

CDK6

CROCCSAP18 LOC652147

BTF3RHBDL1

C19orf50 SDC3

XDH

NEUROG3

FGD1OR2B6

DCBLD2

RELA

PSG1

KRT1E2F4

MVK

COL6A2 GPA33TRIM15 F2RL3BCAM

NCR3

CCDC69

METTL3PRB1CYP27A1

NRTN

DUSP14

HIRIP3ANK1

MFNG

SLC4A3

IGF1RABCB1 MMACHC

POM121L9P

GGT5RAD54L2ITIH3

SLC6A12BAIAP3 LYZL6BFSP2

RNASEH2B

TNFRSF25TLX2

FCGR2A

GCKRPFKFB2

HAUS5

IGF2

IQSEC1PSG7

NCKIPSDABCB6

MYB

ITGADMYST2

AGAP8WBSCR22ADAM15

SLC6A9

GPR35MRPL28

IGSF9B

ATG2ASULT4A1

ZSCAN12

POU4F1

C2orf55

MTHFR

SBNO2PPP3CC

ODF1CCHCR1RPA2HTR1B

RASGRP3PLEC1

EPHB2

POU6F2AHSG

CEACAM4

FLOT1

RABGGTASLC18A1

CAD

FEM1B

EDIL3LPAR2

MAEA

THBS3

FANCA

ALDOC SCAPER

B4GALT3MUC1

LOC26102ABOLY6G6C

POLR2J

LGR5ERBB3

KRT85

PRPH

RAB11FIP5PAIP2B

UGT2B15SLC12A4 EML3

TERTARHGAP6AXIN1SEC16A

CRYBB3PAX9TREX1

CD52

DNAJC4

DGCR11

C12orf47

TRA2A

MFAP3

CHMP7

INSL3

LRRC68PML

XPNPEP2MEGF6

C19orf21

POLG

SPN

HSF4

KHKSLC9A1

UNC5B

CYP2A13

GAD2

TNFSF12ADRA1A

C1orf144TICAM1

ITGAMRRP12

FOSL1MVD

IFI35

TMSB4Y

SLC28A1CSTF3

OPN1SW

KLHL25

EFNA2

COX6A2

SVEP1

SLCO2A1

CCDC85B

ARHGEF12

WHAMML2

ADAMTSL2

RBP3

NTRK2

SLC16A3

CDH16

DNAL4

KATNB1

REM1

C14orf1ITGA10

GRIN2C

LALBA

GRB2

SEMA5A

LDB3

STAU1

SAMD14

RBM8A

ITGA2B

LSM7

NCR2

NTSR2

ARTN

PTGS1

DHCR7

KIR3DL3

PAMR1

SLC16A2

ABCA1

REG1BGARS

PIK3R2

SLC8A1

TGFBR2

ABTB2

GTF3A

SLC6A7

MFHAS1 LOC728849

CACNB1ST3GAL1

IVLAANAT GPR19

GSTM5

GNAS

MAG

SLC22A18AS

POLA2F7

PARD3

TNPO1

COL6A1

BAT3

XRCC2

RASSF7TGOLN2

HCRTR1

ALDOB

TNKSCD72

CDH4MAPKAPK2EPAS1

GLUD2 KIAA0586RAD52

GHITMATF6B

TERF2

VEGFCPCGF1FSCN2

GPTHAP1

TFF3

RGS9

PQBP1YPEL1

SLMO1MGAT1

ATP1B2PRRC1

LSM4

ZNF592UBE3B

TGFBRAP1 RNF5TUBGCP4

RAP1GAP

LOC100134331

SLC30A3APOBEC3C

ZKSCAN3

SLC1A4

RBBP8

MYEF2PRPF31IKBKG

GPR161

MAD1L1STK17APAX8

TRAFD1

CDK5R1TYR

SLC20A2

ZBTB7AKRT35GJA1

TMCC1

CIAPIN1

GRIN2ALRCH4ZNF157LBP

LOC552889

CNKSR1

SPINT3BCL3

ACR GTF2F1TNFRSF10C

INHA

T

CCPG1MAP2K4

LAD1

SCARB1NRG2POU2F2

HDGFPVT1MLH1

TROAP

RAGEOCEL1

SEMA3AFAM89B

COL19A1

CD22

TAZBCRSTAT2

CRYBA4CCKAR

ACVR1BNOC2LIGH@

NAGPA

MATR3

HOXD4

SOCS3

FAM168B

CUBN

KAT5

LOC644165

TRD@

CEBPE

GHRHRHTR6

ZFYKDM1

TK2

PIGC

ARR3 RFC1

CCNA2

ARVCF

POLEDCT

PAK2

GUCA2A

CCL16

SLC9A3PLEKHM2

ZMYM3

TP53TG5

HOXC11IGHG1

KCND3

CBARA1

BTRC

HMGA2

TRIM58

PLIN1PCTK3

ERF

NLE1

ALPPL2

NDST1

UBL3KIAA1024

GNL1

EPR1

HSF1

ZNF500

MEN1

RB1TBX1

AIF1SLC22A24

FAM189A2

TERF1

NRXN1

LMTK2

SMPDL3B

RECQL5

TMBIM6PPP2R5D

ABCF2

MATN3

JAK3

DDX51

GPR3 PTGER3

PSMB8

CRHR2FUSHSP90AB1

PIK3R1

VAPBSETD2

SIX6MAPK8IP1

FOXN2PRKCB PYGME2F1ADAM20

NME6

FAM50A

OTUB1

RASSF1

KRT33A

RPL11 EWSR1

ASPHD1

IQSEC3

DDIT3PLEKHG3

KCNQ1CNTN6

GPRIN2

MRPS12

SMARCD3

ZC3H7B

ACAN

DTNA

ARHGAP26

TBC1D22A

DUS4LSLC6A2

PLCD1

PPP1R10

PTPRD

OAS2

CDC45L

LTK

IL1RL1

SLC15A2

STK25

UBOX5

SCAMP5

SAG

FLT3LGGUCA1A

PIGR

AQP7

TNIP1

HTR4CAPN5

USP19

CSNK1G2

TBX19 CORT

NSFL1CDOK2

VAMP5CEP135

KRTAP5-9

FEZ1 PDPNMELK

P2RX6

ADH1CUBXN1D4S234E

PSMB4RAB31

CREB3

PCDH9HNF1A

EXD2SMG6

SLC1A6

TGFB3TNP1

CDC2L5COX10

SLNIL18RAP

GPR143VENTX

REG1A

GNB3

PITPNM1

FLJ10038

CCL19

PTPRBCCDC94ODZ4

ITPR2

DEFA3MLLT4

AHDC1

ADCY2UBE2H

SLURP1STXBP1HHLA1

KRT37

PRPF6

MEF2C

GJB5

AGPSHLF

PAFAH1B1

CASP2

FGF2

AKT3

SERPINC1

AP2A2

THRA

USP11

IL13ARID4BFAM189B

TCOF1

POLR2E

KRT86

MAGEA9SERPINA4

PDE3A

AQP8

FAM102ASMYD5

HNRNPL

PRH1

SLC25A11

LRP3

GRIK5

MYT1

KRT2

ATP10B

U2AF2

IL4

TGFB1

CHIT1CNP

RUNX2

MC2RSCYL3

FUT7

RDH16

FKBP15

PTAFR

CORO1A

ZBED1

DAPK2SEMA4D

ETV3TM4SF5

TENC1

ATP2B2CHRNA3

KLF11

PAXIP1WWOXMPP2

NEUROD2

CD34

ATP6V0A2

RAD51L3

SERPINB7

DNAJC7CBLN1

GRIP2FCER2

DDX11MED24SV2A

GDF5RAB40BDHRS1MAP3K14

AGPAT1

WNT10BWDR62

DNAH9

RPH3ASLC2A1

UTP3SOX12

METTL1ETS1

FDXR

FGF3PHF21A GPR143

RBBP8

MLANA

MLL

TBXA2R

PHF21A

ARID4B

PAXIP1

KRT2

CASP2CPNE6

GTF2F1

GPRIN2

MYT1

STIP1TGFB1

NSFL1CAKT3 SLC15A2COL19A1TYMS

SCARB1

LLGL2

MGAT1

GNAT1SLC22A18AS

C2orf55

NLE1

TM4SF5

SDC3

ATP11A

MYST2

IFI35

TERF2

PPP2R5D

CBLN1

VAPBLEFTY1

KIAA0586

FEM1B

KRT86

ZBED1GTF2H3

NTNG1

SAMD14

TRA2A

NCKIPSD

UBXN1

CRHR2

ADRBK1

LPCAT4

SERBP1

HIPK2

PTAFRMYO1C MYBPC2

ANKRD12

GUCA2A

LTKMRPL28USP19

SMARCD3

CDC45L

PRPF31

NCR2

P2RX6CADM4

RBM8A

PAX9

ANK1DGKG

VENTXPITPNM1

KCTD17CCKAR

EWSR1

ARHGAP26

DTNAKRT1

ADCY2

MMACHC

MVD MEF2CARSE

CCHCR1

NFKB2KDM5B

SAGITGA2B

HOXC11

TRAFD1

SLC9A3

KIF22

CYP2A13GJB3 SEPT6

USP9X

ABCB6SERPIND1

VAMP5

USP11MAP2K4

MAP3K14

FNTBSAR1A

PTPRD

CRYBB3

SIT1

NOC2LGRIN2A

SERPINC1

CORO1ACYP27A1

UNC5B RHBDL1TCOF1

CHMP7CHST3ITGAMGARNL4

GNB3

MFAP2

CHRNA3CD52

UTP3POM121L9P

RAD51L3

RAB40BLBP

IGSF9B

ZFY

PAIP2B

LOC652147

HLA-DOB

RELA

T

TGFB3

CSNK1G2TREX1

MEGF6

DNAL4GRIN2CCRMP1SFPQBCL2L1

MLLT4

GSTM5

CHIT1

ZSCAN12

ZNF169

IL8

LAD1FKBP15

LRCH4

PDE3A

LYZL6

ARVCF SKI

C12orf47

UBOX5

GRIK5CA11HTR4 INADUSP7

HMGA2PAFAH1B1

SLC28A1XPNPEP2FLT3LG RDH16

TNIP1CCL16

HLFHPCAL1

UPK2

IGHG1

ZMYM3

MAGEA9

DCI

VCX3A

CD40

KCNJ4

AQP7

SLURP1 KLF1

SLC18A3KRTAP26-1

CHRNB2

PTPN21

MAGEA1

CSTF3 MTX1

SLC25A17

ARHGAP4

EFNA3

SLC25A11

PRB1TNFSF12

EIF3G

MFHAS1ARMC6

SPNSEMA4D

TOP3B

ACR

CNTN1C19orf50

KRT35

PGAP2

ADAM20

BTF3EVPLCOPS6

IL18RAP

DCBLD2TRPC4AP

PTGS1 GRINA

ELAVL2

MSX1 DNAH9

SMO

FAM38ATRIM28 ORC1LMAPK8IP1 PBX1LOC644165 HOXD4 KCNQ1 PLEKHG3RECQL5

GJA1

KHK

GAD2RREB1

ERF COL6A1

ABL1SRM

USP6

BAT3

CYP3A4 HIST1H3EEPAS1

RPL11

POLR2J

FLOT1

EML3

NME6

PAMR1

POLD2

HTRA2LOC91316

CYP2A6

ERBB3

BRCA1

TFDP2

RAGESLC22A24

FAM189A2

CHKA

LILRA4

HCRTR1

MUC1

TENC1

EDIL3

SORBS3

POU4F1

YPEL1

LY6G6C

CRCP

ZNF593

CIAPIN1

LGR5

IDS

KIAA0284

TGOLN2FANCA

CNNM2CLSTN3

CD22

LOC201229

FANCL

MYOG

SNUPN

ALAS2

AGPAT1

FCER2

JAK3

CORTMELK

HTR1B

DGCR11

RPS6KB2TCF7L2

MAPKAPK2

REL

PARP2

CCR4IRF2

KIFC1

MAEA

TMBIM6ST6GALNAC4

MLH1

ZBTB7A

RASGRP3

GHITM

CBFA2T3

MAGCDK2

HSP90AB1ST3GAL1 PRPS1L1SLC18A1

ENTPD2 MYO16TUB

IGH@

TERF1 AATK

TGFBRAP1

CAMSAP1CHD8

STAT2FAM153A

TGFBR2

ZBED4

KRT33A

ALPPL2

KRT6A

DHCR7

XRCC2

ACVR1B

F2RL3

INSL3

CEP135

LRRC68AXIN1

KDM1

HSF4

IGF2

CCL19

SLC30A3

FURINIQSEC1

GRIP2

PAK2

ARR3

CD33

PML

SEC16ANRTN

DCT

TBX19

IL1RL1

D4S234E

REEP2

ATP6V1B1CYP3A5

AURKA

ZMYND10

RP1-21O18.1

SLC5A2

C14orf1

TBC1D22A

PLCD1

ITPR2ADH1C

DHRS1

SCYL3

PCDH9

SLC16A5

POU6F2THBS1GPR12

REG1A

RNF5

REM1

TUBB2B

STAU1

TICAM1

CRTC1CSF1

NCSTN

NEUROG3ZNF157

IFIT1

SIX6

C1orf144

POLGITGAD

SMG6PMS2L1

ADRA1A

FOXE1

BFSP2 ABCC8

ARHGAP6

ATG2A

METTL3

MFNGMYEF2

PRPHPDPN

TPPP SULT4A1

JAG1

DPT

ZNF592CNP

GPR35

PRRC1ASGR1

NCR3

IGHMBP2MYO9B

EXOSC2MFAP3 NTRK3

IAH1SERPINB7

MYBPC3LYPLA2STK17A

OSGIN2

NFRKB

CEACAM3

KRT85

SLC20A2

WWOX

RAB14

RAB8A

GPR18

PVT1

RABGGTA

PSMB4AQP5

AGFG2BCL3

TPMTSTXBP1

MAGEA5

AMIGO2

KYNU

UBE3B

TROAP

DEPDC5NOMO1

TRIM15

TMCC1

IL8RB

ADCYAP1SLC9A3R2

FSCN2FUT7

LOC552889

UGT2B15

SLMO1

PCBP3

ARTNDNAJC4

ESR1

PTHPTGER3

CYP11A1

CEACAM4IRF5

CCBP2ALOX5DPEP1

B4GALT3SPC25

GP2

AGAP8

MVKRAB3A

CAPN5EPOR

ALDOCKIAA0323

EPHB2MTHFRBZRPL1

OPRL1

ATP2B2

SBNO2

LPAR2

HMX1

COL6A2

MGAT3SOX12

RGS9PPP3CC

TUBGCP2PHKG2

TBX5 SLC6A8

OR2B6

SLC6A2DNAJC7ESR2TFF3

ZNF783STK25H6PD

NCR1LOC145678

TULP3

EIF2AK2

IL10RA

GIGYF2

HCG9

KRTAP5-9

DOK2

REG1B

GFAP

AANAT

CECR7 NADK

CDK5R1

COX10

RASSF7

TMPRSS6

MT3TRIM58

FDXR

GCOM1

TWF2

SCAMP5

CARD10CNKSR1

NFKB1

MAPK3ABCA1 MRPS12LMTK2GNAS IL4

HNRNPL

IL13PDGFRASETD2

CLEC11A

HPGD

GRN

ADAM15

TUBGCP4

SLC4A3

WDR62

E2F1EMID1

DOK1

SIM2

CROCC

ATP1B2

HDGFLRRC23

SIX3

GGT5

PRKCBNEUROD2

PTPRNKIFC3

HAUS5

PTGES

WBSCR22

C19orf29

OCEL1LGALS9MAD1L1FZR1

AP2A2

VASH1METTL1SPINT3

IKBKG

WNT11DDX11 RPA2

SERPINA4

GCKR

CDK6

WNT10B

HAP1

SLC12A4SAP18

SLC6A9GNL1

ODF1

SLC1A4 ZC3H7B TYRACCN1

SLC2A1

GTSE1

SCAPER

GPA33

INPP4A

UBE2H

MYB

RAD54L2

LSM7

FCGR2A

NRG2

CAPNS1

THRA

ITIH3

SLC9A1

CCPG1

MCM2

FOSL1

XDH

ABCB1

BCR

PSG7CRYBA4

GPR161

CRX

CDC2L5

ELAVL3

psiTPTE22

CDR2LRAP1GAP

HTR2A

ELN

MC2RNTSR2

ECE2

FGF2

SEMA3A

TRIM3

TNFRSF25

PAX8

APOBEC3C

LOC100134331

LSM4

E2F4

TAZ

OGG1

RRP12PRH1

PFKFB2

ATF6B

HTR7RIMS2

ZKSCAN3

U2AF2CD34

NCDNFGD1

GRM4

MR1

AHDC1

TNP1TLX2

ARPC4

LUZP1

PTPN9

RPH3A

PSG1RUNX2

FAM189B

RUNX1SLC6A3

17

Lung-Michigan: FuNeL – C2 Lung-Michigan: FuNeL – C4

MYT1PDGFRL WNT5A

EPS8MLC1

FOXK2

IER2

RB1

PCNA

MRPL19

HAX1

SEPT8 COPB1

BGLAP

CYP2D6

F13A1

USP14

PTGER4

STIL

POLR2B

C2orf3

RRS1

DES

COX7B

SNRPEAQP4

MCFD2

PSMC1CBR1

PRIM1 ASNSCLPS

GPA33

TNCGUK1MUC2USP13

GIP

SUZ12ARVCF

CPT2 JUND

RCAN1

NDUFV2

FGF13CYP1B1

MT1P2

PKMYT1

MYB

MC1R CAMK2G

FRMPD4KLK1HTR1E

SNRPN

MEOX2

GTF2H4

FMO2

HBEGFME1

GPKOWIRS1

CDR2

CRABP2

PLP1

BTG2

HLCSPIK3CA

RAD21

GALNSGNL1

IL11RGN

MAN2C1FUT1

ANKRD36B

EBNA1BP2 HSPA6

ZC3H3PCK1

MAN1A1

S100A3

STK3AGTR2

RASA2

CLEC3B

ITGB1

COPS8

PPBPL2

PML

CTDSP2

PIGH

FOSB

SF1

SLC5A5 TSN

FCGBP

CA3

MEST

CYP4A11

CD300C

ALDOC

TAF13 PTMA

NEK4

CHRNE

KRT2

MTMR11

TAP2

DOC2A

MT1G

FDFT1

STARD3

ADRA2C

CYB561

FKBP1A

NR4A1

TNIP1

PITPNA

SFTPC

ZNF592

CTSE

TPM2

COL11A1

DPYSL3AFAP1

ATP1B2

EDEM1GRIA1

KIAA0226

FMO1

NELL2UBE2A

FBN1

RUNX1

LGALS1

COL6A2

RPL28

KIF5A

GAMT

NR4A3

FN1

TSPAN8

ZAP70

RPL19

BBC3

DCT

FASLG

RPL18

COX7A1

PDGFRBCOL19A1

CD99

RNASE4TPM1 TMEM151B

KNTC1

IGJ

NRGNCDC5LTIMP3

THRA

TSPAN7

TGFB3

PTN

FABP5L7

PPIC

HARS

VLDLR

FUT3

COL6A1

FOLR1

ACTA2

ADAM10

TM9SF1

CYP4B1

C4A

COL6A3

CTNNB1

IGHM

ZNF135

HLA-E

GJA5

CD52

ACP5

CES1

DNMT1

C7VCAN

CKS1B

C4BPA

ADCY7

RRM1

PMCH

BMS1

GNL2

LRRC16A

HK3EPHX1

MN1

TIAL1

MEIS3P1

CXCL12

LAMA4AGL

GRM1

LAMB1

RNASE1

TEC

KIAA0114

PAK2

LSSRAB2A

IL8RB

BCL2L2

C8A

JUNBLAMP2

HSPA13

GTF2H1

BET1

TFPI

EMP1ANXA3PPP6C

CHRFAM7A

ACTG2

GLRXCCL13

DNAJB1

GZMM

ELMO1

FGF2

PRKCB

PRDX6

CHAD

APOA1

GP9

PTK2B

FTH1

GUCY1A3

HSF2

GABPB1

SLC2A4

POLD2

DLGAP1

RPS27

IDS

OCM2

GDNF

CELA2B

CUGBP2

PWP2

HLA-G

FUT6

PLA2G16

FABP4

LRRC14DCHS1

H19RPL36AL

NTRK2

OXTR

CDX1LUM

EMP2

TNFAIP6

LTA4H

ETV5

ARHGAP5

PSMD2

TFRC

MCAM

ZNF250

CFPTALDO1

CKMHSPA4

S100A4

ANXA1

RPS21

CD226

GPD2

HOXA1

DDX1EFNB3MAOB

ATP6V1C1

HMGN3

CLEC2B

CHRND

ORC2L

KCNJ8

HSD11B2

SRPK3

RRM2MPO

RDX

PSEN2

KIAA0232

LMAN2

ARHGDIG

TJP2

C21orf2

HIST1H4CCALCRL

OMD

AGER

PDCD2

ARL4DKRT5

SEPT5

EMG1

TRH ALOX12

RUNX2DGCR5

SCRN1

IGFBP6

ALDH1A1PTPRMAKAP12

FOXF1

HBB

STOM

GATA6

MAPK3

ARL6IP1

THOC1

KIAA0513

ANGPT1

CHN1

CPB2HOXB7CD36

NMI

CSNK1G2

FPR1

ABCG1SERPINB2

GBE1

ALDH2

GRK5

OASL EZR

KCNJ15

KCNN3

PIK3R1

ADH1C

PLIN2

CAV2

KRT4

CAT

ADARB1

EMP3

LPL

PLA2G1B

FGR

CLDN5

GPM6A

CA2

CALD1

CFD

CNN1

SPARCL1

GPC3SFTPA1B

AGTR1

VWF

LTB

S100A9

SPOCK2

AQP1

FEZ1

TM9SF2

IL1R1

PCSK1

PPAT

SMARCA1NT5E

CD46

TNFRSF11B

CXorf40BPZP

GSS

IL4R

NUCB2

RPS25

CDC34

FMR1

CLOCKBMI1

PLAT

CAV1

SEMA3F

SSBP1

GBP1SSB

PASK

TFDP2

POU6F1

SEPP1

LITAF

UBC

SELP

SERPINI1

HIST1H4E

CCR6

CNTN1

CORO2A

MYH8LPIN1

FGFR1

MPZ

TRPC1

RUNX3

DSC3

TNNC1

ACHEARPC2

CYTH2PDHX

ITGAE

CYP2A6

ENPEP

COL4A3POU3F2TYRPPOX

MGAT3C18orf1GYPAGNA11

PECAM1

RAC1

STATH

PCGF2

CPM

FDPS

SLC39A9

SPRR2C

DNA2

VRK2

PTPN11

COL10A1ROR2 SQSTM1 CTSC

HLA-DPB1

LOX PTHLHHIST2H4A

FLI1

ZNF638

PDLIM1

SCTR

CHI3L1LYL1

SEC24CBAG1

ATP5D

SEMA3B

FGF7

TLN2

HIST1H3A

CLCN6

AP2S1

PLN

APPEFTUD2

LBR

HERC1

HTATSF1

NUMB

PTGR1HSPB3

ATP5J

UQCRC2

TIMM44

GCH1

RBMX

MPV17

KYNU

ELAVL4

RPP30

ABCB10

HSD17B10NUP214

TGFB1

MYOM1

SF3A1

KIF5B

SLC15A1

LOC96610

ANXA6

GALM

PCDHGA12FAM65B

AKAP10

ATF3 MEF2A

CTBP1ELK1

ADSS

ACSL1

RPS27A

MED21

PKN2

PSMA2

UPK1A

PNOC

ARHGAP22

ERCC2

GATA3RNASEL

AYP1p1

PTGISZFP36KRT31MAP3K5

MCHR1

FUT4 CCNHUBE2V2

IDUA ZCCHC11LRRC17

H1F0

C8B

PTPN9

PBX3

GSR

SEMA4DRXRG

E2F4

ATP2B3

ITGALF10

HPRT1

FBP1KTN1CACNA1DNME4

EZH2 CCL19

PTPRC

SLC9A6

TBX5ST3GAL1

OAT

PPP2R2B

CCL25

ST6GALNAC2PPP1CC

LEP

RPLP1

DDX21

RAB5B

WEE1

KPNB1

HAS1

TWF1

GNA15

KLKB1

ACVR1

BRCA1

ABAT

CEACAM6

HNRNPC

CASP7RHEB CTSDSPTBN1

SAT1ADRB2

PSMA6

TFPI2

TNNT2

CD1C

GLI3

FST

PRPH

EBP

DEFB1

CYP2J2

GALK1

CDS1

SLPI

HBA2

ALDH3B1

S100A8

NR5A1

EDN2

CPA3ABCA3

P2RY14

ETF1

RARRES2

ANXA8L2

FCGR3B

NPC1QB

TAGLNAPBB2

WARSPDXK

SERPINB6 CYBBGNG11

GAL3ST1

POSTNIL15

SIRPD

HMOX1

PGC

EIF2S1

ELN

HOXB5

ATP2B1TBX2CTSKIQGAP1

CNN3

PEBP1

FAM189A2NAP1L1

POLG

SPARC

EIF5A

AADAC

UGP2

CST6CCL15

TNXB

RBP4

MRC1

MYH11

OXA1L

XRCC6

COMT

MALL

ARHGEF6

SERPINA1

IGLL1

RDBP

NDUFB7

CD22IL10RASEPT6FRZB

PPP3CB

HCLS1

MMRN1

RASGRF1

LCP1

GFER

CR2

DLG1

UGT2B4

STXBP2ITK

SEPT7

GABRA3

CTBS

FGF4

CD79B

KRT7

ACOT2

EFNB1

CSTA

AOC3

G6PD

NKG7

NR2F6MYL9SRPX

CUL7

EDNRB

FHL1

PSMD7

FGF8

SPRR2E

GCNT2CRYGD

TUBGCP3

ITGA5

FGFBP1

FPR2

PTGDS

CD48

BTN3A2

PCOLCE

VCAM1

PEPD

MAP4K1NEUROG1IL13RA2

CLCN5INPP5D

PTPN7

ISCU

SLC11A1

MT1ECYP2B7P1

POU5F1 CDC25C

BLMH

PLEK

KRT85

TULP1

HAGH

TBXA2R

CD6

FCAR

ADAR

PBX1

CORO1A

CXCR4

BCL2

MAP1A

DBI

CYP2C9

KCTD2

FOXJ1ILVBLMB

MPST

MYH6

SLC4A2CFHR2

FES

SELLPRCP

IL16

NPPC

SLC39A7

IDH3BMAPRE2

PITX1

F3

TCF7L2SCP2

KCNJ5POLB

LOXL1TSFM

DLX2 GCNT1

CEBPE COASYBST1

TFCP2 ORC1LSTAT6

PTX3

FLAD1

FAS

PUM2

RPL3

CRY1 PSG7

EXT2

NEFL

TNFAIP2

TCP1

YAP1

PIK3C3

NCOR2

PEX19

UCHL1

MARS

KRT9

ATP13A3

BMPR1A

CPD

INPP5A

MUC8

COX17

CLIP1

ANPEP

HIST1H2AD

GAS1

TXNRD1

ITGA9

NR2C1

LMO4

SMARCD2

SF3B3

YWHAB

ACTB

RUNX1T1

RNF2

NCAM1

TRIB1

GEM

AESFAM110B

HSD17BP1MFSD10

THAP11C2

OAZ1

KIAA0020

GNPDA1

KIAA0040

EEF1A1

INHA

ITPR1

KIFC1

CKS2POLR2LGJA1CLTA

SLC2A3SLC25A3

CNTFRGTF2E2

RLF

FDXR

SIM2

MARCH6

PSMB8

MCM3

ACY1

CDKL5

SFRS2 CYB5A GTF2E1 UBE2D3 ABL2 UCP2 CDK10 TAF10 CLP1 PSIP1 DDX39 ZNF84DUSP5

APLP2IFNA5

NAMPT

HTT

TAF1A

NDUFA12

HRGDEK

BMP5

HYAL1

SDC4

ZFP161

IGF2R

CCR4

MSTP9

AP3M2NID2

PSME2

LPO

TFAM

ERBB3

FLOT2

SLC18A2

GSTO1

YY1

PDE4A

HSPA1A

RELA

BTN1A1

SNW1

RBBP6

MKRN3

KLF4

ADD1ETS2ITGB3BP

ETFARARS

FAM3C

ABI2

SGK1 TERF2GATA1

OR3A1BPGM

RB1CC1

MTHFD1

GPR21

PRKAR2B

SUMO1

ANXA2

USP9X

PDXDC1

NRF1

SOAT1

RRAGB

ZNF154

STS

RAD50

IRX4

ACO1

NCK1

RABEPK

KRT15S1PR1

ZNF24COL13A1

ALPLIFI35

CRHR2

CYP3A4

RBMS2

SP140WRN

LGALS7B

SMAD4

GRM4

NOLC1

CDC27

RNF113A EXTL2

GPM6A

GAS1

CYP2C9

ZFP36

HIST2H4A

FCGR3B

IL1R1

CA2

HSD11B2

IRX4

ANGPT1

IL15

S100A4

SLC9A6OR3A1

KRT5

SAT1

FEZ1

CA3

CCL15

RBBP6

ELAVL4

RELAH19

SPTBN1

CCR4

PLA2G16

CAV2

MMRN1

ITK

UBC

RASGRF1

MT1E

ISCU

EMP2

NRF1

USP9X

TAGLN

MPV17

GALM

SLC11A1

ETV5

EFNB3

IFI35

ITGB1

HBA2

CLOCK

FAM189A2

GPC3USP13

RDBP

ACTG2

LMO4

GATA6

ZNF135

TNFAIP6

SGK1

ANXA1MPO

SFTPC

LCP1

HCLS1

MEF2A

FLI1

CALCRL

EMP1

SERPINB6

PDLIM1

KRT4

FMO2

ARPC2

ARHGEF6MYL9

AGTR2 PLEK

TM9SF2

SPARCL1 HSPA4

TNNC1

MYH6

FHL1

CD52

DGCR5

IL16

NEFL

ADRA2C

RLF

MEOX2

FABP5L7

HBB

PIK3R1

COX7A1

SNW1

ADH1C

MCAM

TAP2

TXNRD1

AOC3

HIST1H3A

TAF10

ANPEP

C1QBGTF2H1

PIK3C3 C7

CXorf40B

POU3F2

AKAP12

TNXBELMO1

DCHS1

C18orf1

GDNF

CYP4B1

PPP3CB

SFTPA1B

ANXA3

GP9DBI

S100A9

PTN

TFRC

GJA1ACHE

CAV1SLC5A5

SRPK3

CXCL12RNASEL

ACSL1

BTN3A2

CHI3L1

CTBP1

RBMS2 SERPINA1 GSTO1CLIP1KCNJ8VWF

MYT1

BCL2 NUP214

KRT2

FOXF1HAS1

FUT3

CPB2 UPK1A

GRK5

RRM2

CLEC3BMRC1

KIAA0232HIST1H4C

LYL1LMAN2

RPL36ALHERC1

CD226NUCB2FN1

DOC2A

SEPP1

LTB

PTGER4CFD

LAMP2

EFTUD2

RXRGSCP2ITGAL

KRT31PIK3CA

GYPA

LAMA4

CPA3

IER2

RDX

MYOM1

ARHGAP5

CES1

PECAM1MCHR1

SRPX

RASA2

MYH8

ARHGDIG

INHA

ACOT2

ST6GALNAC2FGF13

MARCH6

ARL4DRAD21

THOC1

UBE2V2

GSS

CYP4A11

RPS27A

FLOT2

POU5F1

BMP5

CCNH

CUGBP2

CPT2

TPM1

HTT

TPM2

PRIM1

S1PR1

CD300C

TYRPPAT

AGER

FABP4

UCHL1

HAGH

GNL1

DDX1

ADARB1

COMT

TGFB1

NCK1

TRPC1

GTF2H4

TERF2

KTN1

COPB1

PBX1

EZH2

BCL2L2

ARVCF

TIAL1

DES

PTPN11

COL19A1

DNA2

AKAP10

PRCP

EMP3

NCAM1

FTH1

SNRPN

HOXA1

CRY1

INPP5D

NDUFB7

Lung-Michigan: ARACNE – SE(C2) Lung-Michigan: ARACNE – SN(C4)

COL5A2

COL1A1

BGN

ITGB5

COL5A1

INHBA MFAP2SPARC

THY1

VDR

PLAU

RARRES2

COL3A1

HTRA1

POSTNWAPAL NR4A2

IER2

CNN3

COL6A1NR4A1

FAP

THBS2

EGR2

HERC1

INPP5D

MAP4K1

MARCKS

COL4A2

ACTN1

VCAN

COL1A2

NCKAP1L

LIPA

BAP1

KIF2A

RELA

ITGAM

AIF1

ALOX5

IFI30

LAPTM5

PSMB9IL18PIAS1

AMPD3

PEA15

CYTH1

CASP1

DPTIQGAP1 GM2A

SELPLG

ARHGEF6

SLA

FCGR3BSF1

MLLT3

FAS

HLX

GPSM3

CCND2

DAB2SRGN

DOCK2CAMLG

MYD88

TPM4

CTNNA1CASP8

PTK2B

KIAA0430

MAP7

KRT18

SAC3D1

SSFA2

PSMB8

LMO2

SLC31A2PRCP

HLA-DRAHLA-DMB

GPNMB

CAPG

HLA-DPA1

CRY1

RSU1

HMGN4

NDUFB7

HMGB1

MGP

TGFBR2

DARC

KAL1AOC3

NR3C1PMP22

ITGA5FBN1

JUNB

PCOLCE

CCR7

ZFP36

LUM

ITGB7

DUSP1

ISCU

RHOH

COL6A3

CPZ

GDF10ARID5A

TIAM1

HSD3B2

ID2B

FBLN2

LCK

FMOD

LAMA2

ZFP36L2

CALD1

C1R

CYBA

IGSF2

EGR1 TAGLN

LOXL1

CDH11DYRK4

CD72CD2

TLR1

IL2RB

UBE2D2

TRD@DDR2

FN1

ATF3

DPYSL3

CORO1ARUNX3

IRF1

TRBV5-4

EML1

STAT4

DDX5

CD3D

LAMA4

CCL5

CD19

CTGF

EGR3

TPM2

LGALS1PDGFRA

MMP2

C1S

RCAN2

ACTA2

FEZ1 ICAM3

CD6

GNG10PDLIM7

HMHA1

CTSK

COL6A2

MSH3

PRMT2

ZNF266

TAF12

C12orf32

PHF3

GTF2A2 ATP6V0D1

GLG1

EIF4G2

APOC3

CREB5 RARA

RBM6

MYO7A

RAB28 RBBP6NR1H2

BNC1 HEXA

MORC3 STIP1 LZTR1

RCVRN

HTR2A

CRIP1

ZNF638E2F4

TULP1

IARS

ACP1

RCC1

GNA13TDG

UBE2V2

DHX9

HSPE1DDX1

MYL3

MORF4L2

AZIN1COASY

PRDM2SSRP1

AKT2

SKI

ATP6V1C1

CALB2

XRCC5

NMT1NAP1L4

TFCP2 TFAMKDM5A

DLD

IFNAR1

SILVRAB9A

TSG101

HLA-DPB1NCF2 LST1

ACP5

GRN

FGRAPOE

LILRB4

FCER1G

SPNCCR5

PTPN6

CD247

CYP3A7ITGAL

SLC1A3FCGR1A

PIK3IP1 CD52

LCP1

GNPDA1

IL7R

APOC1

MNDACD86

CSTA

SPI1

FGL2

IL10RA

HLA-DQA1

PTPRC

CYTIP

C1QB

LCP2ZAP70

HLA-DMA

ICAM1VAMP2

ANXA6

CD69 IRF2TRIM22

CD74

SYK

IL2

PRNP

GZMA

CCL4IL2RG

TNFAIP3

ENG

CCR2

PTPROHLA-E

ATP6V1F

ITGB2

ARHGDIB

PSAPBTKCD83

AOAH

SEC14L1CXCL12

PECAM1

VIM

SELP

ATP6V1B2

GYPC

RASSF2

PIK3R1

A2M

PAFAH1B2TGFBR3

ADRB2

STOM

SLC8A1

PPP2R5C

FRY

TGFBI

ATXN1

MYL9

FCN1

IGFBP4

LYZITGA4

MAP4

RPA2

ARHGAP19

PSMC3

BTG2

PDE2A

CRYAB

RNASE6

CSF1R UBC

LTC4S

SMARCA2

ALOX15RAP1A

TANK

EMP3

RGS19

HCLS1

STAT5A

LRRC32

SDCBPJUND

CYR61

HIC1

SERPING1

IL15RA

MOBKL1BRTN1

GRM8

PDE4B

CIRBPCFDFILIP1L

CX3CL1CXCR4

NR1H3

TCF4

PLEK

TIMP3

C3AR1

CD14

IL16

CD37

LSP1BCL2A1

TBXAS1

ZKSCAN1

NCF4

SIRPA

GNAI2

LRP1

IFNGR1N4BP2L1

ARHGAP25 RAC2

CD48

CORO1B

CD33

GLIPR1

CD53

CD59HBA2DPYSL2

CREB1

MAOB

ITGA8

ARSA

CAT

MYH10

VWF

ARPC2

FHL1C7

SCGB1A1SERPINF1PPAP2B

GPC3

SFRS7

HSD11B1

PROS1DPYD

NBL1

FMO3

SSR4

ANGPT1

PTGDS

GRK5CYP4B1

TRPC1

FMO2

XBP1

INPP5AADARB1

SPARCL1

MFAP4

COPB1

PTPN7

CHRM1CALCRLCDH5

HLA-C

PLOD2

MAP4K5SNAPC1

ATF6

LOC91316

CCL15 CPA3

LMOD1

MYH11

TPSAB1

COX7A1

CFH

SRPX

U2AF1

HOXA5P2RY14COL4A1

SYT5

PRG2

NTN3

HERPUD1

MYO1C

GABRB1

PSMB4FERMT2

ADH4

GTF3C2

CHRM4SNX19

FEZ2MYOM1

SCAMP1

MAP3K11

BRAF

AHRS100A10

SNTB2

ANXA2

CTSHCSNK1E

SLC35B1

BIRC2 PKNOX1MOGSP4HB

TIAL1

ZYX

NME1

COL16A1

INPPL1

GPS1

PPP2R1A

NONO

IL10RBSRPK1

TRIP10

SUMO1AFF2

GHRHR

CYP1A2

ARHGDIA

OPCML

PPY

BAK1NR5A1

KIAA0125

TNFRSF17

SLC25A16

RPL41

IGL@

IGHG1

CD38

MS4A1 IGHM

CD79B

CD27POU2AF1

CD79AGINS1

ZNF136XPCPUM2SEPP1

TGFB1 SP3IL11RAMIF

PURA

FLII

DPF1

RAD21TIMM17A

PRDX3

ATP2A2

FMR1

PSMD1

NOLC1

ATP5G3

SFRS2

HNRNPF

TWF1ARL3

PSMD14

MUT

CLNS1A

SLC25A11PSMD12

BCL9TLK2

C1orf61

DEFA4

POLR2G

FAM38A

PUM1

RYK

UNC119DNM1L

RBBP7

EIF3AESR1

UBE2A

CCT8

LMAN2

NUMB

UBXN4

PTPRF

MLANA

SART3

PCBP1

XPO1

SLC25A3

CDC42EP1

NKX2-1

PMS2L3

AKAP6

RHEB

MPO

STMN1

AADAC

LMNB1

KIF2CMCM2

MCM3 RPL8

PHYHIP

PGK1

CDC20

TYMS

NEK2

MCM6

DLGAP5

MCM4CDC2

HMGA1

CSTF1

KPNA2

CHERP

CDKN3

DTYMK

CKS2

VCL

ZNF133

EPRS

GALE

EIF4A3

TBL3

MELK TTK

TROAP

ATP5B

RGS3

GTF2E2

LBR

CKAP5

CTPS OCRL

PRKDCNUP88 SSB

SRP9

CLTC

RAB1A

PHKA1DDX10

MRPL3

TTC35

RPL6

RPS10

PSMA4

N4BP2L2

MCM7

PSMD7

ARL1

BRF1

PKN1

TMEM131

SMARCA1

MUM1MAN2C1

UBE2S

PALM

JUP

PMS2L5

HADHB

GSPT1

NAMPTPSMA5

SMC1A

RBM10

EIF2S2

YARS

EZH2RPP38

DAZAP2MRPS22

SFRS3

MAN2A2

NNT

CCT3 ATP5C1SNRPG

BYSL

MRPL12

KIF11

CKS1BSLBP

RFC4MKI67

H2AFX

GGH

MYBL2

PSEN2

NCAPH

MRPL19

PCNA

FLAD1

HNRNPA0

ILF3

COPS5

CYP51A1 SNRPB

CENPAPAICS

PRLR

PLK3

GIPR

CCNB1

FADD

ANXA5

CDC123EIF3B

HUWE1

PSMB1

CCT5

TCEB1

SNRPE

KLHDC3

CBX1

FH

CCT7YWHAQ

CENPE

AIMP2

PCSK7

PPP1CC

TARBP1

SFRS9

CLIP1

TM9SF2 THOC1

UBE2H

AGPS

RANBP2

COPS8PFN2

PWP1

GUCY1A3

CST3

KIF14

RPS21

LDHA

REEP5

TCP1

CYCS

EPHX2

CSNK1D

LMNB2

COX7B

ERCC5

PDHX

PIK3CA

GLUL

HPCABAT1

TUBB3

EMG1

GNL2USF2

H2AFZ

DGKG

STX1A

SON

KIAA0232

ACTG1

CCNA2

HNRNPA2B1

PCBP2

SNRPF

NARS

UNG

LOC150786

ATP6V0A1

SNRPB2

ATP1A3

MTHFD2

UBE2N

CTCF

PTPN11

PSMB7

YIF1AGOT2

NUP153

ALDH18A1

TRAP1LSM1

ELAVL1

TROVE2PPIF

EBNA1BP2

PCCB

CDC25C

TP73

C20orf24 GRSF1

GART

DYRK2

PSMA3

SRP19

CSNK2B

CCNF

CENPF FOXM1

KPNB1

DAP3

SPTBN1

CHAF1A

TK1

P2RY10

RRAS

DNASE1L1

IL8RBVARS

PSMB2HMMR

BRCA1

SRPR

ARCN1

GK

LOC96610MTIF2

FGFR2

KIAA0101TRIP13

SLC2A1

AGFG1PKM2

TOP2A

NDUFA4

BBS9MAPK1IP1L

CXorf40B

RBBP4

GTF2H1HCFC1

CHIC1

CCL17 CDK9TEF

EXOSC10

MAD1L1

ADRA2A

DRG2

NXF1

RAB21

SSTR3

MAP2K1

ADRB3

GPR162 TRIP12

DDX39

SSR1ZMYND11

HOXB13

HSD3B1

PSMD6ATXN10

PROL1 C1D

RCOR1 CLCN3

NEUROG1PHB2

THAP11

IGLL1

ETV2C4orf10

ROS1

SIGMAR1ATP5O

AANAT

GP2ACSL1

NPDC1HSD17B10

GIP

FLT3LG

RFC1

BNIP2

WTAP

RRP1B

CHRND

CAMK2G

ARHGEF12FRG1

RAB2APIN1

GLI3

MAPRE2

CNP

PPM1A

CALCOCO2

B4GALNT1POU2F2

BCL2L1

HBZ

PPP1CB

JARID2

DDX18PCDH1

PPP3CB

PIK3R2

SETMAR

EP300

APC

IMMT

KIAA0240

ATXN2PSMA1

MYO5BCUL4A

RBM3

DYRK1A

LPCAT3

H3F3A

SNRPA1 FCGR2B

SUMO3

TLN2

FNTA

NCOA4

WWOX

IFNA21

CETP

TKT

AVPR1B

ACTC1

PEX1

SMPD1

FUT6

NHLH1

TNNT3

L1CAM

ZNF592

DRD2

GPR3

SNCB

GTPBP1

LIMK2SMAD7

MAP1A

MYLPF

NUP188

MT1A

FUT3PAX8

LIG3MICB

GNB3

CAMPLOC100128792

C21orf2

GSTM5 EBI3

MSX1

OXT

KCNJ4HTR2C

MT1X

RABEPK

ZNF410

CYP4A11

DNASE2

ZNF142P2RX7

GCK

TNNI1

PRKACA

SAA1

MSNTIMP4

NKX2-5

PSMC5CDSN

PTPRN

DBT

ITPKB

ADAM15

ACAA1

PLOD1

DAP

EED

MAP3K4

CYP2C19

ADCYAP1

CD36

PLCD1

TUB

MYL5

ZWINTAS

DRD1

HLA-DOA

FES

IL3RA

KIAA0195

GYPBLOC201229

OR2H1

THRA

ZBTB17PSD

NTSR1 JAK3

HTT

MAPKAPK2

CKM

USP11

DRD4

CTRC

IL17A

PIGF

KPNA3CIR1

FIG4

HARS

SCN7ALPP

ZNF8PPT2

CDC2L5

LAMP1 ADAM9DNTTIP2

MTF1

CNTNAP1LOC728849

DHX8

RAD23A

SP2

SDF2EIF2S1

ATP5F1CSTF3

MTHFD1PAX6 CCL22

MYH6GPER

C4orf6KIR2DL4

RPS9RPL18

RPL32RPS19

RPA1 EEF1G

KLHDC10ERCC3

COX17

MYOD1

RPS5RPL28

GANAB

RPS28

RPL29 RPS15

PTS PDXK

PIGAPPARD

RPSAP12RPS8

MED1EWSR1

RPL13A

WIPF2TIMM8ANAP1L3KRT2 TBCEZNF91NUTF2 RAB4A ZMYND8ARAFGTF2F2ANKS1A GGCX HMGCL HNRNPH1 TAF7 TTYH2MTORMAP3K5JAK1FAM120ARAC1IFNA8ANXA7 HSP90AB1 IL13RA1SLC35A2CCNHHIST1H2AC RPL11 RPL3 ADAM8 HNRNPH2 BRPF1 KDM5D RPP30RPS4Y1NQO1 HIST1H1CCLP1RGNRORCBMP4 ERBB2 WNT2B CDK5R1 IDUA RASA1 EFEMP2 RPL7A EPHA2ZNF273MMP3RPL35 EXTL2 RBM42 COX6B1 PRMT5 GNAT2APEX1 UGP2 DYNLL1 BMS1GSS

IFI35

HK3

IRF9

CCL3

PSME1

LHFPL2

PLA2G7

PTAFR

CYBB

IRF8FLI1

EVI2B

HLA-DQB1

IL32FYB

TGFB3

IGHD

HLA-DOB

TRIM21

IFI44

IFI27

MX1

NR4A3 VGLL4

FOSB

AKAP13HLA-DRB1

C6orf142

OSTF1PSME2

UBTF

ITGAV

CXCL9

HCK

FOLR2

MRC1

FBP1

RHOG

ALOX5AP

MAPKAPK3

IFI6

CCL13

CXCL10GBP1

CXCL11

CENPB

KIFAP3CAP1

SYPL1

C2

RBM25

RPS6KA1HLA-F

TIMP2IFNGR2

MUC1

SLC28A1

CBFB AHNAK

ENPEP

PSMB10

BTN3A2

TMSB10

SLC10A3

TNFRSF1BMPP1

CCR1

C5AR1

LGALS9

ATOX1

RFTN1

BAT3

AGPAT1

ABL2

GAPDH

PARP1

COL18A1

EHMT2

SOD3

ENO1 TMED10

HAGH

AHDC1

NUP62

KRT8

DSC2

GPIETF1

TUBB

DMPK

ESR2

CDH12

SLC11A1

GHRH

LSR

DRD5WARS

SKP1

CFDP1

PIGHNFYC

TAP1

DDR1 MAPK14

RPL31

WIT1EVPL

NFX1

RPL27A

EIF3E

RPS15A

EIF3F

RPS18

RPS3A

RPL10A

RPS6

YWHAH SFTPA2BYWHABPABPC1GPD1LCDH1

ABCA3

E2F5

TCF3 GTF2E1

ITGB1 SFTPA1BSFTPB

NAAA

NFIC

GOLGA4

PAFAH1B1

RPA3

EPS15

LRRC16A

MTMR6

SEPT7

BCL2

GNAQ

DMD

MICA

TNXB

GNB1

CCNG2ADH1C

CD302

HRG

SNRK

ELF1

TUBG1

NPEPPS

TIE1SERINC3

TCEA2KIAA0141

ST14

RDBPWWP1

PSMA2

SHC3

TOP2B

SFRS5

LTBP2

CCT6A

MYL6B

LAMB2

TMED9

RPN2

MYL7IP6K1

SLC1A5

NFIL3

PSMB6

NUDT1

ARD1A

SF3B2PPT1

NFKB1RASA2TOP1 HSPA1L TCN2 FAM190B GABRA6LASS1 HSF1ACY1

TCF12

UQCRC1

FUT8

KCNJ9

GABRQ

RALY CTBS ZPBP SREBF1NCSTNMRPS31 MBTPS1 LARP4B ATXN3 COL14A1DHCR24 IL9 SKIV2L WFDC2 MALL BNIP1 RPS23

ELAVL4 KRT76

TALDO1PGD

BLOC1S1

AKR1C1

RB1

TNFSF8

IGFALS FHL3NCOA6

TM9SF4 AKR1C3SLC19A1

GALT

SNW1COX7C HRH1 ACPP TAF1A PPIG TAX1BP1 AK1 VASPIRX4 TNFRSF8 GRM1 HSPA9 HARS2 UBXN8 SUPT5H

C7

CD302ADH1C

MGP

LAMA2

BCL2

TGFBR2

NME1

DPYSL2MYH11

P4HB

AOC3

FEZ1

ITGA8

COX7A1RPN2

PTGDS

CPA3 MFAP4

LTBP2

A2M

VWF

SPARCL1

SEPP1

FHL1

ANGPT1CFD

TPSAB1

HLA-DQB1

HLA-DQA1HLA-DMA

HLA-DPA1

HLA-DPB1

HLA-DRB1

HLA-DMB

CORO1A

CCL5

GZMA

MAP4K1

CD48

RUNX3

CD2

CD3D

CYTIP

FGL2

LCP2

ACP5

NCKAP1L

FYB

IL2RG

FGR

CCND2

IL2RB

LCP1

HCLS1

CD74

RNASE6

CD14

PTPN6

ARHGAP25

CD37

TRBV5-4

TRD@

LCK

COL4A1

THY1

COL5A1VCAN

COL1A2

SPARC

COL1A1

COL4A2

COL6A3INHBA

THBS2

FAP

COL3A1

POSTN

CYR61CTGFPDGFRAPWP1UBE2NNOLC1

THOC1

RBBP4

TPM2COPS5

COASY

ACTA2

MYL9

HLA-F

PSMB9 EBNA1BP2TAGLN

MRC1

ALOX5AP

HCK

GYPC

CXCL12

THRA

LIPA

ITGB2

LST1

ITGAM

MNDA

ARHGEF6

JUNB

IER2

POU2AF1

LOC91316

CD86

SIRPA

CYBB

AIF1

SPI1

GM2A

HLA-C

IGL@

IGHG1 GBP1

CXCL10

STAT5A

DAB2

BTK

EMP3

FCGR3B

SRGN

CD52

APOC1

C1QB

FCGR1A

C3AR1

FCER1G

LAPTM5

GLIPR1

EIF2S2

SNRPF

MTHFD2

PLA2G7

CSF1R

PEA15

HLA-DRA

MPP1

APOE

GPNMB

DOCK2

TDG

PSMB8

TAP1

ITGAL

LSP1

CAPG

TCF4

CCR1

NCF4

VIM

IFI30

EVI2B

SLA

IL10RA

PLEK

ZAP70

PTPRC

CD53

HLX

RASSF2

ALOX5

NCF2

LMO2

SELPLG

RPS28 ACP1 GRSF1 KRT18 KRT8 LUM CDH11 SFTPA1BC1R COL6A1 COL6A2 PCNA SNRPB TCEB1 AZIN1 RPS9C1SNXF1PCDH1C4orf10JUNDBTG2CIRBP

CDC2

KIAA0101

H2AFZ

CKS1B

CCNB1

ADCYAP1MSX1

AVPR1B

MELK

DUSP1

ADRB3

ZFP36

TRIP13

CENPA

KIF14

KIF11

TYMS

MCM6

CDC20

TOP2A

TK1

AIMP2

CCT5

SFTPA2B RRAS SPTBN1 KDM5D RPS4Y1 SLC2A1 GAPDH RPL18

CD79AEGR1

L1CAMTEF

APOC3

NTN3 CXCL9

SMARCA2RPS5 HNRNPA2B1 HMGB1 PIK3R1 HTR2CNHLH1 ATP2A2MRPL3

CCL4 CCL3 SFTPB NAAA NBL1TIMP3

18

Lung-Michigan: PCC – SE(C2) Lung-Michigan: PCC – SN(C4)

ASS1

PI3

IL11

KRT6C

RASGRF1

GRIK3KRT5

ALPI

PRKG1SDS

TRD@

ZNF132

PRB2TAF1

TUB

NR5A1

GPR68MPZ

PLCD1CD2

TRBV5-4

IL8RB

NHLH1

EPHA1

CHRM1

LTKIL17A

SPRR1A

TNNI1

FUT1

MATK

RAB33A

MLH1

TNFRSF25

HTR2CTHBS3

NTN3

UROD

MAGEA11

ACVRL1

LEP

COL11A1

P2RY4

C4orf6

ASCL1 IFIT1 MX1CALCA PCSK1

LY9INPP5J

PKMYT1

KLK3

PDCD1

MAS1

IKBKE

LOC643332

CD79AGNB3

CCL22GP9

DRD4

FOXO4

EFNA3

FUT6

TBX2

SCNN1G

HNRNPLGNAO1

PAX6

MYBPC2

PRKCGPRRX2

CSN3

EVX1HAAO

SYN1

PLK1

KCNQ1MYO1F

CRYBB2KCNJ4

MAP3K14

ALX1

GSTM5PSD

C21orf2 KIAA0195GFAPGUCA2A

HTT

HLA-DOA

BCAM

CSN2

RBP3RHOD

FES

OXT

CG030

OMP

CCR9

OR2H1PSG7

IDUA

RARBSOX2

PRKCBKRT4

ATP12A

GSTA2

CCDC88A

GRIA1

FGF2

KCNA3

CACNA1A

GYPA

SERPINB2

CACNA1C

CRHR1

IFNA6

CTAG1B

SRP54

PSMA6

PCCA

CALB1

NNAT

CGA

CDH15

CHERPINPP5E TACR3

LOC729991

MYL4

MLL3

ZNF33B

AKAP1

ELN

PLXNB1

HSF4

PMS2L11

SIRPD

NTSR1

AQP2SLC17A4ETV2

CD53 SNAP25CNN1ACTG2CXCL5 CXCL3PTPRCMYL9 TAGLNTFF1 MUC5AC

STARD3

SPARC

APOB

SCGB2A2

AGTR1

TF

GRB7

ERBB2

INSL4

AGT

PLGLB2

TIAL1

S100A9

INHA

IFI35

ERVK6

CST2MAPK1UBE2L3HCK

METTL1

CDK4

AKR1C4

VIL1HPD

APOA1

DEFA5

NEFH

TSPAN31

WASL

SCG2CHGAMAGEA3MAGEA12KRT13NPY1RNFATC3ZNF185CCL15VWFCST4

GNGT1

ACVR2B

ID1

STAT5BITK CR2

COL1A2

COL1A1

COL3A1

CXCL10

CXCL11

GBP1

APOA2

AHSG

TSFM

TAC1

IGFBP1

STRA13

FABP1

RBP1

CGREF1

APOC4 CCT2

LST1LTF NRGN ITIH2 CRISP3 CD48 CD37F10

DCHS1

LOR

KRT76

HOXA9

GPLD1

PGK1

KEL

BMPR1B

CELA3B

MSX2

CYP11B1

CACNA2D1

MLN

FBXW4P1

AOC3

FHL1

COX7A1

CLEC3B

FABP4

FMO2

ADH1C

SPARCL1

GAS2L1

PRM2

MYH10

CPLX2

HLA-C

MT1L

PRRT1

SSTR1

AMELX

CLCNKB

FEZ1

HR44

MYL2

MATN1

NOS1

SSTR2

CTRCSSTR3

PLCB2

BMP4

NEUROD2

GRAP

HSD17BP1

CD3E

SFTPC

AGER

ANGPT1

A2M

FOXF1

CAV1

METTL1

CDK4

TSFMPI3

KRT5

RASGRF1

PRKG1GRIK3

TAGLNASCL1 CD53TSPAN31 CXCL5MYL9 CXCL3PTPRCCALCACOL1A1

COL3A1 KRT6C

COL1A2

SDS

SCG2CHGAMAGEA3MAGEA12KRT13NPY1RNFATC3ZNF185 ITIH2F10 LTF CRISP3SNAP25 NRGNACTG2 CNN1CXCL11 VWF CCL15CST2 CST4CXCL10

PDCD1

PRB2

RAB33A

TRBV5-4

ALPI

CD2

UROD

P2RY4

LEP

MAS1

MPZ ACVRL1

MAGEA11TRD@

IKBKE

GNB3

FOXO4ZNF132LY9TAF1

GPR68

NR5A1

COL11A1

CCR9

GUCA2A

HNRNPLHAAOCCL22

INPP5JGP9

OR2H1

KCNQ1

MATK

FEZ1

WASL

CG030

PLK1

FUT6

CD3E

EFNA3

MAP3K14

CRYBB2

MYO1F

KRT76

KLK3

TUBPSG7

OMPTNFRSF25

EVX1

MLH1DRD4IL8RB

GFAP

PAX6GSTM5

HTTRHOD

PSDGNAO1

CSN3

KCNJ4FESSCNN1G

SPRR1A

ETV2SYN1

PRKCG

PRRX2

BCAM

GYPA

HSD17BP1

SLC17A4

MYBPC2

MLN

PLXNB1

CHRM1

NHLH1

PLCD1

FUT1

BMP4

PGK1

LTK

HCKLST1 SRP54CD48 MAPK1UBE2L3PSMA6CD37

SCGB2A2

STARD3

IL11

ERBB2

AGTR1 ASS1

INSL4

TF

IL17A

GRB7 APOB

NEUROD2

SSTR1

MYL4

HOXA9

HSF4

CTRC

SIRPD

RBP3

CSN2

GRAP

PLCB2

OXT

SSTR2THBS3

HLA-DOA

C21orf2

HTR2C

KIAA0195ALX1

ITK

S100A9

ACVR2B

ID1

GNGT1

IFI35STAT5B

TIAL1

PRM2

CPLX2

HLA-C

MYH10

MT1L

FABP1

AHSGTAC1

CTAG1BCGREF1

APOA2

CALB1

NNAT

APOC4AKR1C4

PLGLB2

DEFA5

APOA1

IGFBP1

INHAHPD

ERVK6

NEFH

RBP1

VIL1 PCCAGAS2L1

FHL1

AOC3

COX7A1FMO2

SPARCL1

ADH1C

ANGPT1

FABP4

SFTPC

AGER

CLEC3B

A2M

AMELX

ZNF33B

BMPR1B

HR44

PRRT1SSTR3CELA3B

INPP5ECDH15

CACNA2D1

MATN1

MSX2

ELN

AKAP1

CLCNKB

MLL3

GRIA1

CRHR1RARB

SERPINB2

IDUA

KCNA3

CACNA1C

PRKCB

TACR3

PMS2L11

CYP11B1

CHERP

NTSR1AQP2LOC729991

GSTA2

ATP12A KRT4

FGF2

SOX2

Lung-Michigan: MIC – SE(C2) Lung-Michigan: MIC – SN(C4)

SEMA4D SYKMX1SFTPA1B IFIT1SFTPA2B IFI35PSMD4 EIF2AK2ISG15IFI44IFI44LILF2ARHGEF16 HLA-FZNF592

ITGAX

FGR

PARP1

SNRPE

SRPXDPYSL3

HIST1H2AC

PDGFRA

PMP22

NBL1

HIST1H2BE

MMP2

HIST1H1C

BOP1

KDM5D KPNB1

EIF2S1

PUF60

SUZ12

PSMA3RPS4Y1

PSMC6

XIST

CYC1

ACTA2

TAGLNACTG2

MYL9

FSTL1

C1R

C1S

CENPA

CDC20

TYMS

MCM7

CSE1L

CDKN3

H2AFZ

AQP1

MCM4

MCM2

NCAPHTTK

STMN1

CYCS

ATP5G3

CCNA2

TRIP13

HNRNPA2B1

PLK4C20orf24

EPR1

LOC96610

CDC2KPNA2

PCNA

MKI67

DTYMK

CCT2

UBE2S

HMGA1

SNRPB

LTBP2CDC6

CSNK2A1

CKS2

UBE2I

MCM6TK1

RFC4

MYH10

MYBL2

CENPF

UBE2C

UNG

EGR3

ZFP36

NR4A1

NR4A3FOSB

ATF3

PTGR1 GCLM

AKR1C1TSFM

PGD

G6PD

TALDO1

IER2

EGR1DUSP1

GPX2 JUNB

AKR1C3

NR4A2

EPHA1CRYBB1UBE2NRAB21EPRSRABIFRANBP1PTGES3IL6SLC2A3NONOUQCRB HSF4 DECR1 HBBRPL31RPS9 COL6A1RPS28 HBA2KRT5AANATYY1 CFTR RPL7A KRT6CRPL35TOMM20 RBL1CAPGPRKACG FILIP1L DPYSL2 RRASEDNRA PURAVIMTIMP3RPL3RPL10ACOL6A2

YIF1A

COASY

PTGDS

XPO1

CCT4

COPS5

RASSF2

MS4A2

PSMB9

GBP1

TAP1

PSMB8

CXCL9

BTN3A2

SPARCL1

CCNB1

ANGPT1

MTHFD2

TCEB1

A2M

TNXB FMO2

VWF

LOC150786

LRP1ITGA8CFD

CLEC3BPTNCCL15

UBE2V2

DARC

EIF2S2

SERPING1

PSMD14

CXCL12

GYPC

ANXA6

CD83

MRPL12

GPD2

CCT7 TUBG1

RARRES2

AIMP2

CCT5

PHB GAS1

ELN

GOT1

KIF14

KIF11

MELKKIAA0101

LMOD1

GAS6

FOXM1

PSMA5

C7

SEPP1XPC

FHL1

ADH1C

FMO3

MYH11

CPA3

CKS1B

NME1

MGP

LTC4S

CRYAB

COX7A1

EIF4A3

MFAP4

AOC3GAPDH

PAICS

H2AFX

SNRPF

TPSAB1LDHA

POLR2J2

CSNK1A1

EZH2

CTPS

PPIA

RARS

SHFM1

UBA7

TROAP CKAP5

PSMB5

RCAN2

CYP4B1 CD302

TGFBR2

PRMT5

NEK2 KIF2CTOP2A

ARHGEF6HLA-DMA

NCKAP1L

HLA-DQA1

BTK NCF1C

ARHGAP25

INPP5D

PTPRC

DOCK2

CCL19

CD48IFI30

FGL2

SIRPA

CD37

CCL4

NCF2

LTB

SELL

ALOX5AP

LIPA

GPR183

CD14

MAN2B1

ALOX5

CSF1R

LST1

HLA-DRAACP5

HLA-DMB

NCF4SLC31A2

FCGR1A

CCR1

ITGB2

CYBBAIF1

APOC1

PLEK

EVI2B

IL10RA

CYTIP

CD2

TRBV5-4

IL2RB

ICAM3

LCP2

FYB

PRKCB

CORO1A

CXCR4

LILRB4

PECAM1

LCP1

TRIM22

HCK

GPNMB

ITGAM

SPI1

RNASE6SELPLG

GLIPR1

ATP6V1B2

C3AR1

GM2A

HCLS1

MAP4K1

MNDA

SRGN

HLA-DQB1

MRC1

CD74

HLA-DPB1

HLA-DRB1

SPN

LMO2

HLA-DPA1ITGAL

CD53

CD33

HSP90AA1

C1QB

FCER1G

LSP1

LAPTM5

CD52

PTPN6

CACNG1

HPS1

HMOX2

MNT

GCM1

ADCYAP1

L1CAM

SSX5

ASNA1

CTCF

SLC9A1

CPA2

SUMO1LY9

EFNB1

MVK

IRF5

PIK3CA

EFNA3

P2RY4MAPK1CTTN

ACTC1

PIK3R2

MSX1

GH2

PTPN11

MAPKAPK2 C21orf2

CSN2

ALPILIMK2

PIGR

SRPR

MNAT1

KRT18

AQP7

ARCN1

KRT8

IL3RA

NXF1 GALE

PRKACA

NPDC1

ARL1

TDG

SLC33A1

ATP2A2

RENBP

PHB2 C4orf10

IGL@CD3E

HLA-DOB

AZU1 CCR7

LOC91316

TNFRSF17

PCDH1

CD79A

IGHG1COL16A1

IGKV1D-13

IGJRPS7

RABGGTB

KIAA0125

HLA-C

HERPUD1

SRP72

POU2AF1

MUM1

ATP13A3

HRC

IARS

IGHM

CD27

MS4A1

WWOX

CD6

TIAL1 USP14

NTN3

CD19

GNAT1

SPTBN1

SSBP1

CORO1B

LCK

CD3D

EMP3

APOE

IL2RG

MRPL3

CCL5

P2RX5

OXT

TAF12

TSG101

EXOSC10CXCL1

C4orf6

TRD@

TBX2

CLCN7

BCAM

PRB2GUCA2A

KLK3 NTSR1

CETP

MYO1F

TNNI1

NKX2-5

GP9

TH

ZNF132

LIFRCDH15

GNAO1

BMP4

HADHB

GZMA

PLA2G7

LGALS1

SHC3

KRT76

SPTB ATP4A

ETV2

AXIN1

BMPR1B

GSPT1

SYN1

PSD

IL8SIRPD

GNB3

HOXA9EVX1

MYL5

DRD4HTR1B

THBS2

CALD1

MFAP2

THY1

CDH11

LUM

HTRA1

LOXL1

KISS1

CTSKFRG1

HAAO

PDGFRBFBN1

MYL4

SLC17A4CUL1PTPN9

KRT86

COL4A1

COL4A2

VCAN

COL5A2

INHBA

COL5A1

COL11A1

REG1A

CHERPCYP17A1PAX6

TRIP12

TACR3 SSTR3

CACNA1B

GRIN2C

IFNA21

ACTR1A

TWIST1GPR3

UGT2B15

GPER

SCNN1GHTT

C1D

SSTR2

ABP1

CGBCYP4A11

AVPR1B

DRG2

FAP

COL6A3

COL1A1

COL1A2

COL3A1

POSTN

SPARC

PLAU

CAMK1

EBNA1BP2PSME2

JAK3

OR2H1 MT1L

SSTR1

EN2

GFAP

PMS2L11

CTRC

PPY

CHRM1

AMELX

MLANA

C16orf35

NEUROD2

CXCL10

HR44

ITGA5

CELA3B

MSX2

KIAA0195

MYBPC2GYPA

GRAP

KCNJ4RBP3

OMP

GNAT2

ATP1A3

FHL3

DNTTIP2

PTPRS

GRSF1

SCN1B

IKBKESNCB

PRKCG RHOD

PDCD1

SPRR1A

FEVAQP2

CLCNKB

LOC729991

MATK

FOXO4

PAX8

SAFB2

IL8RB

CLIP1

PRRX2

CTAG1B

TFDP1 TEF

HNRNPL

LTKADRB3

DRD2

TNNT3

APOC3

CCKAR

GSTM5 DEFA5INPP5J

RBBP6

PSG7

TUB

CHIC1

MLH1

FES

UROD

THRA

PTCRA

HTR2C

NHLH1SSTR4

PLCD1

MLN

HLA-DOASERPINF1

CYP11B1

CCR9

REG1P

CLCNKB

GYPA

HOXA9

CETP

MYBPC2

PAX6 GPER

OMP

LIPA

IFI30FGL2

MNDAAIF1

CD53PTPRCIL10RA

ALOX5MAN2B1

EMP3

HCK

ARHGAP25

ACP5LST1

LAPTM5

NCKAP1L

HLA-DRB1

CD74

HLA-DPA1

HLA-DMAHLA-DQB1

HLA-DMB

HLA-DPB1

HLA-DQA1HLA-DRA

CD37

LSP1

MYO1F

CSN2

HLA-DOA

PAX8

CDH15

SCN1B

SYN1FEV

SCNN1G

FES

PLCD1

PSD

AVPR1B

IGL@

RBBP6

CCR9

IKBKE

ZNF132

FHL3

SIRPD

CYP11B1

REG1P

TH

PDCD1

RHOD

OR2H1

MLH1

UROD LIMK2ITGAL

OXTNHLH1

TUB

CELA3B

LOC729991

KLK3

CHERP

AQP2

SPRR1A

DRD4

EVX1

CTRC

GBP1KRT76PSMB8

CXCL9PSMB9TRD@

PSME2CXCL10TAP1

POSTN

FAPSPARC

COL1A2

COL5A1

COL3A1

COL6A3

MFAP2

ANGPT1

CD302PLAU

VCAN

TNXB

CD2

TRBV5-4

MCM2 COL5A2 COL1A1 THY1

THBS2

UBE2C

TOP2A

TK1

KIF11

FOXM1

UBE2S

CDKN3

TTK

COASY

EBNA1BP2

TCEB1

TUBG1

ETV2C4orf10SFTPA2BSFTPA1BTPSAB1CPA3IER2EGR1 USP14ATP2A2

CD48

CORO1B

MGP

SPARCL1

LMOD1

ADH1C

FHL1

NBL1PMP22 COL4A1COL4A2

HLA-C

IGHM

CD79A

CD3E

IGJ

NTN3

POU2AF1

TNFRSF17

LOC91316

COL6A1 COL6A2 MX1

CENPF

MCM4

NCAPH

IFIT1

A2M

MYH11

TIAL1 FBN1IARS HTRA1

AZU1 EVI2BNCF4 C1QB

IFNA21

HNRNPL

HTTPRRX2

RBP3KCNJ4 NEUROD2 GP9PRKCG

FCGR1AARHGEF6LCP1

GPNMBPTPN6

LIFR GFAP SSTR4 HTR2C

SSTR3

TNNI1

ITGB2CD52CYBB

APOC1 GM2AFCER1G

ADRB3MLN

C21orf2FOXO4

CLIP1PPY

NTSR1MYL5 TWIST1 PMS2L11

AMELX

CHIC1LCP2

PLA2G7

PLEK

APOE

CORO1AC3AR1HCLS1

RNASE6

COX7A1 MYL9

GYPC LRP1

MFAP4

PHBGOT1LGALS1SIRPAH2AFXNME1BCAMC4orf6 HBBHBA2

PTGDS

DARC TAGLN

ACTA2RPS4Y1

XIST

CTPS

NR4A1

KDM5D

FOSB

ATF3

ZFP36

KPNA2

KIAA0101

CDC20

MELKKIF14

CCNB1

EIF4A3

EZH2

19

ReferencesCerami, E. et al (2012). The cBio Cancer Genomics Portal: An Open Platform for Exploring MultidimensionalCancer Genomics Data. Cancer Discovery, 2(5), 401–404.

Chowdary, D. et al (2006). Prognostic gene expression signatures can be measured in tissues collected inRNAlater preservative. The Journal of Molecular Diagnostics, 8(1), 31–39.

Davis, A.P. et al (2015). The Comparative Toxicogenomics Database’s 10th year anniversary: update 2015.Nucleic Acids Research, 43(D1), D914–D920.

Hamosh, A. et al (2005). Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes andgenetic disorders. Nucleic Acids Research, 33(suppl 1), D514–D517.

INSERM (1997). Orphanet: an online database of rare diseases and orphan drugs.

Magrane, M. et al (2011). UniProt Knowledgebase: a hub of integrated protein data. Database, 2011.

Rappaport, N. et al (2013). MalaCards: an integrated compendium for diseases and their annotation. Database,2013.

Shannon, P. et al (2003). Cytoscape: a software environment for integrated models of biomolecular interactionnetworks. Genome Research, 13(11), 2498–2504.

Singh, D. et al (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2),203–209.

Taylor, B.S. et al (2010). Integrative Genomic Profiling of Human Prostate Cancer. Cancer Cell, 18(1), 11–22.

20