fgfbp1 pathways control after induction of a conditional transgene in a mouse model: information...

1
FGFBP1 pathways control after induction of a conditional transgene in a mouse model: Information derived from mRNA expression pattern analysis Anne Deslattes Mays, Elena Tassi, Anton Wellstein Department of Oncology and Medicine, Lombardi Cancer Center, Washington DC 20057 Abstract Fibroblast Growth Factors (FGFs) play a significant role in embryonic development, maintenance of tissue homeostasis in the adult as well as in different diseases. FGF-binding proteins (FGF-BP) are secreted proteins that chaperone FGFs stored in the extracellular matrix to their cognate receptor, and can thus modulate FGF signaling. FGF-BP1 (BP1 a.k.a. HBp17) expression is required for embryonic survival, can modulate FGF-dependent vascular permeability in embryos and is an angiogenic switch in human cancers. To determine the function of BP1 in vivo, we generated tetracycline-regulated conditional BP1 transgenic mice. BP1 expressing mice are viable, fertile and phenotypically indistinguishable from their littermates. Five cDNA Affymetrix arrays were run on the kidneys of the FGF-BP1 transgenic mice. Two arrays were run for the animals under doxycyclin diet with the transgene switched off, one array was run with induction of the FGF-BP1 transgene for 24 hours, one array was run with induction of the FGF-BP1 transgene for 336 hours representing a chronic induction of the transgene. The results indicate that when properly normalized, time series analysis of a large array can reveal the signal transduction pathways. Pattern analysis allows for a systems biology review of the data and allows for the exploration and generation of testable hypotheses. Figure 3 – Heatmap scaled by probe - After RMA normalization, selection of significant over and under expressors relative to the average of the FGFBP1 transgene being off, analysis of the heatmap reveals mutually exclusive clusters. These clusters indicate genes that are off from one state until the other. Cluster A represents those genes that are off with the FGFBP1 transgene being off and switched on when the FGFBP1 transgene is activated for 24 hours. Cluster B contains those genes that are off at 24 hours but activated when the FGFBP1 transgene is on for 48 hours. Cluster C contains those genes that are off at 48 hours but on when the FGFBP1 transgene is on for 336 hours – or chronically. Studying these genes in this order, and with this pattern, allows the exploration of the signal transduction and activation pathway in response to the activation of FGFBP1 transgene. A B C A B C Figure 5– Gene Details – The detail for the genes found in the clusters of Figure 1 are described above in tables A, B and C. The genes responding after activation of the FGFBP1 transgene for 24 hours includes immunoglobulin kappa chain variable 21, 3-phosphoglycerate dehydrogenazes, a zinc finger protein, neuroantin, and homeobox B8. The genes found in table B, represent those genes activated after 48 hours of the FGFBP1 transgene being on. Included in this set is the hemopexin and major urinary protein 3. Finally after 336 hours – truly representing chronic activation of the FGFBP1 transgene, we have one gene, Reg3b, associated with inflammatory response (according to GO ontology). Figure 2 – Distinct Expression Patterns When Filtering by Thresholds at Timepoints. By creating a filter to capture the distinctive patterns that are expressing themselves at each of the separate timepoints, One can understand the major message being communicated at each timep oint. The patterns of expression are distinctive. Panel A are the expression patterns for those genes above a threshold at 24 hours. Panel B are the expression patterns for those genes above a threshold at 48 hours and Panel C are the expression patterns for those genes above a threshold at 336 hours – or at a chronic transgene Expression level. Figure 4- FGFBP1 pathways – Using Pathway Studio, the shortest path through the set of genes that were selected from filtering by a band pass filter at each of the time points, 24 hours, 48 hours and 336 hours was constructed. The resulting selection of diseases, cell processes, and functional classes were the result of Pathway Studio constructing the shortest path to connect those genes in the set. Conclusions A systems biology approach to analyzing large data sets, such as this study which involved five full mouse cDNA arrays allows the researcher to capture a snapshot of the unfolding remodeling events of an organisms response to change, stress or disease. Analyzing data in this form involves filtering the biological signal from the noise. Sorting the noise in appropriate manners is essential to be able to complete the biological story. Building on existing knowledge base, we can complete the picture as long as the proper context of the collection, normalization and analysis is maintained. High throughput technologies such as microarrays and RNA sequencing as enabled by next generation sequencing presents the researcher with the challenge of extracting meaningful information from the measurements. Software tools and analysis techniques are not a substitute to understanding the biological context from which the data are collected. Engineering and digital signal processing has allowed us to derive the understanding of how to reconstruct a signal from the presence of a continual stream of noisy analog data. Sampling frequency and proper filtering are a must to be able to sort out a meaningful signal from the noise. These same principles apply not only to communication theory but also when studying large data such as those that may be collected from high throughput systems such as a Affymetrix mouse cDNA array. A B C 0 A B C Figure 1 Panels. 0, A, B, and C, illustrate ordering based upon the expression values of the control (FGFBP OFF), 24 hour expression (FGFBP1 On 24 hours), 48 hour expression (FGFBP1 On 48 hours), and 336 hour expression (FGFBP On 336 hours). The insight gained from this inspection includes the ability to see the relative changes of expression at each of these time points. Figure 6 – Graphical Gaussian Model. Using the expression profiles, a quassi-Bayesian analysis is performed constructing the partial correlation network among the top expressing genes. Note that C9 (complement component 9) was not able to be placed in context of the data in the Pathway Studio diagram, however using the partial correlations, we are able to place it as strongly positively correlated to Serpina3k, Cyp3all, MUG1, Tdo2, Mup3, Hpx, weakly positively correlated to Hamp, and strongly negatively correlated to Tex10. Together indicating the placement of C9 in the Endothelial response.

Upload: anne-deslattes-mays

Post on 17-Jul-2015

89 views

Category:

Health & Medicine


2 download

TRANSCRIPT

FGFBP1 pathways control after induction of a conditional transgene in a mouse model: Information derived from mRNA expression pattern analysis

Anne Deslattes Mays, Elena Tassi, Anton Wellstein

Department of Oncology and Medicine, Lombardi Cancer Center, Washington DC 20057

Abstract Fibroblast Growth Factors (FGFs) play a significant role in embryonic development, maintenance of tissue homeostasis in the adult as well as in different diseases. FGF-binding proteins (FGF-BP) are secreted proteins that chaperone FGFs stored in the extracellular matrix to their cognate receptor, and can thus modulate FGF signaling. FGF-BP1 (BP1 a.k.a. HBp17) expression is required for embryonic survival, can modulate FGF-dependent vascular permeability in embryos and is an angiogenic switch in human cancers. To determine the function of BP1 in vivo, we generated tetracycline-regulated conditional BP1 transgenic mice. BP1 expressing mice are viable, fertile and phenotypically indistinguishable from their littermates. Five cDNA Affymetrix arrays were run on the kidneys of the FGF-BP1 transgenic mice. Two arrays were run for the animals under doxycyclin diet with the transgene switched off, one array was run with induction of the FGF-BP1 transgene for 24 hours, one array was run with induction of the FGF-BP1 transgene for 336 hours representing a chronic induction of the transgene. The results indicate that when properly normalized, time series analysis of a large array can reveal the signal transduction pathways. Pattern analysis allows for a systems biology review of the data and allows for the exploration and generation of testable hypotheses.

Figure 3 – Heatmap scaled by probe - After RMA normalization, selection of significant over and under expressors relative to the average of the FGFBP1 transgene being off, analysis of the heatmap reveals mutually exclusive clusters. These clusters indicate genes that are off from one state until the other. Cluster A represents those genes that are off with the FGFBP1 transgene being off and switched on when the FGFBP1 transgene is activated for 24 hours. Cluster B contains those genes that are off at 24 hours but activated when the FGFBP1 transgene is on for 48 hours. Cluster C contains those genes that are off at 48 hours but on when the FGFBP1 transgene is on for 336 hours – or chronically. Studying these genes in this order, and with this pattern, allows the exploration of the signal transduction and activation pathway in response to the activation of FGFBP1 transgene.

A

B

C

A

B

C

Figure 5– Gene Details – The detail for the genes found in the clusters of Figure 1 are described above in tables A, B and C. The genes responding after activation of the FGFBP1 transgene for 24 hours includes immunoglobulin kappa chain variable 21, 3-phosphoglycerate dehydrogenazes, a zinc finger protein, neuroantin, and homeobox B8. The genes found in table B, represent those genes activated after 48 hours of the FGFBP1 transgene being on. Included in this set is the hemopexin and major urinary protein 3. Finally after 336 hours – truly representing chronic activation of the FGFBP1 transgene, we have one gene, Reg3b, associated with inflammatory response (according to GO ontology).

Figure 2 – Distinct Expression Patterns When Filtering by Thresholds at Timepoints. By creating a filter to capture the distinctive patterns that are expressing themselves at each of the separate timepoints, One can understand the major message being communicated at each timep oint. The patterns of expression are distinctive. Panel A are the expression patterns for those genes above a threshold at 24 hours. Panel B are the expression patterns for those genes above a threshold at 48 hours and Panel C are the expression patterns for those genes above a threshold at 336 hours – or at a chronic transgene Expression level.

Figure 4- FGFBP1 pathways – Using Pathway Studio, the shortest path through the set of genes that were selected from filtering by a band pass filter at each of the time points, 24 hours, 48 hours and 336 hours was constructed. The resulting selection of diseases, cell processes, and functional classes were the result of Pathway Studio constructing the shortest path to connect those genes in the set.

Conclusions

A systems biology approach to analyzing large data sets, such as this study which involved five full mouse cDNA arrays allows the researcher to capture a snapshot of the unfolding remodeling events of an organisms response to change, stress or disease. Analyzing data in this form involves filtering the biological signal from the noise. Sorting the noise in appropriate manners is essential to be able to complete the biological story. Building on existing knowledge base, we can complete the picture as long as the proper context of the collection, normalization and analysis is maintained. High throughput technologies such as microarrays and RNA sequencing as enabled by next generation sequencing presents the researcher with the challenge of extracting meaningful information from the measurements. Software tools and analysis techniques are not a substitute to understanding the biological context from which the data are collected. Engineering and digital signal processing has allowed us to derive the understanding of how to reconstruct a signal from the presence of a continual stream of noisy analog data. Sampling frequency and proper filtering are a must to be able to sort out a meaningful signal from the noise. These same principles apply not only to communication theory but also when studying large data such as those that may be collected from high throughput systems such as a Affymetrix mouse cDNA array.

A

B

C

0 A

BC

Figure 1 Panels. 0, A, B, and C, illustrate ordering based upon the expression values of the control (FGFBP OFF), 24 hour expression (FGFBP1 On 24 hours), 48 hour expression (FGFBP1 On 48 hours), and 336 hour expression (FGFBP On 336 hours). The insight gained from this inspection includes the ability to see the relative changes of expression at each of these time points.

Figure 6 – Graphical Gaussian Model. Using the expression profiles, a quassi-Bayesian analysis is performed constructing the partial correlation network among the top expressing genes. Note that C9 (complement component 9) was not able to be placed in context of the data in the Pathway Studio diagram, however using the partial correlations, we are able to place it as strongly positively correlated to Serpina3k, Cyp3all, MUG1, Tdo2, Mup3, Hpx, weakly positively correlated to Hamp, and strongly negatively correlated to Tex10. Together indicating the placement of C9 in the Endothelial response.