association of genetic variants with macronutrient intake ...the chechen and circassian populations...
TRANSCRIPT
Accepted Manuscript
Association of genetic variants with macronutrient intake inCircassian and Chechan populations in relation to diabetes
R. Dajani, S.J. Shbailat, A. Arafat, S. Sobar, H. Bitar, R. Tayyem,Z. Wei, H. Hakonarson
PII: S2214-5400(18)30030-6DOI: doi:10.1016/j.mgene.2018.03.003Reference: MGENE 415
To appear in: Meta Gene
Received date: 4 December 2017Revised date: 21 February 2018Accepted date: 4 March 2018
Please cite this article as: R. Dajani, S.J. Shbailat, A. Arafat, S. Sobar, H. Bitar, R.Tayyem, Z. Wei, H. Hakonarson , Association of genetic variants with macronutrientintake in Circassian and Chechan populations in relation to diabetes. The address forthe corresponding author was captured as affiliation for all authors. Please check ifappropriate. Mgene(2017), doi:10.1016/j.mgene.2018.03.003
This is a PDF file of an unedited manuscript that has been accepted for publication. Asa service to our customers we are providing this early version of the manuscript. Themanuscript will undergo copyediting, typesetting, and review of the resulting proof beforeit is published in its final form. Please note that during the production process errors maybe discovered which could affect the content, and all legal disclaimers that apply to thejournal pertain.
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
1
Title: Association of genetic variants with macronutrient intake in Circassian and
Chechan populations in relation to diabetes.
Authors:
Dajani Ra†
, Shbailat SJa, Arafat A
b, Sobar S
c, Bitar H
d, Tayyem R
e, Wei Z
f, Hakonarson
Hg,h
aDepartment of Biology and Biotechnology, The Hashemite University, Zarqa, Jordan
bNational Center for Diabetes, Endocrinology and Genetics, Amman, Jordan
cCell Therapy Center, University of Jordan, Amman, Jordan
dFaculty of Medicine, University of Jordan, Amman, Jordan
e Faculty of Agriculture, University of Jordan, Amman, Jordan
fNew Jersey Institute of Technology, New Jersey, USA
gCenter for Applied Genomics, The Children’s Hospital of Philadelphia, Philadelphia,
Pennsylvania 19104, USA.
hDepartment of Pediatrics, The Perelman School of Medicine, University of
Pennsylvania, Philadelphia, Pennsylvania, 19104, USA.
†Correspondence should be addressed to RD
Department of Biology and Biotechnology, The Hashemite University, Zarqa, Jordan
email: [email protected].
Telephone: 00962798859335
Running title:
Macronutrient intake in relation to diabetes in ethnic populations
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
2
Abstract
Background and Aims: Type two Diabetes (T2D) is a complex disease including
environmental and genetic factors. Macronutrient intake variation can be partly related
to genetics. We propose in this study to analyse GWAS datasets for the Circassian and
Chechan populations in order to determine associations between the GWAS data and
nutrition intake in relation to diabetes.
Methods and Results: We analyzed a total of 36 traits including the
macro/micronutrient intake in association with Genome wide association studies data
for 34 persons with diabetes Chechans and Circassians. Within the Circassian
population, there was a statistically significant association between carbohydrate and
calorie intake and T2D associated SNPs on Histone deacetylase 9 gene- and between
Vitamin B2 intake and SNPs on a second locus of potential interest on chromosome 11
near LOC101928989 and teneurin transmembrane protein 4. Caffeine intake was also
associated with significant SNPs unrelated to T2D. On the other hand, only calorie
intake correlated with the occurrence of significant SNPs in Chechans, none of them
were related to T2D.
Conclusion: we have identified a genetic association between intake of carbohydrate,
calorie, vitamin B2 and caffeine in the Circassian population and calorie intake in the
Chechan population. Three of these intake traits (carbohydrate, calorie and vitamin B2)
were correlated with T2D development in Circassians. The association between
macronutrient intake and diabetes development can shed light on causative variants for
the pathogenesis of T2D.
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
3
Keywords
Circassian; Chechan; SNPs; Diabetes; ethnic
Introduction
Diabetes is among the most common non-communicable diseases globally. The
International Diabetes Federation estimates that there are currently 194 million people
aged 20 to 79 with diabetes worldwide and that this will increase to 333 million by 2025
(1). Diabetes is the fifth main cause of death in Jordan. According to a 2007 study by
the Heart and Capillary Disease Prevention directorate (HCDP) of the Ministry of
Health in Jordan, diabetes afflicts 16% of Jordanians over the age of 18. Another 23.8%
of Jordanians over 18 years old are on the brink of becoming persons with diabetes,
while the diabetes prevalence in Jordan is 30.5% among both children and adults (2).
Despite extensive research efforts, the genetic basis of T2D remains largely unknown
(3).
Over the past decade, Genome Wide Association Studies (GWAS) have been
used to link regions of the human genome with complex diseases (4, 5). Researchers
have linked haplotypes to type 2 diabetes, coronary heart disease and macular
degeneration (3, 4, 6). Although, it has been difficult to uncover the functional
significance of discovered SNPs, GWAS remain a way to discover genetic risk factors
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
4
for complex diseases. Most GWAS have been performed on European populations.
There is a need to conduct GWAS on different ethnic populations to discover novel risk
factors, especially if we are to apply these discoveries on other populations.
Previously, we performed GWAS on two ethnic populations of ancient descent,
the Circassians and the Chechans. Both the Circassians and the Chechans, are the
largest indigenous nationalities of North Caucasus (7, 8, 9). The Circassian and
Chechan populations are descendants of a single ancient origin with later divisions
along linguistic and geographic borders (8, 10). Previous analysis of classical genetic
markers such as blood groups and serum proteins have shown statistical significant
genetic diversity in the Caucasus (7, 11). The genetic diversity has been confirmed by
mitochondrial DNA and Y chromosome analysis (8, 11). Under military pressure
groups of Circassians and Chechans immigrated to Jordan 140 years ago. Circassians
and Chechans in Jordan are endogamous and have managed to keep their separate sense
of identity and ethnicity (12). The Circassian and Chechan communities represent
unique populations that are genetically distinct from the Arab population.
A recent study has shown that the prevalence of impaired fasting glycemia and
diabetes was 18.5% and 9.6%, respectively, for Circassians, and 14.6% and 10.1%,
respectively, for Chechans. In both ethnicities, the prevalence of impaired fasting
glycemia and diabetes was significantly higher in men, older age groups, married,
subjects of lower educational level, past smokers, and subjects with obesity. Moreover,
low high density lipoprotein (HDL) cholesterol was the most common abnormality
observed in the Circassian and Chechan populations in Jordan (13).
A following study, using the same Circassian and Chechan persons with
diabetes compared them to normal participants from both ethnicities. They found that
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
5
intake of all macronutrients and most of micronutrients (vitamin B12, folic acid, vitamin
C, iron, selenium, and zinc) did not differ between participants with normal blood
glucose and those with impaired fasting glucose or diabetes in both Circassian and
Chechan populations (13,14).
In view of high incidence of T2D, we performed GWAS to search for T2D
disease genes. Our GWAS studies demonstrated that multiple variants underlie T2D in
the Chechen and Circassian populations in Jordan, some of which are shared with other
ethnic groups. No single SNP met genome-wide significance criteria at 5x10-8
level, in
either populations (in publication). We propose in this study to further analyse the
GWAS datasets for the Circassian and Chechan populations in order to determine
associations between the GWAS data and nutrition intake of these populations in
relation to diabetes. Macronutrients intake and dietary behavior, which is related to
many factors such as culture, economics, social and psychologic and health beliefs,
varies among individuals. This variance can be partly related to genetic components.
The heritability is reported to range from 8% to 70% (1). The genetics behind
macronutrient intake can shed light on the biology of dietary behavior. There have been
GWAS that have identified chromosomal regions associates with macronutrient intake
(2-5).
The present study sheds more light on multiple genetic factors that underlie T2D in
association with macro/micronutrient intake in Chechen and Circassian populations in
Jordan. Some of these factors may be shared with other ethnic groups.
Materials and Method
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
6
Subjects: The sample consisted of 34 Circassians and 34 Chechans. Inclusion criteria
included: diagnosis with T2D and receiving medical treatment for it, pure ethnicity
(Circassian, Chechen) in the last 3 generations, living in Jordan, age >30 years old, age
of diagnosis >18 years old, and time since diagnosis and commencement of treatment
>6 months.
GWAS Data: GWAS data produced from two previous research projects funded by
King Hussein institute for biotechnology and cancer (KHCIBC) (Chechan population)
and the Higher council for science and technology (HCST) (Circassian population).
High-throughput, genome-wide SNP genotyping was conducted using the Infinium II
OMNI-Express BeadChip technology (Illumina), at the Center for Applied Genomics
(CAG) at the Children’s Hospital of Philadelphia (CHOP), USA, according to
manufacturer’s standard protocol. Both SNP genotype data and intensity data including
information of Log R Ratio (LRR) and B Allele Frequency (BAF) was extracted from
GenomeStudio project files.
Biochemical data: Biochemical data produced from two previous research projects
funded by KHCIBC (Chechan population) and HCST (Circassian-population) included
Lipid parameters (Total Cholesterol, HDL, low density lipoprotein (LDL) and
triacylglycerol (TG)), and glucose were analyzed for all samples using Enzymatic
assays. The following definitions and cutoff points were used in the study according to
the (ADA) guidelines ( 2014): Glycemic control was considered satisfactory (good) if
HbA1c levels were <7.0%, and unsatisfactory (poor) if HbA1c levels were ≥7.0%. An
elevated LDL level was defined as ≥100 mg/dL, a low HDL level was defined as <40
mg/dL in males and <50 mg/dL in females, an elevated cholesterol level was defined as
≥200 mg/dL, and an elevated triglycerides level was defined as ≥150 mg/dL. Patients
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
7
were considered dyslipidemic if they had at least one of the previously mentioned lipid
abnormalities.
Macro/micronutrient intake data: Macro/micronutrient data produced from two
previous research projects funded by KHCIBC (Chechan population) and HCST
(Circassian population). 24-hour recall was collected from participants to estimate mean
nutrient intake. The individual was asked to recall information on portion sizes, cooking
methods, recipe of dishes, condiments, and beverages for the past 24-hour. All details
about estimating portion sizes were explained by trained research assistants. The intakes
of total calories, macro- and micronutrients as well as other dietary substances were
calculated using ESHA Food Processor (Version 7.9, 1987–2002: Salem, OR). Foods
not listed in ESHA were entered item by item after getting its components and recipe
from the participant.
SNPs analysis: We also performed a manual search of the NCBI database for all genes
related to the list of the annotated nominally significant SNPs (P<0.05). We did not find
databased-evidence discussing the exact SNP. Therefore, we used a multi-step search
technique, which starts by identifying the SNP, determining its location and the related
gene. Then, we searched NIH database for studies that discussed the related gene and if
it is correlated with T2D. After that, we described the full name of the gene, its function
(if known) and the major disease associations based on the NIH database.
Statistical Analysis
All statistical tests for association were carried out using the software package plink
(http://pngu.mgh.harvard.edu/~purcell/plink/index.shtml). Principal components
analysis was carried out using Eigenstrat 3.0
(http://genepath.med.harvard.edu/~reich/Software.htm ).The single-marker analysis for
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
8
the genome-wide data was carried out for different traits using linear regression on
allele counts with the first 10 principle components as covariates. P values and effect
sizes (log odds ratios) with the corresponding 95% confidence intervals were calculated
for the association analysis.
A Manhattan plot is a scatter plot used to display a large number of data-points such as
genome-wide association studies (GWAS). In GWAS Manhattan plots, genomic
coordinates are displayed along the X-axis, with the negative logarithm of the
association P-value for each single nucleotide polymorphism (SNP) displayed on the Y-
axis. Each dot on the Manhattan plot signifies a SNP. Because the strongest
associations have the smallest P-values, their negative logarithms will be the greatest.
A Q–Q plot is a quantile-quantile probability plot where graphical method is used for
comparing two probability distributions by plotting their quantiles against each other.
First, the set of intervals for the quantiles is chosen. A point (x, y) on the plot
corresponds to one of the quantiles of the second distribution (y-coordinate) plotted
against the same quantile of the first distribution (x-coordinate). Thus the line is a
parametric curve with the parameter which is the number of the interval for the quantile.
If the two distributions being compared are similar, the points in the Q–Q plot will
approximately lie on the line y = x. In a case-control comparison of genotype markers,
if the cases are enriched for specific markers, they will deviate away from the control
data at a certain significance level. If there is no difference, then y = x.
For calculation of genomic inflation factor also known as lambda(λ) in genome-wide
association studies (GWAS), λ is defined as the median of the resulting chi-squared test
statistics divided by the expected median of the chi-squared distribution. The median of
a chi-squared distribution with one degree of freedom is 0.4549364. A λ value can be
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
9
calculated from z-scores, chi-square statistics, or p-values, depending on the output
from the association analysis. The algorithm we used is automated in the software
PLINK, which provides accurate measures on the population inflation. In our instance
GIF was 1.00, indicating there was no evidence of population stratification causing
inflation.
Quality control
After filtering out SNPs with minor allele frequency <0.05, missing rate > 0.05, Hardy-
Weinberg Equilibrium Test P value < 0.0001, we ended up with ~588K SNPs for
association analysis. The top hit SNP rs16871944 has reached genome wide
significance level (p=3.35E-08).
An association being detected means that its effect size should be large enough at the
given sample size. For example, as we show in our result section, we found the
estimated effect size for SNP rs16871944 is beta=1071, with 95% confidence interval of
[177, 1419]. Given that N=103, minor allele frequency = 0.27 (rs16871944), and
response CALs SD=500, we can obtain its power as 0.85 when beta=1071. We now
make a power vs effect size plot with effect size beta ranging from 177 to 1419. The
tool website for GPC used is R package “GeneticsDesign” (Figure 6).
Results
We analyzed a total of 36 traits including the macro/micronutrient intake and
biochemical data with valid values in association with the GWAS data for 34 persons
with diabetes Chechans and 34 persons with diabetes Circassians (Table 1 -list of
traits). For each trait that showed SNPs with P<0.001 we plotted a Manhattan plot and
a QQ plot as described in the methods section.
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
10
Only one trait, Calorie intake (CALs) (genomic inflation factor 1.00), showed SNPs
with P<0.001 in the Chechan population (Figure 1.a -QQ-plot; Figure 1.b -Manhattan
plot). However, none of these SNPs has been reported in the literature to be related to
diabetes. We further analyzed these SNPs and found a number of them associated with
fatty liver disease. In Table 2 for each trait we show all the SNPS that were statistically
significant and their related genes and diseases.
Four traits showed potential significant signals in the Circassian population. The
first trait is carbohydrate intake (CARB) (genomic inflation factor 1) and the second
trait is calorie intake (CAL) (genomic inflation factor 1.06736) (Figure 2.a -QQ-plot
and b Manhattan plot, and Figure 3.a -QQ-plot and b Manhattan plot, respectively).
Both showed SNPs with P<0.001 on Histone deacetylase 9 (HDAC9) gene which is
reported to be relevant to diabetes. Other SNPs on other genomic regions that are
related to CAL but not to T2D were associated with fatty liver disease Table 2.
The third trait that showed SNPs that were statistically significant was Vitamin
B2 (genomic inflation factor 1.09848) (Figure 4.a -QQ-plot and b. Manhattan plot). The
SNPs were located on a second locus of potential interest on chromosome 11. This
locus is not on any gene (intergenic) and the nearby genes are LOC101928989 and
teneurin transmembrane protein 4 (TENM4). Other SNPs in other genomic regions that
are associated with vitamin B2 trait but not with T2D were correlated with the
occurrence of other diseases such as ovarian and cervical cancers Table 2.
The fourth trait that showed SNPs that were statistically significant was caffeine
intake. Although caffeine intake (CAFF) (genomic inflation factor 1.03214) (Figure 5.a
-QQ-plot and b. Manhattan plot) showed significant SNPs with P<0.001, none of them
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
11
were related to T2D. These SNPs were of significant interest, because they were
associated with Huntington disease and lateral temporal epilepsy Table 2.
Discussion
In the present study, we analyzed a total of 36 traits including macro/micronutrient
intake and biochemical data in Chechen and Circassian populations. Among these traits,
only CALs intake was associated with significant SNPs in Chechans, however, none of
these SNPs was related to T2D. On the other hand, CALs, CARB, Vitamin B2 and
CAFF intake were associated with significant SNPs in Circassians, and all these traits
except CAFF were related to T2D. In addition, our research revealed 7 SNPs that were
associated with macro/micronutrient intake but unrelated to diabetes, and found that
they were associated with other diseases.
We found that SNPs correlated with CALs and CARB intake in Circassians
were carried on HDAC9 gene. This gene is reported to be relevant to T2D. It is highly
expressed in insulin producing β-cells of pancreas (15, 16), and its product acts
epigenetically to prevent the differentiation of these cells during embryogenesis and
after birth (16). A previous study showed that one SNP carried on this gene was
significantly associated with high BMI, T2D and hyperlipidemia, and as a result it
significantly modified the susceptibility to coronary artery disease and increased the
severity to coronary atherosclerosis (17). In our study, SNPs identified on HDAC9 gene
in T2D patients might contribute to the occurrence/and or severity of the disease by
changing the expression of the gene, and consequently changing the epigenetic
regulation of β-cells differentiation during development.
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
12
We also showed that Vitamin B2 intake was associated with SNPs located on an
intergenic region carried on chromosome 11. This region exists near LOC101928989
and TENM4 genes. The former is a hypothetical gene with still uncharacterized
function, while the later encodes for Teneurin-4, a member of type II transmembrane
glycoprotein family which includes Teneurin 1-4 (18). TENM4 is expressed in the
central nervous system and in the mesenchymal tissues including cartilage and muscles
(19). Despite its association with several diseases, no relevance was reported in
literature between TENM4 and T2D. Nevertheless, chromosome 11, where the
intergenic region is located, was shown to be related to T2D.
Chromosome 11 carries six genes that are implicated in insulin secretion [ATP
binding cassette subfamily C member 8 (ABCC8, also known as SUR1), potassium
voltage-gated channel subfamily J member 11 (KCNJ11, also known as Kir6.2),
uncoupling protein 2 (UCP2) and melatonin receptor 1B (MTNR1B)], signaling
[exostosin glycosyltransferase 2 (EXT2)] and resistance [uncoupling protein 3 (UCP3)]
(20). SNPs on genes related to secretion (21, 22, 23. 24) or signaling (25, 26) of insulin
were associated with T2D. Moreover, variants on UCP3 gene were correlated with
insulin resistance causing T2D (27). The ethnic background is also associated with
certain SNPs that might have a role in the occurrence of T2D (20). Some of them are
carried on chromosome 11. For example, certain SNPs on UCP2-UCP3 gene cluster
were correlated with T2D in Caucasian Americans (28). Other variants near Fanconi
anemia complementation group F (FANCF) gene were associated with young-onset of
T2D in Pima Indian Americans (29). Furthermore, variants in an intergenic region of
chromosome 11p12 might contribute to the occurrence of T2D in Finnish population
(30). Although chromosome 11 carries SNPs specific for certain ethnicity, it also carries
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
13
certain variants on genes such as potassium voltage-gated channel subfamily Q member
1 (KCNQ1) (31; 32), KCNJ11 (22; 33), ABCC8 (22), UCP2 (34; 35; 36) and ArfGAP
with RhoGAP domain, ankyrin repeat and PH domain 1 (ARAP1, also known as
CENTD2) (37; 33) that were reported to be related to T2D in wide range ethnic groups.
Taken together, the SNPs we identified on chromosome 11 might have a role in down-
regulating the insulin pathway or increasing the insulin resistance. Further studies are
required to uncover the underlying mechanisms of different polymorphic alleles and to
explore whether these SNPs are specific for Circassians or prevalent in other ethnicities.
We explored the significant variants that were associated with
macro/micronutrient intake but unrelated to diabetes, and found that they were
associated with other diseases. One SNP (rs2143571) was related to CALs intake. It was
carried on the sorting and assembly machinery component 50 homolog (SAMM50) gene
and associated with susceptibility to nonalcoholic fatty liver disease (NAFLD) (38).
Four SNPs were related to Vitamin B2 intake. Two of them were carried on DNA
methyltransferase 3 alpha (DNMT3A) (rs11887120) and deoxyuridine triphosphate
(DUT) (rs3784621) genes (39; 40). The former was correlated with the occurrence of
ovarian cancer (39), while the later was correlated with cervical cancer associated with
human papillomavirus persistence (40). The third SNP (rs1393350) was located on
tyrosinase (TYR) gene and used as a predictor of the eye color (41,42). The last SNP
(rs12904216) was carried on solute carrier family 12 member 1 (SLC12A1) gene and
was associated with developing thiazolidine-related edema in patients with T2D (43).
Although this SNP was unrelated to T2D in our study, the difference in ethnicity
between our study (Circassians) and the previous one (Taiwanese) may explain this
inconsistency. Finally, two SNPs were related to CAFF intake. One of them
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
14
(rs2276881) is carried on huntingtin (HTT) gene and associated with Huntington’s
disease (44), while the other (rs7099034) was located at glioma inactivated 1 (LGI1)
gene and might have a role in the occurrence of lateral temporal epilepsy (45).
Our study has its limitations since it is based on a select cohort of T2D individuals, and
as a result, they may not be generalizable. Moreover, the small size of the cohort may
have prevented us from detecting more modest effects. Although we observed
consistent patterns of association, many of these associations were of nominal
significance. Nevertheless, there have been GWAS studies that have been successful
with subject numbers that are comparable to ours (46, 47). Therefore, replication in an
independent cohort should be conducted to support or refute the obtained results.
However, obtaining a second cohort from these populations is extremely difficult as
these are the first ever samples collected. In addition, nutrients intake estimation was
based on 24-hour recall which depends on memory, and as a consequence recall bias is
expected.
In summary, our study showed new SNPs are associated with T2D, and that they play a
role in nutrient-specific choice and dietary preference. Although the associations
reported here are significant, the effects are small in magnitude. Given the complexity
of the investigated traits, it is likely that other as yet unidentified SNPs and copy
number variants, contribute to macro/micronutrient intake. Future studies should focus
on conducting the same experiments in non-European cohorts to determine whether or
not the identified SNPs are associated with macro/micronutrient intake in other
ethnicities.
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
15
Funding:
This research was funded by the Hashemite University.
Conflict of interest: None
Author contribution: RD designed research project and wrote the manuscript; HH and
ZW contributed to the design of the research project and performed the data analysis
and contributed to the writing; SS, RT, SS, HB and AA contributed to data analysis and
writing of the manuscript
References
1. Wild S, Roglic G, Green A, Sicree R, King H (2004) Global prevalence of
diabetes: estimates for the year 2000 and projections for 2030. Diabetes care
27(5):1047-1053.
2. Ajlouni K, Khader YS, Batieha A, Ajlouni H, El-Khateeb M (2008) An increase
in prevalence of diabetes mellitus in Jordan over 10 years. J Diabetes
Complications 22(5):317-324.
3. Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common
diseases and complex traits. Nat Rev Genet 6(2):95-108.
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
16
4. Barrett JC, Cardon LR (2006) Evaluating coverage of genome-wide association
studies. Nat Genet 38(6):659-662.
5. Pe'er I, de Bakker PI, Maller J, Yelensky R, Altshuler D, Daly MJ (2006)
Evaluating and improving power in whole-genome association studies using
fixed marker sets. Nat Genet 38(6):663-667.
6. Freimer NB, Sabatti C (2007) Human genetics: variants in common diseases.
Nature 445(7130):828-830.
7. Barbujani G, Nasidze IS, Whitehead GN (1994a) Genetic diversity in the
Caucasus. Hum Biol 66(4):639-668
8. Nasidze I, Risch GM, Robichaux M, Sherry ST, Batzer MA, Stoneking M
(2001) Alu insertion polymorphisms and the genetic structure of human
populations from the Caucasus. Eur J Hum Genet 9(4):267-272.
9. Bulayeva KB (2006) Overview of genetic-epidemiological studies in ethnically
and demographically diverse isolates of Dagestan, Northern Caucasus, Russia.
Croat Med J 47(4):641-648.
10. Nasidze I, Ling EY, Quinque D, Dupanloup I, Cordaux R, Rychkov S,
Naumova O, Zhukova O, Sarraf-Zadegan N, Naderi GA, Asgary S, Sardas S,
Farhud DD, Sarkisian T, Asadov C, Kerimov A, Stoneking M (2004)
Mitochondrial DNA and Y-chromosome variation in the caucasus. Ann Hum
Genet 68(Pt 3):205-221
11. Barbujani G, Whitehead GN, Bertorelle G, Nasidze IS (1994b) Testing
hypotheses on processes of genetic and linguistic change in the Caucasus. Hum
Biol 66(5):843-864
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
17
12. Kailani W (2002) Chechens in the Middle East: Between Original and Host
Cultures. Caspian Studies Program.
http://belfercenter.ksg.harvard.edu/publication/12785/chechens
13. Dajani R, Khader YS, Fatahallah R, El-Khateeb M, Shiyab AH, Hakooz N
(2012) Diabetes mellitus in genetically isolated populations in Jordan:
prevalence, awareness, glycemic control, and associated factors. J Diabetes
Complications 26(3):175-180.
14. Tayyem RF, Dajani R, Khader Y, Abu-Mweis S, Fatahallah R, and Bawadi H
(2014) Nutrients Intakes among Normal and Type 2 Diabetic Individuals from
Genetically Isolated Populations in Jordan. Ethn Dis 24(2):200-206.
15. Dorrell C1, Schug J, Lin CF, Canaday PS, Fox AJ, Smirnova O, Bonnah R,
Streeter PR, Stoeckert CJ Jr, Kaestner KH, Grompe M (2011) Transcriptomes of
the major human pancreatic cell types. Diabetologia 54(11):2832-2844.
16. Lenoir O, Flosseau K, Ma FX, Blondeau B, Mai A, Bassel-Duby R, Ravassard
P, Olson EN, Haumaitre C, Scharfmann R (2011) Specific control of pancreatic
endocrine β- and δ-cell mass by class IIa histone deacetylases HDAC4, HDAC5,
and HDAC9. Diabetes 60(11):2861-2871.
17. Wang XB, Han YD, Sabina S, Cui NH, Zhang S, Liu ZJ, Li C, Zheng F (2016)
HDAC9 Variant Rs2107595 Modifies Susceptibility to Coronary Artery Disease
and the Severity of Coronary Atherosclerosis in a Chinese Han Population.
PLoS One. doi: 10.1371/journal.pone.0160449.
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
18
18. Kenzelmann-Broz D, Tucker RP, Leachman NT, Chiquet-Ehrismann R (2010)
The expression of teneurin-4 in the avian embryo: potential roles in patterning of
the limb and nervous system. Int J Dev Biol 54(10):1509-1516.
19. Suzuki N, Mizuniwa C, Ishii K, Nakagawa Y, Tsuji K, Muneta T, Sekiya I,
Akazawa C (2014a) Teneurin-4, a transmembrane protein, is a novel regulator
that suppresses chondrogenic differentiation. J Orthop Res 32(7):915-922.
20. Kaul N, Ali S (2016) Genes, Genetics, and Environment in Type 2 Diabetes:
Implication in Personalized Medicine. DNA Cell Biol 35(1):1-12.
21. Crispim D, Fagundes NJ, dos Santos KG, Rheinheimer J, Bouças AP, de Souza
BM, Macedo GS, Leiria LB, Gross JL, Canani LH (2010) Polymorphisms of the
UCP2 gene are associated with proliferative diabetic retinopathy in patients with
diabetes mellitus. Clin Endocrinol (Oxf) 72(5):612-619.
22. Qin LJ, Lv Y, Huang QY (2013) Meta-analysis of association of common
variants in the KCNJ11-ABCC8 region with type 2 diabetes. Genet Mol Res
12(3):2990-3002.
23. Haghverdizadeh P, Sadat Haerian M, Haghverdizadeh P, Sadat Haerian B
(2014) ABCC8 genetic variants and risk of diabetes mellitus. Gene 545(2):198-
204.
24. Ren J, Xiang AH, Trigo E, Takayanagi M, Beale E, Lawrence JM, Hartiala J,
Richey JM, Allayee H, Buchanan TA, Watanabe RM (2014) Genetic variation
in MTNR1B is associated with gestational diabetes mellitus and contributes only
to the absolute level of beta cell compensation in Mexican Americans.
Diabetologia 57(7):1391-1399.
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
19
25. Rampersaud E, Damcott CM, Fu M, Shen H, McArdle P, Shi X, Shelton J, Yin
J, Chang YP, Ott SH, Zhang L, Zhao Y, Mitchell BD, O'Connell J, Shuldiner
AR (2007) Identification of novel candidate genes for type 2 diabetes from a
genome-wide association scan in the Old Order Amish: evidence for replication
from diabetes-related quantitative traits and from independent populations.
Diabetes 56(12):3053-3062
26. Liu L, Yang X, Wang H, Cui G, Xu Y, Wang P, Yuan G, Wang X, Ding H,
Wang DW (2013) Association between variants of EXT2 and type 2 diabetes: a
replication and meta-analysis. Hum Genet 132(2):139-145.
27. Salopuro T, Pulkkinen L, Lindström J, Kolehmainen M, Tolppanen AM,
Eriksson JG, Valle TT, Aunola S, Ilanne-Parikka P, Keinänen-Kiukaanniemi S,
Tuomilehto J, Laakso M, Uusitupa M (2009) Variation in the UCP2 and UCP3
genes associates with abdominal obesity and serum lipids: the Finnish Diabetes
Prevention Study. BMC Med Genet. doi: 10.1186/1471-2350-10-94.
28. Hsu YH, Niu T, Song Y, Tinker L, Kuller LH, Liu S (2008) Genetic variants in
the UCP2-UCP3 gene cluster and risk of diabetes in the Women's Health
Initiative Observational Study. Diabetes 57(4):1101-1107.
29. Hanson RL, Bogardus C, Duggan D, Kobes S, Knowlton M, Infante AM,
Marovich L, Benitez D, Baier LJ, Knowler WC (2007) A search for variants
associated with young-onset type 2 diabetes in American Indians in a 100K
genotyping array. Diabetes 56(12):3045-3052
30. Scott LJ, Mohlke KL, Bonnycastle LL et al (2007) A genome-wide association
study of type 2 diabetes in Finns detects multiple susceptibility variants. Science
316(5829):1341-1345.
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
20
31. Unoki H, Takahashi A, Kawaguchi T et al (2008) SNPs in KCNQ1 are
associated with susceptibility to type 2 diabetes in East Asian and European
populations. Nat Genet 40(9):1098-1102.
32. Yasuda K, Miyake K, Horikawa Y et al (2008) Variants in KCNQ1 are
associated with susceptibility to type 2 diabetes mellitus. Nat Genet 40(9):1092-
1097.
33. Qian Y, Dong M, Lu F, Li H, Jin G, Hu Z, Shen C, Shen H (2015) Joint effect of
CENTD2 and KCNQ1 polymorphisms on the risk of type 2 diabetes mellitus
among Chinese Han population. Mol Cell Endocrinol 407:46-51.
34. Krempler F, Esterbauer H, Weitgasser R, Ebenbichler C, Patsch JR, Miller K,
Xie M, Linnemayr V, Oberkofler H, Patsch W (2002) A functional
polymorphism in the promoter of UCP2 enhances obesity risk but reduces type 2
diabetes risk in obese middle-aged humans. Diabetes 51(11):3331-3335.
35. D'Adamo M1, Perego L, Cardellini M, Marini MA, Frontoni S, Andreozzi F,
Sciacqua A, Lauro D, Sbraccia P, Federici M, Paganelli M, Pontiroli AE, Lauro
R, Perticone F, Folli F, Sesti G (2004) The -866A/A genotype in the promoter of
the human uncoupling protein 2 gene is associated with insulin resistance and
increased risk of type 2 diabetes. Diabetes 53(7):1905-1910.
36. Sasahara M, Nishi M, Kawashima H, Ueda K, Sakagashira S, Furuta H,
Matsumoto E, Hanabusa T, Sasaki H, Nanjo K (2004) Uncoupling protein 2
promoter polymorphism -866G/A affects its expression in beta-cells and
modulates clinical profiles of Japanese type 2 diabetic patients. Diabetes
53(2):482-485
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
21
37. Nielsen T, Sparsø T, Grarup N, Jørgensen T, Pisinger C, Witte DR, Hansen T,
Pedersen O (2011) Type 2 diabetes risk allele near CENTD2 is associated with
decreased glucose-stimulated insulin release. Diabetologia 54(5):1052-1056.
38. Chen L, Lin Z, Jiang M, Lu L, Zhang H, Xin Y, Jiang X, Xuan S (2015) Genetic
Variants in the SAMM50 Gene Create Susceptibility to Nonalcoholic Fatty
Liver Disease in a Chinese Han Population. Hepat Mon. doi:
10.5812/hepatmon.31076.
39. Kelemen LE, Sellers TA, Schildkraut JM, Cunningham JM, Vierkant RA,
Pankratz VS, Fredericksen ZS, Gadre MK, Rider DN, Liebow M, Goode EL
(2008) Genetic variation in the one-carbon transfer pathway and ovarian cancer
risk. Cancer Res 68(7):2498-2506.
40. Wang SS, Gonzalez P, Yu K, Porras C, Li Q, Safaeian M, Rodriguez AC,
Sherman ME, Bratti C, Schiffman M, Wacholder S, Burk RD, Herrero R,
Chanock SJ, Hildesheim A (2010) Common genetic variants and risk for HPV
persistence and progression to cervical cancer. PLoS One. doi:
10.1371/journal.pone.0008667.
41. Kastelic V, Drobnic K (2012) A single-nucleotide polymorphism (SNP)
multiplex system: the association of five SNPs with human eye and hair color in
the Slovenian population and comparison using a Bayesian network and logistic
regression model. Croat Med J 53(5):401-408.
42. Kastelic V, Pośpiech E, Draus-Barini J, Branicki W, Drobnič K (2013)
Prediction of eye color in the Slovenian population using the IrisPlex SNPs.
Croat Med J 54(4):381-6.
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
22
43. Chang TJ, Liu PH, Liang YC, Chang YC, Jiang YD, Li HY, Lo MT, Chen HS,
Chuang LM (2011) Genetic predisposition and nongenetic risk factors of
thiazolidinedione-related edema in patients with type 2 diabetes. Pharmacogenet
Genomics 21(12):829-836.
44. Drouet V, Ruiz M, Zala D, Feyeux M, Auregan G, Cambon, Troquier L,
Carpentier J, Aubert S, Merienne N, Bourgois-Rocha F, Hassig R, Rey M,
Dufour N, Saudou F, Perrier AL, Hantraye P, Déglon N (2014) Allele-specific
silencing of mutant huntingtin in rodent brain and human stem cells. PLoS One.
doi: 10.1371/journal.pone.0099341
45. Fanciulli M, Santulli L, Errichiello L, Barozzi C, Tomasi L, Rigon L, Cubeddu
T, de Falco A, Rampazzo A, Michelucci R, Uzzau S, Striano S, de Falco FA,
Striano P, Nobile C (2012) LGI1 microdeletion in autosomal dominant lateral
temporal epilepsy. Neurology 78(17):1299-1303
46. Eun Pyo Hong and Ji Wan Park Sample Size and Statistical Power Calculation
in Genetic Association Studies Genomics Inf. 2012 Jun; 10(2): 117–122.
47. Marc E Rothenberg et al Common variants at 5q22 associate with pediatric
eosinophilic esophagitis Nature Genetics 42,289–291(2010).
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
23
Figure 1 a QQ-plot of Chechan for train CALs (calorie intake)
Figure 1 b Manhattan Plot of Chechan for trait CALs (calorie intake). SNPs are sorted
by chromosomal location along X axis against their –log10(P-value) shown on Y axis.
The horizontal line is for the suggested significance level p=10-5
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
24
Figure 2 a QQ plot of Circassians for trait CARB (Carbohydrate intake).
Figure 2 b Manhattan Plot of Circassian for trait CARB (Carbohydrate intake). SNPs
are sorted by chromosomal location along X axis against their –log10(P-value) shown
on Y axis. The horizontal line is for the suggested significance level p=10-5
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
25
Figure 3 a QQ plot for trait CALs (Calorie intake) in Circassian population
Figure 3 b Manhattan Plot of Circassian for trait CALs (Calorie intake). SNPs are sorted
by chromosomal location along X axis against their –log10(P-value) shown on Y axis.
The horizontal line is for the suggested significance level p=10-5
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
26
Figure 4 a QQ Plot of Circassian for trait Vitamin B2
Figure 4 b Manhattan Plot of Circassian for trait Vitamin B2. SNPs are sorted by
chromosomal location along X axis against their –log10(P-value) shown on Y axis. The
horizontal line is for the suggested significance level p=10-5
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
27
Figure 5 a QQ Plot of Circassian for trait CAFF
Figure 5 b Manhattan Plot of Circassian for trait CAFF. SNPs are sorted by
chromosomal location along X axis against their –log10(P-value) shown on Y axis. The
horizontal line is for the suggested significance level p=10-5
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
28
Figure 6 A power vs effect size plot with effect size beta ranging from 177 to 1419. The
tool website for GPC used is R package “GeneticsDesign”
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
29
Number Trait ABBREVIATION
1 Protein PROT
2 Carbohydrate CARB
3 Fiber FIBER
4 Lactose LACT
5 Fat FAT
6 Monounsaturated fatty acid MONO
7 Saturated fatty acid SAT
8 Polyunsaturated fatty acid POLY
9 Trans fatty acid TFA
10 Cholesterol CHOL
11 Vitamin A (international unit) AIU
12 Vitamin B1 B1
13 Vitamin B2 Vitamin B2
14 Vitamin B3 B3
15 Vitamin B12 B12
16 Vitamin B6 B6
17 Biotin BIOT
18 Vitamin C VITC
19 Vitamin D DIU
20 Vitamin E EIU
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
30
21 Folic acid FOLAC
22 Vitamin K VITK
23 Pantothenic acid PANTHOTHE
24 Calorie CALC
25 Iron IRON
26 Selenium SELNIUM
27 Caffeine CAFF
28 Omega-3 OMEGA3
29 Omega-6 OMEGA6
30 Zinc ZINC
31 Glucose GLUCOSE
32 Glycosylated hemoglobin HBAIC
33 Cholesterol CHOL
34 High-Density lipoprotein HDL
35 Low-Density lipoprotein LDL
36 Triglyceride TRIG
Table 1 Macro/micronutrient intake and biochemical traits for Chechan and Circassian
populations
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
31
SNP Gene Locus
ID
Chr
omo
som
e #
Biochemi
cal factor
Disease Functional
_class
P-value References
rs21
435
71
SAM
M50
25813 22 CALS Genetic
Variants
in the
SAMM5
0 Gene
Create
Suscepti
bility to
Nonalco
holic
Fatty
Liver
Disease
in a
Chinese
Han
Populati
on
Intron-
variant
0.0000
6646
( Chen et al.
2015)
rs11
887
DNM
T3A
1788 2 B2MG Ovarian
CA
intron-
variant
0.0000
02139
(Kelemen et al.
2008)
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
32
120
rs13
933
50
TYR 7299 11 B2MG Predictio
n of eye
color in
the
Slovenia
n
populatio
n using
the
IrisPlex
SNPs.
Intron-
variant
0.0000
4833
(
Kasteli
c and
Drobni
c 2012;
Kasteli
c et al.
2013)
rs37
846
21
DUT 1854 15 B2MG risk for
HPV
persisten
ce and
progressi
on to
cervical
cancer.
Intron-
variant
0.0000
3019
(Wang et al.
2010)
rs12
904
SLC1
2A1
10537
08066
15 B2MG Genetic
predispo
intron-
variant,upstr
0.0000
6383
(Chang et al.
2011)
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
33
Table 2 List of macro/micronutrient associated loci associated with other diseases from
both Circassian and Chechan populations
216 557 sition
and
nongenet
ic risk
factors
of
thiazolidi
nedione-
related
edema in
patients
with type
2
diabetes.
eam-
variant-2KB
rs22
768
81
HTT 3064 4 CAFF Huntingt
on’s
disease
reference,sy
nonymous-
codon
0.0000
1753
(Drouet et al.
2014)
rs70
990
34
LGI1 10537
84379
211
10 CAFF Lateral
temporal
epilepsy
intron-
variant
0.0000
001995
( Fanciulli et al.
2012)
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
34
Abbreviation list
CAG: Center for Applied Genomics
GWAS: Genome Wide Association Studies
HDL: high density lipoprotein
HSDP: the Heart and Capillary Disease Prevention directorate
KHCIBC: King Hussein Cancer Institute for Biotechnology and Cancer
LDL: low density lipoprotein
SNPs: Single Nucleotide Polymorphisms
T2D: Type two diabetes
ACCEPTED MANUSCRIPT
ACC
EPTE
D M
ANU
SCR
IPT
Dajani
35
Highlights
Title: Association of genetic variants with macro/micronutrient intake in Circassian and
Chechan populations in relation to diabetes”
Type two Diabetes (T2D) is a complex disease including environmental and
genetic factors. Food intake plays a role in the manifestation of T2D.
Macronutrient and micronutrient intake varies among individuals. This variance
can be partly related to genetic components.
This is the first study to study the association between GWAS and food intake in
diabetic patients in the Circassian and Chechan population in Jordan.
We were able to show that within the Circassian population, traits including
carbohydrate, calorie and vitamin B2 intake that were found to correlate with the
occurrence of T2D were also shown to be associated with single nucleotide
polymorphisms (SNPs) located on certain genomic regions.
The association between macronutrient intake and diabetes development can
shed light on causative variants for the pathogenesis of T2D.
ACCEPTED MANUSCRIPT