ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/jillian language... · web viewthis...

19
Language and Geography: Intertwining Elements in the Picture of Human Genetic Diversity By Jillian Claire Biol 303 November 4, 2011

Upload: vuonglien

Post on 01-Mar-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

Language and Geography: Intertwining Elements in the Picture of Human Genetic Diversity

By Jillian ClaireBiol 303

November 4, 2011

Page 2: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

Looking around any college campus in America today, one can immediately

see a picture of the diversity in not only human ethnicity and genetics, but the

diversity of language as well. This diversity has existed ever since humans first

evolved in Africa(Chiaroni et al. 2009), but diversity concentrated into an area as

small as a college campus is an extremely new phenomenon. For most of human

history languages and haplogroups, or populations of people that share a genetic

marker due to common ancestry, existed only in very specific geographical areas

and spread through gradual migration (Chiaroni et al. 2009). This phenomenon of

evolution lead to the formation of many different populations of humans in the

Americas, each one speaking a unique language. Flora Jay, Olivier Francois, and

Michael G.B. Blum studied the relationship of Native American population structure

and languages in the paper Predictions of Native American Population Structure

Using Linguistic Covariate in a Hidden Regression Framework, published in the

January 2011 volume of the journal PLoS ONE.

Studying the relationship between genetics and languages is not a new

concept, but no previous studies of Native American populations had shown any

significant relationship. So these authors approached the question in a new way.

Instead of using a tree-based test, which compares genetic distances to a language

tree, they used measures of linguistic distance derived from structural features of

the language (Jay et al. 2011). A language tree is a hierarchal classification of related

languages, but in the case of Native American languages a consensus on a language

tree among linguists has not been reached (Jay et al. 2011). The authors used three

Page 3: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

levels of linguistic differentiation, the stock level of 8 groups, the group level of 14,

and the family level of 16 (Jay et al. 2011).

The aim of their study was to determine to what extent geographic and

linguistic origin can explain an individual’s membership to a genetic cluster and to

find out if languages contribute to a better prediction of cluster membership than

geography alone (Jay et al. 2011). From the start of the study it was evident that

geography alone provides a very good prediction of cluster membership. Figure 1

below shows a map of the Americas and the genetic clusters studied. The colored

areas show where the predicted membership coefficient is greater than 0.5, or

where a correct placement was not due to chance. It is clear that most of the

populations fall within the area of prediction (Jay et al. 2011).

Figure 1. (Jay et al. 2011)

Page 4: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

How are these genetic clusters defined; and why does geography have so

much to do with genetic differentiation? Many studies have been conducted on this

topic. Andrea Manica, Franck Prugnolle, and Francois Balloux (2005) studied the

relationships of geography and ethnicity on human genetic diversity. They state that

humans cluster into five or six broad ethnic groups that generally correspond to

continents. However, this is not a fine enough classification to predict genetic

diversity (Manica et al. 2005). In their study they found that genetic differentiation

is essentially dependent on geographic isolation, and geography is always a far

better predictor for the proportion of shared variants between two populations than

ethnicity. The proportion of shared variants is simply the number of alleles that are

shared between two populations divided by the number of loci typed (Manica et al.

2005). Most significantly, they discovered a correlation of 93% between genetic

diversity and distance from East Africa along landmasses. This essentially proves

that geography can indeed predict genetic makeup (Manica et al. 2005).

Another study of genetic diversity analyzing Y-chromosome diversity and

human expansion with relation to cultural evolution, conducted by Jacques Chiaroni,

Peter Underhill, and Luca Cavalli-Sforza (2009), corroborates the claim of Manica et

al. that there is a strong relationship between genetic diversity and distance from

East Africa. A consequence of “Out of Africa expansion”, a “reasonable” observed

slope of decay of genetic diversity with distance from East Africa is hypothesized to

be the result of a serial founder effect (Chiaroni et al. 2009). Very interestingly, a

study by Quentin Atkinson found that phonemic diversity also experiences a linear

fall with distance from Africa (Atkinson 2011). A phoneme is the smallest segmental

Page 5: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

sound used to form words. Atkinson’s study illustrated this decline in phonemic

diversity very clearly. Khoisan language families, which are found in East Africa,

contain about 100 phonemes, including clicks. However Polynesian languages, the

furthest from Africa, contain only 13. As a comparison, English contains about 45

(Atkinson 2011).

Below is a phylogenetic tree, which describes the relationships of human Y

chromosome haplogroups. More evidence for the Out of Africa expansion,

haplogroups A and B are shown by the tree to be the oldest groups, and they are

confined to the African continent (Chiaroni et al. 2009).

Figure 2. (Chiaroni et al. 2009)

Chiaroni proposes that if human migrations were random, the geographic

distribution of people with a specific haplogroup would follow a normal distribution

around the point of origin of the mutation that defines the haplogroup, with various

irregularities due to geographic obstacles (Chiaroni et al. 2009). The following

Page 6: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

figure is a series of maps, which show the distribution of haplogroups from Figure 2.

The concentration of color, which illustrates spatial distribution, shows that the

populations carrying each haplogroup did most likely migrate slowly and

homogenously from their place of origin (Chiaroni et al. 2009).

Figure 3. (Chiaroni et al. 2009)

Page 7: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

How does this expansion relate to cultural evolution, which includes

language? Chiaroni proposes that as humans migrated, culture developed and

became extremely localized, and eventually cultural evolution became so effective at

meeting biological needs that natural selection essentially ceased to have an effect.

Therefore, when humans migrated their culture had to adapt to their new

environment, but the elimination of natural selection meant that a haplogroup

would not die out in any area (Chiaroni et al. 2009).

It is clear that geography has a very significant effect on human genetics, but

what about language? What are the results of Jay’s study? Jay et al. (2011) set out to

determine if geographic and linguistic variables improve the estimation of Native

American genetic cluster membership. They found that geography alone is a very

good predictor of genetic cluster membership, with a correlation coefficient of 0.81.

Based on the two studies just discussed, this result is easily validated. However,

when the linguistic variable was added the correlation improved to 0.98 for the

finest linguistic classification, the family level (Jay et al. 2011).

Figure 4. (Jay et al. 2011)

Page 8: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

Figure 4 shows the improvement in correlation when both geographical and

linguistic variables are considered. The correlation between language and

prediction of genetic cluster is very good, but it’s not perfect. Figure 5A below shows

a comparison of genetic cluster membership (estimation) and the prediction of

membership based on geography and geography plus language. 5B-D show the

breakdown of language classification used in the study.

Figure 5. (Jay et al. 2011)

Page 9: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

The red boxes show where there is a significant difference between the

prediction based on geography and the prediction based on geography plus

language (Jay et al. 2011). However, the authors state that even the prediction with

language is not a perfect predictor. When historical expansion of populations

involved language replacement, as with the Tupi expansion, the lines between

genetics and language become blurred. Today the Tupi family contains

approximately 41 languages but many, many more populations of very small size

(Jay et al. 2011).

The study of Native American genetics was conducted using autosomal

microsatellite loci. They compiled their data set of DNA from 512 individuals of 28

different populations, obtained from the Human Genome Diversity Panel. The

individuals were genotyped at 678 microsatellite loci. The large sample size as well

as the large number of loci typed ensured reliable results (Jay et al 2011). This was

determined by first using simulated data, which at only 100 loci and geographical

and linguistic variables considered had a 0% rate of misclassifying individuals, as

seen in Figure 6 below.

Page 10: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

Figure 6. (Jay et al. 2011)

The results of Jay’s study do not only hold true for Native American

populations. Rather, the same conclusions can be found in populations all over the

world. One such study, Parallel Evolution of Genes and Languages in the Caucasus

Region, found the same results among populations in the Caucasus region between

the Black Sea and the Caspian Sea. While this area covers much less land than the

Americas, it experienced similar diversification because of the mountainous terrain

(Balanovsky et al. 2011). This study was conducted using Y-chromosomal variation,

rather than autosomal microsatellite loci. Y-chromosomal variation is ideal for

population and evolution studies because it passes directly from father to son

without recombination with the mother’s DNA.

Unlike the Native American study, Balanovsky et al. (2011) did use tree-

based tests of haplogroup frequency and linguistic variation. Just from a brief

examination of the trees in Figure 7 it is evident that genetic clusters and language

groups do mirror each other.

Page 11: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

Figure 7. (Balanovsky et al. 2011)

This study found that the Caucasus region contains four major haplogroups,

separated by distinct boundaries as shown in Figure 8. These boundaries also

coincide with four major linguistic groups of the Caucasus (Balanovsky et al. 2011).

Figure 8. (Balanovsky et al. 2011)

This study of linguistics and genetics in the Caucasus region reaches the

same conclusion as the study of Native American languages, that linguistic diversity

Page 12: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

is as important as geography in shaping genetic diversity. The authors believe that

the genetic structure of the Caucasus may have evolved in a parallel process with

the diversification of Caucasus languages (Balanovsky et al. 2011). They authors

postulate that language and geography are probably so closely linked because of the

mountainous nature of the Caucasus region, and language likely had a larger

influence on genetic drift because of marriage and individual migrations linking

similar populations (Balanovsky et al. 2011).

These two studies were conducted very differently and on completely

different populations, and yet they yielded the same results. Language, when

coupled with geography, is an extremely accurate predictor of genetic cluster

membership. This is because the genetic patterns in human populations reflect very

ancient demographic events (Jay et al. 2011). Both papers suggest that cultural

traits contribute to gene flow between groups, because individuals are more likely

to move between groups if they share aspects of culture, like language. Individuals

would also generally prefer to mate with someone of the same language group (Jay

et al. 2011; Balanovsky et al. 2011).

These are very conclusive findings, but is there a practical application? It is

well known that there is variation between populations with respect to

susceptibility for certain diseases (Manica et al. 2005). There has been much focus

towards ethnic-specific drug processing, but these papers suggest that perhaps

geographic origin or linguistic group of the individual would be a better basis for

drug tailoring (Jay et al. 2011). The depth of relationship between genetics and

medicine is only just beginning to be understood, but these studies show that there

Page 13: ww2.biol.sc.eduww2.biol.sc.edu/~elygen/biol303/2011 term papers/Jillian Language... · Web viewThis phenomenon of evolution lead to the formation of many different populations of

is a broader picture to be considered, one that could potentially create great strides

in personalized medicine.

Works Cited

Atkinson, Q. Phonemic diversity supports a serial founder effect model of language

expansion from Africa. Science. 332, 346-349 (2011).

Balanovsky, O., Dibirova, K., Dybo, A. et al. Parallel evolution of genes and languages

in the Caucasus region. Molecular Biology and Evolution. 28, 2905-2918

(2011).

Chiaroni, J., Underhill, P., and Cavalli-Sforza, L. Y chromosome diversity, human

expansion, drift, and cultural evolution. PNAS. 106, 20174-20179 (2009).

Jay, F., Francois, O., and Blum, M. Predictions of Native American population

structure using linguistic covariates in a hidden regression framework. PLoS

ONE. 6, 1-11 (2011).

Manica, A., Prugnolle, F., Balloux, F. Geography is a better determinant of human

genetic differentiation than ethnicity. Human Genetics. 118, 366-371 (2005).