Palaeolithic Caucasus samples reveal the most important component of West Eurasians


Preprint Paleolithic DNA from the Caucasus reveals core of West Eurasian ancestry, by Lazaridis et al. bioRxiv (2018).

Interesting excerpts:

We analyzed teeth from two individuals 63 recovered from Dzudzuana Cave, Southern Caucasus, from an archaeological layer previously dated to ~27-24kya (…). Both individuals had mitochondrial DNA sequences (U6 and N) that are consistent with deriving from lineages that are rare in the Caucasus or Europe today. The two individuals were genetically similar to each other, consistent with belonging to the same population and we thus analyze them jointly.

(…) our results prove that the European affinity of Neolithic Anatolians does not necessarily reflect any admixture into the Near East from Europe, as an Anatolian Neolithic-like population already existed in parts of the Near East by ~26kya. Furthermore, Dzudzuana shares more alleles with Villabruna-cluster groups than with other ESHG (Extended Data Fig. 5b), suggesting that this European affinity was specifically related to the Villabruna cluster, and indicating that the Villabruna affinity of PGNE populations from Anatolia and the Levant is not the result of a migration into the Near East from Europe. Rather, ancestry deeply related to the Villabruna cluster was present not only in Gravettian and Magdalenian-era Europeans but also in the populations of the Caucasus, by ~26kya. Neolithic Anatolians, while forming a clade with Dzudzuana with respect to ESHG, share more alleles with all other PGNE (Extended Data Fig. 5d), suggesting that PGNE share at least partially common descent to the exclusion of the much older samples from Dzudzuana.

Ancient West Eurasian population structure. PCA of key ancient West Eurasians, including additional populations (shown with grey shells), in the space of outgroup f4-statistics (Methods).

Our co-modeling of Epipaleolithic Natufians and Ibero-Maurusians from Taforalt confirms that the Taforalt population was mixed, but instead of specifying gene flow from the ancestors of Natufians into the ancestors of Taforalt as originally reported, we infer gene flow in the reverse direction (into Natufians). The Neolithic population from Morocco, closely related to Taforalt is also consistent with being descended from the source of this gene flow, and appears to have no admixture from the Levantine Neolithic (Supplementary Information 166 section 3). If our model is correct, Epipaleolithic Natufians trace part of their ancestry to North Africa, consistent with morphological and archaeological studies that indicate a spread of morphological features and artifacts from North Africa into the Near East. Such a scenario would also explain the presence of Y-chromosome haplogroup E in the Natufians 170 and Levantine farmers, a common link between the Levant and Africa.

(…) we cannot reject the hypothesis that Dzudzuana and the much later Neolithic Anatolians form a clade with respect to ESHG (P=0.286), consistent with the latter being a population largely descended from Dzudzuana-like pre-Neolithic populations whose geographical extent spanned both Anatolia and the Caucasus. Dzudzuana itself can be modeled as a 2-way mixture of Villabruna-related ancestry and a Basal Eurasian lineage.

In qpAdm modeling, a deeply divergent hunter-gatherer lineage that contributed in relatively unmixed form to the much later hunter-gatherers of the Villabruna cluster is specified as contributing to earlier hunter-gatherer groups (Gravettian Vestonice16: 35.7±11.3% and Magdalenian ElMiron: 60.6±11.3%) and to populations of the Caucasus (Dzudzuana: 199 72.5±3.7%, virtually identical to that inferred using ADMIXTUREGRAPH). In Europe, descendants of this lineage admixed with pre-existing hunter-gatherers related to Sunghir3 from Russia for the Gravettians and GoyetQ116-1 from Belgium for the Magdalenians, while in the Near East it did so with Basal Eurasians. Later Europeans prior to the arrival of agriculture were the product of re-settlement of this lineage after ~15kya in mainland Europe, while in eastern Europe they admixed with Siberian hunter-gatherers forming the WHG-ANE cline of ancestry [See PCA above]. In the Near East, the Dzudzuana-related population admixed with North African-related ancestry in the Levant and with Siberian hunter-gatherer and eastern non-African-related ancestry in Iran and the Caucasus. Thus, the highly differentiated populations at the dawn of the Neolithic were primarily descended from Villabruna Cluster and Dzudzuana-related ancestors, with varying degrees of additional input related to both North Africa and Ancient North/East Eurasia whose proximate sources may be clarified by future sampling of geographically and temporally intermediate populations.

An admixture graph model of Paleolithic West Eurasians. An automatically generated admixture graph models fits populations (worst Z-score of the difference between estimated and fitted f-statistics is 2.7) or populations (also including South_Africa_HG, worst Z-score is 3.5). This is a simplified model assuming binary admixture events and is not a unique solution (Supplementary Information section 2). Sampled populations are shown with ovals and select labeled internal nodes with rectangles.

Interesting excerpts from the supplementary materials:

From our analysis of Supplementary Information section 3, we showed that these sources are indeed complex, and only one of these (WHG, represented by Villabruna) appears to be a contributor to all the remaining sources. This should not be understood as showing that hunter-gatherers from mainland Europe migrated to the rest of West Eurasia, but rather that the fairly homogeneous post-15kya population of mainland Europe labeled WHG appear to represent a deep strain of ancestry that seems to have contributed to West Eurasians from the Gravettian era down to the Neolithic period.

Villabruna is representative of the WHG group. We also include ElMiron, the best sample from the Magdalenian era as we noticed that within the WHG group there were individuals that could not be modeled as a simple clade with Villabruna but also had some ElMiron-related ancestry. Ddudzuana is representative of the Ice Age Caucasus population, differentiated from Villabruna by Basal Eurasian ancestry. AG3 represents ANE/Upper Paleolithic Siberian ancestry, sampled from the vicinity of Lake Baikal, while Russia_Baikal_EN related to eastern Eurasians and represents a later layer of ancestry from the same region of Siberia as AG3 Finally, Mbuti are a deeply diverged African population that is used here to represent deep strains of ancestry (including Basal Eurasian) prior to the differentiation between West Eurasians and eastern non-Africans that are otherwise not accounted for by the remaining five sources. Collectively, we refer to this as ‘Basal’ or ‘Deep’ ancestry, which should be understood as referring potentially to both Basal Eurasian and African ancestry.

It has been suggested that there is an Anatolia Neolithic-related affinity in hunter-gatherers from the Iron Gates. Our analysis confirms this by showing that this population has Dzudzuana-related ancestry as do many hunter-gatherer populations from southeastern Europe, eastern Europe and Scandinavia. These populations cannot be modeled as a simple mixture of Villabruna and AG3 but require extra Dzudzuana-related ancestry even in the conservative estimates, with a positive admixture proportion inferred for several more in the speculative ones. Thus, the distinction between European hunter-gatherers and Near Eastern populations may have been gradual in pre-Neolithic times; samples from the Aegean (intermediate between those from the Balkans and Anatolia) may reveal how gradual the transition between Dzudzuana-like Neolithic Anatolians and mostly Villabruna-like hunter-gatherers was in southeastern Europe.

Modified image (cut, with important samples marked). Modeling present-day and ancient West-Eurasians. Mixture proportions computed with qpAdm (Supplementary Information section 4). The proportion of ‘Mbuti’ ancestry represents the total of ‘Deep’ ancestry from lineages that split prior to the 365 split of Ust’Ishim, Tianyuan, and West Eurasians and can include both ‘Basal Eurasian’ and other (e.g., Sub-Saharan African) ancestry. (a) ‘Conservative’ estimates. Each population 367 cannot be modeled with fewer admixture events than shown.

Villabruna: This type of ancestry differentiates between present-day Europeans and non-Europeans within West Eurasia, attaining a maximum of ~20% in the Baltic in accordance with previous observations and with the finding of a later persistence of significant hunter-gatherer ancestry in the region. Its proportion drops to ~0% throughout the Near East. Interestingly, a hint of such ancestry is also inferred in all North African populations west of Libya in the speculative proportions, consistent with an archaeogenetic inference of gene flow from Iberia to North Africa during the Late Neolithic.

ElMiron: This type of ancestry is absent in present-day West Eurasians. This may be because most of the Villabruna-related ancestry in Europeans traces to WHG populations that lacked it (since ElMiron-related ancestry is quite variable within European hunter-gatherers). However, ElMiron ancestry makes up only a minority component of all WHG populations sampled to date and WHG-related ancestry is a minority component of present-day Europeans. Thus, our failure to detect it in present day people may be simply be too little of it to detect with our methods.

Dzudzuana: Our analysis identifies Dzudzuana-related ancestry as the most important component of West Eurasians and the one that is found across West Eurasian-North African populations at ~46-88% levels. Thus, Dzudzuana-related ancestry can be viewed as the common core of the ancestry of West Eurasian-North African populations. Its distribution reaches its minima in northern Europe and appears to be complementary to that of Villabruna, being most strongly represented in North Africa, the Near East (including the Caucasus) and Mediterranean Europe. Our results here are expected from those of Supplementary Information section 3 in which we modeled ancient Near Eastern/North African populations (the principal ancestors of present-day people from the same regions) as deriving much of their ancestry from a Dzudzuana-related source. Migrations from the Near East/Caucasus associated with the spread of the Neolithic, but also the formation of steppe population introduced most of the Dzudzuana-related ancestry present in Europe, although (as we have seen above) some such ancestry was already present in some pre-agricultural hunter-gatherers in Europe.

AG3: Ancestry related to the AG3 sample from Siberia has a northern distribution, being strongly represented in both central-northern Europe and the north Caucasus.

Russia_Baikal_EN: Ancestry related to hunter-gatherers from Lake Baikal in Siberia (postdating AG3) appears to have affected primarily northeastern European populations which have been previously identified as having East Eurasian ancestry; some such ancestry is also identified for a Turkish population from Balıkesir, likely reflecting the Central Asian ancestry of Turkic speakers which has been recently confirmed directly in an Ottoman sample from Anatolia.

So, here we have the explanation for the “bidirectional gene flow between populations ancestral to Southeastern Europeans of the early Holocene and Anatolians of the late glacial or a dispersal of Southeastern Europeans into the Near East” inferred from Anatolian hunter-gatherers.


Evolution of Steppe, Neolithic, and Siberian ancestry in Eurasia (ISBA 8, 19th Sep)


Some information is already available from ISBA 8 (see programme in PDF), thanks to the tweets from Alexander M. Kim.

Official abstracts are listed first (emphasis mine), then reports and images with link to Kim’s tweets. Here is the list for quick access:

Updates (17:00 CET):

Turkic and Hunnic expansions

Tracing the origin and expansion of the Turkic and Hunnic confederations, by Flegontov et al.

Turkic-speaking populations, now spread over a vast area in Asia, are highly heterogeneous genetically. The first confederation unequivocally attributed to them was established by the Göktürks in the 6th c. CE. Notwithstanding written resources from neighboring sedentary societies such as Chinese, Persian, Indian and Eastern Roman, earlier history of the Turkic speakers remains debatable, including their potential connections to the Xiongnu and Huns, which dominated the Eurasian steppe in the first half of the 1st millennium CE. To answer these questions, we co-analyzed newly generated human genome-wide data from Central Asia (the 1240K panel), spanning the period from ca. 3000 to 500 YBP, and the data published by de Barros Damgaard et al. (137 ancient human genomes from across the Eurasian steppes, Nature, 2018). Firstly, we generated a PCA projection to understand genetic affinities of ancient individuals with respect to present-day Tungusic, Mongolic, Turkic, Uralic, and Yeniseian-speaking groups. Secondly, we modeled hundreds of present-day and few ancient Turkic individuals using the qpAdm tool, testing various modern/ancient Siberian and ancient West Eurasian proxies for ancestry sources.

A majority of Turkic speakers in Central Asia, Siberia and further to the west share the same ancestry profile, being a mixture of Tungusic or Mongolic speakers and genetically West Eurasian populations of Central Asia in the early 1st millennium CE. The latter are themselves modelled as a mixture of Iron Age nomads (western Scythians or Sarmatians) and ancient Caucasians or Iranian farmers. For some Turkic groups in the Urals and the Altai regions and in the Volga basin, a different admixture model fits the data: the same West Eurasian source + Uralic- or Yeniseian-speaking Siberians. Thus, we have revealed an admixture cline between Scythians and the Iranian farmer genetic cluster, and two further clines connecting the former cline to distinct ancestry sources in Siberia. Interestingly, few Wusun-period individuals harbor substantial Uralic/Yeniseian-related Siberian ancestry, in contrast to preceding Scythians and later Turkic groups characterized by the Tungusic/Mongolic-related ancestry. It remains to be elucidated whether this genetic influx reflects contacts with the Xiongnu confederacy. We are currently assembling a collection of samples across the Eurasian steppe for a detailed genetic investigation of the Hunnic confederacies.

Three distinct East/West Eurasian clines across the continent with some interesting linguistic correlates, as earlier reported by Jeong et al. (2018). Alexander M. Kim.
Very important observation with implication of population turnover is that pre-Turkic Inner Eurasian populations’ Siberian ancestry appears predominantly “Uralic-Yeniseian” in contrast to later dominance of “Tungusic-Mongolic” sort (which does sporadically occur earlier). Alexander M. Kim

New interesting information on the gradual arrival of the “Uralic-Yeniseian” (Siberian) ancestry in eastern Europe with Iranian and Turkic-speaking peoples. We already knew that Siberian ancestry shows no original relationship with Uralic-speaking peoples, so to keep finding groups who expanded this ancestry eastwards in North Eurasia should be no surprise for anyone at this point.

Central Asia and Indo-Iranian

The session The Genomic Formation of South and Central Asia, by David Reich, on the recent paper by Narasimhan et al. (2018).

One important upside of dense genomic sampling at single localities – greater visibility of outliers and better constraints on particular incoming ancestries’ arrival times. Gonur Tepe as a great case study of this. Alexander M. Kim
– Tale of three clines, with clear indication that “Indus Periphery” samples drawn from an already-cosmopolitan and heterogeneous world of variable ASI & Iranian ancestry. (I know how some people like to pore over these pictures – so note red dots = just dummy data for illustration.)
– Some more certainty about primary window of steppe ancestry injection into S. Asia: 2000-1500 BC
Alexander M. Kim

British Isles

Ancient DNA and the peopling of the British Isles – pattern and process of the Neolithic transition, by Brace et al.

Over recent years, DNA projects on ancient humans have flourished and large genomic-scale datasets have been generated from across the globe. Here, the focus will be on the British Isles and applying aDNA to address the relative roles of migration, admixture and acculturation, with a specific focus on the transition from a Mesolithic hunter-gatherer society to the Neolithic and farming. Neolithic cultures first appear in Britain ca. 6000 years ago (kBP), a millennium after they appear in adjacent areas of northwestern continental Europe. However, in Britain, at the margins of the expansion the pattern and process of the British Neolithic transition remains unclear. To examine this we present genome-wide data from British Mesolithic and Neolithic individuals spanning the Neolithic transition. These data indicate population continuity through the British Mesolithic but discontinuity after the Neolithic transition, c.6000 BP. These results provide overwhelming support for agriculture being introduced to Britain primarily by incoming continental farmers, with surprisingly little evidence for local admixture. We find genetic affinity between British and Iberian Neolithic populations indicating that British Neolithic people derived much of their ancestry from Anatolian farmers who originally followed the Mediterranean route of dispersal and likely entered Britain from northwestern mainland Europe.

Millennium of lag between farming establishment in NW mainland Europe & British Isles. Only 25 Mesolithic human finds from Britain. Alexander M. Kim.
– Evidently no resurgence of hunter-gatherer ancestry across Neolithic
– Argument for at least two geographically distinct entries of Neolithic farmers
Alexander M. Kim.

MN Atlantic / Megalithic cultures

Genomics of Middle Neolithic farmers at the fringe of Europe, by Sánchez Quinto et al.

Agriculture emerged in the Fertile Crescent around 11,000 years before present (BP) and then spread, reaching central Europe some 7,500 years ago (ya.) and eventually Scandinavia by 6,000 ya. Recent paleogenomic studies have shown that the spread of agriculture from the Fertile Crescent into Europe was due mainly to a demic process. Such event reshaped the genetic makeup of European populations since incoming farmers displaced and admixed with local hunter-gatherers. The Middle Neolithic period in Europe is characterized by such interaction, and this is a time where a resurgence of hunter-gatherer ancestry has been documented. While most research has been focused on the genetic origin and admixture dynamics with hunter-gatherers of farmers from Central Europe, the Iberian Peninsula, and Anatolia, data from farmers at the North-Western edges of Europe remains scarce. Here, we investigate genetic data from the Middle Neolithic from Ireland, Scotland, and Scandinavia and compare it to genomic data from hunter-gatherers, Early and Middle Neolithic farmers across Europe. We note affinities between the British Isles and Iberia, confirming previous reports. However, we add on to this subject by suggesting a regional origin for the Iberian farmers that putatively migrated to the British Isles. Moreover, we note some indications of particular interactions between Middle Neolithic Farmers of the British Isles and Scandinavia. Finally, our data together with that of previous publications allow us to achieve a better understanding of the interactions between farmers and hunter-gatherers at the northwestern fringe of Europe.

-Novel genomic data from 21 individuals from 6 sites.
– “Megalithic” individuals not systematically diff. from geographically proximate “non-megalithic” burials
– Mild evidence for over-representation of males in some British Isles megalithic tombs
– Megalithic tombs in W & N Neolithic Europe may have link to kindred structures
Alexander M. Kim

Central European Bronze Age

Ancient genomes from the Lech Valley, Bavaria, suggest socially stratified households in the European Bronze Age, by Mittnik et al.

Archaeogenetic research has so far focused on supra-regional and long-term genetic developments in Central Europe, especially during the third millennium BC. However, detailed high-resolution studies of population dynamics in a microregional context can provide valuable insights into the social structure of prehistoric societies and the modes of cultural transition.

Here, we present the genomic analysis of 102 individuals from the Lech valley in southern Bavaria, Germany, which offers ideal conditions for such a study. Several burial sites containing rich archaeological material were directly dated to the second half of the 3rd and first half of the 2nd millennium BCE and were associated with the Final Neolithic Bell Beaker Complex and the Early and Middle Bronze Age. Strontium isotope data show that the inhabitants followed a strictly patrilocal residential system. We demonstrate the impact of the population movement that originated in the Pontic-Caspian steppe in the 3rd millennium BCE and subsequent local developments. Utilising relatedness inference methods developed for low-coverage modern DNA we reconstruct farmstead related pedigrees and find a strong association between relatedness and grave goods suggesting that social status is passed down within families. The co-presence of biologically related and unrelated individuals in every farmstead implies a socially stratified complex household in the Central European Bronze Age.

Diminishing steppe ancestry and resurgent Neolithic ancestry over time. Alexander M. Kim

Notice how the arrival of Bell Beakers, obviously derived from Yamna settlers in Hungary, and thus clearly identified as expanding North-West Indo-Europeans all over Europe, marks a decrease in steppe ancestry compared to Corded Ware groups, in a site quite close to the most likely East BBC homeland. Copenhagen’s steppe ancestry = Indo-European going down the toilet, step by step…


Russian Far East populations

Gene geography of the Russian Far East populations – faces, genome-wide profiles, and Y-chromosomes, by Balanovsky et al.

Russian Far East is not only a remote area of Eurasia but also a link of the chain of Pacific coast regions, spanning from East Asia to Americas, and many prehistoric migrations are known along this chain. The Russian Far East is populated by numerous indigenous groups, speaking Tungusic, Turkic, Chukotko-Kamchatka, Eskimo-Aleut, and isolated languages. This linguistic and geographic variation opens question about the patterns of genetic variation in the region, which was significantly undersampled and received minor attention in the genetic literature to date. To fill in this gap we sampled Aleuts, Evenks, Evens, Itelmens, Kamchadals, Koryaks, Nanais, Negidals, Nivkhs, Orochi, Udegeis, Ulchi, and Yakuts. We also collected the demographic information of local populations, took physical anthropological photos, and measured the skin color. The photos resulted in the “synthetic portraits” of many studied groups, visualizing the main features of their faces.


Impressive North Eurasian biobank including 30,500 individual samples with broad consent, some genealogical info, phenotypic data. Alexander M. Kim

Finland AD 5th-8th c.

Sadly, no information will be shared on the session A 1400-year transect of ancient DNA reveals recent genetic changes in the Finnish population, by Salmela et al. We will have to stick to the abstract:

Objectives: Our objective was to use aDNA to study the population history of Finland. For this aim, we sampled and sequenced 35 individuals from ten archaeological sites across southern Finland, representing a time transect from 5th to 18th century.

Methods: Following genomic DNA extraction and preparation of indexed libraries, the samples were enriched for 1,2 million genomewide SNPs using in-solution capture and sequenced on an Illumina HighSeq 4000 instrument. The sequence data were then compared to other ancient populations as well as modern Finns, their geographical neighbors and worldwide populations. Authenticity testing of the data as well as population history inference were based on standard computational methods for aDNA, such as principal component analysis and F statistics.

Results: Despite the relatively limited temporal depth of our sample set, we are able to see major genetic changes in the area, from the earliest sampled individuals – who closely resemble the present-day Saami population residing markedly further north – to the more recent ancient individuals who show increased affinity to the neighboring Circum-Baltic populations. Furthermore, the transition to the present-day population seems to involve yet another perturbation of the gene pool.

So, most likely then, in my opinion – although possibly Y-DNA will not be reported – Finns were in the Classical Antiquity period mostly R1a with secondary N1c in the Circum-Baltic region (similar to modern Estonians, as I wrote recently), while Saami were probably mostly a mix of R1a-Z282 and I1 in southern Finland. That’s what the first transition after the 5th c. probably reflects, the spread of Finns (with mainly N1c lineages) to the north, while the more recent transition shows probably the introduction of North Germanic ancestry (and thus also R1b-U106, R1a-Z284, and I1 lineages) in the west.

Dairying in ancient Mongolia

The History of Dairying in ancient Mongolia, by Wilkin et al.

The use of mass spectrometry based proteomics presents a novel method for investigating human dietary intake and subsistence strategies from archaeological materials. Studies of ancient proteins extracted from dental calculus, as well as other archaeological material, have robustly identified both animal and plant-based dietary components. Here we present a recent case study using shotgun proteomics to explore the range and diversity of dairying in the ancient eastern Eurasian steppe. Contemporary and prehistoric Mongolian populations are highly mobile and the ephemerality of temporarily occupied sites, combined with the severe wind deflation common across the steppes, means detecting evidence of subsistence can be challenging. To examine the time depth and geographic range of dairy use in Mongolia, proteins were extracted from ancient dental calculus from 32 individuals spanning burial sites across the country between the Neolithic and Mongol Empire. Our results provide direct evidence of early ruminant milk consumption across multiple time periods, as well as a dramatic increase in the consumption of horse milk in the late Bronze Age. These data provide evidence that dairy foods from multiple species were a key part of subsistence strategies in prehistoric Mongolia and add to our understanding of the importance of early pastoralism across the steppe.

The confirmation of the date 3000-2700 BC for dairying in the eastern steppe further supports what was already known thanks to archaeological remains, that the pastoralist subsistence economy was brought for the first time to the Altai region by expanding late Khvalynsk/Repin – Early Yamna pastoralists that gave rise to the Afanasevo culture.

Neolithic transition in Northeast Asia

Genomic insight into the Neolithic transition peopling of Northeast Asia, by C. Ning

East Asian representing a large geographic region where around one fifth of the world populations live, has been an interesting place for population genetic studies. In contrast to Western Eurasia, East Asia has so far received little attention despite agriculture here evolved differently from elsewhere around the globe. To date, only very limited genomic studies from East Asia had been published, the genetic history of East Asia is still largely unknown. In this study, we shotgun sequenced six hunter-gatherer individuals from Houtaomuga site in Jilin, Northeast China, dated from 12000 to 2300 BP and, 3 farming individuals from Banlashan site in Liaoning, Northeast China, dated around 5300 BP. We find a high level of genetic continuity within northeast Asia Amur River Basin as far back to 12000 BP, a region where populations are speaking Tungusic languages. We also find our Compared with Houtaomuga hunter-gatherers, the Neolithic farming population harbors a larger proportion of ancestry from Houtaomuga related hunter-gathers as well as genetic ancestry from central or perhaps southern China. Our finding further suggests that the introduction of farming technology into Northeast Asia was probably introduced through demic diffusion.

A detail of the reported haplogroups of the Houtaomuga site:


Y-DNA in Northeast Asia shows thus haplogroup N1b1 ~5000 BC, probably representative of the Baikal region, with a change to C2b-448del lineages before the Xiongnu period, which were later expanded by Mongols.

A study of genetic diversity of three isolated populations in Xinjiang using Y-SNP


New open access paper (in Chinese) A study of genetic diversity of three isolated populations in Xinjiang using Y-SNP, by liu et al. Acta Anthropologica Sinitica (2018)


The Keriyan, Lopnur and Dolan peoples are isolated populations with sparse numbers living in the western border desert of our country. By sequencing and typing the complete Y-chromosome of 179 individuals in these three isolated populations, all mutations and SNPs in the Y-chromosome and their corresponding haplotypes were obtained. Types and frequencies of each haplotype were analyzed to investigate genetic diversity and genetic structure in the three isolated populations. The results showed that 12 haplogroups were detected in the Keriyan with high frequencies of the J2a1b1 (25.64%), R1a1a1b2a (20.51%), R2a (17.95%) and R1a1a1b2a2 (15.38%) groups. Sixteen haplogroups were noted in the Lopnur with the following frequencies: J2a1 (43.75%), J2a2 (14.06%), R2 (9.38%) and L1c (7.81%). Forty haplogroups were found in the Dolan, noting the following frequencies: R1b1a1a1 (9.21%), R1a1a1b2a1a (7.89%), R1a1a1b2a2b (6.58%) and C3c1 (6.58%). These data show that these three isolated populations have a closer genetic relationship with the Uygur, Mongolian and Sala peoples. In particular, there are no significant differences in haplotype and frequency between the three isolated populations and Uygur (f=0.833, p=0.367). In addition, the genetic haplotypes and frequencies in the three isolated populations showed marked Eurasian mixing illustrating typical characteristics of Central Asian populations.

Figure 1. The populations distribution map. Left: Uluru. Center: Dali Yabuyi. Right: Kaerqu.

My knowledge of written Chinese is almost zero, so here are some excerpts with the help of Google Translate:

The source of 179 blood samples used in the study is shown in Figure 1. The Keriyan blood samples were collected from Dali Yabuyi Township, Yutian County (39 samples). The blood samples of the Lopnur people were collected from Kaerqu Township, Yuli County (64 cases); the blood samples of the Dolan people were collected from the town of Uluru, Awati County (76).

Columns one and two are the Keriyan haplotypes and frequencies, respectively; the third and fourth columns are the Lopnur haplotypes and frequencies; the last four columns are the Daolang haplotypes and frequencies.

The composition and frequency of the Keriyan people’s haplogroup are closest to those of the Uighurs, and both Principal Component Analysis and Phylogenetic Tree Analysis show that their kinship is recent. We initially infer that the Keriyan are local desert indigenous people. They have a connection with the source of the Uighurs. Chen et al. [42] studied the patriarchal and maternal genetic analysis of the Keriyan people and found that they are not descendants of the Tibetan ethnic group in the West. The Keriyan people are a mixed group of Eastern and Western Europeans, which may originate from the local Vil group. Duan Ranhui [43] and other studies have shown that the nucleotide variability and average nucleotide differences in the Keriyan population are between the reported Eastern and Western populations. The phylogenetic tree also shows that the populations in Central Asia are between the continental lineage of the eastern population and the European lineage of the western population, and the genetic distance between the Keriyan and the Uighurs is the closest, indicating that they have a close relationship.


Regarding the origin of the Lopnur people, Purzhevski judged that it was a mixture of Mongolians and Aryans according to the physical characteristics of the Lopnur people. In 1934, the Sino-Swiss delegation discovered the famous burials of the ancient tombs in the Peacock River. After research, they were the indigenous people before the Loulan period; the researcher Yang Lan, a researcher at the Institute of Cultural Relics of the Chinese Academy of Social Sciences, said that the Lopnur people were descendants of the ancient “Landan survivors”. However, the Loulan people speaking an Indo-European language, and the Lopnur people speaking Uyghur languages contradict this; the historical materials of the Western Regions, “The Geography of the Western Regions” and “The Western Regions of the Ming Dynasty” record the Uighurs who lived in Cao Cao in the late 17th and early 18th centuries. Because of the occupation of the land by the Junggar nobles and their oppression, they fled. Some of them were forced to move to the Lop Nur area. There are many similar archaeological discoveries and historical records. We have no way to determine their accuracy, but they are at different times, and there is a great difference in what is heard in the same region. (…) The genetic characteristics of modern Lopnur people are the result of the long-term ethnic integration of Uyghurs, Mongols, and Europeans. This is also consistent with the similarity of the genetic structure of the Y chromosome of Lopnur in this study with the Uighurs and Mongolians. For example, the frequency of J haplogroup is as high as 59.37%, while J and its downstream sub-haplogroup are mainly distributed in western Europe, West Asia and Central Asia; the frequency of O, R haplogroup is close to that of Mongolians.

1) KA: Keriya, LB: Rob, DL: Daolang, HTW: Hetian Uygur, HTWZ: and Uygur, TLFW: Turpan Uighur, HZ: Hui, HSKZ: Kazakh, WZBKZ: Wuhuan Others, TJKZ: Tajik, KEKZZ: Kirgiz, TTEZ: Tatar, ELSZ: Russian XBZ: Xibo, MGZ: Mongolian, SLZ: Salar, XJH: Xinjiang Han, GSH: Gansu Han, GDH: Guangdong Han SCH: Sichuan Han. 2) Reference population data source literature 19-22. After the population names in the table have been marked, all the shorthands in the text are referred to in this table. 3) Because the degree of haplotypes of each reference population is different to each sub-group branch, the sub-group branches under the same haplogroup are merged when the population haplogroup data is aggregated, for example: for haplogroup G Some people are divided into G1a and G2a levels, others are assigned to G1, G2, and G3, while some people can only determine G this time. Therefore, each subgroup is merged into a single group G.

According to Ming History·Western Biography, the Mongolians originated from the Mobei Plateau and later ruled Asia and Eastern Europe. Mongolia was established, and large areas of southern Xinjiang and Central Asia were included. Later, due to the Mongolian king’s struggle for power, it fell into a long-term conflict. People of the land fled to avoid the war, and the uninhabited plain of the lower reaches of the Yarkant River naturally became a good place to live. People from all over the world gathered together and called themselves “Dura” and changed to “Dang Lang”. The long-term local Uyghur exchanges that entered the southern Mongolian monks and “Dura” were gradually assimilated [44]. According to the report, locals wore Mongolian clothes, especially women who still maintained a Mongolian face [45]. In 1976, the robes and waistbands found in the ancient time of the Daolang people in Awati County were very similar to those of the ancients. Dalang Muqam is an important part of Daolang culture. It is also a part of the Uyghur Twelve Muqam, and it retains the ancient Western culture, but it also contains a larger Mongolian culture and relics. The above historical records show that the Daolang people should appear in the Chagatai Khanate and be formed by the integration of Mongolian and Uighur ethnic groups. Through our research, we also found that the paternal haplotype of the Daolang people is contained in both Uygur and Mongolian, and the main haplogroups are the same, whereas the frequencies are different (see Table 3). The principal component analysis and the NJ analysis are also the same. It is very close to the Uyghur and the Mongolian people, which establishes new evidence for the “mixed theory” in molecular genetics.

Genetic relationship between the three isolated populations: the Uygur and the Mongolian is the closest, and the main haplogroup can more intuitively compare the source composition of the genetic structure of each population. Haplogroups C, D, and O are mainly distributed in Asia as the East Asian characteristic haplogroup; haplogroups G, J, and R are mainly distributed in continental Europe, and the high frequency distribution is in Europe and Central Asia.

If the nomenclature follows a recent ISOGG standard, it appears that:

The presence of exclusively R1a-Z93 subclades and the lack of R1b-M269 samples is compatible with the expansion of R1a-Z93 into the area with Proto-Tocharians, at the turn of the 3rd-2nd millennium BC, as suggested by the Xiaohe samples, supposedly R1a(xZ93).

Now that it is obvious from ancient DNA (as it was clear from linguistics) that Pre-Tocharians separated earlier than other Late PIE peoples, with the expansion of late Khvalynsk/Repin into the Altai, at the end of the 4th millennium, these prevalent R1a (probably Z93) samples may be showing a replacement of Pre-Tocharian Y-DNA with the Andronovo expansion already by 2000 BC.

Lacking proper assessment of ancient DNA from Proto-Tocharians, this potential early Y-DNA replacement is still speculative*. However, if that is the case, I wonder what the Copenhagen group will say when supporting this, but rejecting at the same time the more obvious Y-DNA replacement in East Yamna / Poltavka in the mid-3rd millennium with incoming Corded Ware-related peoples. I guess the invention of an Indo-Tocharian group may be near…

*NOTE. The presence of R1b-M269 among Proto-Tocharians, as well as the presence of R1b-M269 among Tarim Basin peoples in modern and ancient times is not yet fully discarded. The prevalence of R1a-Z93 may also be the sign of a more recent replacement by Iranian peoples, before the Mongolian and Turkic expansions that probably brought R1b(xM269).

Also, the presence of R1b (xM269) samples in east Asia strengthens the hypothesis of a back-migration of R1b-P297 subclades, from Northern Europe to the east, into the Lake Baikal area, during the Early Mesolithic, as found in the Botai samples and later also in Turkic populations – which are the most likely source of these subclades (and probably also of Q1a2 and N1c) in the region.


South-East Asia samples include shared ancestry with Jōmon


New paper (behind paywall) The prehistoric peopling of Southeast Asia, by McColl et al. (Science 2018) 361(6397):88-92 from a recent bioRxiv preprint.

Interesting is this apparently newly reported information including a female sample from the Ikawazu Jōmon of Japan ca. 570 BC (emphasis mine):

The two oldest samples — Hòabìnhians from Pha Faen, Laos [La368; 7950 with 7795 calendar years before the present (cal B.P.)] and Gua Cha, Malaysia (Ma911; 4415 to 4160 cal B.P.)—henceforth labeled “group 1,” cluster most closely with present-day Önge from the Andaman Islands and away from other East Asian and Southeast-Asian populations (Fig. 2), a pattern that differentiates them from all other ancient samples. We used ADMIXTURE (14) and fastNGSadmix (15) to model ancient genomes as mixtures of latent ancestry components (11). Group 1 individuals differ from the other Southeast Asian ancient samples in containing components shared with the supposed descendants of the Hòabìnhians: the Önge and the Jehai (Peninsular Malaysia), along with groups from India and Papua New Guinea.

We also find a distinctive relationship between the group 1 samples and the Ikawazu Jōmon of Japan (IK002). Outgroup f3 statistics (11, 16) show that group 1 shares the most genetic drift with all ancient mainland samples and Jōmon (fig. S12 and table S4). All other ancient genomes share more drift with present-day East Asian and Southeast Asian populations than with Jōmon (figs. S13 to S19 and tables S4 to S11). This is apparent in the fastNGSadmix analysis when assuming six ancestral components (K = 6) (fig. S11), where the Jōmon sample contains East Asian components and components found in group 1. To detect populations with genetic affinities to Jōmon, relative to present-day Japanese, we computed D statistics of the form D(Japanese, Jōmon; X, Mbuti), setting X to be different presentday and ancient Southeast Asian individuals (table S22). The strongest signal is seen when X=Ma911 and La368 (group 1 individuals), showing a marginally nonsignificant affinity to Jōmon (11). This signal is not observed with X = Papuans or Önge, suggesting that the Jōmon and Hòabìnhians may share group 1 ancestry (11).

Model for plausible migration routes into SEA. This schematic is based on ancestry patterns observed in the ancient genomes. Because we do not have ancient samples to accurately resolve how the ancestors of Jōmon and Japanese populations entered the Japanese archipelago, these migrations are represented by dashed arrows. A mainland component in Indonesia is depicted by the dashed red-green line. Gr, group; Kra, Kradai.

(…) Finally, the Jōmon individual is best-modeled as a mix between a population related to group 1/Önge and a population related to East Asians (Amis), whereas present-day Japanese can be modeled as a mixture of Jōmon and an additional East Asian component (Fig. 3 and fig. S29)

Interesting in relation to the oral communication of the SMBE O-03-OS02 Whole genome analysis of the Jomon remain reveals deep lineage of East Eurasian populations by Gakuuhari et al.:

Post late-Paleolithic hunter-gatherers lived throughout the Japanese archipelago, Jomonese, are thought to be a key to understanding the peopling history in East Asia. Here, we report a whole genome sequence (x1.85) of 2,500-year old female excavated from the Ikawazu shell-mound, unearthed typical remains of Jomon culture. The whole genome data places the Jomon as a lineage basal to contemporary and ancient populations of the eastern part of Eurasian continent, and supports the closest relationship with the modern Hokkaido Ainu. The results of ADMIXTURE show the Jomon ancestry is prevalent in present-day Nivkh, Ulchi, and people in the main-island Japan. By including the Jomon genome into phylogenetic trees, ancient lineages of the Kusunda and the Sherpa/Tibetan, early splitting from the rest of East Asian populations, is emerged. Thus, the Jomon genome gives a new insight in East Asian expansion. The Ikawazu shell-mound site locates on 34,38,43 north latitude, and 137,8, 52 east longitude in the central main-island of the Japanese archipelago, corresponding to a warm and humid monsoon region, which has been thought to be almost impossible to maintain sufficient ancient DNA for genome analysis. Our achievement opens up new possibilities for such geographical regions.


Reconstruction of Y-DNA phylogeny helps also reconstruct Tibeto-Burman expansion


New paper (behind paywall) Reconstruction of Y-chromosome phylogeny reveals two neolithic expansions of Tibeto-Burman populations by Wang et al. Mol Genet Genomics (2018).

Interesting excerpts:

Archeological studies suggest that a subgroup of ancient populations of the Miaodigou culture (~ 6300–5500 BP) moved westward to the upper stream region of the Yellow River and created the Majiayao culture (~ 5400–4900 BP) (Liu et al. 2010), which was proposed to be the remains of direct ancestors of Tibeto-Burman populations (Sagart 2008). On the other hand, Han populations, the other major descendant group of the Yang-Shao culture (~ 7000–5500 BP), are composed of many other sub-lineages of Oα-F5 and extremely low frequencies of D-M174 (Additional files 1: Figure S1; Additional files 2: Table S1). Therefore, we propose that Oα-F5 may be one of the dominant paternal lineages in ancient populations of Yang-Shao culture and its successors.

In this study, we demonstrated that both sub-lineages of D-M174 and Oα-F5 are founding paternal lineages of modern Tibeto-Burman populations. The genetic patterns suggested that the ancestor group of modern Tibeto-Burman populations may be an admixture of two distinct ancient populations. One of them may be hunter–gatherer populations who survived on the plateau since the Paleolithic Age, represented by varied sub-lineages of sub-lineages of D-M174. The other one was comprised of farmers who migrated from the middle Yellow River basin, represented by sub-lineages of Oα-F5. In general, the genetic evidence in this study supports the conclusion that the appearance of the ancestor group of Tibeto-Burman populations was triggered by the Neolithic expansion from the upper-middle Yellow River basin and admixture with local populations on the Tibetan Plateau (Su et al. 2000).

Simplified phylogenetic tree showing sample locations. The size of the circle for each sampling location corresponds to the number of samples

Two neolithic expansion origins of Tibeto‑Burman populations

We also observed significant differences in the paternal gene pool of different subgroups of Tibeto-Burman populations. Haplogroup D-M174 contributed ~ 54% percent in a sampling of 2354 Tibetan males throughout the Tibetan Plateau (Qi et al. 2013). Previous studies have also found high frequencies of D-M174 in other populations on the Tibetan Plateau (Shi et al. 2008), including Sherpa (Lu et al. 2016) and Qiang (Wang et al. 2014). In contrast, haplogroup D-M174 is rare or absent from Tibeto-Burman populations from Northeast India and Burma (Shi et al. 2008). In populations of the Ngwi-Burmese language subgroup, the average frequencies of haplogroup D-M174 are ~ 5% (Dong et al. 2004; Peng et al. 2014). Furthermore, we found that lineage Oα1c1b-CTS5308 is mainly found in Tibeto-Burman populations from the Tibetan Plateau. In contrast, lineage Oα1c1a-Z25929 was found in Tibeto-Burman populations from Northeast India, Burma, and the Yunan and Hunan provinces of China (Additional files 1: Figure S1; Additional files 2: Table S1). In general, enrichment of lineage Oα1c1b- CTS5308 and high frequencies of D-M174 can be found in most Tibeto-Burman populations on the Tibetan Plateau and adjacent regions, whereas Tibeto-Burman populations from other regions tend to have lineage Oα1c1a-Z25929 and a little to no percentage of D-M174.

The inconsistent pattern we observed in the paternal gene pool of modern Tibeto-Burman populations suggested that there may be two distinct ancestor groups (Fig. 3). The proposed migration routes shown in Fig. 3 are somewhat different from those proposed by Su et al. (2000). According to our age estimation, most of the D1a2a-P47 samples belong to sub-lineage PH116, a young lineage that emerged ~ 2500 years ago (95% CI 1915–3188 years). On the other hand, continuous differentiation can be observed on a phylogenetic tree of lineages D1a1a1a1-PH4979 and D1a1a1a2-Z31591 since 6000 years ago. Therefore, we proposed that a group of ancient populations may have moved to the upper basin of the Yellow River and admixed intensively with local populations with high frequencies of haplogroup D-M174, including its sub-lineage D1a2a-P47 (Fig. 3). This ancestor group eventually gave birth to modern Tibeto-Burman populations on the Tibetan Plateau and adjacent regions. The other ancestor group moved toward the southwest and finally reached South East Asia (Burma and other locations) and the northeastern part of India (Fig. 3). This ancestor group may have had no or a minor admixture of D-M174 in their paternal gene pool.

Two proposed ancestor groups and migration routes for Tibeto-Burman populations

Long‑term admixture before expansion to a high‑altitude region

It is interesting to investigate the time gap between the appearance of Neolithic cultures in the northeastern part of the Tibetan Plateau and the final phase of human expansion across the Tibetan Plateau. The Majiayao culture (~ 5400–4900 BP) is the earliest Neolithic culture in the northeastern part of the Tibetan Plateau (Liu et al. 2010). However, previous archeological study has suggested that the final phase of diffusion into the high-altitude area of the Tibetan Plateau occurred at approximately 3.6 kya (Chen et al. 2015). Our genetic evidence in this study is consistent with this scenario based on archeological evidence. Based on Y-chromosome analysis in this study, many unique lineages of Tibeto-Burman populations emerged between 6000 years ago and 2500 years ago (Additional files 3: Table S2). The most recent common age of D1a2-PH116, a sub-lineage that spread throughout the Tibetan Plateau, is only 2500 years ago.

We propose that there may be two important factors for the observed age gap. First, living in a high-altitude environment may require some crucial physical characteristics that were lacking from Neolithic immigrants from the middle Yellow River Basin. Intense genetic admixture with local people who had survived on the Tibetan Plateau since the Paleolithic Age may have actually guaranteed the expansion of humans across the Tibetan Plateau. Therefore, a long period of admixture, lasting from 5.4 to 3.6 kya, may be necessary for the appearance of a population with beneficial genetic variants that was genetically adapted to the high-altitude environment. Second, technological innovations, such as the domestication of wheat and highland barley (Chen et al. 2015), establishment of yak pastoralism (Rhode et al. 2007), and introduction of other culture elements in the Bronze Age (Ma et al. 2016), are also important factors that facilitated permanent settlements with large population sizes in the high-altitude area of the Tibetan Plateau.


Mitogenomes from Thailand offer insights into maternal genetic history of mainland South-East Asia

Open access New insights from Thailand into the maternal genetic history of Mainland Southeast Asia, by Kutanan et al. Eur. J. Hum. Genet. (2018) 26:898–911

Abstract (emphasis mine):

Tai-Kadai (TK) is one of the major language families in Mainland Southeast Asia (MSEA), with a concentration in the area of Thailand and Laos. Our previous study of 1234 mtDNA genome sequences supported a demic diffusion scenario in the spread of TK languages from southern China to Laos as well as northern and northeastern Thailand. Here we add an additional 560 mtDNA genomes from 22 groups, with a focus on the TK-speaking central Thai people and the Sino-Tibetan speaking Karen. We find extensive diversity, including 62 haplogroups not reported previously from this region. Demic diffusion is still a preferable scenario for central Thais, emphasizing the expansion of TK people through MSEA, although there is also some support for gene flow between central Thai and native Austroasiatic speaking Mon and Khmer. We also tested competing models concerning the genetic relationships of groups from the major MSEA languages, and found support for an ancestral relationship of TK and Austronesian-speaking groups.

Map showing sample locations and haplogroup distributions. Blue stars indicate the 22 presently studied populations (Tai-Kadai, Austroasiatic, and Sino-Tibetan groups) while red and green circles represent Tai-Kadai and Austroasiatic populations from the previous study [7]. Population abbreviations are in Supplementary Table S1

Interesting excerpts:

Finally, we used simulations to test hypotheses concerning the genetic relationships of groups belonging to different language families. We found that Starosta’s model [11] provided the best fit to the mtDNA data; however, Sagart’s model [9, 10] was also highly supported. These two models both postulate a close linguistic affinity between TK and AN. Although genetic relatedness between TK and AN groups has been previously studied [7, 46, 47], to our knowledge this is the first study to use demographic simulations to select the best-fitting model. Our results support the genetic relatedness of TK and AN groups, which might reflect a postulated shared ancestry among the proto-Austronesian populations of coastal East Asia [48].

Specifically, the best-fitting model suggests that after separation of the prehistoric TK from AN stocks around 5–6 kya in Southeast China, the TK spread southward throughout MSEA around 1–2 kya by a demic diffusion process, accompanied by population growth but with at most minor admixture with the autochthonous AA groups. Meanwhile, the prehistorical AN ancestors entered Taiwan and dispersed southward throughout ISEA, with these two expansions later meeting in western ISEA. The lack of mtDNA haplogroups associated with the expansion out of Taiwan in our Thai/Lao samples has two possible explanations: either the Out of Taiwan expansion did not reach MSEA (at least, in the area of present-day Thailand and Laos); or, if the prehistoric AN migrated through this area, their mtDNA lineages do not survive in modern Thai/Lao populations. Ancient DNA studies in MSEA would further clarify this issue. Moreover, although mtDNA analyses are informative in elucidating genetic perspectives in geographically and linguistically related populations, they have an obvious limitation in that they only provide insights into the maternal history of populations. Future studies of Y chromosomal and genome-wide data will provide further insights into the genetic history of Thai/Lao populations and the role of factors such as post-marital residence patterns and migration in shaping the genetic structure of the region.

Starosta’s chapter referred to in the paper is Proto-East Asian and the origin and dispersal of the languages of East and Southeast Asia and the Pacific.


Demographic history and genetic adaptation in the Himalayan region

Open access Demographic history and genetic adaptation in the Himalayan region inferred from genome-wide SNP genotypes of 49 populations, by Arciero et al. Mol. Biol. Evol (2018), accepted manuscript (msy094).

Abstract (emphasis mine):

We genotyped 738 individuals belonging to 49 populations from Nepal, Bhutan, North India or Tibet at over 500,000 SNPs, and analysed the genotypes in the context of available worldwide population data in order to investigate the demographic history of the region and the genetic adaptations to the harsh environment. The Himalayan populations resembled other South and East Asians, but in addition displayed their own specific ancestral component and showed strong population structure and genetic drift. We also found evidence for multiple admixture events involving Himalayan populations and South/East Asians between 200 and 2,000 years ago. In comparisons with available ancient genomes, the Himalayans, like other East and South Asian populations, showed similar genetic affinity to Eurasian hunter-gatherers (a 24,000-year-old Upper Palaeolithic Siberian), and the related Bronze Age Yamnaya. The high-altitude Himalayan populations all shared a specific ancestral component, suggesting that genetic adaptation to life at high altitude originated only once in this region and subsequently spread. Combining four approaches to identifying specific positively-selected loci, we confirmed that the strongest signals of high-altitude adaptation were located near the Endothelial PAS domain-containing protein 1 (EPAS1) and Egl-9 Family Hypoxia Inducible Factor 1 (EGLN1) loci, and discovered eight additional robust signals of high-altitude adaptation, five of which have strong biological functional links to such adaptation. In conclusion, the demographic history of Himalayan populations is complex, with strong local differentiation, reflecting both genetic and cultural factors; these populations also display evidence of multiple genetic adaptations to high-altitude environments.

Population samples analysed in this study. A. Map of South and East Asia, highlighting the four regions examined, and the colour assigned to each. B. Samples from the Tibetan Plateau. C.Samples from Nepal. D. Samples from Bhutan and India. The circle areas are proportional to the sample sizes. The three letter population codes in B-D are defined in supplementary table S1.

Relevant excerpts:

Genetic affinity to ancestral populations

We explored the genetic affinity between the Himalayan populations and five ancient genomes using f3-outgroup statistics. Himalayans show greater affinity to Eurasian hunter-gatherers (MA-1, a 24,000- year-old Upper Palaeolithic Siberian), and the related Bronze Age Yamnaya, than to European farmers (5,500-4,800 years ago; Fig. 5A) or to European hunter-gatherers (La Braña, 7,000 years ago; Fig. 5B), like other South and East Asian populations. We further explored the affinity of Himalayan populations by comparing them with the 45,000-year-old Upper Palaeolithic hunter-gatherer (Ust’-Ishim) and each of MA-1, La Braña, or Yamnaya. Himalayan individuals cluster together with other East Asian populations and show equal distance from Ust’-Ishim and the other ancient genomes, probably because Ust’-Ishim belongs to a much earlier period of time (supplementary fig. S15). We also explored genetic affinity between modern Himalayan populations and five ancient Himalayans (3,150 1,250 years old) from Nepal. The ancient individuals cluster together with modern Himalayan populations in a worldwide PCA (supplementary fig. S16), and the f3-outgroup statistics show modern high-altitude populations have the closest affinity with these ancient Himalayans, suggesting that these ancient individuals could represent a proxy for the first populations residing in the region (supplementary fig. S17 and supplementary table S4). Finally, we explored the genetic affinity of Himalayan samples with the archaic genomes of Denisovans and Neanderthals (Skoglund and Jakobsson 2011), and found that they show a similar sharing pattern with Denisovans and Neanderthals to the other South and East Asian populations. Individuals belonging to four Nepalese, one Cambodian, and three Chinese populations show the highest Denisovan sharing (after populations from Australia and Papua New Guinea) but these values are not significantly greater than other South and East Asian populations (supplementary figs. S18 and S19).

Genetic structure of the Himalayan region populations from analyses using unlinked SNPs. A. PCA of the Himalayan and HGDP-CEPH populations. Each dot represents a sample, coded by region as indicated. The Himalayan region samples lie between the HGDP-CEPH East Asian and South Asian samples on the right-hand side of the plot. B. PCA of the Himalayan populations alone. Each dot represents a sample, coded by country or region as indicated. Most samples lie on an arc between Bhutanese and Nepalese samples; Toto (India) are seen as extreme outlier in the bottom left corner, while Dhimal (Nepal) and Bodo (India) also form outliers.

NOTE. The variance explained in the PCA graphics seems to be too high. This happened recently also with the Damgaard et al. (2018) papers (see here the comment by Iosif Lazaridis).

Similarities and differences between high-altitude Himalayan

The most striking example is provided by the Toto from North India, an isolated tribal group with the lowest genetic diversity of the Himalayan populations examined here, indicated by the smallest long-term Ne (supplementary fig. S5), and a reported census size of 321 in 1951 (Mitra 1951), although their numbers have subsequently increased. Despite this extreme substructure, shared common ancestry among the high-altitude populations (Fig. 2C and Fig. 3) can be detected, and the Nepalese in general are distinguished from the Bhutanese and Tibetans (Fig. 2C) and they also cluster separately (Fig. 3). In a worldwide context, they share an ancestral component with South Asians (supplementary fig. S2). On the other hand, the Tibetans do not show detectable population substructure, probably due to a much more recent split in comparison with the other populations (Fig. 2C and supplementary fig. S6). The genetic similarity between the high-altitude populations, including Tibetans, Sherpa and Bhutanese, is also supported by their clustering together on the phylogenetic tree, the PCA generated from the co-ancestry matrix generated by fineSTRUCTURE (supplementary fig. S10 and S11), the lack of statistical significance for most of the D-statistics tests (Yoruba, Han; high-altitude Himalayan 1, high-altitude Himalayan 2), and the absence of correlation between the increased genetic affinity to lowland East Asians and the spatial location of the Himalayan populations (supplementary figs. S12 and S13). Together, these results suggest the presence of a single ancestral population carrying advantageous variants for high-altitude adaptation that separated from lowland East Asians, and then spread and diverged into different populations across the Himalayan region. (…)

Recent admixture events

Genetic structure of the Himalayan region populations from analyses using unlinked SNPs. C. ADMIXTURE (K values of 2 to 6, as indicated) analysis of the Himalayan samples. Note that most increases in the value of K result in single population being distinguished. Population codes in C are defined in supplementary table S1.

Himalayan populations show signatures of recent admixture events, mainly with South and East Asian populations as well as within the Himalayan region itself. Newar and Lhasa show the oldest signature of admixture, dated to between 2,000 and 1,000 years ago. Majhi and Dhimal display signatures of admixture within the last 1,000 years. Chetri and Bodo show the most recent admixture events, between 500 and 200 years ago (Fig. 4, supplementary tables S3). The comparison between the genetic tree and the linguistic association of each Himalayan population highlights the agreement between genetic and linguistic sub-divisions, in particular in the Bhutanese and Tibetan populations. Nepalese populations show more variability, with genetic sub-clusters of populations belonging to different linguistic affiliations (Fig. 3B). Modern high-altitude Himalayans show genetic affinity with ancient genomes from the same region (supplementary fig. S17), providing additional support for the idea of an ancient high-altitude population that spread across the Himalayan region and subsequently diverged into several of the present-day populations. Furthermore, Himalayan populations show a similar pattern of allele sharing with Denisovans as other South-East Asian populations (supplementary fig. S18 and S19). Overall, geographical isolation, genetic drift, admixture with neighbouring populations and linguistic subdivision played important roles in shaping the genetic variability we see in the Himalayan region today.


Ancient genomes document multiple waves of migration in south-east Asian prehistory


Open access preprint at bioRxiv Ancient genomes document multiple waves of migration in Southeast Asian prehistory, by Lipson, Cheronet, Mallick, et al. (2018).

Abstract (emphasis mine):

Southeast Asia is home to rich human genetic and linguistic diversity, but the details of past population movements in the region are not well known. Here, we report genome-wide ancient DNA data from thirteen Southeast Asian individuals spanning from the Neolithic period through the Iron Age (4100-1700 years ago). Early agriculturalists from Man Bac in Vietnam possessed a mixture of East Asian (southern Chinese farmer) and deeply diverged eastern Eurasian (hunter-gatherer) ancestry characteristic of Austroasiatic speakers, with similar ancestry as far south as Indonesia providing evidence for an expansive initial spread of Austroasiatic languages. In a striking parallel with Europe, later sites from across the region show closer connections to present-day majority groups, reflecting a second major influx of migrants by the time of the Bronze Age.

Schematics of admixture graph results. (A) Wider phylogenetic context. (B) Details of the Austroasiatic clade. Branch lengths are not to scale, and the order of the two events on the Nicobarese lineage in (B) is not well determined (Supplementary Text).

Featured image, from the article: “Overview of samples. (A) Locations and dates of ancient individuals. Overlapping positions are shifted slightly for visibility. (B) PCA with East and Southeast Asians. We projected the ancient samples onto axes computed using the present-day populations (with the exception of Mlabri, who were projected instead due to their large population-speci c drift). Present-day colors indicate language family affiliation: green, Austroasiatic; blue, Austronesian; orange, Hmong-Mien; black, Sino-Tibetan; magenta, Tai-Kadai.”

See also:

Genomics reveals four prehistoric migration waves into South-East Asia

Open access preprint article at bioRxiv Ancient Genomics Reveals Four Prehistoric Migration Waves into Southeast Asia, by McColl, Racimo, Vinner, et al. (2018).

Abstract (emphasis mine):

Two distinct population models have been put forward to explain present-day human diversity in Southeast Asia. The first model proposes long-term continuity (Regional Continuity model) while the other suggests two waves of dispersal (Two Layer model). Here, we use whole-genome capture in combination with shotgun sequencing to generate 25 ancient human genome sequences from mainland and island Southeast Asia, and directly test the two competing hypotheses. We find that early genomes from Hoabinhian hunter-gatherer contexts in Laos and Malaysia have genetic affinities with the Onge hunter-gatherers from the Andaman Islands, while Southeast Asian Neolithic farmers have a distinct East Asian genomic ancestry related to present-day Austroasiatic-speaking populations. We also identify two further migratory events, consistent with the expansion of speakers of Austronesian languages into Island Southeast Asia ca. 4 kya, and the expansion by East Asians into northern Vietnam ca. 2 kya. These findings support the Two Layer model for the early peopling of Southeast Asia and highlight the complexities of dispersal patterns from East Asia.

A model for plausible migration routes into Southeast Asia, based on the ancestry patterns observed in the ancient genomes.


Ancient Di-Qiang people show early links with Han Chinese


Bernard Sécher reports on a recent article, Ancient DNA reveals genetic connections between early Di-Qiang and Han Chinese, by Li et al., BMC Evolutionary Biology (2017).


Ancient Di-Qiang people once resided in the Ganqing region of China, adjacent to the Central Plain area from where Han Chinese originated. While gene flow between the Di-Qiang and Han Chinese has been proposed, there is no evidence to support this view. Here we analyzed the human remains from an early Di-Qiang site (Mogou site dated ~4000 years old) and compared them to other ancient DNA across China, including an early Han-related site (Hengbei site dated ~3000 years old) to establish the underlying genetic relationship between the Di-Qiang and ancestors of Han Chinese.

We found Mogou mtDNA haplogroups were highly diverse, comprising 14 haplogroups: A, B, C, D (D*, D4, D5), F, G, M7, M8, M10, M13, M25, N*, N9a, and Z. In contrast, Mogou males were all Y-DNA haplogroup O3a2/P201; specifically one male was further assigned to O3a2c1a/M117 using targeted unique regions on the non-recombining region of the Y-chromosome. We compared Mogou to 7 other ancient and 38 modern Chinese groups, in a total of 1793 individuals, and found that Mogou shared close genetic distances with Taojiazhai (a more recent Di-Qiang population), Hengbei, and Northern Han. We modeled their interactions using Approximate Bayesian Computation, and support was given to a potential admixture of ~13-18% between the Mogou and Northern Han around 3300–3800 years ago.

Mogou harbors the earliest genetically identifiable Di-Qiang, ancestral to the Taojiazhai, and up to ~33% paternal and ~70% of its maternal haplogroups could be found in present-day Northern Han Chinese.

MDS plot of genetic distance Fst between 3 ancient and 38 modern Chinese groups

Interesting times now for the investigation of potential migrations associated with the expansion of Sino-Tibetan and Altaic languages