Yamnaya ancestry: mapping the Proto-Indo-European expansions

steppe-ancestry-expansion-europe

The latest papers from Ning et al. Cell (2019) and Anthony JIES (2019) have offered some interesting new data, supporting once more what could be inferred since 2015, and what was evident in population genomics since 2017: that Proto-Indo-Europeans expanded under R1b bottlenecks, and that the so-called “Steppe ancestry” referred to two different components, one – Yamnaya or Steppe_EMBA ancestry – expanding with Proto-Indo-Europeans, and the other one – Corded Ware or Steppe_MLBA ancestry – expanding with Uralic speakers.

The following maps are based on formal stats published in the papers and supplementary materials from 2015 until today, mainly on Wang et al. (2018 & 2019), Mathieson et al. (2018) and Olalde et al. (2018), and others like Lazaridis et al. (2016), Lazaridis et al. (2017), Mittnik et al. (2018), Lamnidis et al. (2018), Fernandes et al. (2018), Jeong et al. (2019), Olalde et al. (2019), etc.

NOTE. As in the Corded Ware ancestry maps, the selected reports in this case are centered on the prototypical Yamnaya ancestry vs. other simplified components, so everything else refers to simplistic ancestral components widespread across populations that do not necessarily share any recent connection, much less a language. In fact, most of the time they clearly didn’t. They can be interpreted as “EHG that is not part of the Yamnaya component”, or “CHG that is not part of the Yamnaya component”. They can’t be read as “expanding EHG people/language” or “expanding CHG people/language”, at least no more than maps of “Steppe ancestry” can be read as “expanding Steppe people/language”. Also, remember that I have left the default behaviour for color classification, so that the highest value (i.e. 1, or white colour) could mean anything from 10% to 100% depending on the specific ancestry and period; that’s what the legend is for… But, fere libenter homines id quod volunt credunt.

Sections:

  1. Neolithic or the formation of Early Indo-European
  2. Eneolithic or the expansion of Middle Proto-Indo-European
  3. Chalcolithic / Early Bronze Age or the expansion of Late Proto-Indo-European
  4. European Early Bronze Age and MLBA or the expansion of Late PIE dialects

1. Neolithic

Anthony (2019) agrees with the most likely explanation of the CHG component found in Yamnaya, as derived from steppe hunter-fishers close to the lower Volga basin. The ultimate origin of this specific CHG-like component that eventually formed part of the Pre-Yamnaya ancestry is not clear, though:

The hunter-fisher camps that first appeared on the lower Volga around 6200 BC could represent the migration northward of un-admixed CHG hunter-fishers from the steppe parts of the southeastern Caucasus, a speculation that awaits confirmation from aDNA.

neolithic-chg-ancestry
Natural neighbor interpolation of CHG ancestry among Neolithic populations. See full map.

The typical EHG component that formed part eventually of Pre-Yamnaya ancestry came from the Middle Volga Basin, most likely close to the Samara region, as shown by the sampled Samara hunter-gatherer (ca. 5600-5500 BC):

After 5000 BC domesticated animals appeared in these same sites in the lower Volga, and in new ones, and in grave sacrifices at Khvalynsk and Ekaterinovka. CHG genes and domesticated animals flowed north up the Volga, and EHG genes flowed south into the North Caucasus steppes, and the two components became admixed.

neolithic-ehg-ancestry
Natural neighbor interpolation of EHG ancestry among Neolithic populations. See full map.

To the west, in the Dnieper-Dniester area, WHG became the dominant ancestry after the Mesolithic, at the expense of EHG, revealing a likely mating network reaching to the north into the Baltic:

Like the Mesolithic and Neolithic populations here, the Eneolithic populations of Dnieper-Donets II type seem to have limited their mating network to the rich, strategic region they occupied, centered on the Rapids. The absence of CHG shows that they did not mate frequently if at all with the people of the Volga steppes (…)

neolithic-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Neolithic populations. See full map.

North-West Anatolia Neolithic ancestry, proper of expanding Early European farmers, is found up to border of the Dniester, as Anthony (2007) had predicted.

neolithic-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Neolithic populations. See full map.

2. Eneolithic

From Anthony (2019):

After approximately 4500 BC the Khvalynsk archaeological culture united the lower and middle Volga archaeological sites into one variable archaeological culture that kept domesticated sheep, goats, and cattle (and possibly horses). In my estimation, Khvalynsk might represent the oldest phase of PIE.

(…) this middle Volga mating network extended down to the North Caucasian steppes, where at cemeteries such as Progress-2 and Vonyuchka, dated 4300 BC, the same Khvalynsk-type ancestry appeared, an admixture of CHG and EHG with no Anatolian Farmer ancestry, with steppe-derived Y-chromosome haplogroup R1b. These three individuals in the North Caucasus steppes had higher proportions of CHG, overlapping Yamnaya. Without any doubt, a CHG population that was not admixed with Anatolian Farmers mated with EHG populations in the Volga steppes and in the North Caucasus steppes before 4500 BC. We can refer to this admixture as pre-Yamnaya, because it makes the best currently known genetic ancestor for EHG/CHG R1b Yamnaya genomes.

From Wang et al (2019):

Three individuals from the sites of Progress 2 and Vonyuchka 1 in the North Caucasus piedmont steppe (‘Eneolithic steppe’), which harbour EHG and CHG related ancestry, are genetically very similar to Eneolithic individuals from Khvalynsk II and the Samara region. This extends the cline of dilution of EHG ancestry via CHG-related ancestry to sites immediately north of the Caucasus foothills

eneolithic-pre-yamnaya-ancestry
Natural neighbor interpolation of Pre-Yamnaya ancestry among Neolithic populations. See full map. This map corresponds roughly to the map of Khvalynsk-Novodanilovka expansion, and in particular to the expansion of horse-head pommel-scepters (read more about Khvalynsk, and specifically about horse symbolism)

NOTE. Unpublished samples from Ekaterinovka have been previously reported as within the R1b-L23 tree. Interestingly, although the Varna outlier is a female, the Balkan outlier from Smyadovo shows two positive SNP calls for hg. R1b-M269. However, its poor coverage makes its most conservative haplogroup prediction R-M343.

The formation of this Pre-Yamnaya ancestry sets this Volga-Caucasus Khvalynsk community apart from the rest of the EHG-like population of eastern Europe.

eneolithic-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Eneolithic populations. See full map.

Anthony (2019) seems to rely on ADMIXTURE graphics when he writes that the late Sredni Stog sample from Alexandria shows “80% Khvalynsk-type steppe ancestry (CHG&EHG)”. While this seems the most logical conclusion of what might have happened after the Suvorovo-Novodanilovka expansion through the North Pontic steppes (see my post on “Steppe ancestry” step by step), formal stats have not confirmed that.

In fact, analyses published in Wang et al. (2019) rejected that Corded Ware groups are derived from this Pre-Yamnaya ancestry, a reality that had been already hinted in Narasimhan et al. (2018), when Steppe_EMBA showed a poor fit for expanding Srubna-Andronovo populations. Hence the need to consider the whole CHG component of the North Pontic area separately:

eneolithic-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Eneolithic populations. See full map. You can read more about population movements in the late Sredni Stog and closer to the Proto-Corded Ware period.

NOTE. Fits for WHG + CHG + EHG in Neolithic and Eneolithic populations are taken in part from Mathieson et al. (2019) supplementary materials (download Excel here). Unfortunately, while data on the Ukraine_Eneolithic outlier from Alexandria abounds, I don’t have specific data on the so-called ‘outlier’ from Dereivka compared to the other two analyzed together, so these maps of CHG and EHG expansion are possibly showing a lesser distribution to the west than the real one ca. 4000-3500 BC.

eneolithic-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Eneolithic populations. See full map.

Anatolia Neolithic ancestry clearly spread to the east into the north Pontic area through a Middle Eneolithic mating network, most likely opened after the Khvalynsk expansion:

eneolithic-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Eneolithic populations. See full map.
eneolithic-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Eneolithic populations. See full map.

Regarding Y-chromosome haplogroups, Anthony (2019) insists on the evident association of Khvalynsk, Yamnaya, and the spread of Pre-Yamnaya and Yamnaya ancestry with the expansion of elite R1b-L754 (and some I2a2) individuals:

eneolithic-early-y-dna
Y-DNA haplogroups in West Eurasia during the Early Eneolithic in the Pontic-Caspian steppes. See full map, and see culture, ADMIXTURE, Y-DNA, and mtDNA maps of the Early Eneolithic and Late Eneolithic.

3. Early Bronze Age

Data from Wang et al. (2019) show that Corded Ware-derived populations do not have good fits for Eneolithic_Steppe-like ancestry, no matter the model. In other words: Corded Ware populations show not only a higher contribution of Anatolia Neolithic ancestry (ca. 20-30% compared to the ca. 2-10% of Yamnaya); they show a different EHG + CHG combination compared to the Pre-Yamnaya one.

eneolithic-steppe-best-fits
Supplementary Table 13. P values of rank=2 and admixture proportions in modelling Steppe ancestry populations as a three-way admixture of Eneolithic steppe Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Test, Eneolithic_steppe, Anatolian_Neolithic, WHG.
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Yamnaya Kalmykia and Afanasievo show the closest fits to the Eneolithic population of the North Caucasian steppes, rejecting thus sizeable contributions from Anatolia Neolithic and/or WHG, as shown by the SD values. Both probably show then a Pre-Yamnaya ancestry closest to the late Repin population.

wang-eneolithic-steppe-caucasus-yamnaya
Modelling results for the Steppe and Caucasus cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional AF ancestry in Steppe groups and additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups. See tables above. Modified from Wang et al. (2019). Within a blue square, Yamnaya-related groups; within a cyan square, Corded Ware-related groups. Green background behind best p-values. In red circle, SD of AF/WHG ancestry contribution in Afanasevo and Yamnaya Kalmykia, with ranges that almost include 0%.

EBA maps include data from Wang et al. (2018) supplementary materials, specifically unpublished Yamnaya samples from Hungary that appeared in analysis of the preprint, but which were taken out of the definitive paper. Their location among Yamnaya settlers from Hungary is speculative, although most uncovered kurgans in Hungary are concentrated in the Tisza-Danube interfluve.

eba-yamnaya-ancestry
Natural neighbor interpolation of Pre-Yamnaya ancestry among Early Bronze Age populations. See full map. This map corresponds roughly with the known expansion of late Repin/Yamnaya settlers.

The Y-chromosome bottleneck of elite males from Proto-Indo-European clans under R1b-L754 and some I2a2 subclades, already visible in the Khvalynsk sampling, became even more noticeable in the subsequent expansion of late Repin/early Yamnaya elites under R1b-L23 and I2a-L699:

chalcolithic-early-y-dna
Y-DNA haplogroups in West Eurasia during the Yamnaya expansion. See full map and maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Chalcolithic and Yamnaya Hungary.

Maps of CHG, EHG, Anatolia Neolithic, and probably WHG show the expansion of these components among Corded Ware-related groups in North Eurasia, apart from other cultures close to the Caucasus:

NOTE. For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you can read the post Corded Ware ancestry in North Eurasia and the Uralic expansion.

eba-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Early Bronze Age populations. See full map.
eba-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Early Bronze Age populations. See full map.
eba-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Early Bronze Age populations. See full map.
eba-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Early Bronze Age populations. See full map.
eba-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Early Bronze Age populations. See full map.

4. Middle to Late Bronze Age

The following maps show the most likely distribution of Yamnaya ancestry during the Bell Beaker-, Balkan-, and Sintashta-Potapovka-related expansions.

4.1. Bell Beakers

The amount of Yamnaya ancestry is probably overestimated among populations where Bell Beakers replaced Corded Ware. A map of Yamnaya ancestry among Bell Beakers gets trickier for the following reasons:

  • Expanding Repin peoples of Pre-Yamnaya ancestry must have had admixture through exogamy with late Sredni Stog/Proto-Corded Ware peoples during their expansion into the North Pontic area, and Sredni Stog in turn had probably some Pre-Yamnaya admixture, too (although they don’t appear in the simplistic formal stats above). This is supported by the increase of Anatolia farmer ancestry in more western Yamna samples.
  • Later, Yamnaya admixed through exogamy with Corded Ware-like populations in Central Europe during their expansion. Even samples from the Middle to Upper Danube and around the Lower Rhine will probably show increasing contributions of Steppe_MLBA, at the same time as they show an increasing proportion of EEF-related ancestry.
  • To complicate things further, the late Corded Ware Espersted family (from ca. 2500 BC or later) shows, in turn, what seems like a recent admixture with Yamnaya vanguard groups, with the sample of highest Yamnaya ancestry being the paternal uncle of other individuals (all of hg. R1a-M417), suggesting that there might have been many similar Central European mating networks from the mid-3rd millennium BC on, of (mainly) Yamnaya-like R1b elites displaying a small proportion of CW-like ancestry admixing through exogamy with Corded Ware-like peoples who already had some Yamnaya ancestry.
mlba-yamnaya-ancestry
Natural neighbor interpolation of Yamnaya ancestry among Middle to Late Bronze Age populations (Esperstedt CWC site close to BK_DE, label is hidden by BK_DE_SAN). See full map. You can see how this map correlated with the map of Late Copper Age migrations and Yamanaya into Bell Beaker expansion.

NOTE. Terms like “exogamy”, “male-driven migration”, and “sex bias”, are not only based on the Y-chromosome bottlenecks visible in the different cultural expansions since the Palaeolithic. Despite the scarce sampling available in 2017 for analysis of “Steppe ancestry”-related populations, it appeared to show already a male sex bias in Goldberg et al. (2017), and it has been confirmed for Neolithic and Copper Age population movements in Mathieson et al. (2018) – see Supplementary Table 5. The analysis of male-biased expansion of “Steppe ancestry” in CWC Esperstedt and Bell Beaker Germany is, for the reasons stated above, not very useful to distinguish their mutual influence, though.

Based on data from Olalde et al. (2019), Bell Beakers from Germany are the closest sampled ones to expanding East Bell Beakers, and those close to the Rhine – i.e. French, Dutch, and British Beakers in particular – show a clear excess “Steppe ancestry” due to their exogamy with local Corded Ware groups:

Only one 2-way model fits the ancestry in Iberia_CA_Stp with P-value>0.05: Germany_Beaker + Iberia_CA. Finding a Bell Beaker-related group as a plausible source for the introduction of steppe ancestry into Iberia is consistent with the fact that some of the individuals in the Iberia_CA_Stp group were excavated in Bell Beaker associated contexts. Models with Iberia_CA and other Bell Beaker groups such as France_Beaker (P-value=7.31E-06), Netherlands_Beaker (P-value=1.03E-03) and England_Beaker (P-value=4.86E-02) failed, probably because they have slightly higher proportions of steppe ancestry than the true source population.

olalde-iberia-chalcolithic

The exogamy with Corded Ware-like groups in the Lower Rhine Basin seems at this point undeniable, as is the origin of Bell Beakers around the Middle-Upper Danube Basin from Yamnaya Hungary.

To avoid this excess “Steppe ancestry” showing up in the maps, since Bell Beakers from Germany pack the most Yamnaya ancestry among East Bell Beakers outside Hungary (ca. 51.1% “Steppe ancestry”), I equated this maximum with BK_Scotland_Ach (which shows ca. 61.1% “Steppe ancestry”, highest among western Beakers), and applied a simple rule of three for “Steppe ancestry” in Dutch and British Beakers.

NOTE. Formal stats for “Steppe ancestry” in Bell Beaker groups are available in Olalde et al. (2018) supplementary materials (PDF). I didn’t apply this adjustment to Bk_FR groups because of the R1b Bell Beaker sample from the Champagne/Alsace region reported by Samantha Brunel that will pack more Yamnaya ancestry than any other sampled Beaker to date, hence probably driving the Yamnaya ancestry up in French samples.

The most likely outcome in the following years, when Yamnaya and Corded Ware ancestry are investigated separately, is that Yamnaya ancestry will be much lower the farther away from the Middle and Lower Danube region, similar to the case in Iberia, so the map above probably overestimates this component in most Beakers to the north of the Danube. Even the late Hungarian Beaker samples, who pack the highest Yamnaya ancestry (up to 75%) among Beakers, represent likely a back-migration of Moravian Beakers, and will probably show a contribution of Corded Ware ancestry due to the exogamy with local Moravian groups.

Despite this decreasing admixture as Bell Beakers spread westward, the explosive expansion of Yamnaya R1b male lineages (in words of David Reich) and the radical replacement of local ones – whether derived from Corded Ware or Neolithic groups – shows the true extent of the North-West Indo-European expansion in Europe:

chalcolithic-late-y-dna
Y-DNA haplogroups in West Eurasia during the Bell Beaker expansion. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Late Copper Age and of the Yamnaya-Bell Beaker transition.

4.2. Palaeo-Balkan

There is scarce data on Palaeo-Balkan movements yet, although it is known that:

  1. Yamnaya ancestry appears among Mycenaeans, with the Yamnaya Bulgaria sample being its best current ancestral fit;
  2. the emergence of steppe ancestry and R1b-M269 in the eastern Mediterranean was associated with Ancient Greeks;
  3. Thracians, Albanians, and Armenians also show R1b-M269 subclades and “Steppe ancestry”.

4.3. Sintashta-Potapovka-Filatovka

Interestingly, Potapovka is the only Corded Ware derived culture that shows good fits for Yamnaya ancestry, despite having replaced Poltavka in the region under the same Corded Ware-like (Abashevo) influence as Sintashta.

This proves that there was a period of admixture in the Pre-Proto-Indo-Iranian community between CWC-like Abashevo and Yamnaya-like Catacomb-Poltavka herders in the Sintashta-Potapovka-Filatovka community, probably more easily detectable in this group because of the specific temporal and geographic sampling available.

srubnaya-yamnaya-ehg-chg-ancestry
Supplementary Table 14. P values of rank=3 and admixture proportions in modelling Steppe ancestry populations as a four-way admixture of distal sources EHG, CHG, Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Steppe cluster, EHG, CHG, WHG, Anatolian_Neolithic
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Srubnaya ancestry shows a best fit with non-Pre-Yamnaya ancestry, i.e. with different CHG + EHG components – possibly because the more western Potapovka (ancestral to Proto-Srubnaya Pokrovka) also showed good fits for it. Srubnaya shows poor fits for Pre-Yamnaya ancestry probably because Corded Ware-like (Abashevo) genetic influence increased during its formation.

On the other hand, more eastern Corded Ware-derived groups like Sintashta and its more direct offshoot Andronovo show poor fits with this model, too, but their fits are still better than those including Pre-Yamnaya ancestry.

mlba-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Middle to Late Bronze Age populations. See full map.
mlba-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Middle to Late Bronze Age populations. See full map.
mlba-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Middle to Late Bronze Age populations. See full map.
mlba-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Middle to Late Bronze Age populations. See full map.

NOTE For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you should read the post Corded Ware ancestry in North Eurasia and the Uralic expansion instead.

The bottleneck of Proto-Indo-Iranians under R1a-Z93 was not yet complete by the time when the Sintashta-Potapovka-Filatovka community expanded with the Srubna-Andronovo horizon:

early-bronze-age-y-dna
Y-DNA haplogroups in West Eurasia during the European Early Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Bronze Age.

4.4. Afanasevo

At the end of the Afanasevo culture, at least three samples show hg. Q1b (ca. 2900-2500 BC), which seemed to point to a resurgence of local lineages, despite continuity of the prototypical Pre-Yamnaya ancestry. On the other hand, Anthony (2019) makes this cryptic statement:

Yamnaya men were almost exclusively R1b, and pre-Yamnaya Eneolithic Volga-Caspian-Caucasus steppe men were principally R1b, with a significant Q1a minority.

Since the only available samples from the Khvalynsk community are R1b (x3), Q1a(x1), and R1a(x1), it seems strange that Anthony would talk about a “significant minority”, unless Q1a (potentially Q1b in the newer nomenclature) will pop up in some more individuals of those ca. 30 new to be published. Because he also mentions I2a2 as appearing in one elite burial, it seems Q1a (like R1a-M459) will not appear under elite kurgans, although it is still possible that hg. Q1a was involved in the expansion of Afanasevo to the east.

middle-bronze-age-y-dna
Y-DNA haplogroups in West Eurasia during the Middle Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Middle Bronze Age and the Late Bronze Age.

Okunevo, which replaced Afanasevo in the Altai region, shows a majority of hg. Q1b, but also some R1b-M269 samples proper of Afanasevo, suggesting partial genetic continuity.

NOTE. Other sampled Siberian populations clearly show a variety of Q subclades that likely expanded during the Palaeolithic, such as Baikal EBA samples from Ust’Ida and Shamanka with a majority of Q1b, and hg. Q reported from Elunino, Sagsai, Khövsgöl, and also among peoples of the Srubna-Andronovo horizon (the Krasnoyarsk MLBA outlier), and in Karasuk.

From Damgaard et al. Science (2018):

(…) in contrast to the lack of identifiable admixture from Yamnaya and Afanasievo in the CentralSteppe_EMBA, there is an admixture signal of 10 to 20% Yamnaya and Afanasievo in the Okunevo_EMBA samples, consistent with evidence of western steppe influence. This signal is not seen on the X chromosome (qpAdm P value for admixture on X 0.33 compared to 0.02 for autosomes), suggesting a male-derived admixture, also consistent with the fact that 1 of 10 Okunevo_EMBA males carries a R1b1a2a2 Y chromosome related to those found in western pastoralists. In contrast, there is no evidence of western steppe admixture among the more eastern Baikal region region Bronze Age (~2200 to 1800 BCE) samples.

This Yamnaya ancestry has been also recently found to be the best fit for the Iron Age population of Shirenzigou in Xinjiang – where Tocharian languages were attested centuries later – despite the haplogroup diversity acquired during their evolution, likely through an intermediate Chemurchek culture (see a recent discussion on the elusive Proto-Tocharians).

Haplogroup diversity seems to be common in Iron Age populations all over Eurasia, most likely due to the spread of different types of sociopolitical structures where alliances played a more relevant role in the expansion of peoples. A well-known example of this is the spread of Akozino warrior-traders in the whole Baltic region under a partial N1a-VL29-bottleneck associated with the emerging chiefdom-based systems under the influence of expanding steppe nomads.

early-iron-age-y-dna
Y-DNA haplogroups in West Eurasia during the Early Iron Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Iron Age and Late Iron Age.

Surprisingly, then, Proto-Tocharians from Shirenzigou pack up to 74% Yamnaya ancestry, in spite of the 2,000 years that separate them from the demise of the Afanasevo culture. They show more Yamnaya ancestry than any other population by that time, being thus a sort of Late PIE fossils not only in their archaic dialect, but also in their genetic profile:

shirenzigou-afanasievo-yamnaya-andronovo-srubna-ulchi-han

The recent intrusion of Corded Ware-like ancestry, as well as the variable admixture with Siberian and East Asian populations, both point to the known intense Old Iranian and Old/Middle Chinese contacts. The scarce Proto-Samoyedic and Proto-Turkic loans in Tocharian suggest a rather loose, probably more distant connection with East Uralic and Altaic peoples from the forest-steppe and steppe areas to the north (read more about external influences on Tocharian).

Interestingly, both R1b samples, MO12 and M15-2 – likely of Asian R1b-PH155 branch – show a best fit for Andronovo/Srubna + Hezhen/Ulchi ancestry, suggesting a likely connection with Iranians to the east of Xinjiang, who later expanded as the Wusun and Kangju. How they might have been related to Huns and Xiongnu individuals, who also show this haplogroup, is yet unknown, although Huns also show hg. R1a-Z93 (probably most R1a-Z2124) and Steppe_MLBA ancestry, earlier associated with expanding Iranian peoples of the Srubna-Andronovo horizon.

All in all, it seems that prehistoric movements explained through the lens of genetic research fit perfectly well the linguistic reconstruction of Proto-Indo-European and Proto-Uralic.

Related

More Hungarian Conquerors of hg. N1c-Z1936, and the expansion of ‘Altaic-Uralic’ N1c

Open access Y-chromosomal connection between Hungarians and geographically distant populations of the Ural Mountain region and West Siberia, by Post et al. Scientific Reports (2019) 9:7786.

Hungarian Conquerors

More interesting than the study of modern populations of the paper is the following excerpt from the introduction, referring to a paper that is likely in preparation, Európai És Ázsiai Apai Genetikai Vonalak A Honfoglaló Magyar Törzsekben, by Fóthi, E., Fehér, T., Fóthi, Á. & Keyser, C., Avicenna Institute of Middle Eastern Studies (2019):

Certain chr-Y lineages from haplogroup (hg) N have been proposed to be associated with the spread of Uralic languages. So far, hg N3 has not been reported for Indo-European speaking populations in Central Europe, but it is present among Hungarians, although the proportion of hg N in the paternal gene pool of present-day Hungarians is only marginal (up to 4%) compared to other Uralic speaking populations. It has been shown earlier that one of the sub-clades of hg N – N3a4-Z1936 – could be a potential link between two Ugric speaking populations: the Hungarians and the Mansi. It is also notable that some ancient Hungarian samples from the 9th and 10th century Carpathian Basin belonged to this hg N sub-clade: Three Z1936 samples were found in the Upper-Tisza area (Karos II, Bodrogszerdahely/Streda nad Bodrogom) and two in the Middle-Tisza basin cemeteries (Nagykörű and Tiszakécske). The haplotype of the Nagykörű sample is identical with one contemporary Hungarian sample from Transylvania that tested positive for B545 marker downstream of N3a4-Z193632. Similar findings come from the maternal gene pool of historical Hungarians: the analyses of early medieval aDNA samples from Karos-Eperjesszög cemeteries revealed the presence of mtDNA hgs of East Asian provenance.

A commenter recently wrote that in a study by Fehér (probably this one) two Hungarian conquerors, from Ormenykut and Tuzser, will be of hg. N1c-2110. Assuming no other lineages will appear, this would leave the proportion of N1c-L392 vs. R1a-Z280/Z93 closer to the reported proportion of hg. N vs. R1a (5 vs. 2) among Sargat samples, and is thus compatible with a direct migration of Hungarians from around the Urals.

However, the sampling of Iron Age populations around the Urals is scarce, and we don’t know what other lineages these studied Magyars will have, but – based on the known variability of the published ones, and on the ca. 50-60 early Magyar males available to date in previous studies to obtain Y-chromosome haplogroups – I would say these reported N1c lineages are just a tiny proportion of what’s to come…

“Altaic-Uralic” N1c

altaic-uralic-n1c-haplogroup
Phylogenetic tree of hg N3a4. Phylogenetic tree of 33 high coverage Y-chromosomes from
haplogroup N3a4 was reconstructed with BEAST v.1.7.5 software package.

Archaeogenetic studies based on mtDNA haplotypes have shown that ancient Hungarians were relatively close to contemporary Bashkirs who are a Turkic speaking population residing in the Volga-Ural region. Another study reported excessive identical-by-descent (IBD) genomic segments shared between the Ob-Ugric speaking Khantys and Bashkirs but a moderate IBD sharing between Turkic speaking Tatars and their neighbours including Bashkirs.

Phylogenetic tree of hg N3a4 has two main sub-clades defined by markers B535 and B539 that diverged around 4.9 kya (95% confidence interval [CI] = 3.7–6.3 kya). Inner sub-clades of N3a4-B539 (defined by markers B540 and B545) split 4.2 kya (95% CI = 3.0–5.6 kya). (…) The phylogenetic tree reveals that all five Hungarian samples belong to N3a4-B539 sub-clade that they share with Ob-Ugric speaking Khanty and Mansi, and Turkic speaking Bashkirs and Tatars from the Volga-Ural region. Hungarian and Bashkir chrY lineages belong to both sub-clades of N3a4-B539.

Modern distribution of the “Ugric N1c”

To test the presence and proportions of hg N3a4 lineages in a more comprehensive sample set and with a higher phylogenetic resolution level compared to earlier studies, we analysed the genotyping data of about 5000 Eurasian individuals, including West Siberian Mansi and Khanty who are linguistically closest to Hungarians

n3a4-n1c-z1936-ugric
Map of the entire hg N3a4.

There is a clear difference in geographic distribution patterns of these two hg N3a4 sub-clades. Hg N3a4-B535 (Fig. 3b) is common mostly among Finnic (Finns, Karelians, Vepsas, Estonians) and Saami speaking populations in North eastern Europe. The highest frequency is detected in Finns (~44%) but it also reaches up to 32% in Vepsas and around 20% in Karelians, Saamis and North Russians. The latter are known to have changed their language or to be an admixed population with reported similar genetic composition to their Finnic speaking neighbors. The frequency of N3a4-B535 rapidly decreases towards south to around 5% in Estonians, being almost absent in Latvians (1%) and not found among Lithuanians. Towards east its frequency is from 1–9% among Eastern European Russians and populations of the Volga-Ural region such as Komis, Mordvins and Chuvashes (…)

n3a4-n1c-z1936-finnic-samic
Map of N3a4 subclades defined by B535.

Hg N3a4-B539, on the other hand, is prevalent among Turkic speaking Bashkirs and also found in Tatars but is entirely missing from other populations of the Volga-Ural region such as Uralic speaking Udmurts, Maris, Komis and Mordvins, and in Northeast Europe, where instead N3a4-B535 lineages are frequent. Besides Bashkirs and Tatars in Volga-Ural region, N3a4-B539 is substantially represented in West Siberia among Ugric speaking Mansis and Khantys. Among Hungarians, however, N3a4-B539 has a subtle frequency of 1–4%.

n3a4-n1c-z1936-ugric-bashkir
Map of N3a4 subclades defined by B539, with a local snapshot showing the N3a4-B539 distribution among Hungarian speakers.

The battle to appropriate N1c-L392

So, basically, the team of Kristiina Tambets is arguing that N1c-VL29 expanded Finnic to the East Baltic (hence from a common Finno-Mordvinic dialect splitting ca. 600 BC on?) because, you know, apparently the agreed separation of known Uralic dialects from ca. 2000 BC, and their Bronze Age presence around the Baltic, is not valid when you follow haplogroups instead of languages or archaeology.

But now this other group of Tambets (co-author of this paper) considers that hg. N1c-Z1936 – which is probably behind the N1c-L392 samples from Lovozero Ware in the Kola Peninsula – represent either the True Uralic-speaking Palaeo-Arctic peoples, or else merely Ugric-speaking peoples which happened to expand to Fennoscandia but left no trace of their language…

To accept this identification you only have to NOT ask why:

  • N1c is first found in ancient cultures close to Lake Baikal.
  • N1c-L392 appears in ancient East Asian populations speaking completely different languages, with Altaic and Uralic being just some among many Palaeo-Siberian populations where the haplogroup will pop up.
  • Turkic populations like Bashkirs and Tatars (who expanded to the Volga through the southern Urals before the expansion of Hungarians) show a shared distribution of the B539 haplotype with Hungarians.
  • The phylogenetic tree and areas of N1c-L392 expansions don’t make any sense in light of the known linguistic and cultural expansions of Uralic-speaking peoples.

In fact, the Hungarian research group of Neparáczki – publishing the recent paper on Hungarian Conquerors – was apparently looking for a connection with Turkic peoples to support some traditional Turanian myths, and they found it in some scattered R1a-Z93 samples which supposedly connect Hungarian Conquerors to Huns (?), instead of looking for this closer link through N1c-Z1936 (especially haplotype B539)…

Also, is it me or are there two opposed trends with completely different interpretations among researchers publishing papers about hg. N1c: one systematically arguing for Altaic origins, and another for Uralic ones?

If somebody sees some complex reasoning behind the discussions of all these recent papers, beyond the simplest “let’s follow N for Uralic/Altaic”, feel free to comment below. Just so I can understand what I might be doing wrong in assessing Neolithic and Bronze Age migrations in linguistics and archaeology with help of ancient haplogroups coupled with ancestral components, but these researchers are doing right by playing with obsessive ideas born out of the 2000s coupled with phylogenetic trees and maps of modern haplogroup distributions…

This is probably going to be this blog’s most used image in 2019:

horse-meme-steppe-ancestry

Related

N1c-L392 associated with expanding Turkic lineages in Siberia

haplogroup-n1c-tat

Second in popularity for the expansion of haplogroup N1a-L392 (ca. 4400 BC) is, apparently, the association with Turkic, and by extension with Micro-Altaic, after the Uralic link preferred in Europe; at least among certain eastern researchers.

New paper in a recently created journal, by the same main author of the group proposing that Scythians of hg. N1c were Turkic speakers: On the origins of the Sakhas’ paternal lineages: Reconciliation of population genetic / ancient DNA data, archaeological findings and historical narratives, by Tikhonov, Gurkan, Demirdov, and Beyoglu, Siberian Research (2019).

Interesting excerpts:

According to the views of a number of authoritative researchers, the Yakut ethnos was formed in the territory of Yakutia as a result of the mixing of people from the south and the autochthonous population [34].

These three major Sakha paternal lineages may have also arrived in Yakutia at different times and/ or from different places and/or with a difference in several generations instead, or perhaps Y-chromosomal STR mutations may have taken place in situ in Yakutia. Nevertheless, the immediate common ancestor(s) from the Asian Steppe of these three most prevalent Sakha Y-chromosomal STR haplotypes possibly lived during the prominence of the Turkic Khaganates, hence the near-perfect matches observed across a wide range of Eurasian geography, including as far as from Cyprus in the West to Liaoning, China in the East, then Middle Lena in the North and Afghanistan in the South (Table 3 and Figure 5). There may also be haplotypes closely-related to ‘the dominant Elley line’ among Karakalpaks, Uzbeks and Tajiks, however, limitations in the loci coverage for the available dataset (only eight Y-chromosomal STR loci) precludes further conclusions on this matter [25].

yakutia-haplogroup-n1c
17-loci median-joining network analysis of the original/dominant Elley, Unknown and Omogoy Y-chromosomal STR haplotypes with the YHRD matches from outside Yakutia populations.

According to the results presented here, very similar Y-STR haplotypes to that of the original Elley line were found in the west: Afghanistan and northern Cyprus, and in the east: Liaoning Province, China and Ulaanbaator, Northern Mongolia. In the case of the dominant Omogoy line, very closely matching haplotypes differing by a single mutational step were found in the city of Chifen of the Jirin Province, China. The widest range of similar haplotypes was found for the Yakut haplotype Unknown: In Mongolia, China and South Korea. For instance, haplotypes differing by a single step mutation were found in Northern Mongolia (Khalk, Darhad, Uryankhai populations), Ulaanbaator (Khalk) and in the province of Jirin, China (Han population).

n1c-uralic-altaic-siberia
14-loci median-joining network analysis for the original/dominant Elley (Ell), Unknown Clan
(Vil), Omogoy (Omo), Eurasian (Eur) and Xiongnu (Xuo) Y-chromosomal STR haplotypes and that for a representative ancient DNA sample (Ch0 or DSQ04) from the Upper Xiajiadian Culture
recovered from the Inner Mongolia Autonomous Region, China.

Notably, Tat-C-bearing Y-chromosomes were also observed in ancient DNA samples from the 2700-3000 years-old Upper Xiajiadian culture in Inner Mongolia, as well as those from the Serteya II site at the Upper Dvina region in Russia and the ‘Devichyi gory’ culture of long barrow burials at the Nevel’sky district of Pskovsky region in Russia. A 14-loci Y-chromosomal STR median-joining network of the most prevalent Sakha haplotypes and a Tat-C-bearing haplotype from one of the ancient DNA samples recovered from the Upper Xiajiadian culture in Inner Mongolia (DSQ04) revealed that the contemporary Sakha haplotype ‘Xuo’ (Table 2, Haplotype ID “Xuo”) classified as that of ‘the Xiongnu clan’ in our current study, was the closest to the ancient Xiongnu haplotype (Figure 6). TMRCA estimate for this 14-loci Y-chromosomal STR network was 4357 ± 1038 years or 2341 ± 1038 BCE, which correlated well with the Upper Xiajiadian culture that was dated to the Late Bronze Age (700-1000 BCE).

eurasian-n-subclades
Geographical location of ancient samples belonging to major clade N of the Y-chromosome.

NOTE. Also interesting from the paper seems to be the proportion of E1b1b among admixed Russian populations, in a proportion similar to R1a or I2a(xI2a1).

It is tempting to associate the prevalent presence of N1c-L392 in ancient Siberian populations with the expansion of Altaic, by simplistically linking the findings (in chronological order) near Lake Baikal (Damgaard et al. 2018), Upper Xiajiadian (Cui et al. 2013), among Khövsgöl (Jeong et al. 2018), in Huns (Damgaard et al. 2018), and in Mongolic-speaking Avars (Csáky et al. 2019).

However, its finding among Palaeo-Laplandic peoples in the Kola peninsula ca. 1500 BC (Lamnidis et al. 2018) and among Palaeo-Siberian populations near the Yana River (Sikora et al. 2018) ca. AD 1200 should be enough to accept the hypothesis of ancestral waves of expansion of the haplogroup over northern Eurasia, with acculturation and further expansions in the different regions since the Iron Age (see more on its potential expansion waves).

Also, a simple look at the TMRCA and modern distribution was enough to hypothesize long ago the lack of connection of N1c-L392 with Altaic or Uralic peoples. From Ilumäe et al. (2016):

Previous research has shown that Y chromosomes of the Turkic-speaking Yakuts (Sakha) belong overwhelmingly to hg N3 (formerly N1c1). We found that nearly all of the more than 150 genotyped Yakut N3 Y chromosomes belong to the N3a2-M2118 clade, just as in the Turkic-speaking Dolgans and the linguistically distant Tungusic-speaking Evenks and Evens living in Yakutia (Table S2). Hence, the N3a2 patrilineage is a prime example of a male population of broad central Siberian ancestry that is not intrinsic to any linguistically defined group of people. Moreover, the deepest branch of hg N3a2 is represented by a Lebanese and a Chinese sample. This finding agrees with the sequence data from Hallast et al., where one Turkish Y chromosome was also assigned to the same sub-clade. Interestingly, N3a2 was also found in one Bhutan individual who represents a separate sub-lineage in the clade. These findings show that although N3a2 reflects a recent strong founder effect primarily in central Siberia (Yakutia, Sakha), the sub-clade has a much wider distribution area with incidental occurrences in the Near East and South Asia.

haplogroup-n1a-M2118
Frequency-Distribution Maps of Individual Sub-clades of hg N3a2, by Ilumäe et al. (2016).

The most striking aspect of the phylogeography of hg N is the spread of the N3a3’6-CTS6967 lineages. Considering the three geographically most distant populations in our study—Chukchi, Buryats, and Lithuanians—it is remarkable to find that about half of the Y chromosome pool of each consists of hg N3 and that they share the same sub-clade N3a3’6. The fractionation of N3a3’6 into the four sub-clades that cover such an extraordinarily wide area occurred in the mid-Holocene, about 5.0 kya (95% CI = 4.4–5.7 kya). It is hard to pinpoint the precise region where the split of these lineages occurred. It could have happened somewhere in the middle of their geographic spread around the Urals or further east in West Siberia, where current regional diversity of hg N sub-lineages is the highest (Figure 1B). Yet, it is evident that the spread of the newly arisen sub-clades of N3a3’6 in opposing directions happened very quickly. Today, it unites the East Baltic, East Fennoscandia, Buryatia, Mongolia, and Chukotka-Kamchatka (Beringian) Eurasian regions, which are separated from each other by approximately 5,000–6,700 km by air. N3a3’6 has high frequencies in the patrilineal pools of populations belonging to the Altaic, Uralic, several Indo-European, and Chukotko-Kamchatkan language families. There is no generally agreed, time-resolved linguistic tree that unites these linguistic phyla. Yet, their split is almost certainly at least several millennia older than the rather recent expansion signal of the N3a3’6 sub-clade, suggesting that its spread had little to do with linguistic affinities of men carrying the N3a3’6 lineages.

haplogroup_n3a3
Frequency-Distribution Maps of Individual Subclade N3a3 / N1a1a1a1a1a-CTS2929/VL29.

It was thus clear long ago that N1c-L392 lineages must have expanded explosively in the 5th millennium through Northern Eurasia, probably from a region to the north of Lake Baikal, and that this expansion – and succeeding ones through Northern Eurasia – may not be associated to any known language group until well into the common era.

Related

Scytho-Siberians of Aldy-Bel and Sagly, of haplogroup R1a-Z93, Q1b-L54, and N

iron-age-sakas-aldy-bel-scythians

Recently, a paper described Eastern Scythian groups as “Uralic-Altaic” just because of the appearance of haplogroup N in two Pazyryk samples.

This simplistic identification is contested by the varied haplogroups found in early Altaic groups, by the early link of Cimmerians with the expansion of hg. N and Q, by the link of N1c-L392 in north-eastern Europe with Palaeo-Laplandic, and now (paradoxically) by the clear link between early Mongolic expansion and N1c-L392 subclades.

A new paper (behind paywall) offers insight into the prevalent presence of R1a-Z93 among eastern Scytho-Siberian groups (most likely including Samoyedic speakers in the forest-steppes), and a new hint to the westward expansion of haplogroups Q and N (probably coupled with the so-called “Siberian ancestry”) from the east with different groups of Iron Age steppe nomads:

Genetic kinship and admixture in Iron Age Scytho-Siberians, by Mary et al. Human Genetics (2019).

Interesting excerpts (emphasis mine):

From an archeological and historical point of view, the term “Scythians” refers to Iron Age nomadic or seminomadic populations characterized by the presence of three types of artifacts in male burials: typical weapons, specific horse harnesses and items decorated in the so-called “Animal Style”. This complex of goods has been termed the “Scythian triad” and was considered to be characteristic of nomadic groups belonging to the “Scythian World” (Yablonsky 2001). This “Scythian World” includes both the Classic (or European) Scythians from the North Pontic region (7th–3th century BC) and the Southern Siberian (or Asian) populations of the Scythian period (also called Scytho-Siberians). These include, among others, the Sakas from Kazakhstan, the Tagar population from the Minusinsk Basin (Republic of Khakassia), the Aldy-Bel population from Tuva (Russian Federation) and the Pazyryk and Sagly cultures from the Altai Mountains.

mtdna-scytho-siberians
Proportions of Scythian mtDNA haplogroups. Western (blue) and eastern (pink) Eurasian lineages are equally distributed in the Arzhan Scytho-Siberian sample. The U5a2a1 haplogroup shared between the two Scythian groups studied is in bold

In this work, we first aim to address the question of the familial and social organization of Scytho-Siberian groups by studying the genetic relationship of 29 individuals from the Aldy-Bel and Sagly cultures using autosomal STRs. (…) were obtained from 5 archeological sites located in the valley of the Eerbek river in Tuva Republic, Russia (Fig. 1). All the mounds of this archeological site were excavated but DNA samples were not collected from all of them. 14C dates mainly fall within the Hallstatt radiocarbon calibration plateau (ca. 800–400 cal BC) where the chronological resolution is poor. Only one date falls on an earlier segment of calibration curve: Le 9817–2650 ± 25 BP, i.e. 843–792 cal BC with a probability of 94.3% (using the OxCal v4.3.2 program). This sample (Bai-Dag 8, Kurgan 1, grave 10) is not from one of the graves studied but was used to date the kurgan as a whole.

Y-chromosome haplogroups were first assigned using the ISOGG 2018 nomenclature. In order to improve the precision of haplogroup definition, we also analyzed a set of Y-chromosome SNP (Supplementary Table 2). Nine samples belonged to the R1a-M513 haplogroup (defined by marker M513) and two of these nine samples were characterized as belonging to the R1a1a1b2-Z93 haplogroup or one of its subclades. Six samples belonged to the Q1b1a-L54 haplogroup and five of these six samples belonged to the Q1b1a3-L330 subclade. One sample belonged to the N-M231 haplogroup.

haplogroups-scythian-siberians

The distribution of these haplogroups in the population must be confronted with the prevalence of kinship among the samples. Although five individuals belonged to haplogroup Q1b1a3-L330, three of them (ARZ-T18, ARZ-T19 and ARZ-T20) were paternally related (Fig. 2). It must, therefore, be considered that haplogroup Q1b1a3-L330 is present in three independent instances (given that the remaining two instances exhibit no close familial relationship with other samples or one another). All five were buried on the Eki-Ottug 1 archaeological site (although in two different kurgans).

In the same way, although two groups, of two and three individuals, shared haplotypes belonging to the R1a-M513 haplogroup, these groups likely include a father/son pair (ARZ-T2 and ARZ-T12). Therefore, among nine R1a-M513 men, we found six independent haplotypes, one being present in two independent instances. All R1a-M513 haplotypes, however, including those attributed to the R1a1a1b2-Z93 subclade, only differed by one-step mutations, across 5 loci at most. All R1a-M513 individuals were buried on the same site, Eki-Ottug 2, in a single Kurgan.

y-haplogroups-r1a-n-q1b

Haplogroup R1a-M173 was previously reported for 6 Scytho-Siberian individuals from the Tagar culture (Keyser et al. 2009) and one Altaian Scytho-Siberian from the Sebÿstei site (Ricaut et al. 2004a), whereas haplogroup R1a1a1b2-Z93 (or R1a1a1b-S224) was described for one Scythian from Samara (Mathieson et al. 2015) and two Scytho-Siberians from Berel and the Tuva Republic (Unterländer et al. 2017). On the contrary, North Pontic Scythians were found to belong to the R1b1a1a2 haplogroup (Krzewińska et al. 2018), showing a distinction between the two groups of Scythians. (…) The absence of R1b lineages in the Scytho-Siberian individuals tested so far and their presence in the North Pontic Scythians suggest that these 2 groups had a completely different paternal lineage makeup with nearly no gene flow from male carriers between them.

The seven other male individuals studied in this work were found to carry Eastern Eurasian Y haplogroups Q1b1a and one of its subclades (n = 6) and N (n = 1). Haplogroup Q1b1a-L54 was previously described in four males from the Bronze Age in the Altai Mountains (Hollard et al. 2014, 2018) and was clearly associated with Siberian populations (Regueiro et al. 2013).

The N-M231 haplogroup emerged from haplogroup K in Southern Asia around 21,000 years BCE, maybe in Southern China (Shi et al. 2013; Ilumäe et al. 2016). Previous studies attested to its presence in samples from Neolithic and Bronze Age in China (Li et al. 2011; Cui et al. 2013). Waves of northwestern expansion of this haplogroup are described as beginning during the Paleolithic period (Derenko et al. 2006; Shi et al. 2013) but traces of this expansion in archeological samples were reported only in two Scytho-Siberian males from the Altai (Pilipenko et al. 2015).

The sample of haplogroup N comes from the Aldy-Bel culture (ARZ-T15), from the Eerbek site, but has no radiocarbon date. All Q1b-L330 samples come from the Sagly culture, and three are paternally related. The other Q1b-L54 sample is from other tombs in one kurgan at Aldy Bel.

It seems that – exactly as expected – different waves of steppe nomads brought different lineages at a time (the Iron Age) when many regions incorporated different eastern lineages without necessarily changing language. Just like the expansion of N among Ugrians and Samoyeds, and N1c among Finno-Permic peoples, and like many other lineages expanding with federation-like groups in eastern, central, and western Europe

Related

R1a-Z280 and R1a-Z93 shared by ancient Finno-Ugric populations; N1c-Tat expanded with Micro-Altaic

Two important papers have appeared regarding the supposed link of Uralians with haplogroup N.

Avars of haplogroup N1c-Tat

Preprint Genetic insights into the social organisation of the Avar period elite in the 7th century AD Carpathian Basin, by Csáky et al. bioRxiv (2019).

Interesting excerpts (emphasis mine):

After 568 AD the Avars settled in the Carpathian Basin and founded the Avar Qaganate that was an important power in Central Europe until the 9th century. Part of the Avar society was probably of Asian origin, however the localisation of their homeland is hampered by the scarcity of historical and archaeological data.

Here, we study mitogenome and Y chromosomal STR variability of twenty-six individuals, a number of them representing a well-characterised elite group buried at the centre of the Carpathian Basin more than a century after the Avar conquest.

The Y-STR analyses of 17 males give evidence on a surprisingly homogeneous Y chromosomal composition. Y chromosomal STR profiles of 14 males could be assigned to haplogroup N-Tat (also N1a1-M46). N-Tat haplotype I was found in four males from Kunpeszér with identical alleles on at least nine loci. The full Y-STR haplotype I, reconstructed from AC17 with 17 detected STRs, is rare in our days. Only nine matches were found among haplotypes in YHRD database, such as samples from the Ural Region, Northern Europe (Estonia, Finland), and Western Alaska (Yupiks). We performed Median Joining (MJ) network analysis using N-Tat haplotypes with ten shared STR loci (Fig. 3, Table S9). All modern N-Tat samples included in the network had derived allele of L708 as well. Haplotype I (Cluster 1 in Fig. 3) is shared by eight populations on the MJ network among the 24 identical haplotypes. Cluster 1 represents the founding lineage, as it is described in Siberian populations, because this haplotype is shared by the most populations and it is more diverse than Cluster 2.

Nine males share N-Tat haplotype II (on a minimum of eight detected alleles), all of them buried in the Danube-Tisza Interfluve. We found 30 direct matches of this N-Tat haplotype II in the YHRD database, using the complete 17 STR Y-filer profile of AC1, AC12, AC14, AC15, AC19 samples. Most hits came from Mongolia (seven Buryats and one Khalkh) and from Russia (six Yakuts), but identical haplotypes also occur in China (five in Xinjiang and four in Inner Mongolia provinces). On the MJ network, this haplotype II is represented by Cluster 2 and is composed of 45 samples (including 32 Buryats) from six populations (Fig. 3).

y-str-haplogroup-n-mongolian-ugrians
Median Joining network of 162 N-Tat Y-STR haplotypes Allelic information of ten Y-STR loci were used for the network. Only those Avar samples were included, which had results for these ten Y-STR loci. The founder haplotype I (Cluster 1) is shared by eight populations including three Mongolian, three Székely, three northern Mansi, two southern Mansi, two Hungarian, eight Khanty, one Finn and two Avar (AC17, AC26) chromosomes. Haplotype II (Cluster 2) includes 45 haplotypes from six populations studied: 32 Buryats, two Mongolians, one Székely, one Uzbek, one Uzbek Madjar, two northern Mansi and six Avars (AC1, AC12, AC14, AC15, AC19 and KSZ 37). Haplotype III (indicated by a red arrow) is AC8. Information on the modern reference samples is seen in Table S9.

A third N-Tat lineage (type III) was represented only once in the Avar dataset (AC8), and has no direct modern parallels from the YHRD database. This haplotype on the MJ network (see red arrow in Fig. 3) seems to be a descendent from other haplotype cluster that is shared by three populations (two Buryat from Mongolia, three Khanty and one Northern Mansi samples). This haplotype cluster also differs one molecular step (locus DYS393) from haplotype II. We classified the Avar samples to downstream subgroup N-F4205 within the N-Tat haplogroup, based on the results of ours and Ilumäe et al.18 and constructed a second network (Fig. S4). The N-F4205 network results support the assumption that the N-Tat Avar samples belong to N-F4205 subgroup (see SI chapter 1d for more details).

Based on our calculation, the age of accumulated STR variance (TMRCA) within N-Tat lineage for all samples is 7.0 kya (95% CI: 4.9 – 9.2 kya), considering the core haplotype (Cluster 1) to be the founding lineage. Y haplogroup N-Tat was not detected by large scale Eurasian ancient DNA studies but it occurs in late Bronze Age Inner Mongolia and late medieval Yakuts, among them N-Tat has still the highest frequency.

Two males (AC4 and AC7) from the Transtisza group belong to two different haplotypes of Y-haplogroup Q1. Both Q1a-F1096 and Q1b-M346 haplotypes have neither direct nor one step neighbour matches in the worldwide YHRD database. A network of the Q1b-M346 haplotype shows that this male had a probable Altaian or South Siberian paternal genetic origin.

EDIT (5 APR 2019): The paper offers an interesting late sample before the arrival of Hungarian conquerors, although we don’t know which precise lineage the sample belongs to:

One sample in our dataset (HC9) comes from this population, and both his mtDNA (T1a1b) and Y chromosome (R1a) support Eastern European connections. (…) Furthermore, we excluded sample HC9 from population-genetic statistical analyses because it belongs to a later period (end of 7th – early 9th centuries)

Apparently, then, results are consistent with what was already known from studies of modern populations:

According to Ilumäe et al. study, the frequency peak of N-F4205 (N3a5-F4205) chromosomes is close to the Transbaikal region of Southern Siberia and Mongolia, and we conclude that most Avar N-Tat chromosomes probably originated from a common source population of people living in this area, completely in line with the results of Ilumäe et al.

haplogroup_n1
Geographic-Distribution Map of hg N3 from Ilumäe et al.

Finno-Ugrians share haplogroup R1a-Z280

Another paper, behind paywall, Genetic history of Bashkirian Mari and Southern Mansi ethnic groups in the Ural region, by Dudás et al. Molecular Genetics and Genomics (2019).

Interesting excerpts (emphasis mine):

Y‑chromosome diversity

The most frequent haplogroups of the Bashkirian Maris were N1b-P43 (42%), R1a-Z280 (16%), R1a-Z93 (16%), N1c-Tat (13%), and J2-M172 (7%). Furthermore, subgroup R1b-M343 accounted for 4% and I2a-P37 covered 2% of the lineages. None of the Mari N1c Y chromosomes belonged to the N1c subgroups investigated (L1034, VL29, Z1936).

In the case of the Southern Mansi males, the most frequent haplogroups were N1b-P43 (33%), N1c-L1034 (28%) and R1a-Z280 (19%). The frequencies of the remaining haplogroups were as follows: R1a-M458 (6%), I1-L22 (3%), I2a-P37 (3%), and R1b-P312 (3%). The haplotype and haplogroup diversities of the Bashkirian Mari group were 0.9929 and 0.7657, whereas these values for the Southern Mansi were 0.9984 and 0.7873, respectively. The results show that, in both populations, haplotypes are much more diverse than haplogroups.

bashkir-mari-southern-mansi
Haplogroup frequencies of the Bashkirian Mari and the Southern Mansi ethnic groups in Ural region

Genetic structure

(..) the studied Bashkirian Mari and Southern Mansi population groups formed a compact cluster along with two Khanty, Northern Mansi, Mari, and Estonian populations based on close Fst-genetic distances (< 0.05), with nonsignificant p values (p > 0.05) except for the Estonian population. All of these populations belong to the Finno-Ugric language family. Interestingly, the other Mansi population studied by Pimenoff et al. (2008) (pop # 38) was located a great distance from the Southern Mansi group (0.268). In addition, the Bashkir population (pop # 6) did not show a close genetic affinity to the Bashkirian Mari group (0.194), even though it is the host population. However, the Russian population from the Eastern European region of Russia (pop # 49) showed a genetic distance of 0.055 with the Southern Mansi group. All Hungarian speaking populations (pops 13, 22, 23, 24, 50, and 51) showed close genetic affinities to each other and to the neighbouring populations, but not to the two studied populations.

y-dna-hungarians-ugric-mansi
Multidimensional scaling (MDS) plot constructed on Fstgenetic distances of Y haplogroup frequencies of 63 populations compared. The haplogroup frequency data used for population comparison together with references are seen in Online Resource 2 (ESM_2). Pairwise Fst-genetic distances and p values between 63 populations were calculated as shown in Online Resource 3 (ESM_3) Fig. 4 Multidimensional scaling (MDS) plot constructed on Rstgenetic distances of 10 STR-based Y haplotype frequencies of 21 populations compared. Image modified to include labels of modern populations.

Phylogenetic analysis

Median-joining networks were constructed for:

N-P43 (earlier N1b):

(…) TMRCA estimates for this haplogroup were made for all P43 samples (n = 157) 8.7 kya (95% CI 6.7–10.8 kya), for the N-P43 Asian.

N1c-Tat:

(…) 75% of Buryats belonged to Haplotype 2, indicating that the Buryats studied by us is a young and isolated population (Bíró et al. 2015). Bashkirian Mari samples derive from Haplotype 2 via Haplotype 3 (see dark purple circles on the top of Fig. 6a). Haplotype 3 contained six males (2 Buryat, 1 Northern Mansi, and 3 Khanty samples from Pimenoff et al. 2008). The biggest Bashkirian Mari haplotype node (3 Mari samples) was positioned three mutational steps away from Haplotype 1 and the remaining Mari samples can be derived from this haplotype. Southern Mansi haplotypes were scattered within the network except for two, which formed a smaller haplotype node with two Northern Mansi and two Khanty samples from Pimenoff et al. (2008).

n1c-n-tat-uralic-ugric
Median-Joining Networks (MJ) of 153 N-Tat (a) and 26 N-L1034 (b) haplotypes constructed. The circle sizes are proportional to the haplotype frequencies. The smallest area is equivalent to one individual. For N-Tat network, we used data from Southern Mansi (n = 11), Bashkirian Mari (n = 6) samples with Hungarian (n = 12), Hungarian speaking Székely (n = 6), Northern Mansi (n = 14), Mongolian (n = 16), Buryat (n = 44), Finnish (n = 13), Uzbek Madjar (n = 2), Uzbek (n = 3), Khanty (n = 4) populations studied earlier by us (Fehér et al. 2015; Bíró et al. 2015) and Khanty (n = 18) and Mansi (n = 4) studied by Pimenoff et al. (2008)

R1a-Z280 haplotypes, shared by Maris, Mansis, and Hungarians, hence ancient Finno-Ugrians:

The founder R1a-Z280 haplotype was shared by four samples from four populations (1 Bashkirian Mari; 1 Southern Mansi; 1 Hungarian speaking Székely; and 1 Hungarian), as presented in Fig. 7 (Haplotype 1). Haplotype 2 included five males (3 Bashkirian Mari and 2 Hungarian), as it can be seen in Fig. 7. Haplotype 4 included two shared haplotypes (1 Bashkirian Mari and one Hungarian speaking Csángó). The remaining two Bashkirian Mari haplotypes differ from the founder haplotype (Haplotype 1) by two mutational steps via Hungarian or Hungarian and Bashkirian Mari shared haplotypes. Beside Haplotype 1, the remaining Southern Mansi haplotypes were shared with Hungarians (Haplotype 5 or turquoise blue and red-coloured circles above Haplotype 7) or with Hungarians and Hungarian speaking Székely group (Haplotypes 3, 5, and 6). Haplotype 7 included ten Hungarian speakers (Hungarian, Székely, and Csángó). One Hungarian and one Uzbek Khwarezm shared haplotype can be found in Fig. 7 as well (red and white-coloured circle). All the other haplotypes were scattered in the network. The age of accumulated STR variation within R1a-Z280 lineage for 93 samples is estimated to be 9.4 kya (95% CI 6.5–12.4 kya) considering Haplotype 1 (Fig. 7) to be the founder.

r1a-z280-ugrians
Median-Joining Networks (MJ) of 93 R1a-Z280 haplotypes constructed. The circle sizes are proportional to the haplotype frequencies. The smallest area is equivalent to one individual. We used haplotype data from Bashkirian Mari (n = 7), Southern Mansi (n = 7), Hungarian (n = 52), Hungarian speaking Székely (n = 11), Hungarian speaking Csángó (n = 10), Uzbek Ferghana (n = 2), Uzbek Tashkent (n = 1), Uzbek Khwarezm (n = 1) and Northern Mansi (n = 2) populations

R1a-Z93 as isolated lineages among Permic and Ugric populations:

Figure 8 depicts an MJ network of R1a-Z93* samples using 106 haplotypes from the 14 populations (Fig. 8). All of the Bashkirian Mari samples (7 haplotypes) formed a very isolated branch and differed from the one Hungarian haplotype (Fig. 8, see Haplotype 1) by seven mutational steps as well from two Uzbek Tashkent samples (see Haplotype 3). Another Hungarian sample shared two haplotypes of Uzbek Khwarezm samples in Haplotype 4. This haplotype can be derived from Haplotype 3 (Uzbek Tashkent). Haplotype 2 included one Hungarian and one Khakassian male. The remaining three Hungarian haplotypes are outliers in the network and are not shared by any sample. The other population samples included in the network either form independent clusters such as Altaians, Khakassians, Khanties, and Uzbek Madjars or were scattered in the network. The age of accumulated STR variation (TMRCA) within R1a-Z93* lineage for 106 samples is estimated as 11.6 kya (95% CI 9.3–14.0 kya) considering an Armenian haplotype (Fig. 8, “A”) to be the founder and the median haplotype.

r1a-z93-ugrians
Median-Joining Networks (MJ) of 106 R1a-Z93 haplotypes constructed. The circle sizes are proportional to the haplotype frequencies. The smallest area is equivalent to one individual. We used the next haplotype data: 7 Bashkirian Mari, 6 Khanty, 4 Uzbek Madjar, 5 Uzbek Ferghana, 9 Uzbek Tashkent, 7 Uzbek Khwarezm, 2 Mongolian, 2 Buryat, 6 Hungarian samples tested by us for this study or published earlier (Bíró et al. 2015) and populations (3 Armenian; 3 Afghan Tajik;
16 Altaian; 24 Khakassian; 12 Kyrgyz) from Underhill et al. (2015)

Comments

The results of modern populations for N (especially N1c) subclades show really wide clusters and ancient TMRCA, consistent with their known ancient and wide distribution in northern and eastern Eurasian groups, and thus with infiltration of different lineages with eastern nomads (and northern Arctic populations) coupled with later bottlenecks, as well as acculturation of groups.

EDIT (2 APR): Interesting is the specific subclade to which ancient Mongolic-speaking Avars belong (information from Yfull) N1c-F4205 (TMRCA ca. 500 BC), subclade of N1c-Y6058 (formed ca. 2800 BC, TMRCA ca. 2800 BC). This branch also gives the “European” branch N1c-CTS10760 (formed ca. 2800 BC, TMRCA ca. 2100 BC), and is subclade of a branch of N1c-L392 (formed ca. 4400 BC, TMRCA ca. 2800 BC). A northern expansion of N1c-L392 is probably represented by its branch N1c-Z1936 (formed ca. 2800, TMRCA ca. 2100 BC), the most likely candidate to appear in the Kola Peninsula in the Bronze Age as the Palaeo-Laplandic population (see here). Read more about potential routes of expansion of haplogroup N.

On the other hand, R1a-Z280 lineages form a tight cluster connecting Permic with Ugric groups, with R1a-Z93 showing early isolation (probably) between Cis-Urals and Trans-Urals regions. While both Corded Ware lineages in Finno-Ugrians are most likely related to the Abashevo expansion through Seima-Turbino and the Andronovo-like Horizon (and potentially later Eurasian expansions), a plausible hypothesis would be that Finno-Ugrians are related to an expansion of R1a-Z283 haplogroups (we already knew about the Finno-Permic connection), while the ancient connection between Permians and Hungarians with R1a-Z93 would correspond to this haplogroup’s potentially tighter link with an early Samoyedic split.

I don’t think that an explosive expansion of eastern Corded Ware groups of R1a-Z645 lineages will show a clear-cut division of haplogroups among Eastern Uralic groups, though, and culturally I doubt we will have such a clear image, either (similar to how the explosive expansion of Bell Beakers cannot be easily divided by regional/language group into R1b-L151 subclades before the known bottlenecks). Relevant in this regard are the known Z93 samples from the Árpád dynasty.

Nevertheless, this data may represent a slightly more recent wave of R1a-Z280 lineages linked to the expansion of Ugric into the Trans-Uralian region, after their split from Finno-Permic, still in close contact with Indo-Iranians in Poltavka and Sintashta-Potapovka, evident from the early and late Indo-Iranian borrowings, during a common period when Samoyedic had already separated.

Such a “Z283 over Z93” layer in the Trans-Urals (and Cis-Urals?) forest-steppes would be similar to the apparent replacement of Z284 by Z282 in the Eastern Baltic during the Bronze Age (possibly with the second or Estonian Battle Axe wave or, much more likely during later population movements). Such an early R1a-Z93 split could potentially be supported also by the separation into bottlenecks under “Northern” (R1a-Z283) Finno-Ugric-speaking Abashevo-related groups and “Southern” (R1a-Z93) acculturated Indo-Iranian-speaking Abashevo migrants developing Sintashta-Potapovka admixing with Poltavka R1b-Z2103 herders.

r1a-z282-z280-z2125-distribution
Modified image, from Underhill et al. (2015). Spatial frequency distributions of Z282 (green) and Z93 (blue) affiliated haplogroups.. Notice the potential Finno-Ugric-associated distribution of Z282 (especially R1a-M558, a Z280 subclade), the expansion of R1a-Z2123 subclades with Central Asian forest-steppe groups.

Conclusion

Let’s review some of the most common myths about Hungarians (and Finno-Ugrians in general) repeated ad nauseam, side by side with my assertions:

❌ N (especially N1c-Tat) in ancient and modern samples represent the True Uralic™ N1c peoples including Magyar tribes? Nope.

✅ Ancient N (especially N1c-Tat) lineages among Uralic populations expanded relatively recently, and differently in different regions (including eastern steppe nomads and northern arctic populations) not associated with a particular language or language group? Yep (read the series on Corded Ware = Uralic expansion).

❌ Modern Hungarian R1a-Z280 lineages represent the majority of the native population, poor Slavic ‘peasants’ from the Carpathian Basin, forcibly acculturated by a minority of bad bad Hungarian hordes? Nope.

✅ Modern Hungarian R1a-Z280 subclades represent Ugric lineages in common with ancient R1a-Z645 Finno-Ugric populations from north-eastern Europe and the Trans-Urals? Yep (see Avars and Ugrians).

❌ Modern Hungarian R1a-Z93 lineages represent acculturated Iranian/Turkic peoples from the steppes? Not likely.

✅ Modern Hungarian R1a-Z93 lineages represent a remnant of the expansion of Corded Ware to the east, potentially more clearly associated with Samoyedic? Much more likely.

finno-ugric-haplogroup-n
Map of archaeological cultures in north-eastern Europe ca. 8th-3rd centuries BC. [The Mid-Volga Akozino group not depicted] Shaded area represents the Ananino cultural-historical society. Fading purple arrows represent likely stepped movements of subclades of haplogroup N for centuries (e.g. Siberian → Ananino → Akozino → Fennoscandia [N-VL29]; Circum-Arctic → forest-steppe [N1, N2]; etc.). Blue arrows represent eventual expansions of Uralic peoples to the north. Modified image from Vasilyev (2002).

Sooo, the theory of a “diluted” Y-DNA in Modern Hungarians from originally fully N-dominated conquerors subjugating native R1a-Z280 Slavs from the Carpathian Basin is not backed up by genetic studies? The ethnic Iranian-Turkic R1a-Z93 federation in the steppes that ended up speaking Magyar is not real?? Who would’ve thunk.

Another true story whose rejection in genetics could not be predicted, like, not at all.

Totally unexpected, too, the drift of “R1a=IE” fans with the newest genetic findings towards a Molgen-like “Yamna/R1b = Vasconic-Caucasian”, “N1c = Uralic-Altaic”, and “R1a = the origin of the white world in Mother Russia”. So much for the supposed interest in “Steppe ancestry” and fancy statistics.

Related

The Tungusic Ulchi population probably linked to haplogroup C2b1a

ulchi-marital

New paper (behind paywall) Demographic and Genetic Portraits of the Ulchi Population, by Balanovska et al. Russian Journal of Genetics (2018) 54(10):1245–1253.

Interesting excerpts (emphasis mine):

Marital structure. The intensity of interethnic marriages puts the existence of the Ulchi population at risk. The colorful ethnic composition of the Ulchi settlements is reflected in the marriage structure [see featured image]. We found that the proportion of single-ethnic marriages of the Ulchi is on average 51%. The greatest number of such marriages takes place in the village of Bulava. Marriages of Ulchi with Russians are in second place. Marriages with indigenous peoples of the Far East, Nanais, Nivkhs, Evenks, and others, are in third place. Thus, almost half of the Ulchi marriages are with representatives of other nationalities. Such a significant level of interethnic mixing makes it possible to talk about intense processes of assimilation of this indigenous people and puts to the forefront the problem of loss of the unique gene pool of the Ulchi.

Haplogroup C (its branch M48) was genotyped for its five subbranches with markers M86, B470, F13686, B93, and the marker at position 16645386 (GRCh37), which was found by our team for the first time. Variant B93 is rare in the Ulchi, and 14 samples (that is, more than a quarter of the entire gene pool of the Ulchi, Fig. 2) belong to M86 and its subvariants. Therefore, we genotyped STR markers of C-M86 carriers for the Ulchi and neighboring Amur populations and analyzed the relationships of detected haplotypes on the phylogenetic network (Fig. 3, STR haplotypes are available from authors upon request).

(…) On the network, different clusters are associated with different populations: most Mongols belong to F13686, all Evenks of the Amur River region with this haplogroup form a subcluster within F13686, and part of Upper Nanais is the basis of cluster B470.

ulchi-y-chromosome
Frequencies of haplogroups of Y chromosome in the Ulchi population. The nomenclature of haplogroups is given according to [9]. Markers that are not in bold type were not typed, but are ancestral for these nodes.

An estimate of the age of the entire haplogroup C-F12355 obtained from the data of genome-wide sequencing of seven specimens is 2400 ± 500 years (O.P. Balanovsky, unpublished data). That is, the common ancestor of all the studied representatives of various peoples with this haplogroup lived not so long ago, the first millennium BC. The formation time of cluster F13686 is somewhat later: 1990 ± 600 years.

(…) obvious traces of the interaction of the gene pool of the Ulchi with neighboring and remote peoples of the Far East and Central Asia in the time range of the last one to three thousand years were revealed. This shows that the results of work [4] on the similarity of the gene pool of the ancient (age of 7500 years) Neolithic genomes of the Amur River region to the Ulchi probably indicate not the uniqueness of the Ulchi, but the fact that this ancient gene pool was preserved in a vast circle of populations of the Far East interwoven with gene flows both with each other and, to a lesser extent, with populations of Central Asia.

The expansion of C2b1a2a-M86 (among many basal C2-M217 samples) is thus possibly associated with the spread of Tungusic, which puts C2b1a at the root of the Micro-Altaic expansion, with a formation date ca. 12700 BC, TMRCA 12500 BC (and not only Mongolian). This shows that Micro-Altaic is connected with a local population which shows a clear continuity since at least 3500 BC. This, however, tells us little about the origin of the language.

See also the recent ISBA presentation on the Houtaomuga site, Neolithic transition in Northeast Asia; and also Bronze Age population dynamics and rise of dairy pastoralism in Mongolia, Impact of colonization in north-eastern Siberia

That leaves the ancestral N lineages found among Far East Asians as Palaeo-Siberian in origin, and their late expansions to the west not particularly linked with any of the known Palaeo-Siberian ethnolinguistic groups, let alone a supposed “Uralo-Altaic” language…

Related

The Iron Age expansion of Southern Siberian groups and ancestry with Scythians

iron_age-sarmatians

Maternal genetic features of the Iron Age Tagar population from Southern Siberia (1st millennium BC), by Pilipenko et al. (2018).

Interesting excerpts (emphasis mine):

The positions of non-Tagar Iron Age groups in the MDS plot were correlated with their geographic position within the Eurasian steppe belt and with frequencies of Western and Eastern Eurasian mtDNA lineages in their gene pools. Series from chronological Tagar stages (similar to the overall Tagar series) were located within the genetic variability (in terms of mtDNA) of Scythian World nomadic groups (Figs 5 and 6; S4 and S6 Tables). Specifically, the Early Tagar series was more similar to western nomads (North Pontic Scythians), while the Middle Tagar was more similar to the Southern Siberian populations of the Scythian period. The Late Tagar group (Tes`culture) belonging to the Early Xiongnu period had the “western-most” location on the MDS plot with the maximal genetic difference from Xiongnu and other eastern nomadic groups (but see Discussion concerning the low sample size for the Tes`series).

In a comparison of our Tagar series with modern populations in Eurasia, we detected similarity between the Tagar group and some modern Turkic-speaking populations (with the exception of the Indo-Iranian Tajik population) (Fig 7; S2 Table). Among the modern Turkic-speaking groups, populations from the western part of the Eurasian steppe belt, such as Bashkirs from the Volga-Ural region and Siberian Tatars from the West Siberian forest-steppe zone, were more similar to the Tagar group than modern Turkic-speaking populations of the Altay-Sayan mountain system (including the Khakassians from the Minusinsk basin) (Fig 7).

tagar-archaeology
Location of Tagar archaeological sites from which samples for this study were obtained. Burial grounds: 1—Novaya Chernaya-1; 2—Podgornoe Ozero, Barsuchiha-1, Barsuchiha-6, Barsuchiha-7; 3—Perevozinskiy; 4—Ulug-Kyuzyur, Kichik-Kyuzyur, Sovetskaya Khakassiya; 5—Tepsey-3, Tepsey-8, Tepsey-9; 6—Dolgiy Kurgan. https://doi.org/10.1371/journal.pone.0204062.g001

Mitochondrial DNA diversity and genetic relationships of the Tagar population

Our results are not inconsistent with the assumption of a probable role of gene flow due to the migration from Western Eurasia to the Minusinsk basin in the Bronze Age in the formation of the genetic composition of the Tagar population. Particularly, we detected many mtDNA lineages/clusters with probable West Eurasian origin that were dominant in modern populations of different parts of Europe, Caucasus, and the Near East (such as K and HV6) in our Tagar series based on a phylogeographic analysis.

We detected relatively low genetic distances between our Tagar population and two Bronze Age populations from the Minusinsk basin—the Okunevo culture population (pre-Andronovo Bronze Age) and Andronovo culture population, followed by Afanasievo population from the Minusinsk Basin and Middle Bronze Age population from the Mongolian Altai Mountains (the region adjacent to the Minusinsk basin) (Figs 3 and 6; S3 and S5 Tables). Among West Eurasian part of our Tagar series we also observed haplogroups/sub-haplogroups and haplotypes shared with Early and Middle Bronze Age populations from Minusinsk Basin and western part of Eurasian steppe belt (Fig 4; S5 Table). Thus, our results suggested a potentially significant role of the genetic components, introduced by migrants from Western Eurasia during the Bronze Age, in the formation of the genetic composition of the Tagar population. It is necessary to note the relatively small size of available mtDNA samples from the Bronze Age populations of Minusinsk basin; accordingly, additional mtDNA data for these populations are required to further confirm our inference.

tagar-mtdna-tree
Phylogenetic tree of mtDNA lineages from the Tagar population. Color coding of the Tagar stages: orange—the Early Tagar stage; blue—the Middle Tagar Stage; green—the Late Tagar stage. Color of haplogroup labels: yellow—for Western Eurasian haplogroups; red—for Eastern Eurasian haplogroups. https://doi.org/10.1371/journal.pone.0204062.g002

Another substantial part of the mtDNA pool of the Tagar and other eastern populations of the Scythian World is typical of populations in Southern Siberia and adjacent regions of Central Asia (autochthonous Central Asian mtDNA clusters). Most of these components belong to the East Eurasian cluster of mtDNA haplogroups. Moreover, the role of each of these components in the formation of the genetic composition of subsequent (to the present) populations in South Siberia and Central Asia could be very different. In this regard, cluster C4a2a (and its subcluster C4a2a1), and haplogroup A8 are of particular interest.

Genetic features of successive Tagar groups

We compared successive Tagar groups (Early, Middle, and Late Tagar) with each other and with other Iron Age nomadic populations to evaluate changes in the mtDNA pool structure. Despite the genetic similarity between the Early and Middle Tagar series and Scythian World nomadic groups (Figs 5 and 6; S4 and S6 Tables), there were some peculiarities. For example, the Early Tagar series was more similar to North Pontic Classic Scythians, while the Middle Tagar samples were more similar to the Southern Siberian populations of the Scythian period (i.e., completely synchronous populations of regions neighboring the Minusinsk basin, such as the Pazyryk population from the Altay Mountains and Aldy-Bel population from Tuva).

We observed differences in the mtDNA pool structure between the Early and the Middle chronological stages of the Tagar culture population, as evidenced by the change in the ratio of Western to Eastern Eurasian mtDNA components. The contribution of Eastern Eurasian lineages increased from about one-third (34.8%) in the Early Tagar group to almost one-half (45.8%) in the Middle Tagar group.

tagar-mtdna-fst
Results of multidimensional scaling based on matrix of Slatkin population differentiation (FST) according to frequencies of mtDNA haplogroup in Tagar populations and modern populations of Eurasia. Populations: Tagar (red pentagon) (this study); Mongolian-speaking populations: Khamnigans (Buryat Republic, Russia) [43]; Barghuts (Inner Mongolia, China) [44]; Buryats (Buryat Republic, Southern Siberia, Russia) [43]; Mongols (Mongolia) [45]. Turkic-speaking populations: Tuvinians (Tuva Republic, Russia) [43]; Tofalars (Irkutsk region, Russia) [46]; Altai-Kizhi ((Altai Republic, Russia) [43, 47]; Telenghits (Altai Republic, Russia) [43,47]; Tubalars (Altai Republic) [48]; Shors (Kemerovo region, Russia) [43, 47]; Khakassians (Khakassian Rupublic, Russia) [43, 46]; Altaian Kazakhs (Altai Republic) [49]; Kazakhs (Kazakhstan, Uzbekistan) [50, 51]; Kirghiz (Kyrgyzstan) [50, 51]; Uighurs (Kazakhstan and Xinjiang) [50, 52]; Siberian Tatars (Tyumen and Omsk regions, Russia) [53]; Tatars (Volga-Ural rigion, Russia) [54]; Bashkirs (Volga-Ural region, Russia) [55]; Uzbeks (Uzbekistan) [51, 56]; Turkmens (Turkmenistan) [51, 56]; Nogays [57]; Turkeys [58]; other populations: Evenks [43, 46]; Ulchi [59]; Koreans (South Korea) [43]; Han Chinese [60]; Zhuang (Guangxi, China) [61]; Tadjiks (Tadjikistan) [43, 51]; Iranians [60]; Russians [62]. https://doi.org/10.1371/journal.pone.0204062.g007

At the level of mtDNA haplogroups, we detected a decrease in the diversity of phylogenetic clusters during the transition from the Early Tagar to the Middle Tagar. This decline in diversity equally affected the West Eurasian and East Eurasian components of the Tagar mtDNA pool. It should be noted that this decrease can be partially explained by the smaller number of Middle Tagar than Early Tagar samples. Under a simple binomial approximation the mtDNA clusters, observed at frequencies of 6.3% and 11.7%, could be lost by chance in our Early (N = 46) and Middle (N = 24) Tagar samples, respectively. However, the simultaneous lack of several such clusters, with a total frequency in the gene pool of the Early group of 34.8%, is unlikely.

The observed reduction in the genetic distance between the Middle Tagar population and other Scythian-like populations of Southern Siberia(Fig 5; S4 Table), in our opinion, is primarily associated with an increase in the role of East Eurasian mtDNA lineages in the gene pool (up to nearly half of the gene pool) and a substantial increase in the joint frequency of haplogroups C and D (from 8.7% in the Early Tagar series to 37.5% in the Middle Tagar series). These features are characteristic of many ancient and modern populations of Southern Siberia and adjacent regions of Central Asia, including the Pazyryk population of the Altai Mountains. We did not obtain strong evidence for an intensification of genetic contact between the population of the Minusinsk basin and the Altai Mountains in the Middle Tagar period compared with the Early Tagar period. Although, several archaeologists have found evidence for the intensification of contact at the level of material culture, namely, a cultural influence of the population of the Altai Mountains (represented by the Pazyryk population) on the population of the Minusinsk basin (the Saragash Tagar group) [6, 71, 72].

Another important issue is the change in the genetic structure of the Tagar population during the transition from the Middle (Saragash) to the Late (Tes`) stage. The Late Tagar stage refers to the Xiongnu period. Many archaeologists suggest that the formation of the Tes`stage involved the direct cultural influence of the Xiongnu and/or related groups of nomads from more eastern regions of Central Asia [71, 73]. Some archaeologists have even suggested renaming the Tes`stage in the Tes`culture [71], emphasizing the role of new eastern cultural elements. If this influence also existed at the genetic level, then we would expect to observe new genetic elements in the Tes`gene pool, particularly those of East Eurasian origin.

Siberian ancestry

Just a reminder of the recent session in ISBA 8 on expanding Scythians (and also Mongolians and Turks) spreading Siberian ancestry, usually (wrongly) identified as “Uralic-Yeniseian” based on modern populations (similar to how steppe ancestry is wrongly identified as “Indo-European”), see the following graphic including the Tagar population:

siberian-genetic-component-chronology
Very important observation with implication of population turnover is that pre-Turkic Inner Eurasian populations’ Siberian ancestry appears predominantly “Uralic-Yeniseian” in contrast to later dominance of “Tungusic-Mongolic” sort (which does sporadically occur earlier). Alexander M. Kim

And also the poster by Alexander M. Kim et al. Yeniseian hypotheses in light of genome-wide ancient DNA from historical Siberia:

The relevance of ancient DNA data to debates in historical linguistics is an emphatic strand in much recent work on the archaeogenetics of Eurasia, where the discussion has focused heavily on Indo-European (Haak et al. 2015; Narasimhan et al. 2018; de Barros Damgaard et al. 2018a,b). We present new genome-wide ancient DNA data from a historical Siberian individual in relation to Yeniseian, an isolated language “microfamily” (Vajda 2014) that nonetheless sits at the center of numerous controversial proposals in historical linguistics and cultural interaction. Yeniseian’s sole surviving representative is Ket, a critically endangered language fluently spoken by only a few dozen individuals near the Middle Yenisei River of Central Siberia.

In strong contrast to the present-day picture, river names and argued substrate influences and loanwords in languages outside the current range of Yeniseian, as well as direct records from the Russian colonial period, indicate that speakers of extinct Yeniseian languages had a formerly much broader presence in the taiga of Central Siberia as well as further south in the mountainous Altai-Sayan region – and perhaps even further afield in Inner Asia (Vajda 2010; Gorbachov 2017; Blažek 2016). The consilience of these proposals with genetic data is not straightforward (Flegontov et al. 2015, 2017) and faces a major obstacle in the lack of genetic information from verifiable speakers of Yeniseian languages other than the Kets, who have had complex ongoing interactions with speakers of non-Yeniseian languages such as the Samoyedic Selkups. We attempt to remedy this with new historical Siberian aDNA data, orienting our search for common denominators and systematic difference in a broader landscape of concordance, discordance, and uncertainty at the interface of diachronic linguistics and genetics.

Related

Neolithic and Bronze Age Anatolia, Urals, Fennoscandia, Italy, and Hungary (ISBA 8, 20th Sep)

jena-isba8

I will post information on ISBA 8 sesions today as I see them on Twitter (see programme in PDF, and sessions from yesterday).

Official abstracts are listed first (emphasis mine), then reports and images and/or link to tweets. Here is the list for quick access:

Russian colonization in Yakutia

Exploring the genomic impact of colonization in north-eastern Siberia, by Seguin-Orlando et al.

Yakutia is the coldest region in the northern hemisphere, with winter record temperatures below minus 70°C. The ability of Yakut people to adapt both culturally and biologically to extremely cold temperatures has been key to their subsistence. They are believed to descend from an ancestral population, which left its original homeland in the Lake Baykal area following the Mongol expansion between the 13th and 15th centuries AD. They originally developed a semi-nomadic lifestyle, based on horse and cattle breeding, providing transportation, primary clothing material, meat, and milk. The early colonization by Russians in the first half of the 17th century AD, and their further expansion, have massively impacted indigenous populations. It led not only to massive epidemiological outbreaks, but also to an important dietary shift increasingly relying on carbohydrate-rich resources, and a profound lifestyle transition with the gradual conversion from Shamanism to Christianity and the establishment of new marriage customs. Leveraging an exceptional archaeological collection of more than a hundred of bodies excavated by MAFSO (Mission Archéologique Française en Sibérie Orientale) over the last 15 years and naturally kept frozen by the extreme cold temperatures of Yakutia, we have started to characterize the (epi)genome of indigenous individuals who lived from the 16th to the 20th century AD. Current data include the genome sequence of approximately 50 individuals that lived prior to and after Russian contact, at a coverage from 2 to 40 fold. Combined with data from archaeology and physical anthropology, as well as microbial DNA preserved in the specimens, our unique dataset is aimed at assessing the biological consequences of the social and biological changes undergone by the Yakut people following their neolithisation by Russian colons.

NOTE: For another interesting study on Yakutian tribes, see Relationships between clans and genetic kin explain cultural similarities over vast distances.

Ancient DNA from a Medieval trading centre in Northern Finland

Using ancient DNA to identify the ancestry of individuals from a Medieval trading centre in Northern Finland, by Simoes et al.

Analyzing genomic information from archaeological human remains has proved to be a powerful approach to understand human history. For the archaeological site of Ii Hamina, ancient DNA can be used to infer the ancestries of individuals buried there. Situated approximately 30 km from Oulu, in Northern Finland, Ii Hamina was an important trade place since Medieval times. The historical context indicates that the site could have been a melting pot for different cultures and people of diversified genetic backgrounds. Archaeological and osteological evidence from different individuals suggest a rich diversity. For example, stable isotope analyses indicate that freshwater and marine fish was the dominant protein source for this population. However, one individual proved to be an outlier, with a diet containing relatively more terrestrial meat or vegetables. The variety of artefacts that was found associated with several human remains also points to potential differences in religious beliefs or social status. In this study, we aimed to investigate if such variation could be attributed to different genetic ancestries. Ten of the individuals buried in Ii Hamina’s churchyard, dating to between the 15th and 17th century AD, were screened for presence of authentic ancient DNA. We retrieved genome-wide data for six of the individuals and performed downstream analysis. Data authenticity was confirmed by DNA damage patterns and low estimates of mitochondrial contamination. The relatively recent age of these human remains allows for a direct comparison to modern populations. A combination of population genetics methods was undertaken to characterize their genetic structure, and identify potential familiar relationships. We found a high diversity of mitochondrial lineages at the site. In spite of the putatively distant origin of some of the artifacts, most individuals shared a higher affinity to the present-day Finnish or Late Settlement Finnish populations. Interestingly, different methods consistently suggested that the individual with outlier isotopic values had a different genetic origin, being more closely related to reindeer herding Saami. Here we show how data from different sources, such as stable isotopes, can be intersected with ancient DNA in order to get a more comprehensive understanding of the human past.

A closer look at the bottom left corner of the poster (the left columns are probably the new samples):

finland-medieval-admixture

Plant resources processed in HG pottery from the Upper Volga

Multiple criteria for the detection of plant resources processed in hunter-gatherer pottery vessels from the Upper Volga, Russia, by Bondetti et al.

In Northern Eurasia, the Neolithic is marked by the adoption of pottery by hunter-gatherer communities. The degree to which this is related to wider social and lifestyle changes is subject to ongoing debate and the focus of a new research programme. The use and function of early pottery by pre-agricultural societies during the 7th-5th millennia BC is of central interest to this debate. Organic residue analysis provides important information about pottery use. This approach relies on the identification and isotopic characteristics of lipid biomarkers, absorbed into the pores of the ceramic or charred deposits adhering to pottery vessel surfaces, using a combined methodology, namely GC-MS, GC-c-IRMS and EA-IRMS. However, while animal products (e.g., marine, freshwater, ruminant, porcine) have the benefit of being lipid-rich and well-characterised at the molecular and isotopic level, the identification of plant resources still suffers from a lack of specific criteria for identification. In huntergatherer contexts this problem is exacerbated by the wide range of wild, foraged plant resources that may have been potentially exploited. Here we evaluate approaches for the characterisation of terrestrial plant food in pottery through the study of pottery assemblages from Zamostje 2 and Sakhtysh 2a, two hunter-gatherer settlements located in the Upper Volga region of Russia.

GC-MS analysis of the lipids, extracted from the ceramics and charred residues by acidified methanol, suggests that pottery use was primarily oriented towards terrestrial and aquatic animal products. However, while many of the Early Neolithic vessels contain lipids distinctive of freshwater resources, triterpenoids are also present in high abundance suggesting mixing with plant products. When considering the isotopic criteria, we suggest that plants were a major commodity processed in pottery at this time. This is supported by the microscopic identification of Viburnum (Viburnum Opulus L.) berries in the charred deposits on several vessels from Zamostje.

The study of Upper Volga pottery demonstrated the importance of using a multidisciplinary approach to determine the presence of plant resources in vessels. Furthermore, this informs the selection of samples, often subject to freshwater reservoir effects, for 14C dating.

Studies on hunter-gatherer pottery – appearing in eastern Europe before Middle Eastern Neolithic pottery – may be important to understand the arrival of R1a-M17 lineages to the region before ca. 7000 BC. Or not, right now it is not very clear what happened with R1b-P297 and R1a-M17, and with WHG—EHG—ANE ancestry

Bronze Age population dynamics and the rise of dairy pastoralism on the eastern Eurasian steppe

Bronze Age population dynamics and the rise of dairy pastoralism on the eastern Eurasian steppe, by Warinner et al.

Recent paleogenomic studies have shown that migrations of Western steppe herders (WSH), beginning in the Eneolithic (ca. 3300-2700 BCE), profoundly transformed the genes and cultures of Europe and Central Asia. Compared to Europe, the eastern extent of this WSH expansion is not well defined. Here we present genomic and proteomic data from 22 directly dated Bronze Age khirigsuur burials from Khövsgöl, Mongolia (ca. 1380-975 BCE). Only one individual showed evidence of WSH ancestry, despite the presence of WSH populations in the nearby Altai-Sayan region for more than a millennium. At the same time, LCMS/ MS analysis of dental calculus provides direct protein evidence of milk consumption from Western domesticated livestock in 7 of 9 individuals. Our results show that dairy pastoralism was adopted by Bronze Age Mongolians despite minimal genetic exchange with Western steppe herders.

Detail of the images:

mongol-bronze-age-pca

mongol-bronze-age-f4-ancestry