Yamnaya ancestry: mapping the Proto-Indo-European expansions

steppe-ancestry-expansion-europe

The latest papers from Ning et al. Cell (2019) and Anthony JIES (2019) have offered some interesting new data, supporting once more what could be inferred since 2015, and what was evident in population genomics since 2017: that Proto-Indo-Europeans expanded under R1b bottlenecks, and that the so-called “Steppe ancestry” referred to two different components, one – Yamnaya or Steppe_EMBA ancestry – expanding with Pro-Indo-Europeans, and the other one – Corded Ware or Steppe_MLBA ancestry – expanding with Uralic speakers.

The following maps are based on formal stats published in the papers and supplementary materials from 2015 until today, mainly on Wang et al. (2018 & 2019), Mathieson et al. (2018) and Olalde et al. (2018), and others like Lazaridis et al. (2016), Lazaridis et al. (2017), Mittnik et al. (2018), Lamnidis et al. (2018), Fernandes et al. (2018), Jeong et al. (2019), Olalde et al. (2019), etc.

NOTE. As in the Corded Ware ancestry maps, the selected reports in this case are centered on the prototypical Yamnaya ancestry vs. other simplified components, so everything else refers to simplistic ancestral components widespread across populations that do not necessarily share any recent connection, much less a language. In fact, most of the time they clearly didn’t. They can be interpreted as “EHG that is not part of the Yamnaya component”, or “CHG that is not part of the Yamnaya component”. They can’t be read as “expanding EHG people/language” or “expanding CHG people/language”, at least no more than maps of “Steppe ancestry” can be read as “expanding Steppe people/language”. Also, remember that I have left the default behaviour for color classification, so that the highest value (i.e. 1, or white colour) could mean anything from 10% to 100% depending on the specific ancestry and period; that’s what the legend is for… But, fere libenter homines id quod volunt credunt.

Sections:

  1. Neolithic or the formation of Early Indo-European
  2. Eneolithic or the expansion of Middle Proto-Indo-European
  3. Chalcolithic / Early Bronze Age or the expansion of Late Proto-Indo-European
  4. European Early Bronze Age and MLBA or the expansion of Late PIE dialects

1. Neolithic

Anthony (2019) agrees with the most likely explanation of the CHG component found in Yamnaya, as derived from steppe hunter-fishers close to the lower Volga basin. The ultimate origin of this specific CHG-like component that eventually formed part of the Pre-Yamnaya ancestry is not clear, though:

The hunter-fisher camps that first appeared on the lower Volga around 6200 BC could represent the migration northward of un-admixed CHG hunter-fishers from the steppe parts of the southeastern Caucasus, a speculation that awaits confirmation from aDNA.

neolithic-chg-ancestry
Natural neighbor interpolation of CHG ancestry among Neolithic populations. See full map.

The typical EHG component that formed part eventually of Pre-Yamnaya ancestry came from the Middle Volga Basin, most likely close to the Samara region, as shown by the sampled Samara hunter-gatherer (ca. 5600-5500 BC):

After 5000 BC domesticated animals appeared in these same sites in the lower Volga, and in new ones, and in grave sacrifices at Khvalynsk and Ekaterinovka. CHG genes and domesticated animals flowed north up the Volga, and EHG genes flowed south into the North Caucasus steppes, and the two components became admixed.

neolithic-ehg-ancestry
Natural neighbor interpolation of EHG ancestry among Neolithic populations. See full map.

To the west, in the Dnieper-Dniester area, WHG became the dominant ancestry after the Mesolithic, at the expense of EHG, revealing a likely mating network reaching to the north into the Baltic:

Like the Mesolithic and Neolithic populations here, the Eneolithic populations of Dnieper-Donets II type seem to have limited their mating network to the rich, strategic region they occupied, centered on the Rapids. The absence of CHG shows that they did not mate frequently if at all with the people of the Volga steppes (…)

neolithic-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Neolithic populations. See full map.

North-West Anatolia Neolithic ancestry, proper of expanding Early European farmers, is found up to border of the Dniester, as Anthony (2007) had predicted.

neolithic-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Neolithic populations. See full map.

2. Eneolithic

From Anthony (2019):

After approximately 4500 BC the Khvalynsk archaeological culture united the lower and middle Volga archaeological sites into one variable archaeological culture that kept domesticated sheep, goats, and cattle (and possibly horses). In my estimation, Khvalynsk might represent the oldest phase of PIE.

(…) this middle Volga mating network extended down to the North Caucasian steppes, where at cemeteries such as Progress-2 and Vonyuchka, dated 4300 BC, the same Khvalynsk-type ancestry appeared, an admixture of CHG and EHG with no Anatolian Farmer ancestry, with steppe-derived Y-chromosome haplogroup R1b. These three individuals in the North Caucasus steppes had higher proportions of CHG, overlapping Yamnaya. Without any doubt, a CHG population that was not admixed with Anatolian Farmers mated with EHG populations in the Volga steppes and in the North Caucasus steppes before 4500 BC. We can refer to this admixture as pre-Yamnaya, because it makes the best currently known genetic ancestor for EHG/CHG R1b Yamnaya genomes.

From Wang et al (2019):

Three individuals from the sites of Progress 2 and Vonyuchka 1 in the North Caucasus piedmont steppe (‘Eneolithic steppe’), which harbour EHG and CHG related ancestry, are genetically very similar to Eneolithic individuals from Khvalynsk II and the Samara region. This extends the cline of dilution of EHG ancestry via CHG-related ancestry to sites immediately north of the Caucasus foothills

eneolithic-pre-yamnaya-ancestry
Natural neighbor interpolation of Pre-Yamnaya ancestry among Neolithic populations. See full map. This map corresponds roughly to the map of Khvalynsk-Novodanilovka expansion, and in particular to the expansion of horse-head pommel-scepters (read more about Khvalynsk, and specifically about horse symbolism)

NOTE. Unpublished samples from Ekaterinovka have been previously reported as within the R1b-L23 tree. Interestingly, although the Varna outlier is a female, the Balkan outlier from Smyadovo shows two positive SNP calls for hg. R1b-M269. However, its poor coverage makes its most conservative haplogroup prediction R-M343.

The formation of this Pre-Yamnaya ancestry sets this Volga-Caucasus Khvalynsk community apart from the rest of the EHG-like population of eastern Europe.

eneolithic-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Eneolithic populations. See full map.

Anthony (2019) seems to rely on ADMIXTURE graphics when he writes that the late Sredni Stog sample from Alexandria shows “80% Khvalynsk-type steppe ancestry (CHG&EHG)”. While this seems the most logical conclusion of what might have happened after the Suvorovo-Novodanilovka expansion through the North Pontic steppes (see my post on “Steppe ancestry” step by step), formal stats have not confirmed that.

In fact, analyses published in Wang et al. (2019) rejected that Corded Ware groups are derived from this Pre-Yamnaya ancestry, a reality that had been already hinted in Narasimhan et al. (2018), when Steppe_EMBA showed a poor fit for expanding Srubna-Andronovo populations. Hence the need to consider the whole CHG component of the North Pontic area separately:

eneolithic-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Eneolithic populations. See full map. You can read more about population movements in the late Sredni Stog and closer to the Proto-Corded Ware period.

NOTE. Fits for WHG + CHG + EHG in Neolithic and Eneolithic populations are taken in part from Mathieson et al. (2019) supplementary materials (download Excel here). Unfortunately, while data on the Ukraine_Eneolithic outlier from Alexandria abounds, I don’t have specific data on the so-called ‘outlier’ from Dereivka compared to the other two analyzed together, so these maps of CHG and EHG expansion are possibly showing a lesser distribution to the west than the real one ca. 4000-3500 BC.

eneolithic-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Eneolithic populations. See full map.

Anatolia Neolithic ancestry clearly spread to the east into the north Pontic area through a Middle Eneolithic mating network, most likely opened after the Khvalynsk expansion:

eneolithic-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Eneolithic populations. See full map.
eneolithic-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Eneolithic populations. See full map.

Regarding Y-chromosome haplogroups, Anthony (2019) insists on the evident association of Khvalynsk, Yamnaya, and the spread of Pre-Yamnaya and Yamnaya ancestry with the expansion of elite R1b-L754 (and some I2a2) individuals:

eneolithic-early-y-dna
Y-DNA haplogroups in West Eurasia during the Early Eneolithic in the Pontic-Caspian steppes. See full map, and see culture, ADMIXTURE, Y-DNA, and mtDNA maps of the Early Eneolithic and Late Eneolithic.

3. Early Bronze Age

Data from Wang et al. (2019) show that Corded Ware-derived populations do not have good fits for Eneolithic_Steppe-like ancestry, no matter the model. In other words: Corded Ware populations show not only a higher contribution of Anatolia Neolithic ancestry (ca. 20-30% compared to the ca. 2-10% of Yamnaya); they show a different EHG + CHG combination compared to the Pre-Yamnaya one.

eneolithic-steppe-best-fits
Supplementary Table 13. P values of rank=2 and admixture proportions in modelling Steppe ancestry populations as a three-way admixture of Eneolithic steppe Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Test, Eneolithic_steppe, Anatolian_Neolithic, WHG.
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Yamnaya Kalmykia and Afanasievo show the closest fits to the Eneolithic population of the North Caucasian steppes, rejecting thus sizeable contributions from Anatolia Neolithic and/or WHG, as shown by the SD values. Both probably show then a Pre-Yamnaya ancestry closest to the late Repin population.

wang-eneolithic-steppe-caucasus-yamnaya
Modelling results for the Steppe and Caucasus cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional AF ancestry in Steppe groups and additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups. See tables above. Modified from Wang et al. (2019). Within a blue square, Yamnaya-related groups; within a cyan square, Corded Ware-related groups. Green background behind best p-values. In red circle, SD of AF/WHG ancestry contribution in Afanasevo and Yamnaya Kalmykia, with ranges that almost include 0%.

EBA maps include data from Wang et al. (2018) supplementary materials, specifically unpublished Yamnaya samples from Hungary that appeared in analysis of the preprint, but which were taken out of the definitive paper. Their location among Yamnaya settlers from Hungary is speculative, although most uncovered kurgans in Hungary are concentrated in the Tisza-Danube interfluve.

eba-yamnaya-ancestry
Natural neighbor interpolation of Pre-Yamnaya ancestry among Early Bronze Age populations. See full map. This map corresponds roughly with the known expansion of late Repin/Yamnaya settlers.

The Y-chromosome bottleneck of elite males from Proto-Indo-European clans under R1b-L754 and some I2a2 subclades, already visible in the Khvalynsk sampling, became even more noticeable in the subsequent expansion of late Repin/early Yamnaya elites under R1b-L23 and I2a-L699:

chalcolithic-early-y-dna
Y-DNA haplogroups in West Eurasia during the Yamnaya expansion. See full map and maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Chalcolithic and Yamnaya Hungary.

Maps of CHG, EHG, Anatolia Neolithic, and probably WHG show the expansion of these components among Corded Ware-related groups in North Eurasia, apart from other cultures close to the Caucasus:

NOTE. For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you can read the post Corded Ware ancestry in North Eurasia and the Uralic expansion.

eba-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Early Bronze Age populations. See full map.
eba-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Early Bronze Age populations. See full map.
eba-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Early Bronze Age populations. See full map.
eba-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Early Bronze Age populations. See full map.
eba-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Early Bronze Age populations. See full map.

4. Middle to Late Bronze Age

The following maps show the most likely distribution of Yamnaya ancestry during the Bell Beaker-, Balkan-, and Sintashta-Potapovka-related expansions.

4.1. Bell Beakers

The amount of Yamnaya ancestry is probably overestimated among populations where Bell Beakers replaced Corded Ware. A map of Yamnaya ancestry among Bell Beakers gets trickier for the following reasons:

  • Expanding Repin peoples of Pre-Yamnaya ancestry must have had admixture through exogamy with late Sredni Stog/Proto-Corded Ware peoples during their expansion into the North Pontic area, and Sredni Stog in turn had probably some Pre-Yamnaya admixture, too (although they don’t appear in the simplistic formal stats above). This is supported by the increase of Anatolia farmer ancestry in more western Yamna samples.
  • Later, Yamnaya admixed through exogamy with Corded Ware-like populations in Central Europe during their expansion. Even samples from the Middle to Upper Danube and around the Lower Rhine will probably show increasing contributions of Steppe_MLBA, at the same time as they show an increasing proportion of EEF-related ancestry.
  • To complicate things further, the late Corded Ware Espersted family (from ca. 2500 BC or later) shows, in turn, what seems like a recent admixture with Yamnaya vanguard groups, with the sample of highest Yamnaya ancestry being the paternal uncle of other individuals (all of hg. R1a-M417), suggesting that there might have been many similar Central European mating networks from the mid-3rd millennium BC on, of (mainly) Yamnaya-like R1b elites displaying a small proportion of CW-like ancestry admixing through exogamy with Corded Ware-like peoples who already had some Yamnaya ancestry.
mlba-yamnaya-ancestry
Natural neighbor interpolation of Yamnaya ancestry among Middle to Late Bronze Age populations (Esperstedt CWC site close to BK_DE, label is hidden by BK_DE_SAN). See full map. You can see how this map correlated with the map of Late Copper Age migrations and Yamanaya into Bell Beaker expansion.

NOTE. Terms like “exogamy”, “male-driven migration”, and “sex bias”, are not only based on the Y-chromosome bottlenecks visible in the different cultural expansions since the Palaeolithic. Despite the scarce sampling available in 2017 for analysis of “Steppe ancestry”-related populations, it appeared to show already a male sex bias in Goldberg et al. (2017), and it has been confirmed for Neolithic and Copper Age population movements in Mathieson et al. (2018) – see Supplementary Table 5. The analysis of male-biased expansion of “Steppe ancestry” in CWC Esperstedt and Bell Beaker Germany is, for the reasons stated above, not very useful to distinguish their mutual influence, though.

Based on data from Olalde et al. (2019), Bell Beakers from Germany are the closest sampled ones to expanding East Bell Beakers, and those close to the Rhine – i.e. French, Dutch, and British Beakers in particular – show a clear excess “Steppe ancestry” due to their exogamy with local Corded Ware groups:

Only one 2-way model fits the ancestry in Iberia_CA_Stp with P-value>0.05: Germany_Beaker + Iberia_CA. Finding a Bell Beaker-related group as a plausible source for the introduction of steppe ancestry into Iberia is consistent with the fact that some of the individuals in the Iberia_CA_Stp group were excavated in Bell Beaker associated contexts. Models with Iberia_CA and other Bell Beaker groups such as France_Beaker (P-value=7.31E-06), Netherlands_Beaker (P-value=1.03E-03) and England_Beaker (P-value=4.86E-02) failed, probably because they have slightly higher proportions of steppe ancestry than the true source population.

olalde-iberia-chalcolithic

The exogamy with Corded Ware-like groups in the Lower Rhine Basin seems at this point undeniable, as is the origin of Bell Beakers around the Middle-Upper Danube Basin from Yamnaya Hungary.

To avoid this excess “Steppe ancestry” showing up in the maps, since Bell Beakers from Germany pack the most Yamnaya ancestry among East Bell Beakers outside Hungary (ca. 51.1% “Steppe ancestry”), I equated this maximum with BK_Scotland_Ach (which shows ca. 61.1% “Steppe ancestry”, highest among western Beakers), and applied a simple rule of three for “Steppe ancestry” in Dutch and British Beakers.

NOTE. Formal stats for “Steppe ancestry” in Bell Beaker groups are available in Olalde et al. (2018) supplementary materials (PDF). I didn’t apply this adjustment to Bk_FR groups because of the R1b Bell Beaker sample from the Champagne/Alsace region reported by Samantha Brunel that will pack more Yamnaya ancestry than any other sampled Beaker to date, hence probably driving the Yamnaya ancestry up in French samples.

The most likely outcome in the following years, when Yamnaya and Corded Ware ancestry are investigated separately, is that Yamnaya ancestry will be much lower the farther away from the Middle and Lower Danube region, similar to the case in Iberia, so the map above probably overestimates this component in most Beakers to the north of the Danube. Even the late Hungarian Beaker samples, who pack the highest Yamnaya ancestry (up to 75%) among Beakers, represent likely a back-migration of Moravian Beakers, and will probably show a contribution of Corded Ware ancestry due to the exogamy with local Moravian groups.

Despite this decreasing admixture as Bell Beakers spread westward, the explosive expansion of Yamnaya R1b male lineages (in words of David Reich) and the radical replacement of local ones – whether derived from Corded Ware or Neolithic groups – shows the true extent of the North-West Indo-European expansion in Europe:

chalcolithic-late-y-dna
Y-DNA haplogroups in West Eurasia during the Bell Beaker expansion. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Late Copper Age and of the Yamnaya-Bell Beaker transition.

4.2. Palaeo-Balkan

There is scarce data on Palaeo-Balkan movements yet, although it is known that:

  1. Yamnaya ancestry appears among Mycenaeans, with the Yamnaya Bulgaria sample being its best current ancestral fit;
  2. the emergence of steppe ancestry and R1b-M269 in the eastern Mediterranean was associated with Ancient Greeks;
  3. Thracians, Albanians, and Armenians also show R1b-M269 subclades and “Steppe ancestry”.

4.3. Sintashta-Potapovka-Filatovka

Interestingly, Potapovka is the only Corded Ware derived culture that shows good fits for Yamnaya ancestry, despite having replaced Poltavka in the region under the same Corded Ware-like (Abashevo) influence as Sintashta.

This proves that there was a period of admixture in the Pre-Proto-Indo-Iranian community between CWC-like Abashevo and Yamnaya-like Catacomb-Poltavka herders in the Sintashta-Potapovka-Filatovka community, probably more easily detectable in this group because of the specific temporal and geographic sampling available.

srubnaya-yamnaya-ehg-chg-ancestry
Supplementary Table 14. P values of rank=3 and admixture proportions in modelling Steppe ancestry populations as a four-way admixture of distal sources EHG, CHG, Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Steppe cluster, EHG, CHG, WHG, Anatolian_Neolithic
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Srubnaya ancestry shows a best fit with non-Pre-Yamnaya ancestry, i.e. with different CHG + EHG components – possibly because the more western Potapovka (ancestral to Proto-Srubnaya Pokrovka) also showed good fits for it. Srubnaya shows poor fits for Pre-Yamnaya ancestry probably because Corded Ware-like (Abashevo) genetic influence increased during its formation.

On the other hand, more eastern Corded Ware-derived groups like Sintashta and its more direct offshoot Andronovo show poor fits with this model, too, but their fits are still better than those including Pre-Yamnaya ancestry.

mlba-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Middle to Late Bronze Age populations. See full map.
mlba-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Middle to Late Bronze Age populations. See full map.
mlba-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Middle to Late Bronze Age populations. See full map.
mlba-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Middle to Late Bronze Age populations. See full map.

NOTE For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you should read the post Corded Ware ancestry in North Eurasia and the Uralic expansion instead.

The bottleneck of Proto-Indo-Iranians under R1a-Z93 was not yet complete by the time when the Sintashta-Potapovka-Filatovka community expanded with the Srubna-Andronovo horizon:

early-bronze-age-y-dna
Y-DNA haplogroups in West Eurasia during the European Early Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Bronze Age.

4.4. Afanasevo

At the end of the Afanasevo culture, at least three samples show hg. Q1a2-M25 (ca. 2900-2500 BC), which seemed to point to a resurgence of local lineages, despite continuity of the prototypical Pre-Yamnaya ancestry. On the other hand, Anthony (2019) makes this cryptic statement:

Yamnaya men were almost exclusively R1b, and pre-Yamnaya Eneolithic Volga-Caspian-Caucasus steppe men were principally R1b, with a significant Q1a minority.

Since the only available samples from the Khvalynsk community are R1b (x3), Q1a(x1), and R1a(x1), it seems strange that Anthony would talk about a “significant minority”, unless Q1a will pop up in some more individuals of those ca. 30 new to be published. Because he also mentions I2a2 as appearing in one elite burial, it seems Q1a (like R1a-M459) will not appear under elite kurgans, although it is still possible that hg. Q1a was involved in the expansion of Afanasevo to the east.

middle-bronze-age-y-dna
Y-DNA haplogroups in West Eurasia during the Middle Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Middle Bronze Age and the Late Bronze Age.

Okunevo, which replaced Afanasevo in the Altai region, shows a majority of hg. Q1a2-M25, and at least one Q1a1-B284, but also some R1b-M269 samples proper of Afanasevo, suggesting partial genetic continuity.

NOTE. Other sampled Siberian populations clearly show a variety of Q subclades that likely expanded during the Palaeolithic, such as Baikal EBA samples from Ust’Ida and Shamanka with a majority of Q1a2-M25 (in particular Q1a2-L712), and hg. Q reported from Elunino, Sagsai, Khövsgöl, and also among peoples of the Srubna-Andronovo horizon (the Krasnoyarsk MLBA outlier), and in Karasuk. Q1a-M25 was earlier found in a Baltic hunter-gatherer, which supports a widespread distribution of Q1a2 and Q1a1 in North Eurasia during the Neolithic and Bronze Age.

From Damgaard et al. Science (2018):

(…) in contrast to the lack of identifiable admixture from Yamnaya and Afanasievo in the CentralSteppe_EMBA, there is an admixture signal of 10 to 20% Yamnaya and Afanasievo in the Okunevo_EMBA samples, consistent with evidence of western steppe influence. This signal is not seen on the X chromosome (qpAdm P value for admixture on X 0.33 compared to 0.02 for autosomes), suggesting a male-derived admixture, also consistent with the fact that 1 of 10 Okunevo_EMBA males carries a R1b1a2a2 Y chromosome related to those found in western pastoralists. In contrast, there is no evidence of western steppe admixture among the more eastern Baikal region region Bronze Age (~2200 to 1800 BCE) samples.

This Yamnaya ancestry has been also recently found to be the best fit for the Iron Age population of Shirenzigou in Xinjiang – where Tocharian languages were attested centuries later – despite the haplogroup diversity acquired during their evolution, likely through an intermediate Chemurchek culture (see a recent discussion on the elusive Proto-Tocharians).

Haplogroup diversity seems to be common in Iron Age populations all over Eurasia, most likely due to the spread of different types of sociopolitical structures where alliances played a more relevant role in the expansion of peoples. A well-known example of this is the spread of Akozino warrior-traders in the whole Baltic region under a partial N1a-VL29-bottleneck associated with the emerging chiefdom-based systems under the influence of expanding steppe nomads.

early-iron-age-y-dna
Y-DNA haplogroups in West Eurasia during the Early Iron Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Iron Age and Late Iron Age.

Surprisingly, then, Proto-Tocharians from Shirenzigou pack up to 74% Yamnaya ancestry, in spite of the 2,000 years that separate them from the demise of the Afanasevo culture. They show more Yamnaya ancestry than any other population by that time, being thus a sort of Late PIE fossils not only in their archaic dialect, but also in their genetic profile:

shirenzigou-afanasievo-yamnaya-andronovo-srubna-ulchi-han

The recent intrusion of Corded Ware-like ancestry, as well as the variable admixture with Siberian and East Asian populations, both point to the known intense Old Iranian and Old/Middle Chinese contacts. The scarce Proto-Samoyedic and Proto-Turkic loans in Tocharian suggest a rather loose, probably more distant connection with East Uralic and Altaic peoples from the forest-steppe and steppe areas to the north (read more about external influences on Tocharian).

Interestingly, both R1b samples, MO12 and M15-2 – likely of Asian R1b-PH155 branch – show a best fit for Andronovo/Srubna + Hezhen/Ulchi ancestry, suggesting a likely connection with Iranians to the east of Xinjiang, who later expanded as the Wusun and Kangju. How they might have been related to Huns and Xiongnu individuals, who also show this haplogroup, is yet unknown, although Huns also show hg. R1a-Z93 (probably most R1a-Z2124) and Steppe_MLBA ancestry, earlier associated with expanding Iranian peoples of the Srubna-Andronovo horizon.

All in all, it seems that prehistoric movements explained through the lens of genetic research fit perfectly well the linguistic reconstruction of Proto-Indo-European and Proto-Uralic.

Related

Uralic speakers formed clines of Corded Ware ancestry with WHG:ANE populations

steppe-forest-tundra-biomes-uralic

The preprint by Jeong et al. (2018) has been published: The genetic history of admixture across inner Eurasia Nature Ecol. Evol. (2019).

Interesting excerpts, referring mainly to Uralic peoples (emphasis mine):

A model-based clustering analysis using ADMIXTURE shows a similar pattern (Fig. 2b and Supplementary Fig. 3). Overall, the proportions of ancestry components associated with Eastern or Western Eurasians are well correlated with longitude in inner Eurasians (Fig. 3). Notable outliers include known historical migrants such as Kalmyks, Nogais and Dungans. The Uralic- and Yeniseian-speaking populations, as well as Russians from multiple locations, derive most of their Eastern Eurasian ancestry from a component most enriched in Nganasans, while Turkic/Mongolic speakers have this component together with another component most enriched in populations from the Russian Far East, such as Ulchi and Nivkh (Supplementary Fig. 3). Turkic/Mongolic speakers comprising the bottom-most cline have a distinct Western Eurasian ancestry profile: they have a high proportion of a component most enriched in Mesolithic Caucasus hunter-gatherers and Neolithic Iranians and frequently harbour another component enriched in present-day South Asians (Supplementary Fig. 4). Based on the PCA and ADMIXTURE results, we heuristically assigned inner Eurasians to three clines: the ‘forest-tundra’ cline includes Russians and all Uralic and Yeniseian speakers; the ‘steppe-forest’ cline includes Turkic- and Mongolic-speaking populations from the Volga and Altai–Sayan regions and Southern Siberia; and the ‘southern steppe’ cline includes the rest of the populations.

eurasian-clines-uralic-altaic
The first two PCs summarizing the genetic structure within 2,077 Eurasian individuals. The two PCs generally mirror geography. PC1 separates western and eastern Eurasian populations, with many inner Eurasians in the middle. PC2 separates eastern Eurasians along the northsouth cline and also separates Europeans from West Asians. Ancient individuals (color-filled shapes), including two Botai individuals, are projected onto PCs calculated from present-day individuals.

For the forest-tundra populations, the Nganasan + Srubnaya model is adequate only for the two Volga region populations, Udmurts and Besermyans (Fig. 5 and Supplementary Table 8).

For the other populations west of the Urals, six from the northeastern corner of Europe are modelled with additional Mesolithic Western European hunter-gatherer (WHG) contribution (8.2–11.4%; Supplementary Table 8), while the rest need both WHG and early Neolithic European farmers (LBK_EN; Supplementary Table 2). Nganasan-related ancestry substantially contributes to their gene pools and cannot be removed from the model without a significant decrease in the model fit (4.1–29.0% contribution; χ2 P ≤ 1.68 × 10−5; Supplementary Table 8).

west-urals-finno-ugrians-qpadm
Supplementary Table 8. QpAdm-based admixture modeling of the forest-tundra cline populations. For the 13 populations west of the Urals, we present a four-way admixture model, Nganasan+Srubnaya+WHG+LBK_EN, or its minimal adequate subset. Modified from the article, to include colors for cultures, and underlined best models for Corded Ware ancestry among Uralians.

NOTE. It doesn’t seem like Hungarians can be easily modelled with Nganasan ancestry, though…

For the 4 populations east of the Urals (Enets, Selkups, Kets and Mansi), for which the above models are not adequate, Nganasan + Srubnaya + AG3 provides a good fit (χ2 P ≥ 0.018; Fig. 5 and Supplementary Table 8). Using early Bronze Age populations from the Baikal Lake region (‘Baikal_EBA’; Supplementary Table 2) as a reference instead of Nganasan, the two-way model of Baikal_EBA + Srubnaya provides a reasonable fit (χ2 P ≥ 0.016; Supplementary Table 8) and the three-way model of Baikal_EBA + Srubnaya + AG3 is adequate but with negative AG3 contribution for Enets and Mansi (χ2 P ≥ 0.460; Supplementary Table 8).

east-urals-ugric-samoyedic-qpadm
Supplementary Table 8. QpAdm-based admixture modeling of the forest-tundra cline populations. For the four populations east of the Urals, we present three admixture models: Baikal_EBA+Srubnaya, Baikal_EBA+Srubnaya+AG3 and Nganasan+Srubnaya+AG3. For each model, we present qpAdm p-value, admixture coefficient estimates and associated 5 cM jackknife standard errors (estimate ± SE). Modified from the article, to include colors for cultures, and underlined best models for Corded Ware ancestry among Uralians.

Bronze/Iron Age populations from Southern Siberia also show a similar ancestry composition with high ANE affinity (Supplementary Table 9). The additional ANE contribution beyond the Nganasan + Srubnaya model suggests a legacy from ANE-ancestry-rich clines before the Late Bronze Age.

bronze-age-iron-age-karasuk-mezhovska-tagar-qpadm
Supplementary Table 9. QpAdm-based admixture modeling of Bronze and Iron Age populations of southern Siberia. For ancieint individuals associated with Karasuk and Tagar cultures, Nganasan+Srubnaya model is insufficient. For all five groups, adding AG3 as the third ancestry or substituting Nganasan with Baikal_EBA with higher ANE affinity provides an adequate model. For each model, we present qpAdm p-value, admixture coefficient estimates and associated 5 cM jackknife standard errors (estimate ± SE). Models with p-value ≥ 0.05 are highlighted in bold face. Modified from the article, to include colors for cultures, and underlined best models for Corded Ware ancestry among Uralians.

Lara M. Cassidy comments the results of the study in A steppe in the right direction (you can read it here):

Even among the earliest available inner Eurasian genomes, east–west connectivity is evident. These, too, form a longitudinal cline, characterized by the easterly increase of a distinct ancestry, labelled Ancient North Eurasian (ANE), lowest in western European hunter-gatherers (WHG) and highest in Palaeolithic Siberians from the Baikal region. Flow-through from this ANE cline is seen in steppe populations until at least the Bronze Age, including the world’s earliest known horse herders — the Botai. However, this is eroded over time by migration from west and east, following agricultural adoption on the continental peripheries (Fig. 1b,c).

Strikingly, Jeong et al. model the modern upper steppe cline as a simple two-way mixture between western Late Bronze Age herders and Northeast Asians (Fig. 1c), with no detectable residue from the older ANE cline. They propose modern steppe peoples were established mainly through migrations post-dating the Bronze Age, a sequence for which has been recently outlined using ancient genomes. In contrast, they confirm a substantial ANE legacy in modern Siberians of the northernmost cline, a pattern mirrored in excesses of WHG ancestry west of the Urals (Fig. 1b). This marks the inhospitable biome as a reservoir for older lineages, an indication that longstanding barriers to latitudinal movement may indeed be at work, reducing the penetrance of gene flows further south along the steppe.

eurasian-clines-uralic-turkic-mongol-altaic
The genomic formation of inner Eurasians. b–d, Depiction of the three main clines of ancestry identified among Inner Eurasians. Sources of admixture for each cline are represented using proxy ancient populations, both sampled and hypothesised, based on the study’s modelling results. The major eastern and western ancestries used to model each cline are shown in bold; the peripheral admixtures that gave rise to these are also shown. Additional contributions to subsections of each cline are marked with dashed lines. b, The northernmost cline, illustrating the legacy of WHG and ANE-related populations. c,d, The upper (c) and lower (d) steppe clines are shown, both of which have substantial eastern contributions related to modern Tungusic speakers. The authors propose these populations are themselves the result of an admixture between groups related to the Nganasan, whose ancestors potentially occupied a wider range, and hunter-gatherers (HGs) from the Amur River Basin. While the upper steppe cline in c can be described as a mixture between this eastern ancestry and western steppe herders, the current model for the southern steppe cline as shown in d is not adequate and is likely confounded by interactions with diverse bordering ancestries. Credit: Ecoregions 2017, Resolve https://ecoregions2017.appspot.com/

Given the findings as reported in the paper, I think it should be much easier to describe different subclines in the “northernmost cline” than in the much more recent “Turkic/Mongolic cline”, which is nevertheless subdivided in this paper in two clines. As an example, there are at least two obvious clines with “Nganasan-related meta-populations” among Uralians, which converge in a common Steppe MLBA (i.e. Corded Ware) ancestry – one with Palaeo-Laplandic peoples, and another one with different Palaeo-Siberian populations:

siberian-clines-uralic-altaic
PCA of ancient and modern Eurasian samples. Ancient Palaeo-Laplandic, Palaeosiberian, and Altai clines drawn, with modern populations labelled. See a version with higher resolution.

The inclusion of certain Eurasian groups (or lack thereof) in the PCA doesn’t help to distinguish these subclines visually, and I guess the tiny “Naganasan-related” ancestral components found in some western populations (e.g. the famous ~5% among Estonians) probably don’t lend themselves easily to further subdivisions. Notice, nevertheless, the different components of the Eastern Eurasian source populations among Finno-Ugrians:

uralic-admixture-qpadm
Characterization of the Western and Eastern Eurasian source ancestries in inner Eurasian populations. [Modified from the paper, includes only Uralic populations]. a, Admixture f3 values are compared for different Eastern Eurasian (Mixe, Nganasan and Ulchi; green) and Western Eurasian references (Srubnaya and Chalcolithic Iranians (Iran_ChL); red). For each target group, darker shades mark more negative f3 values. b, Weights of donor populations in two sources characterizing the main admixture signal (date 1 and PC1) in the GLOBETROTTER analysis. We merged 167 donor populations into 12 groups (top right). Target populations were split into five groups (from top to bottom): Aleuts; the forest-tundra cline populations; the steppe-forest cline populations; the southern steppe cline populations; and ‘others’.

Also remarkable is the lack of comparison of Uralic populations with other neighbouring ones, since the described Uralic-like ancestry of Russians was already known, and is most likely due to the recent acculturation of Uralic-speaking peoples in the cradle of Russians, right before their eastward expansions.

west-eurasian-east-eurasian-ancestry
Supplementary Fig. 4. ADMIXTURE results qualitatively support PCA-based grouping of inner Eurasians into three clines. (A) Most southern steppe cline populations derive a higher proportion of their total Western Eurasian ancestry from a source related to Caucasus, Iran and South Asian populations. (B) Turkic- and Mongolic-speaking populations tend to derive their Eastern Eurasian ancestry more from the Devil’s Gate related one than from Nganasan-related one, while the opposite is true for Uralic- and Yeiseian-speakers. To estimate overall western Eurasian ancestry proportion, we sum up four components in our ADMIXTURE results (K=14), which are the dominant components in Neolithic Anatolians (“Anatolia_N”), Mesolithic western European hunter-gatherers (“WHG”), early Holocene Caucasus hunter-gatherers (“CHG”) and Mala from southern India, respectively. The “West / South Asian ancestry” is a fraction of it, calculated by summing up the last two components. To estimate overall Eastern Eurasian ancestry proportion, we sum up six components, most prevalent in Surui, Chipewyan, Itelmen, Nganasan, Atayal and early Neolithic Russian Far East individuals (“Devil’s Gate”). Eurasians into three clines. (A) Most southern steppe cline populations derive a higher proportion of their total Western Eurasian ancestry from a source related to Caucasus, Iran and South Asian populations. (B) Turkic- and Mongolic-speaking populations tend to derive their Eastern Eurasian ancestry more from the Devil’s Gate related one than from Nganasan-related one, while the opposite is true for Uralic- and Yeiseian-speakers. To estimate overall western Eurasian ancestry proportion, we sum up four components in our ADMIXTURE results (K=14), which are the dominant components in Neolithic Anatolians (“Anatolia_N”), Mesolithic western European hunter-gatherers (“WHG”), early Holocene Caucasus hunter-gatherers (“CHG”) and Mala from southern India, respectively. The “West / South Asian ancestry” is a fraction of it, calculated by summing up the last two components. To estimate overall Eastern Eurasian ancestry proportion, we sum up six components, most prevalent in Surui, Chipewyan, Itelmen, Nganasan, Atayal and early Neolithic Russian Far East individuals (“Devil’s Gate”).

A comparison of Estonians and Finns with Balts, Scandinavians, and Eastern Europeans would have been more informative for the division of the different so-called “Nganasan-like meta-populations”, and to ascertain which one of these ancestral peoples along the ancient WHG:ANE cline could actually be connected (if at all) to the Cis-Urals.

Because, after all, based on linguistics and archaeology, geneticists are not supposed to be looking for populations from the North Asian Arctic region, for “Siberian ancestry”, or for haplogroup N1c – despite previous works by their peers – , but for the Bronze Age Volga-Kama region…

Related

Aquitanians and Iberians of haplogroup R1b are exactly like Indo-Iranians and Balto-Slavs of haplogroup R1a

eba-indo-iranian-balto-slavs

The final paper on Indo-Iranian peoples, by Narasimhan and Patterson (see preprint), is soon to be published, according to the first author’s Twitter account.

One of the interesting details of the development of Bronze Age Iberian ethnolinguistic landscape was the making of Proto-Iberian and Proto-Basque communities, which we already knew were going to show R1b-P312 lineages, a haplogroup clearly associated during the Bell Beaker period with expanding North-West Indo-Europeans:

From the Bronze Age (~2200–900 BCE), we increase the available dataset from 7 to 60 individuals and show how ancestry from the Pontic-Caspian steppe (Steppe ancestry) appeared throughout Iberia in this period, albeit with less impact in the south. The earliest evidence is in 14 individuals dated to ~2500–2000 BCE who coexisted with local people without Steppe ancestry. These groups lived in close proximity and admixed to form the Bronze Age population after 2000 BCE with ~40% ancestry from incoming groups. Y-chromosome turnover was even more pronounced, as the lineages common in Copper Age Iberia (I2, G2, and H) were almost completely replaced by one lineage, R1b-M269.

iberia-admixture-y-dna
Proportion of ancestry derived from central European Beaker/Bronze Age populations in Iberians from the Middle Neolithic to the Iron Age (table S15). Colors indicate the Y-chromosome haplogroup for each male. Red lines represent period of admixture. Modified from Olalde et al. (2019).

The arrival of East Bell Beakers speaking Indo-European languages involved, nevertheless, the survival of the two non-IE communities isolated from each other – likely stemming from south-western France and south-eastern Iberia – thanks to a long-lasting process of migration and admixture. There are some common misconceptions about ancient languages in Iberia which may have caused some wrong interpretations of the data in the paper and elsewhere:

NOTE. A simple reading of Iberian prehistory would be enough to correct these. Two recent books on this subject are Villar’s Indoeuropeos, iberos, vascos y otros parientes and Vascos, celtas e indoeuropeos. Genes y lenguas.

Iberian languages were spoken at least in the Mediterranean and the south (ca. “1/3 of Iberia“) during the Bronze Age.

Nope, we only know the approximate location of Iberian culture and inscriptions from the Late Iron Age, and they occupy the south-eastern and eastern coastal areas, but before that it is unclear where they were spoken. In fact, it seems evident now that the arrival of Urnfield groups from the north marks the arrival of Celtic-speaking peoples, as we can infer from the increase in Central European admixture, while the expansion of anthropomorphic stelae from the north-west must have marked the expansion of Lusitanian.

Vasconic was spoken in both sides of the Pyrenees, as it was in the Middle Ages.

Wrong. One of the worst mistakes I am seeing in many comments since the paper was published, although admittedly the paper goes around this problem talking about “Modern Basques”. Vasconic toponyms appear south of the Pyrenees only after the Roman conquests, and tribes of the south-western Pyrenees and Cantabrian regions were likely Celtic-speaking peoples. Aquitanians (north of the western Pyrenees) are the only known ancient Vasconic-speaking population in proto-historic times, ergo the arrival of Bell Beakers in Iberia was most likely accompanied by Indo-European languages which were later replaced by Celtic expanding from Central Europe, and Iberian expanding from south-east Iberia, and only later with Latin and Vasconic.

Ligurian is non-Indo-European, and Lusitanian is Celtic-like, so Iberia must have been mostly non-Indo-European-speaking.

The fragmentary material available on Ligurian is enough to show that phonetically it is a NWIE dialect of non-Celtic, non-Italic nature, much like Lusitanian; that is, unless you follow laryngeals up to Celtic or Italic, in which case you can argue anything about this or any other IE language, as people who reconstruct laryngeals for Baltic in the common era do.

EDIT (19 Mar 2019): It was not clear enough from this paragraph, because Ligurian-like languages in NE Iberia is just a hypothesis based on the archaeological connection of the whole southern France Bell Beaker region. My aim was to repeat the idea that Old European hydro-toponymy is older in NE Iberia (as almost anywhere in Iberia) than Iberian toponymy, so the initial hypothesis is that:

  1. a Palaeo-European language (as Villar puts it) expanded into most regions of Iberia in ancient times (he considered at some point the Mesolithic, but that is obviously wrong, as we know now); then
  2. Celts expanded at least to the Ebro River Basin; then
  3. Iberians expanded to the north and replaced these in NE Iberia; and only then
  4. after the Roman invasion, around the start of the Common Era, appear Vasconic toponyms south of the Pyrenees.

Lusitanian obviously does not qualify as Celtic, lacking the most essential traits that define Celticness…Unless you define “(Para-)Celtic” as Pre-Proto-Celtic-like, or anything of the sort to support some Atlantic continuity, in which case you can also argue that Pre-Italic or Pre-Germanic are Celtic, because you would be essentially describing North-West Indo-European

If Basques have R1b, it’s because of a culture of “matrilocality” as opposed to the “patrilocality” of Indo-Europeans

So wrong it hurts my eyes every time I read this. Not only does matrilocality in a regional group have few known effects in genetics, but there are many well-documented cases of population replacement (with either ancestry or Y-DNA haplogroups, or both) without language replacement, without a need to resort to “matrilineality” or “matrilocality” or any other cultural difference in any of these cases.

In fact, it seems quite likely now that isolated ancient peoples north of the Pyrenees will show a gradual replacement of surviving I2a lineages by neighbouring R1b, while early Iberian R1b-DF27 lineages are associated with Lusitanians, and later incoming R1b-DF27 lineages (apart from other haplogroups) are most likely associated with incoming Celts, which must have remained in north-central and central-east European groups.

NOTE. Notice how R1a is fully absent from all known early Indo-European peoples to date, whether Iberian IE, British IE, Italic, or Greek. The absence of R1a in Iberia after the arrival of Celts is even more telling of the origin of expanding Celts in Central Europe.

I haven’t had enough time to add Iberian samples to my spreadsheet, and hence neither to the ASoSaH texts nor maps/PCAs (and I don’t plan to, because it’s more efficient for me to add both, Asian and Iberian samples, at the same time), but luckily Maciamo has summed it up on Eupedia. Or, graphically depicted in the paper for the southeast:

iberia-haplogroups
Y chromosome haplogroup composition of individuals from southeast Iberia during the past 2000 years. The general Iberian Bronze and Iron Age population is included for comparison. Modified from Olalde et al. (2019).

Does this continued influx of Y-DNA haplogroups in Iberia with different cultures represent permanent changes in language? Are, therefore, modern Iberian languages derived from Lusitanian, Sorothaptic/Celtic, Greek, Phoenician, East or West Germanic, Hebrew, Berber, or Arabic languages? Obviously not. Same with Italy (see the recent preprint on modern Italians by Raveane et al. 2018), with France, with Germany, or with Greece.

If that happens in European regions with a known ancient history, why would the recent expansions and bottlenecks of R1b in modern Basques (or N1c around the Baltic, or R1a in Slavs) in the Middle Ages represent an ancestral language surviving into modern times?

Indo-Iranians

If something is clear from Narasimhan, Patterson, et al. (2018), is that we know finally the timing of the introduction and expansion of R1a-Z645 lineages among Indo-Iranians.

We could already propose since 2015 that a slow admixture happened in the steppes, based on archaeological finds, due to settlement elites dominating over common peoples, coupled with the known Uralic linguistic traits of Indo-Iranian (and known Indo-Iranian influence on Finno-Ugric) – as I did in the first version of the Indo-European demic diffusion model.

The new huge sampling of Sintashta – combined with that of Catacomb, Poltavka, Potapovka, Andronovo, and Srubna – shows quite clearly how this long-term admixture process between Uralic peoples and Indo-Iranians happened between forest-steppe CWC (mainly Abashevo) and steppe groups. The situation is not different from that of Iberia ca. 2500-2000 BC; from Narasimhan, Patterson, et al. (2018):

We combined the newly reported data from Kamennyi Ambar 5 with previously reported data from the Sintashta 5 individuals (10). We observed a main cluster of Sintashta individuals that was similar to Srubnaya, Potapovka, and Andronovo in being well modeled as a mixture of Yamnaya-related and Anatolian Neolithic (European agriculturalist-related) ancestry.

Even with such few words referring to one of the most important data in the paper about what happened in the steppes, Wang et al. (2018) help us understand what really happened with this simplistic concept of “steppe ancestry” regarding Yamna vs. Corded Ware differences:

anatolia-neolithic-steppe-eneolithic
Image modified from Wang et al. (2018). Marked are: in red, approximate limit of Anatolia_Neolithic ancestry found in Yamna populations; in blue, Corded Ware-related groups. “Modelling results for the Steppe and Caucasus 1128 cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional Anatolian farmer-related ancestry in Steppe groups as well as additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups (see also Supplementary Tables 10, 14 and 20).”

As with Iberia (or any prehistoric region), the details of how exactly this language change happened are not evident, but we only need a plausible explanation coupled with archaeology and linguistics. Poltavka, Potapovka, and Sintashta samples – like the few available Iberian ones ca. 2500-2000 BC – offer a good picture of the cohabitation of R1b-L23 (mainly Z2103) and R1a-Z645 (mainly Z93+): a glimpse at the likely presence of R1a-Z93 within settlements – which must have evolved as the dominant elites – in a society where the majority of the population was initially formed by nomad herders (probably most R1b-Z2103), who were usually buried outside of the main settlements.

Will the upcoming Narasimhan, Patterson et al. (2019) deal with this problem of how R1a-M417 replaced R1b-M269, and how the so-called “Steppe_MLBA” (i.e. Corded Ware) ancestry admixed with “Steppe_EMBA” (i.e. Yamnaya) ancestry in the steppes, and which one of their languages survived in the region (that is, the same the Reich Lab has done with Iberia)? Not likely. The ‘genetic wars’ in Iberia deal with haplogroup R1b-P312, and how it was neither ‘native’ nor associated with Basques and non-Indo-European peoples in general. The ‘genetic wars’ in South Asia are concerned with the steppe origin of R1a, to prove that it is not a ‘native’ haplogroup to India, and thus neither are Indo-Aryan languages. To each region a politically correct account of genetic finds, with enough care not to fully dismiss national myths, it seems.

NOTE. Funnily enough, these ‘genetic wars’ are the making of geneticists since the 1990s and 2000s, so we are still in the midst of mostly internal wars caused by what they write. Just as genetic papers of the 2020s will most likely be a reaction to what they are writing right now about “steppe ancestry” and R1a. You won’t find much change to the linguistic reconstruction in this whole period, except for the most multicolored glottochronological proposals…

The first author of the paper has engaged, as far as I could see in Twitter, in dialogue with Hindu nationalists who try to dismiss the arrival of steppe ancestry and R1a into South Asia as inconclusive (to support the potential origin of Sanskrit millennia ago in the Indus Valley Civilization). How can geneticists deal with the real problem here (the original ethnolinguistic group expanding with Corded Ware), when they have to fend off anti-steppists from Europe and Asia? How can they do it, when they themselves are part of the same societies that demand a politically correct presentation of data?

This is how the data on the most likely Indo-Iranian-speaking region should be presented in an ideal world, where – as in the Iberia paper – geneticists would look closely to the Volga-Ural region to discover what happened with Proto-Indo-Iranians from their earliest to their latest stage, instead of constantly looking for sites close to the Indus Valley to demonstrate who knows what about modern Indian culture:

indo-iranian-admixture-similar-iberians
Tentative map of the Late PIE and Indo-Iranian community in the Volga-Ural steppes since the Eneolithic. Proportion of ancestry derived from central European Corded Ware peoples. Colors indicate the Y-chromosome haplogroup for each male. Red lines represent period of admixture. Modified from Olalde et al. (2019).

Now try and tell Hindu nationalists that Sanskrit expanded from an Early Bronze Age steppe community of R1b-rich nomadic herders that spoke Pre-Indo-Iranian, which was dominated and eventually (genetically) mostly replaced by elite Uralic-speaking R1a peoples from the Russian forest, hence the known phonetic (and some morphological) traits that remained. Good luck with the Europhobic shitstorm ahead..

Balto-Slavic

Iberian cultures, already with a majority of R1b lineages, show a clear northward expansion over previously Urnfield-like groups of north-east Iberia and Mediterranean France (which we now know probably represent the migration of Celts from central Europe). Similarly, Eastern Balts already under a majority of R1a lineages expanded likely into the Baltic region at the same time as the outlier from Turlojiškė (ca. 1075 BC), which represents the first obvious contacts of central-east Europe with the Baltic.

Iberia shows a more recent influx of central and eastern Mediterranean peoples, one of which eventually succeeded in imposing their language in Western Europe: Romans were possibly associated mainly with R1b-U152, apart from many other lineages. Proto-Slavs probably expanded later than Celts, too, connected to the disintegration of the Lusatian culture, and they were at some point associated with R1a-M458 and R1a-Z280(xZ92) lineages, apart from others already found in Early Slavs.

pca-balto-slavs-tollense-valley
PCA of central-eastern European groups which may have formed the Balto-Slavic-speaking community derived from Bell Beaker, evident from the position ‘westwards’ of CWC in the PCA, and surrounding cultures. Left: Early Bronze Age. Right: Tollense Valley samples.

This parallel between Iberia and eastern Europe is no coincidence: as Europe entered the Bronze Age, chiefdom-based systems became common, and thus the connection of ancestry or haplogroups with ethnolinguistic groups became weaker.

What happened earlier (and who may represent the Pre-Balto-Slavic community) will be clearer when we have enough eastern European samples, but basically we will be able to depict this admixture of NWIE-speaking BBC-derived peoples with Uralic-speaking CWC-derived groups (since Uralic is known to have strongly influenced Balto-Slavic), similar to the admixture found in Indo-Iranians, more or less like this:

iberian-admixture-balto-slavic
Tentative map of the North-West Indo-European and Balto-Slavic community in central-eastern Europe since the East Bell Beaker expansion. Proportion of ancestry derived from Corded Ware peoples. Colors indicate the Y-chromosome haplogroup for each male. Red lines represent period of admixture. Modified from Olalde et al. (2019).

The Early Scythian period marked a still stronger chiefdom-based system which promoted the creation of alliances and federation-like groups, with an earlier representation of the system expanding from north-eastern Europe around the Baltic Sea, precisely during the spread of Akozino warrior-traders (in turn related to the Scythian influence in the forest-steppes), who are the most likely ancestors of most N1c-V29 lineages among modern Germanic, Balto-Slavic, and Volga-Finnic peoples.

Modern haplogroup+language = ancient ones?

It is not difficult to realize, then, that the complex modern genetic picture in Eastern Europe and around the Urals, and also in South Asia (like that of the Aegean or Anatolia) is similar to the Iron Age / medieval Iberian one, and that following modern R1a as an Indo-European marker just because some modern Indo-European-speaking groups showed it was always a flawed methodology; as flawed as following R1b for ancient Vasconic groups, or N1c for ancient Uralic groups.

Why people would argue that haplogroups mean continuity (e.g. R1b with Basques, N1c with Finns, R1a with Slavs, etc.) may be understood, if one lives still in the 2000s. Just like why one would argue that Corded Ware is Indo-European, because of Gimbutas’ huge influence since the 1960s with her myth of “Kurgan peoples”. Not many denied these haplogroup associations, because there was no reason to do it, and those who did usually aligned with a defense of descriptive archaeology.

However, it is a growing paradox that some people interested in genetics today would now, after the Iberian paper, need to:

  • accept that ancient Iberians and probably Aquitanians (each from different regions, and probably from different “Basque-Iberian dialects” in the Chalcolithic, if both were actually related) show eventually expansions with R1b-L23, the haplogroup most obviously associated with expanding Indo-Europeans;
  • acknowledge that modern Iberians have many different lineages derived from prehistoric or historic peoples (Celts, Phoenicians, Greeks, Romans, Jews, Goths, Berbers, Arabs), which have undergone different bottlenecks, the last ones during the Reconquista, but none of their languages have survived;
  • realize that a similar picture is to be found everywhere in central and western Europe since the first proto-historic records, with language replacement in spite of genetic continuity, such as the British Isles (and R1b-L21 continuity) after the arrival of Celts, Romans, Anglo-Saxons, Vikings, or Normans;
  • but, at the same time, continue blindly asserting that haplogroup R1a + “steppe ancestry” represent some kind of supernatural combination which must show continuity with their modern Indo-Iranian or Balto-Slavic language from time immemorial.
sintashta-y-dna
Replacement of R1b-L23 lineages during the Early Bronze Age in eastern Europe and in the Eurasian steppes: emergence of R1a in previous Yamnaya and Bell Beaker territories. Modified from EBA Y-DNA map.

Behave, pretty please

The ‘conservative’ message espoused by some geneticists and amateur genealogists here is basically as follows:

  • Let’s not rush to new theories that contradict the 2000s, lest some people get offended by granddaddy not being these pure whatever wherever as they believed, and let’s wait some 5, 10, or 20 years, as long as necessary – to see if some corner of the Yamna culture shows R1a, or some region in north-eastern Europe shows N1c, or some Atlantic Chalcolithic sample shows R1b – to challenge our preferred theories, if we actually need to challenge anything at all, because it hurts too much.
  • Just don’t let many of these genetic genealogists or academics of our time be unhappy, pretty please with sugar on top, and let them slowly adapt to reality with more and more pet theories to fit everything together (past theories + present data), so maybe when all of them are gone, within 50 or 70 years, society can smoothly begin to move on and propose something closer to reality, but always as politically correct as possible for the next generations.
  • For starters, let’s discuss now (yet again) that Bell Beakers may not have been Indo-European at all, despite showing (unlike Corded Ware) clearly Yamna male lineages and ancestry, because then Corded Ware and R1a could not have been Indo-European and that’s terrible, so maybe Bell Beakers are too brachycephalic to speak Indo-European or something, or they were stopped by the Fearsome Tisza River, or they are not pure Dutch Single Grave in The South hence not Indo-European, or whatever, and that’s why Iron Age Iberians or Etruscans show non-Indo-European languages. That’s not disrespectful to the history of certain peoples, of course not, but talking about the evident R1a-Uralic connection is, because this is The South, not The North, and respect works differently there.
  • Just don’t talk about how Slavs and Balts enter history more than 1,500 years later than Indo-European peoples in Western and Southern Europe, including Iberia, and assume a heroic continuity of Balts and Slavs as pure R1a ‘steppe-like’ peoples dominating over thousands of kms. in the Baltic, Fennoscandia, eastern Europe, and northern Asia for 5,000 years, with multiple Balto-Slavs-over-Balto-Slavs migrations, because these absolute units of Indo-European peoples were a trip and a half. They are the Asterix and Obelix of white Indo-European prehistory.
  • Perhaps in the meantime we can also invent some new glottochronological dialectal scheme that fits the expansion of Sredni Stog/Corded Ware with (Germano-?)Indo-Slavonic separated earlier than any other Late PIE dialect; and Finno-Volgaic later than any other Uralic dialect, in the Middle Ages, with N1c.
balto-slavic-pca
Genetic structure of the Balto-Slavic populations within a European context according to the three genetic systems, from Kushniarevich et al. (2015). Pure Balto-Slavs from…hmm…yeah this…ancient…region…or people…cluster…Whatever, very very steppe-like peoples, the True Indo-Europeans™, so close to Yamna…almost as close as Finno-Ugrians.

To sum up: Iberia, Italy, France, the British Isles, central Europe, the Balkans, the Aegean, or Anatolia, all these territories can have a complex history of periodic admixture and language replacement everywhere, but some peoples appearing later than all others in the historical record (viz. Basques or Slavs) apparently cannot, because that would be shameful for their national or ethnic myths, and these should be respected.

Ignorance of the own past as a blank canvas to be filled in with stupid ethnolinguistic continuity, turned into something valuable that should not be challenged. Ethnonationalist-like reasoning proper of the 19th century. How can our times be called ‘modern’ when this kind of magical thinking is still prevalent, even among supposedly well-educated people?

Related

Eurasian steppe chariots and social complexity during the Bronze Age

ba-eurasia-abashevo-sintashta

New paper (behind paywall), Eurasian Steppe Chariots and Social Complexity During the Bronze Age, by Chechushkov and Epimakhov, Journal of World Prehistory (2018).

Interesting excerpts (emphasis mine):

Nowadays, archaeologists distinguish at least three Bronze Age pictorial traditions on the basis of style, and demonstrate some parallels in the material culture. The earliest is the Yamna–Afanasievo tradition, which is characterized by the symbolic depiction of sun-headed men and animals. Another tradition is a record of the Andronovo people (Kuzmina 1994; Novozhenov 2012), who depicted in it their everyday life and the importance of wheeled transport (Novozhenov 2014a, b). Although petroglyphs on open-air natural rock surfaces are obviously hard to date, the occurrence of similar carvings on stone grave stelae within some Andronovo culture cemeteries (such as the Tamgaly Cemetery and the Samara Cemetery in Sary Arka, Kazakhstan) provide a level of chronological control. Finally, the finds of petroglyphs depicting chariots in the burials of the Karasuk culture (c. 1400–800 BC) in southern Siberia and Kazakhstan allow us to distinguish the latest tradition (Novozhenov 2014b).

petroglyphs-chariot
“Depictions of a chariot on the petroglyphs, the Koksu River valley, Kazakhstan (redrawn after Novozhenov 2012, p. 45, with the author’s permission)”

The site of Sintashta in the steppe zone of the Southern Trans-Urals (the eastern side of the Ural Mountains) was excavated in the 1970s and yielded abundant Bronze Age material, including unparalleled evidence of six vehicles buried in graves, each with two spoked wheels accompanied by cheekpieces and sacrificial horses (Gening 1977; Gening et al. 1992). (…) Chariot remains from the Middle and Late Bronze Age in the southern Urals are quite abundant compared with early chariot remains from other parts of the world, and allow statistical analysis.

In contrast, only two wagons and one sledge were found in the Royal Cemetery of Ur (Woolley 1965), and only ten actual chariots and their parts are known from tombs of the New Kingdom of Egypt (1550–1069 BC) (Littauer and Crouwel 1985; James 1974; Herold 2006), with the rest of the information on the Near Eastern chariots coming in other forms. Two chariots and the wheels of a third were also found in the Lchashen Cemetery in Armenia (Yesayan 1960), dated to 1400–1300 BC (Pogrebova 2003, p. 397), and bronze models of chariots were found in the burial sites of neighboring Transcaucasia (Brileva 2012). Over one hundred chariots have been discovered in Shang period tombs in China, but none dates before 1200 BC (Wu 2013).

Sintashta–Petrovka chariots were functional and used for carrying passengers and, probably, for warfare. Otherwise, one would not expect to see consistency in the measurements and technological solutions (…)

(1) The technological solutions used to construct a wheel and its dimensions are derived from the measurements of the ‘wheel pits’. They allow such analysis because some had the actual imprints of felloes and spokes. (…) Due to the imprints of spokes and felloes left in the soil, it is clear that the Bronze Age people knew of and utilized the spoked wheel.

(2) Wheel track is the distance between the centerlines of two wheels on an axle. It can be estimated on the basis of the distance between the central axes of all known wheel pits, in addition to direct measurement of the eight known cases of wheel imprints.(…) the majority of findings with a mean wheel track of 136 ± 12 cm might represent either a single-driver chariot or a vehicle with two passengers who accessed the vehicle from the rear, since one extreme of this wheel-track provides enough space for a standing person, while another is suitable for a driver and passenger.

(3) The means of traction is the element that connects the vehicle to the yoke of the draft animals (Littauer et al. 2002, p. xvii). It is needed for a vehicle to be pulled by harnessed animals and is constructed as a central draft pole located between the animals, or shafts located on the external sides of the animals, called thills. (…) Using burial chamber size as a proxy, chariots had a maximum estimated length of 327 ± 20 cm, and a maximum estimated width of 205 ± 21 cm. These dimensions suggest a great similarity to six chariots of Tutankhamun that have maximum dimensions of 260 × 236 cm (Crouwel 2013).

bridle-chariot-horses
Elements of Bronze Age chariots. Image from Chechushkov (2007).

Associated individuals

suggest that this person was a chief, and that the burial context illustrates his significance in the social life of the local community (Logvin and Shevnina 2008, p. 193). However, it also suggests the diverse role of the Sintashta–Petrovka elites, who were likely engaged in a number of different activities, such as warfare, craft production, food production, and a broad social life.

(…) while weapons are not universally present with chariots, they are present much more often than in non-chariot burials: more than 50% of the chariot burials are accompanied by weapons, with a clear predominance of projectile arms.

The creation, utilization, and maintenance of the chariots would have required a number of important skills, and some degree of standardization in manufacturing chariots might be related to a very small number of chariot makers. This means that the Sintashta–Petrovka craftsmen were ‘attached specialists’ and made their products following the orders and desires of those who were interested in the competitive use of chariots. Hence, the social group interested in producing and maintaining chariots sponsored all of those processes. While the nature of this social group is unclear, it is reasonable to hypothesize that it could be a group of military elites characterized by aggrandizing behavior. These people shared military identities and values, but also belonged to bigger collectives, presumably diverse kin groups. The competition between these collectives for resources, power, and prestige created the chariot complex.

Evolution

Analyzing horse-headed knobs, Kovalevskaya demonstrates the evolution of horse tack from a simple muzzle to a bridle with bits during the 5th and 4th millennia BC (Kovalevskaya 2014). Her analysis correlates well with a study of pathologies in horse teeth conducted by Brown and Anthony, who suggest the appearance of bits and horseback riding at Botai and Tersek (Anthony et al. 2006). Cheekpieces became the next necessary and logical step in the evolution of means of horse control. Their appearance together with the wheeled vehicles is not a coincidence, but the development of preceding tools. After the year 2000 BC, cheekpieces often occur together with sacrificed horses—13 out of 15 Sintashta burials with cheekpieces also contain horse bones (Epimakhov and Berseneva 2012)—showing evolution in the role of horses.

The whole paper offers an interesting summary of cultural and population events in the Pontic-Caspian steppes since the Early Yamna period. Also, horse-headed knobs!

NOTE. You can find similar information in other (free) papers from Chechushkov in his account in Academia.edu.

Related

Consequences of Damgaard et al. 2018 (II): The late Khvalynsk migration waves with R1b-L23 lineages

chalcolithic_early-asia

This post should probably read “Consequences of Narasimhan et al. (2018),” too, since there seems to be enough data and materials published by the Copenhagen group in Nature and Science to make a proper interpretation of the data that will appear in their corrected tables.

The finding of late Khvalynsk/early Yamna migrations, identified with early LPIE migrants almost exclusively of R1b-L23 subclades is probably one of the most interesting findings in the recent papers regarding the Indo-European question.

Although there are still few samples to derive fully-fledged theories, they begin to depict a clearer idea of waves that shaped the expansion of Late Proto-Indo-European migrants in Eurasia during the 4th millennium BC, i.e. well before the expansion of North-West Indo-European, Palaeo-Balkan, and Indo-Iranian languages.

Late Khvalynsk expansions and archaic Late PIE

Like Anatolian, Tocharian has been described as having a more archaic nature than the rest of Late PIE. However, Pre-Tocharian belongs to the Late PIE trunk, clearly distinguishable phonetically and morphologically from Anatolian.

It is especially remarkable that – even though it expanded into Asia – it has more in common with North-West Indo-European, hence its classification (together with NWIE) as part of a Northern group, unrelated to Graeco-Aryan.

The linguistic supplement by Kroonen et al. accepts that peoples from the Afanasevo culture (ca. 3000-2500 BC) are the most likely ancestors of Tocharians.

NOTE. For those equating the Tarim Mummies (of R1a-Z93 lineages) with Tocharians, you have this assertion from the linguistic supplement, which I support:

An intermediate stage has been sought in the oldest so-called Tarim Mummies, which date to ca. 1800 BCE (Mallory and Mair 2000; Wáng 1999). However, also the language(s) spoken by the people(s) who buried the Tarim Mummies remain unknown, and any connection between them and the Afanasievo culture on the one hand or the historical speakers of Tocharian on the other has yet to be demonstrated (cf. also Mallory 2015; Peyrot 2017).

New samples of late Khvalynsk origin

These are are the recent samples that could, with more or less certainty, correspond to migration waves from late Khvalynsk (or early Yamna), from oldest to most recent:

  • The Namazga III samples from the Late Eneolithic period (in Turkmenistan), dated ca. 3360-3000 BC (one of haplogroup J), potentially showing the first wave of EHG-related steppe ancestry into South Asia. Not related to Indo-Iranian migrations.

NOTE. A proper evaluation with further samples from Narasimhan et al. (2018) is necessary, though, before we can assert a late Khvalynsk origin of this ancestry.

  • Afanasevo samples, dated ca. 3081-2450 BC, with all samples dated before ca. 2700 BC uniformly of R1b-Z2103 subclades, sharing a common genetic cluster with Yamna, showing together the most likely genomic picture of late Khvalynsk peoples.

NOTE 1. Anthony (2007) put this expansion from Repin ca. 3300-3000 BC, while his most recent review (2015) of his own work put its completion ca. 3000-2800. While the migration into Afanasevo may have lasted some time, the wave of migrants (based on the most recent radiocarbon dates) must be set at least before ca. 3100 BC from Khvalynsk.

NOTE 2. I proposed that we could find R1b-L51 in Afanasevo, presupposing the development of R1b-L51 and R1b-Z2103 lineages with separating clans, and thus with dialectal divisions. While finding this is still possible within Khvalynsk regions, it seems we will have a division of these lineages already ca. 4250-4000 BC, which would require a closer follow-up of the different inner late Khvalynsk groups and their samples. For the moment, we don’t have a clear connection through lineages between North-West Indo-European groups and Tocharian.

tocharian-early-copper-age
Early Copper Age migrations in Asia ca. 3300-2800, according to Anthony (2015).
  • Subsequent and similar migration waves are probably to be suggested from the new sample of Karagash, beyond the Urals (attributed to the Yamna culture, hence maintaining cultural contacts after the migration waves), of R1b-Z2103 subclade, ca. 3018-2887 BC, potentially connected then to the event that caused the expansion of Yamna migrants westward into the Carpathians at the same time. Not related to Indo-Iranian migrations.
  • The isolated Darra-e Kur sample, without cultural adscription, ca. 2655 BC, of R1b-L151 lineage. Not related to Indo-Iranian migrations.
  • The Hajji Firuz samples: I4243 dated ca. 2326 BC, female, with a clear inflow of steppe ancestry; and I2327 (probably to be dated to the late 3rd millennium BC or after that), of R1b-Z2103 lineage. Not related to Indo-Iranian migrations.

NOTE. A new radiocarbon dating of I2327 is expected, to correct the currently available date of 5900-5000 BC. Since it clusters nearer to Chalcolithic samples from the site than I4243 (from the same archaeological site), it is possible that both are part of similar groups receiving admixture around this period, or maybe I2327 is from a later period, coinciding with the Iron Age sample F38 from Iran (Broushaki et al. 2016), with which it closely clusters. Also, the finding of EHG-related ancestry in Maykop samples dated ca. 3700-3000 BC (maybe with R1b-L23 subclades) offers another potential source of migrants for this Iranian group.

NOTE. Samples from Narasimhan et al. (2018) still need to be published in corrected tables, which may change the actual subclades shown here.

These late Khvalynsk / early Yamna migration waves into Asia are quite early compared to the Indo-Iranian migrations, whose ancestors can only be first identified with Volga-Ural groups of Yamna/Poltavka (ca. 3000-2400 BC), with its fully formed language expanding only with MLBA waves ca. 2300-1200 BC, after mixing with incoming Abashevo migrants.

While the authors apparently forget to reference the previous linguistic theories whereby Tocharian is more archaic than the rest of Late PIE dialects, they refer to the ca. 1,000-year gap between Pre-Tocharian and Proto-Indo-Iranian migrations, and thus their obvious difference:

The fact that Tocharian is so different from the Indo-Iranian languages can only be explained by assuming an extensive period of linguistic separation.

Potential linguistic substrates in the Middle East

A few words about relevant substrate language proposals.

Euphratic language

What Gordon Whittaker proposes is a North-West Indo-European-related substratum in Sumerian language and texts ca. 3500 BC, which may explain some non-Sumerian, non-Semitic word forms. It is just one of many theories concerning this substratum.

eneolithic_steppe
Diachronic map of Eneolithic migrations ca. 4000-3100 BC

This is a summary of his findings from his latest writing on the subject (a chapter of a book on Indo-European phonetics, from the series Copenhagen Studies in Indo-European):

In Sumerian and Akkadian vocabulary, the cuneiform writing system, and the names of deities and places in Southern Mesopotamia a body of lexical material has been preserved that strongly suggests influence emanating from a superstrate of Indo-European origin. his Indo-European language, which has been given the name Euphratic, is, at present, attested only indirectly through the filters of Sumerian and Akkadian. The attestations consist of words and names recorded from the mid-4th millennium BC (Late Uruk period) onwards in texts and lexical lists. In addition, basic signs that originally had a recognizable pictorial structure in proto-cuneiform preserve (at least from the early 3rd millennium on) a number of phonetic values with no known motivation in Sumerian lexemes related semantically to the items depicted. This suggests that such values are relics from the original logographic values for the items depicted and, thus, that they were inherited from a language intimately associated with the development of writing in Mesopotamia. Since specialists working on proto-cuneiform, most notably Robert K. Englund of the Cuneiform Digital Library Initiative, see little or no evidence for the presence of Sumerian in the corpus of archaic tablets, the proposed Indo-European language provides a potential solution to this problem. It has been argued that this language, Euphratic, had a profound influence on Sumerian, not unlike that exerted by Sumerian and Akkadian on each other, and that the writing system was the primary vehicle of this influence. he phonological sketch drawn up here is an attempt to chart the salient characteristics of this influence, by comparing reconstructed Indo-European lexemes with similarly patterned ones in Sumerian (and, to a lesser extent, in Akkadian).

His original model, based on phonetic values in basic proto-cuneiform signs, is quite imaginative and a very interesting read, if you have the time. His Academia.edu account hosts most of his papers on the subject.

We could speculate about the potential expansion of this substrate language with the commercial contacts between Uruk and Maykop (as I did), now probably more strongly supported because of the EHG found in Maykop samples.

NOTE. We could also put it in relation with the Anatolian language of Mari, but this would require a new reassessment of its North-West Indo-European nature.

Nevertheless, this theory is far from being mainstream, anywhere. At least today.

NOTE. The proposal remains still hypothetic, because of the flaws in the Indo-European parallels – similar to Koch’s proposal of Indo-European in Tartessian inscriptions. A comprehensive critic approach to the theory is found in Sylvie Vanséveren’s A “new” ancient Indo-European language? On assumed linguistic contacts between Sumerian and Indo-European “Euphratic”, in JIES (2008) 36:3&4.

Gutian language

References to Gutian are popping up related to the Hajji Firuz samples of the mid-3rd millennium.

The hypothesis was put forward by Henning (1978) in purely archaeological terms.

This is the relevant excerpt from the book:

(…) Comparativists have asserted that, in spite of its late appearance, Tokharian is a relatively archaic form of Indo-European.3 This claim implies that the speakers of this group separated from their Indo-European brethren at a comparatively early date. They should accordingly have set out on their migrations rather early, and should have appeared within the Babylonian sphere of influence also rather early. Earlier, at any rate, than the Indo-Iranians, who spoke a highly developed (therefore probably later) form of Indo-European. Moreover, as some of the Indo-Iranians after their division into Iranians and Indo-Aryans4 appeared in Mesopotamia about 1500 B.C., we should expect the Proto-Tokharians about 2000 B.C. or even earlier.

If, armed with these assumptions as our working hypothesis, we look through the pages of history, we find one nation – one nation only – that perfectly fulfills all three conditions, which, therefore, entitles us to recognize it as the “Proto-Tokharians”. Tis name was Guti; the intial is also spelled with q (a voiceless back velar or pharyngeal), but the spelling with g is the original one. The closing -i is part of the name, for the Akkadian case-endings are added to it, nom. Gutium etc. Guti (or Gutium, as some scholars prefer) was valid for the nation, considered as an entity, but also for the territory it occupied.
(…).

The text goes on to follow the invasion of Babylonia by the Guti, and further eastward expansions supposedly connected with these, to form the attested Tocharians.

The referenced text by Thorkild Jakobsen offers the interesting linguistic data:

Among the Gutian rulers is one Elulumesh, whose name is evidently Akkadian Elulum slightly “Gutianized” by the Gutian case(?) ending -eš.40 This Gutian ruler Elulum is obviously the same man whom we find participating in the scramble for power after the death of Shar-kali-sharrii; his name appears there in Sumerian form without mimation as Elulu.

The Gutian dynasty, from ca. 22nd c. BC appears as follows:

gutian-rulers

I don’t think we could derive a potential relation to any specific Indo-European branch from this simple suffix repeated in Gutian rulers, though.

The hypothesis of the Tocharian-like nature of the Guti (apart from the obvious error of considering them as the ancestors of Tocharians) remains not contrasted in new works since. It was cited e.g. by Gamkrelidze and Ivanov (1995) to advance their Armenian homeland, and by Mallory and Adams in their Encyclopedia (1997).

It lies therefore in the obscurity of undeveloped archaeological-linguistic hypotheses, and its connection with the attested R1b-Z2103 samples from Iran is not (yet) warranted.

Related:

Y-DNA haplogroup R1b-Z2103 in Proto-Indo-Iranians?

chalcolithic_early-asia

We already know that the Sintashta -> Andronovo migrants will probably be dominated by Y-DNA R1a-Z93 lineages. However, I doubt it will be the only Y-DNA haplogroup found.

I said in my predictions for this year that there could not be much new genetic data to ascertain how Pre-Indo-Iranian survived the invasion, gradual replacement and founder effects that happened in terms of male haplogroups after the arrival of late Corded Ware migrants, and that we should probably have to rely on anthropological explanations for language continuity despite genetic replacement, as in the Basque case.

Nevertheless, since we have very few samples, I think we could still see a clear genetic contribution from Yamna to Corded Ware immigrants in the North Caspian region (from Abashevo, in turn a mix of Fatyanovo/Balanovo and Catacomb/Poltavka cultures) in terms of:

  • Ancestral components and PCA in new Sintashta-Petrovka, Andronovo, and/or later samples – similar the ‘steppe’ drift seen in Potapovka relative to Sintashta samples, both formed by incoming Corded Ware migrants – ; and
  • R1b-L23 subclades, either appearing scattered during the Sintashta melting pot (of Abashevo/R1a-Z645 and East Yamna-Poltavka/R1b-Z2103 peoples), or resurging after this period, as we have seen in Pre-Balto-Slavic territory.

This contribution could better explain the obvious language continuity in the region, beautifully complementing the complex anthropological model we have now of archaeological continuity of Sintashta and Potapovka with the previous Poltavka, seen in a similar material and symbolic culture that survived the arrival of newcomers.

A lot of people seem to be looking like crazy since O&M 2018 for some sort of connection between Corded Ware and Yamna migrants in Eastern and Central Europe (wheter in SNP calls of samples published, or among almost forgotten academic papers), either to support the ideas of the 2015 papers – for those who relied on their conclusions and built (even if only mentally) far-fetched migration models around it – , or just because of some sort of absurd continuity theory involving modern R1a-Z645 subclades:

NOTE. The situation we have seen with the hundreds of samples from O&M 2018, and with the recent additional Eastern European samples, depict an unexpected absolutely clear-cut distinction in Y-DNA haplogroups between Corded Ware and Yamna/Bell Beaker: I really can’t see how the situation could be more obvious for everyone, so I doubt any further samples will make certain people change their minds. Their hope is, I guess, that just one sample may give some more oxygen to infinite pet theories, as we are still surprisingly seeing even with reactionary R1b autochthonous continuists in Western Europe…

However, looking into the most likely future for the field, what we should be expecting right now is continuity of Yamna ancestry and lineages in early Proto-Indo-Iranian territory. Since we only have a few samples from Sintashta-Petrovka, Potapovka, and Andronovo, I think there might be a sizeable number of R1b-Z2103 subclades in the territory inhabited by those who – no doubt – spread the language into Central Asia.

Haplogroup_R1b_(Y-DNA)
Modern Y-DNA haplogroup R1b distribution, by Maulucioni at Wikipedia

While full population replacement by R1a-Z93 lineages in the North Caspian region ca. 2000 BC is not impossible, I don’t think it is very likely, since we already know that there are R1b-Z2103 lineages widely distributed in Indo-Iranian-speaking territory, and Z93 is now known to be an older subclade than YFull’s mean formation date suggested (due to the Ukraine_Eneolithic I6561 sample‘s SNP call), so what we can infer now that actually happened in Sintashta -> Andronovo is not exactly the spread of haplogroup Z93 during its formation, but rather a regional reduction in its variability coupled with the expansion of some of its subclades.

The main question, after the South Asia paper is finally published, will then be:

  1. Given that Yamna peoples were an elite group of patrilineally-related families mainly of R1b-L23 subclades:
  2. Accepting that PCA, ADMIXTURE, and other statistical methods are not relevant (alone) for ethnolinguistic identification: e.g. Yamna ‘outliers’ and East Bell Beaker migrants of R1b-L23 lineages without steppe ancestry; N1c1a1a-L392 lineages and Siberian ancestry unrelated to Uralic speakers; R1a-Z645 and steppe ancestry in North-East Europe related to Uralic-speaking cultures
  3. If we find now, as I expect, genetic continuity of east Yamna in Sintashta -> Andronovo (relative to other late Corded Ware peoples), probably including haplogroup R1b-Z2103 mixed with R1a-Z93 before its further reduction of subclades (e.g. to L657) and expansion during its subsequent spread southward…

bronze_age_early_Asia-andronovo
Diachronic map of migrations in Asia ca. 2250-1750 BC

Why exactly do we need Corded Ware to explain migrations of Late Indo-European speakers?

In other words: if we had the data we have today in 2015, would we have a need for Corded Ware to explain Indo-European migrations from the steppe? Are some people so blinded by their will to (appear to) be right in their past interpretations that they can’t just let go?

NOTE. On a side note, wouldn’t it be nice for this paper to publish some other R1b-L23 (x2103) sample – maybe even R1b-L51 – in Yamna, Andronovo, or Afanasevo territory, to end both autochthonous continuity theories (of North-Eastern and Western Europe) at the same time?

I really hope someone in David Reich’s team understands this matter, or else they will still identify Corded Ware as the (now probably ‘a’ instead) vector of expansion of Indo-European languages, and some of us will still have fun for another 2 or 3 years with such conclusions, until someone in the lab realizes that ancestry ≠ population ≠ ethnic identification ≠ language.

NOTE. It seems rather dull to read how people are discussing in the Twitterverse conventional constructs like ‘human race‘ as found in Reich’s op-ed in The New York Times, as if such grandiose semantic discussions had any practical meaning, when basic anthropological questions actually relevant for Genomics, like the essential ancestral component ≠ people tenet seem not to be of interest for anyone in the field….

Since our Indo-European demic difusion model (and its consequences for our reconstruction of North-West Indo-European) and this blog are becoming more and more popular each day – judging by the constant growth in visits in the past 6 months or so – , I guess the simplemindedness and predictability of certain geneticists is benefitting traditional anthropology directly, driving more and more amateur geneticists to look for sound academic models to answer the growing inconsistencies of genetic research.

NOTE. I am not saying the rejection of Corded Ware as spreading Indo-European is definitive. Maybe more samples within some years will depict a clear ancient expansion of Early or Middle Proto-Indo-Europeans from Khvalynsk to the forest-steppe and forest zone, and later with certain Corded Ware migrants into Central Europe, over whose territory a Late Indo-European dialect from Bell Beakers became the superstrate, as some have proposed in the past – e.g. to explain Krahe’s Old European hydronymy. I really doubt you could demonstrate such an old ethnolinguistic identification with a clear, unbroken archaeological trail, though, and we know now that this old hydronymy is probably of Late Indo-European nature (possibly even more recent).

What I am saying is: with the data we have now, it does not make any sense to keep the anthropological models invented by geneticists ex nihilo in 2015, and the hundred different alternative Late Indo-European migration models that arebornwitheachnewpaper.

These Yamna -> Corded Ware migration models didn’t have any sense for me since early 2016, but now after O&M 2017, and especially O&M 2018, I don’t think any geneticist with a little knowledge in Linguistics or Archaeology (if they are decent about their quest for truth in describing ancient European migrations) would buy them, if not for some sort of created ‘tradition’. So let’s ditch Corded Ware as Late Indo-European-speaking, let’s accept that late Corded Ware migrants should most likely be identified as early Uralic speakers, and then future data will tell if we are – again – wrong.

Please, don’t let Genomics become another pseudoscience based solely on Bioinformatics like glottochronology: let anthropologists (preferably mainstream archaeologists, but also the true Indo-Europeanists, linguists) help you interpret your raw data. Don’t deceive yourselves thinking that you have read enough about the Indo-European question, or that you know enough Indo-Europeanists (say what?) to derive your own conclusions.

Use the South Asia paper to begin expressly retracting the Corded Ware mess.

Please pretty please with sugar on top?

Related:

For commenters: this post concerns an anthropological question, and deals with the expansion of Late Proto-Indo-European speakers from Yamna, and the controversy surrounding the role of Corded Ware migrants that a handful of academics propose spread from it, based on a renewed model of Gimbutas’ outdated Kurgan theory and on the so-called ‘Yamnaya’ ancestry.

It happens so that the discussion has turned lately mainly to ancient Y-DNA haplogroups, because they help confirm previous mainstream anthropological models of cultural diffusion and migration. It is obviously not reasonable to judge prehistoric ethnolinguistic migrations from ca. 5,000 years ago based on historical nation-states and ethnic or religious concepts invented since the Middle Ages, coupled with “your” people’s main modern (or your own) paternal lineage.

EDIT (27 MAR 2018): Minor corrections and post made shorter.

Admixture of Srubna and Huns in Hungarian conquerors

hungarian-conqueror-migrations

New preprint at BioRxiv, Mitogenomic data indicate admixture components of Asian Hun and Srubnaya origin in the Hungarian Conquerors, by Neparáczki et al. (2018), at BioRxiv.

Abstract (emphasis mine):

It has been widely accepted that the Finno-Ugric Hungarian language, originated from proto Uralic people, was brought into the Carpathian Basin by the Hungarian Conquerors. From the middle of the 19th century this view prevailed against the deep-rooted Hungarian Hun tradition, maintained in folk memory as well as in Hungarian and foreign written medieval sources, which claimed that Hungarians were kinsfolk of the Huns. In order to shed light on the genetic origin of the Conquerors we sequenced 102 mitogenomes from early Conqueror cemeteries and compared them to sequences of all available databases. We applied novel population genetic algorithms, named Shared Haplogroup Distance and MITOMIX, to reveal past admixture of maternal lineages. Phylogenetic and population genetic analysis indicated that more than one third of the Conqueror maternal lineages were derived from Central-Inner Asia and their most probable ultimate sources were the Asian Huns. The rest of the lineages most likely originated from the Bronze Age Potapovka-Poltavka-Srubnaya cultures of the Pontic-Caspian steppe, which area was part of the later European Hun empire. Our data give support to the Hungarian Hun tradition and provides indirect evidence for the genetic connection between Asian and European Huns. Available data imply that the Conquerors did not have a major contribution to the gene pool of the Carpathian Basin, raising doubts about the Conqueror origin of Hungarian language.

hungarian-conqueror-mtdna
“Comparison of major Hg distributions from modern and ancient populations. Asian main Hg-s are designated with brackets. Major Hg distribution of Conqueror samples from this study are very similar to that of other 91 Conquerors taken from previous studies [11,12]. Scythians and ancient Xiongnus show similar Hg composition to the bracketed Asian fraction of the Conqueror samples, but Hg B is present just in Xiongnus. Modern Hungarians have very small Asian components pointing at small contribution from the Conquerors. Of the 289 modern Hungarian mitogenomes 272 are published in [29]. Scythian Hg-s are from [48,49,55,59,71–74]. Xiongnu Hg-s are from [66–69].”

Just recently another article contributed to a similar idea. I already talked about the Bronze Age R1a-z93 sample with high steppe ancestry found in the Balkans, and its likely origin in an expansion of the Srubna or a related culture. No truce, therefore, for those looking for autochthonous continuity anywhere in Europe.

We are seeing how multiple migrations shaped the history of the Carpathian basin (and its complex genetic structure) – and of Europe in general -, often from the Pontic-Caspian steppe. That is clear from many different prehistorical and historical times, such as the expansions of Suvorovo-Novodanilovka, Yamna, Srubna, Thraco-Cimmerians, Sarmatians, Scythians, Huns,…

About the linguistic interpretations based on genetics contained in the paper (Hungarian language as a legacy of Huns), well, you know my stance regarding the Yamnaya ancestral concept (and the wrong linguistic interpretations derived from it, which many sadly keep to this day), and genetics in general to solve language questions

This is yet another example of how (what some people would call) “scientific data” is useless without sound anthropological models.

Featured image, from the article: “Hypothetic origin and migration route of different components of the Hungarian Conquerors. Bluish line frames the Eurasian steppe zone, within which all presumptive ancestors of the Conquerors were found. Yellow area designates the Xiongnu Empire at its zenith from which area the East Eurasian lineages originated. Phylogeographical distribution of modern East Eurasian sequence matches (Fig. 1) well correspond to this territory, especially considering that Yakuts, Evenks and Evens lived more south in the past [108], and European Tatars also originated from this area. Regions where Asian and European Scythian remains were found are labeled green, pink is the presumptive range of the Srubnaya culture. Migrants of Xiongnu origin most likely incorporated descendants of these groups. The map was created using QGIS 2.18.4[109]”.

Article available under a CC-BY-NC-ND 4.0 International license.

Discovered via Razib Khan.

See also:

The concept of “Outlier” in Human Ancestry (II): Early Khvalynsk, Sredni Stog, West Yamna, Iron Age Bulgaria, Potapovka, Andronovo…

yamna-corded-ware-bell-beaker

I already wrote about the concept of outlier in Human Ancestry, so I am not going to repeat myself. This is just an update of “outliers” in recent studies, and their potential origins (here I will repeat some of the examples):

Early Khvalynsk: the three samples from the Samara region have quite different positions in PCA, from nearest to EHG (of Y-DNA haplogroup R1a) to nearest to ANE ancestry (of Y-DNA haplogroup Q). This could represent the initial consequences of the second wave of ANE ancestry – as found later in Yamna samples from a neighbouring region -, possibly brought then by Eurasian migrants related to haplogroup Q.
With only 3 samples, this is obviously just a tentative explanation of the finds. The samples can only be reasonably said to show an unstable time for the region in terms of admixture (i.e. probably migration), judging by the data on PCA.

Ukraine Eneolithic samples offer a curious example of how the concept of outlier can change radically: from the third version (May 30th) of the preprint paper of Mathieson et al. (2017), when the Ukraine Eneolithic sample with steppe ancestry (and clustering with central European samples) was the ‘outlier’, to the fourth version (September 19th), when two samples with steppe ancestry clustering close to Corded Ware samples were now the ‘normal’ ones (i.e. those representing Ukraine Eneolithic population), and the outlier was the one clustering closely with Ukraine Mesolithic samples…

pca-admixture-yamna
PCA and Admixture for south-eastern Europe. Image modified from Mathieson et al. (2017) – Third revision (May 30th), used in the 2nd edition of the Indo-European demic diffusion model.

This is one of the funny consequences of the wrong interpretation of the ‘yamnaya component’, that made geneticists believe at first that, out of two samples (!), the ‘outlier’ was the one with ‘yamnaya’ ancestry, because this component would have been brought by an eastern immigrant from early Khvalynsk…

This example offers yet another reason why precise anthropological context is necessary to offer the right interpretation of results. Within the Indo-European demic diffusion model – based mainly on Archaeology and Linguistics – , the sample with steppe ancestry was the most logical find in the region for a potential origin of the Corded Ware culture, and it was interpreted as such, well before the publication of the fourth version of Mathieson et al. (2017).

pca-south-east-europe
PCA of South-East European and other European samples. Image modified from Mathieson et al. (2017) – Fourth revision (September 19th), used in the 3rd edition of the Indo-European demic diffusion model.

West Yamna (to insist on the same question, the ‘yamnaya’ component): we have only four western Yamna samples, two of them showing Anatolian Neolithic ancestry (one of them, from Ukraine, with a strong ‘southern’ drift). On the other hand, Corded Ware migrants do not show this. So we could infer that their migrations were not coetaneous: whereas peoples of Corded Ware culture expanded ca. 3300 BC to the north – in the natural corridor to the Baltic that has been proposed for this culture in Archaeology for decades (and that is well represented by Ukraine Eneolithic samples) -, peoples of Yamna culture expanded to the west, replacing the Ukraine Eneolithic population (i.e. probably those of ‘Proto-Corded Ware culture’), and eventually mixing with Balkan populations of Anatolian Neolithic ancestry.

Potapovka, Andronovo, and Srubna: while Potapovka clusters closely to the steppe, and Andronovo (like Sintashta) clusters closely to Corded Ware (i.e. Ukraine Neolithic / Central-East European), both have certain ‘outliers’ in PCA: the former has one individual clustering closely to Corded Ware, and the latter to the steppe. Both ‘outliers’ fit well with the interpretation of the recent mixture of Corded Ware peoples with steppe populations, and they offer a different image for the evolution of populations of Potapovka and Sintashta-Petrovka, potentially influencing their language. The position of Srubna samples, nearer to Sintashta and Andronovo (but occupying the same territory as the previous Potapovka) offers the image of a late westward conquest from Corded Ware-related populations.

asia-early-bronze
Diachronic map of migrations ca. 2250-1750 BC

Iron Age Bulgaria: a sample of haplogroup R1a-z93, with more ‘yamnaya’ ancestry than any other previous sample from the Balkans. For some, it might mean continuity from an older time. However – as with the Corded Ware outlier from Esperstedt before it – it is more likely a recent migrant from the steppe. The most likely origin of this individual is therefore people from the steppe, i.e. either the Srubna culture or a related group. Its relatively close cluster in PCA to certain recent Slavic populations can be interpreted in light of the multiple back and forth migrations in the region: of steppe populations to the west (Srubna, Cimmerians, Scythians, Sarmatians,…), and of Slavic-speaking populations:

middle-bronze-age-middle-east
Diachronic map of Bronze Age migrations ca. 1750-1250 BC.

Well-defined outliers are, therefore, essential to understand a recent history of admixture. On the other hand, the very concept of “outlier” can be a dangerous tool – when the lack of enough samples makes their classification as as such unjustified -, leading to the wrong interpretations.

Related: