Yamnaya replaced Europeans, but admixed heavily as they spread to Asia

Recent papers The formation of human populations in South and Central Asia, by Narasimhan, Patterson et al. Science (2019) and An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers, by Shinde et al. Cell (2019).

NOTE. For direct access to Narasimhan, Patterson et al. (2019), visit this link courtesy of the first author and the Reich Lab.

I am currently not on holidays anymore, and the information in the paper is huge, with many complex issues raised by the new samples and analyses rather than solved, so I will stick to the Indo-European question, especially to some details that have changed since the publication of the preprint. For a summary of its previous findings, see the book series A Song of Sheep and Horses, in particular the sections from A Clash of Chiefs where I discuss languages and regions related to Central and South Asia.

I have updated the maps of the Preshistory Atlas, and included the most recently reported mtDNA and Y-DNA subclades. I will try to update the Eurasian PCA and related graphics, too.

NOTE. Many subclades from this paper have been reported by Kolgeh (download), Pribislav and Principe at Anthrogenica on this thread. I have checked some out for comparison, but even if it contradicted their analyses mine would be the wrong ones. I will upload my spreadsheets and link to them from this page whenever I find the time.

Ancestry clines (1) before and (2) after the advent of farming. Colour modified from the original to emphasize the CHG cline: notice the apparent relevance of forest-steppe groups in the formation of this CHG mating network from which Pre-Yamnaya peoples emerged.


I think the Narasimhan, Patterson et al. (2019) paper is well-balanced, and unexpectedly centered – as it should – on the spread of Yamnaya-related ancestry (now Western_Steppe_EMBA) as the marker of Proto-Indo-European migrations, which stretched ca. 3000 BC “from Hungary in the west to the Altai mountains in the east”, spreading later Indo-European dialects after admixing with local groups, from the Atlantic to South Asia.

I. Afanasievo

I.1. East or West PIE?

I expected Afanasievo to show (1) R1b-L23(xZ2103, xL51) and (2) R1b-L51 lineages, apart from (3) the known R1b-Z2103 ones, pointing thus to an ancestral PIE community before the typical Yamnaya bottlenecks, and with R1b-L51 supporting a connection with North-West Indo-European. The presence of some samples of hg. Q pointed in this direction, too.

However, Afanasievo samples show overwhelmingly R1b-Z2103 subclades (all except for those with low coverage), all apparently under R1b-Z2108 (formed ca. 3500 BC, TMRCA ca. 3500 BC), like most samples from East Yamnaya.

This necessarily shifts the split and spread of R1b-L23 lineages to Khvalynsk/early Repin-related expansions, in line with what TMRCA suggested, and what advances by Anthony (2019) and Khokhlov (2018) on future samples from the Reich Lab suggest.

Given the almost indistinguishable ancestry between Afanasievo and Early Yamnaya, there seems to be as of yet little potential information to support in population genomics that Pre-Tocharians were more closely related to North-West Indo-Europeans than to Graeco-Aryans, as it is proposed in linguistics based on the few shared traits between them, and the lack of innovations proper of the Graeco-Aryan community.

NOTE. A new issue of Wekʷos contains an abstract from a relevant paper by Blažek on vocabulary for ‘word’, including the common NWIE *wrdʰo-/wordʰo-, but also a new (for me, at least) Northern Indo-European one: *rēki-/*rēkoi̯-, shared by Slavic and Tocharian.

The fact that bottlenecks happened around the time of the late Repin expansion suggests that we might be able to see different clans based on the predominant lineages developing around the Don-Volga area in the 4th millennium BC. The finding of Pre-R1b-L51 in Lopatino (see below), and of a Catacomb sample of hg. R1b-Z2103(Z2105-) in the North Caucasus steppe near Novoaleksandrovskij also support a star-like phylogeny of R1b-L23 stemming from the Don-Volga area.

NOTE. Interestingly, a dismissal of a common trunk between Tocharian and North-West Indo-European would mean that shared similarities between such disparate groups could be traced back to a Common Late PIE trunk, and not to a shared (western) Repin community. For an example of such a ‘pure’ East-West dialectal division, see the diagram of Adams & Mallory (2007) at the end of the post. It would thus mean a fatal blow to Kortlandt’s Indo-Slavonic group among other hypothetical groupings (remade versions of the ancient Centum-Satem division), as well as to certain assumptions about laryngeal survival or tritectalism that usually accompany them. Still, I don’t think this is the case, so the question will remain a linguistic one, and maybe some similarities will be found with enough number of samples that differentiate Northern Indo-Europeans from the East Yamna/Catacomb-Poltavka-Balkan_EBA group.

Y-chromosome haplogroups of Afanasievo samples and neighbouring groups. See full maps.

I.2. Expansion or resurgence of hg. Q1b?

Haplogroup Q1b-Y6802(xY6798) seems to be the main lineage that expanded with Afanasievo, or resurged in their territory. It’s difficult to tell, because the three available samples are family, and belong to a later period.

NOTE. I have finally put some order to the chaos of Q1a vs. Q1b subclades in my spreadsheet and in the maps. The change of ISOGG 2016 to 2017 has caused that many samples reported as of Q1 subclades from papers prepared during the 2017-2018 period, and which did not provide specific SNP calls, were impossible to define with certainty. By checking some of them I could determine the specific standard used.

In favour of the presence of this haplogroup in the Pre-Yamnaya community are:

  • The statement by Anthony (2019) that Q1a [hence maybe Q1b in the new ISOGG nomenclature] represented a significant minority among an R1b-rich community.
  • The sample found in a Sintastha WSHG outlier (see below), of hg. Q1b-Y6798, and the sample from Lola, of hg. Q1b-L717, are thus from other lineage(s) separated thousands of years from the Afanasievo subclade, but might be related to the Khvalynsk expansion, like R1b-V1636 and R1b-M269 are.

These are the data that suggest multiple resurgence events in Afanasievo, rather than expanding Q1b lineages with late Repin:

  • Overwhelming presence of R1b in early Yamnaya and Afanasievo samples; one Q1(xQ1b) sample reported in Khvalynsk.
  • The three Q1b samples appear only later, although wide CI for radiocarbon dates, different sites, and indistinguishable ancestry may preclude a proper interpretation of the only available family.
    • Nevertheless, ancestry seems unimportant in the case of Afanasievo, since the same ancestry is found up to the Iron Age in a community of varied haplogroups.
  • Another sample of hg. Q1b-Y6802(xY6798) is found in Aigyrzhal_BA (ca. 2120 BC), with Central_Steppe_EMBA (WSHG-related) ancestry; however, this clade formed and expanded ca. 14000 BC.
  • The whole Altai – Baikal area seems to be a Q1b-L54 hotspot, although admittedly many subclades separated very early from each other, so they might be found throughout North Eurasia during the Neolithic.
  • One Afanasievo sample is reported as of hg. C in Shin (2017), and the same haplogroup is reported by Hollard (2014) for the only available sample of early Chemurchek to date, from Kulala ula, North Altai (ca. 2400 BC).
Y-chromosome haplogroups of late Afanasievo – early Chemurchek samples and neighbouring groups. See full maps.

I.3. Agricultural substrate

Evidence of continuous contacts of Central_Steppe_MLBA populations with BMAC from ca. 2100 BC on – visible in the appearance of Steppe ancestry among BMAC samples and BMAC ancestry among Steppe pastoralists – supports the close interaction between Indo-Iranian pastoralists and BMAC agriculturalists as the origin of the Asian agricultural substrate found in Proto-Indo-Iranian, hence likely related to the language of the Oxus Civilization.

Similar to the European agricultural substrate adopted by West Yamnaya settlers (both NWIE and Palaeo-Balkan speakers), Tocharian shows a few substrate terms in common with Indo-Iranian, which can be explained by contacts in different dialectal stages through phonetic reconstruction alone.

The recent Hermes et al. (2019) supports the early integration of pastoralism and millet cultivation in Central Asia (ca. 2700 BC or earlier), with the spread of agriculture to the north – through the Inner Asian Mountain Corridor – being thus unrelated to the Indo-Iranian expansions, which might support independent loans.

However, compared to the huge number of parallel shared loans between NWIE and Palaeo-Balkan languages in the European substratum, Indo-Iranians seem to have been the first borrowers of vocabulary from Asian agriculturalists, while Proto-Tocharian shows just one certain related word, with phonetic similarities that warrant an adoption from late Indo-Iranian dialects.

Y-chromosome haplogroups of Sintashta, Central Asia, and neighbouring groups in the Early Bronze Age. See full maps.

The finding of hg. (pre-)R1b-PH155 in a BMAC sample from Dzharkutan (to the west of Xinjiang) together with hg. R1b in a sample from Central Mongolia previously reported by Shin (2017) support the widespread presence of this lineage to the east and west of Xinjiang, which means it might have become incorporated to Indo-Iranian migrants into the Xiaohe horizon, to the Afanasievo-Chemurchek-derived groups, or the later from the former. In other words, the Island Biogeography Theory with its explanation of founder effects might be, after all, applicable to the whole Xinjiang area, not only during the Chemurchek – Tianshan-Beilu – Xiaohe interaction.

Of course, there is no need for too complicated models of haplogroup resurgence events in Central and South Asia, seeing how the total amount of hg. R1a-L657 (today prevalent among Indo-Aryan speakers from South Asia) among ancient Western/Central_Steppe_MLBA-related samples amounts to a total of 0, and that many different lineages survived in the region. Similar cases of haplogroup resurgence and Y-DNA bottleneck events are also found in the Central and Eastern Mediterranean, and in North-Eastern Europe. From the paper:

[It] could reflect stronger ecological or cultural barriers to the spread of people in South Asia than in Europe, allowing the previously established groups more time to adapt and mix with incoming groups. A second difference is the smaller proportion of Steppe pastoralist– related ancestry in South Asia compared with Europe, its later arrival by ~500 to 1000 years, and a lower (albeit still significant) male sex bias in the admixture (…).

Y-chromosome haplogroups of samples from the Srubna-Andronovo and Andronovo-related horizon, Xiaohe, late BMAC, and neighbouring groups. See full maps.

II. R1b-Beakers replaced R1a-CWC peoples

II.1. R1a-M417-rich Corded Ware

Newly reported Corded Ware samples from Radovesice show hg. R1a-M417, at least some of them xZ645, ‘archaic’ lineages shared with the early Bergrheinfeld sample (ca. 2650 BC) and with the coeval Esperstedt family, hence supporting that it eventually became the typical Western Corded Ware lineage(s), probably dominating over the so-called A-horizon and the Single Grave culture in particular. On the other hand, R1a-Z645 was typical of bottlenecks among expanding Eastern Corded Ware groups.

Interestingly, it is supported once again that known bottlenecks under hg. R1a-M417 happened during the Corded Ware expansion, evidenced also by the remarkable high variability of male lineages among early Corded Ware samples. Similarly, these Corded Ware samples from Bohemia form part of the typical ‘Central European’ cluster in the PCA, which excludes once again not only the ‘official’ Espersted outlier I1540, but also the known outlier with Yamnaya ancestry.

NOTE. The fact that Esperstedt is closely related geographically and in terms of ancestry to later Únětice samples further complicates the assumption that Únětice is a mixture of Bell Beakers and Corded Ware, being rather an admixture of incoming Bell Beakers with post-Yamnaya vanguard settlers who admixed with Corded Ware (see more on the expansion of Yamnaya ancestry). In other words, Únětice is rather an admixture of Yamnaya+EEF with Yamnaya+(CWC+EEF).

Y-chromosome haplogroups of samples from Catacomb, Poltavka, Balkan EBA, and Bell Beaker, as well as neighbouring groups. See full maps.

On Ukraine_Eneolithic I6561

If the bottlenecks are as straightforward as they appear, with a star-like phylogeny of R1a-M417 starting with the Pre-Corded Ware expansion, then what is happening with the Alexandria sample, so precisely radiocarbon dated to ca. 4045-3974 BC? The reported hg. R1a-M417 was fully compatible, while R1a-Z645 could be compatible with its date, but the few positive SNPs I got in my analysis point indeed to a potential subclade of R1a-Z94, and I trust more experienced hobbyists in this ‘art’ of ascertaining the SNPs of ancient samples, and they report hg. R1a-Z93 (Z95+, Y26+, Y2-).

Seeing how Y-DNA bottlenecks worked in Yamnaya-Afanasievo and in Corded Ware and related groups, and if this sample really is so deep within R1a-Z93 in a region that should be more strongly affected by the known Neolithic Y-chromosome bottlenecks and forest-steppe ecotone, someone from the lab responsible for this sample should check its date once again, before more people keep chasing their tails with an individual that (based on its derived SNPs’ TMRCA) might actually be dated to the Bronze Age, where it could make much more sense in terms of ancestry and position in the PCA.

EDIT (14 SEP 2019): … and with the fact that he is the first individual to show the genetic adaptation for lactase persistence (I3910-T), which is only found later among Bell Beakers, and much later in Sintashta and related Steppe_MLBA peoples (see comments below).

This is also evidenced by the other Ukraine_Eneolithic (likely a late Yamnaya) sample of hg. R1b-Z2103 from Dereivka (ca. 2800 BC) and who – despite being in a similar territory 1,000 years later – shows a wholly diluted Yamnaya ancestry under typically European HG ancestry, even more so than other late Sredni Stog samples from Dereivka of ca. 3600-3400 BC, suggesting a decrease in Steppe ancestry rather than an increase – which is supposedly what should be expected based on the ancestry from Alexandria…

Like the reported Chalcolithic individual of Hajji Firuz who showed an apparently incompatible subclade and Yamnaya ancestry at least some 1,000 years before it should, and turned out to be from the Iron Age (see below), this may be another case of wrong radiocarbon dating.

NOTE. It would be interesting, if this turns out to be another Hajji Firuz-like error, to check how well different ancestry models worked in whose hands exactly, and if anyone actually pointed out that this sample was derived, and not ancestral, to many different samples that were used in combination with it. It would also be a great control to check if those still supporting a Sredni Stog origin for PIE would shift their preference even more to the north or west, depending on where the first “true” R1a-M417 samples popped up. Such a finding now could be thus a great tool to discover whether haplogroup-based bias plays a role in ancestry magic as related to the Indo-European question, i.e. if it really is about “pure statistics”, or there is something else to it…

II.1. R1b-L51-rich Bell Beakers

The overwhelming majority of R1b-L51 lineages in Radovesice during the Bell Beaker period, just after the sampled Corded Ware individuals from the same site, further strengthen the hypothesis of an almost full replacement of R1a-M417 lineages from Central Europe up to southern Scandinavia after the arrival of Bell Beakers.

Yet another R1b-L151* sample has popped up in Central Europe, in the individual classified as Bilina_BA (ca. 2200-800 BC), which clusters with Bell Beakers from Bohemia, with the outlier from Turlojiškė, and with Early Slavs, suggesting once again that a group of central-east European Beakers represented the Pre-Proto-Balto-Slavic community before their spread and admixture events to the east.

The available ancient distribution of R1b-L51*, R1b-L52* or R1b-L151* is getting thus closer to the most likely origin of R1b-L51 in the expansion of East Bell Beakers, who trace their paternal ancestors to Yamnaya settlers from the Carpathian Basin:

NOTE. Some of these are from other sources, and some are samples I have checked in a hurry, so I may have missed some derived SNPs. If you send me a corrected SNP call to dismiss one of these, or more ‘archaic’ samples, I’ll correct the map accordingly. See also maps of modern distributionof R1b-M269 subclades.

Distribution of ‘archaic’ R1b-L51 subclades in ancient samples, overlaid over a map of Yamnaya and Bell Beaker migrations. In blue, Yamnaya Pre-L51 from Lopatino (not shown) and R1b-L52* from BBC Augsburg. In violet, R1b-L51 (xP312,xU106) from BBC Prague and Poland. In maroon, hg. R1b-L151* from BBC Hungary, BA Bohemia, and (not shown) a potential sample from BBC at Mondelange, which is certainly xU106, maybe xP312. Interestingly, the earliest sample of hg. R1b-U106 (a lineage more proper of northern Europe) has been found in a Bell Beaker from Radovesice (ca. 2350 BC), between two of these ‘archaic’ R1b-L51 samples; and a sample possibly of hg. R1b-ZZ11+ (ancestral to DF27 and U152) was found in a Bell Beaker from Quedlinburg, Germany (ca. 2290 BC), to the north-west of Bohemia. The oldest R1b-U152 are logically from Central Europe, too.

III. Proto-Indo-Iranian

Before the emergence of Proto-Indo-Iranian, it seems that Pre-Proto-Indo-Iranian-speaking Poltavka groups were subjected to pressure from Central_Steppe_EMBA-related peoples coming from the (south-?)east, such as those found sampled from Mereke_BA. Their ‘kurgan’ culture was dated correctly to approximately the same date as Poltavka materials, but their ancestry and hg. N2(pre-N2a) – also found in a previous sample from Botai – point to their intrusive nature, and thus to difficulties in the Pre-Proto-Indo-Iranian community to keep control over the previous East Yamnaya territory in the Don-Volga-Ural steppes.

We know that the region does not show genetic continuity with a previous period (or was not under this ‘eastern’ pressure) because of an Eastern Yamnaya sample from the same site (ca. 3100 BC) showing typical Yamnaya ancestry. Before Yamnaya, it is likely that Pre-Yamnaya ancestry formed through admixture of EHG-like Khvalynsk with a North Caspian steppe population similar to the Steppe_Eneolithic samples from the North Caucasus Piedmont (see Anthony 2019), so we can also rule out some intermittent presence of a Botai/Kelteminar-like population in the region during the Khvalynsk period.

It is very likely, then, that this competition for the same territory – coupled with the known harsher climate of the late 3rd millennium BC – led Poltavka herders to their known joint venture with Abashevo chiefs in the formation of the Sintashta-Potapovka-Filatovka community of fortified settlements. Supporting these intense contacts of Poltavka herders with Central Asian populations, late ‘outliers’ from the Volga-Ural region show admixture with typical Central_Steppe_MLBA populations: one in Potapovka (ca. 2220 BC), of hg. R1b-Z2103; and four in the Sintashta_MLBA_o1 cluster (ca. 2050-1650 BC), with two samples of hg. R1b-L23 (one R1b-Z2109), one Q1b-L56(xL53), one Q1b-Y6798.

Outlier analysis reveals ancient contacts between sites. We plot the average of principal component 1 (x axis) and principal component 2 (y axis) for the West Eurasian and All Eurasian PCA plots (…). In the Middle to Late Bronze Age Steppe, we observe, in addition to the Western_Steppe_MLBA and Central_Steppe_MLBA clusters (indistinguishable in this projection), outliers admixed with other ancestries. The BMAC-related admixture in Kazakhstan documents northward gene flow onto the Steppe and confirms the Inner Asian Mountain Corridor as a conduit for movement of people.

Similar to how the Sintashta_MLBA_o2 cluster shows an admixture with central steppe populations and hg. R1a-Z645, the WSHG ancestry in those outliers from the o1 cluster of typically (or potentially) Yamnaya lineages show that Poltavka-like herders survived well after centuries of Abashevo-Poltavka coexistence and admixture events, supporting the formation of a Proto-Indo-Iranian community from the local language as pronounced by the incomers, who dominated as elites over the fortified settlements.

The Proto-Indo-Iranian community likely formed thus in situ in the Don-Volga-Ural region, from the admixture of locals of Yamnaya ancestry with incomers of Corded Ware ancestry – represented by the ca. 67% Yamnaya-like ancestry and ca. 33% ancestry from the European cline. Their community formed thus ca. 1,000 years later than the expansion of Late PIE ca. 3500 BC, and expanded (some 500 years after that) a full-fledged Proto-Indo-Iranian language with the Srubna-Andronovo horizon, further admixing with ca. 9% of Central_Steppe_EMBA (WSHG-related) ancestry in their migration through Central Asia, as reported in the paper.

IV. Armenian

The sample from Hajji Firuz, of hg. R1b-Z2103 (xPF331), has been – as expected – re-dated to the Iron Age (ca. 1193-1019 BC), hence it may offer – together with the samples from the Levant and their Aegean-like ancestry rapidly diluted among local populations – yet another proof of how the Late Bronze Age upheaval in Europe was the cause of the Armenian migration to the Armenoid homeland, where they thrived under the strong influence from Hurro-Urartian.

Y-chromosome haplogroups of the Middle East and neighbouring groups during the Late Bronze Age / Iron Age. See full maps.

Indus Valley Civilization and Dravidian

A surprise came from the analysis reported by Shinde et al. (2019) of an Iran_N-related IVC ancestry which may have split earlier than 10000 BC from a source common to Iran hunter-gatherers of the Belt Cave.

For the controversial Elamo-Dravidian hypothesis of the Muscovite school, this difference in ancestry between both groups (IVC and Iran Neolithic) seems to be a death blow, if population genomics was even needed for that. Nevertheless, I guess that a full rejection of a recent connection will come down to more recent and subtle population movements in the area.

EDIT (12 SEP): Apparently, Iosif Lazaridis is not so sure about this deep splitting of ‘lineages’ as shown in the paper, so we may be talking about different contributions of AME+ANE/ENA, which means the Elamo-Dravidian game is afoot; at least in genomics:

I shared the idea that the Indus Valley Civilization was linked to the Proto-Dravidian community, so I’m inclined to support this statement by Narasimhan, Patterson, et al. (2019), even if based only on modern samples and a few ancient ones:

The strong correlation between ASI ancestry and present-day Dravidian languages suggests that the ASI, which we have shown formed as groups with ancestry typical of the Indus Periphery Cline moved south and east after the decline of the IVC to mix with groups with more AASI ancestry, most likely spoke an early Dravidian language.

Natural neighbour interpolation of qpAdm results – Maximum A Posteriori Estimate from the Hierarchical Model (estimates used in the Narasimhan, Patterson et al. 2019 figures) for Central_Steppe_MLBA-related (left), Indus_Periphery_West-related (center) and Andamanese_Hunter-Gatherer-related ancestry (right) among sampled modern Indian populations. In blue, peoples of IE language; in red, Dravidian; in pink, Tibeto-Burman; in black, unclassified. See full image.

I am wary of this sort of simplistic correlation with modern speakers, because we have seen what happened with the wrong assumptions about modern Balto-Slavic and Finno-Ugric speakers and their genetic profile (see e.g. here or here). In fact, I just can’t differentiate as well as those with deep knowledge in South Asian history the social stratification of the different tribal groups – with their endogamous rules under the varna and jati systems – in the ancestry maps of modern India. The pattern of ancestry and language distribution combined with the findings of ancient populations seem in principle straightforward, though.


The message to take home from Shinde et al. (2019) is that genomic data is fully at odds with the Anatolian homeland hypothesis – including the latest model by Heggarty (2014)* – whose relevance is still overvalued today, probably due in part to the shift of OIT proponents to more reasonable Out-of-Iran models, apparently more fashionable as a vector of Indo-Aryan languages than Eurasian steppe pastoralists?
*The authors listed this model erroneously as Heggarty (2019).

The paper seems to play with the occasional reference to Corded Ware as a vector of expansion of Indo-European languages, even after accepting the role of Yamnaya as the most evident population expanding Late PIE to western Europe – and the different ancestry that spread with Indo-Iranian to South Asia 1,000 years later. However, the most cringe-worthy aspect is the sole citation of the debunked, pseudoscientific glottochronological method used by Ringe, Warnow, and Taylor (2002) to support the so-called “steppe homeland”, a paper and dialectal scheme which keeps being referenced in papers of the Reich Lab, probably as a consequence of its use in Anthony (2007).

On the other hand, these are the equivalent simplistic comments in Narasimhan, Patterson et al. (2019):

The Steppe ancestry in South Asia has the same profile as that in Bronze Age Eastern Europe, tracking a movement of people that affected both regions and that likely spread the unique features shared between Indo-Iranian and Balto-Slavic languages. (…), which despite their vast geographic separation share the “satem” innovation and “ruki” sound laws.

Indo-European dialectal relationships, from Mallory and Adams (2006).

The only academic closely related to linguistics from the list of authors, as far as I know, is James P. Mallory, who has supported a North-West Indo-European dialect (including Balto-Slavic) for a long time – recently associating its expansion with Bell Beakers – opposed thus to a Graeco-Aryan group which shared certain innovations, “Satemization” not being one of them. Not that anyone needs to be a linguist to dismiss any similarities between Balto-Slavic and Indo-Iranian beyond this phonetic trend, mind you.

Even Anthony (2019) supports now R1b-rich Pre-Yamnaya and Yamnaya communities from the Don-Volga region expanding Middle and Late Proto-Indo-European dialects.

So how does the underlying Corded Ware ancestry of eastern Europe (where Pre-Balto-Slavs eventually spread to from Bell Beaker-derived groups) and of the highly admixed (“cosmopolitan”, according to the authors) Sintashta-Potapovka-Filatovka in the east relate to the similar-but-different phonetic trends of two unrelated IE dialects?

If only there was a language substrate that could (as Shinde et al. put it) “elegantly” explain this similar phonetic evolution, solving at the same time the question of the expansion of Uralic languages and their strong linguistic contacts with steppe peoples. Say, Eneolithic populations of mainly hunter-fisher-gatherers from the North Pontic forest-steppes with a stronger connection to metalworking


15 thoughts on “Yamnaya replaced Europeans, but admixed heavily as they spread to Asia

  1. Heggarty must have felt the pressure of the paper by Shinde et al. (2019) and Narasimhan, Patterson, et al (2019), because he uploaded yesterday his answer to the previously published contradictory data from DNA: Indo-European and the Ancient DNA Revolution (2018).

    It’s funny how many academics and laymen whose theories get contradicted by genomic data (viz. Furholt recently) just snap and decide that population genomics is not that useful anymore, whenever they don’t like what it shows.

    Model reaction for people to support whatever they said, step by step:

    – There needs to be no absolute correlation between population genomics and [this aspect]…
    – because everything [in this aspect] is so much more complicated than what genetics can show…
    – that I might still be right in whatever I said…
    – so let me unnecessarily complicate the interpretation of genomic results in a way that makes them essentially irrelevant.

  2. Relevant for this post, new Master’s thesis (under embargo) about Proto-Indo-Iranian loanwords and their stages, by Palmér (2019) at Leiden:

    In this thesis, I study loanwords of unknown origin in Proto-Indo-Iranian and early Post-Proto-Indo-Iranian. According to the Central Asian Substrate Hypothesis, Indo-Iranian speakers migrated to Central Asia around 2000 BCE and came into contact with the agricultural BMAC civilization, which resulted in a body of loanwords into Proto-Indo-Iranian, borrowed from the language of the BMAC people. Following a methodology for identifying non-Indo-European vocabulary in Indo-European languages, I argue that 74 out of 103 previously suggested loanwords can plausibly be analyzed as loanwords (chapter 3). Only a handful of these may have been borrowed from known languages. After establishing the relative chronology of Proto-Indo-Iranian sound changes (chapter 2), I divide the 74 early Indo-Iranian loanwords into chronological layers based on when they were borrowed (chapter 3-4). I argue that 21 words were borrowed after the disintegration of Proto-Indo-Iranian. Moreover, I argue that many of the remaining 53 loanwords that are reconstructable to Proto-Indo-Iranian were borrowed towards the end of this stage. Finally, I integrate the chronological layers into my analysis of structural characteristics of early Indo-Iranian loanwords and describe two new phonological patterns of loanwords (chapter 5). The fact that many loanwords are shown to have been borrowed in late PII or Post-PII, i.e. after Indo-Iranian speakers migrated to Central Asia, is consistent with the timeline of the Central Asian Substrate Hypothesis. Second, the newly discovered phonological characteristics provide additional support for the Central Asian Substrate Hypothesis, since they increase the likelihood that most loanwords originate in the same language.

  3. AASI rich populations were certainly present in south of India by the date of the Rakhigarhi ancient 2500bc. AASI ancestry increases between IVC(20pc) and modern times, suggesting that AASI had been moving north from S India. Given this, there isn’t evidence to support IVC being proto dravidian, more likely that part of the region was already Vedic given the excavation of chariots, warrior culture, fire altars etc.

  4. *rēki-/*rēkoi̯-, shared by Slavic and Tocharian.

    However on trying to connect using exotic metal silver Slavic/Tocharian look far apart no?

    1]Anatolian: *Hárǵis
    2]Tocharian: *ārkw(ä)i (“white”)-Tocharian A: ārki
    3]Italic: *argus
    4]Indo-Aryan: *Hárȷ́unam
    5]Sanskrit: अर्जुन (árjuna)
    6]Ancient Greek: ἄργυρος árguros,


    Also with regards to Yamnaya sample–SVP58 / I0444– Z2108-KMS75-Y20993and metal working.
    An Indo-Iranian Symbol of Power in the Earliest Steppe Kurgans

    Was the blunt mace imported or manufactured by Yamnaya. If it was manufactured ,was a 2 piece split mold technology used.
    Oh, Indra, getting your support Let us take cudgels,Like (…) vajra, And we will gain victory over all enemies”RV



    On the territory of Hindustan in theGanges-Yamuna Doab are found copper hoards of the 2ndmillennium BC, the post-Harappan period. They are connected with the Ochre Colored Pottery culture, which occupied aterritory sometimes linked with that of the early Indo-Aryans.Falk noted that the copper hoards included so-called bar-celts, which by their shape and size absolutely coincide with Kutulukcudgel-scepter. “The vast majority measure about half a meter,and weigh about 1.5kg (Hami, Bihar) or 2.2kg (Gungeria).Therefore I propose to interpret these pieces of copper asclubs, used to kill an adversary either by hitting or by beingthrown.” (Falk 1993: 200]. Falk considers the bar-celts to be thematerial expression of the
    , the divine weapon of Indra

    Yamnaya in Russia: Kutuluk
    Kutuluk kurgan cemetery I, located 60 km east of the city of Samara, contained:
    SVP58 / I0444 (central grave 1, kurgan 4, 3335-2881 calBCE, AA12570)
    The remains are of male aged 25-35 years (Fig. S3.2), estimated height 176 cm, with no
    obvious injury or disease, and buried with the largest metal object found in a Yamnaya grave
    anywhere26. The object was a blunt mace 48 cm long, 767 g in weight, cast / annealed and
    made of pure copper, like most Yamnaya metal objects. If one wanted to summon thunder and lightning, a pure copper mace would work quite well compared to bones or rock hammer.😀

  5. I added a tweet I wrote today to the post, after I read in some blog that the interpretation of the correlation of IVC-related ancestry with Dravidian must be correct, since some Middle Indo-Aryan dialects show a pattern compatible with their adoption by Dravidian speakers. So, basically:

    1) Retroreflex consonants + ergativity loss suggest Middle Indo-Aryan dialects spoken by Dravidian speakers. Heavy Indo-Aryan borrowings in Dravidian suggest long-lasting, intense contacts. Population genomics agrees; the pattern of Steppe- vs. IVC-related ancestry distribution in India fits. Only nationalists or nativists would deny this. SURE.
    See map

    2) Simplification of nominal system and phonetic changes in NWIE and Palaeo-Balkan languages suggest NWIE spoken by EEF speakers (see e.g. Afroasiatic-like traits in Insular Celtic, changes from Mycenaean to Ancient Greek, etc.). Heavy IE borrowings in the few attested non-IE languages. Population genomics agrees; the pattern of farmer ancestry distribution in Europe fits. Only nationalists or nativists would deny this. FINE.
    See map

    3) “Satemization” trends (+ conservative nominal system in BSl) suggest Indo-Iranian & Balto-Slavic spoken by Uralic speakers; similar case in Germanic. Heavy borrowings of Finno-Ugric from Indo-Iranian and West Uralic from Germanic and Balto-Slavic/Early Baltic suggest long-lasting, intense contacts. Population genomics agrees; the pattern of Steppe ancestry distribution fits. Only nationalists or nativists would deny this. NO WAY!
    See map


  6. Tarim Mummies is an unsolveable case.. They had the genetics of Indo-Iranians and languages of Afanasievo. There must have been an unrecorded wave of Corded Ware immigration/expansion to Tarim Basin prior to Sintashta-Andronovo Era that hasn’t been discovered yet. That’s my two cents.

  7. Regarding the I6561 sample from Alexandria, he is also the first individual to show the allele for lactase persistence (I3910-T) according to Anthony (2019).

    This mutation is supposedly only found much later, among Bell Beakers, and still later in Sintashta and related Steppe_MLBA populations, which puts the origin rather in the steppe, or else in farmer-related peoples in contact with the North Pontic area…


    It shouldn’t be difficult for geneticists to run a test for different mutations in N/EMBA/MLBA samples, to evaluate the number of mutations that probably shouldn’t be there in a North Pontic steppe-forest sample ca. 4000 BC instead of (say) an MLBA sample…

    Doesn’t Srubna show a variety of R1a haplogroups (Z93 and Z280)? Such a society would be more compatible with the first (and only) attested ancient subclade in the branch leading to South Asian L657…

    EDIT: I tried to start a thread on Twitter but I messed up the name of the sample 🤷‍♂️; it’s I6561. Anyway, feel free to comment here or there if you think there is a way to automatically test samples and see if their mutations do not fit their radiocarbon date. This would not be the first time it happens: AFAIK there have been the cases of Hajji_Firuz_IA first reported as Chalcolithic, and the Czech_Early_Slav samples which were first reported as Corded Ware:


  8. What is a difference between EHG and WSHG, even if WSHG has a combination of 30 EHG /50 ANE/ 20 EAST ASIAN?

    In the map of the Havard paper, Botai is with WSHG and east asian:

    but in damgaard paper, botai with EHG and east asian:

    If EHG and WSHG have a similar genetic admixture, we can think kumsay Q1a 3,000bc has also yamna lineage, but an admixture of WSHG/ FARMER/CHG. I think a pie 3,000bc below botai in the map seems to be kumsay Q1a. Considering its admixture, I think Iran farmer also migrated north, splitting to west and east. We have already archeological data that south Ural and south east caspian sea had an interaction since mesolithic to eneolithic.


    see also another admixture model. Actually steppe_MLBA_EAST = steppe_MLBA_west + 9% WSHG, however, only EHG increased w/o East asian in the steppe MLBA East:


  9. What is yamna or steppe admixture?
    It was EHG and CHG, and now EHG/ CHG/Farmer (w/ or w/o small WHG). Thas’t it?

    In a brand new scythian paper, page 5, there are two admixture medels.
    Problem is yamna_kalmykia and samara have altaian genes in qpAdm model, but not in CP/NNLS model.


    Do you know whether yamna samples have altaian admixture?

  10. As i said before, the IE relay-migartion has problems.

    I don’t think steppe people did migrate that way.
    See WSHG migration on foot, even without chariot. And later seima turbino migration.

    Second, any andronovo artifacts were not found near south asia, but Okunevo petroglyph according to geman scholar K. Zetmmar. Okunevo has third eye culture like modern indian. Their sun mark under pottery appears on mycenaean pottery and modern kalash girl’s face. Moreover, okunevo has a creator concept in petroglyph of sunhead with snakes like zeus, Indra and mesoamerica Votan.

    -Instead, there is copper hoard culture and chariot over there:


    I think number 6 would be connected to mycenaean, 8 to celtic torc. Number 4 be to one Mycenaean burial in circle B, chinese script at shang/zhou age (means Tian = sky) and altai petroglyph.
    why those things happened? I think seima turbino seems to have a key. 10% south indian has M73 also.


  11. Just putting this super interesting and relevant info here. SGPT samples I12471 and I12149 (1000 – 800 BC) both belong to I2a2a1b1b-L699 (I think this has been independently verified).
    So this makes the list of all I2a2a1b1b samples found in a Yamnaya context very interesting

    Yamnaya Ulan IV RISE552 – I2a2a1b1b2
    Bulgaria_EBA I2165 – I2a2a1b1b
    Yamnaya_Bulgaria_outlier Bul4 – I2a2a1b1b
    SGPT I12471 – I2a2a1b1b1
    SGPT I12149 – I2a2a1b1b1

    Another common marker of the Graeco-Aryan community pre Steppe_MLBA introgression?

    1. Based on the TMRCA of I2a-L699 vs. other I2a-L701 lineages, I don’t think it’s clear whether these subclades would have expanded with Steppe_EMBA or Steppe_MLBA ca. 3000-2000 BC.

      Anyway, are you sure you are not mixing ISOGG 2017 with ISOGG 2019? Because those you mention first are ISOGG 2017, while the latest SNP calls are usually reported as from ISOGG 2018 or 2019, which is completely different. They would be then I-S6635, possibly unrelated to recent Steppe movements.

      Do you have a link to the SNP calls?

Leave a Reply

Your email address will not be published.