Something is very wrong with models based on the so-called ‘Yamnaya admixture’ – and archaeologists are catching up (II)

A new article by Leo S. Klejn tries to improve the Northern Mesolithic Proto-Indo-European homeland model of the Russian school of thought: The Steppe hypothesis of Indo-European origins remains to be proven, Acta Archaeologica, 88:1, 193–204.


Recent genetic studies have claimed to reveal a massive migration of the bearers of the Yamnaya culture (Pit-grave culture) to the Central and Northern Europe. This migration has supposedly lead to the formation of the Corded Ware cultures and thereby to the dispersal of Indo-European languages in Europe. The article is a summary presentation of available archaeological, linguistic, genetic and cultural data that demonstrates many discrepancies in the suggested scenario for the transformations caused by the Yamnaya “invasion” some 5000 years ago.


Both teams [Reich/Anthony, and Willerslev/Kristiansen] interpreted this resemblance in the same way: as evidence of mass migration of the Yamnaya culture from the steppes into the Central and Northern Europe, resulting in the formation of the Corded Ware cultures, and these are universally recognised as Indo-European. Since earlier in this part of Europe existed a different pool of genomes, geneticists presumed that the Yamnaya migration alone had brought the Indo-European languages into Europe. It is difficult to say to what extent the pre-convictions of the involved archaeologists influenced these conclusions, or whether the results of the genetic studies attracted archaeologists with such beliefs.

Mismatch of cultural manifestations

First, we might question the idea of the Yamnaya culture as a unity rather than a loose conglomerate of cultures. Merpert (1974) divided it into nine local groups but did not recognise them as separate cultures. However, in 1975 I suggested that Nerushay (Budzhak) monuments should be recognised as a distinct culture (Klejn 1975), although still as a part of the same broader steppe community.

This was accepted by other specialists (Ivanova 2012; 2013; 2014). Generally, in the western branch of this community, a mixture of the eastern rites of interment with local, Balkan ceramics can be observed. It should be noted that hitherto all genetic samples were taken from eastern material (in the vicinity of Samara in the Volga basin and Kalmykia), while the central thesis concerns the intrusion of the western branch of this community (Budzhak culture) into Europe.

The spread of cultural-historical communities of the Yamnaya culture and the location of the Budzhak culture. GAC – Globular Amphora culture; CWC – Corded Ware culture. After Ivanova 2013.

Simultaneity of cultures

The Yamnaya culture (Chernykh & Orlovskaya 2004a; Heyd 2011; Frȋnculeasa et al. 2015) appears not to be the predecessor of the Corded Ware cultures but is contemporary with them. The Corded Ware cultures appeared also around the turn between the fourth and third millennium BC (Stöckli 2001; Furholt 2003). Their derivation from the Yamnaya seems, therefore, to be less probable. This is evidenced by the fact that the corded beakers or amphorae found in the Budzhak culture are not the prototypes of the corded beakers or amphorae found in more northern territories, but seem instead to be an outcome of contemporaneous contacts (Ivanova 2014; Klejn 2017c).

Discrepancies across the haplogroups

Even more remarkable is the variation in the distribution of types of Y chromosome. In the Yamnaya population, R1b is not just a single occurrence (there are about seven known occurrences) while in the Corded Ware population a different clade of R1b is found and R1a is predominant (several instances). Thus the postulate of unbroken succession finds no support!

Distribution of artefacts and customs of the Yamnaya culture in the area of the Corded Ware cultures. After Bátora 2006.

Paradoxical gradient

In the tables presented in the article by Reichs’ team (Haak et al. 2015) the genetic pool connecting the Yamnaya culture with the Corded Ware people is shown to be more intense in Northern Europe (Norway and Sweden) and decreases gradually from the North to the South (Fig. 6). It is weakest around the Danube, in Hungary, i. e. areas neighbouring the western branch of the Yamnaya culture! This is the reverse image to what the proposed hypothesis by the geneticists would lead us to expect. It is true that this gradient is traced back from the contemporary materials, but it was already present during the Bronze Age (Klejn 2015a).

The author also uses questionable interpretations from selected articles to advance his (as of today) untenable positions regarding a Mesolithic origin of the reconstructible Proto-Indo-European language.

1. Glottochronology, for a PIE origin:

If based on the data of glottochronology (taking into account all disputes) the period of initial dispersal is to be dated to the 7th-5th millennium BC.

2. Doubts on the origin of R1b-L51 subclades expressed in Genetic differentiation between upland and lowland populations shapes the Y-chromosomal landscape of West Asia, by Balanovsky et al. (2017), Human Genetics 136, 4. 437-450:

The currently available dataset does not contradict the hypothesis that R-GG400 marks a link between the East European steppe dwellers and West Asians, though the route and even direction of this migration is disputable. It does, however, demonstrate that present-day West European R1b chromosomes do not originate from the Yamnaya populations analyzed in (Haak et al. 2015; Mathieson et al. 2015) and raises the question of their origin. A Bronze Age origin is more likely than a Neolithic one (Balaresque et al. 2010), but further ancient DNA studies may be necessary to identify this source.

Just yesterday I read the post The retraction paradox: Once you retract, you implicitly have to defend all the many things you haven’t yet retracted, by Andrew Gelman. While – in my opinion – the post does not live up to its title, it poses an interesting question, as to how ad logicam (fallacy fallacy) is often used today in research: One author proposes something that is later demonstrated to be wrong, so everything they wrote or write can be said ipso facto to be wrong…especially if they accept that it was wrong.

This is usual with amateur geneticists (those who don’t publish, and are therefore not subjected to criticism): if anyone is wrong (whether in Archaeology or Genetics), then they are wrong in everything else. It seems to me that Klejn’s theses against recent genetic results rest on the same assumption: The Yamna -> Corded Ware migration model is wrong, ergo the Yamna homeland model is wrong.

I guess this same fallacy is what a lot of angered geneticists (whether professional or amateurs) are going to use to dismiss Klejn’s criticism, trying to focus on what he clearly does not grasp – about genomic data of Yamna peoples and their expansion – to disregard his doubts on genetic interpretations entirely.

I have warned many times about how simplistic interpretations of genetic data would cause a general mistrust in the field, and that archaeologists won’t take the discipline seriously, no matter how many articles get published in famous research tabloids like Nature or Science…

Those who dismiss this warning lightly seem to forget the fate of other recent “scientific breakthroughs” which were initially so promising that Humanities appeared to matter no more, like glottochronology for Linguistics and, to some extent, that of radiocarbon analysis for Archaeology.
EDIT: see here a recent example of discusion on discrepancies between archaeological and 14C-based chronologies, whereby ‘scientific data’ obviously needs archaeological context for a meaningful interpretation

Featured image: The direction of the supposed migration of the bearers of the Yamnaya culture into the area of the Corded Ware cultures. After Haak et al. 2015.

NOTE: I obviously don’t agree with Klejn’s main model: he criticises the Proto-Indo-European steppe homeland, and more specifically the expansion of Yamna peoples with R1b-L23 subclades, which I support. But, probably because of his “pre-convictions” (as he puts it when describing proponents of the steppe hypotheses) about the Proto-Indo-European homeland in Northern Europe during the Mesolithic, he was one of the first renown archaeologists to criticise the obvious inconsistencies in the genetic model of migrations based exclusively on the “Yamnaya ancestral component” concept, and to provoke the necessary reaction from (until then) overconfident geneticists, and he deserves credit for that.

In my opinion, the Russian school’s “Northern European Mesolithic” homeland model – as I have said before – could be based on the appearance of EHG ancestry, or maybe on the expansion of haplogroup R1b with post-Swiderian cultures, but the timeframe proposed is too early for any reconstructible parent proto-language, even for Indo-Uralic.


11 thoughts on “Something is very wrong with models based on the so-called ‘Yamnaya admixture’ – and archaeologists are catching up (II)

  1. One fact that many geneticists seem to ignore is that Western Europe probably wasn’t IE speaking until the spread of the Italo-Celtic people. Italy and Iberia certainly weren’t, according to the historical record, and there’s some evidence for a Vasconic (Bell Beaker?) substratum in France, Britain and Ireland. The Ital0-Celtic folk, of course, were the successors to the Urnfield culture, which was the successor to the Tumulus culture, the first Central European culture to have kurgans. I’d like to know what the main Tumulus culture Y haplogroups were. I’d guess R1b but with R1a predominating at the higher levels of that society. The original proto-IE folks may not have been R1b.

    1. I think any time before Pre-Roman (proto-historic) times in the regions you mention is Prehistory, and can therefore only be interpreted with Archaeology (and now also Genetics). Language can only be tentatively assigned to any group.

      I agree with you in that the Indo-European we know in Southern Europe (Italo-Celtic) was mainly brought at later times. However, before that time one can only guess (no matter what the Pre-Roman situation was).

      Judging by haplogroups and steppe ancestry arrived in the British Isles and Iberia in the Chalcolithic/Early Bronze Age (probably from East Bell Beakers expanding westward), North-West Indo-European dialects probably arrived there quite early. Each region is different, and for Iberia ( ) I would posit the arrival of North-West Indo-European dialects and then a resurge of Basque/Iberian languages – and possibly haplogroups I or G2 later obscured by founder effects that left mostly R1b subclades. While these early NWIE dialects may have been completely lost, the nature of Lusitanian in west Iberia (possibly an Italo-Celtic language, but distinct from Celtic and Italic) could place it anywhere from the initial Chalcolithic expansion to the expansions during the Iron Age…

      As for the Pre-Celtic substrate of Irish, there is really no consensus there, and as far as I know there is not much “non-IE” substrate found. The arrival of haplogroup R1b-L21 and steppe ancestry probably heralded the arrival of NWIE dialects, that are probably lost, since Celtic must have arrived much later (unless you believe in the “Atlantic façade” model of expansion for Celtic). In France, we only have non-IE substrate in the Aquitanian region, and for proto-historical times – before and beyond that, we just don’t know (maybe the Belgian substrate offers another lost NWIE language substrate distinct from Germanic and Celtic).

      Regarding “original proto-IE folks”, I don’t know what you are referring to (Indo-Hittite? Late PIE? North-West IE?), I guess Late PIE, but in most cases R1b-L23 subclades (as far as we know now) expanded with Yamna first westward from the steppe, and then with East Bell Beaker peoples westward from the Carpathian Basin / upper Danube, accompanied by a “steppe ancestral component”. That much is now quite clear from all recent Genomic papers until now, and to deny that one should bring new data to the table.

      As for the haplogroup composition of the Tumulus culture and Hallstatt, judging by data on the Celtic expansion we know today (with France, Iberia, and the British Isles maintaining a similar distribution to the previous one), it does not seem to have been accompanied by a ‘massive migration’ (not with the current data, at least), so the expansion process may have been very different in nature to previous Neolithic and Chalcolithic ones, possibly through “chiefs” as you seem to put it. I don’t think there is data today to support its expansion with any haplogroup, R1a or R1b – but in the future we might see certain ancestral component and certain subclades show up in early Celtic settlements, and this could be anyone from central Europe, i.e. R1a, R1b, I2, or any other,…

      Anyway, you seem to be working with data on modern distribution of haplogroups, similar to the interpretations of the 2000s, with some kind of ideal Western Vasconic Europe almost unbroken from the Palaeolithic/Mesolithic to the Iron Age, and some kind of IE/Corded Ware/R1a -> Vasconic/Bell Beaker/R1b scheme, thus disregarding even the most obvious data from recent papers on the subject…

      1. I disagree with you on a number of points. We do know what the Romans recorded about linguistic groups, and the only IE speakers in Iberia when they arrived were the Celts. There were several other groups speaking non-IE languages, mostly Vasconic. And the only IE people the Romans found as they expanded into southern Italy were Greek incomers. And no, my saying that the Bell Beaker folk were not IE doesn’t mean I think they had been in Western Europe since the Stone Age. They seem to have moved into Western Europe from points east, but we need to let go of the fantasy that steppe ancestry automatically means IE. Read something about what the linguists have to say on how and when IE must have developed and look at the dates when Bell Beaker folk first became established in Western Europe. They were there a little too early to have begun their migrations as IE speakers, I think.

  2. IE languages seem to have been spread to the Indian subcontinent and to Eastern Europe by people whose Y haplotype was predominantly R1a. But we also have IE in central and Western Europe, and in Western Europe the most common Bronze Age Y haplotype seems to have been R1b. I don’t think that automatically means IE languages first entered western Europe with the first major group of R1b folk. It may mean that the original IE homeland contained both R1a and R1b folk, or it could mean that the people who brought IE to Western Europe adopted IE and a warrior culture from a former ruling elite that could have been R1a. But I don’t see any possibility of an exclusively R1b populated IE homeland, which some folk seem to be searching for. And there seems to be no evidence that BB folk spoke IE languages. Less than no evidence, considering the linguistic situation in Iberia when the Romans first wrote about it, unless you believe that any BB IE languages had somehow vanished from Iberia by then. I don’t imagine my ideas will be accepted by anyone who wants to conflate IE and R1b, although the article under discussion certainly didn’t seem to be doing that. But I’ve seen a lot of discussion about finding the original IE homeland that goes into an automatic R1b mode, and that’s really all I’m trying to address.

    1. There is no “evidence” on the language spoken by any of these prehistoric people, indeed. We just hypothesize a steppe origin for Late Proto-Indo-European, believe that a Yamna expansion fits archaeologically and temporarily (using guesstimates), and try to go back speculatively from known ancient languages to the steppe.

      In my opinion, haplogroups only serve to further precise admixture, and genetics to help archaeology, and anthropological interpretation to help linguistics. There is no magic R1b-IE, or R1b-Vasconic, or R1a-IE, or N1c-Uralic, or any other simplistic assumption which can make things simple for ‘autochthonous continuity’ proposals. The complexity of European history implies that no one can trace their admixture, haplogroup, or language to any (idealised) ancestral group.

      By mentioning that hypothetical Iberian situation, and assuming that Yamna had R1a subclades – I guess you imply a millenium-long community of equally mixed R1a-R1b haplogroups in the Volga-Ural region (where mostly R1b-M269 has been found, and the R1a subclades related to CWC were likely originally from Eneolithic Ukraine, i.e. North Pontic steppe, steppe zone and Forest Zone) – , and that it was a “ruling elite” of R1a the ones who spoke Indo-European, you seem again to be working with modern distribution of haplogroups (i.e. theories of the 2000s), and not with ancient data described in recent articles. I don’t know how and why modern genomics could matter for the situation 5000-4000 years ago, when there is enough ancient ones to make sound hypotheses…

      Again, what is important here is not haplogroups, but the anthropological model of migration supported by genetics. And Heyd’s model of Yamna -> East Bell Beaker migration is now clearly supported by admixture AND the expansion of R1b-L23 subclades; the picture for Corded Ware (i.e. R1a + admixture) is not so clear: not for geneticists now (who after Mathieson et al. must correct the concept of the “Yamnaya ancestral component”), and certainly not for archaeologists, as the doubts of its main proponents – Kristiansen and Anthony – show.

      What happened after the arrival of R1b and steppe admixture in Iberia, without samples for each date after that and until proto-history, is even more speculative… But if you want to bet (against that “BB somehow vanished” – I don’t know what this means), then I will too, and yes, I do believe that East Bell Beakers arriving in Iberia just after their expansion (bringing mostly hg R1b-L51 subclades and steppe admixture, as confirmed by Olalde et al. and Martiniano et al., among others) probably brought NWIE dialects, although they were later obviously replaced in some parts by non-IE languages, possibly (but we have no evidence) by local groups which resurged or groups that migrated to these territories. When and how, and where, and for what languages exactly, who knows, that needs careful anthropological models (including archaeology and linguistics), and support by genomics, and even then ethnolinguistic identifications are a very difficult task for language isolates, without comparative data (as we have in Indo-European studies).

      Maybe more ancient data will show what happened in Iberia in the different periods, at least in terms of genetics…But the whole picture won’t in any case favour those searching for an ideal Palaeolithic or Mesolithic continuity model (whether their dream is for a Basque, Indo-European, Uralic, Indian, or any other continuity), that much is obvious with each new paper.

  3. You seem to have a positive genius for misinterpreting what people write. What I’m saying is that anyone who believes that IE languages were brought to Western Europe directly from the steppes is being silly. The real story is more complicated than that. And there are a lot of good reasons for thinking that BB folk weren’t speaking IE languages and were probably speaking Vasconic languages. My point about R1b is that it’s a mistake to assume BB folk must have been speaking IE languages because of their predominant y haplotype. The transition of Western Europe to IE languages happened because of the Celts and Latins/Romans, who evolved out of a complicated history in Central Europe. But I can see that I’m wasting my time here. Bye.

  4. Guys,
    Everyone has a narrative. Don’t fool yourselves. There is no such thing as oprah “your truth”.

    for eg, – Carlos in that, steppe, eastern Bell beakers, R1b, IE , genetics papers, archeology ,story…
    a. does not go very well with the fact: the 3 Bronze Age Portuguese, R1b1a2a1a2 from Martiniano had not steppe ancestry at all. Does it not?
    b. Or that all ancient historians, since 6 century BC, clearly separated Lusitanians from celts. Yet, Lusitanians spoke an IE language that is considered Pre Celt or Italic.
    c. Having Celts (Celtiberians) speaking the labiovelar *Kw and the Lusitanians further West in Portugal speaking an Indo-European with *p, when all other (celtic & Italic) already had drop it for a long time, does not help, to say that they both were Celtic, does it?
    d. Or that, as Z195 is almost as Old as “father” DF27 itself but actually there is “no” Z195 in Portugal at all these days. Actually M269 not DF27 is the highest in the Peninsula.
    So, on the narratives…. Did you notice that the big paper you have post about here about DF27 (older post)… there is no TMRCA for Portuguese DF27s… or for that matter even Galicia DF27?
    Does it make sense to you?

    1. Yes, the picture is quite complicated, and there is no truth. We try to simplify it the best way we can, with our preconceptions, and with our limited data and biases. It is like that whether we are the most renown archaeologists, linguists, or geneticists, or just amateurs in any of these fields.

      As Don (now gone apparently) says, “it’s a mistake to assume BB folk must have been speaking IE languages because of their predominant y haplotype”. I agree, so we are basically saying the same: it is a mistake to assume any language for any predominant y haplotype, and that’s what we all have been doing for a long time, until ancient samples arrived.

      A different case is the combination of compact archaeological culture + admixture + predominant haplogroup + anthropological model of sudden migration and expansion, such as East Bell Beaker from the upper Danube ca. 2500-2000 BC. That community may have had a common (but temporarily limited) ethnolinguistic identification, i.e. it would represent a “cultural-historical community”, and would thus have conveyed a common (majority) language with them, at least initially. And the same can be said about Yamna migrants, ca. 3500-2500.

      To try to reconstruct a common Proto-Indo-European language (and its stages) without believing in such events – and the possibility of assessing the most likely ones associated with each stage – seems to me incompatible.

      About steppe ancestry in Iberian samples, I think (if we assume that R1b-L151 subclades expanded initially with Yamna migrants) that peoples would have admixed a number of times, from its original territory (whatever it was, possibly East Bell Beakers expanding in Central Europe), until later, so admixture would have changed a lot depending on the specific time, while paternal lineage with cultural continuity – and potentially language – not so much. Anyway, I find it very difficult today to get any conclusion as to ethnolinguistic identification the farther we get from these clear migration periods (Yamna and East Bell Beaker), until proto-historic times, so the Lusitanian question – like the Iberian and Basque ones – seem still quite open.

  5. Carlos,
    It just has been published the R1b-M343 (xP312xU106) Y-DNA tree by Sergey Malyshev.
    I don’t really know them, don’t know how credible this things are. However two things are remarkable.
    a. They, like Genetiker, put ATP3 as M269 and,
    b. most amazingly, add MC337A, Monte Canelas (Portugal), Late Neolithic/Chalcolithic also as R1b-M269.
    c. Atp3 in north Spain 3400-3100 bc and MC337 is 3200-2900bc in the most remote southwestern point of all Europe.

    Remember that Martiniano et al published several Portugal Late Neolithic/Chalcolithic individuals as I2a1b giving the impression they dominate this Period in Portugal and MC337A as nothing reported for him. Then, the following period, middle bronze age, as full of R1b1a2a1a2 (so, P312). Hence the arrival of Steppe (or Balkan east bell beakers) R1b timeline mantra. Though, these 3 bronze age P312 actually had no Steppe component to them. But it did not deter the narrative.

    If, I repeat If, this turns out to be correct then we have In Portugal, further way as one gets from steppe. 3200BC (M269) ——- son 1800BC(P312) —– grandson R1b-DF27 (without Z195). In exact the same 100 miles area.

    Not really the story we are been told, is it?

Leave a Reply

Your email address will not be published.

Help us avoid Spam! *