The new Scicomm’s warhorse is “CHG ancestry = PIE” and the Iranian homeland


Funny reports are popping up due to a recent article in New Scientist (behind paywall), World’s most-spoken languages may have arisen in ancient Iran, which echoes the controversial interpretations of Wang et al. (2018).

I have been waiting to read the printed edition, but that of May 26th doesn’t have the article in it, so it may be a web-only text.

Nevertheless, here are some excerpts about the PIE homeland from a news aggregator that caught my attention (emphasis mine):

The two proposed locations are divided by the Caucasus mountains, which are found between the Black and Caspian Seas. In today’s geography, the mountains cover parts of Russia, Georgia, Azerbaijan, and Armenia.

To find out whether the ancient language came from north or south of these mountains, a team from the Max Planck Institute for the Science of Human History looked at the bones of 45 ancient humans from the Caucasus region, and analyzed their DNA. These people lived in the area between 3,200 and 6,500 years ago.

Interestingly, from looking at their genes, the researchers determined that these ancient people seemed to be moving predominantly in one direction – they were heading north. This suggests that, contrary to what was previously believed, the first Indo-European language might actually have arisen south of the Caucasus mountains, only spreading to other parts of Europe and Asia as people migrated north from this region. The findings are currently available on BioRxiv.

We know that the Proto-Indo-European language appeared somewhere between 5,500 and 9,000 years ago, and the study suggests it only spread to Europe about 6,500 years ago. Therefore, this lost language could have originated south of the Caucasus.

What’s more, the ancient people analyzed had similar genetic signatures to prehistoric farmers who once lived in western Iran. Therefore, the ancient version of many of our languages may have first evolved in ancient Iran, before spreading with the people who first spoke it, and their ancestors, as they radiated north of the Caucasus mountains to the Eurasian steppe.

However, there are still many who favor the conflicting theory – that the Proto-Indo-European language arose in the Eurasian steppe. But this would only take the language back about 4,800 years – when people moved from the Eurasian steppe into Europe – and specialists think the language is significantly older. The idea that it first sprung up in Iran about 6,500 years ago follows this assumption.

It seems that – now that the Danish workgroup (responsible for the “steppe ancestry = Indo-European” and “Corded Ware expanded from Yamna“) is backing down, and both it and the Reich/Jena group are accepting that Yamna represents the expansion of Late Indo-European into Afanasevo, Bell Beaker, and Sintashta – anything before Yamna in the steppe is just another “conflicting theory” among equals…

So forget the “steppe ancestry = PIE”, and welcome the newly fashionable “CHG ancestry = PIE“, and of course the Iranian homeland.

This is how I imagine genetic labs writing anthropological interpretations and conclusions of their papers, against every single reasonable restraint (and the well-established models of linguists and archaeologists) and then publicizing them:



The renewed ‘Kurgan model’ of Kristian Kristiansen and the Danish school: “The Indo-European Corded Ware Theory”

Allentoft Corded Ware

A popular science article on Indo-European migrations has appeared at Science News, entitled How Asian nomadic herders built new Bronze Age cultures, signed by Bruce Bower. While the article is well-balanced and introduces new readers to the current status quo of the controversy on Indo-European migrations – including the opposing theories led by Kristiansen/Anthony vs. Heyd – , it reverberates yet again the conclusions of the 2015 Nature articles on the subject, especially with its featured image.

I have argued many times why the recent ‘Yamnaya -> Corded Ware -> Bell Beaker’ migration model is wrong, mainly within my essay Indo-European demic diffusion model, but also in articles of this blog, most recently in the post Correlation does not mean causation: the damage of the ‘Yamnaya ancestral component’, and the ‘Future America’ hypothesis). It is known that Nature is a bit of a ‘tabloid’ in the publishing industry, and these 2015 articles offered simplistic conclusions based on a wrong assessment of archaeological and linguistic data, in search for groundbreaking conclusions.

An excerpt from Bower’s article:

Corded Ware culture emerged as a hybrid way of life that included crop cultivation, breeding of farm animals and some hunting and gathering, Kristiansen argues. Communal living structures and group graves of earlier European farmers were replaced by smaller structures suitable for families and single graves covered by earthen mounds. Yamnaya families had lived out of their wagons even before trekking to Europe. A shared emphasis on family life and burying the dead individually indicates that members of the Yamnaya and Corded Ware cultures kept possessions among close relatives, in Kristiansen’s view.

“The Yamnaya and the Corded Ware culture were unified by a new idea of transmitting property between related individuals and families,” Kristiansen says.

Yamnaya migrants must have spoken a fledgling version of Indo-European languages that later spread across Europe and parts of Asia, Kristiansen’s group contends. Anthony, a longtime Kristiansen collaborator, agrees. Reconstructed vocabularies for people of the Corded Ware culture include words related to wagons, wheels and horse breeding that could have come only from the Yamnaya, Anthony says.

I have already talked about Kristiansen’s continuation of Gimbutas’ outdated ideas: we are seeing a renewed effort by some Scandinavian (mainly Danish) scholars to boost (and somehow capitalise) the revitalised concept of the “Kurgan people”, although now the fundamental issue has been more clearly shifted to the language spoken by Corded Ware migrants.

As far as I can tell, this renewed interest began two years ago, with the simultaneous publication of genetic studies by Haak et al. (2015), and Allentoft et al. (2015), and the misuse of the cursed concept of ‘Yamnaya ancestry‘ to derive far-fetched conclusions.

On the other hand, genetic research is not solely responsible for this: David Anthony – who was apparently consulted by Haak et al. (2015) for their paper, where he appears as co-author – has kept a low (or lower) profile, and only recently has he merely suggested potential links between Corded Ware and Bell Beaker cultures in Lesser Poland, that might explain what (some geneticists have told him) appeared as a potential Yamna -> Corded Ware -> Bell Beaker migration in the first ancient samples studied.

Anthony’s migration model remains otherwise strongly based on Archaeology, offering a careful interpretation of potential contacts and migrations in the Pontic-Caspian steppe, and only marginally offers some views on Linguistics (based on Ringe’s controversial ‘glottochronological model’ of 2006), to the extent that he is compelled to explain the potential adoption of Indo-European by Corded Ware culture (CWC) peoples as multiple cultural diffusion events, since no migration is observed from the steppe to CWC territories.

I think he is thus showing a great deal of restraint, not jumping on the bandwagon of this recent trend based on scarce genetic finds – and therefore losing also the opportunity to publish articles in journals of high impact factor….

This newly created Danish school, on the other hand, seems to be swimming with the tide. Kristiansen, known for his controversial ‘universal’ interpretations of European Prehistory – which are nevertheless more readable and interesting than most specialised literature on Archaeology, at least for us non-archaeologists – , has apparently seized the opportunity to give a strong impulse to his theories.

Not that there is nothing wrong with that, of course, but sometimes it might seem that a lot of papers (or even researchers) support something, when in fact there are only a few of them, working closely together

I see therefore three main “branches” of this support (two of them, Genetics and Linguistics, only recently giving some limited air to this dying hypothesis), with a closely related group of people involved in this model, and they are lending continuous support to each other, by repeating the same theory – and repeating the same misleading map images (like the one shown in the article) – , so that the circular reasoning they represent is concealed behind seemingly independent works.

The theory and its development

The main theory is officially rooted then in Kristiansen’s hypothesis, whose first article on the subject seems to be Prehistoric Migrations – the Case of the Single Grave and Corded Ware Cultures (1989), supporting the Kurgan model applied to the Corded Ware migrations. It was probably a kind of a breakthrough in Archaeology, bringing migration to mainstream Archaeology again (followed closely by Anthony), and he deserves merit for this.

After this proposal, there are mostly just his publications supporting this model. Nevertheless, Kristiansen’s model, I gather, did not involve the sudden Yamnaya -> Corded Ware migrations discussed in recent genetic articles, but long-lasting contacts between peoples and cultures from the North Pontic steppe, Trypillian, and Globular Amphora, that formed a new mixed one, the Corded Ware people and culture. Also, in Gimbutas’ original model of migration (1963), waves of Kurgan migrants are also described into Vučedol and Bell Beaker, which have been apparently forgotten in recent models*.
* The most recent model by Anthony describes such migrations into Early Bronze Age Balkan cultures – as do most archaeological publications today – , but he is unable to recognize migration waves from Yamna into the Corded Ware culture, and because of that describes mere potential routes (or modes) of cultural diffusion including language change.

Proposal for the origin and spread of the Corded Ware/ Battle Axe cultural complex: 1) Distribution of CWC groups; 2) Yamna culture; 3) presumed area of origin; 4) presumed main directions of the primary distribution. Also numbered are other individual CW cultures. From Kristiansen (1989).

Then – skipping the years of simplistic phylogeography based on modern haplogroup distribution – we have to jump directly to Allentoft (of the Natural History Museum of Denmark) and cols. and their article on population genomics of Bronze Age Eurasia (2015), with which Kristiansen collaborated, and which offers the first direct association of Corded Ware as the vector of expansion of Indo-European peoples and languages from Yamna. An interesting take on the Yamna -> Corded Ware -> Bell Beaker question is represented by their very ‘kurgan-like’ Corded Ware-centric map:

Detail of Fig. 1 from Allentoft et al. (2015): “Distribution of Early Bronze Age cultures Yamnaya, Corded Ware, and Afanasievo with arrows showing the Yamnaya expansions”.

And suddenly, we are now seeing more works that support the central thesis of the group – that Corded Ware must have brought Indo-European languages to Europe:

Recent publications by K-G Sjögren – from the same department as Kristiansen, at the University of Gothenburg – seem to imply that there was a direct connection Corded Ware -> Bell Beaker in central Europe.

Guus Kroonen‘s recent hypothesis of a potential (Proto-Semitic-like) Germanic substrate (2012) has been added recently to the cause, in supporting with Iversen (also from the University of Copenhaguen) a link with the Battle Axe/Funnelbeaker culture interaction. However, in the archaeological-linguistic model it seems that Germanic must predominate over the rest of Indo-European languages in terms of age, representing the first wave of Indo-Europeanization in Europe (wat?!), whereas Balto-Slavic is much younger and unrelated…? But didn’t they share the same substrate (as did partially Greek) in Kroonen (2012)? I think Kroonen’s hypothesis might be better explained through an earlier contact in the North Pontic steppe

Modified from Kristiansen et al. (2017). “Schematic representation of how different Indo-European branches have absorbed words (circles) from a lost Neolithic language or language group (dark fill) in the reconstructed European linguistic setting of the third millennium BC, possibly involving one or more hunter gatherer languages (light fill) (after Kroonen & Iversen 2017)”.


This recently created Danish pressure group is not something bad per se. I don’t agree with their hypothesis (or rather evolving hypotheses, since they change with new genetic results and linguistic proposals, as is shown in Kristiansen et al. 2017), but I understand that the group continues a recent tradition:

Publications are always great to advance in knowledge, and if they bring some deal of publicity, and more publications (with the always craved impact factor), and maybe more investment in the departments (with more local jobs and prestige)… why not?

However, this model of workgroup research system is reminiscent of the Anatolian homeland group loosely created around Renfrew; the Palaeolithic Continuity workgroup around Cavalli-Sforza; or (more recently) the Celtic from the West group around Cunliffe and Koch. The difference between Kristiansen’s workgroup and supporters of all those other models, in my opinion, is that (at least for the moment) their collaboration is not obvious to many.

Therefore, to be fair with any outsider, I think this group should clearly state their end model: I propose the general term “Indo-European Corded Ware Theory” (IECWT) workgroup, because ‘Danish’ is too narrow, and ‘Scandinavian’ too broad to represent the whole group. But any name will do.

My opinion on the IECWT

As you can see, no single strong proof exists in support of the IECWT:

  • Not for a solid model of PIE expansion from Corded Ware, not even within the IECWT group, where there is no support (to date) for a Balto-Slavic expansion associated with the Corded Ware culture… Or any other dialect, for that matter;
  • Not for a Corded Ware -> Bell Beaker connection – that is, before the publication of Allentoft et al. (2015) and articles reverberating their conclusions;
  • Not for a unified Pre-Germanic community before the Dagger Period, and still less linked with the expansion of the Corded Ware culture from the steppe – that connection is found only in Anthony (2007), where he links it with a cultural diffusion into Usatovo, which seems too late for a linguistic expansion with Corded Ware peoples, with the current genetic data.

The wrong interpretation of scarce initial ancient samples has been another feeble stone put over the ruins of Gimbutas’ theory. While her simple theory of Kurgan invaders was certainly a breakthrough in her time – when speaking about migrating Indo-European peoples was taboo -, it has since been overcome by more detailed archaeological and linguistic accounts of what happened in east and central Europe during the Chalcolithic and Bronze Age.

However, a lot of people are willing to consume post-truth genetic-based citebait like crazy, in a time when Twitter, Facebook, blogs, etc. seem to shape the general knowledge, while dozens of new, carefully prepared papers on Archaeology and Linguistics related to Indo-European peoples get published weekly and don’t attract any attention, just because they do not support these simplistic claims, or precisely because they fully reject them.

An older connection of Germanic to Scandinavia – and thus an ancestral Indo-European cultural diffusion from north to south – seems to better fit the traditional idea of an autochthonous Germanic homeland in Scandinavia, instead of a bunch of southern Bell Beaker invaders bringing the language that could only later develop as a common Nordic language during the Bronze Age, in a genetically-diverse community…

One is left to wonder whether the support of Corded Ware + haplogroup R1a representing Pre-Germanic is also in line with the most natural human Kossinnian trends, whereby the older your paternal line and your ancestral language are connected to your historical territory, the better. The lack of researchers from Norway – where R1b subclades brought by Bell Beakers peak – in the workgroup is revealing.

Just as we are seeing strong popular pressure e.g. to support the Out of India Theory by Hindu nationalists, or some Slavic people supporting to recreate a ‘Northern IE group’ with a Germano-Balto-Slavic Corded Ware culture – and a renewed interest in skin, hair and eye colour by amateur geneticists – , it is only natural to expect similar autochtonous-first trends in certain regions of the Germanic-speaking community.

NOTE: I feel a bit like an anti-IECWT hooligan here, and once again fulfilling Godwin’s Law. Judging by previous reactions in this blog to criticism of the Out of India Theory, and to criticism of R1a as the vector of expansion of Indo-European languages, this post is likely to cause some people to feel bad.

It is not intended to be against these researchers individually, though. All of them have certainly contributed in great ways to their fields, indeed more than I have to any field: Kristiansen is well-known for his careful, global interpretations of European prehistory (and has been supporting his model for quite a long time). I do like Kroonen’s ideas of a Pre-Germanic substratum. And people involved in the group do so probably because they collaborate closely with each other, and because of the huge pressure to publish in journals of high impact factor, so to mix their disparate research within a common model seems only natural.

But their collaboration is boosting certain wrong ideas, and is giving way to certain misconceptions in Linguistics, and also sadly renewed past ethnocentric views of language in Northern Europe – that will be luckily demonstrated, again, wrong. After all, publications (like ideas in general) are subjected to criticism, as mine are. Researchers who publish know their work is subjected to criticism, and not only before publication, but also – and probably more so – after it. That a paper can be incorrect, biased, or even completely absurd, does not mean the person who wrote it is a fool. That’s the difference between criticising ideas and insulting. If criticism offends you, you shouldn’t be publishing. Period.


Featured image: From Allentoft et al. (2015)“>Allentoft et al. (2015). See here for full caption.

Correlation does not mean causation: the damage of the ‘Yamnaya ancestral component’, and the ‘Future American’ hypothesis


Human ancestry can only help solve anthropological questions by using all anthropological disciplines involved. I have said that many times in this blog.

Correlation does not mean causation

Really, it does not.

You might think the tenet ‘correlation does not mean causation‘ must be evident at this point in Statistics, and it must also be for all those using statistical methods in their research. But it is sadly not so. A lot of researchers just look for correlation, and derive conclusions – without even an initial sound hypothesis to be contrasted… You can judge for yourself, e.g. reading the many instances of this complaint in recent publications of Biomedical and Social Sciences, on the interesting blog Statistical Modeling, Causal Inference, and Social Science.

In anthropological questions regarding Indo-European studies there is an added handicap: not taking correlation to mean causation does also mean – to avoid at least the most obvious confounders – taking into account the multiple linguistic and archaeological data that are available right now, to explain the expansion of Indo-European languages.

You might also believe that international researchers in Human Evolutionary Biology – after all, this is essentially a biomedical discipline – are acquainted with statistical methods and their problems when applied to their field. And that scientific journals – and especially those with the highest impact factors, like Nature, Science, or PNAS – have professional, careful reviewers who would never accept papers that equal correlation with causation, especially when Social Sciences are involved (because this alone might make errors grow exponentially…). Sadly, this is obviously not so, either.

The ‘Yamnaya component’ concept and its damage

From Allentoft et al. (2015), emphasis is mine:

Both studies [Haak et al. (2015) and this one] found a genetic affinity between samples from a central European culture known as Corded Ware, which existed from around 2500 bc, and samples from the earlier Yamnaya steppe culture. This similarity between distant populations is best explained by a substantial westward expansion of the Yamnaya or their close relatives into central Europe (Fig. 1b). Such an expansion is consistent with the steppe hypothesis, which argues that Corded Ware cultures were a conduit for the dispersal of Indo-European languages into Europe.

More interesting than these vague words – and the short, almost invisible suggestion that Yamna may not be exactly the population behind Corded Ware peoples – are the maps that illustrated in Nature their risky hypothesis: they called it “steppe hypothesis“, like that (in general terms), as if everyone defending a steppe origin for Proto-Indo-European would support such a model, when they actually referred to the specific hypothesis of one of their authors (Kristiansen), one of the few archaeologists who keep Gimbutas’ concept of the ‘Kurgan peoples’ alive, based on the Corded Ware culture:

Allentoft Corded Ware
Allentoft et al. (2015): “They conclude that the Corded Ware culture of central Europe had ancestry from the Yamnaya. Allentoft et al. also show that the Afanasievo culture to the east is related to the Yamnaya, and that the Sintashta and Andronovo cultures had ancestry from the Corded Ware. Arrows indicate migrations — those from the Corded Ware reflect the evidence that people of this archaeological culture (or their relatives) were responsible for the spreading of Indo-European languages. All coloured boundaries are approximate.”

In many publications that followed, the trend has been to reproduce this graphical model, by asserting (or implying) that Bell Beaker peoples were the result of subsequent Corded Ware migrations, and indeed that Corded Ware peoples migrated from the Yamna culture, and were thus the vector of expansion for Indo-European languages in Europe.

All of this is being proven wrong, as I predicted: see Mathieson et al. (2017) and Olalde et al. (2017) for recently studied samples with ‘steppe component’, older than (and unrelated to) the Yamna culture. However, no retraction (or correction, whatever) has been published to date about the concept of the ‘Yamnaya ancestry expansion’, and its consequences.

We shall see then just a rather surreptitious shift in terminology from ‘Yamnaya’ to ‘steppe’ component, to adapt to the new data – i.e. some damage control while the ship of ‘Yamnaya ancestry’ capsizes – but little else. “Earlier ‘Yamnaya ancestry’, you say? Just, you know, let’s call it ‘steppe ancestry’ and shift the expansion of Indo-European languages to one or two thousand years earlier, and done!”

The damage of this post-truth genetics is already done: we will see the unending distribution on the Internet in general, and on social networks in particular, of these grandiose conclusions, of far-fetched Indo-European migration models that include the Corded Ware culture, of simplistic maps with apparently harmless ‘arrows of migration’ (like the above) representing fictional population movements suggesting nonexistent dialectal branches.

You might be one of those sceptics wary of so many boring statistical rules: “But it’s a safe reasoning: Yamanaya samples have an ‘ancestral component’ that is found elevated in Corded Ware samples, and less so in Bell Beaker samples, and PCA showed a similar result…so the migration model Yamnaya -> Corded Ware -> Bell Beaker is a priori correct, right?”

The ‘Future American’ hypothesis

Let me illustrate this attractive “Correlation = Causation” argument, using it to solve the problem of Future American languages.

Suppose we live in a future post-apocalyptic world ca. 3500 AD, with no surviving historical records before 3000 AD. None. Just investigation of cultures and their relationship by Archaeology, proto-languages reconstructed and language families identified by Linguistics, etc.

We have thus Future Germanic and Future Romance as the only language families spoken in Future Western Europe and in the Future Americas, in a distribution similar to the present day*, and we have certain somehow related archaeologically-defined cultures on both sides of the Atlantic, like Briton, Iberian, Norman, or Lowlandish, although their distribution remains partly undefined in time and space.

* If you are really curious about this scenario, you can read about the potential evolution of a Future North-American language.

But what languages did the ancestors of Future Americans speak, and who spread them? That question remains far from being settled by our future researchers, in spite of the solidest linguistic and migration models (talking mainly about Briton and Iberian cultures): too many authorities out there questioning them, fighting to impose their own pet theories.

Suddenly, the newly developed field of Human Ancestry comes to save the day. So let’s say we have this map of ancient samples recovered (dated from, say, the 6th to the 18th century AD), and our study is centered on the newly described “Western European” component (a precise combination of, say, WHG+steppe), which peaks in early samples from the Low Lands – hence we call it, quite daringly, “Lowlandic component“.

Our group is keen to demonstrate that the ancient Lowlandic culture described in Archaeology (marked especially by the worldwide distribution of tulips among other traits) is the origin of Western European and American languages… Now, let’s reach conclusions about migrations in the Middle Ages!

‘Future American’ hypothesis. Migration routes in Western Europe and the Americas during the Middle Ages, based on the ‘Lowlandic component’ (Click to open higher quality version).

PCA shows that South-West European samples cluster closely to some North-West European samples, and that some late South American samples available cluster at some distance from North American samples – nearer to a native component represented by two individuals with 0% Lowlandic ancestry and a different cluster in PCA. And some North-American samples cluster quite closely to North-West European samples.

Based on the decrease in ‘Lowlandic component’ in the different samples and on PCA, we conclude that Lowlandic peoples (“or their close relatives”) must have migrated at the same time to North America, South America (or potentially from North America to South America?) as well as western, central, and northern Europe. Both migration events must have happened roughly at the same time, in part because both distinct language families appear in a north-south distribution, and Proto-Lowlandic must be (according to Genetics) the ancestor of both, Proto-Future-Germanic and Proto-Future-Romance.

That makes a lot of sense! A huge Lowlandic pressure for migration, you see. Push-pull mechanisms and stuff. A Lowlandic Empire probably (scattered remains are found everywhere)! And, judging by the presence of the ‘Lowlandic component’ in Future East Europe from the Elbe to the Vistula, maybe Lowlandic peoples spread Proto-Slavic, too! We can even date the common Lowlandic-Slavic proto-language this way! So many groundbreaking conclusions!

Future scholars supporting the Lowlandic homeland are on fire; they can’t get enough of publishing papers on the subject. “Two different Future American language families with cultural origins in Britain and Iberia, my ass! Because genetics.”

And don’t forget the future people of haplogroup R1b-U106 and high Lowlandic component: Wow, they are the heirs of those who expanded Future Germanic and Future Romance languages everywhere, aren’t they? How proud they must be. And who wouldn’t want to have these tall, blond, blue-eyed Lowlanders as their forefathers? Personalised genetic analysis is selling like crazy: “let’s know our Lowlandic percentage!”. Everyone is happy, colourful maps with lots of arrows and shit…

But – your future you might ask in awe, seeing that this doesn’t sound quite right, based on your basic archaeological and linguistic knowledge:

  • What about specific models of migration proposed to date? The solidest ones, not just anyone that seems to fit?
  • What about the dialectal classification of languages? The mainstream ones, not those that are compatible with this interpretation?
  • What about archaeological cultures to which individual samples belonged?
  • What about the actual dates of each sample? And how this date relates to the state of the culture to which it belongs?
  • What about the haplogroups, and the actual subclade of each haplogroup?
  • What about the territories, cultures, and dates not sampled, could they change this interpretation in light of known archaeological models?
  • And what about the actual origin of that ancestral component they so frivolously named? Dit it really appear ex nihilo in the Low Lands, and expanded from it?

“Who cares! This new data is sooo coool… And it proves what we wanted, what a coincidence! And it’s numbers, mate! Numbers don’t lie.”

No, numbers don’t lie. But people do.

Correlation is fun, isn’t it?