Proto-Indo-European homeland south of the Caucasus?

Ancient DNA available from this time in Anatolia shows no evidence of steppe ancestry similar to that in the Yamnaya (although the evidence here is circumstantial as no ancient DNA from the Hittites themselves has yet been published). This suggests to me that the most likely location of the population that first spoke an Indo-European language was south of the Caucasus Mountains, perhaps in present-day Iran or Armenia, because ancient DNA from people who lived there matches what we would expect for a source population both for the Yamnaya and for ancient Anatolians. If this scenario is right the population sent one branch up into the steppe-mixing with steppe hunter-gatherers in a one-to-one ratio to become the Yamnaya as described earlier- and another to Anatolia to found the ancestors of people there who spoke languages such as Hittite.

Reich’s proposal based on ancestral components to explain the formation of a people and language is a continuation of their emphasis on ancestry to explain cultures and languages. It seems quite interesting to see this happen again, given their current trend to surreptitiously modify their previous ‘Yamnaya ancestry’ concept and Yamnaya millennia-long R1a-R1b community (that supposedly explains a Yamna -> Corded Ware -> Bell Beaker migration) to a more general ‘steppe people’ sharing a ‘steppe ancestry’ who spoke a ‘steppe language’.

This new idea based on ancestral components suffers thus from the same essential methodological problems, which equate it – yet again – to pure speculation:

  1. It is a conclusion based on the genomic analysis of few individuals from distant regions and different periods, and – maybe more disturbingly – on the lack of steppe ancestry in the few samples at hand.
  2. Wait, what? Steppe ancestry? So they are trying to derive potential genetic connections among specific prehistoric cultures with a poorly depicted genetic sketch, based on previous flawed concepts (instead of on anthropological disciplines), which seems a rather long stretch for any scientist, whether they are content with seeing themselves as barbaric scientific conquerors of academic disciplines or not. In other words, statistics is also science (in fact, the main one to assert anything in almost any scientific field), and you cannot overcome essential errors (design, sampling, hypothesis testing) merely by using a priori correct statistical methods. Results obtained this way constitute a statistical fallacy.

  3. Even if the sampling and hypothesis testing were fine, to derive anthropological models from genomic investigation is completely wrong. Ancestral component ≠ population.
  4. To include not only potential migrations, but also languages spoken by these potential migrants? It’s sad that we have a need to repeat it, but if ancestral component ≠ population, how could ancestral component = language?

The Proto-Indo-European-speaking community

This is what we know about the formation of a Proto-Indo-European community (i.e. a community speaking a reconstructible Proto-Indo-European language) in the Pontic-Caspian steppe, which is based on linguistic reconstruction and guesstimates, tracing archaeological cultures backwards from cultures known to have spoken ancient (proto-)languages, and helping both disciplines with anthropological models (for which ancient genomics is only helping select certain details) of migration or – rarely – cultural diffusion:

NOTE. The following dates are obviously simplified. Read here a more detailed linguistic assessment based on phonology.

Most likely Pre-Proto-Anatolian migration with Suvorovo-Novodanilovka chiefs in the North Pontic steppe and the Balkans.
  • ca. 5000 BC. Early Proto-Indo-European (or Indo-Uralic) spoken probably during the formation and development of a loose Early Khvalynsk – Sredni Stog I cultural-historical community over the Pontic-Caspian steppe region, whose indigenous population probably had mainly Caucasus hunter-gatherer ancestry.
  • ca. 4500 BC. Khvalynsk probably speaking Middle Proto-Indo-European expands, most likely including Suvorovo-Novodanilovka chiefs into the North Pontic steppe, and probably expanding R1b-M269 lineages for the first time.
  • ca. 4000 BC. Separated communities develop, including North Pontic cultures probably gradually dominated by R1a-Z645 (potentially speaking Proto-Uralic); and Khvalynsk (and Repin) cultures probably dominated by R1b-L23 lineages, most likely developing a Late Proto-Indo-European already separated from Proto-Anatolian.
  • ca. 3500 BC. A Proto-Corded Ware population dominated by R1a-Z645 expands to the north, and slightly later an early Yamna community develops from Late Khvalynsk and Repin, expanding to the west of the Don River, and to the east into Afanasevo. This is most likely the period of reduction of variability and expansion of subclades of R1a-Z645 and R1b-L23 that we expect to see with more samples.
  • ca. 3000 BC. Expansion of Corded Ware migrants in northern Europe, and Yamna migrants along the Danube and into the Balkans, with further reduction and expansion of certain subclades.
  • ca. 2500 BC. Expansion of Bell Beaker migrants dominated by R1b-L51 subclades in Europe, and late Corded Ware migrants in east Yamna expanding R1a-Z93 subclades.

All these events are compatible with language reconstruction in mainstream European schools since at least the 1980s, supported by traditional archaeological research of the past 20 years, and is being confirmed with Genomics.

For those willingly lost in a myriad of new dreams boosted by the shallow comment contained in David Reich’s paragraph on CHG ancestry, even he does not doubt that the origin of Late Proto-Indo-European lies in Yamna, to the north of the Caucasus, based on Anthony’s (2007) account:

NOTE: By the way, David Anthony, one of the main sources of information for Reich’s group, never considered Corded Ware to have received Yamna migrants, and althought he changed his model due to the conclusions of the 2015 papers, he has recently changed his model again to adapt it to the inconsistencies found in phylogeography.

CHG ancestry and PIE homeland south of the Caucasus

As for the potential origins of CHG ancestry in early Proto-Indo-European speakers, I already stated clearly my opinion quite recently. They may be attributed to:

Just to be clear, an expansion of Proto-Anatolian to the south, through the Caucasus, cannot be discarded today. It will remain a possibility until Maykop and more Balkan Chalcolithic and Anatolian-speaking samples are published.

However, an original Early Proto-Indo-European community south of the Caucasus seems to me highly unlikely, based on anthropological data, which should drive any conclusion. From what I could read, here are the rather simplistic arguments used:

  • Gimbutas and Maykop: Maykop was thought to be (in Gimbutas’ times) a rather late archaeological culture, directly connected to a Transcaucasian Copper Age culture ca. 2400-2300 BC. It has been demonstrated in recent years that this culture is substantially older, and even then language guesstimates for a Late PIE / Proto-Anatolian would not fit a migration to the north. While our ignorance may certainly be used to derive far-fetched conclusions about potential migrations from and to it, using Gimbutas (or any archaeological theory until the 1990s) today does not make any sense. Still less if we think that she favoured a steppe homeland.

NOTE. It seems that the Reich Lab may have already access to Maykop samples, so this suggested Proto-Indo-European – Maykop connection may have some real foundation. Regardless, we already know that intense contacts happened, so there will be no surprise (unless Y-DNA shows some sort of direct continuity from one to the other).

  • Gamkrelidze & Ivanov: they argued for an Armenian homeland (and are thus at the origin of yet another autochthonous continuity theory), but they did so to support their glottalic theory, i.e. merely to support what they saw as favouring their linguistic model (with Armenian being the most archaic dialect). The glottalic theory is supported today – as far as I know – mainly by Kortlandt, Jagodziński, or (Nostraticist) Bomhard, but even they most likely would not need to argue for an Armenian homeland. In fact, their support of a Graeco-Aryan group (also supported by Gamkrelidze & Ivanov) would be against this, at least in archaeological terms.
  • Colin Renfrew and the Anatolian homeland: This conceptual umbrella of language spreading with farming everywhere has changed so much and so many times in the past 20 years, with so many glottochronological and archaeological estimates circulating, that you can support anything by now using them. Mostly used today for abstract models of long-lasting language contacts, cultural diffusion, and constellation analogies. Anyway, he strives to keep up-to-date information to revise the model, that much is certain:
  • Glottochronology, phylogenetic trees, Swadesh list analysis, statistical estimates, psychics, pyramid power, and healing crystals: no, please, no.
“A first line of evidence comes from linguistic analysis based on quantitative lexical data, which returned a tree compatible with the Anatolian hypothesis

In principle, unlike many other recent autochthonous continuity theories, I doubt there can be much racial-based opposition anywhere in the world to an origin of Proto-Indo-European in the Middle East, where the oldest civilizations appeared – apart, obviously, from modern Northeast and Northwest Caucasian, Kartvelian, or Semitic speakers, who may in turn have to revisit their autochthonous continuity theories radically…

Nevertheless, it is obvious that prehistoric (and many historic) migrations are signalled by the reduction in variability and expansion of certain Y-DNA haplogroups, and not just by ancestral components. That is generally accepted, although the reasons for this almost universal phenomenon are not always clear.

In fact, Proto-Anatolian and Common Anatolian speakers need not share any ancestral component, PCA cluster, or any other statistical parameter related to steppe populations, not even the same Y-DNA haplogroups, given that approximately three thousand years might have passed between their split from an Indo-Hittite community and the first attested Anatolian-speaking communities…We must carefully follow their tracks from Anatolia ca. 1500 BC to the steppe ca. 4500 BC, otherwise we risk creating another mess like the Corded Ware one.

In my opinion, the substantial contribution of EHG ancestry and R1a-M417 lineages to the Pontic-Caspian steppe (probably ca. 6500 BC) from Central or East Eurasia is the most recent sizeable genomic event in the region, and thus the best candidate for the community that expanded a language ancestral to Proto-Indo-European – whether you call it Pre-Proto-Indo-European, Pre-Indo-Uralic, or Eurasiatic, depending on your preferences.

An early (and substantial) contribution of CHG ancestry in Khvalynsk relative to North Pontic cultures, if it is found with new samples, may actually be a further proof of the Caucasian substrate of Proto-Indo-European proposed by Kortlandt (or Bomhard) as contributing to the differentiation of Middle PIE from Uralic. Genomics could thus help support, again, traditional disciplines in accepting or rejecting academic controversial theories.


In the case of an Early PIE (or Indo-Uralic) homeland, genomic data is scarce. But all traditional anthropological disciplines point to the Pontic-Caspian steppe, so we should stick to it, regardless of the informal suggestion written by a renown geneticist in one paragraph of a book conceived as an introduction to the field.

It seems we are not learning much from the hundreds of peer-reviewed, statistically (superficially, at least) sound genetic papers whose anthropological conclusions have been proven wrong by now. A lot of people should be spending their time learning about the complex, endless methods at hand in this kind of research – not just bioinformatics – , instead of fruitlessly speculating about wild unsubstantiated proposals.

As a final note, I would like to remind some in the discussion, who seem to dismiss the identification of CHG with Proto-Indo-European by supporting a “R1a-R1b” community for PIE, of their previous commitment to ancestral components in identifying peoples and languages, and thus their support to Reich’s (and his group’s) fundamental premises.

You cannot have it both ways. At least David Reich is being consistent.


First Iberian R1b-DF27 sample, probably from incoming East Bell Beakers


I had some more time to read the paper by Valdiosera et al. (2018) and its supplementary material.

One of the main issues since the publication of Olalde et al. (2018) (and its hundreds of Bell Beaker samples) was the lack of a clear Y-DNA R1b-DF27 subclades among East Bell Beaker migrants, which left us wondering when the subclade entered the Iberian Peninsula, since it could have (theoretically) happened from the Chalcolithic to the Iron Age.

My prediction was that this lineage found today widespread among the Iberian population crossed the Pyrenees quite early, during the Chalcolithic, with migrating East Bell Beakers expanding North-West Indo-European dialects, and that it spread slowly afterwards.

The first ancient sample clearly identified as of R1b-DF27 subclade is found in this paper, at the Late Bronze Age site Cueva de los Lagos. Although it is unidentified and has no radiocarbon date, the site as a whole is associated with the Cogotas culture and its Bouquique ceramic decoration.

Y-DNA and mtDNA haplogroups, from the paper. Sequencing statistics and contamination rates for newly generated sequence data.

It was found in the northern part of the Cogotas culture territory (which lies mainly between Castille and Aragon, in North-Central Spain), shows evident steppe admixture, and it has become obvious with the latest papers (including this one) that R1b-M269 lineages intruded south of the Pyrenees associated with East Bell Beaker migrations.

The Proto-Cogotas culture is associated with a Bell Beaker substrate influenced by either El Argar or Atlantic Bronze, and the specific type of ceramics found at this Cogotas culture site are probably from the mid-2nd millennium, which is too early for the Celtic expansion.

Supervised ADMIXTURE results.

Nevertheless, due to the quite likely late date of the sample (in the centuries around 1500 BC), there is still a possibility that incoming R1b-DF27 lineages were not among the early R1b-M269 lineages found in the Iberian Chalcolithic, and were associated with later migrations from Central Europe, potentially linked to the expansion of the Urnfield culture, and thus nearer to an Italo-Celtic community.

Diachronic map of migrations in Europe ca. 1250-750 BC.

In any of these scenarios, a Pre-Celtic expansion of North-West Indo-European in Iberia (possibly associated with Lusitanian) is still the best explanation for the origin and expansion of (at least some) modern Iberian R1b-DF27 lineages, including those found among the Basque-speaking population.

This implies that the ‘indigenous’ Neolithic lineages of Iberia (like I2 and G2a2) were replaced with subsequent internal gene flows and founder effects, such as those that evidently happened (probably quite recently) among Basques, even though indigenous languages show an obvious continuity.

I would say this is the last nail in the coffin for autochthonous Y-DNA continuity theories for Spain and France (i.e. for the traditional Vasconic-Uralic hypothesis), but we know that data is never enough for any die hard continuist…so let’s just say another nail in the coffin for endless autochthonous continuity theories.

EDIT (18 & 26 MAR 2018): Genetiker has published Y-SNP calls for both R1b samples, showing this one is R1b1a1a2a1a2a-BY15964 (see modern members of this subclade in ytree), and that the other one is R1b1a1a2a~L23.


Language continuity despite population replacement in Remote Oceania


New article (behind paywall) Language continuity despite population replacement in Remote Oceania, by Posth et al., Nat. Ecol. Evol. (2018).


Recent genomic analyses show that the earliest peoples reaching Remote Oceania—associated with Austronesian-speaking Lapita culture—were almost completely East Asian, without detectable Papuan ancestry. However, Papuan-related genetic ancestry is found across present-day Pacific populations, indicating that peoples from Near Oceania have played a significant, but largely unknown, ancestral role. Here, new genome-wide data from 19 ancient South Pacific individuals provide direct evidence of a so-far undescribed Papuan expansion into Remote Oceania starting ~2,500 yr BP, far earlier than previously estimated and supporting a model from historical linguistics. New genome-wide data from 27 contemporary ni-Vanuatu demonstrate a subsequent and almost complete replacement of Lapita-Austronesian by Near Oceanian ancestry. Despite this massive demographic change, incoming Papuan languages did not replace Austronesian languages. Population replacement with language continuity is extremely rare—if not unprecedented—in human history. Our analyses show that rather than one large-scale event, the process was incremental and complex, with repeated migrations and sex-biased admixture with peoples from the Bismarck Archipelago.

So, despite the population replacement in Oceania seen recently in Genomics, the people of present-day Vanuatu continue to speak languages descended from those spoken by the initial Austronesian inhabitants, rather than any Papuan language of the incoming migrants.

Professor Gray, Director of the Department of Linguistic and Cultural Evolution at the MPI-SHH, says:

Population replacement with language continuity is extremely rare – if not unprecedented – in human history. The linguist Bob Blust has long argued for a model in which a separate Papuan expansion reaches Vanuatu soon after initial Austronesian settlement, with the initial, and likely undifferentiated, Austronesian language surviving as a lingua franca for diverse Papuan migrant groups.

Dr. Adam Powell, senior author of the study and also of the MPI-SHH, continues,

The demographic history suggested by our ancient DNA analyses provides really strong support for this historical linguistic model, with the early arrival and complex, incremental process of genetic replacement by people from the Bismarck Archipelago. This provides a compelling explanation for the continuity of Austronesian languages despite the almost complete replacement of the initial genetic ancestry of Vanuatu.

Maps showing the migrations in the area, including, in the final map, the migrations revealed by the current study. Credit: Hans Sell, adapted from Skoglund et al. Genomic insights into the peopling of the Southwest Pacific. Nature (2016).

I think we can safely disagree now with their assertion. We are seeing more and more cases of language continuity in spite of population replacement quite clearly in Eurasian prehistory. At least:

All these cases can be explained with founder effects and gradual expansions after an initial arrival, maybe also initial close interaction between different ethnic groups, where one group (and its language) becomes the dominant one.

NOTE. Even if an alternative model is selected (say, that Corded Ware migrants spoke Indo-European languages), alternative language continuity events need to be proposed for some of these regions, so we are beyond their description as ‘rare language events’ already.

What is becoming clearer with ancient samples, therefore, is that there is little space for prehistoric cultural diffusion events (at least massive ones), which were quite popular explanations before the advent of genetic studies.


Consequences of O&M 2018 (I): The latest West Yamna “outlier”

This is the first of a series of posts analyzing the findings of the recent Nature papers Olalde et al.(2018) and Mathieson et al.(2018) (abbreviated O&M 2018).

As expected, the first Y-DNA haplogroup of a sample from the North Pontic region (apart from an indigenous European I2 subclade) during its domination by the Yamna culture is of haplogroup R1b-L23, and it is dated ca. 2890-2696 BC. More specifically, it is of Z2103 subclade, the main lineage found to date in Yamna samples. The site in question is Dereivka, “in the southern part of the middle Dnieper, at the boundary between the forest-steppe and the steppe zones”.

NOTE: A bit of history for those lost here, which appear to be many: the classical Yamna culture – from previous late Khvalynsk, and (probably) Repin groupsspread west of the Don ca. 3300 BC creating a cultural-historical community – and also an early offshoot into Asia – , with mass migrations following some centuries later along the Danube to the Carpathian Basin, but also south into the Balkans, and north along the Prut. There is thus a very short time frame to find Yamna peoples shaping these massive migrations – the most likely speakers of Late Proto-Indo-European dialects – in Ukraine, compared to their most stable historical settlements east of the Don River.

There is no data on this individual in the supplementary material – since Eneolithic Dereivka samples come from stored dental remains – , but the radiocarbon date (if correct) is unequivocal: the Yamna cultural-historical community dominated over that region at that precise time. Why would the authors name it just “Ukraine_Eneolithic”? They surely took the assessment of archaeologists, and there is no data on it, so I agree this is the safest name to use for a serious paper. This would not be the first sample apparently too early for a certain culture (e.g. Catacomb in this case) which ends up being nevertheless classified as such. And it is also not impossible that it represents another close Ukraine Eneolithic culture, since ancestral cultural groups did not have borders…

NOTE. Why, on the other hand, was the sample from Zvejnieki – classified as of Latvia_LN – assumed to correspond to “Corded Ware” (like the recent samples from Plinkaigalis242 or Gyvakarai1), when we don’t have data on their cultures either? No conspiracy here, just taking assessments from different archaeologists in charge of these samples: those attributed to “Corded Ware” have been equally judged solely by radiocarbon date, but, combining the known archaeological signs of herding in the region arriving around this time with the old belief (similar to the “Iberia is the origin of Bell Beaker peoples” meme) that “only the Corded Ware culture signals the arrival of herding in the Baltic”. This assumption has been contested recently by Furholt, in an anthropological model that is now mainstream, upheld also by Anthony.

We already know that, out of three previous West Yamna samples, one shows Anatolian Neolithic ancestry, the so-called “Yamna outlier”. We also know that one sample from Yamna in Bulgaria also shows Anatolian Neolithic ancestry, with a distinct ‘southern’ drift, clustering closely to East Bell Beaker samples, as we can still see in Mathieson et al. (2018), see below. So, two “outliers” (relative to East Yamna samples) out of four samples… Now a new, fifth sample from Ukraine is another “outlier”, coinciding with (and possibly somehow late to be a part of) the massive migration waves into Central Europe and the Balkans predicted long ago by academics and now confirmed with Genomics.

I think there are two good explanations right now for its ancestral components and position in PCA:

Modified image from Mathieson et al. (2018), including also approximate location of groups from Mittnik et al. (2018), and group (transparent shape outlined by dots) formed by new Bell Beaker samples from Olalde et al. (2018). “Principal components analysis of ancient individuals. Points for 486 ancient individuals are projected onto principal components defined by 777 present-day west Eurasian individuals (grey points). Present-day individuals are shown.”

a) The most obvious one, that the Dnieper-Dniester territory must have been a melting pot, as I suggested, a region which historically connected steppe, forest steppe, and forest zone with the Baltic, as we have seen with early Baltic Neolithic samples (showing likely earlier admixture in the opposite direction). The Yamna population, a rapidly expanding “elite group of patrilineally-related families” (words from the famous 2015 genetic papers, not mine), whose only common genetic trait is therefore Y-DNA haplogroup R1b-L23, must have necessarily acquired other ancestral components of Eneolithic Ukraine during the migrations and settlements west of the Don River.

How many generations are needed for ancestral components and PCA clusters to change to that extent, in regions where only some patrilocal chiefs but indigenous populations remain, and the population probably admixed due to exogamy, back-migrations, and “resurge” events? Not many, obviously, as we see from the differences among the many Bell Beaker samples of R1b-L23 subclades from Olalde et al. (2018)

b) That this sample shows the first genetic sign of the precise population that contributed to the formation of the Catacomb culture. Since it is a hotly debated topic where and how this culture actually formed to gradually replace the Yamna culture in the central region of the Pontic-Caspian steppe, this sample would be a good hint of how its population came to be.

See e.g. for free articles on the Catacomb culture its article on the Encyclopedia of Indo-European Culture, Catacomb culture wagons of the Eurasian steppes, or The Warfare of the Northern Pontic Steppe – Forest-Steppe Pastoral Societies: 2750 – 2000 B.C. There are also many freely available Russian and Ukrainian papers on anthropometry (a discipline I don’t especially like) which clearly show early radiocarbon dates for different remains.

This could then be not ‘just another West Yamna outlier’, but would actually show meaningful ‘resurge’ of Neolithic Ukraine ancestry in the Catacomb culture.

It could be meaningul to derive hypotheses, in the same way that the late Central European CWC sample from Esperstedt (of R1a-M417 subclade) shows recent exogamy directly from the (now more probably eastern part of the) steppe or steppe-forest, and thus implies great mobility among distant CWC groups. Although, given the BB samples with elevated steppe ancestry and close PCA cluster from Olalde et al. (2018), it could also just mean exogamy from a near-by region, around the Carpathian Basin where Yamna migrants settled…

If this was the case, it would then potentially mean a “continuity” break in the steppe, in the region that some looked for as a Balto-Slavic homeland, and which would have been only later replaced by Srubna peoples with steppe ancestry (and probably R1a-Z93 subclades). We would then be more obviously left with only two options: a hypothetic ‘Indo-Slavonic’ North Caspian group to the east (supported by Kortlandt), or a Central-East European homeland near Únětice, as one of the offshoots from the North-West Indo-European group (supported by mainstream Indo-Europeanists).

How to know which is the case? We have to wait for more samples in the region. For the moment, the date seems too early for the known radiocarbon dating of most archaeological remains of the Catacomb Culture.

Diachronic map of Late Copper Age migrations including steppe groups ca. 2600-2250 BC

An important consequence of the addition of these “Yamna outliers” for the future of research on Indo-European migrations is that, especially if confirmed as just another West Yamna sample – with more, similar samples – , early Palaeo-Balkan peoples migrating south of the Danube and later through Anatolia may need to be judged not only in terms of ancestral components or PCA (as in the paper on Minoans and Mycenaeans), but also and more decisively using phylogeography, especially with the earliest samples potentially connected with such migrations.

NOTE. Regarding the controversy (that some R1b European autochthonous continuists want to create) over the origin of the R1b-L151 lineages, we cannot state its presence for sure in Yamna territory right now, but we already have R1b-M269 in the eastern Pontic-Caspian steppe during the Neolithic-Chalcolithic transition, then R1b-L23 and subclades (mainly R1b-Z2013, but also one xZ2103, xL51 which suggests its expansion) in the region before and during the Yamna expansion, and now we have L51 subclades with elevated steppe ancestry in early East Bell Beakers, which most likely descended from Yamna settlers in the Carpathian Basin (yet to be sampled).

Even without express confirmation of its presence in the steppe, the alternative model of a Balkan origin seems unlikely, given the almost certain continuity of expanding Yamna clans as East Bell Beaker ones, in this clearly massive and relatively quick expansion that did not leave much time for founder effects. But, of course, it is not impossible to think about a previously hidden R1b-L151 community in the Carpathian Basin yet to be discovered, adopting North-West Indo-European (by some sort of founder effect) brought there by Yamna peoples of exclusively R1b-Z2103 lineages. As it is not impossible to think about a hidden and ‘magically’ isolated community of haplogroup R1a-M417 in Yamna waiting to be discovered…Just not very likely, either option.

As to why this sample or the other Bell Beaker samples “solve” the question of R1a-Z645 subclades (typical of Corded Ware migrants) not expanding with Yamna, it’s very simple: it doesn’t. What should have settled that question – in previous papers, at least since 2015 – is the absence of this subclade in elite chiefs of clans expanded from Khvalynsk, Yamna, or their only known offshoots Afanasevo and Bell Beaker. Now we only have still more proof, and no single ‘outlier’ in that respect.

No haplogroup R1a among hundreds of samples from a regionally extensive sampling of the only cultures mainstream archaeologists had thoroughly described as potentially representing Indo-European-speakers should mean, for any reasonable person (i.e. without a personal or professional involvement in an alternative hypothesis), that Corded Ware migrants (as expected) did not stem from Yamna, and thus did not spread Late Indo-European dialects.

This haplogroup’s hegemonic presence in North-Eastern Europe – and the lack of N1c lineages until after the Bronze Age – coinciding with dates when Uralicists have guesstimated Uralic dialectal expansion accross this wide region makes the question of the language spread with CWC still clearer. The only surprise would have been to find a hidden and isolated community of R1a-Z645 lineages clearly associated with the Yamna culture.

NOTE. A funny (however predictable) consequence for R1a autochthonous continuists of Northern or Eastern European ancestry: forum commentators are judging if this sample was of the Yamna culture or spoke Indo-European based on steppe component and PCA cluster of the few eastern Yamna samples which define now (you know, with the infallible ‘Yamnaya ancestral component’) the “steppe people” who spoke the “steppe language”™ – including, of course, North-Eastern European Late Neolithic

Not that radiocarbon dates or the actual origin of this sample cannot be wrong, mind you, it just strikes me how twisted such biased reasonings may be, depending on the specific sample at hand… Denial, anger, and bargaining, including shameless circular reasoning – we know the drill: we have seen it a hundred times already, with all kinds of supremacists autochthonous continuists who still today manage to place an oudated mythical symbolism on expanding Proto-Indo-Europeans, or on regional ethnolinguistic continuity…

More detailed posts on the new samples from O&M 2018 and their consequences for the Indo-European demic diffusion to come, indeed…

Reactionary views on new Yamna and Bell Beaker data, and the newest IECWT model

You might expect some rambling about bad journalism here, but I don’t have time to read so much garbage to analyze them all. We have seen already what they did with the “blackness” or “whiteness” of the Cheddar Man: no paper published, just some informal data, but too much sensationalism already.

Some people who supported far-fetched theories on Indo-European migrations or common European haplogroups are today sharing some weeping and gnashing of teeth around forums and blogs – although, to be fair, neither Olalde et al. (2018) nor Mathieson et al. (2018) actually gave any surprising new data that you couldn’t infer before… People are nevertheless in the middle of the five stages of grief (for whatever expectations they had for new samples), and acceptance will surely take some time.

They will be confronted with two options:

  1. Keep fighting for what they believed, however wrong it turns out to be – after all, we still see all kinds of autochthonous continuists out there, no matter how much data there is against their views. People want to be supporters of a West European origin of R1b-M269 linked to Vasconic languages, fans of R1b-M269 continuity in Central Europe, Uralic speakers who believe in hidden N1c communities in Mesolithic or Neolithic Eastern Europe, fans of the OIT and Indian origins of R1a-M417…
  2. Just accept what seems now clear, change their model, and go on.
Modified from Wiik for the current autochthonous continuity fans: Vasconic-Uralic distribution and Indo-European folk distribution

For me, the second option sounds quite simple, since whatever happens – markers of Indo-European migration being R1a or R1b, Corded Ware or Bell Beaker, or bothour group’s aim for the past 15 years or so is to support a North-West Indo-European proto-language, so any of the most reasonable anthropological models are a priori compatible with that. My model of Indo-European demic diffusion fits best the most recent proto-language guesstimates, though.

However, I understand that if I had been buying or selling dreams – and I mean literally buying or selling fantasies of whiteness and Europeanness (hidden behind an idealized concept of “Indo-European”, and ancestral components disguised as populations), beginning with the ‘R1a-M417/CWC’ and ‘Yamnaya ancestry’ craze of the 2015 papers – , and I realized data didn’t support that money exchange, I would be frustrated, too.

There is a funny mental process going on here for some of these people, as far as I could read today. Let me review some history of the Indo-European question here before getting to the point:

  1. Firstly linguists reconstructed (and are still doing it) Proto-Indo-European and other ancestral Indo-European proto-languages.
  2. Then archaeologists tried to identify certain ancestral cultures with these actual communities with help from linguistic guesstimates and dialectal classifications,
  3. using anthropological models of migration or cultural diffusion.
  4. Then genetic data came to support one of these alternative anthropological models, if possible.

Now some (amateur) geneticists are apparently disregarding what “Indo-European” means, and why Yamna was considered the best candidate for the expansion of Late Indo-European languages, and question the very sciences of Linguistics and Archaeology as unreliable, instead of questioning their own false assumptions and wrong interpretations from genetic papers.

Really? Genomics (especially ancestral components) now defines what an Indo-European population, culture, and language is? If that is not a fallacy of circular reasoning, I wonder what is.

The modified IECWT model

The surprise today came from the quick reaction of one member of the IECWT workgroup, Guus Kroonen, in his draft Comments to Olalde et al. 2018., The Beaker phenomenon and the genomic, transformation of northwest Europe, Nature.

Allentoft Corded Ware
The IECWT workgroup’s so-called “Steppe model” until today, as presented in Haak et al. (2015).

He and – I can only guess – the whole IECWT workgroup finally rejected their characteristic Corded Ware -> Bell Beaker migration model – which they defended as “The steppe model” of Indo-European migration in Haak et al. 2015. They now defend a proposal similar to Anthony (2007).

Fan fact: Anthony changed his mind recently to partially support what Heyd said in 2007. While I did not dislike Anthony’s new model, I consider it wrong.

The Danish group – unsurprisingly – sticks nevertheless to the hypothesis of some kind of autochthonous Germanic in Scandinavia being defined by Corded Ware migrants and haplogroup R1a, and being somehow special and older among Proto-Indo-European dialects because of its non-Indo-European substrate – although in fact Kroonen’s original linguistic paper didn’t imply so.

While this new change of the workgroup’s model brings it closer to Heyd (2007), and parallels in that sense the adaptation process of Anthony (but always one step behind), what they are proposing right now seems not anymore a modified Kurgan model, as I described it: it is essentially The Kurgan model of Marija Gimbutas (1963), with Bell Beakers spreading a language ancestral to Italo-Celtic, and Corded Ware spreading some kind of mythical Germano-Balto-Slavic

I find it odd that he would not cite Gimbutas, Heyd – as Anthony recently did – , or the most recent paper of Mallory on the language expanded with Bell Beakers, but just the workgroup’s papers and other old ones, to present this “new” theory.

However simple and (obviously) rapidly drafted it was, following the publications in Nature, it does not seem right: They were first, they were right, acknowledge them. Period.

It is interesting how the wrong interpretations of the ‘Yamnaya ancestral component’ (you know, that bulletproof “Yamna R1a-R1b community” and Yamna->Corded Ware migration that never happened) is affecting everyone involved in Indo-European studies.


Ancestral heterogeneity of ancient Eurasians

Josif Lazaridis tweets about an interesting preprint at BioRxiv (eclipsed by today’s Nature papers), Ancestral heterogeneity of ancient Eurasians, by Daniel Shriner.


Supervised clustering or projection analysis is a staple technique in population genetic analysis. The utility of this technique depends critically on the reference panel. The most commonly used reference panel in the analysis of ancient DNA to date is based on the Human Origins array. We previously described a larger reference panel that captures more ancestries on the global level. Here, I reanalyzed DNA data from 279 ancient Eurasians using our reference panel, finding substantially more ancestral heterogeneity than has been reported. This reanalysis provides evidence against a resurgence of Western hunter-gatherer ancestry in the Middle to Late Neolithic and evidence for a common ancestor of farmers characterized by Western Asian ancestry, a transition of the spread of agriculture from demic to cultural diffusion, at least two migrations between the Pontic-Caspian steppes and Bronze Age Europe, and a sub-Saharan African component in Natufians that localizes to present-day southern Ethiopia.

Admixture bar plots showing projections of ancient Eurasians (Steppe peoples on the left, Bronze Age Europeans on the right) onto 21 ancestries. The 3 proportions are the raw output from ADMIXTURE. The 21 ancestral components are Southern 4 African (dark orchid), Central African (magenta), West-Central African (brown), Eastern 5 African (orange), Omotic (yellow), Northern African (purple), South Indian (slate blue), Kalash 6 (black), Japanese (red), Sino-Tibetan (green), Southeastern Asian (coral), Northern Asian 7 (aquamarine), Amerindian (gray), Oceanian (salmon), Southern European (dark olive green), 8 Northern European (blue), Western Asian (white), Arabian (light gray), Western African 9 (tomato), Circumpolar (pink), and Southern Asian (dark goldenrod).

Excerpt (emphasis mine)

Early to Middle Bronze Age Steppe Peoples
Third, we considered the Eurasian steppe peoples (See figure). The Eneolithic Samara sample had 64.4% Northern European, 18.2% Southern Asian, 8.8% Circumpolar, 4.3% Amerindian, and 4.3% Southern European ancestries. The 27 Early to Middle Bronze Age steppe individuals (Yamnaya from Kalmykia, Yamnaya from Samara, Afanasievo, Poltavka, and Potapovka) averaged 54.7% Northern European, 27.8% Southern Asian, 7.9% Southern European, 4.7% Kalash, 4.2% Amerindian, and 0.8% Western Asian ancestries. We included the Potapovka sample here because the sum of absolute differences in ancestry was greater post-Potapovka rather than post-Poltavka. The increases in Southern Asian and Southern European ancestries do not fit with a European hunter-gatherer source and more broadly do not fit with any of the samples, suggesting an unknown source population. Currently, Southern Asian ancestry co-localizes with Y DNA haplogroup L and correlates with Indo-Iranian languages.

Although there are no L haplogroups in any of these Early to Middle Bronze Age steppe individuals, the correlation with Indo-Iranian languages strengthens the connection between Early to Middle Bronze Age steppe peoples and the introduction of Indo-European languages into Europe. In the Early to Middle Bronze Age steppe peoples, 83.3% of Y DNA haplogroups were R1b and 85.2% of mitochondrial haplogroups were H, J, T, or U. Thus, Northern European ancestry was primarily associated with R1b in these peoples, rather than with I2 as in the European hunter-gatherers, while the mitochondrial lineages were more diverse than in the European hunter-gatherers but less diverse than in the Early Neolithic peoples.

It is an interesting new approach, in that it takes into account more than just adxmiture components and PCA to assess ancestral populations.

As simplistic and wrong some conclusions may seem from your point of view, you have to take into account what Iain Mathieson had to (sadly) expressly state recently:


Spatio-temporal deixis and cognitive models in early Indo-European


Interesting article, Spatio-temporal deixis and cognitive models in early Indo-European, by Annamaria Bartolotta, Cognitive Linguistics (2018); 29(1):1-44.

Abstract (emphasis mine):

This paper is a comparative study based on the linguistic evidence in Vedic Sanskrit and Homeric Greek, aimed at reconstructing the space-time cognitive models used in the Proto-Indo-European language in a diachronic perspective. While it has been widely recognized that ancient Indo-European languages construed earlier (and past) events as in front of later ones, as predicted in the Time-Reference-Point mapping, it is less clear how in the same languages the passage took place from this ‘archaic’ Time-RP model or non-deictic sequence, in which future events are behind or follow the past ones in a temporal sequence, to the more recent ‘post-archaic’ Ego-RP model that is found only from the classical period onwards, in which the future is located in front and the past in back of a deictic observer. Data from the Rigveda and the Homeric poems show that an Ego-RP mapping with an ego-perspective frame of reference (FoR) could not have existed yet at an early Indo-European stage. In particular, spatial terms of front and behind turn out to be used with reference not only to temporal events, but also to east and west respectively, thus presupposing the existence of an absolute field-based FoR which the temporal sequence is metaphorically related to. Specifically, sequence is relative position on a path appears to be motivated by what has been called day orientation frame, in which the different positions of the sun during the day motivate the mapping of front onto ‘earlier’ and behind onto ‘later’, without involving ego’s ‘now’. These findings suggest that early Indo-European still had not made use of spatio-temporal deixis based on the tense-related ego-perspective FoR found in modern languages.

Featured image, from the article: Helios rising from the sea (blacas red-figured calyx-krater, fifth century B.C., British Museum). Related quote from the article:
“Interestingly, the archeological evidence supports that time could be spatialized along the lateral axis. In ancient Greek art the sun is represented as moving from right to left. Such orientation can be observed, for instance, on the Blacas red-figured calyx-krater of the fifth century B.C. (London, British Museum), where Helios is found at the extreme right of the scene and proceeds to the viewer’s left, following Eos, i.e., the dawn.”

“How Asian nomadic herders built new Bronze Age cultures”

I recently wrote about a good informal summary of genomic research in 2017 for geneticists.

I found a more professional review article, How Asian nomadic herders built new Bronze Age cultures, by Bruce Bower, appeared in Science News (25th Nov. 2017).

NOTE: I know, I know, the Pontic-Caspian steppe is in East Europe, not Asia, but what can you do about people’s misconceptions regarding European geography? After all, the division is a conventional one, there are not many landmarks to divide Eurasia…

It refers to Kristiansen’s model, which we already know supports the expansion of IE languages with the Corded Ware culture, and a later Corded Ware -> Bell Beaker migration. This is followed by many geneticists today as “The steppe model”.

Corded Ware culture emerged as a hybrid way of life that included crop cultivation, breeding of farm animals and some hunting and gathering, Kristiansen argues. Communal living structures and group graves of earlier European farmers were replaced by smaller structures suitable for families and single graves covered by earthen mounds. Yamnaya families had lived out of their wagons even before trekking to Europe. A shared emphasis on family life and burying the dead individually indicates that members of the Yamnaya and Corded Ware cultures kept possessions among close relatives, in Kristiansen’s view.

“The Yamnaya and the Corded Ware culture were unified by a new idea of transmitting property between related individuals and families,” Kristiansen says.

Yamnaya migrants must have spoken a fledgling version of Indo-European languages that later spread across Europe and parts of Asia, Kristiansen’s group contends. Anthony, a longtime Kristiansen collaborator, agrees. Reconstructed vocabularies for people of the Corded Ware culture include words related to wagons, wheels and horse breeding that could have come only from the Yamnaya, Anthony says.

As Indo-European languages spread, the Yamnaya’s genetic impact in Europe remained substantial, even after the disappearance of Corded Ware culture around 4,400 years ago, Reich’s team reported online May 9 at About 50 percent of the ancestry of individuals from a later Bronze Age culture, dubbed the Bell Beaker culture for its pottery vessels shaped like an inverted bell, derived from Yamnaya stock. Such pottery spread across much of Europe starting nearly 4,770 years ago and disappeared by 3,800 years ago. Migrations of either people or ideas may have accounted for that dispersal.

NOTE. Anthony, as we know, has already changed his mind with the most recent data.

The author juxtaposes other opinions, to somehow balance the article:

Like many of his colleagues, archaeologist Volker Heyd of the University of Bristol in England was jolted by the 2015 reports of a close genetic link between Asian herders and a Bronze Age culture considered native to Europe. But, Heyd says, the story of ancient Yamnaya migrations is more complex than the rapid-change scenario sketched out by Kristiansen and Anthony.

No evidence exists that Yamnaya people rapidly developed practices typical of the Corded Ware culture in one part of Europe, Heyd argues in the April Antiquity. Cultural shifts in Europe around 5,000 years ago must have emerged from an extended series of small-scale dealings with Yamnaya and other pastoralists, which was then capped off by a large influx of steppe wagon travelers, he says.

For instance, individual graves and other signs of contact with the Yamnaya people and even earlier Asian pastoralists appear in Europe 1,000 to 2,000 years before DNA-transforming migrations occurred. Consider that the Yamnaya account for 5 percent of the ancestry of Ötzi the Iceman, who lived in southeastern Europe roughly 300 years before the Yamnaya’s big move (SN: 5/27/17, p. 13). Little is known about those earlier encounters.

Efforts to decipher ties between Yamnaya and Corded Ware culture are complicated by the fact that DNA is available from just a few people from each group, says Heyd, who is currently excavating Yamnaya graves in Hungary. Ancient DNA samples analyzed in the 2015 papers come from only a handful of Yamnaya and Corded Ware culture sites in a few parts of Europe and Russia.

Heyd suspects that Yamnaya travelers had even earlier contacts, perhaps by 5,400 years ago, with central and eastern Europeans known for making globe-shaped pots with small handles. Individuals from that culture, excavated at two sites in Poland and Ukraine, possess no Yamnaya genes, a team affiliated with Reich’s lab reported online May 9 at But Heyd thinks mating between members of that European culture and Yamnaya migrants may have occurred a bit farther east, where cross-cultural contacts probably occurred at the boundary of European forests and Asian grasslands.

Other genetic clues point to a long history of Asian pastoralists crossing into parts of Europe. Small amounts of DNA from steppe herders, possibly the Yamnaya, appeared in three hunter-gatherer skeletons from southeastern Europe dating to as early as around 6,500 years ago.

It is always interesting to see how reports gradually evolve, including more and more doubts about the ‘Yamnaya component’, and how it may be correctly interpreted. Slow but steady wins the race.

Check out the full article.

Featured image: from the article, based on the 2015 papers and Kristiansen’s model.

The myth of mixed language, the concepts of culture core and package, and the invention of ‘Steppe folk’


I recently read some papers which, albeit apparently unrelated, should be of interest for many today.

Mixed language

The myth of the mixed languages, by Kees Versteeg, in Advances in Maltese linguistics, ed. by Benjamin Saade and Mauro Tosco, 217-238. Berlin and New York: Mouton de Gruyter, 2017 [uncorrected proofs]

This paper focuses on the usefulness of the label ‘mixed languages’ as an analytical tool. Section 1 sketches the emergence of the biological paradigm in linguistics and its effect on the contemporary debate about mixed languages. Sections 2 and 3 discuss two processes that have been held responsible for the emergence of mixed languages, code switching and extreme borrowing. Section 4 compares these two mechanisms with the categories of change in Thomason & Kaufman (1988), while Section 5 offers some conclusions about the status of mixed languages as a special category.

Although the paper is a must read for language contact and language change (code-switching, borrowing, shifting), a good summary may save you some time if you are not interested in linguistics:

Speakers may either shift to a new language while retaining traces of their old language, or they may stick to their original language while borrowing from another language with which they come in touch (…)

[Bakker] distinguishes two types of communities in which mixing is found: isolated mixed marriage communities with (asymmetrical) bilingualism; and nomadic communities that shift to a dominant language, but retain a substantial part of their lexicon as a private or secret register, closely connected with the community’s identity. The history of these communities provides us with plausible scenarios to explain the idiosyncrasies of the speech pattern of the speakers belonging to them.

In what way, then, does it help to put both categories under one label of mixed languages? I believe Backus (2003: 263) is right when he suggests that the question of whether a certain set of features constitutes a mixed language is perhaps not a very interesting one. The question should not be whether given certain features a language may be categorized as mixed, but what the linguistic effects of different kinds of contact (trade, work, conquest, mixed marriages, colonization, marginalization, etc.) are, and to what extent these effects correlate with the type of contact. At no point is it necessary to posit a category of mixed languages. In fact, the myth of the mixed languages may have been perpetuated because of the relative weirdness of the initial cases, notably that of Michif, which represent phenomena so unique that it is understandable that some scholars came to believe that they could only be explained by special mechanisms. The position taken here is that if we focus on the speakers’ behavior, the phenomena in question become much more understandable. The crucial point is that languages do not mix, people do.

Culture core and package

The article Isolation-by-distance, homophily, and “core” vs. “package” cultural evolution models in Neolithic Europe, by Shennana, Crema, and Kerig (2015):

Recently there has been growing interest in characterising population structure in cultural data in the context of ongoing debates about the potential of cultural group selection as an evolutionary process. Here we use archaeological data for this purpose, which brings in a temporal as well as spatial dimension. We analyse two distinct material cultures (pottery and personal ornaments) from Neolithic Europe, in order to: a) determine whether archaeologically defined “cultures” exhibit marked discontinuities in space and time, supporting the existence of a population structure, or merely isolation-by-distance; and b) investigate the extent to which cultures can be conceived as structuring “cores” or as multiple and historically independent “packages”. Our results support the existence of a robust population structure comparable to previous studies on human culture, and show how the two material cultures exhibit profound differences in their spatial and temporal structuring, signalling different evolutionary trajectories.

Our results suggest distinct evolutionary histories in the spatial and temporal variation of personal ornament and pottery, with different rates of innovation, patterns of descent, and dynamics of diffusion. Ornament data do show statistically significant values of ΦST using pottery-defined population structures, but the magnitude is extremely small, and partial Mantel tests suggest that much of this pattern is explained by isolation by distance. These results are in line with a model
of culture represented by independent “packages” of multiple coherent units rather than one characterised by a distinct and fairly isolated “core” surrounded by a “periphery” of elements prone to crosscultural transmission. The alternative hypothesis is that one element was part of the “core” tradition, whilst the other was peripheral. This scenario is however less likely given that both elements are generally regarded as expression of local lines of transmission and/or signalling.

The robust support for a population structure in the pottery data shows that some degree of homophily must have biased the transmission process, but this bias was confined within the single “package”, rather than affecting other aspects of the material culture. In other words similarity (or dissimilarity) of pottery style was not influencing the transmission process of personal ornaments and vice-versa. If this was the case, we should have observed a stronger agreement between in the spatio-temporal distribution of the two datasets, a pattern we failed to observe. Personal ornaments are often seen as group-identity markers, but the fact that our study appears to indicate a stronger role for isolation by distance in accounting for variation in ornaments suggests that this assumption may not be valid, or alternatively that these groups cross-cut the archaeological cultures traditionally recognised. Thus,while our study has provided strong evidence of population structure affecting patterns of cultural interaction, in this case at least the distinct patterns observed point to a modular, ‘package’ model. It has also shown that we can identify population structuring from the evidence of the archaeological record without continuing to attempt the fruitless task of correlating its patterns with past ethnolinguistic units.

Location of material culture (ornament and pottery) data of 361 Neolithic sites in central

A situation more interesting than Neolithic Europe for those following this blog will probably be the one on West Europe during the Bell Beaker expansion.

Regarding Iberia, I have already talked about the possibilities of “resurgence” after the arrival of Bell Beaker migrants, in this blog and in the Indo-European demic diffusion model.

About the British Isles, you can read e.g. the stepped and different cultural changes that happened after the arrival of East Bell Beakers in What was and what would never be: changing patterns of interaction and archaeological visibility across north-west Europe from 2,500 to 1,500 cal BC, by Wilkin, N., Vander Linden, M. 2015. In : Anderson-Whymark, H., Garrow, D., Sturt, F. (eds)Continental connections. Exploring cross -Channel relationships from the Mesolithic to the Iron Age. Oxford: Oxbow: 99-121.

The changing cultural geography of north-west Europe c. 3000-1500 BC. Solid line: architecture; dashed line: material culture;, dash-dot line: funerary practices

This logical description different cultural changes brings up a question obvious to many, and can be summed up by “does the arrival of (North-West Indo-European-speaking) East Bell Beakers mean the end of non-Indo-European cultures in Western and Northern Europe?” The answer is obviously – as in the rest of Europe – quite simply No. Many non-Indo-European groups must have survived the initial expansion of East Bell Beakers in many regions, as the pre-Roman situation (already quite simplified after the expansion of Celtic and Germanic) testifies.

“Resurgence” of local groups as seen in genomic data is the most direct connection with survival of previous non-IE cultures (and thus languages), but obviously not the only mechanism of language survival, since we can see founder effects – such as those seen in modern Basque speakers (mainly of R1b subclades), and in Ugro-Finnic speakers in north-east Europe (mainly of N subclades).

Steppe folk

I am recently stumbling more and more often upon the concept of ‘Steppe people‘ by amateur geneticists, whether in Anthrogenica, in blogs and blog comments, or even in research papers, where people used to talk about ‘Yamnaya people’ or ‘Yamnaya folk’. As I said, I expected this since I questioned the concept of the ‘Yamnaya ancestral component‘, so I feel vindicated by this change.

However, whereas most will be using these names simply following the redeeming term “steppe admixture”, and thus refer to a Neolithic steppe population that shared a common admixture component (probably ca. 5000 BC), some will obviously go further and identify this component with Proto-Indo-European (like that, in general terms), since it is the logical sequence for those who consider the term “Indo-Europeans” as an umbrella for a certain ethnic proportion and a link with modern populations in stupid autochthonous continuity theories, where prehistoric language and culture are irrelevant.

NOTE. Evidently, those who supported quite strongly the fact that R1a-M417 subclades associated with Corded Ware migrants (i.e. mainly R1a-Z645) stemmed directly from Yamna migrants are shifting terminology to “steppe people” in light of recent data, so that they can support an older steppe community in the likely case that no R1a is found in Yamna, while keeping open the possibility to revert to a more direct support of the Yamna -> Corded Ware model in case just one R1a sample is found… So no anthropological model at all here, just a personal desire to be fullfilled in any possible way.

Those thinking naively about an imaginary ‘Steppe folk’, living in loosely connected steppe cultures, speaking a mixed ‘Steppe language’, may well keep inventing potentially popular peoples and languages based on admixture, such as West Hunter-Gatherer folk (Vasconic?), East Hunter-Gatherer folk (Uralic? – or maybe today they can invent a Siberian folk bringing Finno-Ugric after 500 BC), Caucasus Hunter-Gatherer folk (??), etc. So welcome back to the 1930s!… Or was it the 2000s?

Image modified by me from Wiik’s original, for those of you nostalgic fans of autochthonous continuity theories: You can now again support a native Vasconic-Uralic-Indo-European “folk” distribution in Neolithic Europe.

Some people like to talk about how “Science” wins against Academia, especially when they try to defend this pseudo-subfield they are inventing and venerating on the go to characterize ancient populations, where new genomic methods are king, and the other fields involved are just noise they easily use or dismiss to support their own desires and preconceptions.

NOTE. I felt like many fans of Genomics are very well represented in this recent article at ArXiv: “23andMe confirms: I’m super white” — Analyzing Twitter Discourse On Genetic Testing

Too bad for them. The misuse of this new field might be popular today among certain amateur geneticists, but it will not stand the test of time, similar to how the initial hype around radiocarbon analysis (for Archaeology) or glottochronology (for Linguistics) eventually faded, and they became just another tool among traditional methods. In Science, time puts everything where it belongs.

Whether you like it or not, Indo-European (and Uralic) questions will be solved – as they have been for a long time now, there is nothing new under the sun – with Historical Linguistics first, then Prehistoric Archaeology, then Anthropology (models of migration, cultural diffusion, founder effects, etc.), and only then Genomics, which may (or may not) help solve certain controversial aspects, by supporting one or other anthropological model. Period.

Genetic prehistory of the Baltic Sea region and Y-DNA: Corded Ware and R1a-Z645, Bronze Age and N1c


Open Access The genetic prehistory of the Baltic Sea region, by Mittnik et al., Nature Communications 9: 442 (2018), based on preprint The Genetic History of Northern Europe, at BioRxirv.

As you can see, it follows my predictions in terms of haplogroups, and sadly the same trend to substitute ‘Yamna’ for ‘steppe’ while keeping linguistic interpretations unchanged…

Important excerpts for the Indo-European question (emphasis mine):

Mesolithic to Neolithic

In the archaeological understanding, the transition from Mesolithic to Neolithic in the Eastern Baltic region does not coincide with a large-scale population turnover and a stark shift in economy as seen in Central and Southern Europe. Rather, it is signified by a change in networks of contacts and the use of pottery, among other material, cultural and economic changes. Our results suggest continued admixture between groups in the south of the Eastern Baltic region, who are more closely related to WHG, and northern or eastern groups, more closely related to EHG. Neolithic social networks from the Eastern Baltic to the River Volga could also explain similarities of the hunter-gatherer pottery styles, although morphologically analogous ceramics might also have developed independently due to similar functionality. The genetic evidence for a change in networks and possibly even a large-scale population movement is most pronounced in the Middle Neolithic in individuals attributed to the CCC. The distribution of this culture overlaps in the north with the Narva culture and extends further north to Finland and Karelia. Its spread in the Eastern Baltic is linked with a significant change in imported raw materials, artefacts, and the appearance of village-like settlements15.

Neolithic to Chalcolithic

We see a further population movement into the regions surrounding the Baltic Sea with the CWC in the Late Neolithic that was accompanied by the first evidence of extensive animal husbandry in the Eastern Baltic. The presence of ancestry from the Pontic-Caspian Steppe among Baltic CWC individuals without the genetic component from north-western Anatolian Neolithic farmers must be due to a direct migration of steppe pastoralists that did not pick up this ancestry in Central Europe. It suggests import of the new economy by an incoming steppe-like population independent of the agricultural societies that were already established to the south and west of the Baltic Sea. The presence of direct contacts to the steppe could lend support to a linguistic model that sees an early branching of Balto-Slavic from a Proto-Indo-European language, for which the west Eurasian steppe was proposed as a homeland. However, as farmer ancestry is found in later Eastern Baltic individuals, it is likely that considerable individual mobility and a network of contact throughout the range of the CWC facilitated its spread eastward, possibly through exogamous marriage practices. Conversely, the appearance of mitochondrial haplogroup U4 in the Central European Late Neolithic after millennia of absence could indicate female gene-flow from the Eastern Baltic, where this haplogroup was present at high frequency.

PCA and ADMIXTURE analysis reflecting Late Neolithic in Northern European prehistory. a Principal components analysis of 1012 present-day West Eurasians (grey points, modern Baltic populations in dark grey) with 294 projected published ancient and ancient North European samples introduced in this study (marked with a red outline). b Ancestral components in ancient individuals estimated by ADMIXTURE (k = 11)
Zoomed-in version of the European Late Neolithic PCA.

So, we see that no farmer ancestry is found in the Baltic (unlike in Western Yamna), that PCA of Late Neolithic is closer to Corded Ware samples from Europe (or to earlier samples from the region) and not to Yamna, as suggested at first by the Zvejnieki individual.

There obviously was exogamy – which may in fact justify the findings in PCA close to Yamna (like the Zvejnieki sample), although researchers obviate that.

Also, as expected, no R1b-M269 in the Baltic (during the Corded Ware period), most are R1a with the majority showing subclade R1a-Z645 (and others poor SNP coverage), which support the reduction in haplogroup diversity to this very subclade during the expansion of Corded Ware peoples, as I predicted it would happen.

Bronze Age

Local foraging societies were, however, not completely replaced and contributed a substantial proportion to the ancestry of Eastern Baltic individuals of the latest LN and Bronze Age. This ‘resurgence’ of hunter-gatherer ancestry in the local population through admixture between foraging and farming groups recalls the same phenomenon observed in the European Middle Neolithic and is responsible for the unique genetic signature of modern-day Eastern Baltic populations.

We suggest that the Siberian and East Asian related ancestry in Estonia, and Y-haplogroup N in north-eastern Europe, where it is widespread today, arrived there after the Bronze Age, ca. 500 calBCE, as we detect neither in our Bronze Age samples from Lithuania and Latvia. As Uralic speaking populations of the Volga-Ural region show high frequencies of haplogroup N, a connection was proposed with the spread of Uralic language speakers from the east that contributed to the male gene pool of Eastern Baltic populations and left linguistic descendants in the Finno-Ugric languages Finnish and Estonian. A potential future direction of research is the identification of the proximate population that contributed to the arrival of this eastern ancestry into Northern Europe.

I predicted that haplogroup N arrived probably to the region west of the Urals with the Sejma-Turbino phenomenon, and that it expanded quite late, probably through founder effects. A late arrival to the region leaves obviously (safe for these researchers and others working with old ideas) only the Corded Ware culture (represented by steppe admixture and mainly haplogroup R1a-Z645) as the vector of expansion of Uralic languages, which show obviously a dialectalization process and regional expansion much older than 500 BC…

It is funny to see how people keep trying to identify R1a with ‘Yamnaya’, now ‘steppe’, but always Indo-European (an ethnolinguistic term, mind you) supposedly because of the ‘Yamnaya’ (now ‘steppe’) admixture, but the only ‘mark’ of Uralic languages for the same researchers in the same paper using this very concept is nevertheless, paradoxically, haplogroup N, with an assumption explicitly based on prevalence in modern populations

This admixture vs. haplogroup question for language and culture identification in genetic papers is really gettting messed up with new data, now in a contortionist-like way…

Images and text: Content of the paper is licensed under CC-by 4.0.

