Consequences of O&M 2018 (II): The unsolved nature of Suvorovo-Novodanilovka chiefs, and the route of Proto-Anatolian expansion


I already expressed my predictions for 2018. One of the most interesting questions among them is the identification of the early Anatolian offshoot, and this is – I believe – where Genomics has the most to say in Indo-European migrations.

Linguistics and Archaeology had already a mainstream account from Late PIE/Yamna onwards, and it has been proven right in Genomic investigation. There is, however, no consensus on Indo-Hittite.


Apart from the Anatolian homeland hypothesis and its westward migration (as referenced e.g. by Lazaridis et al. 2017), the other possibility including the most likely steppe homeland is that Proto-Anatolian spread through the Balkans, and must have separated from Khvalynsk and travelled first westward through the North Pontic region, and then southward to Ezero.

EDIT (10 MAR 2018): The Anatolian westward route within the steppe homeland model refers to the possibility that Proto-Anatolian spread south through the Caucasus, and then westward through Anatolia, as suggested e.g. originally by Marija Gimbutas for Maykop, as a link in the Caucasus.

We all know that this Khvalynsk -> Novodanilovka-Suvorovo -> Cernavoda -> Ezero -> Troy migration model proposed by Anthony shows no conspicuous chain in Archaeology, but obvious contacts (including Genomics) are seen among some of these neighbouring cultures in different times.

We know that remains of Suvorovo-Novodanilovka culture of chiefs emerged around 4400-4200 BC among ordinary local Sredni Stog settlements:

  • the Novodanilovka rich burials in the steppes, near the Dnieper,
  • and the Suvorovo group in the Danube delta, roughly coinciding with the massive abandonment of old tell settlements in the area.

One of the strongest cultural connections between Khvalynsk and Suvorovo Novodanilovka chiefs is the similar polished stone mace-heads shaped like horse heads found in both cultures, a typical steppe prestige object going back to the east Pontic-Caspian steppe beginning ca. 5000-4800 BC.

Its finding in the Danube valley may have signalled the expansion of horse riding, which is compatible with the finding of ancient domesticated horses in the region. Horses were not important in Old European cultures, and it seems that they weren’t in Sredni Stog or Kvitjana either.

Steppe and Danubian sites at the time: of the Suvorovo-Novodanilovka intrusion, about 4200-3900 BC. David W. Anthony (2007).

NOTE. Telegin, the main source of knowledge in Ukraine prehistoric cultures for Anthony, was eventually convinced that Surovovo-Novodanilovka was a separate culture. However, for Anthony (using Telegin’s first impressions), it may have been a wealthy elite among Sredni Stog peoples. Anthony considers Sredni Stog to have been also influenced by Khvalynsk, and thus potentially related to the Suvorovo-Novodanilovka chiefs.

Nevertheless, he obviously cannot link North Pontic Eneolithic cultures to Khvalynsk nor to horse riding – whilst he clearly assumes horse riding for Novodanilovka-Suvorovo chiefs – , and he does not link North Pontic cultures to later expansions of Late Proto-Indo-Europeans from late Khvalynsk and Yamna, either.

The question here for Anthony (as with further Proto-Anatolian expansions described in his 2007 book), in my opinion, was to offer a plausible string of connections between Khvalynsk and Anatolia, and the simplest connection one can make among steppe cultures is a general, broad community between North Pontic and North Caspian cultures. That way, the knot tying Khvalynsk to the Danube seems stronger, whatever the origin of Suvorovo-Novodanilovka chiefs.

If, however, a direct genetic connection is made between Suvorovo-Novodanilovka chiefs and Khvalynsk – as in its association with R1b-M269 and R1b-L23 lineages – , there will be little need to include Sredni Stog or any other intermediate culture in the equation.

We have already seen a movement of steppe ancestry into mainland Greece, and I would not be surprised if a parallel movement could be seen from Ezero to Troy (or a neighbouring North-West Anatolian region), so that the final migration of Common Anatolian had in fact been triggered by the massive steppe migrations during the Chalcolithic.

NOTE. Whereas we are certain to find R1b-L23 subclades in the direct Balkan migrations from Yamna, the link of steppe->Anatolia migrations may be a little trickier: even if we find out that the Suvorovo-Novodanilovka expansion was associated with an expansion and reduction of haplogroup variability (to haplogroups R1b-M269 and R1b-L23), we don’t know yet if the ca. 1,500 years passed (and the different cultural and population changes occurred) between Proto-Anatolian and Common Anatolian migrations may have impacted the main haplogroup composition of both communities.

O&M 2018

A probably unsurprising – because of its previously known admixture and PCA – , but nevertheless disappointing finding came from the Y-SNP call of the haplogroup R1 found in Varna (R1b-V88, given first by Genetiker), leaving us with no new haplogroup data standing out for this period.

This sample’s lack of obvious genetic links with the steppe and early date didn’t deter me from believing it could show subclade M269, and thus a sign of incoming Suvorovo chiefs in the region. After all, R1b-P297 subclades seemed to have almost disappeared from the Balkans by that time, and we know that assessments based only on ancestral components and PCA clusters are not infallible – we are seeing that in many, many samples already.

1—39 — sceptre bearers of the type Giurgiuleşti and Suvorovo; 40—60 — Gumelniţa-Varna-Bolgrad-Aldeni cultural sphere; 61 — Fălciu; 62 — Cainari; 63 — Giurgiuleşti; 64 — Suvorovo; 65 — Casimcea; 66 — Kjulevča; 67 — Reka Devnja; 68 — Drama; 69 — Gonova Mogila; 70 — Reževo. Țerna S., Govedarica B. (2016)

NOTE. In fact, the first time I checked Mathieson et al. (2018) supplementary tables I thought that the ‘Ukraine_Eneolithic’ sample of R1b-L23 subclade was ‘it’: the first clear proof in ancient samples of incoming Suvorovo chiefs from Khvalynsk I was looking for…Until I realized its date, and that it was more likely a Late Yamna (or Catacomb) sample.

Steppe ancestry is found in the Varna and Smyadovo outliers, though, and these samples cluster closely to Ukraine Eneolithic samples (which are among Khvalynsk, Ukraine Neolithic, and Anatolia Neolithic clusters), so some population movement must have happened around or before that time in the region, and it is obvious that it happened from east to west.

It remains to be seen, therefore:

a) If the incoming Suvorovo-Novodanilovka chiefs (most likely originally from Khvalynsk) dominating over North Pontic and Danube regions show – as I bet – R1b-M269, and possibly also early R1b-L23* subclades,

b) Or else they still show mixed lineages, reflecting an older admixed population of the Pontic-Caspian steppe – as the early Khvalynsk and Ukraine Eneolithic samples we have now.

NOTE. Even though my preferred model of migration is through the Balkans – due to the many east-west migrations seen from the steppe into Europe – , there is no general consensus here because of the lack of solid anthropological models, and there are cultural links found also between the steppe and Anatolia through the Caucasus, so the question remains open.


North Pontic steppe Eneolithic cultures, and an alternative Indo-Slavonic model


I am not a fan of continuity theories – that much should be clear for anyone reading this blog. However, most of such proposals’ supremacist (or rather fear-of-inferiority) overtones don’t mean they have to be wrong. It just means that most of them, most of the time, most likely are.

While reading Tommenable’s comments, I thought about a potential alternative model, where one could a priori accept an identification of North Pontic cultures as ‘Indo-Slavonic’, which seems to be the Eastern European R1a continuist trend right now.

NOTE. To accept this model, one should first (not a posteriori) accept an Indo-Slavonic linguistic group on theoretical grounds, of course, and take the steppe ancestral component (and not archaeological data) as the most meaningful aspect to consider for language expansion and exchange (which we know is not the most intelligent approach to cultural or language change).

Thinking about how Genomics could challenge what mainstream Linguistics and Archaeology accepts, the only situation I can think of (using simplistic phylogeography) regarding late Khvalynsk-Sredni Stog contacts (until ca. 3300 BC) is:

  1. That the community of R1b-L51 lineages was in fact an isolated group , and not a western one – i.e. to the east within the Volga-Ural groups, or maybe to the south within the North Caucasian groups .
  2. That the R1b-Z2103 community was a huge one dominating over much of the steppe, from the Dnieper area to the Volga-Ural region (where we know they were).
  3. That R1a-M417 subclades (and especially subclade R1a-Z645) with steppe ancestry, as found in Corded Ware migrants,were only found in the North Pontic area (i.e. in Sredni Stog) during the fourth millennium (until at least 3300 BC, when Yamna substitutes it), and did not form other communities in the forest-steppe or Forest Zone (from where Corded Ware eventually expanded), as it is quite likely.
  4. That both the R1b-Z2103 and R1a-Z645 communities shared obvious genetic connections (whatever they were) around the Dnieper, that could justify a common, shared language.
Diachronic map of Eneolithic migrations in eastern Europe ca. 4000-3100 BC

Only then, if a widespread Graeco-Aryan-speaking community happened to be spread from west to east in the Pontic-Caspian steppe, with close contacts with North Pontic cultures, and having an isolated Northern Late PIE community somewhere different than West Yamna, it could leave for me a reasonable doubt of a cultural connection (maybe “Indo-Slavonic” in nature) of the North Pontic steppe. But then we would probably be stuck – yet again – with some sort of cultural diffusion event, impossible to demonstrate.

Since it is known (in Linguistics, and also in Y-DNA lineages, due to the early expansion of Z2103 subclades) that Graeco-Aryan groups separated early, this model would not be impossible.

Also a priori in favour of that model would be the early expansion of a (Northern IE-speaking) Pre-Tocharian population to the east. On the other hand, from an archaeological point of view, the group reaching Afanasevo seems to have expanded from Repin, just like the community expanding Yamna to the west of the Dnieper.

I really doubt there can be any serious discussion though, apart from amateur geneticists with a personal interest on this, because:

  • Graeco-Aryan is a Late PIE dialect, and Late PIE guesstimates are more recent than that.
  • Dialectal separation within a Late Proto-Indo-European language must have happened late, gradually, and in close contact, allowing for common innovations to spread through dialectal groups.
  • It does not make sense in terms of prehistoric cultures, since there is no direct connection or migration among steppe cultures but for the Novodanilovka and the Yamna expansions.
  • Indo-Slavonic is only supported by a handful of linguists, and not in the way or timing described in this model.

NOTE. You can read Kortlandt’s works in (also on his personal website) if you are really interested in knowing more about an Indo-Slavonic proposal, from an expert Balticist and Slavicist. However, if your intent is to demonstrate some ancient ethnic link of “your” people (whatever that means) to mythical Proto-Indo-Europeans, you would not need actual knowledge or sound theories to do that, so you can skip that part. Also, Kortlandt would probably support a later model of Indo-Slavonic expansion in the steppe, related to East Yamna, and later Sintashta, Srubna, etc…

Migration Yamna -> Corded Ware -> Bell Beaker as claimed by articles published in Nature (2015). From materials of the UAB.

If you think about it, if most modern Slavs were mainly of R1b-L23 lineages instead of R1a-Z645 (a replacement which, as it is clear know, is the consequence of a simple resurge of previous lineages in East-Central Europe, coupled with a later gradual replacement through founder effects, so no big migration history here), and Finnic speakers were mainly of R1a-Z645 lineages (whose replacement by N1c lineages seems also the consequence of quite late consecutive founder effects), I doubt we would be having this reticence to accept sound anthropological models.

So, we are speculating here for the sake of an unnecessary, naïve compromise…Just hoping to find some common ground to move on, now that the picture is clearer for everyone.

NOTE. The change of narratives where certain languages must have accompanied R1a-Z645 and N1c lineages, but in alternative ways not previously described, is obviously unjustified, if linguistic and archaeological data tell a different story. As unjustified as it is to change Yamna for “Neolithic Steppe” as homeland of Late Indo-European, to fit it with the steppe ancestry concept

The arrival of haplogroup R1a-M417 in Eastern Europe, and the east-west diffusion of pottery through North Eurasia


Henny Piezonka recently uploaded an old chapter, Die frühe Keramik Eurasiens: Aktuelle Forschungsfragen und methodische Ansätze, in Multidisciplinary approach to archaeology: Recent achievements and prospects. Proceedings of the International Symposium “Multidisciplinary approach to archaeology: Recent achievements and prospects”, June 22-26, 2015, Novosibirsk, Eds. V. I. Molodin, S. Hansen.

Abstract (in German):

Die älteste bisher bekannte Gefäßkeramik der Welt wurde in Südostchina von spätglazialen Jäger-Sammlern wahrscheinlich schon um 18.000 cal BC hergestellt. In den folgenden Jahrtausenden verbreitete sich die neue Technik bei Wildbeutergemeinschaften in der russischen Amur-Region, in Japan, Korea und Transbaikalien bekannt, bevor sie im frühen und mittleren Holozän das Uralgebiet und Ost- und Nordeuropa erreichte. Entgegen verbreiteter Forschungsmeinungen zur Keramikgeschichte, die frühe Gefaßkeramik als Bestandteil des „neolithischen Bündes” der frühen Bauernkulturen sehen, stellt die eurasische Jäger-Sammler-Keramiktradition eine Innovation dar, die sich offenbar völlig unabhängig von anderen neolithischen Kulturerscheinungen wie Ackerbau, Viehzucht und sesshafre Lebensweise entwickelt hat Im vorliegenden Beitrag wird die chronologische Abfolge des ersten Auftretens von Tongefäßen in nordeurasischen Jäger-Sammler-Gemeinschaften anahnd von 14C-Datierungen Pazifik bis ins Baltikum nachvollzogen. Gleichzeitig werden vielversprechende methodische Ansätze vorgestellet, die derzeit ein Rolle bei der Erforschung dieses viel diskutierten Themas spielen.

Sites named in the text with earlier ceramic pottery in Eurasia up to the Urals.

If you have followed the updates to the Indo-European demic diffusion model, my proposal of a potential late arrival of haplogroup R1a-M417 during the Mesolithic did not change by the potential earlier arrival of EHG ancestry and haplogroup R1a in the North Pontic steppe, after the findings in Mathieson et al. (2017).

That is so because of the anthropological models of migration – or, lacking them, archaeological models of cultural expansion – that we have to date.

If I had followed a simplistic autochthonous continuity view, I would have thought that R1a-M417 was autochthonous to Eastern Europe, because an older subclade is found in the North Pontic steppe during the Mesolithic, akin to how some people want to believe that R1b-M269 shows autochthonous continuity in or around Central Europe, because of the Villabruna sample and later R1b-L23 subclades found there.

However, it is difficult to assert today that the population movement involving a community of mostly haplogroup R1a-M417 happened from west to east:

  1. If you follow Piezonka’s work, who did her Ph.D. dissertation in Eastern European Mesolithic (you can buy a more readable version), and has dedicated a great amount of time and effort to the research of cultural connections between Eastern Europe and Eurasia during the Mesolithic;
  2. taking into account the potential migration waves behind the increase in EHG ancestry in Eastern Europe in these periods, and this ancestral component’s speculative connection with ANE ancestry;
  3. and if we accept the TMRCA of R1a-M417 based on modern samples, dated ca. 6500 BC, and the appearance of the first samples at a similar time in Eastern Europe and in Baikalic cultures.

NOTE. More and more findings of Eastern Europe are showing how the sample of haplogroup N1c found in Eastern Europe and dated ca. 2500 BC is probably wrong, either in its haplogroup or in the radiocarbon date: after all, the lab has published just one study. The study of Baikalic samples, on the other hand, seems to have been corroborated by a more recent study.

Another interesting sample is that of Afontova Gora, whose community may have actually been mostly of haplogroup R1a (based on its position in PCA and relation to ANE ancestry), and thus the regional distribution of this haplogroup could have been quite large in North Eurasia during the Palaeolithic-Mesolithic transition, although this is highly speculative, like the connection WHG:ANE for EHG.

Early radiocarbon-dated complexes with pottery in different regions of North Eurasia

It is obvious that we cannot know what happened during these millennia without more samples, and indeed I don’t see anything a priori wrong with having an origin of R1a-M417 (and thus some sort of continuity) in Eastern Europe during the Mesolithic and Neolithic; just as I don’t see any problem with the continuity of other European haplogroups. Or with their discontinuity, mind you. That would not change the Proto-Indo-European homeland, or the complexity of language and ethnicity in Eastern Europe in the millennia following the expansion of Late Indo-European.

It just amazes me again and again how otherwise serious and capable people are often blinded by the desire to have their direct paternal line (some ancestors among an infinite number of them, probably representing for them genetically much less than other ancestral lines) stem from the own region and have the same ethnolinguistic affiliation since time immemorial, instead of betting for sounder migration models supported by anthropological data…


The myth of mixed language, the concepts of culture core and package, and the invention of ‘Steppe folk’


I recently read some papers which, albeit apparently unrelated, should be of interest for many today.

Mixed language

The myth of the mixed languages, by Kees Versteeg, in Advances in Maltese linguistics, ed. by Benjamin Saade and Mauro Tosco, 217-238. Berlin and New York: Mouton de Gruyter, 2017 [uncorrected proofs]

This paper focuses on the usefulness of the label ‘mixed languages’ as an analytical tool. Section 1 sketches the emergence of the biological paradigm in linguistics and its effect on the contemporary debate about mixed languages. Sections 2 and 3 discuss two processes that have been held responsible for the emergence of mixed languages, code switching and extreme borrowing. Section 4 compares these two mechanisms with the categories of change in Thomason & Kaufman (1988), while Section 5 offers some conclusions about the status of mixed languages as a special category.

Although the paper is a must read for language contact and language change (code-switching, borrowing, shifting), a good summary may save you some time if you are not interested in linguistics:

Speakers may either shift to a new language while retaining traces of their old language, or they may stick to their original language while borrowing from another language with which they come in touch (…)

[Bakker] distinguishes two types of communities in which mixing is found: isolated mixed marriage communities with (asymmetrical) bilingualism; and nomadic communities that shift to a dominant language, but retain a substantial part of their lexicon as a private or secret register, closely connected with the community’s identity. The history of these communities provides us with plausible scenarios to explain the idiosyncrasies of the speech pattern of the speakers belonging to them.

In what way, then, does it help to put both categories under one label of mixed languages? I believe Backus (2003: 263) is right when he suggests that the question of whether a certain set of features constitutes a mixed language is perhaps not a very interesting one. The question should not be whether given certain features a language may be categorized as mixed, but what the linguistic effects of different kinds of contact (trade, work, conquest, mixed marriages, colonization, marginalization, etc.) are, and to what extent these effects correlate with the type of contact. At no point is it necessary to posit a category of mixed languages. In fact, the myth of the mixed languages may have been perpetuated because of the relative weirdness of the initial cases, notably that of Michif, which represent phenomena so unique that it is understandable that some scholars came to believe that they could only be explained by special mechanisms. The position taken here is that if we focus on the speakers’ behavior, the phenomena in question become much more understandable. The crucial point is that languages do not mix, people do.

Culture core and package

The article Isolation-by-distance, homophily, and “core” vs. “package” cultural evolution models in Neolithic Europe, by Shennana, Crema, and Kerig (2015):

Recently there has been growing interest in characterising population structure in cultural data in the context of ongoing debates about the potential of cultural group selection as an evolutionary process. Here we use archaeological data for this purpose, which brings in a temporal as well as spatial dimension. We analyse two distinct material cultures (pottery and personal ornaments) from Neolithic Europe, in order to: a) determine whether archaeologically defined “cultures” exhibit marked discontinuities in space and time, supporting the existence of a population structure, or merely isolation-by-distance; and b) investigate the extent to which cultures can be conceived as structuring “cores” or as multiple and historically independent “packages”. Our results support the existence of a robust population structure comparable to previous studies on human culture, and show how the two material cultures exhibit profound differences in their spatial and temporal structuring, signalling different evolutionary trajectories.

Our results suggest distinct evolutionary histories in the spatial and temporal variation of personal ornament and pottery, with different rates of innovation, patterns of descent, and dynamics of diffusion. Ornament data do show statistically significant values of ΦST using pottery-defined population structures, but the magnitude is extremely small, and partial Mantel tests suggest that much of this pattern is explained by isolation by distance. These results are in line with a model
of culture represented by independent “packages” of multiple coherent units rather than one characterised by a distinct and fairly isolated “core” surrounded by a “periphery” of elements prone to crosscultural transmission. The alternative hypothesis is that one element was part of the “core” tradition, whilst the other was peripheral. This scenario is however less likely given that both elements are generally regarded as expression of local lines of transmission and/or signalling.

The robust support for a population structure in the pottery data shows that some degree of homophily must have biased the transmission process, but this bias was confined within the single “package”, rather than affecting other aspects of the material culture. In other words similarity (or dissimilarity) of pottery style was not influencing the transmission process of personal ornaments and vice-versa. If this was the case, we should have observed a stronger agreement between in the spatio-temporal distribution of the two datasets, a pattern we failed to observe. Personal ornaments are often seen as group-identity markers, but the fact that our study appears to indicate a stronger role for isolation by distance in accounting for variation in ornaments suggests that this assumption may not be valid, or alternatively that these groups cross-cut the archaeological cultures traditionally recognised. Thus,while our study has provided strong evidence of population structure affecting patterns of cultural interaction, in this case at least the distinct patterns observed point to a modular, ‘package’ model. It has also shown that we can identify population structuring from the evidence of the archaeological record without continuing to attempt the fruitless task of correlating its patterns with past ethnolinguistic units.

Location of material culture (ornament and pottery) data of 361 Neolithic sites in central

A situation more interesting than Neolithic Europe for those following this blog will probably be the one on West Europe during the Bell Beaker expansion.

Regarding Iberia, I have already talked about the possibilities of “resurgence” after the arrival of Bell Beaker migrants, in this blog and in the Indo-European demic diffusion model.

About the British Isles, you can read e.g. the stepped and different cultural changes that happened after the arrival of East Bell Beakers in What was and what would never be: changing patterns of interaction and archaeological visibility across north-west Europe from 2,500 to 1,500 cal BC, by Wilkin, N., Vander Linden, M. 2015. In : Anderson-Whymark, H., Garrow, D., Sturt, F. (eds)Continental connections. Exploring cross -Channel relationships from the Mesolithic to the Iron Age. Oxford: Oxbow: 99-121.

The changing cultural geography of north-west Europe c. 3000-1500 BC. Solid line: architecture; dashed line: material culture;, dash-dot line: funerary practices

This logical description different cultural changes brings up a question obvious to many, and can be summed up by “does the arrival of (North-West Indo-European-speaking) East Bell Beakers mean the end of non-Indo-European cultures in Western and Northern Europe?” The answer is obviously – as in the rest of Europe – quite simply No. Many non-Indo-European groups must have survived the initial expansion of East Bell Beakers in many regions, as the pre-Roman situation (already quite simplified after the expansion of Celtic and Germanic) testifies.

“Resurgence” of local groups as seen in genomic data is the most direct connection with survival of previous non-IE cultures (and thus languages), but obviously not the only mechanism of language survival, since we can see founder effects – such as those seen in modern Basque speakers (mainly of R1b subclades), and in Ugro-Finnic speakers in north-east Europe (mainly of N subclades).

Steppe folk

I am recently stumbling more and more often upon the concept of ‘Steppe people‘ by amateur geneticists, whether in Anthrogenica, in blogs and blog comments, or even in research papers, where people used to talk about ‘Yamnaya people’ or ‘Yamnaya folk’. As I said, I expected this since I questioned the concept of the ‘Yamnaya ancestral component‘, so I feel vindicated by this change.

However, whereas most will be using these names simply following the redeeming term “steppe admixture”, and thus refer to a Neolithic steppe population that shared a common admixture component (probably ca. 5000 BC), some will obviously go further and identify this component with Proto-Indo-European (like that, in general terms), since it is the logical sequence for those who consider the term “Indo-Europeans” as an umbrella for a certain ethnic proportion and a link with modern populations in stupid autochthonous continuity theories, where prehistoric language and culture are irrelevant.

NOTE. Evidently, those who supported quite strongly the fact that R1a-M417 subclades associated with Corded Ware migrants (i.e. mainly R1a-Z645) stemmed directly from Yamna migrants are shifting terminology to “steppe people” in light of recent data, so that they can support an older steppe community in the likely case that no R1a is found in Yamna, while keeping open the possibility to revert to a more direct support of the Yamna -> Corded Ware model in case just one R1a sample is found… So no anthropological model at all here, just a personal desire to be fullfilled in any possible way.

Those thinking naively about an imaginary ‘Steppe folk’, living in loosely connected steppe cultures, speaking a mixed ‘Steppe language’, may well keep inventing potentially popular peoples and languages based on admixture, such as West Hunter-Gatherer folk (Vasconic?), East Hunter-Gatherer folk (Uralic? – or maybe today they can invent a Siberian folk bringing Finno-Ugric after 500 BC), Caucasus Hunter-Gatherer folk (??), etc. So welcome back to the 1930s!… Or was it the 2000s?

Image modified by me from Wiik’s original, for those of you nostalgic fans of autochthonous continuity theories: You can now again support a native Vasconic-Uralic-Indo-European “folk” distribution in Neolithic Europe.

Some people like to talk about how “Science” wins against Academia, especially when they try to defend this pseudo-subfield they are inventing and venerating on the go to characterize ancient populations, where new genomic methods are king, and the other fields involved are just noise they easily use or dismiss to support their own desires and preconceptions.

NOTE. I felt like many fans of Genomics are very well represented in this recent article at ArXiv: “23andMe confirms: I’m super white” — Analyzing Twitter Discourse On Genetic Testing

Too bad for them. The misuse of this new field might be popular today among certain amateur geneticists, but it will not stand the test of time, similar to how the initial hype around radiocarbon analysis (for Archaeology) or glottochronology (for Linguistics) eventually faded, and they became just another tool among traditional methods. In Science, time puts everything where it belongs.

Whether you like it or not, Indo-European (and Uralic) questions will be solved – as they have been for a long time now, there is nothing new under the sun – with Historical Linguistics first, then Prehistoric Archaeology, then Anthropology (models of migration, cultural diffusion, founder effects, etc.), and only then Genomics, which may (or may not) help solve certain controversial aspects, by supporting one or other anthropological model. Period.

The Indo-European demic diffusion model, and the "R1b – Indo-European" association


Beginning with the new year, I wanted to commit myself to some predictions, as I did last year, even though they constantly change with new data.

I recently read Proto-Indo-European homelands – ancient genetic clues at last?, by Edward Pegler, which is a good summary of the current state of the art in the Indo-European question for many geneticists – and thus a great example of how well Genetics can influence Indo-European studies, and how badly it can be used to interpret actual cultural events – although more time is necessary for some to realize it. Notice for example the distribution of ‘Yamnaya’ in 3000 BC, all the way to Latvia (based on the initial findings of Mathieson et al. 2017), and the map of 2000 BC with ‘Corded Ware’, both suggesting communities linked by admixture and unrelated to actual cultures.

Some people – especially those interested in keeping a simplistic picture of Europe, either divided into admixture groups or simplistic R1b-Vasconic / R1a-Indo-European / N1c-Uralic (or any combination thereof) – want (others) to believe that I am linking ‘Indo-Europeans’ with haplogroup R1b. That is simply not true. In fact, my model dismisses such simplistic identifications of the reconstructible proto-languages with any modern peoples, admixtures, or haplogroups.

Simplistic Vasconic/R1b-Uralic/N1c distribution, and intruding Indo-European/R1a, according to Wiik.

The beauty of the model lies, therefore, precisely in that if you take any modern group speaking Indo-European languages, none can trace back their combination of language, admixture, and/or haplogroup to a common Indo-European-speaking people. All our ancestral lines have no doubt changed language families (and indeed cultures), they have admixed, and our European regions’ paternal lines have changed, so that any dreams of ‘purity’ or linguistic/cultural/regional continuity become absurd.

That conclusion, which should be obvious to all, has been denied for a long time in blogs and forums alike, and is behind the effort of many of those involved in amateur genetics.

Main linguistic aim

The main consequence of the model, as the title of the paper suggests, is that reconstructible Indo-European proto-languages expanded with people, i.e. with actual communities, which is what we can assert with the help of Genomics. From a personal (or ethnic, or political) point of view genomics is useless, but from an anthropological (and thus linguistic) point of view, genomics can be a very useful tool to decide between alternative models of language diffusion, which has given lots of headaches to those of us involved in Indo-European studies.

The demic diffusion theory for the three main stages of the proto-language expansion was originally, therefore, a dismissal of impossible-to-prove cultural diffusion models for the proto-language – e.g. the adoption of Late Proto-Indo-European by Corded Ware groups due to a patron-client relationship (as proposed by Anthony), or a long-lasting connection between cultures (as proposed by Kristiansen, and favoured by “constellation analogy” proponents like Clackson, who negated the existence of common proto-languages). It also means the acceptance of the easiest anthropological model for language change: migration and – consequently – replacement.

By the time of the famous 2015 papers, I had been dealing for some time with the idea that the shared features between Indo-Iranian and Balto-Slavic may have been due to a common substrate, and must have therefore had some reflection in genomic finds. The data on these papers, and the addition of a weak connection between Pre-Germanic and Balto-Slavic communities, together with their clearest genetic link – R1a-M417 subclades (especially European Z283) – made it still easier to propose a Corded Ware substrate, partially common to the three.

Allentoft Corded Ware
Allentoft et al. “Arrows indicate migrations — those from the Corded Ware reflect the evidence that people of this archaeological culture (or their relatives) were responsible for the spreading of Indo-European languages. All coloured boundaries are approximate.”

Before the famous 2015 papers (and even after them, if we followed their interpretation), we were left to wonder why the supposed vector of expansion of Indo-European languages, Corded Ware migrants – represented by R1a-Z645 subclades, and supposedly continued unchanged into modern populations in its ‘original’ ancestral territories, Balto-Slavic and Indo-Iranian – , were precisely the (phonetically) most divergent Indo-European languages – relative to the parent Late Indo-European proto-language.

My paper implied therefore the dismissal of an unlikely Indo-Slavonic group, as proposed by Kortlandt, and of a still less factible Germano-Slavonic, or Germano-Indo-Slavonic (?) group, as loosely implied by some in the past, and maybe supported in certain archaeological models (viz. Kristiansen or partially Anthony), and presently by some geneticists since their simplistic 2015 papers on “massive migrations from the steppe“, and amateur genetic fans with infinite pet theories, indeed.

A common Corded Ware substrate to Balto-Slavic and Indo-Iranian, and common also partially between Balto-Slavic and Germanic (as supported by Kortlandt, too, albeit with different linguistic connotations), would explain their common features. The Corded Ware culture (and Uralic, tentatively proposed by me as the group’s main language family) is a strong potential connection between them, further supported by phylogeography, too.

Other consequences

Interpretations in my paper help thus dismiss the simplistic Yamna -> Corded Ware -> Bell Beaker migration model implied with phylogeography in the 2000s, and revived again by geneticists and Kristiansen’s workgroup based on the famous 2015 papers, whereby – due to the “Yamnaya ancestral component” – the Yamna culture would have been composed of communities of R1a-M417 and R1b-M269 lineages which remained against all odds ‘related but separated’ for more than two thousand years, sharing a common unitary language (why? and how?), and which expanded from Yamna (mainly R1b-L23) into Corded Ware (mainly R1a-M417) and then into Bell Beaker (mainly R1b-L51), in imaginary migration waves whose traces Archaeology has not found, or Anthropology described, before.

While phylogeography (especially the distribution of ancient samples of certain R1b and R1a subclades) was the main genetic aspect I used in combination with Archaeology and Anthropology to challenge the reliability of the “Yamnaya ancestral component” in assessing migrations – and thus Kristiansen’s now-popular-again modified Kurgan model – , my main aim was to prove a recent expansion of Late Proto-Indo-European from the steppe, and a still more recent expansion of a common group of speakers of North-West Indo-European, the language ancestral to Italo-Celtic, Germanic, and probably Balto-Slavic (or ‘Temematic’, the NWIE substrate of Balto-Slavic, according to some linguists).

My arguments serve for this purpose, and modern distributions of haplogroups or admixture are fully irrelevant: I am ready to change my view at any time, regarding the role of any haplogroup, or ancestral component, archaeological data, or anthropological migration model, to the extent that it supports the soundest linguistic model.

Stages of Proto-Indo-European evolution. IU: Indo-Uralic; PU: Proto-Uralic; PAn: Pre-Anatolian; PToch: Pre-Tocharian; Fin-Ugr: Finno-Ugric. The period between Balkan IE and Proto-Greek could be divided in two periods: an older one, called Proto-Greek (close to the time when NWIE was spoken), probably including Macedonian, and spoken somewhere in the Balkans; and a more recent one, called Mello-Greek, coinciding with the classically reconstructed Proto-Greek, already spoken in the Greek peninsula (West 2007). Similarly, the period between Northern Indo-European and North-West Indo-European could be divided, after the split of Pre-Tocharian, into a North-West Indo-European proper, during the expansion of Yamna to the west, and an Old European period, coinciding with the formation and expansion of the East Bell Beaker group.

Gimbutas’ old theory of sudden and recent expansion served well to support a real community of Proto-Indo-European speakers, as did later the Yamna -> Corded Ware -> Bell Beaker theory that circulated in the 2000s based on modern phylogeography, and as did later partially Anthony’s updated steppe theory (2007). On the other hand, Kristiansen’s long-lasting connections among north-west Pontic steppe cultures and Globular Amphorae and Trypillian cultures, did not fit well with a close community expanding rapidly – although recent genetic data on Trypillia and Globular Amphorae might be compelling him to improve his migration theory.

So, if data turns out to be not as I expect now, I will reflect that in future versions of the paper. I have no problem saying I am wrong. I have been wrong many times before, and something I am certain is that I am wrong now in many details, and I am going to be in the future.

If, for example, R1b-L23(xZ2105) is demonstrated to come from Hungary and not the steppe (as supported by Balanovsky) or R1a-M417 samples are proved to have expanded with West Yamna settlers (as recently proposed by Anthony, see below the Balto-Slavic question), I would support the same model from a linguistic point of view, but modified to reflect these facts. Or if a direct migration link is found in Archaeology from Yamna to Corded Ware, and from Corded Ware to Bell Beaker (as proposed in the 2015 papers), I will revise that too (again, see the image below). Or, if – as Lazaridis et al. (2017) paper on Minoans and Mycenaeans suggested – the Anatolian hypothesis (that is, one of the multiple ones proposed) turns out to be somehow right, I will support it.

My map of Late Proto-Indo-European expansion (A Grammar of Modern Indo-European, 2006), following Gimbutas and Mallory.

Haplogroups are the least important aspect of the whole model, they are just another data that has to be taken into account for a throrough explanation of migrations. It has become essential today because of the apparent lack of vision on the part of geneticists, who failed to use them to adjust their findings of admixture with findings of haplogroup expansions, favouring thus a marginal theory of long-lasting steppe expansion instead of the mainstream anthropological models.

Since many of these alternative scenarios seem less and less likely with each new paper, it is probably more efficient to talk about which developments are most likely to challenge my model.

Main points

My main predictions – based mostly on language guesstimates, archaeological cultures, and anthropological models of migration -, even with the scarce genomic data we had, have been proven right until know with new samples from Mathieson et al. (2017) and Olalde et al. (2017), among other papers of this past year. These were my original assumptions:

(1) A Middle Proto-Indo-European expansion defined by the appearance of steppe ancestry + reduction in haplogroup diversity and expansion of (mainly) R1b-M269 and R1b-L23 lineages;

(2) A Late Proto-Indo-European expansion defined by steppe ancestry + reduction in haplogroup diversity and expansion of (mainly) R1b-L23 subclades; and

(3) A North-West Indo-European expansion defined by steppe ancestry + reduction in haplogroup diversity and expansion of (mainly) R1b-L51 subclades.

The expansion of Corded Ware peoples, associated with steppe ancestry + reduction in haplogroup diversity and expansion of (mainly) R1a-Z645 subclades, represents thus a different migration, which is compatible with the different nature of the Corded Ware culture, unrelated to Yamna and without migration waves from one to the other (although there were certainly contacts in neighbouring regions).

As you can see, neither of the 3+1 expansion models imply that no other haplogroup can be found in the culture or regions involved (others have in fact been found, and still the models remain valid): these migrations imply a reduction of haplogroup diversity, and the expansion of certain subclades as is common in population expansions throughout history. While we all accept this general idea, some people have difficulties accepting just those cases not compatible with their dreams of autochthonous continuity.

Nevertheless, there are still voids in genetic investigation.

Controversial aspects

In my humble opinion, these are potential conflict periods and the most likely areas of change for the future of the theory:

1. When and how did R1b-M269 lineages become “chiefs” in the steppe?

Based on scarce data from Khvalynsk, it seems that during the Neolithic there were many haplogroups in the North Pontic and North Caspian steppes. A reduction to R1b-M269 subclades must have happened either just before or (as I support) during (the migrations that caused) the Suvorovo-Novodanilovka expansion among Sredni Stog, probably coinciding also with the expansion (or one of the expansions) of CHG ancestry (and thus the appearance of ‘Steppe component’ in the steppe). My theory was based initially on Anthony’s account and TMRCA of haplogroups of modern populations (both ca. 4200-4000 BC), but recent samples of the Balkans (R1b-M269 and steppe ancestry) seem to trace the population expansion some centuries back.

If my assessment is correct, then modern populations of haplogroup R1b-M269* and R1b-L23* in the Balkans probably reflect that ancient expansion, and samples related to Proto-Anatolian cultures in the Balkans will most likely be of R1b-M269 subclades and R1b-L23*. After admixture in the Balkans, posterior migrations of Anatolian languages into Anatolia might be associated with a different admixture component and haplogroups, we don’t have enough data yet.

If the haplogroup reduction and expansion in Khvalynsk happened later than the Suvorovo-Novodanilovka expansion, then we might find the expansion of Pre- or Proto-Anatolian associated with many different haplogroups, such as R1b (xM269), R1a, I, J, or G2, and more or less associated with steppe ancestry in the Balkans.

Another reason for finding such variety of haplogroups in ancient samples from the Balkans would be that this Khvalynsk group of “chiefs” traversed – and mixed with – the Sredni Stog population. Nevertheless, if we suppose homogeneity in haplogroups in Khvalynsk during the expansion, a high proportion of different haplogroups explained by admixture with the local population of Sredni Stog would challenge the whole “chief domination” explanation by Anthony, and we would have to return to the “different culture” theory by Rassamakin and potentially an older migration from Khvalynsk. In any case, both researchers show clear links of the Suvorovo-Novodanilovka phenomenon to Khvalynsk, and a differentiation with the surrounding Sredni Stog culture.

A less likely model would support the identification of the whole Eneolithic Pontic-Caspian steppe as a loose Indo-Hittite-speaking community, which would be in my opinion too big a territory and too loose a cultural bond to justify such a long-lasting close linguistic connection. This will probably be the refuge of certain people looking desperately for R1a-IE connections. However, the nature of the western steppe will remain distinct from Late Proto-Indo-European, which must have developed in the Yamna culture, so autochthonous continuity is not on the table anymore, in any case…

Coexistence of the Varna-Gumelniţa culture and the Suvorovo phase of the sceptre-bearer communities. 1 — Fălciu; 2 — Fundeni-Lungoţi; 3 — Novoselskaja; 4 — Suvorovo; 5 — Casimcea; 6 — Kjulevča; 7 — Reka Devnja; 8 — Drama; 9 — Gonova mogila; 10 — Reževo; 11 — geographically separate Decea variant of the sceptre bearer group (after Govedarica, Manzura 2011: Abb. 5, adapted).

2. How did R1a-M417 (and especially R1a-Z645) haplogroups came to dominate over the Corded Ware cultures?

If I am right (again, based on TMRCA of modern populations), then it is precisely at the time of the potential expansion of Proto-Corded Ware from the Dnieper-Dniester forest, forest-steppe, and steppe regions, ca 3300-3000. Furholt’s recent radiocarbon analysis and suggestions of a Lesser Poland origin of the third or A-horizon, on which disparate archaeologists such as Anthony or Klejn rely now, seem to suggest also that Corded Ware was a cultural complex rather than a compact culture reflecting a migration of peoples – similar thus to the Bell Beaker complex.

This cultural complex interpretation of Corded Ware contrasts with the quite homogeneous late samples we have, suggesting clear migration waves in northern Europe, at least at some point in time, so Genomics will be a great tool to ascertain when and from where approximately did Corded Ware peoples expand. Right now, it seems that Eneolithic Ukraine populations are the closest to its origin, so the traditional interpretation of its regional origin by Kristiansen or Anthony remains valid.

3. How was Indo-Iranian adopted by Corded Ware invaders?

This is rather an anthropological question. We need reasonable models of founder effect/cultural diffusion necessary for that to happen – similar to the ones necessary to explain the arrival of N1c subclades into north-east Europe, or the arrival of R1b subclades in Basque/Iberian-speaking regions in south-west Europe. My description of potential events in the eastern steppe – based partially on Anthony – is merely a short sketch. Genomic data is unlikely to offer more than it does today (replacement of haplogroups, and gradually of some steppe component, by late Corded Ware groups in the steppe), but let’s see what new samples can contribute.

As for what some Indians – and other people willing to confront them – are looking for, regarding R1a-M417 and/or Indo-European origins in India, I don’t see the point, we already know a) that the origin of the expansion is in the steppe and b) that Hindu nationalist biggots will not accept results from research that oppose their views. I don’t expect huge surprises there, just more fruitless discussions (fomented by those who live from trolling or conspiracies)…

4. Yamna settlers from Hungary

Anthony’s new theory – and the nature of Balto-Slavic – hinges on the presence of R1a-M417 subclades (associated with later Corded Ware samples) in Yamna settlers of Hungary, potentially originally from the North Pontic area, where the oldest sample has been found.

My ‘modified’ version of Anthony’s new model (the only I deem just remotely factible) includes the expansion of a Proto-Corded Ware from Lesser Poland, but (given the overwhelming R1b found in East Bell Beaker), with R1a-M417 being associated with the region. How to explain this language change with objective data? Well, we have Bell Beaker expanding to these areas at a later time, so we would need to find R1b-L23 settlers in Lesser Poland, and then a resurge of R1a-M417 haplogroup. If not, resorting yet again to cultural diffusion Yamna “patrons” to Corded Ware “clients” of Lesser Poland would bring us to square one, now with the ‘steppe ancestry’ controversy included…

Since some Eastern Europeans are (for no obvious reason whatsoever) putting their hopes on that IE-R1a-CWC association, let’s hope some samples of R1a-M417 in Yamna or Hungary give them a break, so that they can begin accepting something closer to mainstream anthropological models. We could then work from there a Yamna-> Bell Beaker / North-West Indo-European association truce, and from there keep accepting that no single haplogroup from Yamna settlers is linked with modern languages, cultures or ethnic groups.

localization of Central-European funerary monuments with elements of the Pit Grave culture (after Bátora 2006);

5. How and when was Balto-Slavic associated with haplogroup R1a?

If we accept the Southern or Graeco-Aryan nature of Balto-Slavic with influence from an absorbed North-West Indo-European dialect, “Temematic” (as Kortlandt does), then Indo-Slavonic adopted in the steppe from Potapovka by Sintashta and Poltavka populations divided ca. 2000 BC into Indo-Iranian (migrating to the east with Andronovo), and Balto-Slavic (migrating westward with the Srubna culture). History from there is not straightforward, and it should follow Srubna, Thraco-Cimmerian, or other late expansions from cultures of the steppe.

On the other hand, if it is a Northern dialect related closely to Germanic and Italo-Celtic (in a North-West Indo-European group), then its origin has to be found in the initial expansion of East Bell Beakers, and its development into either the Únětice culture (of Balkan and thus potentially “Southern IE” influence), or the Mierzanowice-Nitra culture (of Corded Ware and thus potentially Uralic influence), or maybe from both, given the intermediate substrate found in Germanic and Balto-Slavic.

It is my opinion that the association of Balto-Slavic with haplogroup R1a is quite early after the East Bell Beaker expansion, probably initially with the subclade typically associated with West Slavic, R1a-M458. I have not much data to support this (apart from the most common linguistic model), just modern haplogroup distribution maps and common TMRCA, and highly hypothetical archaeological-anthropological models. Genetics will hopefully bring more data.

Let’s see also what information on ancient haplogroups we can obtain from the Tollense valley (already showing a close cluster with modern West Slavic populations) and steppe regions.

6. How did Germanic, Celtic, and Italic expand?

Germanic is probably the most interesting one. Following the expansion of R1b-L51 subclades (especially R1b-U106) and steppe ancestry (a confounding factor, with the previous expansion of R1a-Z284 subclades) in Scandinavia is going to be fascinating. Anthropological models already point to a linguistic and archaeological expansion of Pre-Germanic with Bell Beaker peoples.

The expansion of Celtic seems to be associated with chiefdoms, untraceable today in terms of haplogroups, and it seems thus different from previous expansions. New studies might tell how that happened, if it was actually in successive ways, as proposed, or maybe we don’t have enough data yet to reach conclusions.

We don’t know either how Italic expanded into the Italian Peninsula, or whether Latin expanded with peoples from Italy, if at all, or it was mostly a cultural diffusion event, as it seems.

Regarding Etruscan, while I think it is a controversy initiated based on fantastic accounts, and ignited with few finds of Middle Eastern ancestry (that seem logical from the point of view of regional contacts), it will be important for Italian linguists and archaeologists, also to accept the most likely scenario.

As for Palaeo-Hispanic languages, while steppe ancestry is found quite reduced in R1b-L51 subclades (after so many different expansions and admixture events since the departure from the steppe), their distribution from the Chalcolithic onwards and the resurgence of native haplogroups may serve to ascertain which Pre-Roman tribes were associated with the oldest regions where these subclades dominated. For that aim, a closer look at the developments in Aquitania and other pre-Roman Vasconic- and Iberian-speaking regions may shed some light on how founder effects might develop to leave the native language intact (in a case similar to the adoption of Indo-Iranian by post-Corded Ware Sinthastha and Potapovka in the eastern Pontic-Caspian steppe).

NOTE: Although mostly unrelated, linguistic questions may also be somehow altered with a change of migration models. For example, our current Corded Ware Substrate Hypothesis – strongly contested by Kortlandt and others – implies that Uralic was potentially the language spoken by Eneolithic Ukraine / Proto-Corded Ware peoples, therefore early Uralic languages were spoken by Corded Ware peoples, as a substrate for Germanic and Balto-Slavic, and Balto-Slavic and Indo-Iranian. If an Indo-Hittite branch different from Late PIE is accepted for Eneolithic Ukraine (thus suggesting a millennia-long cultural-historical community in the steppe), then the model still stands (e.g. Ger. and BSl. *-mos/-mus, as stated by Kortlandt, would correspond to the oldest morphological IE layer). As you can read in the different versions of our model, the different possibilities for the common substrate are stated, and the most likely one selected. But the most likely a priori option sometimes turns out to be wrong…

NOTE 2: You can comment whatever you want here, but I opened a specific thread in our forum if you want serious comments on the model to stuck and be further discussed.

Featured images: from the book Interactions, changes and meanings. Essays in honour of Igor Manzura on the occasion of his 60th birthday. Țerna S., Govedarica B. (eds.). 2016. Kishinev: Stratum Plus.

Admixture of Srubna and Huns in Hungarian conquerors


New preprint at BioRxiv, Mitogenomic data indicate admixture components of Asian Hun and Srubnaya origin in the Hungarian Conquerors, by Neparáczki et al. (2018), at BioRxiv.

Abstract (emphasis mine):

It has been widely accepted that the Finno-Ugric Hungarian language, originated from proto Uralic people, was brought into the Carpathian Basin by the Hungarian Conquerors. From the middle of the 19th century this view prevailed against the deep-rooted Hungarian Hun tradition, maintained in folk memory as well as in Hungarian and foreign written medieval sources, which claimed that Hungarians were kinsfolk of the Huns. In order to shed light on the genetic origin of the Conquerors we sequenced 102 mitogenomes from early Conqueror cemeteries and compared them to sequences of all available databases. We applied novel population genetic algorithms, named Shared Haplogroup Distance and MITOMIX, to reveal past admixture of maternal lineages. Phylogenetic and population genetic analysis indicated that more than one third of the Conqueror maternal lineages were derived from Central-Inner Asia and their most probable ultimate sources were the Asian Huns. The rest of the lineages most likely originated from the Bronze Age Potapovka-Poltavka-Srubnaya cultures of the Pontic-Caspian steppe, which area was part of the later European Hun empire. Our data give support to the Hungarian Hun tradition and provides indirect evidence for the genetic connection between Asian and European Huns. Available data imply that the Conquerors did not have a major contribution to the gene pool of the Carpathian Basin, raising doubts about the Conqueror origin of Hungarian language.

“Comparison of major Hg distributions from modern and ancient populations. Asian main Hg-s are designated with brackets. Major Hg distribution of Conqueror samples from this study are very similar to that of other 91 Conquerors taken from previous studies [11,12]. Scythians and ancient Xiongnus show similar Hg composition to the bracketed Asian fraction of the Conqueror samples, but Hg B is present just in Xiongnus. Modern Hungarians have very small Asian components pointing at small contribution from the Conquerors. Of the 289 modern Hungarian mitogenomes 272 are published in [29]. Scythian Hg-s are from [48,49,55,59,71–74]. Xiongnu Hg-s are from [66–69].”

Just recently another article contributed to a similar idea. I already talked about the Bronze Age R1a-z93 sample with high steppe ancestry found in the Balkans, and its likely origin in an expansion of the Srubna or a related culture. No truce, therefore, for those looking for autochthonous continuity anywhere in Europe.

We are seeing how multiple migrations shaped the history of the Carpathian basin (and its complex genetic structure) – and of Europe in general -, often from the Pontic-Caspian steppe. That is clear from many different prehistorical and historical times, such as the expansions of Suvorovo-Novodanilovka, Yamna, Srubna, Thraco-Cimmerians, Sarmatians, Scythians, Huns,…

About the linguistic interpretations based on genetics contained in the paper (Hungarian language as a legacy of Huns), well, you know my stance regarding the Yamnaya ancestral concept (and the wrong linguistic interpretations derived from it, which many sadly keep to this day), and genetics in general to solve language questions

This is yet another example of how (what some people would call) “scientific data” is useless without sound anthropological models.

Featured image, from the article: “Hypothetic origin and migration route of different components of the Hungarian Conquerors. Bluish line frames the Eurasian steppe zone, within which all presumptive ancestors of the Conquerors were found. Yellow area designates the Xiongnu Empire at its zenith from which area the East Eurasian lineages originated. Phylogeographical distribution of modern East Eurasian sequence matches (Fig. 1) well correspond to this territory, especially considering that Yakuts, Evenks and Evens lived more south in the past [108], and European Tatars also originated from this area. Regions where Asian and European Scythian remains were found are labeled green, pink is the presumptive range of the Srubnaya culture. Migrants of Xiongnu origin most likely incorporated descendants of these groups. The map was created using QGIS 2.18.4[109]”.

Article available under a CC-BY-NC-ND 4.0 International license.

Discovered via Razib Khan.

Something is very wrong with models based on the so-called 'Yamnaya admixture' – and archaeologists are catching up (II)

A new article by Leo S. Klejn tries to improve the Northern Mesolithic Proto-Indo-European homeland model of the Russian school of thought: The Steppe hypothesis of Indo-European origins remains to be proven, Acta Archaeologica, 88:1, 193–204.


Recent genetic studies have claimed to reveal a massive migration of the bearers of the Yamnaya culture (Pit-grave culture) to the Central and Northern Europe. This migration has supposedly lead to the formation of the Corded Ware cultures and thereby to the dispersal of Indo-European languages in Europe. The article is a summary presentation of available archaeological, linguistic, genetic and cultural data that demonstrates many discrepancies in the suggested scenario for the transformations caused by the Yamnaya “invasion” some 5000 years ago.


Both teams [Reich/Anthony, and Willerslev/Kristiansen] interpreted this resemblance in the same way: as evidence of mass migration of the Yamnaya culture from the steppes into the Central and Northern Europe, resulting in the formation of the Corded Ware cultures, and these are universally recognised as Indo-European. Since earlier in this part of Europe existed a different pool of genomes, geneticists presumed that the Yamnaya migration alone had brought the Indo-European languages into Europe. It is difficult to say to what extent the pre-convictions of the involved archaeologists influenced these conclusions, or whether the results of the genetic studies attracted archaeologists with such beliefs.

Mismatch of cultural manifestations

First, we might question the idea of the Yamnaya culture as a unity rather than a loose conglomerate of cultures. Merpert (1974) divided it into nine local groups but did not recognise them as separate cultures. However, in 1975 I suggested that Nerushay (Budzhak) monuments should be recognised as a distinct culture (Klejn 1975), although still as a part of the same broader steppe community.

This was accepted by other specialists (Ivanova 2012; 2013; 2014). Generally, in the western branch of this community, a mixture of the eastern rites of interment with local, Balkan ceramics can be observed. It should be noted that hitherto all genetic samples were taken from eastern material (in the vicinity of Samara in the Volga basin and Kalmykia), while the central thesis concerns the intrusion of the western branch of this community (Budzhak culture) into Europe.

The spread of cultural-historical communities of the Yamnaya culture and the location of the Budzhak culture. GAC – Globular Amphora culture; CWC – Corded Ware culture. After Ivanova 2013.

Simultaneity of cultures

The Yamnaya culture (Chernykh & Orlovskaya 2004a; Heyd 2011; Frȋnculeasa et al. 2015) appears not to be the predecessor of the Corded Ware cultures but is contemporary with them. The Corded Ware cultures appeared also around the turn between the fourth and third millennium BC (Stöckli 2001; Furholt 2003). Their derivation from the Yamnaya seems, therefore, to be less probable. This is evidenced by the fact that the corded beakers or amphorae found in the Budzhak culture are not the prototypes of the corded beakers or amphorae found in more northern territories, but seem instead to be an outcome of contemporaneous contacts (Ivanova 2014; Klejn 2017c).

Discrepancies across the haplogroups

Even more remarkable is the variation in the distribution of types of Y chromosome. In the Yamnaya population, R1b is not just a single occurrence (there are about seven known occurrences) while in the Corded Ware population a different clade of R1b is found and R1a is predominant (several instances). Thus the postulate of unbroken succession finds no support!

Distribution of artefacts and customs of the Yamnaya culture in the area of the Corded Ware cultures. After Bátora 2006.

Paradoxical gradient

In the tables presented in the article by Reichs’ team (Haak et al. 2015) the genetic pool connecting the Yamnaya culture with the Corded Ware people is shown to be more intense in Northern Europe (Norway and Sweden) and decreases gradually from the North to the South (Fig. 6). It is weakest around the Danube, in Hungary, i. e. areas neighbouring the western branch of the Yamnaya culture! This is the reverse image to what the proposed hypothesis by the geneticists would lead us to expect. It is true that this gradient is traced back from the contemporary materials, but it was already present during the Bronze Age (Klejn 2015a).

The author also uses questionable interpretations from selected articles to advance his (as of today) untenable positions regarding a Mesolithic origin of the reconstructible Proto-Indo-European language.

1. Glottochronology, for a PIE origin:

If based on the data of glottochronology (taking into account all disputes) the period of initial dispersal is to be dated to the 7th-5th millennium BC.

2. Doubts on the origin of R1b-L51 subclades expressed in Genetic differentiation between upland and lowland populations shapes the Y-chromosomal landscape of West Asia, by Balanovsky et al. (2017), Human Genetics 136, 4. 437-450:

The currently available dataset does not contradict the hypothesis that R-GG400 marks a link between the East European steppe dwellers and West Asians, though the route and even direction of this migration is disputable. It does, however, demonstrate that present-day West European R1b chromosomes do not originate from the Yamnaya populations analyzed in (Haak et al. 2015; Mathieson et al. 2015) and raises the question of their origin. A Bronze Age origin is more likely than a Neolithic one (Balaresque et al. 2010), but further ancient DNA studies may be necessary to identify this source.

Just yesterday I read the post The retraction paradox: Once you retract, you implicitly have to defend all the many things you haven’t yet retracted, by Andrew Gelman. While – in my opinion – the post does not live up to its title, it poses an interesting question, as to how ad logicam (fallacy fallacy) is often used today in research: One author proposes something that is later demonstrated to be wrong, so everything they wrote or write can be said ipso facto to be wrong…especially if they accept that it was wrong.

This is usual with amateur geneticists (those who don’t publish, and are therefore not subjected to criticism): if anyone is wrong (whether in Archaeology or Genetics), then they are wrong in everything else. It seems to me that Klejn’s theses against recent genetic results rest on the same assumption: The Yamna -> Corded Ware migration model is wrong, ergo the Yamna homeland model is wrong.

I guess this same fallacy is what a lot of angered geneticists (whether professional or amateurs) are going to use to dismiss Klejn’s criticism, trying to focus on what he clearly does not grasp – about genomic data of Yamna peoples and their expansion – to disregard his doubts on genetic interpretations entirely.

I have warned many times about how simplistic interpretations of genetic data would cause a general mistrust in the field, and that archaeologists won’t take the discipline seriously, no matter how many articles get published in famous research tabloids like Nature or Science…

Those who dismiss this warning lightly seem to forget the fate of other recent “scientific breakthroughs” which were initially so promising that Humanities appeared to matter no more, like glottochronology for Linguistics and, to some extent, that of radiocarbon analysis for Archaeology.
EDIT: see here a recent example of discusion on discrepancies between archaeological and 14C-based chronologies, whereby ‘scientific data’ obviously needs archaeological context for a meaningful interpretation

Featured image: The direction of the supposed migration of the bearers of the Yamnaya culture into the area of the Corded Ware cultures. After Haak et al. 2015.

NOTE: I obviously don’t agree with Klejn’s main model: he criticises the Proto-Indo-European steppe homeland, and more specifically the expansion of Yamna peoples with R1b-L23 subclades, which I support. But, probably because of his “pre-convictions” (as he puts it when describing proponents of the steppe hypotheses) about the Proto-Indo-European homeland in Northern Europe during the Mesolithic, he was one of the first renown archaeologists to criticise the obvious inconsistencies in the genetic model of migrations based exclusively on the “Yamnaya ancestral component” concept, and to provoke the necessary reaction from (until then) overconfident geneticists, and he deserves credit for that.

In my opinion, the Russian school’s “Northern European Mesolithic” homeland model – as I have said before – could be based on the appearance of EHG ancestry, or maybe on the expansion of haplogroup R1b with post-Swiderian cultures, but the timeframe proposed is too early for any reconstructible parent proto-language, even for Indo-Uralic.


Recent archaeological finds near Indo-European and Uralic homelands


The latest publication of Documenta Praehistorica, vol. 44 (2017) is a delight for anyone interested in Indo-European and Uralic studies, whether from a linguistic, archaeological, anthropological, or genetic point of view. Articles are freely downloadable from the website.

The following is a selection of articles I deem more interesting, but almost all are.

On the Corded Ware culture

Do 14C dates always turn into an absolute chronology? The case of the Middle Neolithic in western Lesser Poland, by Marek Novak:

In the late 5th, 4th, and early 3rd millennia BC, different archaeological units are visible in western Lesser Poland. According to traditional views, local branches of the late Lengyel-Polgár complex, the Funnel Beaker culture, and the Baden phenomena overlap chronologically in great measure. The results of investigations done with new radiocarbon dating show that in some cases a discrete mode and linearity of cultural transformation is recommended. The study demonstrates that extreme approaches in which we either approve only those dates which fit with our concepts or accept with no reservation all dates as such are incorrect.

Territory of western Lesser Poland and the main archaeological units in the late 5th, 4th and early 3rd millennia BC: 1 borders of the area discussed in the paper; 2 sites of the Lublin-Volhynian culture; 3 the Wyciąże-Złotniki group; 4 the Funnel Beaker culture (a dense settlement typical of ‘loess’ upland; b more dispersed settlement typical of foothills, alluvial plains/basins and ‘jurassic zones; c highly dispersed settlement typical mainly of mountainous zone); 5 sites with the Wyciąże/Niedźwiedź materials; 6 the Baden culture, 7 the Beaker/Baden assemblages; 8 Corded Ware culture (a relatively dense settlement typical mainly of ‘loess’ upland; b highly dispersed settlement typical of other ecological zones).

This article brings new data against David Anthony’s new IECWT model, suggesting later dates for the Corded Ware Culture group of Lesser Poland, and thus an earlier origin of their nomadic herders in the steppe, forest-steppe or forest zone to the east and south-east.

On the Pontic-Caspian steppe and forest-steppe

First isotope analysis and new radiocarbon dating of Trypillia (Tripolye) farmers from Verteba Cave, Bilche Zolote, Ukraine, by Lillie et al.:

This paper presents an analysis of human and animal remains from Verteba cave, near Bilche Zolote, western Ukraine. This study was prompted by a paucity of direct dates on this material and the need to contextualise these remains in relation both to the transition from hunting and gathering to farming in Ukraine, and their specific place within the Cucuteni-Trypillia culture sequence. The new absolute dating places the remains studied here in Trypillia stages BII/CI at c. 3900–3500 cal BC, with one individual now redated to the Early Scythian period. As such, these finds are even more exceptional than previously assumed, being some of the earliest discovered for this culture. The isotope analyses indicate that these individuals are local to the region, with the dietary stable isotopes indicating a C3 terrestrial diet for the Trypillia-period humans analysed. The Scythian period individual has δ13C ratios indicative of either c. 50% marine, or alternatively C4 plant inputs into the diet, despite δ18O and 87Sr/86Sr ratios that are comparable to the other individuals studied.

Map showing the extent of the Trypillia culture of Ukraine and
neighbouring countries, key sites and the location of Verteba Cave ©WAERC
University of Hull.

New data on one of the cultures that was very likely a close neighbour of Corded Ware peoples.

Chronology of Neolithic sites in the forest-steppe area of the Don River, by Smolyaninov, Skorobogatov, and Surkov:

The first ceramic complexes appeared in the forest-steppe and forest zones of Eastern Europe at the end of the 7th–5th millennium BC. They existed until the first half of the 5th millennium BC in the Don River basin. All these first ceramic traditions had common features and also local particularities. Regional cultures, distinguished nowadays on the basis of these local particularities, include the Karamyshevskaya and Middle Don cultures, as well pottery of a new type found at sites on the Middle Don River (Cherkasskaya 3 and Cherkasskaya 5 sites).

Radiocarbon chronology of Neolithic in the Lower Don and North-eastern Azov Sea, by Tsybryi et al.:

So far, four different cultural-chronological groups of sites have been identified in the North-eastern Azov Sea and Lower Don River areas, including sites of the Rakushechny Yar culture, Matveev Kurgan culture, Donets culture, and sites of the Caspian-Ciscaucasian region. An analysis of all known dates, as well as the contexts and stratigraphies of the sites, allowed us to form a new perspective of the chronology of southern Russia, to revise the chronology of this region, and change the concept of unreliability of dates for this area.

On the Forest Zone

The past in the past in the mortuary practice of hunter-gatherers: an example from a settlement and cemetery site in northern Latvia, by Lars Olof Georg Larsson:

During excavations of burials at Zvejnieki in northern Latvia, it transpired that the grave fill included occupation material brought to the grave. It contained tools of a type that could not be contemporaneous with the grave. This is confirmed by the dating of bone tools and other bone finds in the fill. The fill was taken from an older settlement site a short distance away. The fill also included skeletal parts of humans whose graves had been destroyed with the digging of the grave for a double burial. This provides an interesting view of the mortuary practice of hunter-gatherers and an insight into the use of the past in the past.

The Zvejnieki site with the location of the burial ground, the settlements,
the farmhouse on the site and the gravel pit.

I keep expecting that more information is given regarding the important sample labelled “Late Neolithic/Corded Ware Culture” from Zvejnieki ca. 2880 BC. It seems too early for the Corded Ware culture in the region, clusters too close to steppe samples, and the information on it from genetic papers is so scarce… My ad hoc explanation of these data – as a product of recent exogamy from Eastern Yamna -, while possibly enough to explain one sample, is not satisfying without further data, so we need to have more samples from the region to have a clearer picture of what happened there and when. Another possibility is a new classification of the sample, compatible with later migration events (a later date of the sample would explain a lot). Anyway, this article won’t reveal anything about this matter, but is interesting for other, earlier samples from the cemetery.

Other articles on the Forest Zone include:

Other articles include studies on Neolithic sites, potentially relevant for Indo-European migrations, such as Anatolia, Greece, southern or south-eastern sites in Europe. Check it out!


Differences in ADMIXTURE between Khvalynsk/Yamna and Sredni Stog/Corded Ware


Looking for differences among steppe cultures in Genomics is like looking for a needle in a haystack.

It means, after all, looking for differences among closely related cultures, such as between South-Western and North-Western Anatolian Neolithic cultures, or among Old European cultures (such as Vinča or Cucuteni–Trypillia), or between Iberian cultures after the arrival of steppe-related populations.

These differences between closely related regions, in all these cases and especially among steppe cultures, even when they are supported by Archaeology and anthropological models of migration (and compatible with linguistic models), are expected to be minimal.

Fortunately, we have phylogeography, which helps us point in the right direction when assessing potential migrations using genomic data.

User Tomenable recently pointed out a curious finding on Anthrogenica, from data available in Mathieson et al (2017): in ADMIXTURE results with K=12, a different ancestral component (in light green in the paper, see below) is traceable from the North Caspian steppe since the Neolithic. This is also partially distinguishable on K=10 and K=11, although not so clearly differentiating among later cultures.

NOTE: Read more on the controversy regarding the ideal number of ancestral populations, the absurd use of ADMIXTURE to solve language questions, and the meaning of cross-validation (CV) values

Unsupervised ADMIXTURE plot from k=10 to 12, on a dataset consisting of 1099 present-day individuals and 476 ancient individuals. We show newly reported ancient individuals and some previously published individuals for comparison.

Explanations for this finding might include, as the user points out, a greater contribution of CHG ancestry in the eastern steppe cultures (Khvalynsk/Yamna) compared to the North Pontic steppe (Sredni Stog/Corded Ware), which is probably one of the main genomic differences among both cultures, as I pointed out in the Indo-European demic diffusion model (see accounts on the origins of Khvalynsk and Sredni Stog populations and on contacts between Yamna and the Caucasus, and see below also my sketch of Eurasian genomic history).

Interesting is also the appearance of similar ancestral components later in Vučedol – which probably received admixture from Yamna settlers (see admixture components in West Yamna samples and in the Yamna settler from Bulgaria) – , and later still in the Balkans.

On the other hand, previous ancestral components in outliers from the Balkans seem to be more similar to Sredni Stog samples, giving still more strength to the hypothesis that this common (“steppe”) component expanded westward within the Pontic-Caspian steppe with the spread of Suvorovo-Novodanilovka chiefs.

Problems with this interpretation include:

1) The scarce samples available, the different cultures included, and the CV values of the K populations selected in ADMIXTURE.

2) The lack of data for comparison with Bell Beaker peoples (from Olalde et al. 2017).

3) The sample classified as Latvia_LN/CWC has this component. I have already said before that, given the differences with all other Corded Ware samples, this quite early sample might be an outlier, with Khvalynsk/Yamna population connected directly to the ancestors of this individual, possibly through exogamy (as it is clear from my sketch below). Whether or not this is an outlier among CWC populations in the Baltic, only future samples can tell.

4) Three later individuals from Corded Ware in Germany have the component, in a minimal amount. I would bet – judging by their position in the graphic – that this might be explained through the Esperstedt family. These individuals might have in turn got the contribution directly from the oldest member, who shows what seems (in PCA) like a recent admixture from contemporary steppe cultures (such as the Catacomb culture).

NOTE: See my graphics with interesting members of the Espersted family marked: ADMIXTURE and PCA (outlier).

Tentative sketch modelling the genetic history of Europe and West Eurasia from ancient populations up to the Neolithic, according to results in recent genetic papers and archaeological models of known migrations.

Again, needle in a haystack… And confirmation bias by me, indeed.

But interesting nonetheless.

EDIT (4 JAN 2017): A reader points out that the interpretation of Unsupervised ADMIXTURE should work backwards (i.e. different contributions into different modern populations), and not based solely on ancestral populations, which seems probably right. So again, confirmation bias (and potentially wrong direction fallacy) by me…


Massive Migrations? The Impact of Recent aDNA Studies on our View of Third Millennium Europe


Thanks to Joshua Jonathan, I have discovered the paper Massive Migrations? The Impact of Recent aDNA Studies on our View of Third Millennium Europe, by Martin Furholt, European Journal of Archaeology (28 SEP 2017).


New human aDNA studies have once again brought to the forefront the role of mobility and migration in shaping social phenomena in European prehistory, processes that recent theoretical frameworks in archaeology have downplayed as an outdated explanatory notion linked to traditional culture history. While these new genetic data have provided new insights into the population history of prehistoric Europe, they are frequently interpreted and presented in a manner that recalls aspects of traditional culture-historical archaeology that were rightly criticized through the 1970s to the 1990s. They include the idea that shared material culture indicates shared participation in the same social group, or culture, and that these cultures constitute one-dimensional, homogeneous, and clearly bounded social entities. Since the new aDNA data are used to create vivid narratives describing ‘massive migrations’, the so-called cultural groups are once again likened to human populations and in turn revitalized as external drivers for socio-cultural change. Here, I argue for a more nuanced consideration of molecular data that more explicitly incorporates anthropologically informed mobility and migration models.

I was copying and pasting whole excerpts to post them here, but I think it is best to read the full paper.

From the paper: “Simplified map showing the extent of the most important archaeological units of classification in the third millennium cal BC in Europe discussed in this text.”

It is a great summary of potential flaws of the current reasoning in genetic papers.

It should be a must-read for any serious geneticist involved in discussions on migrations, especially regarding archaeology in Indo-European studies.

As for the answers to the paper, well, unsurprisingly quite disappointing that of Haak, neither addressing the main flaw of their proposed “Yamna -> Corded Ware migration” model, nor taking the opportunity to evaluate other potential models fitting their findings of steppe ancestry in Corded Ware peoples, not even those directly suggested to them (like the expansion of Suvorovo-Novodanilovka chiefs).

NOTE: A funny thing about the paper is that, although published at the end of September, it does not take into account certain recent developments supporting Furholt’s doubts, such as the Esperstedt’s family, the new sample of Sredni Stog (and consequently the change in interpretations of outliers in Eneolithic Ukraine populations), or even the elevated steppe ancestry found in East Bell Beaker peoples. I guess Haak’s answer to all that would still be the same thorough argument: “meh, massive Indo-European migration Yamnaya -> Corded Ware is right”…

#EDIT (30 DEC 2017): Check out the interesting article by Bruce G. Trigger, referenced by John Hawks, about the question of descriptive vs. theoretical archaeologist vs. ethnologist/anthropologist from the 1950s to the 1980s. Interesting to see how today the new playboys in Academia, geneticists, are playing the archaeologist playing the ethnologist playing the linguist in Indo-European questions, and how we are living a historical debate on essential questions for the future of all these disciplines.