Consequences of Damgaard et al. 2018 (I): EHG ancestry in Maykop samples, and the potential Anatolian expansion routes

neolithic_steppe-anatolian-migrations

This is part I of two posts on the most recent data concerning the earliest known Indo-European migrations.

Anatolian in Armi

I am reading in forums about “Kroonen’s proposal” of Anatolian in the 3rd millennium. That is false. The Copenhagen group (in particular the authors of the linguistic supplement, Kroonen, Barjamovic, and Peyrot) are merely referencing Archi (2011. “In Search of Armi”. Journal of Cuneiform Studies 63: 5–34) in turn using transcriptions from Bonechi (1990. “Aleppo in età arcaica; a proposito di un’opera recente”. Studi Epigrafici e Linguistici sul Vicino Oriente Antico 7: 15–37.), who asserted the potential Anatolian origin of the terms. This is what Archi had to say about this:

Most of these personal names belong to a name-giving tradition different from that of Ebla; Arra-ti/tulu(m) is attested also at Dulu, a neighbouring city-state (Bonechi 1990b: 22–25).28 We must, therefore, deduce that Armi belonged to a marginal, partially Semitized linguistic area different from the ethno-linguistic region dominated by Ebla. Typical are masculine personal names ending in -a-du: A-la/li-wa-du/da, A-li/lu-wa-du, Ba-mi-a-du, La-wadu, Mi-mi-a-du, Mu-lu-wa-du. This reminds one of the suffix -(a)nda, -(a)ndu, very productive in the Anatolian branch of Indo-European (Laroche 1966: 329). Elements such as ali-, alali-, lawadu-, memi-, mula/i- are attested in Anatolian personal names of the Old Assyrian period (Laroche 1966: 26–27, 106, 118, 120).

First_Eblaite_Empire
Ebla’ first kingdom at its height c. 2340 BC. Hipothetical location of Armi depicted. The first Eblaite kingdom extended from Urshu in the north,1 to Damascus area in the south.2 And from Phoenicia and the coastal mountains in the west,3 4 to Tuttul,5 and Haddu in the east.6 The eastern kingdom of Nagar controlled most of the Khabur basin from the river junction with the Euphrates to the northwestern part at Nabada.7 Page 101. From Wikipedia.

This was used by Archi to speculatively locate the state of Armi, in or near Ebla territory, which could correspond with the region of modern north-western Syria:

The onomastic tradition of Armi, so different from that of Ebla and her allies (§ 5), obliges us to locate this city on the edges of the Semitized area and, thus, necessarily north of the line running through Hassuwan – Ursaum – Irritum – Harran. If Armi were to be found at Banat-Bazi, it would have represented an anomaly within an otherwise homogenous linguistic scenario.34

Taken as a whole, the available information suggests that Armi was a regional state, which enjoyed a privileged relationship with Ebla: the exchange of goods between the two cities was comparable only to that between Ebla and Mari. No other state sent so many people to Ebla, especially merchants, lú-kar. It is only a hypothesis that Armi was the go-between for Ebla and for the areas where silver and copper were extracted.

This proposal is similar to the one used to support Indo-Aryan terminology in Mittanni (ca. 16th-14th c. BC), so the scarce material should not pose a problem to those previously arguing about the ‘oldest’ nature of Indo-Aryan.

NOTE. On the other hand, the theory connecting ‘mariannu‘, a term dated to 1761 BC (referenced also in the linguistic supplement), and put in relation with PIIr. *arya, seems too hypothetical for the moment, although there is a clear expansion of Aryan-related terms in the Middle East that could support one or more relevant eastern migration waves of Indo-Aryans from Asia.

Potential routes of Anatolian migration

Once we have accepted that Anatolian is not Late PIE – and that only needed a study of Anatolian archaisms, not the terminology from Armi – , we can move on to explore the potential routes of expansion.

On the Balkan route

A current sketch of the dots connecting Khvalynsk with Anatolia is as follows.

suvorovo-scepters
1—39 — sceptre bearers of the type Giurgiuleşti and Suvorovo; 40—60 — Gumelniţa-Varna-Bolgrad-Aldeni cultural sphere; 61 — Fălciu; 62 — Cainari; 63 — Giurgiuleşti; 64 — Suvorovo; 65 — Casimcea; 66 — Kjulevča; 67 — Reka Devnja; 68 — Drama; 69 — Gonova Mogila; 70 — Reževo.

First, we have the early expansion of Suvorovo chieftains spreading from ca. 4400-4000 BC in the lower Danube region, related to Novodanilovka chiefs of the North Pontic region, and both in turn related to Khvalynsk horse riders (read a a recent detailed post on this question).

Then we have Cernavoda I (ca. 3850-3550 BC), a culture potentially derived from the earlier expansion of Suvorovo chiefs, as shown in cultural similarities with preceding cultures and Yamna, and also in the contacts with the North Pontic steppe cultures (read a a recent detailed post on this question).

We also have proof of genetic inflow from the steppe into populations of cultures near those suggested to be heirs of those dominated by Suvorovo chiefs, from the 5th millennium BC (in Varna I ca. 4630 BC, and Smyadovo ca. 4500 BC, see image below).

If these neighbouring Balkan peoples of ca. 4500 BC are taken as proxies for Proto-Anatolians, then it becomes quite clear why Old Hittite samples dated 3,000 years after this migration event of elite chiefs could show no or almost no ancestry from Europe (for this question, read my revision of Lazaridis’ preprint).

NOTE. A full account of the crisis in the lower Danube, as well as the Suvorovo-Novodanilovka intrusion, is available in Anthony (2007).

mathieson-2018-balkan-expansion
Modified image, including PCA and supervised ADMIXTURE data from Mathieson et al. (2018). Blue arrow represents incoming ancestry from Suvorovo chiefs, red line represents distance from the majority of the neighbouring Balkan population in this period studied to date. Northwestern-Anatolian Neolithic (grey), Yamnaya from Samara (yellow), EHG (pink) and WHG (green).

The southern Balkans and Anatolia

The later connection of Cernavoda II-III and related cultures (and potentially Ezero) with Troy, on the other hand, is still blurry. But, even if a massive migration of Common Anatolian is found to happen from the Balkans into Anatolia in the late 4th / beginning of the 3rd millennium, the people responsible for this expansion could show a minimal trace of European ancestry.

A new paper has appeared recently (in Russian), Dubene and Troy: Gold and Prosperity in the Third Millennium Cal. BCE in Eurasia. Stratum Plus, 2 (2018), by L. Nikolova, showing commercial contacts between Troy and cultures from Bulgaria:

Earlier third millennium cal BCE is the period of development of interconnected Early Bronze Age societies in Eurasia, which economic and social structures expressed variants of pre-state political structures, named in the specialized literature tribes and chiefdoms. In this work new arguments will be added to the chiefdom model of third millennium cal BC societies of Yunatsite culture in the Central Balkans from the perspectives of the interrelations between Dubene (south central Bulgaria) and Troy (northwest Turkey) wealth expression.

Possible explanations of the similarity in the wealth expression between Troy and Yunatsite chiefdoms is the direct interaction between the political elite. However, the golden and silver objects in the third millennium cal BCE in the Eastern Mediterranean are most of all an expression of economic wealth. This is the biggest difference between the early state and chiefdoms in the third millennium cal BCE in Eurasia and Africa. The literacy and the wealth expression in the early states was politically centralized, while the absence of literacy and wider distribution of the wealth expression in the chiefdoms of the eastern Mediterranean are indicators, that wider distribution of wealth and the existed stable subsistence layers prevented the formation of states and the need to regulate the political systems through literacy.

The only way to link Common Anatolians to their Proto-Anatolian (linguistic) ancestors would therefore be to study preceding cultures and their expansions, until a proper connecting route is found, as I said recently.

These late commercial contacts in the south-eastern Balkans (Nikolova also offers a simplified presentation of data, in English) are yet another proof of how Common Anatolian languages may have further expanded into Anatolia.

NOTE. One should also take into account the distribution of modern R1b-M269* and L23* subclades (i.e. those not belonging to the most common subclades expanding with Yamna), which seem to peak around the Balkans. While those may just belong to founder effects of populations preceding Suvorovo or related to Yamna migrants, the Balkans is a region known to have retained Y-DNA haplogroup diversity, in contrast with other European regions.

On a purely linguistic aspect, there are strong Hattic and Hurrian influences on Anatolian languages, representing a unique layer that clearly differentiates them from LPIE languages, pointing also to different substrates behind each attested Common Anatolian branch or individual language:

  • Phonetic changes, like the appearance of /f/ and /v/.
  • Split ergativity: Hurrian is ergative, Hattic probably too.
  • Increasing use of enclitic pronoun and particle chains after first stressed word: in Hattic after verb, in Hurrian after nominal forms.
  • Almost obligatory use of clause initial and enclitic connectors: e.g. semantic and syntactic identity of Hattic pala/bala and Hittite nu.

NOTE. For a superficial discussion of this, see e.g. An Indo-European Linguistic Area and its Characteristics: Ancient Anatolia. Areal Diffusion as a Challenge to the Comparative Method?, by Calvert Watkins. You can also search for any of the mentioned shared isoglosses between Middle Eastern languages and Anatolian if you want more details.

On the Caucasus route

It seems that the Danish group is now taking a stance in favour of a Maykop route (from the linguistic supplement):

The period of Proto-Anatolian linguistic unity can now be placed in the 4th millennium BCE and may have been contemporaneous with e.g. the Maykop culture (3700–3000 BCE), which influenced the formation and apparent westward migration of the Yamnaya and maintained commercial and cultural contact with the Anatolian highlands (Kristiansen et al. 2018).

In fact, they have data to support this:

The EHG ancestry detected in individuals associated with both Yamnaya (3000–2400 BCE) and the Maykop culture (3700–3000 BCE) (in prep.) is absent from our Anatolian specimens, suggesting that neither archaeological horizon constitutes a suitable candidate for a “homeland” or “stepping stone” for the origin or spread of Anatolian Indo- European speakers to Anatolia. However, with the archaeological and genetic data presented here, we cannot reject a continuous small-scale influx of mixed groups from the direction of the Caucasus during the Chalcolithic period of the 4th millennium BCE.

While it is difficult to speak about the consequences of this find without having access to this paper in preparation or its samples, we already knew that Maykop had obvious cultural contacts with the steppe.

It will not be surprising to find not only EHG, but also R1b-L23 subclades there. In my opinion, though, the most likely source of EHG ancestry in Maykop (given the different culture shown in other steppe groups) is exogamy.

The question will still remain: was this a Proto-Anatolian-speaking group?

eneolithic_steppe
Diachronic map of Eneolithic migrations ca. 4000-3100 BC

My opinion in this regard – again, without access to the study – is that you would still need to propose:

  • A break-up of Anatolian ca. 4500 BC represented by some early group migrating into the Northern Caucasus area.
  • For this group – who were closely related linguistically and culturally to early Khvalynsk – to remain isolated in or around the Northern Caucasus, i.e. somehow ‘hidden’ from the evolving LPIE speakers in late Khvalynsk/early Yamna peoples.
  • Then, they would need to have migrated from Maykop to Anatolian territory only after ca. 3700 BC – while having close commercial contacts with Khvalynsk and the North Pontic cultures in the period 3700-3000 BC -, in some migration wave that has not showed up in the archaeological records to date.
  • Then appear as Old Hittites without showing EHG ancestry (even though they show it in the period 3700-3000 BC), near the region of the Armi state, where Anatolian was supposedly spoken already in the mid-3rd millennium.

Not a very convincing picture, right now, but indeed possible.

Also, we have R1b-Z2103 lineages and clear steppe ancestry in the region probably ca. 2500 BC with Hajji Firuz, which is most likely the product of the late Khvalynsk migration waves that we are seeing in the recent papers.

These migrations are then related to early LPIE-speaking migrants spreading after ca. 3300 BC – that also caused the formation of early Yamna and the expansion of Tocharian-related migrants – , which leaves almost no space for an Anatolian expansion, unless one supports that the former drove the latter.

NOTE. In any case, if the Caucasus route turned out to be the actual Anatolian route, I guess this would be a way as good as any other to finally kill their Indo-European – Corded Ware theory, for obvious reasons.

On the North Iranian homeland

A few thoughts for those equating CHG ancestry in IE speakers (and especially now in Old Hittites) with an origin in North Iran, due to a recent comment by David Reich:

In the paper it is clearly stated that there is no Neolithic Iranian ancestry in the Old Hittite samples.

Ancestry is not people, and it is certainly not language. The addition of CHG ancestry to the Eneolithic steppe need not mean a population or linguistic replacement. Although it could have been. But this has to be demonstrated with solid anthropological models.

NOTE. On the other hand, if you find people who considered (at least until de Barros Damgaard et al. 2018) steppe (ancestry/PCA) = Indo-European, then you should probably confront them about why CHG in Hittites and the arrival of CHG in steppe groups is now not to be considered the same, i.e why CHG / Iran_N ≠ PIE.

Since there has been no serious North Iranian homeland proposal made for a while, it is difficult to delineate a modern sketch, and I won’t spend the time with that unless there is some real anthropological model and genetic proof of it. I guess the Armenian homeland hypothesis proposed by Gamkrelidze and Ivanov (1995) would do, but since it relies on outdated data (some of which appears also in Gimbutas’ writings), it would need a full revision.

NOTE. Their theory of glottalic consonants (or ejectives) relied on the ‘archaism’ of Hittite, Germanic, and Armenian. As you can see (unless you live in the mid-20th century) this is not very reasonable, since Hittite is attested quite late and after heavy admixture with Middle Eastern peoples, and Germanic and Armenian are some of the latest attested (and more admixed, phonetically changed) languages.

This would be a proper answer, indeed, for those who would accept this homeland due to the reconstruction of ‘ejectives’ for these languages. Evidently, there is no need to posit a homeland near Armenia to propose a glottalic theory. Kortlandt is a proponent of a late and small expansion of Late PIE from the steppe, and still proposes a reconstruction of ejectives for PIE. But, this was the main reason of Gamkrelidze and Ivanov to propose that homeland, and in that sense it is obviously flawed.

Those claiming a relationship of the North Iranian homeland with such EHG ancestry in Maykop, or with the hypothetic Proto-Euphratic or Gutian, are obviously not understanding the implications of finding steppe ancestry coupled with (likely) early Late PIE migrants in the region in the mid-4th millennium.

Related:

Consequences of O&M 2018 (III): The Balto-Slavic conundrum in Linguistics, Archaeology, and Genetics

This is part of a series of posts analyzing the findings of the recent Nature papers Olalde et al.(2018) and Mathieson et al.(2018) (abbreviated O&M 2018).

The recent publication of Narasimhan et al. (2018) has outdated the draft of this post a bit, and it has made it at the same time still more interesting.

While we wait for the publication of the dataset (and the actual Y-DNA haplogroups and precise subclades with the revision of the paper), and as we watch the wrath of Hindu nationalists vented against the West (as if the steppe was in Western Europe) and science itself, we have already seen confirmation from the Reich Lab of their new approach to Late Proto-Indo-European migrations.

Yamna/Steppe EMBA, previously identified as the direct source of “steppe” ancestry (AKA Yamnaya‘ ancestry) and Late Indo-European migrations in Asia – through Corded Ware, it is to be understood – has been officially changed. In the case of Indo-Iranian migrations it is the “Steppe MLBA cloud”, after a direct contribution to it of Yamna/Steppe EMBA, which expanded Indo-Iranian, as I predicted ancient DNA could support.

In Twitter, the main author responded the following when asked for this change regarding the origin of steppe ancestry in Asian migrants (emphasis mine):

Our reasons are:

  1. The Turan samples show no elevated steppe ancestry till 2000BC.
  2. MLBA is R1a
  3. Indus periphery doesn’t have steppe ancestry but Swat does, and EMBA doesn’t work both in terms of time or genetic ancestry to explain the difference.
yamna-late-proto-indo-european
Image modified from Narasimhan et al. (2018), including the most likely proto-language identification of different groups. Original description “Modeling results including Admixture events, with clines or 2-way mixtures shown in rectangles, and clouds or 3-way mixtures shown in ellipses”. Yes, this map is the latest official view on migrations from the Reich Lab now. See the original full image here.

I am glad to see finally recognized that Y-DNA haplogroups and time have to be taken into account, and happy also to see an end to the by now obsolete ‘ADMIXTURE/PCA-only relevance’ in Human Ancestry. The timing of archaeological migrations, the cultural attribution of each sample, and the role of Y-DNA variability reduction and expansion have been finally recognized as equally important to assess potential migrations, as I requested.

This change was already in the making some months ago, when David Anthony – who has worked with the group for this paper and others before it – already changed his official view on Corded Ware – from his previous support of the 2015 model. His latest theory, which linked Yamna settlements in Hungary with a potential mixed society of migrants (of R1b-L23 and R1a-Z645 lineages) from West Yamna, is most likely wrong, too, but it was clearly a brave step forward in the right direction.

The only reasonable model now is that Yamna expanded Late Proto-Indo-European languages with steppe ancestry + R1b-L23 subclades.

You can either accept this change, or you can deny it and wait until one sample of R1a-Z645 appears in West Yamna or central Europe, or one sample of R1b-L23 appears in Corded Ware (as it is obvious it could happen), to keep spreading the wrong ideas still some more years, while the rest of the world goes on: Mallory, Anthony, and other archaeologists co-authoring the latest paper (probably part of the stronger partnership with academics that we were going to see), who had formally put forward complex, detailed theories, investing their time and name in them, have rejected their previous migration models to develop new ones based on the most recent findings. If they can do that, I am sure any amateur geneticist out there can, too.

yamna-expansion-malopolska
Modified image, from Narasimhan et al. (2018). Anthony’s new model of a Yamna Hungary -> Corded Ware (Małopolska) migration arrow in red. Notice also how they keep the arrow from West Yamna to the north (in black), due probably to the Baltic Late Neolithic samples (see below).

The Balto-Slavic dialect and its homeland

An interesting question in Linguistics and Archaeology, now that Corded Ware cannot be identified as “Indo-Slavonic” or any other imaginary ancient group (like Indo-Slavo-Germanic), remains thus mostly unchanged since before the famous 2015 genetic papers:

  • Was Balto-Slavic a dialect of the expanding North-West Indo-European language, a Northern LPIE dialect, as we support, based on morphological and lexical isoglosses?
  • Or was it part of an Indo-Slavonic group in East Yamna, i.e. a Graeco-Aryan dialect, based mainly on the traditional Satem-Centum phonological division?

I am a strong supporter of Balto-Slavic being a member of a North-West Indo-European group. That’s probably because I educated myself first with the main Spanish books* on Proto-Indo-European reconstruction, and its authors kept repeating this consistent idea, but I have found no relevant data to reject it in the past 15 years.

* Today two of the three volumes are available in English, although they are from the early 1990s, hence a bit outdated. They also maintain certain peculiarities from Adrados’ own personal theories, such as multiple (coloured) laryngeals, 5 cases – with a common ancestral oblique case – for Middle PIE, etc. But it has lots of detailed discussions on the different aspects of the reconstruction. It is not an easy introductory manual to the field, though; for that you have already many famous short handbooks out there, like those of Fortson (N.American), Beekes (Leiden), or Meier-Brügger (Germany).

Fernando and I have always maintained that North-West Indo-European must have formed a very recent community, probably connected well into the early 2nd millennium BC for certain recent isoglosses to spread among its early dialects, based on our guesstimates*, and on our belief that it formed at some point not just a dialect continuum, but probably a common language, so we estimated that the expansion was associated with the pan-European influence of Únětice and close early Bronze Age European contacts.

NOTE. I know, you must be thinking “linguistic guesstimates? Bollocks, that’s not Science”. Right? Wrong. When you learn a dozen languages from different branches, half a dozen ancient ones, and then still study some reconstructed proto-languages from them, you begin to make your own assumptions about how the language changes you perceive could have developed according to your mental time frames. If you just learned a second language and some Latin in school, and try to make assumptions as to how language changes, or you believe you can judge it with this limited background, you have evidently the wrong idea of what a guesstimate is. I accept criticism to this concept from a scientist used only to statistical methods, since it comes from pure ignorance of what it means. And I accept alternative guesstimates from linguists whose language backgrounds may differ (and thus their perception of language change). However, I would not accept a glottochronological or otherwise (supposedly) statistical model instead (or a religious model, for that matter), so we have no alternatives to guesstimates for the moment.

In fact, guesstimates and dialectalization have paved the way to the steppe hypothesis, first with the kurgan hypothesis by Marija Gimbutas, then complemented further in the past 60 years by linguists and archaeologists into a detailed Khvalynsk -> Yamna -> Afanasevo/Bell Beaker/Sintashta-Andronovo expansion model, now confirmed with genomics. So either you trust us (or any other polyglot who deals with Indo-European matters, like Adrados, Lehmann, Beekes, Kloekhorst, Kortlandt, etc.), or you begin learning ancient languages and obtaining your own guesstimates, whichever way you prefer. The easy way of numbers + computer science does not exist yet, and is quite far from happening – until we can understand how our brains summarize and select important details involved in obtaining estimates – , no matter what you might be reading (even in Nature or Science) recently

proto-indo-european-expansion
Proto-Indo-European dialectal expansion according to Adrados (1998).

Data from the 2015 papers changed my understanding of the original NWIE-speaking community, and I have since shifted my preffered anthropological model (from a Northern dialect in Yamna spreading into a loose NWIE-speaking Corded Ware -> Únětice) to a quite close group formed by late Yamna settlers in the Carpathian Basin, expanded as East Bell Beakers, and later continuing with close contacts through Central European EBA.

NOTE. As you can read, we initially rejected Gimbutas’ and Anthony’s (2007) notion of a Late PIE splitting suddenly into all known dialects (viz. Italo-Celtic with Vučedol/Bell Beaker), and looked thus for a common NWIE spread with Corded Ware migrants, with help from inferences of modern haplogroup distribution (as was common in the early 2000s). Language reconstruction was the foundation of that model, and it was right in its own way. It probably gave the wrong idea to geneticists and archaeologists, who quite easily accepted some results from the 2015 papers as supporting this model. But it also helped us develop a new model and predict what would happen in future papers, as demonstrated in O&M 2018. Any alternative linguistic and archaeological model could explain what is seen today in genomics, but our model of North-West Indo-European reconstruction is obviously at present the best fit for it.

calcolithic-expansion
Map of Chalcolithic migrations (A Grammar of Modern Indo-European, 2nd ed. 2008): Corded Ware as the vector of Indo-European languages.

Nevertheless, one of the most important Balticists and Slavicists alive, Frederik Kortlandt, posits that there was in fact an Indo-Slavonic group, so one has to take that possibility into account. Not that his ideas are flawless, of course: he defends the glottalic theory – which is still held today by just a handful of researchers – , and I strongly oppose his description of Balto-Slavic and Germanic oblique cases in *-m- (against other LPIE *-bh-) as an ancestral remnant related to Anatolian (an ending which few scholars would agree corresponds to what he claims), since that would probably represent an older split than warranted in our model. I believe genetics is proving that the dialectalization of Late PIE happened as Fernando López-Menchero and I described.

NOTE. The idea with these examples of how he has been wrong in LPIE and MPIE reconstruction is not to observe the common ad hominem arguments used by amateur geneticists to dismiss academic proposals (“he said that and was wrong, ergo he is wrong now”). It is to bring into attention that the argument from authority is important for the academic community insofar as it creates a common ground, i.e. especially when there are many relevant scholars agreeing on the same subject. But, indeed, any model can and should be challenged, and all authorities are capable of being wrong, and in fact they often are.

The most common explanation today for the dialectal development *-m- is an innovation (not an archaism), whether morphological (viz. Ita. and Gk. them. pl *-i) or phonological (as I defend); and the most commonly repeated model for the satemization trend (even for those supporting a three-dorsal theory for PIE) is areal contact, whether driven by a previous (most likely Uralic) substratum, or not. Hence, if Kortlandt’s main different phonological and morphological assessments of the parent language are flawed, and they are the basis for his dialectal scheme, it should be revised.

The ‘atomic bomb’ that Indo-Slavonic proponents launched, in my opinion, was Holzer’s Temematic (born roughly at the same time as the renewed Old European concept in North-West Indo-European model of Oettinger) – and indeed Kortlandt’s acceptance of it. It seems to me like the linguistic equivalent of the archaeological “patron-client relationship” proposed by Anthony for a cultural diffusion of Late PIE into different Corded Ware regions: almost impossible to be fully rejected, if the Indo-Slavonic superstrate is proposed for a relatively early time.

In my opinion, the shared morphological layer with North-West Indo-European is obviously older than Iranian influence on Slavic, and I think this is communis opinio today. But how could we disentangle the dialectalization of Balto-Slavic, if there is (as it seems) an ancestral substrate layer (most likely Uralic) common to both Balto-Slavic and Indo-Iranian? It seems a very difficult task.

bronze_age_early_Unetice.
Diachronic map of migrations in Europe ca. 2250-1750 BC

The expansion of Balto-Slavic

In any case, there are two, and only two mainstream choices right now.

NOTE. Mainstream, as in representing trends current today among Indo-Europeanists, so that many programs around the world would explain these alternative models to their students, or they would easily appear in most handbooks. Not like the word “mainstream” you read in any comment out there by anyone who has never been interested in Indo-European studies, and uses any text from any author, written who knows how long ago, merely to justify their ethnic preconceptions coupled with certain genomic finds.

You can agree with:

A) The Spanish and German schools of thought, together with many American and British scholars, as well as archaeologists like Heyd, Mallory, or Prescott, and now Anthony, too: the language ancestral to Balto-Slavic, Germanic, and Italo-Celtic accompanied expanding West Yamna/East Bell Beakers into Europe, and then their speakers – like the rest of peoples everywhere in Europe – admixed later in the different regions.

B) Frederik Kortlandt and other Indo-Slavicists. The ‘original’ Balto-Slavic would have spread with Srubna (and likely Potapovka before it), as a product of the admixture of East Yamna’s Indo-Slavonic with incoming Corded Ware migrants (this would correspond to my description of Indo-Iranian). ‘True’ Balto-Slavic speakers would have then absorbed the Temematic-speaking migrants (equivalent to early Balto-Slavic migrants as described in the demic diffusion model) spreading from the west, most likely in the steppe. Later developments from the steppe would have then brought Baltic to the north, and Slavic to the west.

Therefore, in both cases the language spoken by early R1a-Z645 lineages in Únětice or Mierzanowice/Nitra EBA cultures would have been an eastern North-West Indo-European dialect associated with expanding Bell Beakers, and closely related to Germanic and Italo-Celtic. In the second case, the ancient samples we see genetically closer to modern West Slavs could thus be identified with those speaking the Temematic substrate absorbed later by Balto-Slavic, or maybe by Balts migrating northward, and Slavs spreading west- and southward.

NOTE. In any case, we know that R1a-Z645 subclades resurged in Central-East Europe after the expansion of Bell Beakers, potentially showing an ancient link with the prevalent R1a subclades in the region today. We know that some ancient Central European populations cluster near modern West Slavs, but in other interesting regions (like the British Isles, Central Europe, Scandinavia, or Iberia) we also see close clusters, and nevertheless observe historically documented radical ethnolinguistic changes, as well as many different subsequent genetic inflows and founder effects, that have significantly altered the anthropological picture in these regions, so it could very well be that the lineages we find in ancient samples do not correspond to modern West Slavic lineages, or even similar ancient and modern lineages could show a radical cultural discontinuity (as is likely the case in this to-and-from-the-steppe migration scheme).

bronze-age-tollense-battle
Diachronic map of migrations in Europe ca. 1250-750 BC.

Since we are going to see signs of both – west and east admixture – in early Slavic communities near the steppe, and the distribution from South, West, and East Slavs will include a wide “cloud” connecting Central, East, and South-East Europe, as it is evident already from early Germanic samples, it may be interesting to shift our attention to the Tollense valley and Lusatian samples, and their predominant Y-DNA haplogroups. Once again, tracking male-driven migrations from Central Europe to the Baltic region and the steppe, and back again to much of Central and South Europe, will determine which groups expanded this eastern NWIE dialect initially and in later times.

Since Baltic and Slavic languages are attested quite late, genetics is likely to help us select among the different available models for Balto-Slavic, although (it is worth repeating it) these lineages may not be the same that later expanded each dialect.

NOTE. Bronze and Iron Age samples might begin to depict the true Balto-Slavic migration map. Apart from the strong differences in the satemization processes seen among Baltic, Slavic, and Indo-Iranian, from an archaeological point of view the geographic location of the earliest attested Baltic languages and the prehistoric developments of the region seem to me almost incompatible with a homeland in the steppe. Anyway, in the worst-case scenario – for those of us who work with Balto-Slavic to reconstruct North-West Indo-European – there is consensus that there must an eastern North-West Indo-European language (which some would call Temematic), whose common traits with Germanic and Italo-Celtic we use to reconstruct their parent language. The question remains thus mostly theoretical, of limited pragmatic use for the reconstruction.

The third way: Baltic Late Neolithic

I have referred to Kristiansen and his group‘s position regarding Corded Ware as Indo-European as flawed before. While their latest interpretation (and language identification) was wrong, Kristiansen’s original idea of long-lasting contacts in the Dnieper-Dniester region with the area occupied by late Trypillia developing a Proto-Corded Ware culture was probably right, as we are seeing now.

New data in Mittnik et al. 2018 show some interesting early Late Neolithic samples from the Baltic region – Zvejnieki, Gyvakarai1 (R1a-Z645) and Plinkaigalis242 – , proving what I predicted: that elevated steppe ancestry and R1a-Z645 subclades would be found in the Dnieper-Dniester region unrelated to the Yamna expansion, and, it seems, to migrants of the Corded Ware A-horizon.

Funnily enough, this shows that there were probably ancient interactions in the region, as originally asserted by Kristiansen, and probably following some of Victor Klochko‘s proposed exchange paths, but earlier than predicted by him.

Nevertheless, linguist Guus Kroonen (from Kristiansen’s workgroup) issued a quick response to O&M 2018 in yet another twist of his agricultural substrate theory, changing Corded Ware from the vector to a vector of expansion of Late Proto-Indo-European languages (thus following again strictly Gimbutas’ oudated model), which fails thus to tackle the main inconsistencies of their previous models, as shown now with the latest paper on South Asian migrations. As I said, they were always one step behind Anthony, and they still are.

Funny also how Anthony, too – like Kristiansen – , may have been right all along since 2007, in proposing that Corded Ware (the nuclear Corded Ware migrants) stemmed from the Dnieper-Dniester region roughly at the same time as Yamna migrants expanded west, and that they did not have any direct genetic connection (in terms of migrations) with each other.

neolithic_steppe-anatolian-migrations
Most likely Pre-Proto-Anatolian migration with Suvorovo-Novodanilovka chiefs in the North Pontic steppe and the Balkans.

Both researchers, who collaborated with the latest genomic research, remade their models, and have to revise now their most recent proposals with the new data, influencing each new paper published with their pressure to be right in their previous models, and with new genomic data compelling them to change their theories under the pressure not to be too wrong again, in this strange vicious circle. Had they remained silent and committed to their archaeological theories, they could have been right all along, each one in their own way.

NOTE. BTW, in case you see ad hominem here too, I feel compelled to say that only thanks to their commitment to disentangle the truth about ancient migrations, and their readiness to collaborate with genetic research – unlike many others in their field – we know today what we know. If they have been wrong many times, it is because they have tried to connect the genetic dots as they were told. Only because of their readiness to explore their science further they should be praised by all. But, again, that does not mean that they cannot be wrong in their models…

Thanks to Anthony’s latest change of mind, we don’t have to hear the “cultural diffusion” argument anymore, and I consider this a great advance for the field.

NOTE. Not that there could not be prehistoric cultural diffusion events of language (i.e. not accompanied by genetic admixture), of course, but such theories, almost impossible to disprove, probably need much more than a simple “patron-client relationship” proposal and anthropometry to justify them, in a time when we will be able to see almost every meaningful personal exchange in Genomics…

Today – since the finding of Ukraine_Eneolithic sample I6561, of haplogroup R1a-Z93, dated ca. 4200 BC, and likely from the Sredni Stog culture – it seems more likely than ever that the expansion of R1a-Z645 subclades was in fact associated with the spread of steppe admixture probably near the North Pontic forest-steppe region, most likely from the Dnieper-Dniester or Upper Dniester region.

The appearance of a ‘late’ Z93 subclade already at such an early date, with steppe admixture, makes it still more likely that the Proto-Corded Ware culture, from where Corded Ware migrants of R1a-Z645 lineages later spread, was probably associated with this wide region.

In a parallel but unrelated migration, as it is now clear, steppe admixture also expanded with Yamna settlers of R1b-L23 lineages into the North Pontic steppe – from the North Caspian steppe, where it had developed previously as the Khvalynsk and (likely) Repin cultures -, roughly at the same time as Proto-Corded Ware expanded to the north, ca. 3300-3000 BC, and then expanded to the west into the Balkans (contributing to the formation of Balkan EBA cultures, and to the East Bell Beaker group).

NOTE. A migration of Yamna settlers northward along the Prut dated ca. 3000 BC or later could have justified the appearance of steppe admixture in the Dnieper-Dniester region, as I proposed for the Zvejnieki sample, although dates from Baltic samples are likely too early for that. For this to be corroborated, migrants should be accompanied up to a certain region by R1b-L23 lineages, and this could mean in turn a revival of Anthony’s original model of cultural diffusion of 2007. The most likely scenario, however, as predicted by Heyd, given the early appearance of steppe admixture and R1a-Z93 subclades in the forest-steppe during the 5th millennium, is that the admixture happened much earlier than that, fully unrelated to Late PIE migrations.

indo-european-yamna-corded-ware
Diachronic map of Copper Age migrations in Europe ca. 3100-2600 BC

The modern Baltic and Slavic conundrum

As for some people of Northern European ancestry previously supporting a bulletproof Yamna (R1a/R1b) -> Corded Ware migration that was obviously wrong; now supporting different Sredni Stog -> Corded Ware groups representing Indo-Slavonic (and Germanic??) in a model that is clearly wrong: how are these attempts different from Western Europeans supporting the autochthonous continuity of R1b-P312 lineages against all recent data, from Indians supporting the autochthonous continuity of R1a-M417 lineages no matter what, and from the more recent trend of autochthonous continuity theories for N1c lineages and Uralic in Eastern Europe?

Modern Germanic-speaking peoples can trace their common language to Nordic Iron Age Proto-Germanic, Celts to La Tène’s expansion of Proto-Celtic, and Romance speakers to the Roman expansion (and to an earlier Proto-Italic), all three dating approximately to the Iron Age. Proto-Slavic is dated much later than that, and probably Proto-Baltic too (or maybe earlier depending on the dialectal proposal), with Balto-Slavic being possibly coeval with Pre-Proto-Germanic and Italo-Celtic, but probably slightly later than that. Also, the language ancestral to Slavic may be (like a theoretical Proto-Romance language) impossible to reconstruct with precision, due to multiple substrate (or superstrate?) influences on the wide territory where Proto-Slavic formed and expanded from, in close alliance with steppe communities of different ethnolinguistic backgrounds.

We know that proto-historic Germanic, Celtic, and Italic peoples spread from relatively small regions, and had almost nothing to do with historic groups speaking their daughter languages, let alone modern speakers. Baltic and Slavic are not different.

NOTE. We have read that Weltzin samples clustered closely to Central Europeans (especially Austrians), and at a certain distance from modern Poles. That’s the conclusion of Sell’s PhD thesis, and it may be right, if you take only modern samples for comparison. However, if you have read or thought that they represented some kind of “ancestral Germanic vs. Slavic” battle, please imagine Trump’s voice for my opinion: Wrroonng, wrroonng, wrroonng. They cluster closely with Bell Beaker migrants, Poland BA, and Únětice (in this order), which we now know thanks to the data from O&M 2018 and Mittnik et al. 2018. And we also know who they don’t cluster close too: Corded Ware and Trzciniec samples. Therefore, people from the region near the most likely homelands of Pre-Proto-Germanic and Proto-Balto-Slavic are – as expected – likely descendants from Bell Beaker migrants in Central Europe. The genetic relationship of those ancient samples to modern inhabitants of Central-East Europe? Not obvious – at all.

tollense-welzin
PCA of samples from Tollense Valley battlefield and some ancient and modern samples.

We also know (and have known for a long time, well before these recent papers) that the oldest attested Indo-European languagesMycenaean, early Anatolian languages, and Indo-Aryan (through certain words in Mitanni inscriptions) – do not show continuity from the places where they were first attested to the Late and Middle Proto-Indo-European (steppe) homeland either. There should be no problem then in accepting that there is no linguistic, archaeological, or common sense reason to support that Balto-Slavic is older or shows more regional continuity than other IE languages from Europe.

NOTE. Oh yes, Balts saying “Baltic is the most similar language to PIE” I hear you thinking? Uh-huh, sure. And according to some Greeks (supported e.g. by the conclusions from Lazaridis et al. 2017) Mycenaeans were ‘autochthonous’, and Proto-Greek the most similar to PIE. For many Hindus, Vedic Sanskrit is in fact PIE), and the latest paper by Narasimhan et al. (2018) only reinforces this idea (don’t ask me why). Also, Caucasian scholar Gamkrelidze (with Ivanov) supported the origin of the language precisely in the Caucasus, with Armenian being thus the purest language. For Italians fans of Virgil and the Roman Empire, Latin (like Aeneas) comes from Anatolian linguistically and genetically, hence it must be the ‘oldest’ IE dialect alive… No, wait, Danish scholars Kroonen and Iversen quite recently asserted that Germanic is the oldest to branch off, then it should thus be nearest to PIE! I think you can see a pattern here…And don’t forget about the new Vasconic-Uralic hypotheses going on now, with Vasconic fans of R1b changing from Palaeolithic to Mesolithic, and now to European Neolithic and whatnot, or Uralic fans of N1c changing now from Mesolithic EHG to Siberia (for ancestry) or Central Asia (for N1c subclades), or whatever is necessary to believe in ‘continuity’ of their people following the newest genetic papers… Just pick whatever theory you want, call it “mainstream”, and that’s it.

So, if there is no reliable archaeological model connecting Bronze or Iron Age cultures to Eastern European cultures which are supposed to represent the Proto-Slavic and Proto-Baltic homelands…why on earth would any reasonable amateur (not to speak about scholars) dare propose any sort of genetic or linguistic continuity for thousands of years from PIE to early Slavs, a people whose first blurry appearance in historical records happened during the Middle Ages in rather turbulent and genetically admixed regions? It does not make any sense, and it had all odds against it. Blond hair, blue eyes, lactase persistence? Sure, and ABO group, brachycephaly, anthropometry… All very scientifish.

antiquity_classical_Europe_przeworsk
Diachronic map of migrations during Classical Antiquity in Europe 250 BC – 250 AD.
Where’s Proto-Slavic Wally?

Wrap-up

Human ancestry can only help refine solid academic theories, it cannot create one. Every new pet theory used to satisfy modern cultural pre- and misconceptions has failed, and it will fail again, and again, and again…

To have an own anthropological model of prehistoric migration requires time and study. It is not enough to play with software and to misuse traditional academic disciplines just to ‘prove’ some completely irrelevant, meaningless, and false continuity.

Related:

Proto-Indo-European homeland south of the Caucasus?

User Camulogène Rix at Anthrogenica posted an interesting excerpt of Reich’s new book in a thread on ancient DNA studies in the news (emphasis mine):

Ancient DNA available from this time in Anatolia shows no evidence of steppe ancestry similar to that in the Yamnaya (although the evidence here is circumstantial as no ancient DNA from the Hittites themselves has yet been published). This suggests to me that the most likely location of the population that first spoke an Indo-European language was south of the Caucasus Mountains, perhaps in present-day Iran or Armenia, because ancient DNA from people who lived there matches what we would expect for a source population both for the Yamnaya and for ancient Anatolians. If this scenario is right the population sent one branch up into the steppe-mixing with steppe hunter-gatherers in a one-to-one ratio to become the Yamnaya as described earlier- and another to Anatolia to found the ancestors of people there who spoke languages such as Hittite.

The thread has since logically become a trolling hell, and it seems not to be working right for hours now.

Reich’s proposal based on ancestral components to explain the formation of a people and language is a continuation of their emphasis on ancestry to explain cultures and languages. It seems quite interesting to see this happen again, given their current trend to surreptitiously modify their previous ‘Yamnaya ancestry’ concept and Yamnaya millennia-long R1a-R1b community (that supposedly explains a Yamna -> Corded Ware -> Bell Beaker migration) to a more general ‘steppe people’ sharing a ‘steppe ancestry’ who spoke a ‘steppe language’.

steppe-ancestry
Interesting arrows of dispersal of steppe ancestry, from Yamna -> Corded Ware -> Bell Beaker, from David Reich’s new book (yes, from 2018, number one bestseller in Amazon.com).

This new idea based on ancestral components suffers thus from the same essential methodological problems, which equate it – yet again – to pure speculation:

  1. It is a conclusion based on the genomic analysis of few individuals from distant regions and different periods, and – maybe more disturbingly – on the lack of steppe ancestry in the few samples at hand.
  2. Wait, what? Steppe ancestry? So they are trying to derive potential genetic connections among specific prehistoric cultures with a poorly depicted genetic sketch, based on previous flawed concepts (instead of on anthropological disciplines), which seems a rather long stretch for any scientist, whether they are content with seeing themselves as barbaric scientific conquerors of academic disciplines or not. In other words, statistics is also science (in fact, the main one to assert anything in almost any scientific field), and you cannot overcome essential errors (design, sampling, hypothesis testing) merely by using a priori correct statistical methods. Results obtained this way constitute a statistical fallacy.

  3. Even if the sampling and hypothesis testing were fine, to derive anthropological models from genomic investigation is completely wrong. Ancestral component ≠ population.
  4. To include not only potential migrations, but also languages spoken by these potential migrants? It’s sad that we have a need to repeat it, but if ancestral component ≠ population, how could ancestral component = language?

The Proto-Indo-European-speaking community

This is what we know about the formation of a Proto-Indo-European community (i.e. a community speaking a reconstructible Proto-Indo-European language) in the Pontic-Caspian steppe, which is based on linguistic reconstruction and guesstimates, tracing archaeological cultures backwards from cultures known to have spoken ancient (proto-)languages, and helping both disciplines with anthropological models (for which ancient genomics is only helping select certain details) of migration or – rarely – cultural diffusion:

NOTE. The following dates are obviously simplified. Read here a more detailed linguistic assessment based on phonology.

neolithic_steppe-anatolian-migrations
Most likely Pre-Proto-Anatolian migration with Suvorovo-Novodanilovka chiefs in the North Pontic steppe and the Balkans.
  • ca. 5000 BC. Early Proto-Indo-European (or Indo-Uralic) spoken probably during the formation and development of a loose Early Khvalynsk – Sredni Stog I cultural-historical community over the Pontic-Caspian steppe region, whose indigenous population probably had mainly Caucasus hunter-gatherer ancestry.
  • ca. 4500 BC. Khvalynsk probably speaking Middle Proto-Indo-European expands, most likely including Suvorovo-Novodanilovka chiefs into the North Pontic steppe, and probably expanding R1b-M269 lineages for the first time.
  • ca. 4000 BC. Separated communities develop, including North Pontic cultures probably gradually dominated by R1a-Z645 (potentially speaking Proto-Uralic); and Khvalynsk (and Repin) cultures probably dominated by R1b-L23 lineages, most likely developing a Late Proto-Indo-European already separated from Proto-Anatolian.
  • ca. 3500 BC. A Proto-Corded Ware population dominated by R1a-Z645 expands to the north, and slightly later an early Yamna community develops from Late Khvalynsk and Repin, expanding to the west of the Don River, and to the east into Afanasevo. This is most likely the period of reduction of variability and expansion of subclades of R1a-Z645 and R1b-L23 that we expect to see with more samples.
  • ca. 3000 BC. Expansion of Corded Ware migrants in northern Europe, and Yamna migrants along the Danube and into the Balkans, with further reduction and expansion of certain subclades.
  • ca. 2500 BC. Expansion of Bell Beaker migrants dominated by R1b-L51 subclades in Europe, and late Corded Ware migrants in east Yamna expanding R1a-Z93 subclades.

All these events are compatible with language reconstruction in mainstream European schools since at least the 1980s, supported by traditional archaeological research of the past 20 years, and is being confirmed with Genomics.

For those willingly lost in a myriad of new dreams boosted by the shallow comment contained in David Reich’s paragraph on CHG ancestry, even he does not doubt that the origin of Late Proto-Indo-European lies in Yamna, to the north of the Caucasus, based on Anthony’s (2007) account:

yamnaya-migrations-reich
Both images from the book, posted by Twitter user Jasper at https://twitter.com/jaspergregory.

NOTE: By the way, David Anthony, one of the main sources of information for Reich’s group, never considered Corded Ware to have received Yamna migrants, and althought he changed his model due to the conclusions of the 2015 papers, he has recently changed his model again to adapt it to the inconsistencies found in phylogeography.

CHG ancestry and PIE homeland south of the Caucasus

As for the potential origins of CHG ancestry in early Proto-Indo-European speakers, I already stated clearly my opinion quite recently. They may be attributed to:

Just to be clear, an expansion of Proto-Anatolian to the south, through the Caucasus, cannot be discarded today. It will remain a possibility until Maykop and more Balkan Chalcolithic and Anatolian-speaking samples are published.

However, an original Early Proto-Indo-European community south of the Caucasus seems to me highly unlikely, based on anthropological data, which should drive any conclusion. From what I could read, here are the rather simplistic arguments used:

  • Gimbutas and Maykop: Maykop was thought to be (in Gimbutas’ times) a rather late archaeological culture, directly connected to a Transcaucasian Copper Age culture ca. 2400-2300 BC. It has been demonstrated in recent years that this culture is substantially older, and even then language guesstimates for a Late PIE / Proto-Anatolian would not fit a migration to the north. While our ignorance may certainly be used to derive far-fetched conclusions about potential migrations from and to it, using Gimbutas (or any archaeological theory until the 1990s) today does not make any sense. Still less if we think that she favoured a steppe homeland.

NOTE. It seems that the Reich Lab may have already access to Maykop samples, so this suggested Proto-Indo-European – Maykop connection may have some real foundation. Regardless, we already know that intense contacts happened, so there will be no surprise (unless Y-DNA shows some sort of direct continuity from one to the other).

  • Gamkrelidze & Ivanov: they argued for an Armenian homeland (and are thus at the origin of yet another autochthonous continuity theory), but they did so to support their glottalic theory, i.e. merely to support what they saw as favouring their linguistic model (with Armenian being the most archaic dialect). The glottalic theory is supported today – as far as I know – mainly by Kortlandt, Jagodziński, or (Nostraticist) Bomhard, but even they most likely would not need to argue for an Armenian homeland. In fact, their support of a Graeco-Aryan group (also supported by Gamkrelidze & Ivanov) would be against this, at least in archaeological terms.
  • Colin Renfrew and the Anatolian homeland: This conceptual umbrella of language spreading with farming everywhere has changed so much and so many times in the past 20 years, with so many glottochronological and archaeological estimates circulating, that you can support anything by now using them. Mostly used today for abstract models of long-lasting language contacts, cultural diffusion, and constellation analogies. Anyway, he strives to keep up-to-date information to revise the model, that much is certain:
  • Glottochronology, phylogenetic trees, Swadesh list analysis, statistical estimates, psychics, pyramid power, and healing crystals: no, please, no.
Science Magazine
“A first line of evidence comes from linguistic analysis based on quantitative lexical data, which returned a tree compatible with the Anatolian hypothesis

In principle, unlike many other recent autochthonous continuity theories, I doubt there can be much racial-based opposition anywhere in the world to an origin of Proto-Indo-European in the Middle East, where the oldest civilizations appeared – apart, obviously, from modern Northeast and Northwest Caucasian, Kartvelian, or Semitic speakers, who may in turn have to revisit their autochthonous continuity theories radically…

Nevertheless, it is obvious that prehistoric (and many historic) migrations are signalled by the reduction in variability and expansion of certain Y-DNA haplogroups, and not just by ancestral components. That is generally accepted, although the reasons for this almost universal phenomenon are not always clear.

In fact, Proto-Anatolian and Common Anatolian speakers need not share any ancestral component, PCA cluster, or any other statistical parameter related to steppe populations, not even the same Y-DNA haplogroups, given that approximately three thousand years might have passed between their split from an Indo-Hittite community and the first attested Anatolian-speaking communities…We must carefully follow their tracks from Anatolia ca. 1500 BC to the steppe ca. 4500 BC, otherwise we risk creating another mess like the Corded Ware one.

In my opinion, the substantial contribution of EHG ancestry and R1a-M417 lineages to the Pontic-Caspian steppe (probably ca. 6500 BC) from Central or East Eurasia is the most recent sizeable genomic event in the region, and thus the best candidate for the community that expanded a language ancestral to Proto-Indo-European – whether you call it Pre-Proto-Indo-European, Pre-Indo-Uralic, or Eurasiatic, depending on your preferences.

An early (and substantial) contribution of CHG ancestry in Khvalynsk relative to North Pontic cultures, if it is found with new samples, may actually be a further proof of the Caucasian substrate of Proto-Indo-European proposed by Kortlandt (or Bomhard) as contributing to the differentiation of Middle PIE from Uralic. Genomics could thus help support, again, traditional disciplines in accepting or rejecting academic controversial theories.

Conclusion

In the case of an Early PIE (or Indo-Uralic) homeland, genomic data is scarce. But all traditional anthropological disciplines point to the Pontic-Caspian steppe, so we should stick to it, regardless of the informal suggestion written by a renown geneticist in one paragraph of a book conceived as an introduction to the field.

It seems we are not learning much from the hundreds of peer-reviewed, statistically (superficially, at least) sound genetic papers whose anthropological conclusions have been proven wrong by now. A lot of people should be spending their time learning about the complex, endless methods at hand in this kind of research – not just bioinformatics – , instead of fruitlessly speculating about wild unsubstantiated proposals.

As a final note, I would like to remind some in the discussion, who seem to dismiss the identification of CHG with Proto-Indo-European by supporting a “R1a-R1b” community for PIE, of their previous commitment to ancestral components in identifying peoples and languages, and thus their support to Reich’s (and his group’s) fundamental premises.

You cannot have it both ways. At least David Reich is being consistent.

Related:

Stone Age plague accompanying migrants from the steppe, probably Yamna, Balkan EBA, and Bell Beaker, not Corded Ware

copper-age-late-bell-beaker

In the latest revisions of the Indo-European demic diffusion model, using the results from the article Early Divergent Strains of Yersinia pestis in Eurasia 5,000 Years Ago, by Rasmussen et al., Cell (2015), I stated (more or less indirectly) that the high east-west mobility of the Corded Ware migrants across related cultures might have been responsible for the spread of this disease, which seems to have been originally expanded from Central Eurasia.

New results appeared recently in the article The Stone Age Plague and Its Persistence in Eurasia, by Valtueña et al., Current Biology (2017), which may contradict that interpretation.

copper-age-early_yamna-corded-ware
Early Yamna and Corded Ware communities and their migrations ca. 3000 BC onwards.

Abstract:

Yersinia pestis, the etiologic agent of plague, is a bacterium associated with wild rodents and their fleas. Historically it was responsible for three pandemics: the Plague of Justinian in the 6th century AD, which persisted until the 8th century [ 1 ]; the renowned Black Death of the 14th century [ 2, 3 ], with recurrent outbreaks until the 18th century [ 4 ]; and the most recent 19th century pandemic, in which Y. pestis spread worldwide [ 5 ] and became endemic in several regions [ 6 ]. The discovery of molecular signatures of Y. pestis in prehistoric Eurasian individuals and two genomes from Southern Siberia suggest that Y. pestis caused some form of disease in humans prior to the first historically documented pandemic [ 7 ]. Here, we present six new European Y. pestis genomes spanning the Late Neolithic to the Bronze Age (LNBA; 4,800 to 3,700 calibrated years before present). This time period is characterized by major transformative cultural and social changes that led to cross-European networks of contact and exchange [ 8, 9 ]. We show that all known LNBA strains form a single putatively extinct clade in the Y. pestis phylogeny. Interpreting our data within the context of recent ancient human genomic evidence that suggests an increase in human mobility during the LNBA, we propose a possible scenario for the early spread of Y. pestis: the pathogen may have entered Europe from Central Eurasia following an expansion of people from the steppe, persisted within Europe until the mid-Bronze Age, and moved back toward Central Eurasia in parallel with human populations.

plague_phylogeny_eurasia
Maximum-Likelihood Tree and Percent Coverage Plot of Virulence Factors of Yersinia pestis. (A) Maximum-likelihood tree of all Yersinia pestis genomes, including 1,265 SNP positions with complete deletion. Nodes with support R95% are marked with an asterisk. The colors represent different branches in the Y. pestis phylogeny: branch 0 (black), branch 1 (red), branch 2 (green), branch 3 (blue), branch 4 (orange), and LNBA Y. pestis branch (purple). Y. pseudotuberculosis-specific SNPs were excluded from the tree for clarity of representation. In the light-colored boxes, discussed losses and gains of genomic regions and genes are indicated. Related

It seems that, notwithstanding the simplistic (white) arrows of steppe ancestry expansion shown in their map (see below), the actual expansion of Yersinia pestis might have in fact accompanied Yamna migrants from the Pontic-Caspian steppe into Early Bronze Age cultures from the Balkans, including Bell Beaker migrants, as the phylogenetic analysis and dates suggest – and as the potential arrows of the plague expansion in the map (in green) show.

Late Corded Ware migrants would have only later expanded the disease to eastern Europe, as shown in the second map, most likely because of their close contact with Bell Beaker migrants (but remaining culturally distinct from them), and indeed because of the mobility accross related Corded Ware cultures up to the Urals.

The cultural-historical community in the Late Neolithic between steppe peoples that would evolve into Uralic-speaking Sredni Stog/Corded Ware migrants in the western steppe, and Late Indo-European-speaking Yamna/SE EBA/Bell Beaker migrants originally from the eastern steppe, would allow for the spread of the disease first among steppe groups, and then from both distinct late groups into their respective expanded regions.

The phylogenetic tree of Y. pestis available right now (see above), however, seems to suggest a stronger initial link to Yamna migrants, i.e. an origin in the North Caspian steppe, and an expansion with Yamna into the north Pontic area, into the Caucasus, and with the Afansevo culture, spreading later with Balkan EBA cultures and the expansion of Bell Beaker peoples.

Instead of warring nature, close ties, and mobility of Corded Ware peoples (reasons I used to justify the rapid spread of the disease among CWC groups), I guess it was rather the higher population density of SE Europe compared to the regions north of the loess belt, as well as the greater admixture of Yamna migrants with native SE European populations, the factors which might have helped expand the disease.

plague-expansion-europe
Map of Proposed Yersinia pestis Circulation throughout Eurasia (A) Entrance of Y. pestis into Europe from Central Eurasia with the expansion of Yamnaya pastoralists around 4,800 years ago. (B) Circulation of Y. pestis to Southern Siberia from Europe. Only complete genomes are shown.

Nevertheless, lacking more data, it is unclear if the disease expanded with both steppe groups.

Related:

The Aryan migration debate, the Out of India models, and the modern “indigenous Indo-Aryan” sectarianism

indus-valley-early-harappan

The Proto-Indo-European Urheimat

Not long ago, the Proto-Indo-European language Urheimat problem used to be cyclic in nature: linguistic and archaeological publications appeared supporting a Copper Age migration from the steppe proposed by Marija Gimbutas, or a Neolithic expansion from Anatolia (or Armenia) proposed by Colin Renfrew, and back again.

I have always supported the simpler, more recent Chalcolithic migration of Late Indo-Europeans from the Pontic-Caspian steppe over an older Neolithic expansion from Anatolia with agriculture. The latter model implied a complex cultural diffusion over a greater span of time than is warranted by linguistic guesstimates, understood as the general grasp that anyone can have on how much a language changes in time, comparing the different stages of different Indo-European languages. Whether they like to talk about it or not, or whether they would describe them as such (or else as terminus ante or post quem), most known linguists and archaeologists involved in Indo-European studies have published at some point their own guesstimates.

To have an idea about how guesstimates work, you only have to learn some Indo-European languages from different branches, the ancient languages from which they are derived, how they have evolved from them through time, and their proto-languages, to see how unlikely it is that the differences from Late Indo-European to Proto-Greek, Proto-Indo-Iranian, Proto-Celtic, or Proto-Italic need a leap of ca. 3000 years almost without change, as required by the Anatolian hypothesis. Some have strong reactions against guesstimates arguing you cannot compare historic or proto-historic changes to prehistoric ones, to support a different linguistic change rate from Proto-Indo-European to proto-languages. I find this to be a sound criticism, but often used justify a worse, ad-hoc estimate that supports other theory.

Glottochronology – in case you are looking for mathematics or statistics to solve the problem – is as useless today as it always was. Not everything – in fact few things in anthropology – can be solved with algorithms and statistics. I do love algorithms and statistics, because their results – if based on sound assumptions – are hard to be contested, but not a single good one has been proposed for comparative grammar, as far as I know.

Algorithms solve everything

Steppe hypothesis

The steppe hypothesis was always the simpler connection with modern Indo-European languages, from a linguistic and archaeological point of view, and archaeogenetics (since the advent of haplogroup investigation, and the finding of modern R1a distribution) did also support it. However, it implied a conquest by warring patrilocal peoples, that substituted the ‘original’ Neolithic European and Asian population and languages, and invasions have not been a fashionable antrhopological subject for a long time.

One of the consequences of the genocidal racism and xenophobia seen during World War II was the strong reaction to its ideological foundations, and there was a common will to end with Kosinna’s trend of historic ethnolinguistic identification of modern peoples. Linguistics and archaeology did then search for more complex models of human relations and exchange, mostly to avoid what appeared as simplistic concepts of migration or invasion. Marija Gimbutas’ simplistic kurganist, male-driven invasion of territories inhabited by matrilocal Old Europeans, albeit reasonable, did not fit well with these post-war times. One could accept historic and proto-historic atrocities and genocide by any people against others, and even tribal conflicts between prehistoric hunter-gatherers that ended in the destruction of one of them, but a violent, massive spread of ‘Aryans’ was considered a dangerous idea to be avoided.

Thanks to the effort of David Anthony (among others) in supporting migration models in Archaeology, the steppe model did have a strong revival even before archaeogenetics began to be a thing in anthropological research.

Anatolian hypothesis

The Anatolian hypothesis, on the other hand, seemed like a fine, long evolution of a language accompanying the peaceful spread of a technological innovation, farming and cattle herding. Originally believed to be mostly a cultural diffusion (now it has been demonstrated to be a mixed diffusion event, with strong demic diffusion in its early phase), it was thus in line with a more politically correct view of prehistoric events.

This cultural diffusion gave in turn way to more peaceful and innovative solutions to language spread, like waves of expansion, or a constellation of languages influencing each other for long periods, so that even the potential reconstruction of a single Proto-Indo-European language or people was doubted. Prehistoric friendly neighbours would have adopted farming and exchanged goods and languages for thousands of years, and only with proto-historic events did people have ethnolinguistic identification that caused conflicts…

While recently there have been some doubts expressed by Mathieson et al. (2017) on the of the steppe hypothesis regarding Proto-Anatolian, it is likely that the lack of enough ancient DNA of the Balkans and Anatolia is the key factor here.

An interesting linguistic proposal, the glottalic theory, while sound in its assumptions and results – much less likely in my opinion than the more common two-dorsal theory, and this much more likely than the prevalent three-dorsal one – gave some theoretical support to the Anatolian (or Armenian) hypothesis, since some proponents felt that a glottalic Proto-Indo-European should have an origin near to the Armenian homeland – because glottalic Proto-Armenian would have retained a phonetic state nearer to the “original” Proto-Indo-European.

That simplistic regional continuity explanation is akin to the trend of Basque researchers to discover links of Proto-Basque with the Pyrenees in Mesolithic and Palaeolithic times, when there is no data to warrant such identifications – and it seems in fact that Proto-Basque, Proto-Iberian, and Palaeo-Sardinian might have accompanied the expansion of farming in the Neolithic. Probably most proponents left of the Glottalic theory today (like Frederik Kortlandt and Alan Bomhard) would accept a steppe migration unrelated to an Armenian or Anatolian origin.

Marginal proposals

There were indeed other marginal proposals, with people supporting origins of Proto-Indo-European in both ends of the current distribution of Indo-European languages, from the “Indo-” in Out of India theories, to the “-European” in Eurocentric proposals. Most Eurocentric proposals – based on certain archaeological cultures and their evolution in- and outside Europe – have been dismissed with archaeological and genetic research, and the remaining ones usually favour the more fashionable peaceful spread of languages.

Palaeolithic Continuity Theory

A small group in support of the more recent Palaeolithic Continuity Theory remains. It seems to me as deeply flawed from a linguistic point of view (with a much larger time span needed than for a Neolithic expansion), but their arguments are led by research on genetics and archaeology, and not much is left for European romanticism, so it has always appeared to me as a professionally acceptable – although futile – attempt by eccentric researchers to disentangle prehistoric events.

Similar to what happens with proponents of the Anatolian hypothesis, new linguistic, archaeological, and genetic research is used to remake PCT models – instead of just dismissing it -, so it is likely that we will have many different proposals of stepped population movements that will make both models eventually converge with the steppe migration theory, to the point where only the steppe migration theory remains, with some added details on its most ancient origin. I guess sometimes it is difficult to let (part of) your life’s research just go away without fighting for some recognition… You desperately look for a tap on the back by some colleagues, even out of pity, who will tell you ‘it seems you might have been right in some details, after all!’…

Out of India

The Out of India theory is the name given to a group of (mostly) independent models that usually propose a Proto-Indo-European homeland based on or around India. Contrary to the PCT, an Out of India theory set during the Mesolithic or Neolithic would be feasible from a linguistic point of view: you could somehow connect some archaeological migrations to support the spread of Early-Proto-Indo-European-speaking R1a lineage happen east-to-west (and north), and genetically it had support in some papers on modern distribution of R1a subclades, for example in Underhill et al. (2014). Underhill himself has since questioned his conclusion in view of recent papers publishing ancient DNA analysis.

Out of India theories, overall, could thus be as strong (or as weak) as the theories concerning an Anatolian origin, in their potential for explanation of the ancient origin of the Proto-Indo-European language spoken in the steppe during the Neolithic and Chalcolithic. However feasible they might a priori be, I have yet to encounter a decent modern paper with that kind of proposal, based on recent genetic papers. Most modern articles are just Indian nationalist crap, and the only decent papers on this matter are becoming quite old fpr this relatively young field of Indo-European studies. Maybe that’s because I don’t have enough time to look for the hidden good anthropological papers among so much dirt. After all, it is not a very likely theory, and one has a limited amount of time.

In recent papers, if you get rid of simplistic reactionary and revisionist views, conservative Indo-Aryan Hindu nationalist or religious bigotry, fantastic connections with the Indus Valley civilization, and simplistic identifications of Proto-Indo-European as ‘nearer’ to Vedic Sanskrit – with absurdly old and odd references to Schleicher’s reconstruction and dialectal Indo-Slavonic or Satem references -, you are left at best with some basic criticisms of Eurocentrism and the known shortcomings of anthropological disciplines in investigating Proto-Indo-European Urheimat, but no data to support any connection with India whatsoever.

If there is a reason for a generalised inferiority complex in India, I would find it in the shameless publication and popularity of such worthless research papers, a trend that is also seen in scientific fields, with Indian researchers having a increasingly tougher time passing editorial and peer reviews, and resorting thus to national journals. In the case of Indo-European studies, instead of trying to fit data with what we know, the only aim in Indian research seems to be to connect the Indus Valley with Proto-Indo-European, and Proto-Indo-European with a “pure” (i.e. Vedic) Indo-Aryan, to support a mythological Indo-Aryan Hinduist India. And that is mostly what you will find in any Out of India article today, whether based on linguistic, archaeological, or – what is prevalent today – genetic investigation.

This has been The Out of India Controversy Week: it began last week with the publication of a quite decent article in The Hindu by Tony Joseph summing up the current situation of anthropological research. It was followed by reactions in conservative Indian news, and this in turn was contested by Davidsky and Razib Khan. The original article by Tony Joseph has been echoed by Victor Mair in Language Log, and I agree with his description of Joseph’s paper as “informed, sensitive, balanced, and nuanced. This is responsible science journalism”, even if I disagree with some of his statements (in a different way than Mr. Mair). However, this propaganda disguised as scientific criticism is what you get from Indian nationalists.

EDIT (25/6/2017): Razib Khan has published a thorough post on Indian evolutionary genetics as follow-up to this week’s controversy. I think there is too much effort being invested during these controversies precisely by the people who need not explain themselves. Anyway, good summaries of anthropological matters are always welcome.

EDIT (29/6/2017): Other posts on the subject, from Brown Pundits: On the “Aryan” debate – the linguistics POV; Razib Khan’s Indian genetics, part n of many; and Aryan Migration and its Discontents.

Interestingly, any time new research comes to shake certain Indian nationalist foundations, a stronger backfire effect happens, and more criticism is done on the shortcomings of such anthropological research. Because, indeed, if the anthropological theory is flawed, mythical Indo-Aryans spread from the Indus Valley, right…? One can only expect this kind of controversies to escalate in conservative Indian blogs and fora alike, and then deescalate until the next paper is published. A dialectic cycle whose only evident result is the increased opposition that conservative Indian researchers – or researchers that depend on funding by such groups – will have in publishing anything related to a potential Aryan invasion, and the addition of a stronger bias in Indian research.

Western European history

It might well be because I am western European, and western Europeans tend to accept quite well multiple invasions from the East. After all, they have happened so many times in proto-historical and historical times, that it is part of our ethnolinguistic nation-building lore. French people trace their history to the expansion of Celts, Romans, and Franks; Spaniards and Portuguese trace it to the spread of Celts, Ibero-Basques, Romans, and Westgoths; Italians to the expansion of Etruscans, Celts and Italics, Romans, Ostrogoths and Langobards; the English to the expansion of Celts, Angles and Saxons, Vikings, and Normans…

It often seems to me that western Europeans will romanticise their origins no matter what appears in historic and genetic investigation: if Neanderthals are unrelated to Europeans, they are ‘cavemen’; if they intermixed with our ancestors, then they suddenly become quite human in their behaviour, and it is great to have more Neanderthal admixture. If Indo-European-speaking R1a lineages invaded central Europe from the east, and transferred their languages, great, because “we” are heirs of original western European hunter-gatherers of Palaeolithic R1b lineages; if R1b lineages represent an invasion of eastern peoples speaking Late Indo-European, great too, because it means that our paternal forefathers were the ‘original’ Indo-European speakers…

This reaction, our history is great no matter what, seems to be a good one for research, since it allows for any change in our romantic views of the past. This, however, does not seem to be the case for some nations, and this inability to change their views is likely related to the inferiority complex that some nations have developed, in turn probably caused by western European colonialism, so one is left to wonder how responsible we are of modern chauvinist trends.

The sad future

Seeing how so many people of eastern European ancestry are convinced of an origin of R1a-M417 in Indo-European migrations from Yamna – when there is (yet?) not a single proof of it – may be just as troubling as the Indian case, or maybe more, since it affects an important part of Europe. I cannot believe that even today only western Europeans are capable of romanticising their own past no matter what, while the rest of the world lives in a quest to appropriate whatever they view as some great ancient culture, people, or language for their own ancestors.

I have already received complaints and have seen people (of Y-DNA haplogroup R1a) complain online that their forefathers cannot have been Uralic speakers, and some Uralic speakers (of haplogroup N) that original Uralic speakers cannot have been of R1a lineages. Firstly, if I were eastern European – be it Germanic, Balto-Slavic, or Uralic speaker, or a speaker of Indo-Aryan languages, of R1a or N lineage, whatever my country of origin, I like to think I would prefer to know where my forefathers actually came from, and what languages they did in fact speak thousands of years ago, even if that disrupts everything I or my fellow countrymen (wrongly) assumed for a long time. Secondly, we – as western Europeans speaking Romance or Germanic languages – have the right to know exactly how our peoples and languages really came to be, even if that means disrupting others’ dreams. Our paternal ancestors probably changed languages 3 or 4 times during their multiple migrations from the east, and were not peaceful hunter-gatherers living since the Palaeolithic in the same region we do now, as traditionally held; if we can get over this, eastern Europeans and Indians can get over it, too.

I think everyone deserves to know the truth, and they will eventually like it and fantasise with it. But many individuals want to disrupt any possible change to keep their current ethnic and nationalist agendas untouched, and that can affect us all. Nationalistic and romantic trends are understandable: Romans needed Virgil at the peak of their conquests to tell them that they had a glorious past in Troy, connecting them to the immortal Greek epics. The most important lesson one can learn from that example is that Italian researchers are still (2000 years later!) influenced by that myth, and they keep trying to look for Anatolian remains in Latin studies, and in the archaeology and evolutionary genetics of Italy. I guess you could therefore say these mythification trends are naturally human…but losing so much time in absurd quests for mythological identities seems absurd, and can only damage research.

It is sad to think about future generations of Indians looking for any sign to support an autochthonous Indo-Aryan homeland, while the rest of the world keeps moving in the right direction…

(Note: featured image is licensed CC-by-sa 4.0 from Avantiputra7 at Wikipedia)