Yamnaya replaced Europeans, but admixed heavily as they spread to Asia

narasimhan-spread-yamnaya-ancestry

Recent papers The formation of human populations in South and Central Asia, by Narasimhan, Patterson et al. Science (2019) and An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers, by Shinde et al. Cell (2019).

NOTE. For direct access to Narasimhan, Patterson et al. (2019), visit this link courtesy of the first author and the Reich Lab.

I am currently not on holidays anymore, and the information in the paper is huge, with many complex issues raised by the new samples and analyses rather than solved, so I will stick to the Indo-European question, especially to some details that have changed since the publication of the preprint. For a summary of its previous findings, see the book series A Song of Sheep and Horses, in particular the sections from A Clash of Chiefs where I discuss languages and regions related to Central and South Asia.

I have updated the maps of the Preshistory Atlas, and included the most recently reported mtDNA and Y-DNA subclades. I will try to update the Eurasian PCA and related graphics, too.

NOTE. Many subclades from this paper have been reported by Kolgeh (download), Pribislav and Principe at Anthrogenica on this thread. I have checked some out for comparison, but even if it contradicted their analyses mine would be the wrong ones. I will upload my spreadsheets and link to them from this page whenever I find the time.

caucasus-cline-narasimhan
Ancestry clines (1) before and (2) after the advent of farming. Colour modified from the original to emphasize the CHG cline: notice the apparent relevance of forest-steppe groups in the formation of this CHG mating network from which Pre-Yamnaya peoples emerged.

Indo-Europeans

I think the Narasimhan, Patterson et al. (2019) paper is well-balanced, and unexpectedly centered – as it should – on the spread of Yamnaya-related ancestry (now Western_Steppe_EMBA) as the marker of Proto-Indo-European migrations, which stretched ca. 3000 BC “from Hungary in the west to the Altai mountains in the east”, spreading later Indo-European dialects after admixing with local groups, from the Atlantic to South Asia.

I. Afanasievo

I.1. East or West PIE?

I expected Afanasievo to show (1) R1b-L23(xZ2103, xL51) and (2) R1b-L51 lineages, apart from (3) the known R1b-Z2103 ones, pointing thus to an ancestral PIE community before the typical Yamnaya bottlenecks, and with R1b-L51 supporting a connection with North-West Indo-European. The presence of some samples of hg. Q pointed in this direction, too.

However, Afanasievo samples show overwhelmingly R1b-Z2103 subclades (all except for those with low coverage), all apparently under R1b-Z2108 (formed ca. 3500 BC, TMRCA ca. 3500 BC), like most samples from East Yamnaya.

This necessarily shifts the split and spread of R1b-L23 lineages to Khvalynsk/early Repin-related expansions, in line with what TMRCA suggested, and what advances by Anthony (2019) and Khokhlov (2018) on future samples from the Reich Lab suggest.

Given the almost indistinguishable ancestry between Afanasievo and Early Yamnaya, there seems to be as of yet little potential information to support in population genomics that Pre-Tocharians were more closely related to North-West Indo-Europeans than to Graeco-Aryans, as it is proposed in linguistics based on the few shared traits between them, and the lack of innovations proper of the Graeco-Aryan community.

NOTE. A new issue of Wekʷos contains an abstract from a relevant paper by Blažek on vocabulary for ‘word’, including the common NWIE *wrdʰo-/wordʰo-, but also a new (for me, at least) Northern Indo-European one: *rēki-/*rēkoi̯-, shared by Slavic and Tocharian.

The fact that bottlenecks happened around the time of the late Repin expansion suggests that we might be able to see different clans based on the predominant lineages developing around the Don-Volga area in the 4th millennium BC. The finding of Pre-R1b-L51 in Lopatino (see below), and of a Catacomb sample of hg. R1b-Z2103(Z2105-) in the North Caucasus steppe near Novoaleksandrovskij also support a star-like phylogeny of R1b-L23 stemming from the Don-Volga area.

NOTE. Interestingly, a dismissal of a common trunk between Tocharian and North-West Indo-European would mean that shared similarities between such disparate groups could be traced back to a Common Late PIE trunk, and not to a shared (western) Repin community. For an example of such a ‘pure’ East-West dialectal division, see the diagram of Adams & Mallory (2007) at the end of the post. It would thus mean a fatal blow to Kortlandt’s Indo-Slavonic group among other hypothetical groupings (remade versions of the ancient Centum-Satem division), as well as to certain assumptions about laryngeal survival or tritectalism that usually accompany them. Still, I don’t think this is the case, so the question will remain a linguistic one, and maybe some similarities will be found with enough number of samples that differentiate Northern Indo-Europeans from the East Yamna/Catacomb-Poltavka-Balkan_EBA group.

afanasievo-y-dna
Y-chromosome haplogroups of Afanasievo samples and neighbouring groups. See full maps.

I.2. Expansion or resurgence of hg. Q1b?

Haplogroup Q1b-Y6802(xY6798) seems to be the main lineage that expanded with Afanasievo, or resurged in their territory. It’s difficult to tell, because the three available samples are family, and belong to a later period.

NOTE. I have finally put some order to the chaos of Q1a vs. Q1b subclades in my spreadsheet and in the maps. The change of ISOGG 2016 to 2017 has caused that many samples reported as of Q1 subclades from papers prepared during the 2017-2018 period, and which did not provide specific SNP calls, were impossible to define with certainty. By checking some of them I could determine the specific standard used.

In favour of the presence of this haplogroup in the Pre-Yamnaya community are:

  • The statement by Anthony (2019) that Q1a [hence maybe Q1b in the new ISOGG nomenclature] represented a significant minority among an R1b-rich community.
  • The sample found in a Sintastha WSHG outlier (see below), of hg. Q1b-Y6798, and the sample from Lola, of hg. Q1b-L717, are thus from other lineage(s) separated thousands of years from the Afanasievo subclade, but might be related to the Khvalynsk expansion, like R1b-V1636 and R1b-M269 are.

These are the data that suggest multiple resurgence events in Afanasievo, rather than expanding Q1b lineages with late Repin:

  • Overwhelming presence of R1b in early Yamnaya and Afanasievo samples; one Q1(xQ1b) sample reported in Khvalynsk.
  • The three Q1b samples appear only later, although wide CI for radiocarbon dates, different sites, and indistinguishable ancestry may preclude a proper interpretation of the only available family.
    • Nevertheless, ancestry seems unimportant in the case of Afanasievo, since the same ancestry is found up to the Iron Age in a community of varied haplogroups.
  • Another sample of hg. Q1b-Y6802(xY6798) is found in Aigyrzhal_BA (ca. 2120 BC), with Central_Steppe_EMBA (WSHG-related) ancestry; however, this clade formed and expanded ca. 14000 BC.
  • The whole Altai – Baikal area seems to be a Q1b-L54 hotspot, although admittedly many subclades separated very early from each other, so they might be found throughout North Eurasia during the Neolithic.
  • One Afanasievo sample is reported as of hg. C in Shin (2017), and the same haplogroup is reported by Hollard (2014) for the only available sample of early Chemurchek to date, from Kulala ula, North Altai (ca. 2400 BC).
afanasievo-chemurchek-y-dna
Y-chromosome haplogroups of late Afanasievo – early Chemurchek samples and neighbouring groups. See full maps.

I.3. Agricultural substrate

Evidence of continuous contacts of Central_Steppe_MLBA populations with BMAC from ca. 2100 BC on – visible in the appearance of Steppe ancestry among BMAC samples and BMAC ancestry among Steppe pastoralists – supports the close interaction between Indo-Iranian pastoralists and BMAC agriculturalists as the origin of the Asian agricultural substrate found in Proto-Indo-Iranian, hence likely related to the language of the Oxus Civilization.

Similar to the European agricultural substrate adopted by West Yamnaya settlers (both NWIE and Palaeo-Balkan speakers), Tocharian shows a few substrate terms in common with Indo-Iranian, which can be explained by contacts in different dialectal stages through phonetic reconstruction alone.

The recent Hermes et al. (2019) supports the early integration of pastoralism and millet cultivation in Central Asia (ca. 2700 BC or earlier), with the spread of agriculture to the north – through the Inner Asian Mountain Corridor – being thus unrelated to the Indo-Iranian expansions, which might support independent loans.

However, compared to the huge number of parallel shared loans between NWIE and Palaeo-Balkan languages in the European substratum, Indo-Iranians seem to have been the first borrowers of vocabulary from Asian agriculturalists, while Proto-Tocharian shows just one certain related word, with phonetic similarities that warrant an adoption from late Indo-Iranian dialects.

chemurchek-sintashta-bmac
Y-chromosome haplogroups of Sintashta, Central Asia, and neighbouring groups in the Early Bronze Age. See full maps.

The finding of hg. (pre-)R1b-PH155 in a BMAC sample from Dzharkutan (to the west of Xinjiang) together with hg. R1b in a sample from Central Mongolia previously reported by Shin (2017) support the widespread presence of this lineage to the east and west of Xinjiang, which means it might have become incorporated to Indo-Iranian migrants into the Xiaohe horizon, to the Afanasievo-Chemurchek-derived groups, or the later from the former. In other words, the Island Biogeography Theory with its explanation of founder effects might be, after all, applicable to the whole Xinjiang area, not only during the Chemurchek – Tianshan-Beilu – Xiaohe interaction.

Of course, there is no need for too complicated models of haplogroup resurgence events in Central and South Asia, seeing how the total amount of hg. R1a-L657 (today prevalent among Indo-Aryan speakers from South Asia) among ancient Western/Central_Steppe_MLBA-related samples amounts to a total of 0, and that many different lineages survived in the region. Similar cases of haplogroup resurgence and Y-DNA bottleneck events are also found in the Central and Eastern Mediterranean, and in North-Eastern Europe. From the paper:

[It] could reflect stronger ecological or cultural barriers to the spread of people in South Asia than in Europe, allowing the previously established groups more time to adapt and mix with incoming groups. A second difference is the smaller proportion of Steppe pastoralist– related ancestry in South Asia compared with Europe, its later arrival by ~500 to 1000 years, and a lower (albeit still significant) male sex bias in the admixture (…).

Y-chromosome haplogroups of samples from the Srubna-Andronovo and Andronovo-related horizon, Xiaohe, late BMAC, and neighbouring groups. See full maps.

II. R1b-Beakers replaced R1a-CWC peoples

II.1. R1a-M417-rich Corded Ware

Newly reported Corded Ware samples from Radovesice show hg. R1a-M417, at least some of them xZ645, ‘archaic’ lineages shared with the early Bergrheinfeld sample (ca. 2650 BC) and with the coeval Esperstedt family, hence supporting that it eventually became the typical Western Corded Ware lineage(s), probably dominating over the so-called A-horizon and the Single Grave culture in particular. On the other hand, R1a-Z645 was typical of bottlenecks among expanding Eastern Corded Ware groups.

Interestingly, it is supported once again that known bottlenecks under hg. R1a-M417 happened during the Corded Ware expansion, evidenced also by the remarkable high variability of male lineages among early Corded Ware samples. Similarly, these Corded Ware samples from Bohemia form part of the typical ‘Central European’ cluster in the PCA, which excludes once again not only the ‘official’ Espersted outlier I1540, but also the known outlier with Yamnaya ancestry.

NOTE. The fact that Esperstedt is closely related geographically and in terms of ancestry to later Únětice samples further complicates the assumption that Únětice is a mixture of Bell Beakers and Corded Ware, being rather an admixture of incoming Bell Beakers with post-Yamnaya vanguard settlers who admixed with Corded Ware (see more on the expansion of Yamnaya ancestry). In other words, Únětice is rather an admixture of Yamnaya+EEF with Yamnaya+(CWC+EEF).

Y-chromosome haplogroups of samples from Catacomb, Poltavka, Balkan EBA, and Bell Beaker, as well as neighbouring groups. See full maps.

On Ukraine_Eneolithic I6561

If the bottlenecks are as straightforward as they appear, with a star-like phylogeny of R1a-M417 starting with the Pre-Corded Ware expansion, then what is happening with the Alexandria sample, so precisely radiocarbon dated to ca. 4045-3974 BC? The reported hg. R1a-M417 was fully compatible, while R1a-Z645 could be compatible with its date, but the few positive SNPs I got in my analysis point indeed to a potential subclade of R1a-Z94, and I trust more experienced hobbyists in this ‘art’ of ascertaining the SNPs of ancient samples, and they report hg. R1a-Z93 (Z95+, Y26+, Y2-).

Seeing how Y-DNA bottlenecks worked in Yamnaya-Afanasievo and in Corded Ware and related groups, and if this sample really is so deep within R1a-Z93 in a region that should be more strongly affected by the known Neolithic Y-chromosome bottlenecks and forest-steppe ecotone, someone from the lab responsible for this sample should check its date once again, before more people keep chasing their tails with an individual that (based on its derived SNPs’ TMRCA) might actually be dated to the Bronze Age, where it could make much more sense in terms of ancestry and position in the PCA.

EDIT (14 SEP 2019): … and with the fact that he is the first individual to show the genetic adaptation for lactase persistence (I3910-T), which is only found later among Bell Beakers, and much later in Sintashta and related Steppe_MLBA peoples (see comments below).

This is also evidenced by the other Ukraine_Eneolithic (likely a late Yamnaya) sample of hg. R1b-Z2103 from Dereivka (ca. 2800 BC) and who – despite being in a similar territory 1,000 years later – shows a wholly diluted Yamnaya ancestry under typically European HG ancestry, even more so than other late Sredni Stog samples from Dereivka of ca. 3600-3400 BC, suggesting a decrease in Steppe ancestry rather than an increase – which is supposedly what should be expected based on the ancestry from Alexandria…

Like the reported Chalcolithic individual of Hajji Firuz who showed an apparently incompatible subclade and Yamnaya ancestry at least some 1,000 years before it should, and turned out to be from the Iron Age (see below), this may be another case of wrong radiocarbon dating.

NOTE. It would be interesting, if this turns out to be another Hajji Firuz-like error, to check how well different ancestry models worked in whose hands exactly, and if anyone actually pointed out that this sample was derived, and not ancestral, to many different samples that were used in combination with it. It would also be a great control to check if those still supporting a Sredni Stog origin for PIE would shift their preference even more to the north or west, depending on where the first “true” R1a-M417 samples popped up. Such a finding now could be thus a great tool to discover whether haplogroup-based bias plays a role in ancestry magic as related to the Indo-European question, i.e. if it really is about “pure statistics”, or there is something else to it…

II.1. R1b-L51-rich Bell Beakers

The overwhelming majority of R1b-L51 lineages in Radovesice during the Bell Beaker period, just after the sampled Corded Ware individuals from the same site, further strengthen the hypothesis of an almost full replacement of R1a-M417 lineages from Central Europe up to southern Scandinavia after the arrival of Bell Beakers.

Yet another R1b-L151* sample has popped up in Central Europe, in the individual classified as Bilina_BA (ca. 2200-800 BC), which clusters with Bell Beakers from Bohemia, with the outlier from Turlojiškė, and with Early Slavs, suggesting once again that a group of central-east European Beakers represented the Pre-Proto-Balto-Slavic community before their spread and admixture events to the east.

The available ancient distribution of R1b-L51*, R1b-L52* or R1b-L151* is getting thus closer to the most likely origin of R1b-L51 in the expansion of East Bell Beakers, who trace their paternal ancestors to Yamnaya settlers from the Carpathian Basin:

NOTE. Some of these are from other sources, and some are samples I have checked in a hurry, so I may have missed some derived SNPs. If you send me a corrected SNP call to dismiss one of these, or more ‘archaic’ samples, I’ll correct the map accordingly. See also maps of modern distributionof R1b-M269 subclades.

r1b-l51-ancient-europe
Distribution of ‘archaic’ R1b-L51 subclades in ancient samples, overlaid over a map of Yamnaya and Bell Beaker migrations. In blue, Yamnaya Pre-L51 from Lopatino (not shown) and R1b-L52* from BBC Augsburg. In violet, R1b-L51 (xP312,xU106) from BBC Prague and Poland. In maroon, hg. R1b-L151* from BBC Hungary, BA Bohemia, and (not shown) a potential sample from BBC at Mondelange, which is certainly xU106, maybe xP312. Interestingly, the earliest sample of hg. R1b-U106 (a lineage more proper of northern Europe) has been found in a Bell Beaker from Radovesice (ca. 2350 BC), between two of these ‘archaic’ R1b-L51 samples; and a sample possibly of hg. R1b-ZZ11+ (ancestral to DF27 and U152) was found in a Bell Beaker from Quedlinburg, Germany (ca. 2290 BC), to the north-west of Bohemia. The oldest R1b-U152 are logically from Central Europe, too.

III. Proto-Indo-Iranian

Before the emergence of Proto-Indo-Iranian, it seems that Pre-Proto-Indo-Iranian-speaking Poltavka groups were subjected to pressure from Central_Steppe_EMBA-related peoples coming from the (south-?)east, such as those found sampled from Mereke_BA. Their ‘kurgan’ culture was dated correctly to approximately the same date as Poltavka materials, but their ancestry and hg. N2(pre-N2a) – also found in a previous sample from Botai – point to their intrusive nature, and thus to difficulties in the Pre-Proto-Indo-Iranian community to keep control over the previous East Yamnaya territory in the Don-Volga-Ural steppes.

We know that the region does not show genetic continuity with a previous period (or was not under this ‘eastern’ pressure) because of an Eastern Yamnaya sample from the same site (ca. 3100 BC) showing typical Yamnaya ancestry. Before Yamnaya, it is likely that Pre-Yamnaya ancestry formed through admixture of EHG-like Khvalynsk with a North Caspian steppe population similar to the Steppe_Eneolithic samples from the North Caucasus Piedmont (see Anthony 2019), so we can also rule out some intermittent presence of a Botai/Kelteminar-like population in the region during the Khvalynsk period.

It is very likely, then, that this competition for the same territory – coupled with the known harsher climate of the late 3rd millennium BC – led Poltavka herders to their known joint venture with Abashevo chiefs in the formation of the Sintashta-Potapovka-Filatovka community of fortified settlements. Supporting these intense contacts of Poltavka herders with Central Asian populations, late ‘outliers’ from the Volga-Ural region show admixture with typical Central_Steppe_MLBA populations: one in Potapovka (ca. 2220 BC), of hg. R1b-Z2103; and four in the Sintashta_MLBA_o1 cluster (ca. 2050-1650 BC), with two samples of hg. R1b-L23 (one R1b-Z2109), one Q1b-L56(xL53), one Q1b-Y6798.

central-steppe-pastoralists
Outlier analysis reveals ancient contacts between sites. We plot the average of principal component 1 (x axis) and principal component 2 (y axis) for the West Eurasian and All Eurasian PCA plots (…). In the Middle to Late Bronze Age Steppe, we observe, in addition to the Western_Steppe_MLBA and Central_Steppe_MLBA clusters (indistinguishable in this projection), outliers admixed with other ancestries. The BMAC-related admixture in Kazakhstan documents northward gene flow onto the Steppe and confirms the Inner Asian Mountain Corridor as a conduit for movement of people.

Similar to how the Sintashta_MLBA_o2 cluster shows an admixture with central steppe populations and hg. R1a-Z645, the WSHG ancestry in those outliers from the o1 cluster of typically (or potentially) Yamnaya lineages show that Poltavka-like herders survived well after centuries of Abashevo-Poltavka coexistence and admixture events, supporting the formation of a Proto-Indo-Iranian community from the local language as pronounced by the incomers, who dominated as elites over the fortified settlements.

The Proto-Indo-Iranian community likely formed thus in situ in the Don-Volga-Ural region, from the admixture of locals of Yamnaya ancestry with incomers of Corded Ware ancestry – represented by the ca. 67% Yamnaya-like ancestry and ca. 33% ancestry from the European cline. Their community formed thus ca. 1,000 years later than the expansion of Late PIE ca. 3500 BC, and expanded (some 500 years after that) a full-fledged Proto-Indo-Iranian language with the Srubna-Andronovo horizon, further admixing with ca. 9% of Central_Steppe_EMBA (WSHG-related) ancestry in their migration through Central Asia, as reported in the paper.

IV. Armenian

The sample from Hajji Firuz, of hg. R1b-Z2103 (xPF331), has been – as expected – re-dated to the Iron Age (ca. 1193-1019 BC), hence it may offer – together with the samples from the Levant and their Aegean-like ancestry rapidly diluted among local populations – yet another proof of how the Late Bronze Age upheaval in Europe was the cause of the Armenian migration to the Armenoid homeland, where they thrived under the strong influence from Hurro-Urartian.

middle-east-armenia-y-dna
Y-chromosome haplogroups of the Middle East and neighbouring groups during the Late Bronze Age / Iron Age. See full maps.

Indus Valley Civilization and Dravidian

A surprise came from the analysis reported by Shinde et al. (2019) of an Iran_N-related IVC ancestry which may have split earlier than 10000 BC from a source common to Iran hunter-gatherers of the Belt Cave.

For the controversial Elamo-Dravidian hypothesis of the Muscovite school, this difference in ancestry between both groups (IVC and Iran Neolithic) seems to be a death blow, if population genomics was even needed for that. Nevertheless, I guess that a full rejection of a recent connection will come down to more recent and subtle population movements in the area.

EDIT (12 SEP): Apparently, Iosif Lazaridis is not so sure about this deep splitting of ‘lineages’ as shown in the paper, so we may be talking about different contributions of AME+ANE/ENA, which means the Elamo-Dravidian game is afoot; at least in genomics:

I shared the idea that the Indus Valley Civilization was linked to the Proto-Dravidian community, so I’m inclined to support this statement by Narasimhan, Patterson, et al. (2019), even if based only on modern samples and a few ancient ones:

The strong correlation between ASI ancestry and present-day Dravidian languages suggests that the ASI, which we have shown formed as groups with ancestry typical of the Indus Periphery Cline moved south and east after the decline of the IVC to mix with groups with more AASI ancestry, most likely spoke an early Dravidian language.

india-steppe-indus-valley-andamanese-ancestry
Natural neighbour interpolation of qpAdm results – Maximum A Posteriori Estimate from the Hierarchical Model (estimates used in the Narasimhan, Patterson et al. 2019 figures) for Central_Steppe_MLBA-related (left), Indus_Periphery_West-related (center) and Andamanese_Hunter-Gatherer-related ancestry (right) among sampled modern Indian populations. In blue, peoples of IE language; in red, Dravidian; in pink, Tibeto-Burman; in black, unclassified. See full image.

I am wary of this sort of simplistic correlation with modern speakers, because we have seen what happened with the wrong assumptions about modern Balto-Slavic and Finno-Ugric speakers and their genetic profile (see e.g. here or here). In fact, I just can’t differentiate as well as those with deep knowledge in South Asian history the social stratification of the different tribal groups – with their endogamous rules under the varna and jati systems – in the ancestry maps of modern India. The pattern of ancestry and language distribution combined with the findings of ancient populations seem in principle straightforward, though.

Conclusion

The message to take home from Shinde et al. (2019) is that genomic data is fully at odds with the Anatolian homeland hypothesis – including the latest model by Heggarty (2014)* – whose relevance is still overvalued today, probably due in part to the shift of OIT proponents to more reasonable Out-of-Iran models, apparently more fashionable as a vector of Indo-Aryan languages than Eurasian steppe pastoralists?
*The authors listed this model erroneously as Heggarty (2019).

The paper seems to play with the occasional reference to Corded Ware as a vector of expansion of Indo-European languages, even after accepting the role of Yamnaya as the most evident population expanding Late PIE to western Europe – and the different ancestry that spread with Indo-Iranian to South Asia 1,000 years later. However, the most cringe-worthy aspect is the sole citation of the debunked, pseudoscientific glottochronological method used by Ringe, Warnow, and Taylor (2002) to support the so-called “steppe homeland”, a paper and dialectal scheme which keeps being referenced in papers of the Reich Lab, probably as a consequence of its use in Anthony (2007).

On the other hand, these are the equivalent simplistic comments in Narasimhan, Patterson et al. (2019):

The Steppe ancestry in South Asia has the same profile as that in Bronze Age Eastern Europe, tracking a movement of people that affected both regions and that likely spread the unique features shared between Indo-Iranian and Balto-Slavic languages. (…), which despite their vast geographic separation share the “satem” innovation and “ruki” sound laws.

mallory-adams-tree
Indo-European dialectal relationships, from Mallory and Adams (2006).

The only academic closely related to linguistics from the list of authors, as far as I know, is James P. Mallory, who has supported a North-West Indo-European dialect (including Balto-Slavic) for a long time – recently associating its expansion with Bell Beakers – opposed thus to a Graeco-Aryan group which shared certain innovations, “Satemization” not being one of them. Not that anyone needs to be a linguist to dismiss any similarities between Balto-Slavic and Indo-Iranian beyond this phonetic trend, mind you.

Even Anthony (2019) supports now R1b-rich Pre-Yamnaya and Yamnaya communities from the Don-Volga region expanding Middle and Late Proto-Indo-European dialects.

So how does the underlying Corded Ware ancestry of eastern Europe (where Pre-Balto-Slavs eventually spread to from Bell Beaker-derived groups) and of the highly admixed (“cosmopolitan”, according to the authors) Sintashta-Potapovka-Filatovka in the east relate to the similar-but-different phonetic trends of two unrelated IE dialects?

If only there was a language substrate that could (as Shinde et al. put it) “elegantly” explain this similar phonetic evolution, solving at the same time the question of the expansion of Uralic languages and their strong linguistic contacts with steppe peoples. Say, Eneolithic populations of mainly hunter-fisher-gatherers from the North Pontic forest-steppes with a stronger connection to metalworking

Related

Yamnaya ancestry: mapping the Proto-Indo-European expansions

steppe-ancestry-expansion-europe

The latest papers from Ning et al. Cell (2019) and Anthony JIES (2019) have offered some interesting new data, supporting once more what could be inferred since 2015, and what was evident in population genomics since 2017: that Proto-Indo-Europeans expanded under R1b bottlenecks, and that the so-called “Steppe ancestry” referred to two different components, one – Yamnaya or Steppe_EMBA ancestry – expanding with Proto-Indo-Europeans, and the other one – Corded Ware or Steppe_MLBA ancestry – expanding with Uralic speakers.

The following maps are based on formal stats published in the papers and supplementary materials from 2015 until today, mainly on Wang et al. (2018 & 2019), Mathieson et al. (2018) and Olalde et al. (2018), and others like Lazaridis et al. (2016), Lazaridis et al. (2017), Mittnik et al. (2018), Lamnidis et al. (2018), Fernandes et al. (2018), Jeong et al. (2019), Olalde et al. (2019), etc.

NOTE. As in the Corded Ware ancestry maps, the selected reports in this case are centered on the prototypical Yamnaya ancestry vs. other simplified components, so everything else refers to simplistic ancestral components widespread across populations that do not necessarily share any recent connection, much less a language. In fact, most of the time they clearly didn’t. They can be interpreted as “EHG that is not part of the Yamnaya component”, or “CHG that is not part of the Yamnaya component”. They can’t be read as “expanding EHG people/language” or “expanding CHG people/language”, at least no more than maps of “Steppe ancestry” can be read as “expanding Steppe people/language”. Also, remember that I have left the default behaviour for color classification, so that the highest value (i.e. 1, or white colour) could mean anything from 10% to 100% depending on the specific ancestry and period; that’s what the legend is for… But, fere libenter homines id quod volunt credunt.

Sections:

  1. Neolithic or the formation of Early Indo-European
  2. Eneolithic or the expansion of Middle Proto-Indo-European
  3. Chalcolithic / Early Bronze Age or the expansion of Late Proto-Indo-European
  4. European Early Bronze Age and MLBA or the expansion of Late PIE dialects

1. Neolithic

Anthony (2019) agrees with the most likely explanation of the CHG component found in Yamnaya, as derived from steppe hunter-fishers close to the lower Volga basin. The ultimate origin of this specific CHG-like component that eventually formed part of the Pre-Yamnaya ancestry is not clear, though:

The hunter-fisher camps that first appeared on the lower Volga around 6200 BC could represent the migration northward of un-admixed CHG hunter-fishers from the steppe parts of the southeastern Caucasus, a speculation that awaits confirmation from aDNA.

neolithic-chg-ancestry
Natural neighbor interpolation of CHG ancestry among Neolithic populations. See full map.

The typical EHG component that formed part eventually of Pre-Yamnaya ancestry came from the Middle Volga Basin, most likely close to the Samara region, as shown by the sampled Samara hunter-gatherer (ca. 5600-5500 BC):

After 5000 BC domesticated animals appeared in these same sites in the lower Volga, and in new ones, and in grave sacrifices at Khvalynsk and Ekaterinovka. CHG genes and domesticated animals flowed north up the Volga, and EHG genes flowed south into the North Caucasus steppes, and the two components became admixed.

neolithic-ehg-ancestry
Natural neighbor interpolation of EHG ancestry among Neolithic populations. See full map.

To the west, in the Dnieper-Dniester area, WHG became the dominant ancestry after the Mesolithic, at the expense of EHG, revealing a likely mating network reaching to the north into the Baltic:

Like the Mesolithic and Neolithic populations here, the Eneolithic populations of Dnieper-Donets II type seem to have limited their mating network to the rich, strategic region they occupied, centered on the Rapids. The absence of CHG shows that they did not mate frequently if at all with the people of the Volga steppes (…)

neolithic-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Neolithic populations. See full map.

North-West Anatolia Neolithic ancestry, proper of expanding Early European farmers, is found up to border of the Dniester, as Anthony (2007) had predicted.

neolithic-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Neolithic populations. See full map.

2. Eneolithic

From Anthony (2019):

After approximately 4500 BC the Khvalynsk archaeological culture united the lower and middle Volga archaeological sites into one variable archaeological culture that kept domesticated sheep, goats, and cattle (and possibly horses). In my estimation, Khvalynsk might represent the oldest phase of PIE.

(…) this middle Volga mating network extended down to the North Caucasian steppes, where at cemeteries such as Progress-2 and Vonyuchka, dated 4300 BC, the same Khvalynsk-type ancestry appeared, an admixture of CHG and EHG with no Anatolian Farmer ancestry, with steppe-derived Y-chromosome haplogroup R1b. These three individuals in the North Caucasus steppes had higher proportions of CHG, overlapping Yamnaya. Without any doubt, a CHG population that was not admixed with Anatolian Farmers mated with EHG populations in the Volga steppes and in the North Caucasus steppes before 4500 BC. We can refer to this admixture as pre-Yamnaya, because it makes the best currently known genetic ancestor for EHG/CHG R1b Yamnaya genomes.

From Wang et al (2019):

Three individuals from the sites of Progress 2 and Vonyuchka 1 in the North Caucasus piedmont steppe (‘Eneolithic steppe’), which harbour EHG and CHG related ancestry, are genetically very similar to Eneolithic individuals from Khvalynsk II and the Samara region. This extends the cline of dilution of EHG ancestry via CHG-related ancestry to sites immediately north of the Caucasus foothills

eneolithic-pre-yamnaya-ancestry
Natural neighbor interpolation of Pre-Yamnaya ancestry among Neolithic populations. See full map. This map corresponds roughly to the map of Khvalynsk-Novodanilovka expansion, and in particular to the expansion of horse-head pommel-scepters (read more about Khvalynsk, and specifically about horse symbolism)

NOTE. Unpublished samples from Ekaterinovka have been previously reported as within the R1b-L23 tree. Interestingly, although the Varna outlier is a female, the Balkan outlier from Smyadovo shows two positive SNP calls for hg. R1b-M269. However, its poor coverage makes its most conservative haplogroup prediction R-M343.

The formation of this Pre-Yamnaya ancestry sets this Volga-Caucasus Khvalynsk community apart from the rest of the EHG-like population of eastern Europe.

eneolithic-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Eneolithic populations. See full map.

Anthony (2019) seems to rely on ADMIXTURE graphics when he writes that the late Sredni Stog sample from Alexandria shows “80% Khvalynsk-type steppe ancestry (CHG&EHG)”. While this seems the most logical conclusion of what might have happened after the Suvorovo-Novodanilovka expansion through the North Pontic steppes (see my post on “Steppe ancestry” step by step), formal stats have not confirmed that.

In fact, analyses published in Wang et al. (2019) rejected that Corded Ware groups are derived from this Pre-Yamnaya ancestry, a reality that had been already hinted in Narasimhan et al. (2018), when Steppe_EMBA showed a poor fit for expanding Srubna-Andronovo populations. Hence the need to consider the whole CHG component of the North Pontic area separately:

eneolithic-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Eneolithic populations. See full map. You can read more about population movements in the late Sredni Stog and closer to the Proto-Corded Ware period.

NOTE. Fits for WHG + CHG + EHG in Neolithic and Eneolithic populations are taken in part from Mathieson et al. (2019) supplementary materials (download Excel here). Unfortunately, while data on the Ukraine_Eneolithic outlier from Alexandria abounds, I don’t have specific data on the so-called ‘outlier’ from Dereivka compared to the other two analyzed together, so these maps of CHG and EHG expansion are possibly showing a lesser distribution to the west than the real one ca. 4000-3500 BC.

eneolithic-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Eneolithic populations. See full map.

Anatolia Neolithic ancestry clearly spread to the east into the north Pontic area through a Middle Eneolithic mating network, most likely opened after the Khvalynsk expansion:

eneolithic-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Eneolithic populations. See full map.
eneolithic-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Eneolithic populations. See full map.

Regarding Y-chromosome haplogroups, Anthony (2019) insists on the evident association of Khvalynsk, Yamnaya, and the spread of Pre-Yamnaya and Yamnaya ancestry with the expansion of elite R1b-L754 (and some I2a2) individuals:

eneolithic-early-y-dna
Y-DNA haplogroups in West Eurasia during the Early Eneolithic in the Pontic-Caspian steppes. See full map, and see culture, ADMIXTURE, Y-DNA, and mtDNA maps of the Early Eneolithic and Late Eneolithic.

3. Early Bronze Age

Data from Wang et al. (2019) show that Corded Ware-derived populations do not have good fits for Eneolithic_Steppe-like ancestry, no matter the model. In other words: Corded Ware populations show not only a higher contribution of Anatolia Neolithic ancestry (ca. 20-30% compared to the ca. 2-10% of Yamnaya); they show a different EHG + CHG combination compared to the Pre-Yamnaya one.

eneolithic-steppe-best-fits
Supplementary Table 13. P values of rank=2 and admixture proportions in modelling Steppe ancestry populations as a three-way admixture of Eneolithic steppe Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Test, Eneolithic_steppe, Anatolian_Neolithic, WHG.
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Yamnaya Kalmykia and Afanasievo show the closest fits to the Eneolithic population of the North Caucasian steppes, rejecting thus sizeable contributions from Anatolia Neolithic and/or WHG, as shown by the SD values. Both probably show then a Pre-Yamnaya ancestry closest to the late Repin population.

wang-eneolithic-steppe-caucasus-yamnaya
Modelling results for the Steppe and Caucasus cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional AF ancestry in Steppe groups and additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups. See tables above. Modified from Wang et al. (2019). Within a blue square, Yamnaya-related groups; within a cyan square, Corded Ware-related groups. Green background behind best p-values. In red circle, SD of AF/WHG ancestry contribution in Afanasevo and Yamnaya Kalmykia, with ranges that almost include 0%.

EBA maps include data from Wang et al. (2018) supplementary materials, specifically unpublished Yamnaya samples from Hungary that appeared in analysis of the preprint, but which were taken out of the definitive paper. Their location among Yamnaya settlers from Hungary is speculative, although most uncovered kurgans in Hungary are concentrated in the Tisza-Danube interfluve.

eba-yamnaya-ancestry
Natural neighbor interpolation of Pre-Yamnaya ancestry among Early Bronze Age populations. See full map. This map corresponds roughly with the known expansion of late Repin/Yamnaya settlers.

The Y-chromosome bottleneck of elite males from Proto-Indo-European clans under R1b-L754 and some I2a2 subclades, already visible in the Khvalynsk sampling, became even more noticeable in the subsequent expansion of late Repin/early Yamnaya elites under R1b-L23 and I2a-L699:

chalcolithic-early-y-dna
Y-DNA haplogroups in West Eurasia during the Yamnaya expansion. See full map and maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Chalcolithic and Yamnaya Hungary.

Maps of CHG, EHG, Anatolia Neolithic, and probably WHG show the expansion of these components among Corded Ware-related groups in North Eurasia, apart from other cultures close to the Caucasus:

NOTE. For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you can read the post Corded Ware ancestry in North Eurasia and the Uralic expansion.

eba-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Early Bronze Age populations. See full map.
eba-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Early Bronze Age populations. See full map.
eba-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Early Bronze Age populations. See full map.
eba-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Early Bronze Age populations. See full map.
eba-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Early Bronze Age populations. See full map.

4. Middle to Late Bronze Age

The following maps show the most likely distribution of Yamnaya ancestry during the Bell Beaker-, Balkan-, and Sintashta-Potapovka-related expansions.

4.1. Bell Beakers

The amount of Yamnaya ancestry is probably overestimated among populations where Bell Beakers replaced Corded Ware. A map of Yamnaya ancestry among Bell Beakers gets trickier for the following reasons:

  • Expanding Repin peoples of Pre-Yamnaya ancestry must have had admixture through exogamy with late Sredni Stog/Proto-Corded Ware peoples during their expansion into the North Pontic area, and Sredni Stog in turn had probably some Pre-Yamnaya admixture, too (although they don’t appear in the simplistic formal stats above). This is supported by the increase of Anatolia farmer ancestry in more western Yamna samples.
  • Later, Yamnaya admixed through exogamy with Corded Ware-like populations in Central Europe during their expansion. Even samples from the Middle to Upper Danube and around the Lower Rhine will probably show increasing contributions of Steppe_MLBA, at the same time as they show an increasing proportion of EEF-related ancestry.
  • To complicate things further, the late Corded Ware Espersted family (from ca. 2500 BC or later) shows, in turn, what seems like a recent admixture with Yamnaya vanguard groups, with the sample of highest Yamnaya ancestry being the paternal uncle of other individuals (all of hg. R1a-M417), suggesting that there might have been many similar Central European mating networks from the mid-3rd millennium BC on, of (mainly) Yamnaya-like R1b elites displaying a small proportion of CW-like ancestry admixing through exogamy with Corded Ware-like peoples who already had some Yamnaya ancestry.
mlba-yamnaya-ancestry
Natural neighbor interpolation of Yamnaya ancestry among Middle to Late Bronze Age populations (Esperstedt CWC site close to BK_DE, label is hidden by BK_DE_SAN). See full map. You can see how this map correlated with the map of Late Copper Age migrations and Yamanaya into Bell Beaker expansion.

NOTE. Terms like “exogamy”, “male-driven migration”, and “sex bias”, are not only based on the Y-chromosome bottlenecks visible in the different cultural expansions since the Palaeolithic. Despite the scarce sampling available in 2017 for analysis of “Steppe ancestry”-related populations, it appeared to show already a male sex bias in Goldberg et al. (2017), and it has been confirmed for Neolithic and Copper Age population movements in Mathieson et al. (2018) – see Supplementary Table 5. The analysis of male-biased expansion of “Steppe ancestry” in CWC Esperstedt and Bell Beaker Germany is, for the reasons stated above, not very useful to distinguish their mutual influence, though.

Based on data from Olalde et al. (2019), Bell Beakers from Germany are the closest sampled ones to expanding East Bell Beakers, and those close to the Rhine – i.e. French, Dutch, and British Beakers in particular – show a clear excess “Steppe ancestry” due to their exogamy with local Corded Ware groups:

Only one 2-way model fits the ancestry in Iberia_CA_Stp with P-value>0.05: Germany_Beaker + Iberia_CA. Finding a Bell Beaker-related group as a plausible source for the introduction of steppe ancestry into Iberia is consistent with the fact that some of the individuals in the Iberia_CA_Stp group were excavated in Bell Beaker associated contexts. Models with Iberia_CA and other Bell Beaker groups such as France_Beaker (P-value=7.31E-06), Netherlands_Beaker (P-value=1.03E-03) and England_Beaker (P-value=4.86E-02) failed, probably because they have slightly higher proportions of steppe ancestry than the true source population.

olalde-iberia-chalcolithic

The exogamy with Corded Ware-like groups in the Lower Rhine Basin seems at this point undeniable, as is the origin of Bell Beakers around the Middle-Upper Danube Basin from Yamnaya Hungary.

To avoid this excess “Steppe ancestry” showing up in the maps, since Bell Beakers from Germany pack the most Yamnaya ancestry among East Bell Beakers outside Hungary (ca. 51.1% “Steppe ancestry”), I equated this maximum with BK_Scotland_Ach (which shows ca. 61.1% “Steppe ancestry”, highest among western Beakers), and applied a simple rule of three for “Steppe ancestry” in Dutch and British Beakers.

NOTE. Formal stats for “Steppe ancestry” in Bell Beaker groups are available in Olalde et al. (2018) supplementary materials (PDF). I didn’t apply this adjustment to Bk_FR groups because of the R1b Bell Beaker sample from the Champagne/Alsace region reported by Samantha Brunel that will pack more Yamnaya ancestry than any other sampled Beaker to date, hence probably driving the Yamnaya ancestry up in French samples.

The most likely outcome in the following years, when Yamnaya and Corded Ware ancestry are investigated separately, is that Yamnaya ancestry will be much lower the farther away from the Middle and Lower Danube region, similar to the case in Iberia, so the map above probably overestimates this component in most Beakers to the north of the Danube. Even the late Hungarian Beaker samples, who pack the highest Yamnaya ancestry (up to 75%) among Beakers, represent likely a back-migration of Moravian Beakers, and will probably show a contribution of Corded Ware ancestry due to the exogamy with local Moravian groups.

Despite this decreasing admixture as Bell Beakers spread westward, the explosive expansion of Yamnaya R1b male lineages (in words of David Reich) and the radical replacement of local ones – whether derived from Corded Ware or Neolithic groups – shows the true extent of the North-West Indo-European expansion in Europe:

chalcolithic-late-y-dna
Y-DNA haplogroups in West Eurasia during the Bell Beaker expansion. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Late Copper Age and of the Yamnaya-Bell Beaker transition.

4.2. Palaeo-Balkan

There is scarce data on Palaeo-Balkan movements yet, although it is known that:

  1. Yamnaya ancestry appears among Mycenaeans, with the Yamnaya Bulgaria sample being its best current ancestral fit;
  2. the emergence of steppe ancestry and R1b-M269 in the eastern Mediterranean was associated with Ancient Greeks;
  3. Thracians, Albanians, and Armenians also show R1b-M269 subclades and “Steppe ancestry”.

4.3. Sintashta-Potapovka-Filatovka

Interestingly, Potapovka is the only Corded Ware derived culture that shows good fits for Yamnaya ancestry, despite having replaced Poltavka in the region under the same Corded Ware-like (Abashevo) influence as Sintashta.

This proves that there was a period of admixture in the Pre-Proto-Indo-Iranian community between CWC-like Abashevo and Yamnaya-like Catacomb-Poltavka herders in the Sintashta-Potapovka-Filatovka community, probably more easily detectable in this group because of the specific temporal and geographic sampling available.

srubnaya-yamnaya-ehg-chg-ancestry
Supplementary Table 14. P values of rank=3 and admixture proportions in modelling Steppe ancestry populations as a four-way admixture of distal sources EHG, CHG, Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Steppe cluster, EHG, CHG, WHG, Anatolian_Neolithic
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Srubnaya ancestry shows a best fit with non-Pre-Yamnaya ancestry, i.e. with different CHG + EHG components – possibly because the more western Potapovka (ancestral to Proto-Srubnaya Pokrovka) also showed good fits for it. Srubnaya shows poor fits for Pre-Yamnaya ancestry probably because Corded Ware-like (Abashevo) genetic influence increased during its formation.

On the other hand, more eastern Corded Ware-derived groups like Sintashta and its more direct offshoot Andronovo show poor fits with this model, too, but their fits are still better than those including Pre-Yamnaya ancestry.

mlba-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Middle to Late Bronze Age populations. See full map.
mlba-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Middle to Late Bronze Age populations. See full map.
mlba-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Middle to Late Bronze Age populations. See full map.
mlba-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Middle to Late Bronze Age populations. See full map.

NOTE For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you should read the post Corded Ware ancestry in North Eurasia and the Uralic expansion instead.

The bottleneck of Proto-Indo-Iranians under R1a-Z93 was not yet complete by the time when the Sintashta-Potapovka-Filatovka community expanded with the Srubna-Andronovo horizon:

early-bronze-age-y-dna
Y-DNA haplogroups in West Eurasia during the European Early Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Bronze Age.

4.4. Afanasevo

At the end of the Afanasevo culture, at least three samples show hg. Q1b (ca. 2900-2500 BC), which seemed to point to a resurgence of local lineages, despite continuity of the prototypical Pre-Yamnaya ancestry. On the other hand, Anthony (2019) makes this cryptic statement:

Yamnaya men were almost exclusively R1b, and pre-Yamnaya Eneolithic Volga-Caspian-Caucasus steppe men were principally R1b, with a significant Q1a minority.

Since the only available samples from the Khvalynsk community are R1b (x3), Q1a(x1), and R1a(x1), it seems strange that Anthony would talk about a “significant minority”, unless Q1a (potentially Q1b in the newer nomenclature) will pop up in some more individuals of those ca. 30 new to be published. Because he also mentions I2a2 as appearing in one elite burial, it seems Q1a (like R1a-M459) will not appear under elite kurgans, although it is still possible that hg. Q1a was involved in the expansion of Afanasevo to the east.

middle-bronze-age-y-dna
Y-DNA haplogroups in West Eurasia during the Middle Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Middle Bronze Age and the Late Bronze Age.

Okunevo, which replaced Afanasevo in the Altai region, shows a majority of hg. Q1b, but also some R1b-M269 samples proper of Afanasevo, suggesting partial genetic continuity.

NOTE. Other sampled Siberian populations clearly show a variety of Q subclades that likely expanded during the Palaeolithic, such as Baikal EBA samples from Ust’Ida and Shamanka with a majority of Q1b, and hg. Q reported from Elunino, Sagsai, Khövsgöl, and also among peoples of the Srubna-Andronovo horizon (the Krasnoyarsk MLBA outlier), and in Karasuk.

From Damgaard et al. Science (2018):

(…) in contrast to the lack of identifiable admixture from Yamnaya and Afanasievo in the CentralSteppe_EMBA, there is an admixture signal of 10 to 20% Yamnaya and Afanasievo in the Okunevo_EMBA samples, consistent with evidence of western steppe influence. This signal is not seen on the X chromosome (qpAdm P value for admixture on X 0.33 compared to 0.02 for autosomes), suggesting a male-derived admixture, also consistent with the fact that 1 of 10 Okunevo_EMBA males carries a R1b1a2a2 Y chromosome related to those found in western pastoralists. In contrast, there is no evidence of western steppe admixture among the more eastern Baikal region region Bronze Age (~2200 to 1800 BCE) samples.

This Yamnaya ancestry has been also recently found to be the best fit for the Iron Age population of Shirenzigou in Xinjiang – where Tocharian languages were attested centuries later – despite the haplogroup diversity acquired during their evolution, likely through an intermediate Chemurchek culture (see a recent discussion on the elusive Proto-Tocharians).

Haplogroup diversity seems to be common in Iron Age populations all over Eurasia, most likely due to the spread of different types of sociopolitical structures where alliances played a more relevant role in the expansion of peoples. A well-known example of this is the spread of Akozino warrior-traders in the whole Baltic region under a partial N1a-VL29-bottleneck associated with the emerging chiefdom-based systems under the influence of expanding steppe nomads.

early-iron-age-y-dna
Y-DNA haplogroups in West Eurasia during the Early Iron Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Iron Age and Late Iron Age.

Surprisingly, then, Proto-Tocharians from Shirenzigou pack up to 74% Yamnaya ancestry, in spite of the 2,000 years that separate them from the demise of the Afanasevo culture. They show more Yamnaya ancestry than any other population by that time, being thus a sort of Late PIE fossils not only in their archaic dialect, but also in their genetic profile:

shirenzigou-afanasievo-yamnaya-andronovo-srubna-ulchi-han

The recent intrusion of Corded Ware-like ancestry, as well as the variable admixture with Siberian and East Asian populations, both point to the known intense Old Iranian and Old/Middle Chinese contacts. The scarce Proto-Samoyedic and Proto-Turkic loans in Tocharian suggest a rather loose, probably more distant connection with East Uralic and Altaic peoples from the forest-steppe and steppe areas to the north (read more about external influences on Tocharian).

Interestingly, both R1b samples, MO12 and M15-2 – likely of Asian R1b-PH155 branch – show a best fit for Andronovo/Srubna + Hezhen/Ulchi ancestry, suggesting a likely connection with Iranians to the east of Xinjiang, who later expanded as the Wusun and Kangju. How they might have been related to Huns and Xiongnu individuals, who also show this haplogroup, is yet unknown, although Huns also show hg. R1a-Z93 (probably most R1a-Z2124) and Steppe_MLBA ancestry, earlier associated with expanding Iranian peoples of the Srubna-Andronovo horizon.

All in all, it seems that prehistoric movements explained through the lens of genetic research fit perfectly well the linguistic reconstruction of Proto-Indo-European and Proto-Uralic.

Related

Corded Ware ancestry in North Eurasia and the Uralic expansion

uralic-clines-nganasan

Now that it has become evident that Late Repin (i.e. Yamnaya/Afanasevo) ancestry was associated with the migration of R1b-L23-rich Late Proto-Indo-Europeans from the steppe in the second half of the the 4th millennium BC, there’s still the question of how R1a-rich Uralic speakers of Corded Ware ancestry expanded , and how they spread their languages throughout North Eurasia.

Modern North Eurasians

I have been collecting information from the supplementary data of the latest papers on modern and ancient North Eurasian peoples, including Jeong et al. (2019), Saag et al. (2019), Sikora et al. (2018), or Flegontov et al. (2019), and I have tried to add up their information on ancestral components and their modern and historical distributions.

Fortunately, the current obsession with simplifying ancestry components into three or four general, atemporal groups, and the common use of the same ones across labs, make it very simple to merge data and map them.

Corded Ware ancestry

There is no doubt about the prevalent ancestry among Uralic-speaking peoples. A map isn’t needed to realize that, because ancient and modern data – like those recently summarized in Jeong et al. (2019) – prove it. But maps sure help visualize their intricate relationship better:

natural-modern-srubnaya-ancestry
Natural neighbor interpolation of Srubnaya ancestry among modern populations. See full map.
kriging-modern-srubnaya-ancestry
Kriging interpolation of Srubnaya ancestry among modern populations. See full map

Interestingly, the regions with higher Corded Ware-related ancestry are in great part coincident with (pre)historical Finno-Ugric-speaking territories:

uralic-languages-modern
Modern distribution of Uralic languages, with ancient territory (in the Common Era) labelled and delimited by a red line. For more information on the ancient territory see here.

Edit (29/7/2019): Here is the full Steppe_MLBA ancestry map, including Steppe_MLBA (vs. Indus Periphery vs. Onge) in modern South Asian populations from Narasimhan et al. (2018), apart from the ‘Srubnaya component’ in North Eurasian populations. ‘Dummy’ variables (with 0% ancestry) have been included to the south and east of the map to avoid weird interpolations of Steppe_MLBA into Africa and East Asia.

modern-steppe-mlba-ancestry2
Natural neighbor interpolation of Steppe MLBA-like ancestry among modern populations. See full map.

Anatolia Neolithic ancestry

Also interesting are the patterns of non-CWC-related ancestry, in particular the apparent wedge created by expanding East Slavs, which seems to reflect the intrusion of central(-eastern) European ancestry into Finno-Permic territory.

NOTE. Read more on Balto-Slavic hydrotoponymy, on the cradle of Russians as a Finno-Permic hotspot, and about Pre-Slavic languages in North-West Russia.

natural-modern-lbk-en-ancestry
Natural neighbor interpolation of LBK EN ancestry among modern populations. See full map.
kriging-modern-lbk-en-ancestry
Kriging interpolation of LBK EN ancestry among modern populations. See full map

WHG ancestry

The cline(s) between WHG, EHG, ANE, Nganasan, and Baikal HG are also simplified when some of them excluded, in this case EHG, represented thus in part by WHG, and in part by more eastern ancestries (see below).

modern-whg-ancestry
Natural neighbor interpolation of WHG ancestry among modern populations. See full map.
kriging-modern-whg-ancestry
Kriging interpolation of WHG ancestry among modern populations. See full map.

Arctic, Tundra or Forest-steppe?

Data on Nganasan-related vs. ANE vs. Baikal HG/Ulchi-related ancestry is difficult to map properly, because both ancestry components are usually reported as mutually exclusive, when they are in fact clearly related in an ancestral cline formed by different ancient North Eurasian populations from Siberia.

When it comes to ascertaining the origin of the multiple CWC-related clines among Uralic-speaking peoples, the question is thus how to properly distinguish the proportions of WHG-, EHG-, Nganasan-, ANE or BaikalHG-related ancestral components in North Eurasia, i.e. how did each dialectal group admix with regional groups which formed part of these clines east and west of the Urals.

The truth is, one ought to test specific ancient samples for each “Siberian” ancestry found in the different Uralic dialectal groups, but the simplistic “Siberian” label somehow gets a pass in many papers (see a recent example).

Below qpAdm results with best fits for Ulchi ancestry, Afontova Gora 3 ancestry, and Nganasan ancestry, but some populations show good fits for both and with similar proportions, so selecting one necessarily simplifies the distribution of both.

Ulchi ancestry

modern-ulchi-ancestry
Natural neighbor interpolation of Ulchi ancestry among modern populations. See full map.
kriging-modern-ulchi-ancestry
Kriging interpolation of Ulchi ancestry among modern populations. See full map.

ANE ancestry

natural-modern-ane-ancestry
Natural neighbor interpolation of ANE ancestry among modern populations. See full map.
kriging-modern-ane-ancestry
Kriging interpolation of ANE ancestry among modern populations. See full map.

Nganasan ancestry

modern-nganasan-ancestry
Natural neighbor interpolation of Nganasan ancestry among modern populations. See full map.
kriging-modern-nganasan-ancestry
Kriging interpolation of Nganasan ancestry among modern populations. See full map.

Iran Chalcolithic

A simplistic Iran Chalcolithic-related ancestry is also seen in the Altaic cline(s) which (like Corded Ware ancestry) expanded from Central Asia into Europe – apart from its historical distribution south of the Caucasus:

modern-iran-chal-ancestry
Natural neighbor interpolation of Iran Neolithic ancestry among modern populations. See full map.
kriging-modern-iran-neolithic-ancestry
Kriging interpolation of Iran Chalcolithic ancestry among modern populations. See full map.

Other models

The first question I imagine some would like to know is: what about other models? Do they show the same results? Here is the simplistic combination of ancestry components published in Damgaard et al. (2018) for the same or similar populations:

NOTE. As you can see, their selection of EHG vs. WHG vs. Nganasan vs. Natufian vs. Clovis of is of little use, but corroborate the results from other papers, and show some interesting patterns in combination with those above.

EHG

damgaard-modern-ehg-ancestry
Natural neighbor interpolation of EHG ancestry among modern populations, data from Damgaard et al. (2018). See full map.
damgaard-kriging-ehg-ancestry
Kriging interpolation of EHG ancestry among modern populations. See full map.

Natufian ancestry

damgaard-modern-natufian-ancestry
Natural neighbor interpolation of Natufian ancestry among modern populations, data from Damgaard et al. (2018). See full map.
damgaard-kriging-natufian-ancestry
Kriging interpolation of Natufian ancestry among modern populations. See full map.

WHG ancestry

damgaard-modern-whg-ancestry
Natural neighbor interpolation of WHG ancestry among modern populations, data from Damgaard et al. (2018). See full map.
damgaard-kriging-whg-ancestry
Kriging interpolation of WHG ancestry among modern populations. See full map.

Baikal HG ancestry

damgaard-modern-baikalhg-ancestry
Natural neighbor interpolation of Baikal hunter-gatherer ancestry among modern populations, data from Damgaard et al. (2018). See full map.
damgaard-kriging-baikal-hg-ancestry
Kriging interpolation of Baikal HG ancestry among modern populations. See full map.

Ancient North Eurasians

Once the modern situation is clear, relevant questions are, for example, whether EHG-, WHG-, ANE, Nganasan-, and/or Baikal HG-related meta-populations expanded or became integrated into Uralic-speaking territories.

When did these admixture/migration events happen?

How did the ancient distribution or expansion of Palaeo-Arctic, Baikalic, and/or Altaic peoples affect the current distribution of the so-called “Siberian” ancestry, and of hg. N1a, in each specific population?

NOTE. A little excursus is necessary, because the calculated repetition of a hypothetic opposition “N1a vs. R1a” doesn’t make this dichotomy real:

  1. There was not a single ethnolinguistic community represented by hg. R1a after the initial expansion of Eastern Corded Ware groups, or by hg. N1a-L392 after its initial expansion in Siberia:
  2. Different subclades became incorporated in different ways into Bronze Age and Iron Age communities, most of which without an ethnolinguistic change. For example, N1a subclades became incorporated into North Eurasian populations of different languages, reaching Uralic- and Indo-European-speaking territories of north-eastern Europe during the late Iron Age, at a time when their ancestral origin or language in Siberia was impossible to ascertain. Just like the mix found among Proto-Germanic peoples (R1b, R1a, and I1)* or among Slavic peoples (I2a, E1b, R1a)*, the mix of many Uralic groups showing specific percentages of R1a, N1a, or Q subclades* reflect more or less recent admixture or acculturation events with little impact on their languages.

*other typically northern and eastern European haplogroups are also represented in early Germanic (N1a, I2, E1b, J, G2), Slavic (I1, G2, J) and Finno-Permic (I1, R1b, J) peoples.

ananino-culture-new
Map of archaeological cultures in north-eastern Europe ca. 8th-3rd centuries BC. [The Mid-Volga Akozino group not depicted] Shaded area represents the Ananino cultural-historical society. Fading purple arrows represent likely stepped movements of subclades of haplogroup N for centuries (e.g. Siberian → Ananino → Akozino → Fennoscandia [N-VL29]; Circum-Arctic → forest-steppe [N1, N2]; etc.). Blue arrows represent eventual expansions of Uralic peoples to the north. Modified image from Vasilyev (2002).

The problem with mapping the ancestry of the available sampling of ancient populations is that we lack proper temporal and regional transects. The maps that follow include cultures roughly divided into either “Bronze Age” or “Iron Age” groups, although the difference between samples may span up to 2,000 years.

NOTE. Rough estimates for more external groups (viz. Sweden Battle Axe/Gotland_A for the NW, Srubna from the North Pontic area for the SW, Arctic/Nganasan for the NE, and Baikal EBA/”Ulchi-like” for the SE) have been included to offer a wider interpolated area using data already known.

Bronze Age

Similar to modern populations, the selection of best fit “Siberian” ancestry between Baikal HG vs. Nganasan, both potentially ± ANE (AG3), is an oversimplification that needs to be addressed in future papers.

Corded Ware ancestry

bronze-age-corded-ware-ancestry
Natural neighbor interpolation of Srubnaya ancestry among Bronze Age populations. See full map.

Nganasan-like ancestry

bronze-age-nganasan-like-ancestry
Natural neighbor interpolation of Nganasan-like ancestry among Bronze Age populations. See full map.

Baikal HG ancestry

bronze-age-baikal-hg-ancestry
Natural neighbor interpolation of Baikal Hunter-Gatherer ancestry among Bronze Age populations. See full map.

Afontova Gora 3 ancestry

bronze-age-afontova-gora-ancestry
Natural neighbor interpolation of Afontova Gora 3 ancestry among Bronze Age populations. See full map.

Iron Age

Corded Ware ancestry

Interestingly, the moderate expansion of Corded Ware-related ancestry from the south during the Iron Age may be related to the expansion of hg. N1a-VL29 into the chiefdom-based system of north-eastern Europe, including Ananyino/Akozino and later expanding Akozino warrior-traders around the Baltic Sea.

NOTE. The samples from Levänluhta are centuries older than those from Estonia (and Ingria), and those from Chalmny Varre are modern ones, so this region has to be read as a south-west to north-east distribution from the Iron Age to modern times.

iron-age-corded-ware-ancestry
Natural neighbor interpolation of Srubnaya ancestry among Iron Age populations. See full map.

Baikal HG-like ancestry

The fact that this Baltic N1a-VL29 branch belongs in a group together with typically Avar N1a-B197 supports the Altaic origin of the parent group, which is possibly related to the expansion of Baikalic ancestry and Iron Age nomads:

iron-age-baikal-ancestry
Natural neighbor interpolation of Baikal HG ancestry among Iron Age populations. See full map.

Nganasan-like ancestry

The dilution of Nganasan-like ancestry in an Arctic region featuring “Siberian” ancestry and hg. N1a-L392 at least since the Bronze Age supports the integration of hg. N1a-Z1934, sister clade of Ugric N1a-Z1936, into populations west and east of the Urals with the expansion of Uralic languages to the north into the Tundra region (see here).

The integration of N1a-Z1934 lineages into Finnic-speaking peoples after their migration to the north and east, and the displacement or acculturation of Saami from their ancestral homeland, coinciding with known genetic bottlenecks among Finns, is yet another proof of this evolution:

iron-age-nganasan-ancestry
Natural neighbor interpolation of Nganasan ancestry among Iron Age populations. See full map.

WHG ancestry

Similarly, WHG ancestry doesn’t seem to be related to important population movements throughout the Bronze Age, which excludes the multiple North Eurasian populations that will be found along the clines formed by WHG, EHG, ANE, Nganasan, Baikal HG ancestry as forming part of the Uralic ethnogenesis, although they may be relevant to follow later regional movements of specific populations.

iron-age-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Iron Age populations. See full map.

Conclusion

It seems natural that people used to look at maps of haplogroup distribution from the 2000s, coupled with modern language distributions, and would try to interpret them in a certain way, reaching thus the wrong conclusions whose consequences are especially visible today when ancient DNA keeps contradicting them.

In hindsight, though, assuming that Balto-Slavs expanded with Corded Ware and hg. R1a, or that Uralians expanded with “Siberian” ancestry and hg. N1a, was as absurd as looking at maps of ancestry and haplogroup distribution of ancient and modern Native Americans, trying to divide them into “Germanic” or “Iberian”…

The evolution of each specific region and cultural group of North Eurasia is far from being clear. However, the general trend speaks clearly in favour of an ancient, Bronze Age distribution of North Eurasian ancestry and haplogroups that have decreased, diluted, or become incorporated into expanding Uralians of Corded Ware ancestry, occasionally spreading with inter-regional expansions of local groups.

Given the relatively recent push of Altaic and Indo-European languages into ancestral Uralic-speaking territories, only the ancient Corded Ware expansion remains compatible with the spread of Uralic languages into their historical distribution.

Related

Sea Peoples behind Philistines were Aegeans, including R1b-M269 lineages

New open access paper Ancient DNA sheds light on the genetic origins of early Iron Age Philistines, by Feldman et al. Science Advances (2019) 5(7):eaax0061.

Interesting excerpts (modified for clarity, emphasis mine):

Here, we report genome-wide data from human remains excavated at the ancient seaport of Ashkelon, forming a genetic time series encompassing the Bronze to Iron Age transition. We find that all three Ashkelon populations derive most of their ancestry from the local Levantine gene pool. The early Iron Age population was distinct in its high genetic affinity to European-derived populations and in the high variation of that affinity, suggesting that a gene flow from a European-related gene pool entered Ashkelon either at the end of the Bronze Age or at the beginning of the Iron Age. Of the available contemporaneous populations, we model the southern European gene pool as the best proxy for this incoming gene flow. Last, we observe that the excess European affinity of the early Iron Age individuals does not persist in the later Iron Age population, suggesting that it had a limited genetic impact on the long-term population structure of the people in Ashkelon.

philistines-pca
Ancient genomes (marked with color-filled symbols) projected onto the principal components inferred from present-day west Eurasians (gray circles). The newly reported Ashkelon populations are annotated in the upper corner.

Genetic discontinuity between the Bronze Age and the early Iron Age people of Ashkelon

In comparison to ASH_LBA, the four ASH_IA1 individuals from the following Iron Age I period are, on average, shifted along PC1 toward the European cline and are more spread out along PC1, overlapping with ASH_LBA on one extreme and with the Greek Late Bronze Age “S_Greece_LBA” on the other. Similarly, genetic clustering assigns ASH_IA1 with an average of 14% contribution from a cluster maximized in the Mesolithic European hunter-gatherers labeled “WHG” (shown in blue in Fig. 2B) (15, 22, 26). This component is inferred only in small proportions in earlier Bronze Age Levantine populations (2 to 9%).

In agreement with the PCA and ADMIXTURE results, only European hunter-gatherers (including WHG) and populations sharing a history of genetic admixture with European hunter-gatherers (e.g., as European Neolithic and post-Neolithic populations) produced significantly positive f4-statistics (Z ≥ 3), suggesting that, compared to ASH_LBA, ASH_IA1 has additional European-related ancestry.

We find that the PC1 coordinates positively correlate with the proportion of WHG ancestry modeled in the Ashkelon individuals, suggesting that WHG reasonably tag a European-related ancestral component within the ASH_IA1 individuals.

philistines-admixture
We plot the ancestral proportions of the Ashkelon individuals inferred by qpAdm using Iran_ChL, Levant_ChL, and WHG as sources ±1 SEs. P values are annotated under each model. In cases when the three-way model failed (χ2P < 0.05), we plot the fitting two-way model. The WHG ancestry is necessary only in ASH_IA1.

The best supported one (χ2P = 0.675) infers that ASH_IA1 derives around 43% of ancestry from the Greek Bronze Age “Crete_Odigitria_BA” (43.1 ± 19.2%) and the rest from the ASH_LBA population.

(…) only the models including “Sardinian,” “Crete_Odigitria_BA,” or “Iberia_BA” as the candidate population provided a good fit (χ2P = 0.715, 49.3 ± 8.5%; χ2P = 0.972, 38.0 ± 22.0%; and χ2P = 0.964, 25.8 ± 9.3%, respectively). We note that, because of geographical and temporal sampling gaps, populations that potentially contributed the “European-related” admixture in ASH_IA1 could be missing from the dataset.

The transient impact of the “European-related” gene flow on the Ashkelon gene pool

The ASH_IA2 individuals are intermediate along PC1 between the ASH_LBA ones and the earlier Bronze Age Levantines (Jordan_EBA/Lebanon_MBA) in the west Eurasian PCA (Fig. 2A). Notably, despite being chronologically closer to ASH_IA1, the ASH_IA2 individuals position closer, on average, to the earlier Bronze Age individuals.

philistines-y-dna
See more information on Y-DNA SNP calls, including ASH067 as R1b-M269 (xL151).

The transient excess of European-related genetic affinity in ASH_IA1 can be explained by two scenarios. The early Iron Age European-related genetic component could have been diluted by either the local Ashkelon population to the undetectable level at the time of the later Iron Age individuals or by a gene flow from a population outside of Ashkelon introduced during the final stages of the early Iron Age or the beginning of the later Iron Age.

By modeling ASH_IA2 as a mixture of ASH_IA1 and earlier Bronze Age Levantines/Late Period Egyptian, we infer a range of 7 to 38% of contribution from ASH_IA1, although no contribution cannot be rejected because of the limited resolution to differentiate between Bronze Age and early Iron Age ancestries in this model.

Hg. R1b-M269 and the Aegean

I already predicted this relationship of Philistines and Aegeans (Greeks in particular) months ago, based on linguistics, archaeology, and phylogeography, although it was (and still is) yet unclear if these paternal lineages might have come from other nearby populations which might be descended from Common Anatolians instead, given the known intense contacts between Helladic and West Anatolian groups.

luwian-civilization-sea-peoples
The alternative view: The Sea Peoples can be traced back to the Aegean, so they could also have consisted of Luwian petty kingdoms, who had formed an alliance and attacked Hatti from the south.

The deduction process for the Greek connection was quite simple:

Palaeo-Balkan populations

We know that R1b-Z2103 expanded with Yamna, including West Yamna settlers: they appear in Vučedol, which means they formed part of the earliest expansion waves of Yamna settlers into the Carpathian Basin, and they also appear scattered among Bell Beakers (apart from dominating East Yamna and Afanasevo), which suggests that they were possibly one of the most successful lineages during the late Repin/early Yamna expansion.

The “Steppe ancestry” associated with I2a-L699 samples among Balkan BA peoples may have also been associated with recent Bronze Age expansions, and this haplogroup’s presence among modern Balkan peoples may also suggest that it expanded with Palaeo-Balkan languages. Nevertheless, we don’t know which specific lineages and “Steppe ancestry” they represent, sadly.

These samples may well be related to remnants of previous Balkan populations like Cernavodă or Ezero, because there has been no peer-reviewed attempt at distinguishing Khvalynsk-/Novodanilovka- from Sredni Stog- from Yamnaya-related populations (see here), and some groups that are associated with this ancestry, like Corded Ware, are known to be culturally distinct from Yamna.

In any case, Proto-Greeks from the southern Balkans (say, Sitagroi IV and related groups) are probably going to show, based on Palaeo-Balkan substrate and Pre-Greek substrate and on the available Mycenaean samples, a process of decreasing proportion of R1b-Z2103 lineages relative to local ones, and a relatively similar cline of Yamna:EEF ancestry from northern to southern areas, at least in the periods closest to the Yamna expansion.

NOTE. The finding of “archaic” R1b-L389 (R1b-V1636) and R1a-M198 subclades among modern Greeks and the likely Neolithic origin of these paternal lineages around the Caucasus suggest that their presence in Greece may be from any of the more recent migrations that have happened between Anatolia and the Balkans, especially during the Common Era, rather than Indo-Anatolian migrations; probably very very recently.

-chalcolithic-late-balkans
Bronze Age cultures in the Balkans and the Aegean. See full map including ancient samples with Y-DNA, mtDNA, and ADMIXTURE.

Minoans and haplogroup J

In the Aegean, it is already evident that the population changed language partly through cultural diffusion, probably through elite domination of Proto-Greek speakers. Whether that happened before the invasion into the Greek Peninsula or after it is unclear, as we discussed recently, because we only have one reported Y-chromosome haplogroup among Mycenaeans, and it is J (probably continuing earlier lineages).

Now we have more samples from the so-called Emporion 2 cluster in Olalde et al. (2019), which shows Mycenaean-like eastern Mediterranean ancestry and 3 (out of 3) samples of haplogroup J, which – given the origin of the colony in Phocea – may be interpreted as the prevalence of West Anatolian-like ancestry and lineages in the eastern part of the Aegean (and possibly thus south Peloponnese), in line with the modern situation.

NOTE. It does not seem likely that those R or R1b-L23 samples from the Emporion 1 cluster are R1b-Z2103, based on their West European-like ancestry, although they still may be, because – as we know – ancestry (unlike haplogroup) changes too easily to interpret it as an ancestral ethnolinguistic marker.

anatolia-greek-aegean
PCA of ancient samples related to the Aegean, with Minoans, Mycenaeans (including the Emporion 2 cluster in the background) Anatolia N-Ch.-BA and Levantine BA-LBA populations, including Tel Shadud samples. See more PCAs of ancient Eurasian populations.

Greeks and haplogroup R1b-M269

Therefore, while the presence of R1b-Z2103 among ancient Balkan peoples connected to the Yamna expansion is clear, one might ask if R1b-Z2103 really spread up to the Peloponnese by the time of the Mycenaean Civilization. That has only one indirect answer, and it’s most likely yes.

We already had some R1b-Z2103 among Thracians and around the Armenoid homeland, which offers another clue at the migration of these lineages from the Balkans. The distribution of different “archaic” R1b-Z2103 subclades among modern Balkan populations and around the Aegean offered more support to this conclusion.

But now we have two interesting ancient populations that bear witness to the likely intrusion of R1b-M269 with Proto-Greeks:

An Ancient Greek of hg. R1b

A single ancient sample supports the increase in R1b-Z2103 among Greeks during the “Dorian” invasions that triggered the Dark Ages and the phenomenon of the Aegean Sea Peoples. It comes from a Greek lab study, showing R1b1b (i.e. R1b-P297 in the old nomenclature) as the only Y-chromosome haplogroup obtained from the sampling of the Gulf of Amurakia ca. 470-30 BC, i.e. before the Roman foundation of Nikopolis, hence from people likely from Anaktorion in Ancient Acarnania, of Corinthian origin.

ancient-greeks-y-dna-mtdna

Even with the few data available – and with the caution necessary for this kind of studies from non-established labs, which may be subject to many different kinds of errors – one could argue that the western Greek areas, which received different waves of migrants from the north and shows a higher distribution of R1b-Z2103 in modern times, was probably more heavily admixed with R1b-Z2103 than southern and eastern areas, which were always dominated by Greek-speaking populations more heavily admixed with locals.

The Dorian invasion and the Greek Dark Ages may thus account for a renewed influx of R1b-Z2103 lineages accompanying the dialects that would eventually help form the Hellenic Koiné. In a sense, it is only natural that demographically stronger populations around the Bronze Age Aegean would suffer a limited (male) population replacement with the succeeding invasions, starting with a higher genetic impact in the north-west and diminishing as they progressed to the south and the east, coupled with stepped admixture events with local populations.

This would be therefore the late equivalent of what happened at the end of the 3rd millennium BC, with Mycenaeans and their genetic continuity with Minoans.

pre-greek-ssos
Distribution of Pre-Greek place-names ending in -ssos/-ssa or -sos/-sa. See original images and more on the south/east cline distribution of Pre-Greek place-names here.

Sea peoples of hg. R1b-M269

Thanks to Wang et al. (2018) supplementary materials we knew that one of the two Levantine LBA II samples from Tel Shadud (final 13th–early 11th c. BC) published in van den Brink (2017) was of hg. R1b-M269 – in fact, the one interpreted as a Canaanite official residing at this site and emulating selected funerary aspects of Egyptian mortuary culture.

Both analyzed samples, this elite individual and a commoner of hg. J buried nearby, were genetically similar and indistinguishable from local populations, though:

Principal Components Analysis of L112 and L126 was carried out within the framework described in Lazaridis et al. (2016). This analysis showed that the two individuals cluster genetically, with similar estimated proportions of ancestry from diverse West Eurasian ancestral sources. These results are consistent with the hypothesis that they derive from the same population, or alternatively that they derive from two quite closely related populations.

We know that ancestry changes easily within a few generations, so there was not much information to go on, except for the fact that – being R1b-M269 – this individual could trace his paternal ancestor at some point to Proto-Indo-Europeans.

One might think that, because many haplogroups in this spreadsheet were wrong, this is also wrong; nevertheless, many haplogroups are correctly identified by Yleaf, and finding R1b-M269 in the Levant after the expansion of Sea Peoples could not be that surprising, because they were most likely related to populations of the Aegean Sea. Any other related hg. R1b (R1b-M73, R1b-V88, even R1b-V1636) wouldn’t fit as well as R1b-M269.

sea-peoples-egypt-rameses-iii

However, the early expansion of Proto-Indo-Aryans into the Middle East, as well as the later expansion of Armenians from the Balkans through Anatolia and of West Iranians from the east may have all potentially been related to this sample. But still, the previous linguistic and archaeological theories concerning the Philistines and the expansion of Sea Peoples in the Levant made this sample a likely (originally) Greek “Dorian” lineage, rather than the other (increasingly speculative) alternatives.

In any case, it was obvious to anyone – that is, to anyone with a minimum knowledge of how population genomics works – that just the two samples from van den Brink (2017) couldn’t be used to get to any conclusions about the ancestral origin of these individuals (or their differences) beyond Levantine peoples, because their ancestry was essentially (i.e. statistically) the same as the other few available ancient samples from nearby regions and similar periods.

If anything, the PCA suggested an origin of the R1b sample closer to Aegean populations relative to the J individual (see PCA above), and this should have been supported also by amateur models, without any possible confirmation (as with the ASH_IA2 cluster in this paper). However, if you have followed online discussions of Tel Shadud R1b-M269 sample since it was mentioned first on Eupedia months ago – including another wave of misguided speculation based on the ancestry of both individuals triggered by a discussion on this blog -, you have once more proof of how misleading ancestry analyses can be in the wrong hands.

NOTE. This is the Nth proof (and that only in 2019) of how it’s best to just avoid amateur analyses and interpretations altogether, as I did in the recent publication of the books. All those who didn’t take into account whatever was commented about the ancestry of these samples haven’t lost a single bit of relevant information on Levantine peoples, and have had more time for useful reads, compared to those dedicated to endless void speculation, once again gone awfully wrong, as does everything related to cocky ancient DNA crackpottery 😉

bronze-age-late-aegean
Late Bronze Age population movements in the Eastern Mediterranean and the Middle East. See full map including ancient DNA samples with Y-DNA, mtDNA, and ADMIXTURE.

Admittedly, though, even accepting the evident Mediterranean origin of this lineage, one could have argued that this sample may have been of R1b-L151 subclade, if one were inclined to support the theory that Italic peoples were behind Sea Peoples expanding east – and consequently that the ancestors of Etruscans had migrated eastward into the Aegean (e.g. into Lemnos), so that it could be asserted that Tyrsenian might have been a remnant language of an ancient population of northern Italy.

Philistines

Fortunately, some of the samples recovered in Feldman et al. (2019) that could be analyzed (those of the cluster ASH_IA1) offer a very specific time frame where European ancestry appeared (ca. 1250 BC) before it subsequently became fully diluted (as seen in cluster ASH_IA2) among the prevalent Levantine ancestry of the area.

Also fortunately, this precise cluster shows another R1b-M269 sample, likely R1b-Z2103 (because it is probably xL151), and this sample together with others from the same cluster prove that the ancestry related to the original southern European incomers was:

  1. Recent, related thus to LBA population movements, as expected; and
  2. More closely related to coeval Aegeans, including Mycenaeans with Steppe-related ancestry.

NOTE. I say “fortunately” because, as you can imagine if you have dealt with amateurish discussions long enough, without this cluster with evident Aegean ancestry and the R1b-M269 (Z2103) sample precisely associated to it, some would enter again in endless comment loops created by ancestry magicians, showing how Aegean peoples were not behind Sea Peoples, or not behind Philistines, or not behind the R1b-M269 among Philistines, depending on their specific agendas.

aegean-sea-peoples
Map of the Sea People invasions in the Aegean Sea and Eastern Mediterranean at the end of the Late Bronze Age (blue arrows).. Some of the major cities impacted by the raids are denoted with historical dates. Inland invasions are represented by purple arrows. From Kaniewski et al. (2011). Some of the major cities impacted by the raids are denoted with historical dates. Inland invasions are represented by purple arrows.

The results of the paper don’t solve the question of the exact origin of all Sea Peoples (not even that of Philistines), but it is quite clear that most of those forming this seafaring confederation must have come from sites around the Aegean Sea. This supports thus the traditional origin attributed to them, including a hint at the likely expansion of Eastern Mediterranean ancestry and lineages into the Italian Peninsula precisely from the Aegean, as some oral communications have already disclosed.

As an indirect conclusion from the findings in this paper, then, we can now more confidently support that Tyrsenian speakers most likely expanded into the Appenines and the Alps originally from a Tyrsenian-speaking LBA population from Lemnos, due to the social unrest in the whole Aegean region, and might have become heavily admixed with local Italic peoples quite quickly, as it happened with Philistines, resulting in yet another case of language expansion through (the simplistically called) elite domination.

Conclusion

Even more interesting than these specific findings, this paper confirms yet another hypothesis based on phylogeography, and proves once again two important starting points for ancient DNA interpretation that I have discussed extensively in this blog:

  • The rare R1b-M269 Y-chromosome lineage of Tel Shadud offered ipso facto the most relevant clue about the ancestral geographical origin of this Canaanite elite male’s paternal family, most likely from the north-west based on ancient phylogeography, which indirectly – in combination with linguistics and archaeology – supported the ancestral ethnolinguistic identification of Philistines with the Aegean and thus with (a population closest to) Ancient Greeks.
  • Ancestry analyses are often fully unreliable when assessing population movements, especially when few samples from incomplete temporal-geographical transects are assessed in isolation, because – unlike paternal (and maternal) haplogroups – ancestry might change fully within a few generations, depending on the particular anthropological setting. Their investigation is thus bound by many limitations – of design, statistical, and anthropological (i.e. archaeological and linguistic) – which are quite often not taken into account.

These cornerstones of ancient DNA interpretation have been already demonstrated to be valid not only for Levantine populations, as in this case, but also for Balkan peoples, for Bell Beakers, for steppe populations (like Khvalynsk, Sredni Stog, Yamna, Corded Ware), for Basques, for Balto-Slavs, for Ugrians and Samoyeds, and for many other prehistoric peoples.

I rest my case.

Related

Bronze Age cultures in the Tarim Basin and the elusive Proto-Tocharians

andronovo-xiaohe-horizon

Master’s thesis Shifting Memories: Burial Practices and Cultural Interaction in Bronze Age China: A study of the Xiaohe-Gumugou cemeteries in the Tarim Basin, by Yunyun Yang, Uppsala University, Department of Archaeology and Ancient History (2019).

Summary excerpts, mainly from the conclusions (emphasis mine):

Both the Xiaohe and the Gumugou groups are suggested as possibly originating from southern Siberia or Central Asia and being related to Afanasievo and Andronovo people (Han 1986, 1994; Li et al. 2010, 2015). But a latest research suggest that the Xiaohe males are genetic distinct from the Afanasievo males, considering the paternal lineages (Hollard et al. 2018). From genetic evidence, it is suggested that southern Siberia and Central Asia were dominated by Europeans during the Bronze Age. Southern Siberia was predominant by Europeans since the Bronze Age as a result of eastward migration of Kurgan people (Keyser et al. 2009). Central Asia started to have an eastern Eurasian maternal lineage that coexisted with the previous western maternal lineage from around 700 BCE (Lalueza-Fox et al. 2004). Based on the research mentioned above, we can conclude as that the Xiaohe and the Gumugou people possibly came from the southern Siberia or Central Asia.

Origin of the Xiaohe horizon

There are two hypotheses about the origins of the Xiaohe horizon. The “steppe hypothesis” assumes that the early settlers (Gumugou people) of the Tarim Basin came from the Afanasievo culture in the Minusinsk Basin-Altai Mountains regions (Kuz’mina et al. 2008; Mallory et al. 2008). The “oasis hypothesis” argues that the early settlers were related to the spreading of the oasis-based agricultural groups from the Bactria and Margiana parts of the southern Central Asia area (Chen et al. 1995). Both hypotheses mainly relied on the use of some materials such as animal cattle, sheep/goats, camel hair, and plant wheat, whose origins were bound to western traditions. But these proofs cannot provide enough support to claim that the Xiaohe horizon cultures were from Afanasievo or BMAC cultures, except for telling there were possible cultural connections or interactions among them. What’s more, there were no horses or potteries in the Xiaohe horizon.

It is worth noting that Ephedra plant is commonly thought as a strong candidate of the Soma or Haoma sacred drink for the ancient Indians or Iranians. Soma is the name recorded in the Vedic Brahmanism religious literature Rigveda, Haoma in the Zoroastrianism Avesta, and indicates as a ritual drink from plant juice. The reason to address Ephedra plant to Soma-Haoma drink is mainly because of its ephedrine, which works on muscle strength, low blood pressure, (and asthma) to make people get rid of tiredness (Houben 2013). Furthermore, it is thought that Ephedra with anti-fatigue function gives gods or the dead immortality, longevity, and resurrection (Mahdihassan 1987). From a mobile consideration of Vedic Aryans perspective, it is thought Vedic Aryans made use of Ephedra, cannabis and poppy to produce Soma drink in Margiana, only Ephedra in Bactria and in Indian mountains area, but other substitutes in Indian plains (Shah 2014). From the Ephedra perspective, it is agreeable that the Xiaohe-Gumugou people were related to the Indo-Aryan peoples (Mallory et al. 1997; Wang 2017).

gumugou-xiaohe
The distribution map of the sites in the Xiaohe cultural horizon.

Burial customs

Both the Xiaohe and the Gumugou groups maintained similar burial customs, but we can distinguish a developing process from the slight diverse ways of the Gumugou cemetery to the highly consistent and advanced technology in making coffins of the Xiaohe cemetery. In terms of the dressing, the dead wore a felt cap, a pair of leather boots, a bracelet twined on the right wrist, and was wrapped in a big felt mantle. The dead in the Xiaohe cemetery also wore a loin-cloth. Commonly, both cemeteries contained burials goods of Ephedra twigs, grains of wheat and millet, grass-made baskets, animal ears (such as calf ears), and livestock. Wooden coffins in the two cemeteries were constructed in a similar way, by assembling two side-planks, two end-boards, a lid consisting of a few short straight boards, and covered with livestock hide (mainly cattle hide in the Xiaohe cemetery and sheep/goats hide in the Gumugou cemetery).

Considering the similar and continuous burial behaviours in the two cemeteries, it can be assumed that both the Xiaohe and the Gumugou societies were stable and consistent. The Xiaohe cemetery had both the special clay-lid wooden coffins and the normal coffins in its early phase (burial layers 4th-5th), then turned to be stable and consistent with the normal coffins (burial layers 1st-3rd), and have developed better construction of the boat-shape coffins. The Gumugou cemetery contained two main burial patterns, type I; the sun-radiating-spokes burials and type II; the normal burials, which coexisted during the same time. Burials of type II were similar but not limited to strict rules. Burials in both the Xiaohe and the Gumugou cemetery were fairly heterogeneous, and the clay-lid wooden coffins in the Xiaohe cemetery and the sun-radiating-spokes burials in the Gumugou cemetery only took up in a small percentage of each cemetery. These special burial types could indicate special roles of the dead in their related societies. Either the dead had high social positions or possibly they actually had a different ancestry origin. It is argued here that the latter is something that is quite possible, considering the mixed populations in the two cemeteries.

The sun-radiating-spokes burials share some features with a similar type of grave, constructed of circular stone kerbs of the stone-pit graves. The sun-radiating-spokes burials might represent an adaption to the local desert environment, which had better access to wood rather than stones. Circular stone kerbs with stone-pit in centre were widely seen in Bronze Age Afanasievo and Andronovo burials, and also in the late Bronze Age and early Iron Age burials along the Tian Shan. The present study suggests a high possibility that the six males buried in the sun-radiating-spokes graves came from the contemporary parallel Andronovo horizon, and kept some of their own ancestry memories in an adapted way.

xinjiang-afanasievo-andronovo-bmac-tian-shan
An assumption of the spreading/expansion routes stone burial construct.

Societies

Although the Xiaohe and Gumugou societies were stable and consistent, it does not mean that the societies were isolated, and we can see strong indications of them being open to the outside. With time, the Xiaohe population were getting even more diverse origins, as newcomers kept joining the group from outside. However, the burial behaviours in the Xiaohe cemetery did not change as a consequence if these additions. This suggests that the newcomers inherited the local burial customs, and strongly indicates that they became part of the community and adopted the new social identity, possibly through marriage. As a result, the diverse populations can well explain the coexistence of different cultural elements in the burials, e.g. cattle, sheep/goats, camel hair (from Central Asia), grains of wheat (from the west) and millet (from the east), etc.

The Xiaohe and the Gumugou societies were similar, but the Xiaohe society developed to a more advanced level both in economy and in social structure. First, the oasis-based economic system of the Xiaohe and the Gumugou had similar husbandry, but later this was developed to different extent. Both societies mainly relied on livestock, and while the Xiaohe people favoured cattle, the Gumugou people favoured sheep/goats. The two societies also developed agriculture, which can be seen from the grains of wheat and millet. It has been shown that grains of wheat are bread wheat. The Xiaohe people also cooked porridge with millet and milk, and had dairy products.

From these evidences, we can assume that the Xiaohe people have developed a stronger economic level. Secondly, the Xiaohe society had more distinguished gender roles, resulting in different social roles for men and women in terms of work and religions. The female and male dead were buried in a distinguished way with loin-cloths and wooden monuments. Sexual identity on a social level refers to how people consider and expect different genders to act and behave under the social and cultural framework. In the Xiaohe society, men carried out hunting tasks (creatures like vultures, badgers, lizards, snakes); women were associated to the rebirth of lives. To synthesize, a possible relation between the Xiaohe and the Gumugou societies is that they represent two parallel groups who shared similar economic systems because of the similar environment, or that there is a chronological difference where the Gumugou people may have existed earlier. The absolute dating information from the two cemeteries is insufficient to rule out the second situation.

tarim-basin-regions
The area division of the Tarim Basin and its surroundings (The division is made based on the mountain ranges including Altai Mountains, Tian Shan, and Kunlun Mountains, and also the distribution of ancient cemeteries in the whole Xinjiang generally.)

Surroundings

To place the Xiaohe horizon in the larger context of the Bronze Age burials in its surroundings, the hypothesis presented in this study is that the Xiaohe-Gumugou people might possibly represent a parallel to the Andronovo groups, with an eastward migration, that developed their own societies and ethnicities in the Tarim Basin with some ancestral memories still preserved. Considering the location and the geographical features of Xinjiang, the Altai Mountains and the Tian Shan left open access from the Eurasian Steppe to the Dzungarian Basin. The Hami Basin-the Balikun Grassland was the first intersection area to combine the possible western and eastern cultural influences. To pass by the Turpan Basin and enter into the Tarim Basin, there were two possible routes, one northern route along the southern edge of Tian Shan, and one southern route along the northern edge of Kunlun Mountains.

In the early Bronze Age, the burials in Xinjiang had some clear typical geographic features that distinguish them from their surroundings. But from the late Bronze Age to the early Iron Age, the tradition with circular kerbs of stones with stone-pits burials expanded along the southern edge of the Tian Shan, which was a major shift of burial practice that possibly could be linked to the expansion of the Andronovo horizon or a general nomadic expansion.

Although there were no horses or wagons found in the Xiaohe burials, the wooden horse-hoof objects were an indication of horses, which did not exist in their daily lives anymore, but possibly were related to some settlers’ ancestral memories of their nomadic origins. However, it was more important for them to assimilate to the common social identities of their new group. After people died, it was preferred to be buried in the communal cemetery. Even if the dead bodies were lost, wooden substitutes will be used in graves to represent the dead, since they believed in afterlife and thought that the end of the death is rebirth.

Comments

While the results of Li et al. (2010, 2015) of Xiaohe mummies regarding Y-chromosome haplogroups – showing mostly R1a(xZ93) – and radiocarbon dates of the samples are yet to be confirmed, Proto-Tocharians are known to have had contacts with Samoyeds, early Indo-Iranians (in turn in contact with the BMAC language), then into Common Tocharian with ancient Iranians, and then Indo-Aryan and Iranian languages again (for more on this, see Ged Carling‘s publications).

The connection of the Tocharian branch with Afanasevo is essentially indisputable today, like that of Late Proto-Indo-European with late Repin/early Yamna, even more so than it was just 10 years ago, thanks to the most recent genetic investigation. The common genetic stock of Yamna and Afanasevo – as well as that of East Bell Beakers and Palaeo-Balkan peoples – fits perfectly earlier predictions based on the linguistic estimates of the separation and evolution of the diverse language communities, and the tentative attribution to Eurasian steppe-related cultures.

early-bronze-age-tocharian-chemurchek
Tentative identification of language groups among Early Bronze Age cultures. Pre-/Proto-Tocharian is traditionally associated with Chemurchek. See full image.

The trail leading from Afanasevo to Common Tocharians, on the other hand, seems to be more tricky, not unlike many other Indo-European-speaking groups from Europe and Asia, whose precise evolution until their historical attestation is often unclear. Nevertheless, the eventual presence of diverse haplogroups among historical Tocharians – whether they coincide with ancient DNA recovered from BMAC, South India, Andronovo, or Bronze Age Tian Shan populations – will only be relevant to understand the genetic evolution of the speakers of Tocharian during its different stages.

If the genetic trail backwards from known Tocharians to (earlier) unknown Common Tocharians, and forwards from known Pre-Tocharians to (later) unknown Proto-Tocharians leads unequivocally to these populations from the Xiaohe cultural horizon, this paper shows one of the mechanisms through which peoples of the Andronovo cultural horizon (or, more precisely, male lines derived from it) may have become integrated into a Tocharian-speaking population, not dissimilar to what happened in the steppes between Uralic-speaking Abashevo and Pre-Proto-Indo-Iranian-speaking Catacomb-Poltavka to form the Proto-Indo-Iranian-speaking Sintashta-Potapovka-Filatovka culture.

As we have discussed in this blog many times over, to solve this ethnolinguistic identification of prehistoric cultures one needs to investigate ancient DNA in combination with linguistic guesstimates and the Indo-European homeland problem from a wide anthropological perspective. People not understanding this simple concept are bound to end up in some comical Tocharo-Indo-Iranian grouping related to Corded Ware ancestry from Andronovo, similar to the Celto-Ibero-Basques of elevated CEU BA ancestry and hg. R1b-P312 to the south of the Pyrenees during the Iron Age from Olalde et al. (2019), and to the Balto-Finno-Slavs of hg. R1a-Z283 and elevated “Steppe ancestry” in the BA-IA East Baltic from Saag et al. (2019)

Related

Balto-Slavic accentual mobility: an innovation in contact with Balto-Finnic

bronze-age-germanic-balto-slavic

Some very specific prosodic innovations affected the Balto-Slavic linguistic community, probably at a time when it already showed internal dialectal differences. Whether those innovations were related to archaic remnants stemming from the parent Proto-Indo-European language, and whether that disintegrating community included different dialects, remains an object of active debate.

“Archaic” Balto-Slavic?

The main question about Balto-Slavic is whether this concept represents a single community, or it was rather a continuum formed by two (Baltic and Slavic) or possibly three (East Baltic, West Baltic, Slavic) neighbouring communities, speaking closely related Northern European dialects, which just happened to evolve very close to each other, i.e. in cultures that were closer to each other than they were to Germanic or Balto-Finnic.

In my opinion, their similarities warrant the reconstruction of a single original central-east European community since the dissolution of Bell Beakers, speaking a North-West Indo-European dialect, and most internal differences between Baltic and Slavic may be explained as innovations. The precise identification of a Proto-Balto-Slavic community remains elusive, although the Unetice-Iwno-Mierzanowice triangle remains the best bet, with Trzciniec showing what seems like an Early Slavic-like population reaching up to the East Baltic.

bell-beaker-balto-slavic-germanic
Bell Beaker expansion in eastern Europe and around the Baltic.

The reconstruction of a common Balto-Slavic proto-language is known to range from difficult to impossible, depending on who you ask, not the least because of the differences that are discussed in this post, and which have been the own battlefield created by Balticists and Slavicists for decades. The old tenet that Balto-Slavic had inherited some traits directly from PIE is – in contrast with e.g. the Italo-Celtic concept – surprisingly vivid still today.

Take, for example, these internal differences and supposedly archaic traits:

  • The ruKi rule, where Baltic shows mostly *is, *us, and Slavic shows *, *; or the different output of Satemization in Baltic compared to Slavic (and both compared to Indo-Iranian). Nevertheless, the Satemization trends in Balto-Slavic and Indo-Iranian are usually explained together and taken as a sign of a traditional three-velar system for PIE.
    • If you consider Satemization as a late trend in Balto-Slavic, affecting each dialect in a different way, and thus Balto-Slavic phonetic evolution clearly distinct from the Indo-Iranian trend, rejecting trictectalism, this problem is solved. This would also solve the impossible Indo-Slavonic problem, and the paradox of Balto-Slavic sharing a genetic phylum with Germanic and Italo-Celtic.
    • If you, however, conflate these differences and North-West Indo-European features with an ad hoc explanation of a hypothetic Centum dialect called Temematic, which intends to solve their (in Holzer’s words) unlösbaren inconsistencies, you essentially add a whole new inconsistency without solving their previous ones. For a full rebuttal of Holzer‘s Temematic etymologies, see Matasović (2014).
  • Kortlandt’s reconstruction of a PIE 3rd singular *-e (Baltic from *-et, Slavic from *-eti) and 3rd plural *-o, which would have been replaced independently in other Indo-European dialects (by *-eti, *-onti), is reminiscent of his own reconstruction of laryngeals almost up to the attestation of all Indo-European dialects, including Baltic. If you consider these traits an innovation, this artificially created problem is immediately solved.
  • Genitive plural Pre-Baltic *-ōm vs. Pre-Slavic *-ŏm is another commonly cited example. However, I would place this difference among other similar differences found within other related IE dialects, hence a common phonetic innovation (see e.g. below for the classicist view of unstable obliques).
  • Kortlandt’s reconstruction of oblique cases in *-m-, shared with Germanic, as stemming from a common Middle PIE *-mus (based essentially on Old Lithuanian *-mus and on a non-existent equivalent Anatolian formation), hence different from those in *-bʰ-. While you can argue for infinite more reasonable alternatives, the most often cited one is the ins.-dat. pl. *-bʰ- as a common NWIE innovation based on ins. sg. *bʰi-, while forms in *-m- (including ins. sg.) as a Northern European phonetic innovation. The simplest, most elegant explanation I’ve read to date (I think by Rémy Viredaz) is the similar bilabial change of Giacobo/Giacomo in Italian…

As you can see, some Balto-Slavicists could have written whole books about how their object of study holds the key to solve problems on common Proto-Indo-European paradigms, some of which wouldn’t need solving if they hadn’t been started by Balto-Slavicists themselves…

While all of these “archaic” traits are easily dismissed without further ado (except for some understandable damaged pride among academics), there is one especially pervasive idea among those willing to find the white whale of laryngeal remnants in Indo-European languages (see here for other examples of dubious laryngeal remains).

prophecy-before-battle
The prophecy before the battle, Józef Ryszkiewicz, 1890. Or, how to conjure laryngeal remnants in Balto-Slavic.

Accentual development in contact

Whichever position one prefers, the general argument is that the Balto-Slavic accentual system is non-trivial for the classification of both dialects into a common branch. However, that would only be completely true if it were a common innovation, but not so much if it were a natural laryngeal evolution.

In fact, the broken tone preserving a PIE laryngeal, as proposed by Kortlandt – continuing Meillet’s idea of synchronous PIE-PBS developments – was always very difficult to accept. Even the rising pronunciation is not original, and represents a shift of the accent on the initial syllable in Latvian…

In my opinion, the derivation of a modern phenomenon from a PIE laryngeal must always raise a red flag (see below on archaisms vs. innovations in IE languages). As you can see from my take of the fable in Balto-Slavic, which uses Kortlandt’s reconstruction, I preferred not to take into account the reconstructed accents. The fable remains thus a model of what could have been a common Proto-Balto-Slavic, unlike other reconstructions, which are much less tentative.

NOTE. You could argue that accents may be reconstructed in spite of the wrong theory behind them, but this is not true; at least not of all reconstructed accents, some of which require further assumptions. Think about it this way: I wouldn’t take into account a reconstruction of Germanic accent which used Danish glottalized tone for a hypothetical Proto-Germanic laryngeal, even if most accents seemed correct at first sight. The truth is, I didn’t want to dedicate time to go through each reconstructed word and its explanation, so it was easier to delete them all, even though that’s not an actual solution, either. You will find the same doubts in the description of Balto-Slavic evolution in my old Modern Indo-European grammar. The introduction to IE dialects was partially copied from Wikipedia (which, in the case of Balto-Slavic, essentially summarized data from Kortlandt), but in the grammar I just tried to keep the basics, and not very successfully, because you need a comprehensive and coherent description of a language’s evolution. That’s how messed up the question was, and how it still is, even though 15 years of research have passed…

Despite the idea of an “archaic Balto-Slavic”, especially prevalent among older researchers, the current trend is to consider Balto-Slavic prosodic changes as a natural innovation, even among those who would artificially reconstruct laryngeal remnants up to late Balto-Slavic stages.

NOTE. You can read more about the Proto-Indo-European laryngeal loss and vocalism. While the presence of certain laryngeals up to Late PIE is certain, the loss in many environments is also generally agreed upon. This is especially true of a hypothetical Indo-Slavonic branch, like that supported by Kortlandt: even those supporting multiple laryngeal loss events must admit that Indo-Iranian showed no laryngeals before its disintegration, whether they put this loss as an internal Proto-Indo-Iranian evolution, or they place it earlier. Tocharian attests to an evolution similar to the rest of Late PIE dialects (hence to a quite early laryngeal loss trend), and Balkan dialects (supposedly splitting before Indo-Slavonic) also lost laryngeals in a similar way, except for initial ones, which show vocalic output instead of full loss.

So, where does a laryngeal loss fit in this “Indo-Slavonic” scheme, exactly? Before the Tocharian split? Before the Balkan split? After the Balkan split but before the full loss in Indo-Iranian? And where exactly does this group belong regarding Corded Ware, and where does Germanic? No idea (but you can read Kortlandt try fitting his model with Gimbutas’ “Kurgan peoples”). Because one thing is to reconstruct Proto-Greek, or Proto-Celtic, or Proto-Italic forms without laryngeals and to put them in relation with a purely theoretical three-laryngeal PIE, and a different one is to reconstruct laryngeals (including in environments which were already lost in Tocharian) up to Proto-Baltic and Proto-Slavic, which seems more than just a bit of a stretch…

mallory-adams-tree
Indo-European dialectal relationships, from Mallory and Adams (2006).

Thomas Olander offered a summary of the current positions regarding the Balto-Slavic accentual system recently in Indo-European heritage in the Balto-Slavic accentuation system (2013), which also contains a summary of his Mobility Law, to explain this phenomenon as a common Pre-Baltic and Pre-Slavic innovation.

Andersen, an advocate of different Baltic and Slavic dialects developing in contact with Satem dialects, suggested in The Satem Languages of the Indo-European Northwest. First Contacts? (2009), partially based on Olander’s initial proposal, that Baltic and Slavic accentual mobility arose as a result of contact with languages with fixed word-initial ictus: the accent was lost in the word-final mora in pre-Proto-Baltic and, independently, in pre-Proto-Slavic. Hence, the central innovation, the accent loss

technically is not a shared Slavic and Baltic innovation. On the contrary. It shows that the speakers of the Pre-Slavic and Pre-Baltic dialects formed bilingual communities with speakers of contact dialects that were of the same prosodic type, viz. had fixed initial ictus but no free accent.

In the meantime, Olander (2019) has found out about more real-world examples of this same phenomenon:

Prosodic features are known to be susceptible to contact influence (Salmons 1992:1 and passim). While it does not directly influence the evaluation of the Mobility Law as a non-trivial innovation, it is interesting that most of the alleged parallels are indeed considered to be contact-induced changes due to influence from languages with an ictus on the word-initial syllable (Andersen 2009: 11-14; Rinkevičius 2013): Balto-Fennic in the case of the Karelian and (perhaps through Latvian as an intermediary) Žemaitian dialects, and Hungarian in the case of the Slavonian dialects (for Karelian see Jakobson 1938/2002: 239; Veenker 1967: 74; Thomason & Kaufman 1988: 122, 241; Salmons 1992: 41- 42; for Žemaitian see Zinkevičius 1966: 45- 46; for Slavonian see Ivić 1958: 287).

I am not aware of any hypotheses on a contact-induced origin for Greek prosodic innovations, but it is at least worth noting that there is agreement on significant substrate influence on Greek. While we may speculate that these substrate language(s) had word-initial ictus like Balto-Fennic and Hungarian, we do not have any actual information about the prosodic system(s) (thus even Beekes 2014: 9, who in other respects provides a fairly detailed picture of the substrate).

The parallels from other speech varieties show that an accent loss of the type suggested for a pre-stage of Baltic and Slavic is a type of prosodic change that has occurred several times in different various systems. In the context of the present paper this means that the sound law itself cannot be classified as a non-trivial innovation; it may have taken place in already differentiated dialects or languages. Also, the parallels suggest that a loss of the accent may be the result of influence from languages with fixed word-initial ictus.

In this time when even linguists agree that substrate/contact languages have to be related to specific ethnolinguistic groups (see here for Germanic), the fact that Olander stops short of naming this substrate behind Pre-Baltic and Pre-Slavic as being Late Uralic in general, or Balto-Finnic in particular, is surprising.

NOTE. Not the least because Olander is part of the Homeland Timeline map project of the Copenhagen group (their website is not working right now), and they placed Volosovo as Uralians expanding with Netted Ware in contact with the Baltic during the Bronze Age…So what’s to doubt about Balto-Slavic – Balto-Finnic contacts, exactly? Maybe if Balto-Finnic was the substrate language behind Balto-Slavic (as it was in Germanic), it would mean that Uralic languages were previously spoken in territories that became later Germanic- and Balto-Slavic-speaking?

copenhagen-group-map
Still image from the Copenhagen Timeline Map (accessed one year ago), showing in green Volosovo hunter-gatherers who, according to the map, later expand to the north-east with Netted Ware…

Archaism vs. Innovation

If we tried to describe these trends of explaining peculiar traits in recent Indo-European dialects as archaism vs. innovation from a purely theoretical point of view, we could roughly distinguish two different positions (with infinite variants, of course) among academics – just like we could find people more inclined to leftist or rightist trends when speaking about economy. When it comes to linguistics, which is the least messed-up field where one can describe Indo-European and Indo-Europeans, I think we can find two alternative basic tenets:

  • One idea would hold that the oldest attested dialects – and those with an older guesstimated proto-language – are the gold standard as to what the original situation may have been, and about what could be described as an archaism. For example, Ancient Greek and Mycenaean or Vedic Sanskrit for old dialects; Tocharian, or Italic dialects for those with quite old guesstimates, each for different reasons; and Anatolian for both, old dialect and attested early.
  • NOTE. Nevertheless, the phonology of Anatolian inscriptions is often difficult to ascertain, and its ancient dialectal nature stemming from a Middle PIE stage may still be disputed by some. The archaic nature of Tocharian seems to be maybe less generally accepted than that of Anatolian, but I would say there is general consensus on the matter today.

  • The other general idea would support that the most isolated dialects are those which may hold the key to the oldest Indo-European traits, somehow hidden from external influences and areal contacts, and thus from generalized innovative trends that have affected the best known ancient dialects. In that sense, languages like Slavic, Baltic, Albanian, or Armenian – as well as some Balkan fragmentary dialects – are quite common aims of study to reveal exceptional PIE traits.

I think the education system in Southern Europe and South Asia is that of formal classicists. In eastern Europe, I’d reckon the education system – especially in regions that were never connected to the Graeco-Roman tradition – favours linguistics as a study of the own and related proto-languages. For northern Europe, I would say it’s 50/50, especially in Scandinavia, depending on whether classicists or linguists dominate over the departments of Indo-European. For example, while Germany or Austria would maybe lean more toward the classics, Copenhagen’s obsession with Germanic as the most archaic IE branch is well known…

birch-bark-manuscript-panini-grammar-treatise
A 17th-century birch bark manuscript of Pāṇini’s grammar treatise from Kashmir. Image from Wikipedia.

Both positions, when blindly accepted, are bound to fail at some point or another:

  • If you take Classical Sanskrit, Classical Greek, or Classical Latin as an example of Proto-Indo-European, you are bound to make radical mistakes when reconstructing the parent language, more so if you disregard the oldest attested layers of the languages. An interesting view of the so-called Adradists at the Complutense University of Madrid – apart from their famous 9-laryngeal reconstruction – is that Middle PIE had only 5 cases, with a general (unstable) oblique one in Late PIE that later evolved into the attested 5 to 8 cases in the different dialects. That is, in my opinion, a fairly typical classicist error, which would be easily addressed by taking into account the oldest stages, like those attested in Mycenaean and in Old Latin, instead of focusing on classical grammar. The 8-case system is, in fact, one of the few true Balto-Slavic archaisms, supported by external comparanda.
  • On the other hand, if you take Albanian, Armenian, Baltic or Slavic, or even phonetically dubious data like those from some Anatolian inscriptions, you can eventually argue for anything. And I really mean anything; you are leaving the logic door wide open for any crazy-ass opinion about Proto-Indo-European based on traits found in modern languages: From how many velars evolved (if at all, because you may find all of them in Luwian, or still living in Albanian or in Armenian…) and their nature as ejective consonants in Late PIE (based on Armenian or Germanic); to how many laryngeals and when these laryngeals disappeared (if they actually did disappear, because some may even find them in Modern Lithuanian, in Armenian, or in Danish…); etc. Once you believe your own romantic view of some modern language(s) retaining traits from five thousand years ago, there is no stopping that; not for you, but not for anyone else, either.

NOTE. One of the funniest consequences of this type of ‘worldview’, where one assumes that – the own interpretations of – modern dialects are as reliable (or even more so than) ancient ones, and that Indo-European dialects somehow split at the same time from the parent language (so there was one common “full laryngeal” language, and then all attested dialects evolved from it) are some of the theories that you can easily find posted on Facebook’s group on Proto-Indo-European. Let’s just say, for the sake of simplicity, that you can compare English ‘sunrise’ with Spanish ‘sonrisa’ “smile” all you want, and assert that both reveal a common origin in PIE *sup- hence from the Sun and the smile going “up” or something, but any explanation as to how you reached that conclusion doesn’t make for the why this comparison shouldn’t have even started at all. Now replace English and Spanish with Armenian, Slavic, and/or Albanian, invent some new IE sound law, throw one or two laryngeals in the mix, and somehow this might get a pass among certain linguists…

celebration-svetovid-rügen
The Celebration of Svetovid on Rügen, Alphonse Mucha, The Slav Epic. Image from Wikipedia. Were Early Slavs some among a selected few romantic peoples to keep the “true” Indo-European language and traditions? Of course not.

While no one can deny the value of different Indo-European branches for the reconstruction of the parent language, no matter how recently they were attested, the only reasonable solution whenever a difficult case arises is to trust ancient dialects more than recent ones. Using data from fringe theories based on recent dialects to build a Proto-Indo-European paradigm, especially when there is contradictory data from ancient IE dialects, is flawed for two reasons:

  1. Languages attested later – especially after periods of population movements and contacts – would show, in general, a greater degree of change. Preferring Old Slavic or Classical Armenian to reconstruct Indo-European over ancient dialects like Ancient Greek, Vedic Sanskrit, or ancient Italic dialects is, in a way, like taking Byzantine Greek, Pali, or Old French as models, respectively.
  2. Classical languages are indeed modified due to the action of grammarians, but once standardized these “languages behind a state” (or religion) are less prone to change, due to the transmission of oral (and written) literature, education, commerce, etc. Languages left to unorganized tribes are less constrained in their evolution, and their internal (substrate) and external (contact) influences are greater and (what’s worse) unknown.

Baltic and Slavic, like Albanian or Armenian, are dialects attested very recently, which may have undergone complex internal and external influences we may never fully understand. Confronted with controversial or inexplicable traits compared to ancient branches like Greek, Indo-Iranian, or Italo-Celtic (especially if they fit with other Indo-European dialects), the conservative solution that will be right most of the time (and I mean 99.9999% of cases) is to assume they represent an innovation over Late PIE.

The fact that some researchers still use these recent dialects as a blank canvas instead, in order to propose unending new ideas about how to reconstruct IE proto-languages, or even older common PIE stages, is shocking. Not “R1a/Steppe” vs. “N1c/Siberian” haplogroup+ancestry bullshit-level shocking, but still unacceptable in a serious academic environment.

The only reason why Balto-Slavicists have failed so many times in this “unsolvable” question that seems to be Proto-Balto-Slavic reconstruction, apart from the known differences between Baltic and Slavic, is precisely the fixation of many with their object of study as a model for other IE languages (and thus for PIE), instead of taking the rest as a model for the reconstruction of Balto-Slavic (or of Proto-Baltic and Proto-Slavic).

Repeating ad nauseam the popular concept of Balto-Slavic (or Baltic and Slavic) being among the most archaic IE dialects, or the slowest evolving IE dialects, and cheap nationalist slogans of the sort, does not help this aim, and just reading or hearing that should make anyone cringe instantly. Not less than reading or hearing about Sanskrit being essentially equal to PIE, or spoken in the Indus Valley 10,000 years ago. Because we are not living in the 19th century, mind you.

Related

A Song of Sheep and Horses, revised edition, now available as printed books

cover-song-sheep-and-horses

As I said 6 months ago, 2019 is a tough year to write a blog, because this was going to be a complex regional election year and therefore a time of political promises, hence tenure offers too. Now the preliminary offers have been made, elections have passed, but the timing has slightly shifted toward 2020. So I may have the time, but not really any benefit of dedicating too much effort to the blog, and a lot of potential benefit of dedicating any time to evaluable scientific work.

On the other hand, I saw some potential benefit for publishing texts with ISBNs, hence the updates to the text and the preparation of these printed copies of the books, just in case. While Spain’s accreditation agency has some hard rules for becoming a tenured professor, especially for medical associates (whose years of professional experience are almost worthless compared to published peer-reviewed papers), it is quite flexible in assessing one’s merits.

However, regional and/or autonomous entities are not, and need an official identifier and preferably printed versions to evaluate publications, such as an ISBN for books. I took thus some time about a month ago to update the texts and supplementary materials, to publish a printed copy of the books with Amazon. The first copies have arrived, and they look good.

series-song-sheep-horses-cover

Corrections and Additions

Titles
I have changed the names and order of the books, as I intended for the first publication – as some of you may have noticed when the linguistic book was referred to as the third volume in some parts. In the first concept I just wanted to emphasize that the linguistic work had priority over the rest. Now the whole series and the linguistic volume don’t share the same name, and I hope this added clarity is for the better, despite the linguistic volume being the third one.

Uralic dialects
I have changed the nomenclature for Uralic dialects, as I said recently. I haven’t really modified anything deeper than that, because – unlike adding new information from population genomics – this would require for me to do a thorough research of the most recent publications of Uralic comparative grammar, and I just can’t begin with that right now.

Anyway, the use of terms like Finno-Ugric or Finno-Samic is as correct now for the reconstructed forms as it was before the change in nomenclature.

west-east-uralic-schema

Mediterranean
The most interesting recent genetic data has come from Iberia and the Mediterranean. Lacking direct data from the Italian Peninsula (and thus from the emergence of the Etruscan and Rhaetian ethnolinguistic community), it is becoming clearer how some quite early waves of Indo-Europeans and non-Indo-Europeans expanded and shrank – at least in West Iberia, West Mediterranean, and France.

Finno-Ugric
Some of the main updates to the text have been made to the sections on Finno-Ugric populations, because some interesting new genetic data (especially Y-DNA) have been published in the past months. This is especially true for Baltic Finns and for Ugric populations.

ananino-culture-new

Balto-Slavic
Consequently, and somehow unsurprisingly, the Balto-Slavic section has been affected by this; e.g. by the identification of Early Slavs likely with central-eastern populations dominated by (at least some subclades of) hg. I2a-L621 and E1b-V13.

Maps
I have updated some cultural borders in the prehistoric maps, and the maps with Y-DNA and mtDNA. I have also added one new version of the Early Bronze age map, to better reflect the most likely location of Indo-European languages in the Early European Bronze Age.

As those in software programming will understand, major changes in the files that are used for maps and graphics come with an increasing risk of additional errors, so I would not be surprised if some major ones would be found (I already spotted three of them). Feel free to communicate these errors in any way you see fit.

bronze-age-early-indo-european
European Early Bronze Age: tentative langage map based on linguistics, archaeology, and genetics.

SNPs
I have selected more conservative SNPs in certain controversial cases.

I have also deleted most SNP-related footnotes and replaced them with the marking of each individual tentative SNP, leaving only those footnotes that give important specific information, because:

  • My way of referencing tentative SNP authors did not make it clear which samples were tentative, if there were more than one.
  • It was probably not necessary to see four names repeated 100 times over.
  • Often I don’t really know if the person I have listed as author of the SNP call is the true author – unless I saw the full SNP data posted directly – or just someone who reposted the results.
  • Sometimes there are more than one author of SNPs for a certain sample, but I might have added just one for all.
ancient-dna-all
More than 6000 ancient DNA samples compiled to date.

For a centralized file to host the names of those responsible for the unofficial/tentative SNPs used in the text – and to correct them if necessary -, readers will be eventually able to use Phylogeographer‘s tool for ancient Y-DNA, for which they use (partly) the same data I compiled, adding Y-Full‘s nomenclature and references. You can see another map tool in ArcGIS.

NOTE. As I say in the text, if the final working map tool does not deliver the names, I will publish another supplementary table to the text, listing all tentative SNPs with their respective author(s).

If you are interested in ancient Y-DNA and you want to help develop comprehensive and precise maps of ancient Y-DNA and mtDNA haplogroups, you can contact Hunter Provyn at Phylogeographer.com. You can also find more about phylogeography projects at Iain McDonald’s website.

Graphics
I have also added more samples to both the “Asian” and the “European” PCAs, and to the ADMIXTURE analyses, too.

I previously used certain samples prepared by amateurs from BAM files (like Botai, Okunevo, or Hittites), and the results were obviously less than satisfactory – hence my criticism of the lack of publication of prepared files by the most famous labs, especially the Copenhagen group.

Fortunately for all of us, most published datasets are free, so we don’t have to reinvent the wheel. I criticized genetic labs for not releasing all data, so now it is time for praise, at least for one of them: thank you to all responsible at the Reich Lab for this great merged dataset, which includes samples from other labs.

NOTE. I would like to make my tiny contribution here, for beginners interested in working with these files, so I will update – whenever I have time – the “How To” sections of this blog for PCAs, PCA3d, and ADMIXTURE.

-iron-age-europe-romans
Detail of the PCA of European Iron Age populations. See full versions.

ADMIXTURE
For unsupervised ADMIXTURE in the maps, a K=5 is selected based on the CV, giving a kind of visual WHG : NWAN : CHG/IN : EHG : ENA, but with Steppe ancestry “in between”. Higher K gave worse CV, which I guess depends on the many ancient and modern samples selected (and on the fact that many samples are repeated from different sources in my files, because I did not have time to filter them all individually).

I found some interesting component shared by Central European populations in K=7 to K=9 (from CEU Bell Beakers to Denmark LN to Hungarian EBA to Iberia BA, in a sort of “CEU BBC ancestry” potentially related to North-West Indo-Europeans), but still, I prefer to go for a theoretically more correct visualization instead of cherry-picking the ‘best-looking’ results.

Since I made fun of the search for “Siberian ancestry” in coloured components in Tambets et al. 2018, I have to be consistent and preferred to avoid doing the same here…

qpAdm
In the first publication (in January) and subsequent minor revisions until March, I trusted analyses and ancestry estimates reported by amateurs in 2018, which I used for the text adding my own interpretations. Most of them have been refuted in papers from 2019, as you probably know if you have followed this blog (see very recent examples here, here, or here), compelling me to delete or change them again, and again, and again. I don’t have experience from previous years, although the current pattern must have been evidently repeated many times over, or else we would be still talking about such previous analyses as being confirmed today…

I wanted to be one step ahead of peer-reviewed publications in the books, but I prefer now to go for something safe in the book series, rather than having one potentially interesting prediction – which may or may not be right – and ten huge mistakes that I would have helped to endlessly redistribute among my readers (online and now in print) based on some cherry-picked pairwise comparisons. This is especially true when predictions of “Steppe“- and/or “Siberian“-related ancestry have been published, which, for some reason, seem to go horribly wrong most of the time.

I am sure whole books can be written about why and how this happened (and how this is going to keep happening), based on psychology and sociology, but the reasons are irrelevant, and that would be a futile effort; like writing books about glottochronology and its intermittent popularity due to misunderstood scientist trends. The most efficient way to deal with this problem is to avoid such information altogether, because – as you can see in the current revised text – they wouldn’t really add anything essential to the content of these books, anyway.

Continue reading

Official site of the book series:
A Song of Sheep and Horses: eurafrasia nostratica, eurasia indouralica

Złota a GAC-CWC transitional group…but not the origin of Corded Ware peoples

koszyce-gac-zlota-cwc

Open access Unraveling ancestry, kinship, and violence in a Late Neolithic mass grave, by Schroeder et al. PNAS (2019).

Interesting excerpts of the paper and supplementary materials, about the Złota group variant of Globular Amphora (emphasis mine):

A special case is the so-called Złota group, which emerged around 2,900 BCE in the northern part of the Małopolska Upland and existed until 2,600-2,500 BCE. Originally defined as a separate archaeological “culture” (15), this group is mainly defined by the rather local introduction of a distinct form of burial in the area mentioned. Distinct Złota settlements have not yet been identified. Nonetheless, because of the character of its burial practices and material culture, which both retain many elements of the GAC and yet point forward to the Corded Ware tradition, and because of its geographical location, the Złota group has attracted significant archaeological attention (15, 16).

The Złota group buried their dead in a new, distinct type of funerary structure; so-called niche graves (also called catacomb graves). These structures featured an entrance shaft or pit and, below that, a more or less extensive niche, sometimes connected to the entrance area by a narrow corridor. Local limestone was used to seal off the entrance shaft and to pave the floor of the niche, on which the dead were usually placed along with grave goods. This specific and relatively sophisticated form of burial probably reflects contacts between the northern Małopolska Upland and the steppe and forest-steppe communities further to the east, who also buried their dead in a form of catacomb graves. Individual cases of the use of ochre and of deformation of skulls in Złota burials provide further indications of such a connection (15). At the same time, the Złota niche grave practice also retains central elements of the GAC funerary tradition, such as the frequent practice of multiple burials in one grave, often entailing redeposition and violation of the anatomical order of corpses, and thus differs from the catacomb grave customs found on the steppes which are strongly dominated by single graves. Nonetheless, at Złota group cemeteries single burial graves appear, and even in multiple burial graves the identity of each individual is increasingly emphasized, e.g. by careful deposition of the body and through the personal nature of grave goods (16).

globular-amphorae-corded-ware-zlota-amphorae
Correspondence analysis of amphorae from the Złota-graveyards reveals that there is no typological break between Globular Amphorae and Corded Ware Amphorae, including ‘Strichbündelamphorae’ (after Furholt 2008)

Just like its burial practices, the material culture and grave goods of the Złota group combine elements of the GAC, such as amber ornaments and central parts of the ceramic inventory, with elements also found in the Corded Ware tradition, such as copper ornaments, stone shaft-hole axes, bone and shell ornaments, and other stylistic features of the ceramic inventory. In particular, Złota group ceramic styles have been seen as a clear transitional phenomenon between classical GAC styles and the subsequent Corded Ware ceramics, probably playing a key role in the development of the typical cord decoration patterns that came to define the latter (17).

As briefly summarized above, the Złota group displays a distinct funerary tradition and combination of material culture traits, which give the clear impression of a cultural “transitional situation”. While the group also appears to have had long-distance contacts directed elsewhere (e.g. to Baden communities to the south), it is the combination of Globular Amphora traits, on the one hand, and traits found among late Yamnaya or Catacomb Grave groups to the east as well as the closely related Corded Ware groups that emerged around 2,800 BCE, on the other hand, that is such a striking feature of the Złota group and which makes it interesting when attempting to understand cultural and demographic dynamics in Central and Eastern Europe during the early 3rd millennium BCE.

catacomb-grave-ksiaznice
Catacomb grave no. 2a/06 from Książnice, Złota culture (acc. to Wilk 2013). Image from Włodarczak (2017)

Książnice (site 2, grave 3ZC), Świętokrzyskie province. This burial, a so-called niche grave of the Złota type (with a vertical entrance shaft and perpendicularly situated niche), was excavated in 2006 and contained the remains of 8 individuals, osteologically identified as three adult females and five children, positioned on limestone pavement in the niche part of the grave. Radiocarbon dating of the human remains indicates that the grave dates to 2900-2630 BCE, 95.4% probability (Dataset S1). The grave had an oval entrance shaft with a diameter of 60 cm and depth of 130 cm; the depth of the niche reached to 170 cm (both measured from the modern surface), and it also contained a few animal bones, a few flint artefacts and four ceramic vessels typical of the Złota group. Książnice is located in the western part of the Małopolska Upland, which only has a few Złota group sites but a stronger presence of other, contemporary groups (including variants of the Baden culture).

Wilczyce (site 90, grave 10), Świętokrzyskie province. A rescue excavation in 2001 uncovered a niche grave of the Złota type, which had a round entrance shaft measuring 90 cm in diameter. The grave was some 60-65 cm deep below the modern surface and the bottom of the niche was paved with thin limestone plates, on which remains of three individuals had been placed; two adults, one female and one male, and one child. Four ceramic vessels of Złota group type were deposited in the niche along with the bodies. Wilczyce is located in the Sandomierz Upland, an area with substantial presence of both the Globular Amphora culture and Złota group, as well as the Corded Ware culture from 2800 BCE.

zlota-gac-cwc
Genetic affinities of the Koszyce individuals and other GAC groups (here including Złota) analyzed in this study. (A) Principal component analysis of previously published and newly sequenced ancient individuals. Ancient genomes were projected onto modern reference populations, shown in gray. (B) Ancestry proportions based on supervised ADMIXTURE analysis (K = 3), specifying Western hunter-gatherers, Anatolian Neolithic farmers, and early Bronze Age steppe populations as ancestral source populations. LP, Late Paleolithic; M, Mesolithic; EN, Early Neolithic; MN, Middle Neolithic; LN, Late Neolithic; EBA, Early Bronze Age; PWC, Pitted Ware culture; TRB, Trichterbecherkultur/Funnelbeaker culture; LBK, Linearbandkeramik/Linear Pottery culture; GAC, Globular Amphora culture; Złota, Złota culture. Image modified to outline in red GAC and Złota groups.

To further investigate the ancestry of the Globular Amphora individuals, we performed a supervised ADMIXTURE (6) analysis, specifying typical western European hunter-gatherers (Loschbour), early Neolithic Anatolian farmers (Barcın), and early Bronze Age steppe populations (Yamnaya) as ancestral source populations (Fig. 2B). The results indicate that the Globular Amphora/Złota group individuals harbor ca. 30% western hunter-gatherer and 70% Neolithic farmer ancestry, but lack steppe ancestry. To formally test different admixture models and estimate mixture proportions, we then used qpAdm (7) and find that the Polish Globular Amphora/Złota group individuals can be modeled as a mix of western European hunter-gatherer (17%) and Anatolian Neolithic farmer (83%) ancestry (SI Appendix, Table S2), mirroring the results of previous studies.

zlota-steppe-ancestry-cwc
Table S2. qpADM results. The ancestry of most Globular Amphora/Złota group individuals
can be modelled as a two-way mixture of Mesolithic western hunter-gatherers (WHG), and early Anatolian Neolithic farmers (Barcın). The five individuals from Książnice (Złota group) show evidence for additional gene flow, most likely from an eastern source.

The lack of a direct genetic connection of Corded Ware peoples with the Złota group despite their common “steppe-like traits” – shared with Yamna – reveals, once more, how the few “Yamna-like” traits of Corded Ware do not support a direct connection with Indo-Europeans, and are the result of the expansion of the so-called steppe package all over Europe, and particularly among cultures closely related to the Khvalynsk expansion, and later under the influence of expanding Yamna peoples.

The results from Książnice may support that early Corded Ware peoples were in close contact with GAC peoples in Lesser Poland during the complex period of GAC-Trypillia-CWC interactions, and especially close to the Złota group at the beginning of the 3rd millennium BC. Nevertheless, patrilineal clans of Złota apparently correspond to Globular Amphorae populations, with the only male sample available yet being within haplogroup I2a-L801, prevalent in GAC.

NOTE. The ADMIXTURE of Złota samples in common with GAC samples (and in contrast with the shared Sredni Stog – Corded Ware “steppe ancestry”) makes the possibility of R1a-M417 popping up in the Złota group from now on highly unlikely. If it happened, that would complicate further the available picture of unusually diverse patrilineal clans found among Uralic speakers expanding with early Corded Ware groups, in contrast with the strict patrilineal and patrilocal culture of Indo-Europeans as found in Repin, Yamna and Bell Beakers.

Once again the traditional links between groups hypothesized by archaeologists – like Gimbutas and Kristiansen in this case – are wrong, as is the still fashionable trend in descriptive archaeology, of supporting 1) wide cultural relationships in spite of clear-cut inter-cultural differences (and intra-cultural uniformity kept over long distances by genetically-related groups), 2) peaceful interactions among groups based on few common traits, and 3) regional population continuities despite cultural change. These generalized ideas made some propose a steppe language shared between Pontic-Caspian groups, most of which have been proven to be radically different in culture and genetics.

gimbutas-kurgan-indo-european
The background shading indicates the tree migratory waves proposed by Marija Gimbutas, and personally checked by her in 1995. Image from Tassi et al. (2017).

Furthermore, paternal lines show once again marked bottlenecks in expanding Neolithic cultures, supporting their relevance to follow the ethnolinguistic identity of different cultural groups. The steppe- or EHG-related ancestry (if it is in fact from early Corded Ware peoples) in Książnice was thus probably, as in the case of Trypillia, in the form of exogamy with females of neighbouring groups:

The presence of unrelated females and related males in the grave is interesting because it suggests that the community at Koszyce was organized along patrilineal lines of descent, adding to the mounting evidence that this was the dominant form of social organization among Late Neolithic communities in Central Europe. Usually, patrilineal forms of social organization go hand in hand with female exogamy (i.e., the practice of women marrying outside their social group). Indeed, several studies (11, 12) have shown that patrilocal residence patterns and female exogamy prevailed in several parts of Central Europe during the Late Neolithic. (…) the high diversity of mtDNA lineages, combined with the presence of only a single Y chromosome lineage, is certainly consistent with a patrilocal residence system.

funnelbeaker-trypillia-corded-ware
Map of territorial ranges of Funnel Beaker Culture (and its settlement concentrations in Lesser Poland), local Tripolyan groups and Corded Ware Culture settlements (■) at the turn of the 4th/3rd millennia BC.

Since ancient and modern Uralians show predominantly Corded Ware ancestry, and Proto-Uralic must have been in close contact with Proto-Indo-European for a very long time – given the different layers of influence that can be distinguished between them -, it follows as logical consequence that the North Pontic forest-steppes (immediately to the west of the PIE homeland in the Don-Volga-Ural steppes) is the most likely candidate for the expansion of Proto-Uralic, accompanying the spread of Sredni Stog ancestry and a bottleneck under R1a-M417 lineages.

The early TMRCAs in the 4th millennium BC for R1a-M417 and R1a-Z645 support this interpretation, like the R1a-M417 sample found in Sredni Stog. On the other hand, the resurgence of typical GAC-like ancestry in late Corded Ware groups, with GAC lineages showing late TMRCAs in the 3rd millennium BC, proves the disintegration of Corded Ware all over Europe (except in Textile Ceramics- and Abashevo-related groups) as the culture lost its cohesion and different local patrilineal clans used the opportunity to seize power – similar to how eventually I2a-L621 infiltrated eastern (Finno-Ugrian) groups.

Related