Arrival of steppe ancestry with R1b-P312 in the Mediterranean: Balearic Islands, Sicily, and Iron Age Sardinia

steppe-balearic-sicily-sardinia

New preprint The Arrival of Steppe and Iranian Related Ancestry in the Islands of the Western Mediterranean by Fernandes, Mittnik, Olalde et al. bioRxiv (2019)

Interesting excerpts (emphasis in bold; modified for clarity):

Balearic Islands: The expansion of Iberian speakers

Mallorca_EBA dates to the earliest period of permanent occupation of the islands at around 2400 BCE. We parsimoniously modeled Mallorca_EBA as deriving 36.9 ± 4.2% of her ancestry from a source related to Yamnaya_Samara; (…). We next used qpAdm to identify “proximal” sources for Mallorca_EBA’s ancestry that are more closely related to this individual in space and time, and found that she can be modeled as a clade with the (small) subset of Iberian Bell Beaker culture associated individuals who carried Steppe-derived ancestry (p=0.442).

Suppl. Materials: The model used was with Bell_Beaker_Iberia_highsteppe, a group of outliers from Iberia buried in a Bell Beaker mortuary context who unlike most individuals from this context in that region had high proportions of Steppe ancestry (p=0.442).

Our estimates of Steppe ancestry in the two later Balearic Islands individuals are lower than the earlier one: 26.3 ± 5.1% for Formentera_MBA and 23.1 ± 3.6% for Menorca_LBA, but the Middle to Late Bronze Age Balearic individuals are not a clade relative to non-Balearic groups. Specifically, we find that f4(Mbuti.DG, X; Formentera_MBA, Menorca_LBA) is positive when X=Iberia_Chalcolithic (Z=2.6) or X=Sardinia_Nuragic_BA (Z=2.7). While it is tempting to interpret the latter statistic as suggesting a genetic link between peoples of the Talaiotic culture of the Balearic islands and the Nuragic culture of Sardinia, the attraction to Iberia_Chalcolithic is just as strong, and the mitochondrial haplogroup U5b1+16189+@16192 in Menorca_LBA is not observed in Sardinia_Nuragic_BA but is observed in multiple Iberia_Chalcolithic individuals. A possible explanation is that both the ancestors of Nuragic Sardinians and the ancestors of Talaiotic people from the Balearic Islands received gene flow from an unsampled Iberian Chalcolithic-related group (perhaps a mainland group affiliated to both) that did not contribute to Formentera_MBA.

This sample, like another one in El Argar, is of hg. R1b-P312. So there you are, the data that connects the Proto-Iberian expansion (replacing IE-speaking Bell Beakers) to the Iberian Chalcolithic population, signaled by the increase in Iberian Chalcolithic ancestry after the arrival of Bell Beakers, most likely connected originally to the Argaric and post-Argaric expansions during the MBA.

balearic-sicily-sardinia-pca
PCA with previously published ancient individuals (non-filled symbols), projected onto variation from present-day populations (gray squares).

Steppe in Sardinia IA: Phocaeans from Italy?

Most Sardinians buried in a Nuragic Bronze Age context possessed uniparental haplogroups found in European hunter-gatherers and early farmers, including Y-haplogroup R1b1a[xR1b1a1a] which is different from the characteristic R1b1a1a2a1a2 spread in association with the Bell Beaker complex. An exception is individual I10553 (1226-1056 calBCE) who carried Y-haplogroup J2b2a, previously observed in a Croatian Middle Bronze Age individual bearing Steppe ancestry, suggesting the possibility of genetic input from groups that arrived from the east after the spread of first farmers. This is consistent with the evidence of material culture exchange between Sardinians and mainland Mediterranean groups, although genome-wide analyses find no significant evidence of Steppe ancestry so the quantitative demographic impact was minimal.

Another interesting data, these (Mesolithic) remnant R1b-V88 lineages closely related to the Italian Peninsula, the most likely region of expansion of these lineages into Africa, in turn possibly connected to the expansion of Proto-Afroasiatic.

We detect definitive evidence of Iranian-related ancestry in an Iron Age Sardinian I10366 (391-209 calBCE) with an estimate of 11.9 ± 3.7.% Iran_Ganj_Dareh_Neolithic related ancestry, while rejecting the model with only Anatolian_Neolithic and WHG at p=0.0066 (Supplementary Table 9). The only model that we can fit for this individual using a pair of populations that are closer in time is as a mixture of Iberia_Chalcolithic (11.9 ± 3.2%) and Mycenaean (88.1 ± 3.2%) (p=0.067). This model fits even when including Nuragic Sardinians in the outgroups of the qpAdm analysis, which is consistent with the hypothesis that this individual had little if any ancestry from earlier Sardinians.

yamnaya-samara
Proportions of ancestry using a distal qpAdm framework on an individual basis (a), and based on qpWave clusters

Sicily EBA: The Lusitanian/Ligurian connection?

(…) While a previously reported Bell Beaker culture-associated individual from Sicily had no evidence of Steppe ancestry, (…) we find evidence of Steppe ancestry in the Early Bronze Age by ~2200 BCE. In distal qpAdm, the outlier Sicily_EBA11443 is parsimoniously modeled as harboring 40.2 ± 3.5% Steppe ancestry, and the outlier Sicily_EBA8561 is parsimoniously modeled as harboring 23.3 ± 3.5% Steppe ancestry. (…) The presence of Steppe ancestry in Early Bronze Age Sicily is also evident in Y chromosome analysis, which reveals that 4 of the 5 Early Bronze Age males had Steppe-associated Y-haplogroup R1b1a1a2a1a2. (Online Table 1). Two of these were Y-haplogroup R1b1a1a2a1a2a1 (Z195) which today is largely restricted to Iberia and has been hypothesized to have originated there 2500-2000 BCE. This evidence of west-to-east gene flow from Iberia is also suggested by qpAdm modeling where the only parsimonious proximate source for the Steppe ancestry we found in the main Sicily_EBA cluster is Iberians.

What’s this? An ancestral connection between Sicel and Galaico-Lusitanian or Ligurian (based on an origin in NE Iberia)? Impossible to say, especially if the languages of these early settlers were replaced later by non-Indo-European speakers from the eastern Mediterranean, and by Indo-European speakers from the mainland closely related to Proto-Italic during the LBA, but see below.

Regarding the comment on R1b-Z195, it is associated with modern Iberians, as DF27 in general, due to founder effects beyond the Pyrenees. It is a very old subclade, split directly from DF27 roughly at the same time as it split from the parent P312, i.e. it can be found anywhere in Europe, and it almost certainly accompanied the expansion of Celts from Central Europe under the subclade R1b-M167/SRY2627.

The connection is thus strong only because of the qpAdm modeling, since R1b-DF27 and subclade R1b-Z195 are certainly lineages expanded quite early, most likely with Yamna settlers in Hungary and East Bell Beakers.

In this case, if stemming from Iberia, it is most likely of subclade R1b-Z220 – or another Z195 (xM167) lineage – originally associated with the Old European substrate found in topo-hydronymy in Iberia, whose most likely remnants attested during the Iron Age were Lusitanians.

r1b-df27-z195
Left: Modern distribution of R1b-Z195 (YFull estimate 2700 BC); Right: Modern distribution of DF27. Both include later founder effects within Iberia, so the increase in the Basque country and the Crown of Aragon and the decrease in Portugal can safely be ignored. Contour maps of the derived allele frequencies of the SNPs analyzed in Solé-Morata et al. (2017).

We detect Iranian-related ancestry in Sicily by the Middle Bronze Age 1800-1500 BCE, consistent with the directional shift of these individuals toward Mycenaeans in PCA. Specifically, two of the Middle Bronze Age individuals can only be fit with models that in addition to Anatolia_Neolithic and WHG, include Iran_Ganj_Dareh_Neolithic. The most parsimonious model for Sicily_MBA3125 has 18.0 ± 3.6% Iranian-related ancestry (p=0.032 for rejecting the alternative model of Steppe rather than Iranian-related ancestry), and the most parsimonious model for Sicily_MBA has 14.9 ± 3.9% Iranian-related ancestry (p=0.037 for rejecting the alternative model).

The modern southern Italian Caucasus-related signal identified in Raveane et al. (2018) is plausibly related to the same Iranian-related spread of ancestry into Sicily that we observe in the Middle Bronze Age (and possibly the Early Bronze Age).

The non-Indo-European Sicanians and Elymians were possibly then connected to eastern Mediterranean groups before the expansion of the Sea Peoples.

For the Late Bronze Age group of individuals, qpAdm documented Steppe-related ancestry, modeling this group as 80.2 ± 1.8% Anatolia_Neolithic, 5.3 ± 1.6% WHG, and 14.5 ± 2.2% Yamnaya_Samara. Our modeling using sources more closely related in space and time also supports Sicily_LBA having Minoan-related ancestry or being derived from local preceding populations or individuals with ancestries similar to those of Sicily_EBA3123 (p=0.527), Sicily_MBA3124 (p=0.352), and Sicily_MBA3125 (p=0.095).

This increase in Steppe-related ancestry in a western site during the LBA most likely represents either an expansion from the Aegean or – maybe more likely, given the archaeological finds – a regional population similar to Sicily EBA re-emerging or rather being displaced from the eastern part of the island because of a westward movement from nearby Calabria.

NOTE. Whether this population sampled spoke Indo-European or not at this time is questionable, since the Iron Age accounts show non-IE Elymians in this region.

EDIT (21 MAR): Interesting about a proposed incoming Minoan-like ancestry is the potential origin of the Iran Neolithic-related ancestry that is going to appear in Central Italy during the LBA. This could then be potentially associated with Tyrsenians passing through the area, although the traditional description may be more more compatible with an arrival of Sea Peoples from the Adriatic.

Sad to read this:

This manuscript is dedicated to the memory of Sebastiano Tusa of the Soprintendenza del Mare in Palermo, who would have been an author of this study had he not tragically died in the crash of Ethiopia Airlines flight 302 on March 10.

Related

Aquitanians and Iberians of haplogroup R1b are exactly like Indo-Iranians and Balto-Slavs of haplogroup R1a

eba-indo-iranian-balto-slavs

The final paper on Indo-Iranian peoples, by Narasimhan and Patterson (see preprint), is soon to be published, according to the first author’s Twitter account.

One of the interesting details of the development of Bronze Age Iberian ethnolinguistic landscape was the making of Proto-Iberian and Proto-Basque communities, which we already knew were going to show R1b-P312 lineages, a haplogroup clearly associated during the Bell Beaker period with expanding North-West Indo-Europeans:

From the Bronze Age (~2200–900 BCE), we increase the available dataset from 7 to 60 individuals and show how ancestry from the Pontic-Caspian steppe (Steppe ancestry) appeared throughout Iberia in this period, albeit with less impact in the south. The earliest evidence is in 14 individuals dated to ~2500–2000 BCE who coexisted with local people without Steppe ancestry. These groups lived in close proximity and admixed to form the Bronze Age population after 2000 BCE with ~40% ancestry from incoming groups. Y-chromosome turnover was even more pronounced, as the lineages common in Copper Age Iberia (I2, G2, and H) were almost completely replaced by one lineage, R1b-M269.

iberia-admixture-y-dna
Proportion of ancestry derived from central European Beaker/Bronze Age populations in Iberians from the Middle Neolithic to the Iron Age (table S15). Colors indicate the Y-chromosome haplogroup for each male. Red lines represent period of admixture. Modified from Olalde et al. (2019).

The arrival of East Bell Beakers speaking Indo-European languages involved, nevertheless, the survival of the two non-IE communities isolated from each other – likely stemming from south-western France and south-eastern Iberia – thanks to a long-lasting process of migration and admixture. There are some common misconceptions about ancient languages in Iberia which may have caused some wrong interpretations of the data in the paper and elsewhere:

NOTE. A simple reading of Iberian prehistory would be enough to correct these. Two recent books on this subject are Villar’s Indoeuropeos, iberos, vascos y otros parientes and Vascos, celtas e indoeuropeos. Genes y lenguas.

Iberian languages were spoken at least in the Mediterranean and the south (ca. “1/3 of Iberia“) during the Bronze Age.

Nope, we only know the approximate location of Iberian culture and inscriptions from the Late Iron Age, and they occupy the south-eastern and eastern coastal areas, but before that it is unclear where they were spoken. In fact, it seems evident now that the arrival of Urnfield groups from the north marks the arrival of Celtic-speaking peoples, as we can infer from the increase in Central European admixture, while the expansion of anthropomorphic stelae from the north-west must have marked the expansion of Lusitanian.

Vasconic was spoken in both sides of the Pyrenees, as it was in the Middle Ages.

Wrong. One of the worst mistakes I am seeing in many comments since the paper was published, although admittedly the paper goes around this problem talking about “Modern Basques”. Vasconic toponyms appear south of the Pyrenees only after the Roman conquests, and tribes of the south-western Pyrenees and Cantabrian regions were likely Celtic-speaking peoples. Aquitanians (north of the western Pyrenees) are the only known ancient Vasconic-speaking population in proto-historic times, ergo the arrival of Bell Beakers in Iberia was most likely accompanied by Indo-European languages which were later replaced by Celtic expanding from Central Europe, and Iberian expanding from south-east Iberia, and only later with Latin and Vasconic.

Ligurian is non-Indo-European, and Lusitanian is Celtic-like, so Iberia must have been mostly non-Indo-European-speaking.

The fragmentary material available on Ligurian is enough to show that phonetically it is a NWIE dialect of non-Celtic, non-Italic nature, much like Lusitanian; that is, unless you follow laryngeals up to Celtic or Italic, in which case you can argue anything about this or any other IE language, as people who reconstruct laryngeals for Baltic in the common era do.

EDIT (19 Mar 2019): It was not clear enough from this paragraph, because Ligurian-like languages in NE Iberia is just a hypothesis based on the archaeological connection of the whole southern France Bell Beaker region. My aim was to repeat the idea that Old European topo-hydronymy is older in NE Iberia (as almost anywhere in Iberia) than Iberian toponymy, so the initial hypothesis is that:

  1. a Palaeo-European language (as Villar puts it) expanded into most regions of Iberia in ancient times (he considered at some point the Mesolithic, but that is obviously wrong, as we know now); then
  2. Celts expanded at least to the Ebro River Basin; then
  3. Iberians expanded to the north and replaced these in NE Iberia; and only then
  4. after the Roman invasion, around the start of the Common Era, appear Vasconic toponyms south of the Pyrenees.

Lusitanian obviously does not qualify as Celtic, lacking the most essential traits that define Celticness…Unless you define “(Para-)Celtic” as Pre-Proto-Celtic-like, or anything of the sort to support some Atlantic continuity, in which case you can also argue that Pre-Italic or Pre-Germanic are Celtic, because you would be essentially describing North-West Indo-European

If Basques have R1b, it’s because of a culture of “matrilocality” as opposed to the “patrilocality” of Indo-Europeans

So wrong it hurts my eyes every time I read this. Not only does matrilocality in a regional group have few known effects in genetics, but there are many well-documented cases of population replacement (with either ancestry or Y-DNA haplogroups, or both) without language replacement, without a need to resort to “matrilineality” or “matrilocality” or any other cultural difference in any of these cases.

In fact, it seems quite likely now that isolated ancient peoples north of the Pyrenees will show a gradual replacement of surviving I2a lineages by neighbouring R1b, while early Iberian R1b-DF27 lineages are associated with Lusitanians, and later incoming R1b-DF27 lineages (apart from other haplogroups) are most likely associated with incoming Celts, which must have remained in north-central and central-east European groups.

NOTE. Notice how R1a is fully absent from all known early Indo-European peoples to date, whether Iberian IE, British IE, Italic, or Greek. The absence of R1a in Iberia after the arrival of Celts is even more telling of the origin of expanding Celts in Central Europe.

I haven’t had enough time to add Iberian samples to my spreadsheet, and hence neither to the ASoSaH texts nor maps/PCAs (and I don’t plan to, because it’s more efficient for me to add both, Asian and Iberian samples, at the same time), but luckily Maciamo has summed it up on Eupedia. Or, graphically depicted in the paper for the southeast:

iberia-haplogroups
Y chromosome haplogroup composition of individuals from southeast Iberia during the past 2000 years. The general Iberian Bronze and Iron Age population is included for comparison. Modified from Olalde et al. (2019).

Does this continued influx of Y-DNA haplogroups in Iberia with different cultures represent permanent changes in language? Are, therefore, modern Iberian languages derived from Lusitanian, Sorothaptic/Celtic, Greek, Phoenician, East or West Germanic, Hebrew, Berber, or Arabic languages? Obviously not. Same with Italy (see the recent preprint on modern Italians by Raveane et al. 2018), with France, with Germany, or with Greece.

If that happens in European regions with a known ancient history, why would the recent expansions and bottlenecks of R1b in modern Basques (or N1c around the Baltic, or R1a in Slavs) in the Middle Ages represent an ancestral language surviving into modern times?

Indo-Iranians

If something is clear from Narasimhan, Patterson, et al. (2018), is that we know finally the timing of the introduction and expansion of R1a-Z645 lineages among Indo-Iranians.

We could already propose since 2015 that a slow admixture happened in the steppes, based on archaeological finds, due to settlement elites dominating over common peoples, coupled with the known Uralic linguistic traits of Indo-Iranian (and known Indo-Iranian influence on Finno-Ugric) – as I did in the first version of the Indo-European demic diffusion model.

The new huge sampling of Sintashta – combined with that of Catacomb, Poltavka, Potapovka, Andronovo, and Srubna – shows quite clearly how this long-term admixture process between Uralic peoples and Indo-Iranians happened between forest-steppe CWC (mainly Abashevo) and steppe groups. The situation is not different from that of Iberia ca. 2500-2000 BC; from Narasimhan, Patterson, et al. (2018):

We combined the newly reported data from Kamennyi Ambar 5 with previously reported data from the Sintashta 5 individuals (10). We observed a main cluster of Sintashta individuals that was similar to Srubnaya, Potapovka, and Andronovo in being well modeled as a mixture of Yamnaya-related and Anatolian Neolithic (European agriculturalist-related) ancestry.

Even with such few words referring to one of the most important data in the paper about what happened in the steppes, Wang et al. (2018) help us understand what really happened with this simplistic concept of “steppe ancestry” regarding Yamna vs. Corded Ware differences:

anatolia-neolithic-steppe-eneolithic
Image modified from Wang et al. (2018). Marked are: in red, approximate limit of Anatolia_Neolithic ancestry found in Yamna populations; in blue, Corded Ware-related groups. “Modelling results for the Steppe and Caucasus 1128 cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional Anatolian farmer-related ancestry in Steppe groups as well as additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups (see also Supplementary Tables 10, 14 and 20).”

As with Iberia (or any prehistoric region), the details of how exactly this language change happened are not evident, but we only need a plausible explanation coupled with archaeology and linguistics. Poltavka, Potapovka, and Sintashta samples – like the few available Iberian ones ca. 2500-2000 BC – offer a good picture of the cohabitation of R1b-L23 (mainly Z2103) and R1a-Z645 (mainly Z93+): a glimpse at the likely presence of R1a-Z93 within settlements – which must have evolved as the dominant elites – in a society where the majority of the population was initially formed by nomad herders (probably most R1b-Z2103), who were usually buried outside of the main settlements.

Will the upcoming Narasimhan, Patterson et al. (2019) deal with this problem of how R1a-M417 replaced R1b-M269, and how the so-called “Steppe_MLBA” (i.e. Corded Ware) ancestry admixed with “Steppe_EMBA” (i.e. Yamnaya) ancestry in the steppes, and which one of their languages survived in the region (that is, the same the Reich Lab has done with Iberia)? Not likely. The ‘genetic wars’ in Iberia deal with haplogroup R1b-P312, and how it was neither ‘native’ nor associated with Basques and non-Indo-European peoples in general. The ‘genetic wars’ in South Asia are concerned with the steppe origin of R1a, to prove that it is not a ‘native’ haplogroup to India, and thus neither are Indo-Aryan languages. To each region a politically correct account of genetic finds, with enough care not to fully dismiss national myths, it seems.

NOTE. Funnily enough, these ‘genetic wars’ are the making of geneticists since the 1990s and 2000s, so we are still in the midst of mostly internal wars caused by what they write. Just as genetic papers of the 2020s will most likely be a reaction to what they are writing right now about “steppe ancestry” and R1a. You won’t find much change to the linguistic reconstruction in this whole period, except for the most multicolored glottochronological proposals…

The first author of the paper has engaged, as far as I could see in Twitter, in dialogue with Hindu nationalists who try to dismiss the arrival of steppe ancestry and R1a into South Asia as inconclusive (to support the potential origin of Sanskrit millennia ago in the Indus Valley Civilization). How can geneticists deal with the real problem here (the original ethnolinguistic group expanding with Corded Ware), when they have to fend off anti-steppists from Europe and Asia? How can they do it, when they themselves are part of the same societies that demand a politically correct presentation of data?

This is how the data on the most likely Indo-Iranian-speaking region should be presented in an ideal world, where – as in the Iberia paper – geneticists would look closely to the Volga-Ural region to discover what happened with Proto-Indo-Iranians from their earliest to their latest stage, instead of constantly looking for sites close to the Indus Valley to demonstrate who knows what about modern Indian culture:

indo-iranian-admixture-similar-iberians
Tentative map of the Late PIE and Indo-Iranian community in the Volga-Ural steppes since the Eneolithic. Proportion of ancestry derived from central European Corded Ware peoples. Colors indicate the Y-chromosome haplogroup for each male. Red lines represent period of admixture. Modified from Olalde et al. (2019).

Now try and tell Hindu nationalists that Sanskrit expanded from an Early Bronze Age steppe community of R1b-rich nomadic herders that spoke Pre-Indo-Iranian, which was dominated and eventually (genetically) mostly replaced by elite Uralic-speaking R1a peoples from the Russian forest, hence the known phonetic (and some morphological) traits that remained. Good luck with the Europhobic shitstorm ahead..

Balto-Slavic

Iberian cultures, already with a majority of R1b lineages, show a clear northward expansion over previously Urnfield-like groups of north-east Iberia and Mediterranean France (which we now know probably represent the migration of Celts from central Europe). Similarly, Eastern Balts already under a majority of R1a lineages expanded likely into the Baltic region at the same time as the outlier from Turlojiškė (ca. 1075 BC), which represents the first obvious contacts of central-east Europe with the Baltic.

Iberia shows a more recent influx of central and eastern Mediterranean peoples, one of which eventually succeeded in imposing their language in Western Europe: Romans were possibly associated mainly with R1b-U152, apart from many other lineages. Proto-Slavs probably expanded later than Celts, too, connected to the disintegration of the Lusatian culture, and they were at some point associated with R1a-M458 and R1a-Z280(xZ92) lineages, apart from others already found in Early Slavs.

pca-balto-slavs-tollense-valley
PCA of central-eastern European groups which may have formed the Balto-Slavic-speaking community derived from Bell Beaker, evident from the position ‘westwards’ of CWC in the PCA, and surrounding cultures. Left: Early Bronze Age. Right: Tollense Valley samples.

This parallel between Iberia and eastern Europe is no coincidence: as Europe entered the Bronze Age, chiefdom-based systems became common, and thus the connection of ancestry or haplogroups with ethnolinguistic groups became weaker.

What happened earlier (and who may represent the Pre-Balto-Slavic community) will be clearer when we have enough eastern European samples, but basically we will be able to depict this admixture of NWIE-speaking BBC-derived peoples with Uralic-speaking CWC-derived groups (since Uralic is known to have strongly influenced Balto-Slavic), similar to the admixture found in Indo-Iranians, more or less like this:

iberian-admixture-balto-slavic
Tentative map of the North-West Indo-European and Balto-Slavic community in central-eastern Europe since the East Bell Beaker expansion. Proportion of ancestry derived from Corded Ware peoples. Colors indicate the Y-chromosome haplogroup for each male. Red lines represent period of admixture. Modified from Olalde et al. (2019).

The Early Scythian period marked a still stronger chiefdom-based system which promoted the creation of alliances and federation-like groups, with an earlier representation of the system expanding from north-eastern Europe around the Baltic Sea, precisely during the spread of Akozino warrior-traders (in turn related to the Scythian influence in the forest-steppes), who are the most likely ancestors of most N1c-V29 lineages among modern Germanic, Balto-Slavic, and Volga-Finnic peoples.

Modern haplogroup+language = ancient ones?

It is not difficult to realize, then, that the complex modern genetic picture in Eastern Europe and around the Urals, and also in South Asia (like that of the Aegean or Anatolia) is similar to the Iron Age / medieval Iberian one, and that following modern R1a as an Indo-European marker just because some modern Indo-European-speaking groups showed it was always a flawed methodology; as flawed as following R1b for ancient Vasconic groups, or N1c for ancient Uralic groups.

Why people would argue that haplogroups mean continuity (e.g. R1b with Basques, N1c with Finns, R1a with Slavs, etc.) may be understood, if one lives still in the 2000s. Just like why one would argue that Corded Ware is Indo-European, because of Gimbutas’ huge influence since the 1960s with her myth of “Kurgan peoples”. Not many denied these haplogroup associations, because there was no reason to do it, and those who did usually aligned with a defense of descriptive archaeology.

However, it is a growing paradox that some people interested in genetics today would now, after the Iberian paper, need to:

  • accept that ancient Iberians and probably Aquitanians (each from different regions, and probably from different “Basque-Iberian dialects” in the Chalcolithic, if both were actually related) show eventually expansions with R1b-L23, the haplogroup most obviously associated with expanding Indo-Europeans;
  • acknowledge that modern Iberians have many different lineages derived from prehistoric or historic peoples (Celts, Phoenicians, Greeks, Romans, Jews, Goths, Berbers, Arabs), which have undergone different bottlenecks, the last ones during the Reconquista, but none of their languages have survived;
  • realize that a similar picture is to be found everywhere in central and western Europe since the first proto-historic records, with language replacement in spite of genetic continuity, such as the British Isles (and R1b-L21 continuity) after the arrival of Celts, Romans, Anglo-Saxons, Vikings, or Normans;
  • but, at the same time, continue blindly asserting that haplogroup R1a + “steppe ancestry” represent some kind of supernatural combination which must show continuity with their modern Indo-Iranian or Balto-Slavic language from time immemorial.
sintashta-y-dna
Replacement of R1b-L23 lineages during the Early Bronze Age in eastern Europe and in the Eurasian steppes: emergence of R1a in previous Yamnaya and Bell Beaker territories. Modified from EBA Y-DNA map.

Behave, pretty please

The ‘conservative’ message espoused by some geneticists and amateur genealogists here is basically as follows:

  • Let’s not rush to new theories that contradict the 2000s, lest some people get offended by granddaddy not being these pure whatever wherever as they believed, and let’s wait some 5, 10, or 20 years, as long as necessary – to see if some corner of the Yamna culture shows R1a, or some region in north-eastern Europe shows N1c, or some Atlantic Chalcolithic sample shows R1b – to challenge our preferred theories, if we actually need to challenge anything at all, because it hurts too much.
  • Just don’t let many of these genetic genealogists or academics of our time be unhappy, pretty please with sugar on top, and let them slowly adapt to reality with more and more pet theories to fit everything together (past theories + present data), so maybe when all of them are gone, within 50 or 70 years, society can smoothly begin to move on and propose something closer to reality, but always as politically correct as possible for the next generations.
  • For starters, let’s discuss now (yet again) that Bell Beakers may not have been Indo-European at all, despite showing (unlike Corded Ware) clearly Yamna male lineages and ancestry, because then Corded Ware and R1a could not have been Indo-European and that’s terrible, so maybe Bell Beakers are too brachycephalic to speak Indo-European or something, or they were stopped by the Fearsome Tisza River, or they are not pure Dutch Single Grave in The South hence not Indo-European, or whatever, and that’s why Iron Age Iberians or Etruscans show non-Indo-European languages. That’s not disrespectful to the history of certain peoples, of course not, but talking about the evident R1a-Uralic connection is, because this is The South, not The North, and respect works differently there.
  • Just don’t talk about how Slavs and Balts enter history more than 1,500 years later than Indo-European peoples in Western and Southern Europe, including Iberia, and assume a heroic continuity of Balts and Slavs as pure R1a ‘steppe-like’ peoples dominating over thousands of kms. in the Baltic, Fennoscandia, eastern Europe, and northern Asia for 5,000 years, with multiple Balto-Slavs-over-Balto-Slavs migrations, because these absolute units of Indo-European peoples were a trip and a half. They are the Asterix and Obelix of white Indo-European prehistory.
  • Perhaps in the meantime we can also invent some new glottochronological dialectal scheme that fits the expansion of Sredni Stog/Corded Ware with (Germano-?)Indo-Slavonic separated earlier than any other Late PIE dialect; and Finno-Volgaic later than any other Uralic dialect, in the Middle Ages, with N1c.
balto-slavic-pca
Genetic structure of the Balto-Slavic populations within a European context according to the three genetic systems, from Kushniarevich et al. (2015). Pure Balto-Slavs from…hmm…yeah this…ancient…region…or people…cluster…Whatever, very very steppe-like peoples, the True Indo-Europeans™, so close to Yamna…almost as close as Finno-Ugrians.

To sum up: Iberia, Italy, France, the British Isles, central Europe, the Balkans, the Aegean, or Anatolia, all these territories can have a complex history of periodic admixture and language replacement everywhere, but some peoples appearing later than all others in the historical record (viz. Basques or Slavs) apparently cannot, because that would be shameful for their national or ethnic myths, and these should be respected.

Ignorance of the own past as a blank canvas to be filled in with stupid ethnolinguistic continuity, turned into something valuable that should not be challenged. Ethnonationalist-like reasoning proper of the 19th century. How can our times be called ‘modern’ when this kind of magical thinking is still prevalent, even among supposedly well-educated people?

Related

Iberia: East Bell Beakers spread Indo-European languages; Celts expanded later

iberia-migrations-celts

New paper (behind paywall), The genomic history of the Iberian Peninsula over the past 8000 years, by Olalde et al. Science (2019).

NOTE. Access to article from Reich Lab: main paper and supplementary materials.

Abstract:

We assembled genome-wide data from 271 ancient Iberians, of whom 176 are from the largely unsampled period after 2000 BCE, thereby providing a high-resolution time transect of the Iberian Peninsula. We document high genetic substructure between northwestern and southeastern hunter-gatherers before the spread of farming. We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry. We show that, in the Iron Age, Steppe ancestry had spread not only into Indo-European–speaking regions but also into non-Indo-European–speaking ones, and we reveal that present-day Basques are best described as a typical Iron Age population without the admixture events that later affected the rest of Iberia. Additionally, we document how, beginning at least in the Roman period, the ancestry of the peninsula was transformed by gene flow from North Africa and the eastern Mediterranean.

Interesting excerpts:

From the Bronze Age (~2200–900 BCE), we increase the available dataset (6, 7, 17) from 7 to 60 individuals and show how ancestry from the Pontic-Caspian steppe (Steppe ancestry) appeared throughout Iberia in this period (Fig. 1, C and D), albeit with less impact in the south (table S13). The earliest evidence is in 14 individuals dated to ~2500–2000 BCE who coexisted with local people without Steppe ancestry (Fig. 2B). These groups lived in close proximity and admixed to form the Bronze Age population after 2000 BCE with ~40% ancestry from incoming groups (Fig. 2B and fig. S6).

Y-chromosome turnover was even more pronounced (Fig. 2B), as the lineages common in Copper Age Iberia (I2, G2, and H) were almost completely replaced by one lineage, R1b-M269. These patterns point to a higher contribution of incoming males than females, also supported by a lower proportion of nonlocal ancestry on the X-chromosome (table S14 and fig. S7), a paradigm that can be exemplified by a Bronze Age tomb from Castillejo del Bonete containing a male with Steppe ancestry and a female with ancestry similar to Copper Age Iberians.

iberian-adna

For the Iron Age, we document a consistent trend of increased ancestry related to Northern and Central European populations with respect to the preceding Bronze Age (Figs. 1, C and D, and 2B). The increase was 10 to 19% (95% confidence intervals given here and in the percentages that follow) in 15 individuals along the Mediterranean coast where non-Indo-European Iberian languages were spoken; 11 to 31% in two individuals at the Tartessian site of La Angorrilla in the southwest with uncertain language attribution; and 28 to 43% in three individuals at La Hoya in the north where Indo-European Celtiberian languages were likely spoken (fig. S6 and tables S11 and S12).

This trend documents gene flow into Iberia during the Late Bronze Age or Early Iron Age, possibly associated with the introduction of the Urnfield tradition (18). Unlike in Central or Northern Europe, where Steppe ancestry likely marked the introduction of Indo-European languages (12), our results indicate that, in Iberia, increases in Steppe ancestry were not always accompanied by switches to Indo-European languages.

I think it is obvious they are extrapolating the traditional (not that well-known) linguistic picture of Iberia during the Iron Age, believing in continuity of that picture (especially non-Indo-European languages) during the Urnfield period and earlier.

What this data shows is, as expected, the arrival of Celtic languages in Iberia after Bell Beakers and, by extension, in the rest of western Europe. Somewhat surprisingly, this may have happened during the Urnfield period, and not during the La Tène period.

Also important are the precise subclades:

We thus detect three Bronze Age males who belonged to DF27 (154, 155), confirming its presence in Bronze Age Iberia. The other Iberian Bronze Age males could belong to DF27 as well, but the extremely low recovery rate of this SNP in our dataset prevented us to study its true distribution. All the Iberian Bronze Age males with overlapping sequences at R1b-L21 were negative for this mutation. Therefore, we can rule out Britain as a plausible proximate origin since contemporaneous British males are derived for the L21 subtype.


New open access paper Survival of Late Pleistocene Hunter-Gatherer Ancestry in the Iberian Peninsula, by Villalba-Mouco et al. Cell (2019):

BAL0051 could be assigned to haplogroup I1, while BAL003 carries the C1a1a haplogroup. To the limits of our typing resolution, EN/MN individuals CHA001, CHA003, ELT002 and ELT006 share haplogroup I2a1b, which was also reported for Loschbour [73] and Motala HG [13], and other LN and Chalcolithic individuals from Iberia [7, 9], as well as Neolithic Scotland, France, England [9], and Lithuania [14]. Both C1 and I1/ I2 are considered typical European HG lineages prior to the arrival of farming. Interestingly, CHA002 was assigned to haplogroup R1b-M343, which together with an EN individual from Cova de Els Trocs (R1b1a) confirms the presence of R1b in Western Europe prior to the expansion of steppe pastoralists that established a related male lineage in Bronze Age Europe [3, 6, 9, 13, 19]. The geographical vicinity and contemporaneity of these two sites led us to run genomic kinship analysis in order to rule out any first or second degree of relatedness. Early Neolithic individual FUC003 carries the Y haplogroup G2a2a1, commonly found in other EN males from Neolithic Anatolia [13], Starçevo, LBK Hungary [18], Impressa from Croatia and Serbia Neolithic [19] and Czech Neolithic [9], but also in MN Croatia [19] and Chalcolithic Iberia [9].

See also

Ahead of the (Indo-European – Uralic) game: in theory and in numbers

yamnaya-expansion-bell-beaker

There is a good reason for hope, for those who look for a happy ending to the revolution of population genomics that is quickly turning into an involution led by beliefs and personal interests. This blog is apparently one of the the most read sites on Indo-European peoples, if not the most read one, and now on Uralic peoples, too.

I’ve been checking the analytics of our sites, and judging by the numbers of the English blog, Indo-European.eu (without the other languages) is quickly turning into the most visited one from Academia Prisca‘s sites on Indo-European languages, beyond Indo-European.info (and its parent sites in other languages), which host many popular files for download.

If we take into account file downloads (like images or PDFs), and not only what Google Analytics can record, Indo-European.eu has not more users than all other websites of Academia Prisca, but at this pace it will soon reach half the total visits, possibly before the end of 2019.

Overall, we have evolved from some 10,000 users/year in 2006 to ~300,000 active users/year and >1,000,000 page+file views/year in 2018 (impossible to say exactly without spending too much time on this task). Nothing out of the ordinary, I guess, and obviously numbers are not a quality index, but rather a hint at increasing popularity of the subject and of our work.

NOTE. The mean reading time is ~2:40 m, which I guess fits the length of most posts, and most visitors read a mean of ~2+ pages before leaving, with increasing reader fidelity over time.

indo-european-eu-analytics
Number of active users of indo-european.eu, according to Google Analytics since before the start of the new blog. Notice the peaks corresponding to the posts below (except the last one, corresponding to the publication of A Song of Sheep and Horses).

The most read posts of 2018, now that we can compare those from the last quarter, are as follows:

  1. – The series on the Corded Ware-Uralic theory, with a marked increase in readers, especially with the last three posts:
    1. Finno-Permic and the expansion of N-L392/Siberian ancestry,
    2. “Siberian ancestry” and Ugric-Samoyedic expansions, and
    3. Haplogroups R1a and N in Finno-Ugric and Samoyedic
  2. Haplogroup is not language, but R1b-L23 expansion was associated with Proto-Indo-Europeans
  3. The history of the simplistic ‘haplogroup R1a — Indo-European’ association
  4. On the origin of haplogroup R1b-L51 in late Repin / early Yamna settlers
  5. On the origin and spread of haplogroup R1a-Z645 from eastern Europe
  6. The Caucasus a genetic and cultural barrier; Yamna dominated by R1b-M269; Yamna settlers in Hungary cluster with Yamna
  7. Something is very wrong with models based on the so-called ‘Yamnaya admixture’ – and archaeologists are catching up (II)
  8. Olalde et al. and Mathieson et al. (Nature 2018): R1b-L23 dominates Bell Beaker and Yamna, R1a-M417 resurges in East-Central Europe during the Bronze Age
  9. Early Indo-Iranian formed mainly by R1b-Z2103 and R1a-Z93, Corded Ware out of Late PIE-speaking migrations
  10. “Steppe ancestry” step by step: Khvalynsk, Sredni Stog, Repin, Yamna, Corded Ware

NOTE. Of course, the most recent posts are the most visited ones right now, but that’s because of the constant increase in the number of visitors.

I think it is obvious what the greatest interest of readers has been in the past two years. You can see the pattern by looking at the most popular posts of 2017, when the blog took off again:

  1. Germanic–Balto-Slavic and Satem (‘Indo-Slavonic’) dialect revisionism by amateur geneticists, or why R1a lineages *must* have spoken Proto-Indo-European
  2. The renewed ‘Kurgan model’ of Kristian Kristiansen and the Danish school: “The Indo-European Corded Ware Theory”
  3. The new “Indo-European Corded Ware Theory” of David Anthony
  4. Correlation does not mean causation: the damage of the ‘Yamnaya ancestral component’, and the ‘Future American’ hypothesis
  5. The Aryan migration debate, the Out of India models, and the modern “indigenous Indo-Aryan” sectarianism

The most likely reason for the radical increase in this blog’s readership is very simple, then: people want to know what is really happening with the research on ancestral Indo-Europeans and Uralians, and other blogs and forums are not keeping up with that demand, being content with repeating the same ideas again and again (R1a-CWC-IE, R1b-BBC-Vasconic, and N-Comb Ware-Uralic), despite the growing contradictions. As you can imagine, once you have seen the Yamna -> Bell Beaker migration model of North-West Indo-European, with Corded Ware obviously representing Uralic, you can’t unsee it.

The online bullying, personal attacks, and similar childish attempts to silence those who want to talk about this theory elsewhere (while fringe theories like R1a/CHG-OIT, R1b-Vasconic, or the Anatolian/Armenian-CHG hypotheses, to name just a few, are openly discussed) has had, as could be expected, the opposite effect to what was intended. I guess you can say this blog and our projects have profited from the first relevant Streisand effect of population genomics, big time.

If this trend continues this year (and other bloggers’ or forum users’ faith in miracles is not likely to change), I suppose that after the Yamna Hungary samples are published (with the expected results) this blog is going to be the most read in 2020 by a great margin… I can only infer that this tension is also helping raise the interest in (and politicization of) the question, hence probably the overall number of active users and their participation in other blogs and forums is going to increase everywhere in 2019, too, as this debate becomes more and more heated.

So, what I infer from the most popular posts and the numbers is that people want criticism and controversy, and if you want blood you’ve got it. Here it is, my latest addition to the successful series criticizing the “Corded Ware/R1a–Indo-European” pet theories, a post I wrote two-three months ago, slightly updated with the newest comedy, and a sure success for 2019 (already added to the static pages of the menu):

The “Indo-European Corded Ware theory” doesn’t hold water

This is how I feel when I see spikes in visits with more and more returning users linked to my controversial posts 😉

Are you not entertained?! Are you not entertained?! Is this not why you are here?!

The genetic and cultural barrier of the Pontic-Caspian steppe – forest-steppe ecotone

steppe-forest-steppe-biomes

We know that the Caucasus Mountains formed a persistent prehistoric barrier to cultural and population movements. Nevertheless, an even more persistent frontier to population movements in Europe, especially since the Neolithic, is the Pontic-Caspian steppe – forest-steppe ecotone.

Like the Caucasus, this barrier could certainly be crossed, and peoples and cultures could permeate in both directions, but there have been no massive migrations through it. The main connection between both regions (steppe vs. forest-steppe/forest zone) was probably through its eastern part, through the Samara region in the Middle Volga.

The chances of population expansions crossing this natural barrier anywhere else seem quite limited, with a much less porous crossing region in the west, through the Dnieper-Dniester corridor.

A Persistent ecological and cultural frontier

It is very difficult to think about any culture that transgressed this persistent ecological and cultural frontier: many prehistoric and historical steppe pastoralists did appear eventually in the neighbouring forest-steppe areas during their expansions (e.g. Yamna, Scythians, or Turks), as did forest groups who permeated to the south (e.g. Comb Ware, GAC, or Abashevo), but their respective hold in foreign biomes was mostly temporary, because their cultures had to adapt to the new ecological environment. Most if not all groups originally from a different ecological niche eventually disappeared, subjected to renewed demographic pressure from neighbouring steppe or forest populations…

The Samara region in the Middle Volga may be pointed out as the true prehistoric link between forests and steppes (see David Anthony’s remarks), something reflected in its nature as a prehistoric sink in genetics. This strong forest – forest-steppe – steppe connection was seen in the Eurasian technocomplex, during the expansion of hunter-gatherer pottery, in the expansion of Abashevo peoples to the steppes (in one of the most striking cases of population admixture in the area), with Scythians (visible in the intense contacts with Ananyino), and with Turks (Volga Turks).

steppe-forest-steppe-europe
Simplified map of the distribution of steppes and forest-steppes (Pontic and Pannonian) and xeric grasslands in Eastern Central Europe (with adjoining East European ranges) with their regionalisation as used in the review (Northern—Pannonic—Pontic). Modified from Kajtoch et al. (2016).

Before the emergence of pastoralism, the cultural contacts of the Pontic region (i.e. forest-steppes) with the Baltic were intense. In fact, the connection of the north Pontic area with the Baltic through the Dnieper-Dniester corridor and the Podolian-Volhynian region is essential to understand the spread of peoples of post-Maglemosian and post-Swiderian cultures (to the south), hunter-gatherer pottery (to the north), TRB (to the south), Late Trypillian groups (north), GAC (south), or Comb Ware (south) (see here for Eneolithic movements), and finally steppe ancestry and R1a-Z645 with Corded Ware (north). After the complex interaction of TRB, Trypillia, GAC, and CWC during the expansion of late Repin, this traditional long-range connection is lost and only emerges sporadically, such as with the expansion of East Germanic tribes.

A barrier to steppe migrations into northern Europe

One may think that this barrier was more permeable, then, in the past. However, the frontier is between steppe and forest-steppe ecological niches, and this barrier evolved during prehistory due to climate changes. The problem is, before the drought that began ca. 4000 BC and increased until the Yamna expansion, the steppe territory in the north Pontic region was much smaller, merely a strip of coastal land, compared to its greater size ca. 3300 BC and later.

This – apart from the cultural and technological changes associated with nomadic pastoralism – justifies the traditional connection of the north Pontic forest-steppes to the north, broken precisely after the expansion of Khvalynsk, as the north Pontic area became gradually a steppe region. The strips of north Pontic and Azov steppes and Crimea seem to have had stronger connections to the Northern Caucasus and Northern Caspian steppes than with the neighbouring forest-steppe areas during the Upper Palaeolithic, Mesolithic, and Neolithic.

NOTE. We still don’t know the genetic nature of Mikhailovka or Ezero, steppe-related groups possibly derived from Novodanilovka and Suvorovo close to the Black Sea (which possibly include groups from the Pannonian plains), and how they compare to neighbouring typically forest-steppe cultures of the so-called late Sredni Stog groups, like Dereivka or partly Kvityana.

steppe-forest-steppe-migration-routes
Typical migration routes through European steppes and forest-steppes. Red line represents the persistent cultural and genetic barrier, with the latest evolution in steppe region represented by the shift from dashed line to the north. Arrows show the most common population movements. Modified from Kajtoch et al. (2016).

Despite the Pontic-Caspian steppes and forest-steppes neighbouring each other for ca. 2,000 km, peoples from forested and steppe areas had an obvious advantage in their own regions, most likely due to the specialization of their subsistence economy. While this is visible already in Palaeolithic and Mesolithic hunter-gatherers, the arrival of the Neolithic package in the Pontic-Caspian region incremented the difference between groups, by spreading specialized animal domestication. The appearance of nomadic pastoralism adapted to the steppe, eventually including the use of horses and carts, made the cultural barrier based on the economic know-how even stronger.

Even though groups could still adapt and permeate a different territory (from steppe to forest-steppe/forest and vice-versa), this required an important cultural change, to the extent that it is eventually complicated to distinguish these groups from neighbouring ones (like north-west Pontic Mesolithic or Neolithic groups and their interaction with the steppes, Trypillia-Usatovo, Scythians-Thracians, etc.). In fact, this steppe – forest-steppe barrier is also seen to the east of the Urals, with the distinct expansion of Andronovo and Seima-Turbino/Andronovo-like horizons, which seem to represent completely different ethnolinguistic groups.

As a result of this cultural and genetic barrier, like that formed by the Northern Caucasus:

1) No steppe pastoralist culture (which after the emergence of Khvalynsk means almost invariably horse-riding, chariot-using nomadic herders who could easily pasture their cows in the huge grasslands without direct access to water) has ever been successful in spreading to the north or north-west into northern Europe, until the Mongols. No forest culture has ever been successful in expanding to the steppes, either (except for the infiltration of Abashevo into Sintashta-Potapovka).

2) Corded Ware was not an exception: like hunter-gatherer pottery before it (and like previous population movements of TRB, late Trypillia, GAC, Comb Ware or Lublin-Volhynia settlers) their movements between the north Pontic area and central Europe happened through forest-steppe ecological niches due to their adaptation to them. There is no reason to support a direct connection of CWC with true steppe cultures.

3) The so-called “Steppe ancestry” permeated the steppe – forest-steppe ecotone for hundreds of years during the 5th and early 4th millennium BC, due to the complex interaction of different groups, and probably to the aridization trend that expanded steppe (and probably forest-steppe) to the north. Language, culture, and paternal lineages did not cross that frontier, though.

EDIT (4 FEB 2019): Wang et al. is out in Nature Communications. They deleted the Yamna Hungary samples and related analyses, but it’s interesting to see where exactly they think the trajectory of admixture of Yamna with European MN cultures fits best. This path could also be inferred long ago from the steppe connections shown by the Yamna Hungary -> Bell Beaker evolution and by early Balkan samples:

wang-yamna-connection
Prehistoric individuals projected onto a PCA of 84 modern-day West Eurasian populations (open symbols). Dashed arrows indicate trajectories of admixture: EHG—CHG (petrol), Yamnaya—Central European MN (pink), Steppe—Caucasus (green), and Iran Neolithic—Anatolian Neolithic (brown). Modified from the original, a red circle has been added to the Yamna-Central European MN admixture.

Related

ASoSaH Reread (II): Y-DNA haplogroups among Uralians (apart from R1a-M417)

corded-ware-yamna-ancestry

This is mainly a reread of from Book Two: A Game of Clans of the series A Song of Sheep and Horses: chapters iii.5. Early Indo-Europeans and Uralians, iv.3. Early Uralians, v.6. Late Uralians and vi.3. Disintegrating Uralians.

“Sredni Stog”

While the true source of R1a-M417 – the main haplogroup eventually associated with Corded Ware, and thus Uralic speakers – is still not known with precision, due to the lack of R1a-M198 in ancient samples, we already know that the Pontic-Caspian steppes were probably not it.

We have many samples from the north Pontic area since the Mesolithic compared to the Volga-Ural territory, and there is a clear prevalence of I2a-M223 lineages in the forest-steppe area, mixed with R1b-V88 (possibly a back-migration from south-eastern Europe).

R1a-M459 (xR1a-M198) lineages appear from the Mesolithic to the Chalcolithic scattered from the Baltic to the Caucasus, from the Dniester to Samara, in a situation similar to haplogroups Q1a-M25 and R1b-L754, which supports the idea that R1a, Q1a, and R1b expanded with ANE ancestry, possibly in different waves since the Epipalaeolithic, and formed the known ANE:EHG:WHG cline.

y-dna-khvalynsk
Y-DNA samples from Khvalynsk and neighbouring cultures. See full version.

The first confirmed R1a-M417 sample comes from Alexandria, roughly coinciding with the so-called steppe hiatus. Its emergence in the area of the previous “early Sredni Stog” groups (see the mess of the traditional interpretation of the north Pontic groups as “Sredni Stog”) and its later expansion with Corded Ware supports Kristiansen’s interpretation that Corded Ware emerged from the Dnieper-Dniester corridor, although samples from the area up to ca. 4000 BC, including the few Middle Eneolithic samples available, show continuity of hg. I2a-M223 and typical Ukraine Neolithic ancestry.

NOTE. The further subclade R1a-Z93 (Y26) reported for the sample from Alexandria seems too early, given the confidence interval for its formation (ca. 3500-2500 BC); even R1a-Z645 could be too early. Like the attribution of the R1b-L754 from Khvalynsk to R1b-V1636 (after being previously classifed as of Pre-V88 and M73 subclade), it seems reasonable to take these SNP calls with a pinch of salt: especially because Yleaf (designed to look for the furthest subclade possible) does not confirm for them any subclade beyond R1a-M417 and R1b-L754, respectively.

The sudden appearance of “steppe ancestry” in the region, with the high variability shown by Ukraine_Eneolithic samples, suggests that this is due to recent admixture of incoming foreign peoples (of Ukraine Neolithic / Comb Ware ancestry) with Novodanilovka settlers.

The most likely origin of this population, taking into account the most common population movements in the area since the Neolithic, is the infiltration of (mainly) hunter-gatherers from the forest areas. That would confirm the traditional interpretation of the origin of Uralic speakers in the forest zone, although the nature of Pontic-Caspian settlers as hunter-gatherers rather than herders make this identification today fully unnecessary (see here).

EDIT (3 FEB 2019): As for the most common guesstimates for Proto-Uralic, roughly coinciding with the expansion of this late Sredni Stog community (ca. 4000 BC), you can read the recent post by J. Pystynen in Freelance Reconstruction, Probing the roots of Samoyedic.

eneolithic-ukraine-corded-ware
Late Sredni Stog admixture shows variability proper of recent admixture of forest-steppe peoples with steppe-like population. See full version here.

NOTE. Although my initial simplistic interpretation (of early 2017) of Comb Ware peoples – traditionally identified as Uralic speakers – potentially showing steppe ancestry was probably wrong, it seems that peoples from the forest zone – related to Comb Ware or neighbouring groups like Lublyn-Volhynia – reached forest-steppe areas to the south and eventually expanded steppe ancestry into east-central Europe through the Volhynian Upland to the Polish Upland, during the late Trypillian disintegration (see a full account of the complex interactions of the Final Eneolithic).

The most interesting aspect of ascertaining the origin of R1a-M417, given its prevalence among Uralic speakers, is to precisely locate the origin of contacts between Late Proto-Indo-European and Proto-Uralic. Traditionally considered as the consequence of contacts between Middle and Upper Volga regions, the most recent archaeological research and data from ancient DNA samples has made it clear that it is Corded Ware the most likely vector of expansion of Uralic languages, hence these contacts of Indo-Europeans of the Volga-Ural region with Uralians have to be looked for in neighbours of the north Pontic area.

sredni-stog-repin-contacts
Sredni Stog – Repin contacts representing Uralic – Late Indo-European contacts were probably concentrated around the Don River.

My bet – rather obvious today – is that the Don River area is the source of the earliest borrowings of Late Uralic from Late Indo-European (i.e. post-Indo-Anatolian). The borrowing of the Late PIE word for ‘horse’ is particularly interesting in this regard. Later contacts (after the loss of the initial laryngeal) may be attributed to the traditionally depicted Corded Ware – Yamna contact zone in the Dnieper-Dniester area.

NOTE. While the finding of R1a-M417 populations neighbouring R1b-L23 in the Don-Volga interfluve would be great to confirm these contacts, I don’t know if the current pace of more and more published samples will continue. The information we have right now, in my opinion, suffices to support close contacts of neighbouring Indo-Europeans and Uralians in the Pontic-Caspian area during the Late Eneolithic.

Classical Corded Ware

After some complex movements of TRB, late Trypillia and GAC peoples, Corded Ware apparently emerged in central-east Europe, under the influence of different cultures and from a population that probably (at least partially) stemmed from the north Pontic forest-steppe area.

Single Grave and central Corded Ware groups – showing some of the earliest available dates (emerging likely ca. 3000/2900 BC) – are as varied in their haplogroups as it is expected from a sink (which does not in the least resemble the Volga-Ural population):

Interesting is the presence of R1b-L754 in Obłaczkowo, potentially of R1b-V88 subclade, as previously found in two Central European individuals from Blätterhole MN (ca. 3650 and 3200 BC), and in the Iron Gates and north Pontic areas.

Haplogroups I2a and G have also been reported in early samples, all potentially related to the supposed Corded Ware central-east European homeland, likely in southern Poland, a region naturally connected to the north Pontic forest-steppe area and to the expansion of Neolithic groups.

corded-ware-haplogroups
Y-DNA samples from early Corded Ware groups and neighbouring cultures. See full version.

The true bottlenecks under haplogroup R1a-Z645 seem to have happened only during the migration of Corded Ware to the east: to the north into the Battle Axe culture, mainly under R1a-Z282, and to the south into Middle Dnieper – Fatyanovo-Balanovo – Abashevo, probably eventually under R1a-Z93.

This separation is in line with their reported TMRCA, and supports the split of Finno-Permic from an eastern Uralic group (Ugric and Samoyedic), although still in contact through the Russian forest zone to allow for the spread of Indo-Iranian loans.

This bottleneck also supports in archaeology the expansion of a sort of unifying “Corded Ware A-horizon” spreading with people (disputed by Furholt), the disintegrating Uralians, and thus a source of further loanwords shared by all surviving Uralic languages.

Confirming this ‘concentrated’ Uralic expansion to the east is the presence of R1a-M417 (xR1a-Z645) lineages among early and late Single Grave groups in the west – which essentially disappeared after the Bell Beaker expansion – , as well as the presence of these subclades in modern Central and Western Europeans. Central European groups became thus integrated in post-Bell Beaker European EBA cultures, and their Uralic dialect likely disappeared without a trace.

NOTE. The fate of R1b-L51 lineages – linked to North-West Indo-Europeans undergoing a bottleneck in the Yamna Hungary -> Bell Beaker migration to the west – is thus similar to haplogroup R1a-Z645 – linked to the expansion of Late Uralians to the east – , hence proving the traditional interpretation of the language expansions as male-driven migrations. These are two of the most interesting genetic data we have to date to confirm previous language expansions and dialectal classifications.

It will be also interesting to see if known GAC and Corded Ware I2a-Y6098 subclades formed eventually part of the ancient Uralic groups in the east, apart from lineages which will no doubt appear among asbestos ware groups and probably hunter-gatherers from north-eastern Europe (see the recent study by Tambets et al. 2018).

Corded Ware ancestry marked the expansion of Uralians

Sadly, some brilliant minds decided in 2015 that the so-called “Yamnaya ancestry” (now more appropriately called “steppe ancestry”) should be associated to ‘Indo-Europeans’. This is causing the development of various new pet theories on the go, as more and more data contradicts this interpretation.

There is a clear long-lasting cultural, populational, and natural barrier between Yamna and Corded Ware: they are derived from different ancestral populations, which show clearly different ancestry and ancestry evolution (although they did converge to some extent), as well as different Y-DNA bottlenecks; they show different cultures, including those of preceding and succeeding groups, and evolved in different ecological niches. The only true steppe pastoralists who managed to dominate over grasslands extending from the Upper Danube to the Altai were Yamna peoples and their cultural successors.

corded-ware-yamna-pca
Corded Ware admixture proper of expanding late Sredni Stog-like populations from the forest-steppe. See full version here.

NOTE. You can also read two recent posts by FrankN in the blog aDNA era, with detailed information on the Pontic-Caspian cultures and the formation of “steppe ancestry” during the Palaeolithic, Mesolithic and Neolithic: How did CHG get into Steppe_EMBA? Part 1: LGM to Early Holocene and How did CHG get into Steppe_EMBA? Part 2: The Pottery Neolithic. Unlike your typical amateur blogger on genetics using few statistical comparisons coupled with ‘archaeolinguoracial mumbo jumbo’ to reach unscientific conclusions, these are obviously carefully redacted texts which deserve to be read.

I will not enter into the discussion of “steppe ancestry” and the mythical “Siberian ancestry” for this post, though. I will just repost the opinion of Volker Heyd – an archaeologist specialized in Yamna Hungary and Bell Beakers who is working with actual geneticists – on the early conclusions based on “steppe ancestry”:

[A]rchaeologist Volker Heyd at the University of Bristol, UK, disagreed, not with the conclusion that people moved west from the steppe, but with how their genetic signatures were conflated with complex cultural expressions. Corded Ware and Yamnaya burials are more different than they are similar, and there is evidence of cultural exchange, at least, between the Russian steppe and regions west that predate Yamnaya culture, he says. None of these facts negates the conclusions of the genetics papers, but they underscore the insufficiency of the articles in addressing the questions that archaeologists are interested in, he argued. “While I have no doubt they are basically right, it is the complexity of the past that is not reflected,” Heyd wrote, before issuing a call to arms. “Instead of letting geneticists determine the agenda and set the message, we should teach them about complexity in past human actions.

Related

ASoSaH Reread (I): Y-DNA haplogroups among Indo-Europeans (apart from R1b-L23)

eneolithic-early-admixture-steppe-ancestry

Given my reduced free time in these months, I have decided to keep updating the text on Indo-European and Uralic migrations and/or this blog, simultaneously or alternatively, to make the most out of the time I can dedicate to this. I will add the different ‘A Song of Sheep and Horses (ASoSaH) reread’ posts to the original post announcing the books. I would be especially interested in comments and corrections to the book chapters rather than the posts, but any comments are welcome (including in the forum, where comments are more likely to stick).

This is mainly a reread of iv.2. Indo-Anatolians and vi.1. Disintegrating Indo-Europeans.

Indo-Anatolians and Late Indo-Europeans

I have often written about R1b-L23 as the majority haplogroup among Late Proto-Indo-Europeans (see my predictions for 2018 and my summary of 2018), but always expected other haplogroups to pop up somewhere along the way, in Khvalynsk, in Repin, in Yamna, and in Bell Beakers (see e.g. the post on common fallacies of R1a/IE-fans).

Luckily enough – for those of us who want precise answers to our previous infinite models of Indo-European language expansions (viz. GAC-associated expansion, IE-speaking Old Europe, Anatolian homeland, Iran homeland, Maykop as Proto-Anatolian, Palaeolithic Continuity Theory, Celtic in the Atlantic façade, etc.) – the situation has been more clear-cut than expected: it turns out that, especially during population expansions, acute Y-chromosome bottlenecks were very common in the past, at least until the Iron Age.

Khvalynsk and Repin-Yamna expansions were no different, and that seems quite natural in hindsight, given the strong familial ties and aversion to foreigners proper of the Late Proto-Indo-European society and culture – probably not really that different from other contemporary societies, like the neighbouring Late Proto-Uralic or Trypillian ones.

y-dna-khvalynsk
Y-DNA samples from Khvalynsk and neighbouring cultures. See full version here.

Y-DNA haplogroups

During the expansion of early Khvalynsk, the most likely Indo-Anatolian culture, the society of the Don-Volga area was probably made up of different lineages including R1b-V1636, R1b-M269, R1a-YP1272, Q1a-M25, and I2a-L699 (and possibly some R1b-V88?), a variability possibly greater than that of the contemporary north Pontic area, probably a sign of this region being a sink of different east and west migrations from steppe and forest areas.

During its expansion, the Khvalynsk society saw its haplogroup variability reduced, as evidenced by the succeeding expansive Repin culture:

Afanasevo, representing Pre-Tocharian (the earliest Late PIE dialect to branch off), expanded with R1b-L23 – especially R1b-Z2103 – lineages, while early Yamna expanded with R1b-L23 and I2a-L699 lineages, which suggests that these are the main haplogroups that survived the Y-DNA bottleneck undergone during the Khvalynsk expansion, and especially later during the late Repin expansion. Nevertheless, other old haplogroups might still pop up during the Repin and early Yamna period, such as the R1b-V1636 sample from Yamna in the Northern Caucasus.

It is still unclear if R1b-L23 sister clade R1b-PF7562 (formed ca. 4400 BC, TMRCA ca. 3400 BC), prevalent among modern Albanians, expanded with Yamna migrants, or if it was part of an earlier expansion of R1b-M269 into the Balkans, and represent thus Indo-Anatolian speakers who later hitchhiked the expansion of the Late PIE language from the north or west Pontic area. The early TMRCA seems to suggest an association with Repin (and therefore Yamna), rather than later movements in the Balkans.

chalcolithic-early-y-dna
Y-DNA samples from Yamnaya and neighbouring cultures. See full version here.

‘Yamnaya’ or ‘steppe’ ancestry?

After the early years when population genetics relied mainly on modern Y-DNA haplogroups, geneticists and amateurs have been recently playing around with testing “ancestry percentages”, based on newly developed free statistical tools, which offer obviously just one among many types of data to achieve a proper interpretation of the past.

Today we have quite a lot Y-DNA haplogroups reported for ancient samples of more recent prehistoric periods, and they seem to offer (at least since the 2015 papers, but more evidently since the 2018 papers on Bell Beakers and Europeans, Corded Ware, or Fennoscandia among others) the most straightforward interpretation of all results published in population genomics research.

NOTE. The finding of a specific type of ancestry in one isolated 40,000-year-old sample from Tianyuan can offer very interesting information on potential population movements to the region. However, the identification of ethnolinguistic communities and their migrations among neighbouring groups in Neolithic or Bronze Age groups is evidently not that simple.

PCA-caucasus-steppe-all
Yamnaya (Indo-European peoples) and their evolution in the steppes, together with North Pontic (eventually Uralic) peoples.Notice how little Indo-European ancestry changes from Khvalynsk (Indo-Anatolian) to Yamna Hungary (North-West Indo-Europeans) Image modified from Wang et al. (2018). See more on the evolution of “steppe ancestry”.

It is becoming more and more clear with each paper that the true “Yamnaya ancestry” – not the originally described one – was in fact associated with Indo-Europeans (see more on the very Yamnaya-like Yamna Hungary and early East Bell Beaker R1b samples, all of quite similar ancestry and PCA cluster before their further admixture with EEF- and CWC-like groups).

The so-called “steppe ancestry”, on the other hand, reflects the contribution of a Northern Caucasus-related ancestry to expanding Khvalynsk settlers, who spread through the steppes more than a thousand years before the expansion of Late Proto-Indo-Europeans with late Repin, and can thus be found among different groups related to the Pontic-Caspian steppes (see more on the emergence and evolution of “steppe ancestry”).

In fact, after the Yamna/Indo-European and Corded Ware/Uralic expansions, it is more likely to find “steppe ancestry” to the north and east in territories traditionally associated with Uralic languages, whereas to the south and west – i.e. in territories traditionally associated with Indo-European languages – it is more likely to find “EEF ancestry” with diminished “steppe ancestry”, among peoples patrilineally descended from Yamna settlers.

Y-DNA haplogroups, the only uniparental markers (see exceptions in mtDNA inheritance) – unlike ancestry percentages based on the comparison of a few samples and flawed study designs – do not admix, do not change, and therefore they do not lend themselves to infinite pet theories (see e.g. what David Reich has to say about R1b-P312 in Iberia directly derived from Yamna migrants in spite of their predominant EEF ancestry): their cultural continuity can only be challenged with carefully threaded linguistic, archaeological, and genetic data.

Related

Happy new year 2019…and enjoy our new books!

song-sheep-horses-header

Sorry for the last weeks of silence, I have been rather busy lately. I am having more projects going on, and (because of that) I also wanted to finish a project I have been working on for many months already.

I have therefore decided to publish a provisional version of the text, in the hope that it will be useful in the following months, when I won’t be able to update it as often as I would like to:

EDIT (20 JAN 2019): For those of you who are more comfortable reading in your native language, I have placed some links to automatic translations by Google Translate. They might work especially well for the texts of A Game of Clans & A Clash of Chiefs.

Don’t forget to check out the maps included in the supplementary materials: I have added Y-DNA, mtDNA, and ADMIXTURE data using GIS software. The PCA graphics are also important to follow the main text.

NOTE. Right now the files are only in my server. I will try to upload them to Academia.edu and Research Gate when I have time, I have uploaded them to Academia.edu and ResearchGate, in case the websites are too slow.

I would have preferred to wait for a thorough revision of the section on archaeology and the linguistic sections on Uralic, but I doubt I will have time when the reviews come, so it was either now or maybe next December…

I say so in the introduction, but it is evident that certain aspects of the book are tentative to say the least: the farther back we go from Late Proto-Indo-European, the less clear are many aspects. Also, linguistically I am not convinced about Eurasiatic or Nostratic, although they do have a certain interest when we try to offer a comprehensive view of the past, including ethnolinguistic identities.

I cannot be an expert in everything, and these books cover a lot. I am bound to publish many corrections as new information appears and more reviews are sent. For example, just days ago (before SNP calls of Wang et al. 2018 were published) some paragraphs implied that AME might have expanded Nostratic from the Middle East. Now it does not seem so, and I changed them just before uploading the text. That’s how tentative certain routes are, and how much all of this may change. And that only if we accept a Nostratic phylum…

NOTE. Since the first book I wrote was the linguistic one, and I have spent the last months updating the archaeology + genetics part, now many of you will probably understand 1) why I am so convinced about certain language relationships and 2) how I used many posts to clarify certain ideas and receive comments. Many posts offer probably a good timeline of what I worked with, and when.

A Song of Sheep and Horses (ASoSaH) reread

Edit (23 Jan 2019):

To be able to revise and update the text properly, I decided to start a series of posts on different aspects.

This is an updated list of the posts:

Acknowledgements

I did not add this section to the books, because they are still not ready for print, but I think this is due somewhere now. It is impossible to reference all who have directly or indirectly contributed to this, so this is a list of those I feel have played an important role.

I am indebted to the following people (which does not mean that they share my views, obviously):

First and foremost, to Fernando López-Menchero, for having the patience to review with detail many parts on Indo-European linguistics, knowing that I won’t accept many of his comments anyway. The additional information he offers is invaluable, but I didn’t want to turn this into a huge linguistic encyclopaedia with unending discussions of tiny details of each reconstructed word. I think it is already too big as it is.

I would not have thought about doing this if it were not for the interest of Wekwos (Xavier Delamarre) in publishing a full book about the Indo-European demic diffusion model (in the second half of 2017, I think). It was them who suggested that I extended the content, when all I had done until then was write an essay and draw some maps in my free time between depositing the PhD thesis and defending it.

Sadly, as much as I would like to publish a book with a professional publisher, I don’t think ancient DNA lends itself for the traditional format, so my requests (mainly to have free licenses and being able to review the text at will, as new genetic papers are published) were logically not acceptable. Also, the main aim of all volumes, especially the linguistic one, is the teaching of essentials of Late Proto-Indo-European and related languages, and this objective would be thwarted by selling each volume for $50-70 and only in printed format. I prefer a wider distribution.

At first I didn’t think much of this proposal, because I do not benefit from this kind of publications in my scientific field, but with time my interest in writing a whole, comprehensive book on the subject grew to the point where it was already an ongoing project, probably by the start of 2018.

I would not have been in contact with Wekwos if it were not for user Camulogène Rix at Anthrogenica, so thanks for that and for the interest in this work.

I would not have thought of writing this either if not for the spontaneous support (with an unexpected phone call!) of a professor of the Complutense University of Madrid, Ángel Gómez Moreno, who is interested in this subject – as is his wife, a professor of Classics more closely associated to Indo-European studies, and who helped me with a search for Indo-Europeanists.

EDIT (1 JAN 2019): I remembered that Karin Bojs sent me her book after reading the demic diffusion model. I may have also thought about writing a whole book back then, but mid-2017 is probably too early for the project.

Professor Kortlandt is still to review the text, but he contributed to both previous essays in some very interesting ways, so I hope he can help me improve the parts on Uralic, and maybe alternative accounts of expansion for Balto-Slavic, depending on the time depth that he would consider warranted according to the Temematic hypothesis.

The maps are evidently (for those who are interested in genetics) in part the result of the effort of the late Jean Manco: As you can see from the maps including Y-DNA and mtDNA samples, I have benefitted from her way of organising data and publishing it. Similarly, the work of Iain McDonald in assessing the potential migration routes of R1b and R1a in Europe with the help of detailed maps was behind my idea for the first maps, and consequently behind these, too.

I should thank all people responsible for the release of free datasets to work with, including the Reich and Jena labs, the Veeramah Lab, and also researchers from the Max Planck Institute or the Mainz Palaeogenetics group, who didn’t mind to share with me datasets to work with.

Readers of this blog with interesting comments have also been essential for the improvement of the texts. You can probably see some of your many contributions there. I may not answer many comments, because I am always busy (and sometimes I just don’t have anything interesting to say), but I try to read all of them.

EDIT (1 JAN 2019) I think I should mention at least Chetan, Egg, or Robert George; but then I would leave out old europe, Sgr Ganesh, or Tileman Ehlen; and if I include them I would leave out others…

Users of other sites, like Anthrogenica, whose particular points of view and deep knowledge of some very specific aspects are sometimes very useful. In particular, user Anglesqueville helped me to fix some issues with the merging of datasets to obtain the PCAs and ADMIXTURE, and prepared some individual samples to merge them.

Even without posting anything, Google Analytics keeps sending me messages about increasing user fidelity (returning users), and stats haven’t really changed (which probably means more people are reading old posts), so thank you for that.

I hope you enjoy the books.

Happy new year!