Arrival of steppe ancestry with R1b-P312 in the Mediterranean: Balearic Islands, Sicily, and Iron Age Sardinia

steppe-balearic-sicily-sardinia

New preprint The Arrival of Steppe and Iranian Related Ancestry in the Islands of the Western Mediterranean by Fernandes, Mittnik, Olalde et al. bioRxiv (2019)

Interesting excerpts (emphasis in bold; modified for clarity):

Balearic Islands: The expansion of Iberian speakers

Mallorca_EBA dates to the earliest period of permanent occupation of the islands at around 2400 BCE. We parsimoniously modeled Mallorca_EBA as deriving 36.9 ± 4.2% of her ancestry from a source related to Yamnaya_Samara; (…). We next used qpAdm to identify “proximal” sources for Mallorca_EBA’s ancestry that are more closely related to this individual in space and time, and found that she can be modeled as a clade with the (small) subset of Iberian Bell Beaker culture associated individuals who carried Steppe-derived ancestry (p=0.442).

Suppl. Materials: The model used was with Bell_Beaker_Iberia_highsteppe, a group of outliers from Iberia buried in a Bell Beaker mortuary context who unlike most individuals from this context in that region had high proportions of Steppe ancestry (p=0.442).

Our estimates of Steppe ancestry in the two later Balearic Islands individuals are lower than the earlier one: 26.3 ± 5.1% for Formentera_MBA and 23.1 ± 3.6% for Menorca_LBA, but the Middle to Late Bronze Age Balearic individuals are not a clade relative to non-Balearic groups. Specifically, we find that f4(Mbuti.DG, X; Formentera_MBA, Menorca_LBA) is positive when X=Iberia_Chalcolithic (Z=2.6) or X=Sardinia_Nuragic_BA (Z=2.7). While it is tempting to interpret the latter statistic as suggesting a genetic link between peoples of the Talaiotic culture of the Balearic islands and the Nuragic culture of Sardinia, the attraction to Iberia_Chalcolithic is just as strong, and the mitochondrial haplogroup U5b1+16189+@16192 in Menorca_LBA is not observed in Sardinia_Nuragic_BA but is observed in multiple Iberia_Chalcolithic individuals. A possible explanation is that both the ancestors of Nuragic Sardinians and the ancestors of Talaiotic people from the Balearic Islands received gene flow from an unsampled Iberian Chalcolithic-related group (perhaps a mainland group affiliated to both) that did not contribute to Formentera_MBA.

This sample, like another one in El Argar, is of hg. R1b-P312. So there you are, the data that connects the Proto-Iberian expansion (replacing IE-speaking Bell Beakers) to the Iberian Chalcolithic population, signaled by the increase in Iberian Chalcolithic ancestry after the arrival of Bell Beakers, most likely connected originally to the Argaric and post-Argaric expansions during the MBA.

balearic-sicily-sardinia-pca
PCA with previously published ancient individuals (non-filled symbols), projected onto variation from present-day populations (gray squares).

Steppe in Sardinia IA: Phocaeans from Italy?

Most Sardinians buried in a Nuragic Bronze Age context possessed uniparental haplogroups found in European hunter-gatherers and early farmers, including Y-haplogroup R1b1a[xR1b1a1a] which is different from the characteristic R1b1a1a2a1a2 spread in association with the Bell Beaker complex. An exception is individual I10553 (1226-1056 calBCE) who carried Y-haplogroup J2b2a, previously observed in a Croatian Middle Bronze Age individual bearing Steppe ancestry, suggesting the possibility of genetic input from groups that arrived from the east after the spread of first farmers. This is consistent with the evidence of material culture exchange between Sardinians and mainland Mediterranean groups, although genome-wide analyses find no significant evidence of Steppe ancestry so the quantitative demographic impact was minimal.

Another interesting data, these (Mesolithic) remnant R1b-V88 lineages closely related to the Italian Peninsula, the most likely region of expansion of these lineages into Africa, in turn possibly connected to the expansion of Proto-Afroasiatic.

We detect definitive evidence of Iranian-related ancestry in an Iron Age Sardinian I10366 (391-209 calBCE) with an estimate of 11.9 ± 3.7.% Iran_Ganj_Dareh_Neolithic related ancestry, while rejecting the model with only Anatolian_Neolithic and WHG at p=0.0066 (Supplementary Table 9). The only model that we can fit for this individual using a pair of populations that are closer in time is as a mixture of Iberia_Chalcolithic (11.9 ± 3.2%) and Mycenaean (88.1 ± 3.2%) (p=0.067). This model fits even when including Nuragic Sardinians in the outgroups of the qpAdm analysis, which is consistent with the hypothesis that this individual had little if any ancestry from earlier Sardinians.

yamnaya-samara
Proportions of ancestry using a distal qpAdm framework on an individual basis (a), and based on qpWave clusters

Sicily EBA: The Lusitanian/Ligurian connection?

(…) While a previously reported Bell Beaker culture-associated individual from Sicily had no evidence of Steppe ancestry, (…) we find evidence of Steppe ancestry in the Early Bronze Age by ~2200 BCE. In distal qpAdm, the outlier Sicily_EBA11443 is parsimoniously modeled as harboring 40.2 ± 3.5% Steppe ancestry, and the outlier Sicily_EBA8561 is parsimoniously modeled as harboring 23.3 ± 3.5% Steppe ancestry. (…) The presence of Steppe ancestry in Early Bronze Age Sicily is also evident in Y chromosome analysis, which reveals that 4 of the 5 Early Bronze Age males had Steppe-associated Y-haplogroup R1b1a1a2a1a2. (Online Table 1). Two of these were Y-haplogroup R1b1a1a2a1a2a1 (Z195) which today is largely restricted to Iberia and has been hypothesized to have originated there 2500-2000 BCE. This evidence of west-to-east gene flow from Iberia is also suggested by qpAdm modeling where the only parsimonious proximate source for the Steppe ancestry we found in the main Sicily_EBA cluster is Iberians.

What’s this? An ancestral connection between Sicel Elymian and Galaico-Lusitanian or Ligurian (based on an origin in NE Iberia)? Impossible to say, especially if the languages of these early settlers were replaced later by non-Indo-European speakers from the eastern Mediterranean, and by Indo-European speakers from the mainland closely related to Proto-Italic during the LBA, but see below.

Regarding the comment on R1b-Z195, it is associated with modern Iberians, as DF27 in general, due to founder effects beyond the Pyrenees. It is a very old subclade, split directly from DF27 roughly at the same time as it split from the parent P312, i.e. it can be found anywhere in Europe, and it almost certainly accompanied the expansion of Celts from Central Europe under the subclade R1b-M167/SRY2627.

The connection is thus strong only because of the qpAdm modeling, since R1b-DF27 and subclade R1b-Z195 are certainly lineages expanded quite early, most likely with Yamna settlers in Hungary and East Bell Beakers.

In this case, if stemming from Iberia, it is most likely of subclade R1b-Z220 – or another Z195 (xM167) lineage – originally associated with the Old European substrate found in topo-hydronymy in Iberia, whose most likely remnants attested during the Iron Age were Lusitanians.

r1b-df27-z195
Left: Modern distribution of R1b-Z195 (YFull estimate 2700 BC); Right: Modern distribution of DF27. Both include later founder effects within Iberia, so the increase in the Basque country and the Crown of Aragon and the decrease in Portugal can safely be ignored. Contour maps of the derived allele frequencies of the SNPs analyzed in Solé-Morata et al. (2017).

We detect Iranian-related ancestry in Sicily by the Middle Bronze Age 1800-1500 BCE, consistent with the directional shift of these individuals toward Mycenaeans in PCA. Specifically, two of the Middle Bronze Age individuals can only be fit with models that in addition to Anatolia_Neolithic and WHG, include Iran_Ganj_Dareh_Neolithic. The most parsimonious model for Sicily_MBA3125 has 18.0 ± 3.6% Iranian-related ancestry (p=0.032 for rejecting the alternative model of Steppe rather than Iranian-related ancestry), and the most parsimonious model for Sicily_MBA has 14.9 ± 3.9% Iranian-related ancestry (p=0.037 for rejecting the alternative model).

The modern southern Italian Caucasus-related signal identified in Raveane et al. (2018) is plausibly related to the same Iranian-related spread of ancestry into Sicily that we observe in the Middle Bronze Age (and possibly the Early Bronze Age).

The non-Indo-European Sicanians and Elymians were possibly then connected to eastern Mediterranean groups before the expansion of the Sea Peoples.

For the Late Bronze Age group of individuals, qpAdm documented Steppe-related ancestry, modeling this group as 80.2 ± 1.8% Anatolia_Neolithic, 5.3 ± 1.6% WHG, and 14.5 ± 2.2% Yamnaya_Samara. Our modeling using sources more closely related in space and time also supports Sicily_LBA having Minoan-related ancestry or being derived from local preceding populations or individuals with ancestries similar to those of Sicily_EBA3123 (p=0.527), Sicily_MBA3124 (p=0.352), and Sicily_MBA3125 (p=0.095).

This increase in Steppe-related ancestry in a western site during the LBA most likely represents either an expansion from the Aegean or – maybe more likely, given the archaeological finds – a regional population similar to Sicily EBA re-emerging or rather being displaced from the eastern part of the island because of a westward movement from nearby Calabria.

Whether this population sampled spoke Indo-European or not at this time is questionable, since the Iron Age accounts show non-IE Elymians in this region.

Actually, Elymians seem to have spoken Indo-European, which fits well with the increase in steppe ancestry.

EDIT (21 MAR): Interesting about a proposed incoming Minoan-like ancestry is the potential origin of the Iran Neolithic-related ancestry that is going to appear in Central Italy during the LBA. This could then be potentially associated with Tyrsenians passing through the area, although the traditional description may be more more compatible with an arrival of Sea Peoples from the Adriatic.

Sad to read this:

This manuscript is dedicated to the memory of Sebastiano Tusa of the Soprintendenza del Mare in Palermo, who would have been an author of this study had he not tragically died in the crash of Ethiopia Airlines flight 302 on March 10.

Related

Y-chromosome mixture in the modern Corsican population shows different migration layers

mesolithic-europe

Open access Prehistoric migrations through the Mediterranean basin shaped Corsican Y-chromosome diversity, by Di Cristofaro et al. PLOS One (2018).

Interesting excerpts:

This study included 321 samples from men throughout Corsica; samples from Provence and Tuscany were added to the cohort. All samples were typed for 92 Y-SNPs, and Y-STRs were also analyzed.

Haplogroup R represented approximately half of the lineages in both Corsican and Tuscan samples (respectively 51.8% and 45.3%) whereas it reached 90% in Provence. Sub-clade R1b1a1a2a1a2b-U152 predominated in North Corsica whereas R1b1a1a2a1a1-U106 was present in South Corsica. Both SNPs display clinal distributions of frequency variation in Europe, the U152 branch being most frequent in Switzerland, Italy, France and Western Poland. Calibrated branch lengths from whole Y chromosome sequencing [44,45] and ancient DNA studies [46] both indicated that R1a and R1b diversification began relatively recently, about 5 Kya, consistent with Bronze Age and Copper Age demographic expansion. TMRCA estimations are concordant with such expansion in Corsica.

corsica-haplogroups
Spatial frequency maps for haplogroups with frequencies above 3%, their Y-STR based phylogenetic networks in Corsican populations (Blue: North, Green: West, Orange: South, Black: Center and Purple: East) and their TMRCA (in years, +/- SE).

Haplogroup G reached 21.7% in Corsica and 13.3% in Tuscany. Sub-clade G2a2a1a2-L91 accounted for 11.3% of all haplogroups in Corsica yet was not present in Provence or in Tuscany. Thirty-four out of the 37 G2a2a1a2-L91 displayed a unique Y-STR profile, illustrated by the star-like profile of STR networks (Fig 1). G2a2a1a2-L91 and G2a2a-PF3147(xL91xM286) show their highest frequency in present day Sardinia and southern Corsica compared to low levels from Caucasus to Southern Europe, encompassing the Near and Middle East [21,47–50]. Ancient DNA results from Early and Middle Neolithic samples reported the presence of haplogroup G2a-P15 [51–53], consistent with gene flow from the Mediterranean region during the Neolithic transition. Td expansion time estimated by STR for P15-affiliated chromosomes was estimated to be 15,082+/-2217 years ago [49]. Ötzi, the 5,300-year-old Alpine mummy, was derived for the L91 SNP [21]. A genetic relationship between G haplogroups from Corsica and Sardinia is further supported by DYS19 duplication, reported in North Sardinia [14], and observed in the southern part of the Corsica in 9 out of 37 G2a2a1a2-L91 chromosomes and in 4 out of 5 G2a2a-PF3147(xL91xM286) chromosomes, 3 of which displayed an identical STR profile (S4 Table).

This lineage has a reported coalescent age estimated by whole sequencing in Sardinian samples of about 9,000 years ago. This could reflect common ancestors coming from the Caucasus and moving westward during the Neolithic period [48], whereas their continental counterparts would have been replaced by rapidly expanding populations associated with the Bronze Age [46,54,55]. Estimated TMRCA for L91 lineage in Corsica is 4529 +/- 853 years. G-L497 showed high frequencies in Corsica compared to Provence and Tuscany, and this haplogroup was common in Europe, but rare in Greece, Anatolia and the Middle East. Fifteen out of the 17 Corsican G2a2b2a1a1b-L497 displayed a unique Y-STR profile (S4 Table) with an estimated TMRCA of 6867 +/- 1294 years. Haplogroup G2a2b1-M406, associated with Impressed Ware Neolithic markers, along with J2a1-DYS445 = 6 and J2a1b1-M92 [22,49], had very low levels in Corsica. Conversely, G2a2b2a-P303was highly represented and seemed to be independent of the G2a2b1-M406 marker. The 7 G2a2b2a-P303(xL497xM527) Corsican chromosomes displayed a unique Y-STR profile (S4 Table).

pca-corsica
First and second axes of the PCA based on 12 Y-chromosome haplogroup frequencies in 83 west Mediterranean populations.

Haplogroup J, mainly represented by J2a1b-M67(xM92), displayed intermediate frequencies in Corsica compared to Tuscany and Provence. J2a1b-M67(xM92) derived STR network analysis displayed a quite homogeneous profile across the island with an estimated TMRCA of 2381 +/- 449 years (Fig 1) and individuals displaying M67 were peripheral compared to Northwestern Italians (S2 Fig). The haplogroup J2a1-Page55(xM67xM530), characteristic of non-Greek Anatolia [22], was found in the north-west of Corsica. Haplogroup J2a1-DYS445 = 6 was found in the north-west with DYS391 = 10 repeats, and in the far south with DYS391 = 9 repeats, the former was associated with Anatolian Greek samples, whereas the second was found in central Anatolia [22]. The 7 J2b2a-M241 displayed a unique Y-STR profile (S4 Table), they were only detected in the Cap Corse region, this sub-haplogroup shows frequency peaks in both the southern Balkans and northern-central Italy [56] and is associated with expansion from the Near East to the Balkans during Neolithic period [57].

Haplogroup E, mainly represented by E1b1b1a1b1a-V13, displayed intermediate frequencies in Corsica compared to Tuscany and Provence. E1b1b1a1b1a-V13 was thought to have initiated a pan-Mediterranean expansion 7,000 years ago starting from the Balkans [52] and its dispersal to the northern shore of the Mediterranean basin is consistent with the Greek Anatolian expansion to the western Mediterranean [22], characteristic of the region surrounding Alaria, and consistent with the TMRCA estimated in Corsica for this haplogroup. A few E1b1a-V38 chromosomes are also observed in the same regions as V13.

Related: