Y-chromosome mixture in the modern Corsican population shows different migration layers


Open access Prehistoric migrations through the Mediterranean basin shaped Corsican Y-chromosome diversity, by Di Cristofaro et al. PLOS One (2018).

Interesting excerpts:

This study included 321 samples from men throughout Corsica; samples from Provence and Tuscany were added to the cohort. All samples were typed for 92 Y-SNPs, and Y-STRs were also analyzed.

Haplogroup R represented approximately half of the lineages in both Corsican and Tuscan samples (respectively 51.8% and 45.3%) whereas it reached 90% in Provence. Sub-clade R1b1a1a2a1a2b-U152 predominated in North Corsica whereas R1b1a1a2a1a1-U106 was present in South Corsica. Both SNPs display clinal distributions of frequency variation in Europe, the U152 branch being most frequent in Switzerland, Italy, France and Western Poland. Calibrated branch lengths from whole Y chromosome sequencing [44,45] and ancient DNA studies [46] both indicated that R1a and R1b diversification began relatively recently, about 5 Kya, consistent with Bronze Age and Copper Age demographic expansion. TMRCA estimations are concordant with such expansion in Corsica.

Spatial frequency maps for haplogroups with frequencies above 3%, their Y-STR based phylogenetic networks in Corsican populations (Blue: North, Green: West, Orange: South, Black: Center and Purple: East) and their TMRCA (in years, +/- SE).

Haplogroup G reached 21.7% in Corsica and 13.3% in Tuscany. Sub-clade G2a2a1a2-L91 accounted for 11.3% of all haplogroups in Corsica yet was not present in Provence or in Tuscany. Thirty-four out of the 37 G2a2a1a2-L91 displayed a unique Y-STR profile, illustrated by the star-like profile of STR networks (Fig 1). G2a2a1a2-L91 and G2a2a-PF3147(xL91xM286) show their highest frequency in present day Sardinia and southern Corsica compared to low levels from Caucasus to Southern Europe, encompassing the Near and Middle East [21,47–50]. Ancient DNA results from Early and Middle Neolithic samples reported the presence of haplogroup G2a-P15 [51–53], consistent with gene flow from the Mediterranean region during the Neolithic transition. Td expansion time estimated by STR for P15-affiliated chromosomes was estimated to be 15,082+/-2217 years ago [49]. Ötzi, the 5,300-year-old Alpine mummy, was derived for the L91 SNP [21]. A genetic relationship between G haplogroups from Corsica and Sardinia is further supported by DYS19 duplication, reported in North Sardinia [14], and observed in the southern part of the Corsica in 9 out of 37 G2a2a1a2-L91 chromosomes and in 4 out of 5 G2a2a-PF3147(xL91xM286) chromosomes, 3 of which displayed an identical STR profile (S4 Table).

This lineage has a reported coalescent age estimated by whole sequencing in Sardinian samples of about 9,000 years ago. This could reflect common ancestors coming from the Caucasus and moving westward during the Neolithic period [48], whereas their continental counterparts would have been replaced by rapidly expanding populations associated with the Bronze Age [46,54,55]. Estimated TMRCA for L91 lineage in Corsica is 4529 +/- 853 years. G-L497 showed high frequencies in Corsica compared to Provence and Tuscany, and this haplogroup was common in Europe, but rare in Greece, Anatolia and the Middle East. Fifteen out of the 17 Corsican G2a2b2a1a1b-L497 displayed a unique Y-STR profile (S4 Table) with an estimated TMRCA of 6867 +/- 1294 years. Haplogroup G2a2b1-M406, associated with Impressed Ware Neolithic markers, along with J2a1-DYS445 = 6 and J2a1b1-M92 [22,49], had very low levels in Corsica. Conversely, G2a2b2a-P303was highly represented and seemed to be independent of the G2a2b1-M406 marker. The 7 G2a2b2a-P303(xL497xM527) Corsican chromosomes displayed a unique Y-STR profile (S4 Table).

First and second axes of the PCA based on 12 Y-chromosome haplogroup frequencies in 83 west Mediterranean populations.

Haplogroup J, mainly represented by J2a1b-M67(xM92), displayed intermediate frequencies in Corsica compared to Tuscany and Provence. J2a1b-M67(xM92) derived STR network analysis displayed a quite homogeneous profile across the island with an estimated TMRCA of 2381 +/- 449 years (Fig 1) and individuals displaying M67 were peripheral compared to Northwestern Italians (S2 Fig). The haplogroup J2a1-Page55(xM67xM530), characteristic of non-Greek Anatolia [22], was found in the north-west of Corsica. Haplogroup J2a1-DYS445 = 6 was found in the north-west with DYS391 = 10 repeats, and in the far south with DYS391 = 9 repeats, the former was associated with Anatolian Greek samples, whereas the second was found in central Anatolia [22]. The 7 J2b2a-M241 displayed a unique Y-STR profile (S4 Table), they were only detected in the Cap Corse region, this sub-haplogroup shows frequency peaks in both the southern Balkans and northern-central Italy [56] and is associated with expansion from the Near East to the Balkans during Neolithic period [57].

Haplogroup E, mainly represented by E1b1b1a1b1a-V13, displayed intermediate frequencies in Corsica compared to Tuscany and Provence. E1b1b1a1b1a-V13 was thought to have initiated a pan-Mediterranean expansion 7,000 years ago starting from the Balkans [52] and its dispersal to the northern shore of the Mediterranean basin is consistent with the Greek Anatolian expansion to the western Mediterranean [22], characteristic of the region surrounding Alaria, and consistent with the TMRCA estimated in Corsica for this haplogroup. A few E1b1a-V38 chromosomes are also observed in the same regions as V13.


The Proto-Indo-European – Euskarian hypothesis


Another short communication by Juliette Blevins has just been posted, A single sibilant in Proto-Basque: *s, *Rs, *sT and the phonetic basis of the sibilant split:

Blevins (to appear) presents a new reconstruction of Proto-Basque, the mother of Basque and Aquitanian, based on standard methods in historical linguistics: the comparative method and the method of internal reconstruction. Where all previous reconstructions of Proto-Basque assume a contrast between two sibilants, *s, a voiceless apical sibilant, and *z a voiceless laminal sibilant (Martinet 1955; Michelena 1977; Lakarra 1995; Trask 1997), this proposal is unique in positing only a single sibilant *s. Under this account, all instances of Common Basque /z/ are derived from *s. More specifically, in syllable coda, *Rs > *Rz (R a sonorant) while in the syllable onset, *sT > *zT (T an oral stop). The true split of *s into /s/ vs. /z/ occurs when clusters like *Rz or *zT are further simplified to /z/.

In this talk, internal evidence for a single sibilant, *s, in Proto-Basque is presented, and sound changes underlying the sibilant split are examined within the context of Evolutionary Phonology (Blevins 2004, 2006, 2015, 2017). Similar sound changes are identified in other languages with similar cluster types (e.g. Kümmel 2007:232), and the phonetic basis of the sibilant split is informed by recent studies of sibilant retraction (e.g. Stevens and Harrington 2016; Stuart-Smith et al. 2018).

Blevins, already known for her previous work on the Basque language, was known internationally for her recent controversial proposal of a genetic relationship between Proto-Indo-European and Basque. Apparently, a book with her full model, Advances in Proto-Basque Reconstruction with evidence for the Proto-Indo-European-Euskarian Hypothesis, will be published by Routledge soon.

I was never convinced, not just about a genetic connection, but about the very possibility of discovering it if there was any, mainly because such a link would be quite old, and Basque is known to have been greatly influenced by surrounding IE prestige languages for millennia until it was first attested in the 16th century. Internal reconstruction can only avail a gross reconstruction of few aspects up to a certain point in time, probably not very far beyond the Pre-Roman period, and that only thanks to the available Aquitanian inscriptions.

There are indeed certain known migrations that could be linked with a pan-European population movement, the most likely one for this hypothesis being the Villabruna cluster (the Villabruna sample itself being of haplogroup R1b pre-P297), and especially the expansion of R1b-V88 lineages, found widespread in Europe from west to east, from Mesolithic Iberia to Khvalynsk.

This haplogroup is also found in Sardinia, which may be connected to the expansion of V88 subclades (which I have speculatively proposed could be linked to Afro-Asiatic) into Africa through Italy and the Green Sahara; although it could also be linked to a speculative Vasconic-Iberian – Palaeo-Sardo group.

Without knowing the exact Pre-Proto-Indo-European stage at which Blevins would place the Basque separation, it is difficult to know how it could fit within any macro-language proposal – and thus potential ancestral population expansion.

If you are interested in this hypothesis, I suggest Koch’s controversial paper of 2013 Is Basque an Indo-European Language? (PDF), appeared in JIES 41 (1 & 2)….And of course the many papers rejecting it in the same volume. You also have Forni’s writings supporting this association.

Seeing how many Basque nationalists (obviously obsessed with racial purity) are still rooting for an autochthonous Palaeolithic origin of R1b lineages (especially P312) linked with the Basque language and dat huge Vasconic Western Europe; and now, after Olalde & Mathieson 2018, how some are also suggesting a Neolithic link of R1b with the Neolithic expansion and Sardinians, for lack of further modern genetic differences with other Western Europeans… I wonder how a lot of people inclined to believe this nonsense today, and mentally linking Vasconic with haplogroup R1b, will be paradoxically necessarily tied precisely to this kind of macro-family proposals in the future.


Germanic tribes during the Barbarian migrations show mainly R1b, also I lineages


New preprint at BioRxiv, Understanding 6th-Century Barbarian Social Organization and Migration through Paleogenomics, by Amorim, Vai, Posth, et al. (2018)

Abstract (emphasis mine):

Despite centuries of research, much about the barbarian migrations that took place between the fourth and sixth centuries in Europe remains hotly debated. To better understand this key era that marks the dawn of modern European societies, we obtained ancient genomic DNA from 63 samples from two cemeteries (from Hungary and Northern Italy) that have been previously associated with the Longobards, a barbarian people that ruled large parts of Italy for over 200 years after invading from Pannonia in 568 CE. Our dense cemetery-based sampling revealed that each cemetery was primarily organized around one large pedigree, suggesting that biological relationships played an important role in these early Medieval societies. Moreover, we identified genetic structure in each cemetery involving at least two groups with different ancestry that were very distinct in terms of their funerary customs. Finally, our data was consistent with the proposed long-distance migration from Pannonia to Northern Italy.

Interesting excerpts:

Since the adults were almost all non-local, it is tempting to suggest that we may be observing the historically described fara during migration. Regardless, this group appears to be a unit organized around one high-status, kin-based group of predominantly males, but also incorporating other males that may have some common central/northern European descent. The relative lack of adult female representatives from Kindred SZ1, the diverse genetic and isotope signatures of the sampled women around the males and their rich graves goods suggests that they may have been acquired and incorporated into the unit during the process of migration (perhaps hinting at a patrilocal societal structure that has been shown to be prominent in Europe during earlier periods).

The remaining part of this community for which we have genomic data (N=7) is composed of individuals of mainly southern European genetic ancestry that are conspicuously lacking grave goods and occupy the southeastern part of the cemetery, with randomly oriented graves with straight walls. While the lack of grave goods does not necessarily imply that these individuals were of lower status, it does point to them belonging to a different social group. Interestingly, the strontium isotope data suggest that they may have migrated together with the warrior-based group from outside Szólád, but barriers to gene flow were largely been maintained.

Genetic structure of Szólád and Collegno. (A) Procrustes Principal Component Analysis of modern and ancient European population (faded small dots are individuals, larger circle is median of individuals) along with samples from Szólád (filled circles), Collegno (filled stars), Bronze Age SZ1 (filled grey circle), second period CL36 (grey star), two Avar-period samples from Szólád (yellow circles), Anglo-Saxon period UK (orange circles) and 6th Century Bavaria (green circles). Szólád and Collegno samples are filled with colors based on estimated ancestry from ADMIXTURE. Blue circles with thick black edge = Kindred_SZ1 , blue stars with thick black edge = Kindred_CL1 , stars with thick green edge = Kindred_CL2 . NWE = northwest Europe, NE = modern north Europe, NEE = modern northeast Europe, CE =central Europe, EE = eastern Europe, WE =western Europe, SE = southern Europe, SEE = southeast Europe, HUN = modern Hungarian, HBr = Hungarian Bronze Age, Br = central, northern and eastern Europe Bronze age.

Evidence for Migrating Barbarians and “Longobards”

Our two cemeteries overlap chronologically with the historically documented migration of Longobards from Pannonia to Italy at the end of the 6th century. It is thus intriguing that we observe that central/northern European ancestry is dominant not only in Szólád, but also in Collegno. Based on modern genetic data we would not expect to see a preponderance of such ancestry in either Hungary or especially Northern Italy. While we do not yet know the general genomic background of Europe in these geographic regions just before the establishment of Szólád and Collegno, other Migration Period genomes from the UK and Germany show a fairly strong correlation with modern geography (while also possessing a similar central/northern European ancestry component to that found in Szólád and Collegno). Going further back in time, Late Bronze Age Hungarians show almost no resemblance to populations from modern central/northern Europe, especially compare to Bronze Age Germans and in particular Scandinavians, who, in contrast, show considerable overlap with our Szólád and Collegno central/northern ancestry samples. Coupled with the strontium isotope data, our paleogenomic analysis suggest that the earliest individuals of central/northern ancestry in Collegno were probably migrants while those with southern ancestry were local residents. Our results are thus consistent with an origin of barbarian groups such as the Longobards somewhere in Northern and Central Europe east of the Rhine and north of the Danube. Thus our results cannot reject the migration, its route, and settlement of “the Longobards” described in historical texts.

We note however that whether these people identified as “Longobard” or any other particular barbarian people is impossible to assess. Modern European genetic variation is generally highly structured by geography 22,32 , even at the level of individual villages 33 . It is, therefore, surprising to find significant diversity, even amongst individuals with central/northern ancestry, within small, individual Langobard cemeteries. Even amongst the two family groups of primarily central/northern ancestry, who may have formed the heart of such migration, there is clear evidence of admixture with individuals with more southern ancestry. If we are seeing evidence of movements of barbarians, there is no evidence that these were genetically homogenous groups of people.

Model-based ancestry estimates from Admixture for Szólád (B) and Collegno (C) using 1000 Genomes Project Eurasian and YRI populations to supervise analysis. Note that high contamination was identified in CL31 and is shown with a triangle in the (A) and overlaid with a pink hue in the (C).

From the supplementary material:

The haplogroups detected in the samples show a prevalence of R1b (55.3%), which is the most common sub-haplogroup in western Europe, with a peak in the Iberian Peninsula and in the British islands and a west-east gradient in central Europe. A consistent percentage of haplotypes belongs to the I haplogroup (26.4%), both in the I1a and, more abundantly, in I2a2 sub-haplogroups. They are particularly frequent in the northern Balkans with a westward gradient in central and western Europe, with some lineages belonging to I2a2a1b particularly common in the Germanic region.

Relative and absolute haplogroup frequencies: COL = Collegno; SZO = Szólád; CEU = Central European from Utah; FIN = Finnish; GBR = Britons; IBS = Iberians; SAR = Sardinians; TSI = Tuscans