Ancient Sardinia hints at Mesolithic spread of R1b-V88, and Western EEF-related expansion of Vasconic


New preprint Population history from the Neolithic to present on the Mediterranean island of Sardinia: An ancient DNA perspective, by Marcus et al. bioRxiv (2019)

Interesting excerpts (emphasis mine, edited for clarity):

On the high frequency of R1b-V88

Our genome-wide data allowed us to assign Y haplogroups for 25 ancient Sardinian individuals. More than half of them consist of R1b-V88 (n=10) or I2-M223 (n=7).

Francalacci et al. (2013) identi fied three major Sardinia-specifi c founder clades based on present-day variation within the haplogroups I2-M26, G2-L91 and R1b-V88, and here we found each of those broader haplogroups in at least one ancient Sardinian individual. Two major present-day Sardinian haplogroups, R1b-M269 and E-M215, are absent.

Compared to other Neolithic and present-day European populations, the number of identi fied R1b-V88 carriers is relatively high.

(…)ancient Sardinian mtDNA haplotypes belong almost exclusively to macro-haplogroups HV (n = 16), JT (n = 17) and U (n = 9), a composition broadly similar to other European Neolithic populations.

Geographic and temporal distribution of R1b-V88 Y-haplotypes in ancient European samples. We plot the geographic position of all ancient samples inferred to carry R1b-V88 equivalent markers. Dates are given as years BCE (means of calibrated 2s radio-carbon dates). Multiple V88 individuals with similar geographic positions are vertically stacked. We additionally color-code the status of the R1b-V88 subclade R1b-V2197, which is found in most present-day African R1b-V88 carriers.

On the origin of a Vasconic-like Paleosardo with the Western EEF

(…) the Neolithic (and also later) ancient Sardinian individuals sit between early Neolithic Iberian and later Copper Age Iberian populations, roughly on an axis that differentiates WHG and EEF populations and embedded in a cluster that additionally includes Neolithic British individuals. This result is also evident in terms of absolute genetic differentiation, with low pairwise FST ~ 0.005 +- 0.002 between Neolithic Sardinian individuals and Neolithic western mainland European populations. Pairwise outgroup-f3 analysis shows a very similar pattern, with the highest values of f3 (i.e. most shared drift) being with Neolithic and Copper Age Iberia, gradually dropping off for temporally and geographically distant populations.

In explicit admixture models (using qpAdm, see Methods) the southern French Neolithic individuals (France-N) are the most consistent with being a single source for Neolithic Sardinia (p ~ 0:074 to reject the model of one population being the direct source of the other); followed by other populations associated with the western Mediterranean Neolithic Cardial Ware expansion.

Principal Components Analysis based on the Human Origins dataset. A: Projection of ancient individuals’ genotypes onto principal component axes de fined by modern Western Eurasians (gray labels).

Pervasive Western Hunter-Gatherer ancestry in Iberian/French/Sardinian population

Similar to western European Neolithic and central European Late Neolithic populations, ancient Sardinian individuals are shifted towards WHG individuals in the top two PCs relative to early Neolithic Anatolians Admixture analysis using qpAdm infers that ancient Sardinian individuals harbour HG ancestry (~ 17%) that is higher than early Neolithic mainland populations (including Iberia, ~ 8%), but lower than Copper Age Iberians (~ 25%) and about the same as Southern French Middle-Neolithic individuals (~ 21%).

Principal Components Analysis based on the Human Origins dataset. B: Zoom into the region most relevant for Sardinian individuals.

Continuity from Sardinia Neolithic through the Nuragic

We found several lines of evidence supporting genetic continuity from the Sardinian Neolithic into the Bronze Age and Nuragic times. Importantly, we observed low genetic differentiation between ancient Sardinian individuals from various time periods.

A qpAdm analysis, which is based on simultaneously testing f-statistics with a number of outgroups and adjusts for correlations, cannot reject a model of Neolithic Sardinian individuals being a direct predecessor of Nuragic Sardinian individuals (…) Our qpAdm analysis further shows that the WHG ancestry proportion, in a model of admixture with Neolithic Anatolia, remains stable at ~17% throughout three ancient time-periods.

Present-day genetic structure in Sardinia reanalyzed with aDNA. A: Scatter plot of the rst two principal components trained on 1577 present-day individuals with grand-parental ancestry from Sardinia. Each individual is labeled with a location if at least 3 of the 4 grandparents were born in the same geographical location (\small” three letter abbreviations); otherwise with \x” or if grand-parental ancestry is missing with \?”. We calculated median PC values for each Sardinian province (large abbreviations). We also projected each ancient Sardinian individual on to the top two PCs (gray points). B/C: We plot f-statistics that test for admixture of modern Sardinian individuals (grouped into provinces) when using Nuragic Sardinian individuals as one source population. Uncertainty ranges depict one standard error (calculated from block bootstrap). Karitiana are used in the f-statistic calculation as a proxy for ANE/Steppe ancestry (Patterson et al., 2012).

Steppe influx in Modern Sardinians

While contemporary Sardinian individuals show the highest affinity towards EEF-associated populations among all of the modern populations, they also display membership with other clusters (Fig. 5). In contrast to ancient Sardinian individuals, present-day Sardinian individuals carry a modest “Steppe-like” ancestry component (but generally less than continental present-day European populations), and an appreciable broadly “eastern Mediterranean” ancestry component (also inferred at a high fraction in other present-day Mediterranean populations, such as Sicily and Greece).


Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula

Open access preprint (which I announced already) at bioRxiv Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula, by Bycroft et al. (2018).

Abstract (emphasis mine):

Genetic differences within or between human populations (population structure) has been studied using a variety of approaches over many years. Recently there has been an increasing focus on studying genetic differentiation at fine geographic scales, such as within countries. Identifying such structure allows the study of recent population history, and identifies the potential for confounding in association studies, particularly when testing rare, often recently arisen variants. The Iberian Peninsula is linguistically diverse, has a complex demographic history, and is unique among European regions in having a centuries-long period of Muslim rule. Previous genetic studies of Spain have examined either a small fraction of the genome or only a few Spanish regions. Thus, the overall pattern of fine-scale population structure within Spain remains uncharacterised. Here we analyse genome-wide genotyping array data for 1,413 Spanish individuals sampled from all regions of Spain. We identify extensive fine-scale structure, down to unprecedented scales, smaller than 10 Km in some places. We observe a major axis of genetic differentiation that runs from east to west of the peninsula. In contrast, we observe remarkable genetic similarity in the north-south direction, and evidence of historical north-south population movement. Finally, without making particular prior assumptions about source populations, we show that modern Spanish people have regionally varying fractions of ancestry from a group most similar to modern north Moroccans. The north African ancestry results from an admixture event, which we date to 860 – 1120 CE, corresponding to the early half of Muslim rule. Our results indicate that it is possible to discern clear genetic impacts of the Muslim conquest and population movements associated with the subsequent Reconquista.

“(a) Binary tree showing the inferred hierarchical relationships between clusters. The colours and points correspond to each cluster as shown on the map, and the length of the coloured rectangles is proportional to the number of individuals assigned to that cluster. We combined some small clusters (Methods) and the thick black branches indicate the clades of the tree that we visualise in the map. We have labeled clusters according to the approximate location of most of their members, but geographic data was not used in the inference. (b) Each individual is represented by a point placed at (or close to) the centroid of their grandparents’ birthplaces. On this map we only show the individuals for whom all four grandparents were born within 80km of their average birthplace, although the data for all individuals were used in the fineSTRUCTURE inference. The background is coloured according to the spatial densities of each cluster at the level of the tree where there are 14 clusters (see Methods). The colour and symbol of each point corresponds to the cluster the individual was assigned to at a lower level of the tree, as shown in (a). The labels and boundaries of Spain’s Autonomous Communities are also shown.”

Some interesting excerpts:

Our results further imply that north west African-like DNA predominated in the migration. Moreover, admixture mainly, and perhaps almost exclusively, occurred within the earlier half of the period of Muslim rule. Within Spain, north African ancestry occurs in all groups, although levels are low in the Basque region and in a region corresponding closely to the 14th-century ‘Crown of Aragon’. Therefore, although genetically distinct this implies that the Basques have not been completely isolated from the rest of Spain over the past 1300 years.

NOTE. I must add here that the Expulsion of Moriscos is known to have been quite successful in the old Crown of Aragon – deeply affecting its economy – , in contrast with other territories of the Crown of Castille, where they either formed less sizeable communities, or were dispersed and eventually Christened and integrated with local communities. For example, thousands of Moriscos from Granada were dispersed following the War of Alpujarras (1567–1571) into different regions of the Crown of Castille, and many could not be later expelled due to the locals’ resistance to follow the expulsion edict.

Perhaps surprisingly, north African ancestry does not reflect proximity to north Africa, or even regions under more extended Muslim control. The highest amounts of north African ancestry found within Iberia are in the west (11%) including in Galicia, despite the fact that the region of Galicia as it is defined today (north of the Miño river), was never under Muslim rule and Berber settlements north of the Douro river were abandoned by. This observation is consistent with previous work using Y-chromosome data. We speculate that the pattern we see is driven by later internal migratory flows, such as between Portugal and Galicia, and this would also explain why Galicia and Portugal show indistinguishable ancestry sharing with non-Spanish groups more generally. Alternatively, it might be that these patterns reflect regional differences in patterns of settlement and integration with local peoples of north African immigrants themselves, or varying extents of the large-scale expulsion of Muslim people, which occurred post-Reconquista and especially in towns and cities.

We estimated ancestry profiles for each point on a fine spatial grid across Spain (Methods). Gray crosses show
the locations of sampled individuals used in the estimation. Map shows the fraction contributed from the donor group ‘NorthMorocco’.

Overall, the pattern of genetic differentiation we observe in Spain reflects the linguistic and geopolitical boundaries present around the end of the time of Muslim rule in Spain, suggesting this period has had a significant and long-term impact on the genetic structure observed in modern Spain, over 500 years later. In the case of the UK, similar geopolitical correspondence was seen, but to a different period in the past (around 600 CE). Noticeably, in these two cases, country-specific historical events rather than geographic barriers seem to drive overall patterns of population structure. The observation that fine-scale structure evolves at different rates in different places could be explained if observed patterns tend to reflect those at the ends of periods of significant past upheaval, such as the end of Muslim rule in Spain, and the end of the Anglo-Saxon and Danish Viking invasions in the UK.

Certain people want to believe (well into the 21st century) into ideal ancestral populations and ancient ethnolinguistic identifications linked to one’s own – or the own country’s dominant – ancestral components and Y-DNA haplogroup.

We are nevertheless seeing how mainly the most recent relevant geopolitical events and late internal migratory flows have shaped the genetic structure (including Y-DNA haplogroup composition) of modern regions and countries regardless of its population’s actual language or ethnic identification, whether (pre)historical or modern.

Another surprise for many, I guess.


WordPress Translation Plugin – now using Google Translation from and into Swedish, Finnish, Danish, Norwegian, Polish, Czech, Romanian, Bulgarian, Hindi, Arabic, Japanese, Chinese, etc.

The latest improvements added to the Indoeuropean Translator Widget have been included in the simpler WordPress Translation Plugin available in this personal blog.

It now includes links to automatic translations from and into all language pairs offered by Google Translation Engine, apart from other language pairs (from individual languages, like English or Spanish) into other online machine translators, viz Tranexp or Translendium.

Available language pairs now include English, Arabic, Bulgarian, Catalan*, Czech, Chinese (traditional/simplified), Welsh*, Danish, German, Greek, Spanish, Persian*, French, Hindi, Croatian, Icelandic*, Italian, Hebrew*, Latin*, Korean, Hungarian*, Dutch, Japanese, Norwegian (Bokmål), Polish, Portuguese (Brazilian Portuguese*), Romanian, Russian, Slovenian*, Serbian*, Swedish, Finnish, Tagalog*, Turkish* and Ukrainian*.

How ‘difficult’ (using Esperantist terms) is an inflected language like Proto-Indo-European for Europeans?

For native speakers of most modern Romance languages (apart from some reminiscence of the neuter case), Nordic (Germanic) languages, English, Dutch, or Bulgarian, it is usually considered “difficult” to learn an inflected language like Latin, German or Russian: cases are a priori felt as too strange, too “archaic”, too ‘foreign’ to the own system of expressing ideas. However, for a common German, Baltic, Slavic, Greek speaker, or for non-IE speakers of Basque or Uralic languages (Finnish, Hungarian, Estonian), cases are the only way to express common concepts and ideas, and it was also the common way of expression for speakers of older versions of those very uninflected languages, like Old English, Old Norse or Classical Latin; and their speakers didn’t consider their languages “difficult” …

Therefore, to use different cases is the normal way to express concepts that non-inflected languages express in different ways – i.e. not “more easily”, but “differently”. That’s the point Esperantism has lost in its struggle to convince the world of its “easiness”. In fact, the idea that cases are difficult is so impregnated in Esperantism, that some did create “an old version” [probably deemed “more difficult”] of Esperanto called Arcaicam Esperantom, as a fiction of evolution from an older language…

Thus, among the European population (more than 700 million inhabitants), just around 200 million speak non-inflected languages, while the rest use at least 4 cases to express every possible concept. Within the current EU, more or less half of its speakers speak an inflected language – like German, Polish, Czech, Greek, Lithuanian, Slovenian, or non-IE Hungarian, Finnish, etc. – as their mother tongue.

For example, the literal sentence “I go to-the-house” [not exactly the common expression “I go home” which is expressed differently in each language] would be said in Spanish “voy a-la-casa”, or in French “je vais a-la-maison”, in Italian “vado a-la-casa”, etc. Therefore, in an “easy conlang” for Western European speakers, say in something called Esperanto, a sentence like “io vo a-lo-haus” is apparently “easy”, because the syntactical structure is similar to those non-inflected languages.

NOTE: In fact, there are other interesting concepts behind the use of the obligatory subject before the verb in languages like English or Esperanto, that appears usually in those languages that have reduced the verbal system; therefore, the subject is necessary only in those languages whose verbal inflection becomes too simple to express an idea that must still be expressed some way – more or less like different combinations of prepositions and articles are often needed to substitute the lost nominal inflection, as we discuss here. In those ‘less innovative’ languages that retain a rich verbal system, the subject appears for some reason, as e.g. in Spanish “yo voy a la casa”, which must be expressed differently in innovative languages, using different linguistic resources, like e.g. Eng. “I myself go to the house” (or maybe “it’s me who…“), or French “moi, je vais a la maison”. Is that obligatory subject and ‘simplified’ verbal system of Esperanto “easier”, and therefore “better”…? I guess not. It’s just an imitation of French or English that Mr. Zamenhoff deemed “better” for his creation to succeed, given the relevance of those languages (and its speakers’ acceptance) back in 1900…

On the other hand, in German it would be “Ich gehe nach-Haus-e”, in Latin, it is “vado ad-domu-m”; in Polish “idę do-dom-u” etc. The use of declensions, if compared to uninflected languages, is usually made of just a simple change of “preposition+article” -> “declension” – or, in the ‘worst’ case (as it is shown here), by a “preposition+article” -> “preposition+declension”.

To sum up, can some languages be considered “more difficult” than others? Yes, indeed. If seen from a European point of view, some linguistic features are not easy to learn: the Arab writing system, Chinese unending kanjis, Sino-Tibetan or Vietnamese tones, etc. can cause headaches to [adult] speakers willing to learn them… Also, from an English, French or Spanish point of view, learning a language like Esperanto might seem “better” because of its apparent and equivocal “easiness”… But, between (a) all Indo-European speakers learning a non-inflected language like English [or ‘easy’ Esperanto], or (b) all Indo-European speakers learning an inflected one like Proto-Indo-European?; I guess there is no language “easier” than other, and therefore the “better” option should come from other rational considerations, not just faith in the absurd ramblings of an illuminated Polish ophthalmologist.

Therefore, the question remains still the same: why on earth should any European willing to speak a common language select an invented one (from the thousand “super easy” ones available) than a natural one, like the ancestor of most of their mother tongues, Proto-Indo-European?