Haplogroup R1b-M167/SRY2627 linked to Celts expanding with the Urnfield culture

bronze-age-late-urnfield

As you can see from my interest in the recently published Olalde et al. (2019) Iberia paper, once you accept that East Bell Beakers expanded North-West Indo-European, the most important question becomes how did its known dialects spread to their known historic areas.

We already had a good idea about the expansion of Celts, based on proto-historical accounts, fragmentary languages, and linguistic guesstimates, but the connection of Celtic with either Urnfield or slightly later Hallstatt/La Tène was always blurred, due to the lack of precise data on population movements.

The latest paper on Iberia is interesting for many details, such as:

  • The express dismissal of the newest pet theory based on the simplistic “steppe ancestry = IE”: the obsessive comparisons of Dutch Bell Beakers as the origin of basically anything that moves in Europe.
  • A discrete influx of North African ancestry in certain samples before the Moorish invasion (which was probably mediated by peoples of North African rather than Levantine admixture).
  • The finding of very Mycenaean-like Greek colonies of the 5th century (interestingly, under R1b lineages).
iberia-celts-romans
Modified from section of PCA of ancient samples by Olalde et al. (2019). “IE Iberia” refers to Pre-Celtic Indo-European languages of Iberia, such as Galaico-Lusitanian in the west (see more on Lusitanian), and a potentially Ligurian-related language in the North-East and southern France.

The paper is, however, of particular importance from the perspective of historical linguistics. It confirms that:

  • Celtic-speaking peoples expanded in Iberia likely during the Late Bronze Age – Early Iron Age (probably with the Urnfield culture, before 1000 BC) with North/Central European ancestry.

NOTE. The paper marks what are believed to be the boundaries of non-Indo-European languages during the Iron Age in later times, extrapolating that situation to the past. Mediterranean sites with Iberian traits (ca. 6th century on) were probably non-Indo-European-speaking tribes, but it is unclear what happened in the centuries before their sampling, and there are no clear boundaries. These incoming Celts from central Europe with the Urnfield culture makes it very likely that the Iberian expansion to the north happened later, incorporating thus this central European ancestry in the process. The southern (orientalizing, Tartessian) site of La Angorrilla shows incineration and influence from Phoenician settlers, and their actual language is also far from clear. The other investigated samples, with higher central European contribution, are from Celtiberian sites.

  • The slightly later arrival of (Phoenician, Greek and) Latin-speaking peoples into Iberia is marked by Central/Eastern Mediterranean and North African ancestry.
iberia-migrations-celts-romans
Expansion of different ancestry components in Iberia during Prehistory. Modified from Olalde et al. (2019) to include labels with populations expanding with each component.

While both confirm what was more or less already known about the oldest attested NWIE dialects, and further support the role of East Bell Beakers in expanding North-West Indo-European, the first part is interesting for two main reasons:

  1. Koch’s Celtic from the West hypothesis, which made a recent comeback with a renewed model based on “steppe ancestry”, is once again rejected in population genomics, as expected. At this point I doubt this will mean anything to the supporters of the theory (because you can propose as many “Celtic-over-Celtic” layers as you want), but if you are not obsessed with autochthonous continuity of Celtic languages in the Atlantic area we might begin to judge the most correct dialectal split (and thus classification) among those proposed to date, based on ancestry and haplogroup expansions.
  2. We believed in the 2000s that the expansion of haplogroup R1b-M167 (TMRCA ca. 1100 BC for YTree or 1700 BC for YFull) was coupled with the expansion of Iberians from the Pyrenees, in turn (thus) closely related to Basques. This non-IE presence has been contested with toponymic data in linguistics, and with the testing of many modern samples and the subsequent discovery of the widespread distribution of the subclade in western and northern Europe. Now it has become even more likely (lacking confirmation with aDNA) that this haplogroup expanded with Celts.

NOTE. Regarding R1b SNPs, YTree has more samples (and thus more SNPs) to work with estimates, due to its connection with FTDNA groups, so it is in principle more reliable (although estimates were calculated in 2017). Nevertheless, the methods to estimate the age of the MRCA are different between YTree and YFull.

df27-m167-z262-mcdonald
YTree estimations of TMRCA for R1b-Z262 (left) and R1b-M167 (right).

Why this is important has to do with the realization that Celts must have expanded explosively in all directions during the estimated range for Common Celtic (ca. 1500-1000 BC), and as such R1b-M167 is probably going to be one of the clear Y-DNA markers of the Celtic expansion, when it appears in the ancient DNA record, maybe in new SNP calls from samples of the Olalde et al. (2019) paper, or in future Urnfield/Hallstatt/La Tène papers.

Sister clades derived from R1b-Z262 (TMRCA ca. 1650 BC for YTree, or 2700 for YFull), although sharing a quite old origin, may have taken part in the same communities that expanded R1b-M167, likely from some point in central Europe, possibly as remnants of a previous (Tumulus culture?) central European expansion, as the sample SZ5 from Szólád (R1b-CTS1595) and the distribution of modern samples suggest.

r1b-df27-m167-sry2627
Left: Modern distribution of upstream clade L176.2 (YFull R1b-CTS4188); Right: Modern distribution of M167. Both include later expansions within Iberia (probably with the Crown of Aragon during the Reconquista). Contour maps of the derived allele frequencies of the SNPs analyzed in Solé-Morata et al. (2017).

EDIT (23 APRIL): In Hernández et al. (2018), the TMRCA of R1b-M167 is reported as 3372-3718 ybp:

The youngest sub-branch, R1b-M167, dates to approximately 3.5 kya (95% CI= 2.5-5.3 kya), i.e. even after the Bronze Age.

r1b-df27-m167-europe
Contour (surface) maps displaying the frequencies of Y-chromosome haplogroup and its sub-lineages across Europe and the Mediterranean basin. Modified from Hernández et al. (2018).

NOTE. Admittedly, the maps are mainly based on Iberian samples and certain limited sampling elsewhere, so most of the frequencies displayed in other territories are extrapolated. Since the percentage of R1b-M167 in France is estimated to be ca. 3%, and in Bavaria ca. 5%, the distribution in Central Europe is probably much higher, and around the Mediterranean much lower than represented in them.

The Celtic expansion might not have been a mass migration of peoples replacing all male lines of their controlled territories (as was common in the Neolithic and Chalcolithic), because of the Bronze Age dominant chiefdom-based system that relied on alliances, but it is becoming clear that Early Celts are also going to show the expansion of certain successful male lineages.

Oh, and you can say goodbye to the autochthonous “Vasconic = R1b-DF27” (latest heir of the “Vasconic = R1b-P312”) theory, too, if – for some strange reason – you hadn’t already.

EDIT (16 MAR) Just in case the wording is not clear: the fact that this haplogroup most likely expanded with Celts does not mean that its lineages didn’t become eventually incorporated into Iberian cultures and adopted non-IE languages: some of them probably did at some point, in some regions of northern Iberia, and most were certainly later incorporated to the Roman civilization and spoke Latin, then to the medieval kingdoms with their languages, and so on until the present day… Only those eventually associated with Iron Age Aquitanians may have retained their non-IE language, unless those lineages today associated with Basques were incorporated later to the Basque-speaking regions by expanding medieval kingdoms. A complex picture repeated everywhere in Europe: no haplogroup+language continuity in sight, anywhere.

NOTE: This here is currently the most likely interpretation of data based on estimations of mutations; it is not confirmed with ancient samples.

Related

Iberia: East Bell Beakers spread Indo-European languages; Celts expanded later

iberia-migrations-celts

New paper (behind paywall), The genomic history of the Iberian Peninsula over the past 8000 years, by Olalde et al. Science (2019).

NOTE. Access to article from Reich Lab: main paper and supplementary materials.

Abstract:

We assembled genome-wide data from 271 ancient Iberians, of whom 176 are from the largely unsampled period after 2000 BCE, thereby providing a high-resolution time transect of the Iberian Peninsula. We document high genetic substructure between northwestern and southeastern hunter-gatherers before the spread of farming. We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry. We show that, in the Iron Age, Steppe ancestry had spread not only into Indo-European–speaking regions but also into non-Indo-European–speaking ones, and we reveal that present-day Basques are best described as a typical Iron Age population without the admixture events that later affected the rest of Iberia. Additionally, we document how, beginning at least in the Roman period, the ancestry of the peninsula was transformed by gene flow from North Africa and the eastern Mediterranean.

Interesting excerpts:

From the Bronze Age (~2200–900 BCE), we increase the available dataset (6, 7, 17) from 7 to 60 individuals and show how ancestry from the Pontic-Caspian steppe (Steppe ancestry) appeared throughout Iberia in this period (Fig. 1, C and D), albeit with less impact in the south (table S13). The earliest evidence is in 14 individuals dated to ~2500–2000 BCE who coexisted with local people without Steppe ancestry (Fig. 2B). These groups lived in close proximity and admixed to form the Bronze Age population after 2000 BCE with ~40% ancestry from incoming groups (Fig. 2B and fig. S6).

Y-chromosome turnover was even more pronounced (Fig. 2B), as the lineages common in Copper Age Iberia (I2, G2, and H) were almost completely replaced by one lineage, R1b-M269. These patterns point to a higher contribution of incoming males than females, also supported by a lower proportion of nonlocal ancestry on the X-chromosome (table S14 and fig. S7), a paradigm that can be exemplified by a Bronze Age tomb from Castillejo del Bonete containing a male with Steppe ancestry and a female with ancestry similar to Copper Age Iberians.

iberian-adna

For the Iron Age, we document a consistent trend of increased ancestry related to Northern and Central European populations with respect to the preceding Bronze Age (Figs. 1, C and D, and 2B). The increase was 10 to 19% (95% confidence intervals given here and in the percentages that follow) in 15 individuals along the Mediterranean coast where non-Indo-European Iberian languages were spoken; 11 to 31% in two individuals at the Tartessian site of La Angorrilla in the southwest with uncertain language attribution; and 28 to 43% in three individuals at La Hoya in the north where Indo-European Celtiberian languages were likely spoken (fig. S6 and tables S11 and S12).

This trend documents gene flow into Iberia during the Late Bronze Age or Early Iron Age, possibly associated with the introduction of the Urnfield tradition (18). Unlike in Central or Northern Europe, where Steppe ancestry likely marked the introduction of Indo-European languages (12), our results indicate that, in Iberia, increases in Steppe ancestry were not always accompanied by switches to Indo-European languages.

I think it is obvious they are extrapolating the traditional (not that well-known) linguistic picture of Iberia during the Iron Age, believing in continuity of that picture (especially non-Indo-European languages) during the Urnfield period and earlier.

What this data shows is, as expected, the arrival of Celtic languages in Iberia after Bell Beakers and, by extension, in the rest of western Europe. Somewhat surprisingly, this may have happened during the Urnfield period, and not during the La Tène period.

Also important are the precise subclades:

We thus detect three Bronze Age males who belonged to DF27 (154, 155), confirming its presence in Bronze Age Iberia. The other Iberian Bronze Age males could belong to DF27 as well, but the extremely low recovery rate of this SNP in our dataset prevented us to study its true distribution. All the Iberian Bronze Age males with overlapping sequences at R1b-L21 were negative for this mutation. Therefore, we can rule out Britain as a plausible proximate origin since contemporaneous British males are derived for the L21 subtype.


New open access paper Survival of Late Pleistocene Hunter-Gatherer Ancestry in the Iberian Peninsula, by Villalba-Mouco et al. Cell (2019):

BAL0051 could be assigned to haplogroup I1, while BAL003 carries the C1a1a haplogroup. To the limits of our typing resolution, EN/MN individuals CHA001, CHA003, ELT002 and ELT006 share haplogroup I2a1b, which was also reported for Loschbour [73] and Motala HG [13], and other LN and Chalcolithic individuals from Iberia [7, 9], as well as Neolithic Scotland, France, England [9], and Lithuania [14]. Both C1 and I1/ I2 are considered typical European HG lineages prior to the arrival of farming. Interestingly, CHA002 was assigned to haplogroup R1b-M343, which together with an EN individual from Cova de Els Trocs (R1b1a) confirms the presence of R1b in Western Europe prior to the expansion of steppe pastoralists that established a related male lineage in Bronze Age Europe [3, 6, 9, 13, 19]. The geographical vicinity and contemporaneity of these two sites led us to run genomic kinship analysis in order to rule out any first or second degree of relatedness. Early Neolithic individual FUC003 carries the Y haplogroup G2a2a1, commonly found in other EN males from Neolithic Anatolia [13], Starçevo, LBK Hungary [18], Impressa from Croatia and Serbia Neolithic [19] and Czech Neolithic [9], but also in MN Croatia [19] and Chalcolithic Iberia [9].

See also

Population structure in Argentina shows most European sources of South European origin

argentina-population

Open access Population structure in Argentina, by Muzzio et al., PLOS One (2018).

Abstract (emphasis mine):

We analyzed 391 samples from 12 Argentinian populations from the Center-West, East and North-West regions with the Illumina Human Exome Beadchip v1.0 (HumanExome-12v1-A). We did Principal Components analysis to infer patterns of populational divergence and migrations. We identified proportions and patterns of European, African and Native American ancestry and found a correlation between distance to Buenos Aires and proportion of Native American ancestry, where the highest proportion corresponds to the Northernmost populations, which is also the furthest from the Argentinian capital. Most of the European sources are from a South European origin, matching historical records, and we see two different Native American components, one that spreads all over Argentina and another specifically Andean. The highest percentages of African ancestry were in the Center West of Argentina, where the old trade routes took the slaves from Buenos Aires to Chile and Peru. Subcontinentaly, sources of this African component are represented by both West Africa and groups influenced by the Bantu expansion, the second slightly higher than the first, unlike North America and the Caribbean, where the main source is West Africa. This is reasonable, considering that a large proportion of the ships arriving at the Southern Hemisphere came from Mozambique, Loango and Angola.

argentina-pca
Principal component analysis.
On the x axis is PC 1 while PC2 is the y axis. Plus symbols represent Argentinian samples and circles are for reference panels. Fig 2a (left) Argentinians with YRI and LWK for African references (“African”), IBS and TSI for European references (“European”) and the PEL, MXL, PUR and CLM as a Latin American references. Fig 2b (right) samples from Argentina with IBS, MXL, CLM and PEL.

Related:

Iberian prehistoric migrations in Genomics from Neolithic, Chalcolithic, and Bronze Age

iberia-neolithic-bronze-age

New open access paper Four millennia of Iberian biomolecular prehistory illustrate the impact of prehistoric migrations at the far end of Eurasia, by Valdiosera, Günther, Vera-Rodríguez, et al. PNAS (2018) published ahead of print.

Abstract (emphasis mine)

Population genomic studies of ancient human remains have shown how modern-day European population structure has been shaped by a number of prehistoric migrations. The Neolithization of Europe has been associated with large-scale migrations from Anatolia, which was followed by migrations of herders from the Pontic steppe at the onset of the Bronze Age. Southwestern Europe was one of the last parts of the continent reached by these migrations, and modern-day populations from this region show intriguing similarities to the initial Neolithic migrants. Partly due to climatic conditions that are unfavorable for DNA preservation, regional studies on the Mediterranean remain challenging. Here, we present genome-wide sequence data from 13 individuals combined with stable isotope analysis from the north and south of Iberia covering a four-millennial temporal transect (7,500–3,500 BP). Early Iberian farmers and Early Central European farmers exhibit significant genetic differences, suggesting two independent fronts of the Neolithic expansion. The first Neolithic migrants that arrived in Iberia had low levels of genetic diversity, potentially reflecting a small number of individuals; this diversity gradually increased over time from mixing with local hunter-gatherers and potential population expansion. The impact of post-Neolithic migrations on Iberia was much smaller than for the rest of the continent, showing little external influence from the Neolithic to the Bronze Age. Paleodietary reconstruction shows that these populations have a remarkable degree of dietary homogeneity across space and time, suggesting a strong reliance on terrestrial food resources despite changing culture and genetic make-up.

iberia-admixture
(A) f4 statistics testing affinities of prehistoric European farmers to either early Neolithic Iberians or central Europeans, restricting these reference populations to SNP-captured individuals to avoid technical artifacts driving the affinities. The boxplots in A show the distributions of all individual f4 statistics belonging to the respective groups. The signal is not sensitive to the choice of reference populations and is not driven by hunter-gatherer–related admixture (Datasets S4 and S5). (B) Estimates of ancestry proportions in different prehistoric Europeans as well as modern southwestern Europeans. Individuals from regions of Iberia were grouped together for the analysis in A and B to increase sample sizes per group and reduce noise

Conclusion:

We present a comprehensive biomolecular dataset spanning four millennia of prehistory across the whole Iberian Peninsula. Our results highlight the power of archaeogenomic studies focusing on specific regions and covering a temporal transect. The 4,000 y of prehistory in Iberia were shaped by major chronological changes but with little geographic substructure within the Peninsula. The subtle but clear genetic differences between early Neolithic Iberian farmers and early Neolithic central European farmers point toward two independent migrations, potentially originating from two slightly different source populations. These populations followed different routes, one along the Mediterranean coast, giving rise to early Neolithic Iberian farmers, and one via mainland Europe forming early Neolithic central European farmers. This directly links all Neolithic Iberians with the first migrants that arrived with the initial Mediterranean Neolithic wave of expansion. These Iberians mixed with local hunter-gatherers (but maintained farming/pastoral subsistence strategies, i.e., diet), leading to a recovery from the loss of genetic diversity emerging from the initial migration founder bottleneck. Only after the spread of Bell Beaker pottery did steppe-related ancestry arrive in Iberia, where it had smaller contributions to the population compared with the impact that it had in central Europe. This implies that the two prehistoric migrations causing major population turnovers in central Europe had differential effects at the southwestern edge of their distribution: The Neolithic migrations caused substantial changes in the Iberian gene pool (the introduction of agriculture by farmers) (6, 9, 11, 13, 24), whereas the impact of Bronze Age migrations (Yamnaya) was significantly smaller in Iberia than in north-central Europe (24). The post-Neolithic prehistory of Iberia is generally characterized by interactions between residents rather than by migrations from other parts of Europe, resulting in relative genetic continuity, while most other regions were subject to major genetic turnovers after the Neolithic (4, 6, 7, 9, 25, 48). Although Iberian populations represent the furthest wave of Neolithic expansion in the westernmost Mediterranean, the subsequent populations maintain a surprisingly high genetic legacy of the original pioneer farming migrants from the east compared with their central European counterparts. This counterintuitive result emphasizes the importance of in-depth diachronic studies in all parts of the continent.

Related:

Population substructure in Iberia, highest in the north-west territory (to appear in Nature)

A manuscript co-authored by Angel Carracedo, from the University of Santiago de Compostela, and (always according to him) pre-accepted in Nature, will offer more insight into the population substructure of Spain, based on autosomal DNA.

Carracedo’s lecture about DNA (in Galician), including his summary of the paper (from december 2017):

Some of the points made in the video:

  • The study shows a situation parallelling – as expected – the expansion of Spanish Medieval kingdoms during the Reconquista (and subsequent repopulation).
  • In it, the biggest surprise seems to be the greater substructure found in Galicia, the north-western Spanish territory – greater even than expected by the authors.
  • As a side note, Galicia shows a great influence from Moorish” ancestral components, due mainly to the influx from Portugal, which shows more.

It is difficult to judge only from the image and his words, but one could say that there are:

  • Certain quite old ancestral Galician groups;
    • then two – also quite old – ancestral Basque groups;
      • then more recent Galician groups;
        • and then a common, central Spanish group – including
          • a wider Asturian-Catalan group, with a western Asturian-Leonese, and an eastern Catalan subgroup;
          • and a central Castillian-Aragonese group, also with a western Castillian, and an eastern Aragonese subgroup.
spain-autosomal
Spain’s population substructure, from the video.

We thought that certain parts of the British Isles could show ancestral components related to the old population, although this has not proven exactly right, due to more recent population expansions.

However, this paper might shed light to the controversy surrounding Lusitanian (possibly Gallaico-Lusitanian) as a Pre-Celtic Indo-European group of Iberia, either slightly older as an Italo-Celtic dialect, or potentially from the Bell Beaker expansion, whose genetic imprint might have survived the Roman conquest, which apparently didn’t replace its ancestral population.

Given the presence of a central Spanish group opposed to the other minor groups – and knowing that (at least part of) the Medieval kingdoms should be related to the Occitan region – due to the Celtic expansion, and also potentially later during the Visigothic Kingdom, and the Carolingian Empire – , we can only guess that the other (north-western and Basque) groups are potentially quite old, and reflect prehistoric population structures.

Just speculating here, of course. Another interesting genetic paper to await…

Seen first in the Facebook group Iberia ADN.

Related:

Iberian Peninsula: Discontinuity in mtDNA between hunter-gatherers and farmers, not so much during the Chalcolithic and EBA

iberia-mtdna

A new preprint paper at BioRxiv, The maternal genetic make-up of the Iberian Peninsula between the Neolithic and the Early Bronze Age, by Szécsényi-Nagy et al. (2017).

Abstract:

Agriculture first reached the Iberian Peninsula around 5700 BCE. However, little is known about the genetic structure and changes of prehistoric populations in different geographic areas of Iberia. In our study, we focused on the maternal genetic makeup of the Neolithic (~ 5500-3000 BCE), Chalcolithic (~ 3000-2200 BCE) and Early Bronze Age (~ 2200-1500 BCE). We report ancient mitochondrial DNA results of 213 individuals (151 HVS-I sequences) from the northeast, central, southeast and southwest regions and thus on the largest archaeogenetic dataset from the Peninsula to date. Similar to other parts of Europe, we observe a discontinuity between hunter-gatherers and the first farmers of the Neolithic. During the subsequent periods, we detect regional continuity of Early Neolithic lineages across Iberia, however the genetic contribution of hunter-gatherers is generally higher than in other parts of Europe and varies regionally. In contrast to ancient DNA findings from Central Europe, we do not observe a major turnover in the mtDNA record of the Iberian Late Chalcolithic and Early Bronze Age, suggesting that the population history of the Iberian Peninsula is distinct in character.

iberian-mtdna-samples
Iberian mtDNA samples

Detailed conclusions of their work,

The present study, based on 213 new and 125 published mtDNA data of prehistoric Iberian individuals suggests a more complex mode of interaction between local hunter-gatherers and incoming early farmers during the Early and Middle Neolithic of the Iberian Peninsula, as compared to Central Europe. A characteristic of Iberian population dynamics is the proportion of autochthonous hunter-gatherer haplogroups, which increased in relation to the distance to the Mediterranean coast. In contrast, the early farmers in Central Europe showed comparatively little admixture of contemporaneous hunter-gatherer groups. Already during the first centuries of Neolithic transition in Iberia, we observe a mix of female DNA lineages of different origins. Earlier hunter-gatherer haplogroups were found together with a variety of new lineages, which ultimately derive from Near Eastern farming groups. On the other hand, some early Neolithic sites in northeast Iberia, especially the early group from the cave site of Els Trocs in the central Pyrenees, seem to exhibit affinities to Central European LBK communities. The diversity of female lineages in the Iberian communities continued even during the Chalcolithic, when populations became more homogeneous, indicating higher mobility and admixture across different geographic regions. Even though the sample size available for Early Bronze Age populations is still limited, especially with regards to El Argar groups, we observe no significant changes to the mitochondrial DNA pool until the end of our time transect (1500 BCE). The expansion of groups from the eastern steppe, which profoundly impacted Late Neolithic and EBA groups of Central and North Europe, cannot (yet) be seen in the contemporaneous population substrate of the Iberian Peninsula at the present level of genetic resolution. This highlights the distinct character of the Neolithic transition both in the Iberian Peninsula and elsewhere and emphasizes the need for further in depth archaeogenetic studies for reconstructing the close reciprocal relationship of genetic and cultural processes on the population level.

So it seems more and more likely that the North-West Indo-European invasion during the Copper Age (signaled by changes in Y-DNA lineages) was not, as in central Europe, accompanied by much mtDNA turnover. What that means – either a male-dominated invasion, or a longer internal evolution of invasive Y-DNA subclades – remains to bee seen, but I am still more inclined to see the former as the most likely interpretation, in spite of admixture results.

Related:

Featured images: from the article, licensed BY-NC-ND.

Analysis of R1b-DF27 haplogroups in modern populations adds new information that contrasts with ‘steppe admixture’ results

R1b-DF27-iberia

New open access article published in Scientific Reports, Analysis of the R1b-DF27 haplogroup shows that a large fraction of Iberian Y-chromosome lineages originated recently in situ, by Solé-Morata et al. (2017).

Abstract

Haplogroup R1b-M269 comprises most Western European Y chromosomes; of its main branches, R1b-DF27 is by far the least known, and it appears to be highly prevalent only in Iberia. We have genotyped 1072 R1b-DF27 chromosomes for six additional SNPs and 17 Y-STRs in population samples from Spain, Portugal and France in order to further characterize this lineage and, in particular, to ascertain the time and place where it originated, as well as its subsequent dynamics. We found that R1b-DF27 is present in frequencies ~40% in Iberian populations and up to 70% in Basques, but it drops quickly to 6–20% in France. Overall, the age of R1b-DF27 is estimated at ~4,200 years ago, at the transition between the Neolithic and the Bronze Age, when the Y chromosome landscape of W Europe was thoroughly remodeled. In spite of its high frequency in Basques, Y-STR internal diversity of R1b-DF27 is lower there, and results in more recent age estimates; NE Iberia is the most likely place of origin of DF27. Subhaplogroup frequencies within R1b-DF27 are geographically structured, and show domains that are reminiscent of the pre-Roman Celtic/Iberian division, or of the medieval Christian kingdoms.

Some people like to say that Y-DNA haplogroup analysis, or phylogeography in general, is of no use anymore (especially modern phylogeography), and they are content to see how ‘steppe admixture’ was (or even is) distributed in Europe to draw conclusions about ancient languages and their expansion. With each new paper, we are seeing the advantages of analysing ancient and modern haplogroups in ascertaining population movements.

Quite recently there was a suggestion based on steppe admixture that Basque-speaking Iberians resisted the invasion from the steppe. Observing the results of this article (dates of expansion and demographic data) we see a clear expansion of Y-DNA haplogroups precisely by the time of Bell Beaker expansion from the east. Y-DNA haplogroups of ancient samples from Portugal point exactly to the same conclusion.

The situation of R1b-DF27 in Basques, as I have pointed out elsewhere, is probably then similar to the genetic drift of Finns, mainly of N1c lineages, speaking today a Uralic language that expaned with Corded Ware and R1a subclades.

The recent article on Mycenaean and Minoan genetics also showed that, when it comes to Europe, most of the demographic patterns we see in admixture are reminiscent of the previous situation, only rarely can we see a clear change in admixture (which would mean an important, sudden replacement of the previous population).

Equating the so-called steppe admixture with Indo-European languages is wrong. Period.

The following are excerpts from the article (emphasis is mine):

Dates and expansions

The average STR variance of DF27 and each subhaplogroup is presented in Suppl. Table 2. As expected, internal diversity was higher in the deeper, older branches of the phylogeny. If the same diversity was divided by population, the most salient finding is that native Basques (Table 2) have a lower diversity than other populations, which contrasts with the fact that DF27 is notably more frequent in Basques than elsewhere in Iberia (Suppl. Table 1). Diversity can also be measured as pairwise differences distributions (Fig. 5). The distribution of mean pairwise differences within Z195 sits practically on top of that of DF27; L176.2 and Z220 have similar distributions, as M167 and Z278 have as well; finally, M153 shows the lowest pairwise distribution values. This pattern is likely to reflect the respective ages of the haplogroups, which we have estimated by a modified, weighted version of the ρ statistic (see Methods).

Z195 seems to have appeared almost simultaneously within DF27, since its estimated age is actually older (4570 ± 140 ya). Of the two branches stemming from Z195, L176.2 seems to be slightly younger than Z220 (2960 ± 230 ya vs. 3320 ± 200 ya), although the confidence intervals slightly overlap. M167 is clearly younger, at 2600 ± 250 ya, a similar age to that of Z278 (2740 ± 270 ya). Finally, M153 is estimated to have appeared just 1930 ± 470 ya.

Haplogroup ages can also be estimated within each population, although they should be interpreted with caution (see Discussion). For the whole of DF27, (Table 3), the highest estimate was in Aragon (4530 ± 700 ya), and the lowest in France (3430 ± 520 ya); it was 3930 ± 310 ya in Basques. Z195 was apparently oldest in Catalonia (4580 ± 240 ya), and with France (3450 ± 269 ya) and the Basques (3260 ± 198 ya) having lower estimates. On the contrary, in the Z220 branch, the oldest estimates appear in North-Central Spain (3720 ± 313 ya for Z220, 3420 ± 349 ya for Z278). The Basques always produce lower estimates, even for M153, which is almost absent elsewhere.

R1b-DF27-tree
Simplified phylogenetic tree of the R1b-M269 haplogroup. SNPs in italics were not analyzed in this manuscript.

Demography

The median value for Tstart has been estimated at 103 generations (Table 4), with a 95% highest probability density (HPD) range of 50–287 generations; effective population size increased from 131 (95% HPD: 100–370) to 72,811 (95% HPD: 52,522–95,334). Considering patrilineal generation times of 30–35 years, our results indicate that R1b-DF27 started its expansion ~3,000–3,500 ya, shortly after its TMRCA.

As a reference, we applied the same analysis to the whole of R1b-S116, as well as to other common haplogroups such as G2a, I2, and J2a. Interestingly, all four haplogroups showed clear evidence of an expansion (p > 0.99 in all cases), all of them starting at the same time, ~50 generations ago (Table 4), and with similar estimated initial and final populations. Thus, these four haplogroups point to a common population expansion, even though I2 (TMRCA, weighted ρ, 7,800 ya) and J2a (TMRCA, 5,500 ya) are older than R1b-DF27. It is worth noting that the expansion of these haplogroups happened after the TMRCA of R1b-DF27.

R1b-DF27-PCA
Principal component analysis of STR haplotypes. (a) Colored by subhaplogroup, (b) colored by population. Larger squares represent subhaplogroup or population centroids.

Sum up and discussion

We have characterized the geographical distribution and phylogenetic structure of haplogroup R1b-DF27 in W. Europe, particularly in Iberia, where it reaches its highest frequencies (40–70%). The age of this haplogroup appears clear: with independent samples (our samples vs. the 1000 genome project dataset) and independent methods (variation in 15 STRs vs. whole Y-chromosome sequences), the age of R1b-DF27 is firmly grounded around 4000–4500 ya, which coincides with the population upheaval in W. Europe at the transition between the Neolithic and the Bronze Age. Before this period, R1b-M269 was rare in the ancient DNA record, and during it the current frequencies were rapidly reached. It is also one of the haplogroups (along with its daughter clades, R1b-U106 and R1b-S116) with a sequence structure that shows signs of a population explosion or burst. STR diversity in our dataset is much more compatible with population growth than with stationarity, as shown by the ABC results, but, contrary to other haplogroups such as the whole of R1b-S116, G2a, I2 or J2a, the start of this growth is closer to the TMRCA of the haplogroup. Although the median time for the start of the expansion is older in R1b-DF27 than in other haplogroups, and could suggest the action of a different demographic process, all HPD intervals broadly overlap, and thus, a common demographic history may have affected the whole of the Y chromosome diversity in Iberia. The HPD intervals encompass a broad timeframe, and could reflect the post-Neolithic population expansions from the Bronze Age to the Roman Empire.

While when R1b-DF27 appeared seems clear, where it originated may be more difficult to pinpoint. If we extrapolated directly from haplogroup frequencies, then R1b-DF27 would have originated in the Basque Country; however, for R1b-DF27 and most of its subhaplogroups, internal diversity measures and age estimates are lower in Basques than in any other population. Then, the high frequencies of R1b-DF27 among Basques could be better explained by drift rather than by a local origin (except for the case of M153; see below), which could also have decreased the internal diversity of R1b-DF27 among Basques. An origin of R1b-DF27 outside the Iberian Peninsula could also be contemplated, and could mirror the external origin of R1b-M269, even if it reaches there its highest frequencies. However, the search for an external origin would be limited to France and Great Britain; R1b-DF27 seems to be rare or absent elsewhere: Y-STR data are available only for France, and point to a lower diversity and more recent ages than in Iberia (Table 3). Unlike in Basques, drift in a traditionally closed population seems an unlikely explanation for this pattern, and therefore, it does not seem probable that R1b-DF27 originated in France. Then, a local origin in Iberia seems the most plausible hypothesis. Within Iberia, Aragon shows the highest diversity and age estimates for R1b-DF27, Z195, and the L176.2 branch, although, given the small sample size, any conclusion should be taken cautiously. On the contrary, Z220 and Z278 are estimated to be older in North Central Spain (N Castile, Cantabria and Asturias). Finally, M153 is almost restricted to the Basque Country: it is rarely present at frequencies >1% elsewhere in Spain (although see the cases of Alacant, Andalusia and Madrid, Suppl. Table 1), and it was found at higher frequencies (10–17%) in several Basque regions; a local origin seems plausible, but, given the scarcity of M153 chromosomes outside of the Basque Country, the diversity and age values cannot be compared.

Within its range, R1b-DF27 shows same geographical differentiation: Western Iberia (particularly, Asturias and Portugal), with low frequencies of R1b-Z195 derived chromosomes and relatively high values of R1b-DF27* (xZ195); North Central Spain is characterized by relatively high frequencies of the Z220 branch compared to the L176.2 branch; the latter is more abundant in Eastern Iberia. Taken together, these observations seem to match the East-West patterning that has occurred at least twice in the history of Iberia: i) in pre-Roman times, with Celtic-speaking peoples occupying the center and west of the Iberian Peninsula, while the non-Indoeuropean eponymous Iberians settled the Mediterranean coast and hinterland; and ii) in the Middle Ages, when Christian kingdoms in the North expanded gradually southwards and occupied territories held by Muslim fiefs.

DF27-iberia-france
Contour maps of the derived allele frequencies of the SNPs analyzed in this manuscript. Population abbreviations as in Table 1. Maps were drawn with SURFER v. 12 (Golden Software, Golden CO, USA).

I wouldn’t trust the absence of R1b-DF27 outside France as a proof that its origin must be in Western Europe – especially since we have ancient DNA, and that assertion might prove quite wrong – but aside from that the article seems solid in its analysis of modern populations.

Related:

Text and figures from the article, licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.