We assembled genome-wide data from 271 ancient Iberians, of whom 176 are from the largely unsampled period after 2000 BCE, thereby providing a high-resolution time transect of the Iberian Peninsula. We document high genetic substructure between northwestern and southeastern hunter-gatherers before the spread of farming. We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry. We show that, in the Iron Age, Steppe ancestry had spread not only into Indo-European–speaking regions but also into non-Indo-European–speaking ones, and we reveal that present-day Basques are best described as a typical Iron Age population without the admixture events that later affected the rest of Iberia. Additionally, we document how, beginning at least in the Roman period, the ancestry of the peninsula was transformed by gene flow from North Africa and the eastern Mediterranean.
From the Bronze Age (~2200–900 BCE), we increase the available dataset (6, 7, 17) from 7 to 60 individuals and show how ancestry from the Pontic-Caspian steppe (Steppe ancestry) appeared throughout Iberia in this period (Fig. 1, C and D), albeit with less impact in the south (table S13). The earliest evidence is in 14 individuals dated to ~2500–2000 BCE who coexisted with local people without Steppe ancestry (Fig. 2B). These groups lived in close proximity and admixed to form the Bronze Age population after 2000 BCE with ~40% ancestry from incoming groups (Fig. 2B and fig. S6).
Y-chromosome turnover was even more pronounced (Fig. 2B), as the lineages common in Copper Age Iberia (I2, G2, and H) were almost completely replaced by one lineage, R1b-M269. These patterns point to a higher contribution of incoming males than females, also supported by a lower proportion of nonlocal ancestry on the X-chromosome (table S14 and fig. S7), a paradigm that can be exemplified by a Bronze Age tomb from Castillejo del Bonete containing a male with Steppe ancestry and a female with ancestry similar to Copper Age Iberians.
For the Iron Age, we document a consistent trend of increased ancestry related to Northern and Central European populations with respect to the preceding Bronze Age (Figs. 1, C and D, and 2B). The increase was 10 to 19% (95% confidence intervals given here and in the percentages that follow) in 15 individuals along the Mediterranean coast where non-Indo-European Iberian languages were spoken; 11 to 31% in two individuals at the Tartessian site of La Angorrilla in the southwest with uncertain language attribution; and 28 to 43% in three individuals at La Hoya in the north where Indo-European Celtiberian languages were likely spoken (fig. S6 and tables S11 and S12).
This trend documents gene flow into Iberia during the Late Bronze Age or Early Iron Age, possibly associated with the introduction of the Urnfield tradition (18). Unlike in Central or Northern Europe, where Steppe ancestry likely marked the introduction of Indo-European languages (12), our results indicate that, in Iberia, increases in Steppe ancestry were not always accompanied by switches to Indo-European languages.
I think it is obvious they are extrapolating the traditional (not that well-known) linguistic picture of Iberia during the Iron Age, believing in continuity of that picture (especially non-Indo-European languages) during the Urnfield period and earlier.
What this data shows is, as expected, the arrival of Celtic languages in Iberia after Bell Beakers and, by extension, in the rest of western Europe. Somewhat surprisingly, this may have happened during the Urnfield period, and not during the La Tène period.
Also important are the precise subclades:
We thus detect three Bronze Age males who belonged to DF27 (154, 155), confirming its presence in Bronze Age Iberia. The other Iberian Bronze Age males could belong to DF27 as well, but the extremely low recovery rate of this SNP in our dataset prevented us to study its true distribution. All the Iberian Bronze Age males with overlapping sequences at R1b-L21 were negative for this mutation. Therefore, we can rule out Britain as a plausible proximate origin since contemporaneous British males are derived for the L21 subtype.
BAL0051 could be assigned to haplogroup I1, while BAL003 carries the C1a1a haplogroup. To the limits of our typing resolution, EN/MN individuals CHA001, CHA003, ELT002 and ELT006 share haplogroup I2a1b, which was also reported for Loschbour  and Motala HG , and other LN and Chalcolithic individuals from Iberia [7, 9], as well as Neolithic Scotland, France, England , and Lithuania . Both C1 and I1/ I2 are considered typical European HG lineages prior to the arrival of farming. Interestingly, CHA002 was assigned to haplogroup R1b-M343, which together with an EN individual from Cova de Els Trocs (R1b1a) confirms the presence of R1b in Western Europe prior to the expansion of steppe pastoralists that established a related male lineage in Bronze Age Europe [3, 6, 9, 13, 19]. The geographical vicinity and contemporaneity of these two sites led us to run genomic kinship analysis in order to rule out any first or second degree of relatedness. Early Neolithic individual FUC003 carries the Y haplogroup G2a2a1, commonly found in other EN males from Neolithic Anatolia , Starçevo, LBK Hungary , Impressa from Croatia and Serbia Neolithic  and Czech Neolithic , but also in MN Croatia  and Chalcolithic Iberia .
Opportunities to directly study the founding of a human population and its subsequent evolutionary history are rare. Using genome sequence data from 27 ancient Icelanders, we demonstrate that they are a combination of Norse, Gaelic, and admixed individuals. We further show that these ancient Icelanders are markedly more similar to their source populations in Scandinavia and the British-Irish Isles than to contemporary Icelanders, who have been shaped by 1100 years of extensive genetic drift. Finally, we report evidence of unequal contributions from the ancient founders to the contemporary Icelandic gene pool. These results provide detailed insights into the making of a human population that has proven extraordinarily useful for the discovery of genotype-phenotype associations.
We estimated the mean Norse ancestry of the settlement population (24 pre-Christians and one early Christian) as 0.566 [95% confidence interval (CI) 0.431–0.702], with a nonsignificant difference betweenmales (0.579) and females (0.521). Applying the same ADMIXTURE analysis to each of the 916 contemporary Icelanders, we obtained a mean Norse ancestry of 0.704 (95% CI 0.699–0.709). Although not statistically significant (t test p = 0.058), this difference is suggestive. A similar difference ofNorse ancestry was observed with a frequency-based weighted least-squares admixture estimator (16), 0.625 [Mean squared error (MSE) = 0.083] versus 0.74 (MSE = 0.0037). Finally, the D-statistic test D(YRI, X; Gaelic, Norse) also revealed a greater affinity between Norse and contemporary Icelanders (0.0004, 95% CI 0.00008–0.00072) than between Norse and ancient Icelanders (−0.0002, 95% CI −0.00056–0.00015). This observation raises the possibility that reproductive success among the earliest Icelanders was stratified by ancestry, as genetic drift alone is unlikely to systematically alter ancestry at thousands of independent loci (fig. S10). We note that many settlers of Gaelic ancestry came to Iceland as slaves, whose survival and freedom to reproduce is likely to have been constrained (17). Some shift in ancestry must also be due to later immigration from Denmark, which maintained colonial control over Iceland from 1380 to 1944 (for example, in 1930 there were 745 Danes out of a total population of 108,629 in Iceland) (18).
Five pre-Christian Icelanders (VDP-A5, DAVA9, NNM-A1, SVK-A1 and TGS-A1) fall just outside the space occupied by contemporary Norse in Fig. 3A. That these individuals show a stronger signal of drift shared with contemporary Icelanders is also apparent in the results of ADMIXTURE, run in supervised mode with three contemporary reference populations (Norse, Gaelic, and Icelandic) (Fig. 3B). The correlation between the proportion of Icelandic ancestry from this analysis and PC1 in Fig. 2A is |r| = 0.913.(…)
(…) as the five ancient Icelanders fall well within the cluster of contemporary Scandinavians (Fig. 3C), we conclude that they, or close relatives, likely contributed more to the contemporary Icelandic gene pool than the other pre-Christians. We note that this observation is consistent with the inference that settlers of Norse ancestry had greater reproductive success than those of Gaelic ancestry.
Among R1a, the picture is uniformly of R1a-Z284 (at least five of the seven reported).
There are six samples of I1, with great variation in subclades.
Among R1b-L51 subclades (ten samples), there are U106 (at least one sample), L21 (three samples), and another P312 (L238); see above the relationship with those clustering closely with Gaelic samples, marked in fluorescent, which is compatible with Gaelic settlers (predominantly of R1b-L21 lineages) coming to Iceland as slaves.
Probably not much of a surprise, coming from Norse speakers, but they are another relevant reference for comparison with samples of East Germanic tribes, when they appear.
Also, the first reported Klinefelter (XXY) in ancient DNA (sample ID is YGS-B2).
Next generation sequencing (NGS) technologies offer immense possibilities given the large genomic data they simultaneously deliver. The human Y chromosome serves as good example how NGS benefits various applications in evolution, anthropology, genealogy and forensics. Prior to NGS, the Y-chromosome phylogenetic tree consisted of a few hundred branches, based on NGS data it now contains many thousands. The complexity of both, Y tree and NGS data provide challenges for haplogroup assignment. For effective analysis and interpretation of Y-chromosome NGS data, we present Yleaf, a publically available, automated, user-friendly software for high-resolution Y-chromosome haplogroup inference independently of library and sequencing methods.
In the time of NGS (or massively parallel sequencing, MPS), the amount of genomic data produced and made publically available is rapidly expanding, providing valuable resources for many areas of research and applications. Due to its haploid nature and male-specific inheritance, the non-recombining part of the human Y-chromosome (NRY) is highly suitable for phylogenetic studies and for addressing questions in evolution, anthropology, population history, genealogy and forensics (Jobling & Tyler-Smith, 2017). Over recent years, NGS data allowed the phylogenetic NRY tree to dramatically increase in size and complexity (Hallast et al. 2014; Poznik et al. 2016). The two most comprehensive tree versions ISOGG (http://www.isogg.org/tree) and Yfull (https://www.yfull.com/tree) currently contain thousands of branches. However, the complexity of both, Y tree and NGS data provide immense challenges for NRY haplogroup assignment, which reflects a key element in many NRY applications. Here we introduce Yleaf, a Phyton-based, easy-to-use, publically-available software tool for effective NRY single nucleotide polymorphism (SNP) calling and subsequent NRY haplogroup inference from NGS data. By comparative whole genome data analysis, we demonstrate high concordance of Yleaf in NRY-SNP calling compared to well-established tools such as SAMtools/BCFtools (Li et al. 2009), and GATK (McKenna, et al. 2010) as well as improved performance of Yleaf in NRY haplogroup assignment relative to previously developed tools such as clean_tree (Ralf et al. 2015), AMY-tree (Van Geystelen et al. 2015), and yHaplo (Poznik, 2016).
Yleaf allows analyzing NRY sequence data from many types of NGS libraries i.e., whole genomes, whole exomes, large genomic regions, and large numbers of targeted amplicons. Several modifications relative to our previously developed clean_tree tool (Ralf et al. 2015) were implemented to optimize the performance especially relevant for extremely large NGS datasets such as whole genomes. For instance, Yleaf extracts the Y-chromosomal reads prior to further processing and uses multi-threading, a batch option is included too. Importantly, Yleaf provides drastically increased haplogroup resolution i.e., from Downloaded from 530 positions defining 432 NRY haplogroups with clean_tree (Ralf et al. 2015) to over 41,000 positions defining 5353 haplogroups with Yleaf. For a detailed method description see the supplementary material.
The extent of population structure within Ireland is largely unknown, as is the impact of historical migrations. Here we illustrate fine-scale genetic structure across Ireland that follows geographic boundaries and present evidence of admixture events into Ireland. Utilising the ‘Irish DNA Atlas’, a cohort (n = 194) of Irish individuals with four generations of ancestry linked to specific regions in Ireland, in combination with 2,039 individuals from the Peoples of the British Isles dataset, we show that the Irish population can be divided in 10 distinct geographically stratified genetic clusters; seven of ‘Gaelic’ Irish ancestry, and three of shared Irish-British ancestry. In addition we observe a major genetic barrier to the north of Ireland in Ulster. Using a reference of 6,760 European individuals and two ancient Irish genomes, we demonstrate high levels of North-West French-like and West Norwegian-like ancestry within Ireland. We show that that our ‘Gaelic’ Irish clusters present homogenous levels of ancient Irish ancestries. We additionally detect admixture events that provide evidence of Norse-Viking gene flow into Ireland, and reflect the Ulster Plantations. Our work informs both on Irish history, as well as the study of Mendelian and complex disease genetics involving populations of Irish ancestry.
Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.
Here are some interesting excerpts (emphasis mine):
Population structure in Ireland
The geographical distribution of this deep subdivision of Leinster resembles pre-Norman territorial boundaries which divided Ireland into fifths (cúige), with north Leinster a kingdom of its own known as Meath (Mide) . However interpreted, the firm implication of the observed clustering is that despite its previously reported homogeneity, the modern Irish population exhibits genetic structure that is subtly but detectably affected by ancestral population structure conferred by geographical distance and, possibly, ancestral social structure.
ChromoPainter PC1 demonstrated high diversity amongst clusters from the west coast, which may be attributed to longstanding residual ancient (possibly Celtic) structure in regions largely unaffected by historical migration. Alternatively, genetic clusters may also have diverged as a consequence of differential influence from outside populations. This diversity between western genetic clusters cannot be explained in terms of geographic distance alone.
In contrast to the west of Ireland, eastern individuals exhibited relative homogeneity; (…) The overall pattern of western diversity and eastern homogeneity in Ireland may be explained by increased gene flow and migration into and across the east coast of Ireland from geographically proximal regions, the closest of which is the neighbouring island of Britain.
Analysis of variance of the British admixture component in cluster groups showed a significant difference (p < 2×10-16), indicating a role for British Anglo-Saxon admixture in distinguishing clusters, and ChromoPainter PC2 was correlated with the British component (p < 2×10-16), explaining approximately 43% of the variance. PC2 therefore captures an east to west Anglo-Celtic cline in Irish ancestry. This may explain the relative eastern homogeneity observed in Ireland, which could be a result of the greater English influence in Leinster and the Pale during the period of British rule in Ireland following the Norman invasion, or simply geographic proximity of the Irish east coast to Britain. Notably, the Ulster cluster group harboured an exceptionally large proportion of the British component (Fig 1D and 1E), undoubtedly reflecting the strong influence of the Ulster Plantations in the 17th century and its residual effect on the ethnically British population that has remained.
On the genetic structure of the British Isles
The genetic substructure observed in Ireland is consistent with long term geographic diversification of Celtic populations and the continuity shown between modern and Early Bronze Age Irish people
Clusters representing Celtic populations harbouring less Anglo-Saxon influence separate out above and below SEE on PC4. Notably, northern Irish clusters (NLU), Scottish (NISC, SSC and NSC), Cumbria (CUM) and North Wales (NWA) all separate out at a mutually similar level, representing northern Celtic populations. The southern Celtic populations Cornwall (COR), south Wales (SWA) and south Munster (SMN) also separate out on similar levels, indicating some shared haplotypic variation between geographically proximate Celtic populations across both Islands. It is notable that after the split of the ancestrally divergent Orkney, successive ChromoPainter PCs describe diversity in British populations where “Anglo-saxonization” was repelled . PC3 is dominated by Welsh variation, while PC4 in turn splits North and South Wales significantly, placing south Wales adjacent to Cornwall and north Wales at the other extreme with Cumbria, all enclaves where Brittonic languages persisted.
In an interesting symmetry, many Northern Irish samples clustered strongly with southern Scottish and northern English samples, defining the Northern Irish/Cumbrian/Scottish (NICS) cluster group. More generally, by modelling Irish genomes as a linear mixture of haplotypes from British clusters, we found that Scottish and northern English samples donated more haplotypes to clusters in the north of Ireland than to the south, reflecting an overall correlation between Scottish/north English contribution and ChromoPainter PC1 position in Fig 1 (Linear regression: p < 2×10-16, r2 = 0.24).
North to south variation in Ireland and Britain are therefore not independent, reflecting major gene flow between the north of Ireland and Scotland (Fig 5) which resonates with three layers of historical contacts. First, the presence of individuals with strong Irish affinity among the third generation PoBI Scottish sample can be plausibly attributed to major economic migration from Ireland in the 19th and 20th centuries . Second, the large proportion of Northern Irish who retain genomes indistinguishable from those sampled in Scotland accords with the major settlements (including the Ulster Plantation) of mainly Scottish farmers following the 16th Century Elizabethan conquest of Ireland which led to these forming the majority of the Ulster population. Third, the suspected Irish colonisation of Scotland through the Dál Riata maritime kingdom, which expanded across Ulster and the west coast of Scotland in the 6th and 7th centuries, linked to the introduction and spread of Gaelic languages . Such a migratory event could work to homogenise older layers of Scottish population structure, in a similar manner as noted on the east coasts of Britain and Ireland. Earlier communications and movements across the Irish Sea are also likely, which at its narrowest point separates Ireland from Scotland by approximately 20 km.
Genomic footprints of migration into Ireland
Quite interesting is that it is haplogroups, and not admixture, that which defines the oldest migration layers into Ireland. Without evidence of paternal Y-DNA lineages we would probably not be able to ascertain the oldest migrations and languages broght by migrants, including Celtic languages:
Of all the European populations considered, ancestral influence in Irish genomes was best represented by modern Scandinavians and northern Europeans, with a significant single-date one-source admixture event overlapping the historical period of the Norse-Viking settlements in Ireland (p < 0.01; fit quality FQB > 0.985; Fig 6). (…) This suggests a contribution of historical Viking settlement to the contemporary Irish genome and contrasts with previous estimates of Viking ancestry in Ireland based on Y chromosome haplotypes, which have been very low . The modern-day paucity of Norse-Viking Y chromosome haplotypes may be a consequence of drift with the small patrilineal effective population size, or could have social origins with Norse males having less influence after their military defeat and demise as an identifiable community in the 11th century, with persistence of the autosomal signal through recombination.
European admixture date estimates in northwest Ulster did not overlap the Viking age but did include the Norman period and the Plantations
The genetic legacies of the populations of Ireland and Britain are therefore extensively intertwined and, unlike admixture from northern Europe, too complex to model with GLOBETROTTER.
Featured image, from the article on Science Reports: The clustering of individuals with Irish and British ancestry based solely on genetics. Shown are 30 clusters identified by fineStructure from 2,103 Irish and British individuals. The dendrogram (left) shows the tree of clusters inferred by fineStructure and the map (right) shows the geographic origin of 192 Atlas Irish individuals and 1,611 British individuals from the Peoples of the British Isles (PoBI) cohort, labelled according to fineStructure cluster membership. Individuals are placed at the average latitude and longitude of either their great-grandparental (Atlas) or grandparental (PoBI) birthplaces. Great Britain is separated into England, Scotland, and Wales. The island of Ireland is split into the four Provinces; Ulster, Connacht, Leinster, and Munster. The outline of Britain was sourced from Global Administrative Areas (2012). GADM database of Global Administrative Areas, version 2.0. www.gadm.org. The outline of Ireland was sourced from Open Street Map Ireland, Copyright OpenStreetMap Contributors, (https://www.openstreetmap.ie/) – data available under the Open Database Licence. The figure was plotted in the statistical software language R46, version 3.4.1, with various packages.