The time and place of European admixture in Ashkenazi Jewish history

Open access The time and place of European admixture in Ashkenazi Jewish history, by Xue, Lencz, Darvasi, Pe’er, & Carmi, PLOS Genetics (2018).

Abstract (emphasis mine):

The Ashkenazi Jewish (AJ) population is important in genetics due to its high rate of Mendelian disorders. AJ appeared in Europe in the 10th century, and their ancestry is thought to comprise European (EU) and Middle-Eastern (ME) components. However, both the time and place of admixture are subject to debate. Here, we attempt to characterize the AJ admixture history using a careful application of new and existing methods on a large AJ sample. Our main approach was based on local ancestry inference, in which we first classified each AJ genomic segment as EU or ME, and then compared allele frequencies along the EU segments to those of different EU populations. The contribution of each EU source was also estimated using GLOBETROTTER and haplotype sharing. The time of admixture was inferred based on multiple statistics, including ME segment lengths, the total EU ancestry per chromosome, and the correlation of ancestries along the chromosome. The major source of EU ancestry in AJ was found to be Southern Europe (≈60–80% of EU ancestry), with the rest being likely Eastern European. The inferred admixture time was ≈30 generations ago, but multiple lines of evidence suggest that it represents an average over two or more events, pre- and post-dating the founder event experienced by AJ in late medieval times. The time of the pre-bottleneck admixture event, which was likely Southern European, was estimated to ≈25–50 generations ago.

ashkenazi-pca
Principal Component Analysis (PCA) of the European and Middle-Eastern samples used as reference panels in our study. The analysis was performed using SmartPCA [25] with default parameters (except no outlier removal). The populations included within each region are listed in Table 1 of the main text. The PCA plot supports the partitioning of the European and Middle-Eastern populations into the broad regional groups used in the analysis. https://doi.org/10.1371/journal.pgen.1006644.s001

Interesting excerpts:

(…) AJ genetics defies simple demographic theories. Hypotheses such as a wholly Khazar, Turkish, or Middle-Eastern origin have been disqualified [4–7, 17, 55], but even a model of a single Middle-Eastern and European admixture event cannot account for all of our observations. The actual admixture history might have been highly complex, including multiple geographic sources and admixture events. Moreover, due to the genetic similarity and complex history of the European populations involved (particularly in Southern Europe [51]), the multiple paths of AJ migration across Europe [10], and the strong genetic drift experienced by AJ in the late Middle Ages [9, 16], there seems to be a limit on the resolution to which the AJ admixture history can be reconstructed.

ashkenazi
A proposed model for the recent AJ history. The proposed intervals for the dates and admixture proportions are based on multiple methods, as described in the main text. https://doi.org/10.1371/journal.pgen.1006644.g007

Historical model and interpretation

Under our model, admixture in Europe first happened in Southern Europe, and was followed by a founder event and a minor admixture event (likely) in Eastern Europe. Admixture in Southern Europe possibly occurred in Italy, given the continued presence of Jews there and the proposed Italian source of the early Rhineland Ashkenazi communities [3]. What is perhaps surprising is the timing of the Southern European admixture to ≈24–49 generations ago, since Jews are known to have resided in Italy already since antiquity. This result would imply no gene flow between Jews and local Italian populations almost until the turn of the millennium, either due to endogamy, or because the group that eventually gave rise to contemporary Ashkenazi Jews did not reside in Southern Europe until that time. More detailed and/or alternative interpretations are left for future studies.

Recent admixture in Northern Europe (Western or Eastern) is consistent with the presence of Ashkenazi Jews in the Rhineland since the 10th century and in Poland since the 13th century. Evidence from the IBD analysis suggests that Eastern European admixture is more likely; however, the results are not decisive. An open question in AJ history is the source of migration to Poland in late Medieval times; various speculations have been proposed, including Western and Central Europe [2, 10]. The uncertainty on whether gene flow from Western Europeans did or did not occur leaves this question open.

ashkenazi-f4-statistics
The effect of gene flow from the Middle-East into Southern EU on f4 statistics. Panels (A) and (B) demonstrate f4(West-EU,YRI;AJ,ME) and f4(South-EU,YRI;AJ,ME), respectively (cf S4A Fig). Paths from the Middle-East into AJ are indicated with red arrows; paths from YRI to Western or Southern Europe with green arrows. The f4 statistic is proportional to the total overlap between these paths (black bars). Whereas panel (B) (f4(South-EU,YRI;AJ,ME)) has more overlapping branches than in (A), migration from the Middle-East into Southern EU introduces a branch where the arrows run in opposite directions (patterned bar). Hence, the observed f4 statistic in (B) may be lower (depending on branch lengths) than in (A), even if Southern EU is the true source of gene flow into AJ. https://doi.org/10.1371/journal.pgen.1006644.s005

Featured image: Expulsions of Jews, from Wikipedia.

A history of male migration in and out of the Green Sahara

Open access research highlight A history of male migration in and out of the Green Sahara, by Yali Xue, Genome Biology (2018) 19:30, on the recent paper by D’Atanasio et al.

Insights from the Green Saharan Y-chromosomal findings (emphasis mine):

It is widely accepted that sub-Saharan Y chromosomes are dominated by E-M2 lineages carried by Bantu-speaking farmers as they expanded from West Africa starting < 5 kya, reaching South Africa within recent centuries [4]. The E-M2-Bantu lineages lie phylogenetically within the E-M2-Green Sahara lineage and show at least three explosive lineage expansions beginning 4.9–5.3 kya [5] (Fig. 1a). These events of E-M2-Bantu expansion are slightly later than the R-V88 expansion, and highlight the range of male demographic changes in the mid-Holocene. North of the Sahara, in addition to the four trans-Saharan haplogroups, haplogroup E-M81 (which diverged from E-M78 ~ 13 kya) became very common in present-day populations as a result of another massive expansion ~ 2 kya [6] (Fig. 1a).

african-sahara-y-dna
Simplified Y-chromosomal phylogeny and inferred past or observed present-day distribution of relevant Y-chromosomal lineages. a Calibrated phylogenetic tree of Y-chromosomal lineages discussed in the text. Green shading represents the period when the present-day Sahara Desert was green and fertile. Lineages represented by filled pentagons have undergone very rapid expansions. b [featured image] The Green Sahara period 5–12 kya. Green shading indicates that the present-day Sahara Desert was green and fertile. The colors within the large oval represent the four Y-chromosomal haplogroups deduced to be present in the region at this time; specific locations are not implied. The arrows indicate the inferred origins of these haplogroups to the north or south, but specific origins and routes are not implied. c The present-day distributions of the four Green Saharan Y-chromosomal haplogroups. Yellow shading indicates the Sahara Desert. Each circle represents a sampled population, with the presence or absence of the four Green Saharan haplogroups shown by the colored sectors; other haplogroups may also be present in these populations, but are not shown. The small arrows indicate the inferred northwards and southwards movements of these haplogroups when the Sahara became uninhabitable.

Although Y chromosomes exist within populations and so share and reflect the general history of those populations, they can sometimes show some departures from other parts of the genome that result from differences in male and female behaviors. D’Atanasio et al. [1] highlight one such contrast in their study. Present-day North African populations show substantial sub-Saharan autosomal and mtDNA genetic components ascribed to the Roman and Arab slave trades 1–2 kya [7], but carry few sub-Saharan Y lineages from this source, probably reflecting the smaller numbers of male slaves and their reduced reproductive opportunities when compared to those of female slaves. The sub-Saharan Y chromosomes in these North African populations thus originate predominantly from the earlier Green Sahara period.

In this part of Africa, the indigenous languages that are spoken belong to three of the four African linguistic families (Afro-Asiatic, Nilo-Saharan and Niger-Congo). Interestingly, these languages show non-random associations with Y lineages. For example, Chadic languages within the Afro-Asiatic family are associated with haplogroup R-V88, whereas Nilo-Saharan languages are associated with specific sublineages within A3-M13 and E-M78, further illustrating the complex human history of the region.

The main question after D’Atanasio et al. (2018) is thus:

(…) what are the reasons for the very rapid R-V88 expansion 5–6 kya [1] and E-M81 expansion ~ 2 kya [6], and how do these expansions fit within general worldwide patterns of male-specific expansions, which in other cases have been linked to cultural and technological changes [5]?

I think that the only known haplogroup expansion that might fit today the spread and dialectalization of Afroasiatic, a proto-language probably contemporaneous or slighly older than Middle Proto-Indo-European, is that of R1b-V88 lineages. However, without ancient DNA samples to corroborate this, we cannot be sure.

See also:

Pleistocene North African genomes link Near Eastern and sub-Saharan African human populations

taforalt-samples

Pleistocene North African genomes link Near Eastern and sub-Saharan African human populations, by van de Loosdrecht et al. Science (2018).

Abstract

North Africa is a key region for understanding human history, but the genetic history of its people is largely unknown. We present genomic data from seven 15,000-year-old modern humans from Morocco, attributed to the Iberomaurusian culture. We find a genetic affinity with early Holocene Near Easterners, best represented by Levantine Natufians, suggesting a pre-agricultural connection between Africa and the Near East. We do not find evidence for gene flow from Paleolithic Europeans into Late Pleistocene North Africans. The Taforalt individuals derive one third of their ancestry from sub-Saharan Africans, best approximated by a mixture of genetic components preserved in present-day West and East Africans. Thus, we provide direct evidence for genetic interactions between modern humans across Africa and Eurasia in the Pleistocene.

Excerpts:

We analyzed the genetic affinities of the Taforalt individ-uals by performing principal component analysis (PCA) and model-based clustering of worldwide data (Fig. 2). When pro-jected onto the top PCs of African and West Eurasian popu-lations, the Taforalt individuals form a distinct cluster in an intermediate position between present-day North Africans (e.g., Amazighes (Berbers), Mozabite and Saharawi) and East Africans (e.g., Afar, Oromo and Somali) (Fig. 2A). Consist-ently, we find that all males with sufficient nuclear DNA preservation carry Y haplogroup E1b1b1a1 (M-78; table S16). This haplogroup occurs most frequently in present-day North and East African populations (18). The closely related E1b1b1b (M-123) haplogroup has been reported for Epipaleolithic Natufians and Pre-Pottery Neolithic Levantines (“Levant_N”) (16). Unsupervised genetic clustering also suggests a connection of Taforalt to the Near East. The three major components that comprise the Taforalt genomes are maximized in early Holocene Levantines, East African hunter-gatherer Hadza from north-central Tanzania, and West Africans (K = 10; Fig. 2B). In contrast, present-day North Africans have smaller sub-Saharan African components with minimal Hadza-related contribution (Fig. 2B).

Taforalt harboring an ancestry that contains additional affinity with South, East and Central African outgroups. None of the present-day or ancient Holocene African groups serve as a good proxy for this unknown ancestry, because adding them as the third source is still insufficient to match the model to the Taforalt gene pool.

Mitochondrial consensus sequences of the Taforalt indi-viduals belong to the U6a (n = 6) and M1b (n = 1) haplogroups (15), which are mostly confined to present-day populations in North and East Africa (7). U6 and M1 have been proposed as markers for autochthonous Maghreb ancestry, which might have been originally introduced into this region by a back-to-Africa migration from West Asia (6, 7). The occurrence of both haplogroups in the Taforalt individuals proves their pre-Holocene presence in the Maghreb.
(…) the diversification of haplogroup U6a and M1 found for Taforalt is dated to ~24,000 yBP (fig. S23), which is close in time to the earliest known appearance of the Iberomaurusian in Northwest Africa (25,845-25,270 cal. yBP at Tamar Hat (26)).

taforalt-admixture
A summary of the genetic profile of the Taforalt individuals. (A) The top two PCs calculated from present-day African, Near Eastern and South European individuals from 72 populations. The Taforalt individuals are projected thereon (red-colored circles). Selected present-day populations are marked by colored symbols. Labels for other populations (marked by small grey circles) are provided in fig. S8. (B) ADMIXTURE results of chosen African and Middle Eastern populations (K = 10). Ancient individuals are labeled in red color. Major ancestry components in Taforalt are maximized in early Holocene Levantines (green), West Africans (purple) and East African Hadza (brown). The ancestry component prevalent in pre-Neolithic Europeans (beige) is absent in Taforalt.

The relationships of the Iberomaurusian culture with the preceding MSA, including the local backed bladelet technologies in Northeast Africa, and the Epigravettian in southern Europe have been questioned (13). The genetic profile of Taforalt suggests substantial Natufian-related and sub-Saharan African-related ancestries (63.5% and 36.5%, respec-tively), but not additional ancestry from Epigravettian or other Upper Paleolithic European populations. Therefore, we provide genomic evidence for a Late Pleistocene connection between North Africa and the Near East, predating the Neolithic transition by at least four millennia, while rejecting a potential Epigravettian gene flow from southern Europe into northern Africa within the resolution of our data.

It seems that the Taforalt gene pool (ca. 13000-12000 BC) cannot be explained by a connection with Upper Palaeolithic Europeans, but a more archaic admixture, so the authors cannot prove a migration through the Strait of Gibraltar or Sicily.

Nevertheless, these results apparently suggest:

  • That there is no contact before ca. 12000 BC through the Strait of Gibraltar; therefore the Sicilian route I support for the migration of R1b-V88 lineages is still the most likely one.
  • That the North African connection with Natufians is quite old – for which we already had modern Y-DNA investigation – , and therefore unlikely to be related to the Afroasiatic expansion.

I am glad I had some more time this week to read at least some interesting parts of the published papers, because the information to process is becoming insanely huge…

Related:

Genetic ancestry of Hadza and Sandawe peoples reveals ancient population structure in Africa

Open access paper Genetic Ancestry of Hadza and Sandawe Peoples Reveals Ancient Population Structure in Africa, by Shriner, Tekola-Ayele, Adeyemo, & Rotimi, GBE (2018).

Abstract (emphasis mine):

The Hadza and Sandawe populations in present-day Tanzania speak languages containing click sounds and therefore thought to be distantly related to southern African Khoisan languages. We analyzed genome-wide genotype data for individuals sampled from the Hadza and Sandawe populations in the context of a global data set of 3,528 individuals from 163 ethno-linguistic groups. We found that Hadza and Sandawe individuals share ancestry distinct from and most closely related to Omotic ancestry; share Khoisan ancestry with populations such as ≠Khomani, Karretjie, and Ju/’hoansi in southern Africa; share Niger-Congo ancestry with populations such as Yoruba from Nigeria and Luhya from Kenya, consistent with migration associated with the Bantu Expansion; and share Cushitic ancestry with Somali, multiple Ethiopian populations, the Maasai population in Kenya, and the Nama population in Namibia. We detected evidence for low levels of Arabian, Nilo-Saharan, and Pygmy ancestries in a minority of individuals. Our results indicate that west Eurasian ancestry in eastern Africa is more precisely the Arabian parent of Cushitic ancestry. Relative to the Out-of-Africa migrations, Hadza ancestry emerged early whereas Sandawe ancestry emerged late.

Excerpts:

Introduction
In the Hadza population, the distribution of Y chromosomes includes mostly B2 haplogroups, with a smaller number of E1b1a haplogroups, which are common in Niger-Congo-speaking populations, and E1b1b haplogroups, which are common in Cushitic populations (Tishkoff, et al. 2007). In the Sandawe population, E1b1a and E1b1b haplogroups are more common, with lower frequencies of B2 and A3b2 haplogroups (Tishkoff, et al. 2007).

Conclusion
We found that Hadza ancestry diverged early, rather than late. We found evidence for contributions of Cushitic and Niger-Congo ancestries in Tanzania, consistent with the movements of herding and cultivating Cushitic speakers ~4,000 years ago and agricultural Niger-Congo speakers ~2,500 years ago (Newman 1995). However, we did not find evidence of a substantial contribution of Nilo-Saharan ancestry that might have resulted from movement of pastoralist Nilo-Saharan speakers (Newman 1995). We also identified west Eurasian ancestry in eastern and southern African populations more precisely as the Arabian parent of Cushitic ancestry. Finally, our ancestry analyses support the hypothesis that Omotic, Hadza, and Sandawe languages group together, rather than Omotic languages belonging to the Afroasiatic family and Hadza and Sandawe languages belonging to the Khoisan family.

I don’t like linguistic assumptions from admixture analysis; especially from scarce modern samples, as in this case.

Nevertheless, these papers may help clarify the different nature of Omotic and Cushitic among Afroasiatic languages, and thus leave the origin of Afroasiatic either:

a) To the east, with the traditionalist Afroasiatic – Semitic/Hamitic homeland association.

afroasiatic-homeland
Expansion of Afroasiatic

b) To the west, near modern Chadic languages (associated with the expansion of R1b-V88 subclades through a Green Sahara), as I suggested.

Related:

The demographic history and mutational load of African hunter-gatherers and farmers

african-admixture-rainforest

Interesting new article (behind paywall), The demographic history and mutational load of African hunter-gatherers and farmers, Nat Ecol Evol (2018)

Abstract (emphasis mine):

Understanding how deleterious genetic variation is distributed across human populations is of key importance in evolutionary biology and medical genetics. However, the impact of population size changes and gene flow on the corresponding mutational load remains a controversial topic. Here, we report high-coverage exomes from 300 rainforest hunter-gatherers and farmers of central Africa, whose distinct subsistence strategies are expected to have impacted their demographic pasts. Detailed demographic inference indicates that hunter-gatherers and farmers recently experienced population collapses and expansions, respectively, accompanied by increased gene flow. We show that the distribution of deleterious alleles across these populations is compatible with a similar efficacy of selection to remove deleterious variants with additive effects, and predict with simulations that their present-day additive mutation load is almost identical. For recessive mutations, although an increased load is predicted for hunter-gatherers, this increase has probably been partially counteracted by strong gene flow from expanding farmers. Collectively, our predicted and empirical observations suggest that the impact of the recent population decline of African hunter-gatherers on their mutation load has been modest and more restrained than would be expected under a fully recessive model of dominance.

african-bantu-hunter-gatherer-demographic
“Inferred demographic models of the studied populations. a, EUR-first branching model, in which ancestors of EUR (aEUR) diverged from African populations before the divergence of the ancestors of RHG (aRHG) and AGR (aAGR). b, RHG-first branching model, in which aRHG were the first to diverge from the other groups. c, AGR-first branching model, in which aAGR were the first to diverge from the other groups. We assumed an ancient change in the size of the ancestral population of all humans (ANC). We assumed that each subsequent divergence of populations was followed by an instantaneous change in the effective population size (Ne). We also assumed that there were two epochs of migration between the following population pairs: wAGR/aAGR and wRHG/aRHG, eAGR/aAGR and eRHG/aRHG, and EUR and eAGR/aAGR. The figure labels correspond to the parameters of the model estimated by maximum likelihood and the 95% confidence intervals assessed by bootstrapping by site 100 times (Supplementary Table 4). Vertical arrow corresponds to the direction of time, from past to present, with divergence times given on the left and expressed in thousand years ago(ka). Effective population sizes (N) are given within the diagram and expressed in thousands of individuals. Bold horizontal arrows indicate an estimated parameter for the effective strength of migration 2Nm > 1, while thin horizontal arrows indicate 2Nm ≤ 1.”

See also:

Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula

Open access preprint (which I announced already) at bioRxiv Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula, by Bycroft et al. (2018).

Abstract (emphasis mine):

Genetic differences within or between human populations (population structure) has been studied using a variety of approaches over many years. Recently there has been an increasing focus on studying genetic differentiation at fine geographic scales, such as within countries. Identifying such structure allows the study of recent population history, and identifies the potential for confounding in association studies, particularly when testing rare, often recently arisen variants. The Iberian Peninsula is linguistically diverse, has a complex demographic history, and is unique among European regions in having a centuries-long period of Muslim rule. Previous genetic studies of Spain have examined either a small fraction of the genome or only a few Spanish regions. Thus, the overall pattern of fine-scale population structure within Spain remains uncharacterised. Here we analyse genome-wide genotyping array data for 1,413 Spanish individuals sampled from all regions of Spain. We identify extensive fine-scale structure, down to unprecedented scales, smaller than 10 Km in some places. We observe a major axis of genetic differentiation that runs from east to west of the peninsula. In contrast, we observe remarkable genetic similarity in the north-south direction, and evidence of historical north-south population movement. Finally, without making particular prior assumptions about source populations, we show that modern Spanish people have regionally varying fractions of ancestry from a group most similar to modern north Moroccans. The north African ancestry results from an admixture event, which we date to 860 – 1120 CE, corresponding to the early half of Muslim rule. Our results indicate that it is possible to discern clear genetic impacts of the Muslim conquest and population movements associated with the subsequent Reconquista.

iberia-reconquista-dna
“(a) Binary tree showing the inferred hierarchical relationships between clusters. The colours and points correspond to each cluster as shown on the map, and the length of the coloured rectangles is proportional to the number of individuals assigned to that cluster. We combined some small clusters (Methods) and the thick black branches indicate the clades of the tree that we visualise in the map. We have labeled clusters according to the approximate location of most of their members, but geographic data was not used in the inference. (b) Each individual is represented by a point placed at (or close to) the centroid of their grandparents’ birthplaces. On this map we only show the individuals for whom all four grandparents were born within 80km of their average birthplace, although the data for all individuals were used in the fineSTRUCTURE inference. The background is coloured according to the spatial densities of each cluster at the level of the tree where there are 14 clusters (see Methods). The colour and symbol of each point corresponds to the cluster the individual was assigned to at a lower level of the tree, as shown in (a). The labels and boundaries of Spain’s Autonomous Communities are also shown.”

Some interesting excerpts:

Our results further imply that north west African-like DNA predominated in the migration. Moreover, admixture mainly, and perhaps almost exclusively, occurred within the earlier half of the period of Muslim rule. Within Spain, north African ancestry occurs in all groups, although levels are low in the Basque region and in a region corresponding closely to the 14th-century ‘Crown of Aragon’. Therefore, although genetically distinct this implies that the Basques have not been completely isolated from the rest of Spain over the past 1300 years.

NOTE. I must add here that the Expulsion of Moriscos is known to have been quite successful in the old Crown of Aragon – deeply affecting its economy – , in contrast with other territories of the Crown of Castille, where they either formed less sizeable communities, or were dispersed and eventually Christened and integrated with local communities. For example, thousands of Moriscos from Granada were dispersed following the War of Alpujarras (1567–1571) into different regions of the Crown of Castille, and many could not be later expelled due to the locals’ resistance to follow the expulsion edict.

Perhaps surprisingly, north African ancestry does not reflect proximity to north Africa, or even regions under more extended Muslim control. The highest amounts of north African ancestry found within Iberia are in the west (11%) including in Galicia, despite the fact that the region of Galicia as it is defined today (north of the Miño river), was never under Muslim rule and Berber settlements north of the Douro river were abandoned by. This observation is consistent with previous work using Y-chromosome data. We speculate that the pattern we see is driven by later internal migratory flows, such as between Portugal and Galicia, and this would also explain why Galicia and Portugal show indistinguishable ancestry sharing with non-Spanish groups more generally. Alternatively, it might be that these patterns reflect regional differences in patterns of settlement and integration with local peoples of north African immigrants themselves, or varying extents of the large-scale expulsion of Muslim people, which occurred post-Reconquista and especially in towns and cities.

iberia-north-african-morocco
We estimated ancestry profiles for each point on a fine spatial grid across Spain (Methods). Gray crosses show
the locations of sampled individuals used in the estimation. Map shows the fraction contributed from the donor group ‘NorthMorocco’.

Overall, the pattern of genetic differentiation we observe in Spain reflects the linguistic and geopolitical boundaries present around the end of the time of Muslim rule in Spain, suggesting this period has had a significant and long-term impact on the genetic structure observed in modern Spain, over 500 years later. In the case of the UK, similar geopolitical correspondence was seen, but to a different period in the past (around 600 CE). Noticeably, in these two cases, country-specific historical events rather than geographic barriers seem to drive overall patterns of population structure. The observation that fine-scale structure evolves at different rates in different places could be explained if observed patterns tend to reflect those at the ends of periods of significant past upheaval, such as the end of Muslim rule in Spain, and the end of the Anglo-Saxon and Danish Viking invasions in the UK.

Certain people want to believe (well into the 21st century) into ideal ancestral populations and ancient ethnolinguistic identifications linked to one’s own – or the own country’s dominant – ancestral components and Y-DNA haplogroup.

We are nevertheless seeing how mainly the most recent relevant geopolitical events and late internal migratory flows have shaped the genetic structure (including Y-DNA haplogroup composition) of modern regions and countries regardless of its population’s actual language or ethnic identification, whether (pre)historical or modern.

Another surprise for many, I guess.

Related:

Population replacement in Early Neolithic Britain, and new Bell Beaker SNPs

copper-age-late-bell-beaker

New (copyrighted) preprint at BioRxiv, Population Replacement in Early Neolithic Britain, by Brace et al. (2018).

Abstract (emphasis mine):

The roles of migration, admixture and acculturation in the European transition to farming have been debated for over 100 years. Genome-wide ancient DNA studies indicate predominantly Anatolian ancestry for continental Neolithic farmers, but also variable admixture with local Mesolithic hunter-gatherers. Neolithic cultures first appear in Britain c. 6000 years ago (kBP), a millennium after they appear in adjacent areas of northwestern continental Europe. However, the pattern and process of the British Neolithic transition remains unclear. We assembled genome-wide data from six Mesolithic and 67 Neolithic individuals found in Britain, dating from 10.5-4.5 kBP, a dataset that includes 22 newly reported individuals and the first genomic data from British Mesolithic hunter-gatherers. Our analyses reveals persistent genetic affinities between Mesolithic British and Western European hunter-gatherers over a period spanning Britain’s separation from continental Europe. We find overwhelming support for agriculture being introduced by incoming continental farmers, with small and geographically structured levels of additional hunter-gatherer introgression. We find genetic affinity between British and Iberian Neolithic populations indicating that British Neolithic people derived much of their ancestry from Anatolian farmers who originally followed the Mediterranean route of dispersal and likely entered Britain from northwestern mainland Europe.

Also, Genetiker has updated Y-SNP calls from new data published from the Harvard group.

The R1b lineages that expanded from (Yamna->) East Bell Beakers -> Western Europe are more and more clearly of R1b-L151 subclades, as expected.

Quite interesting are the early samples from Poland, of R1b1a1a2a2-Z2103 and R1b1a1a2a1a-L151 lineages – , which may point (different to the more homogeneous L151 distribution in Western Europe) to a mix in both original (east-west) Yamna groups. This could tentatively be used to explain the Graeco-Aryan influence that some linguists see in Balto-Slavic (or its superstrate).

That link would then be quite early, to account for an influence during the Yamna settlements in Hungary, before its expansion as East Bell Beakers, but we haven’t seen a clearly differentiated subgroup (yet) in Archaeology, Anthropology, or Genomics within the Hungarian Yamna/East Bell Beaker community, so I am not convinced. It could be just that different scattered subclades mixed with the general L151 population pop up (following old Yamna lineages, or having being added along the way), as expected in an expansion over such a great territory – as if some scattered samples of R1a, I1, I2, J, etc. were found.

We need more early samples from south-eastern Europe and the steppe during the Chalcolithic to ascertain the composition and migration paths of the different Yamna settlers.

Other interesting findings are the early (Proto-)Bell Beaker samples of haplogroup R1b with no steppe ancestry from Spain – which some autochthonous continuists wanted to believe was a proof of some kind – , which are actually R1b-V88, a haplogroup known to have expanded throughout Europe quite early. In fact, this subclade has been recently shown to have most likely expanded through the Green Sahara region, and is potentially linked to the expansion of Afro-Asiatic.

See also:

R1b-V88 migration through Southern Italy into Green Sahara corridor, and the Afroasiatic connection

Open access article The peopling of the last Green Sahara revealed by high-coverage resequencing of trans-Saharan patrilineages, by D’Atanasio, Trombetta, Bonito, et al., Genome Biology (2018) 19:20.

Abstract:

Background
Little is known about the peopling of the Sahara during the Holocene climatic optimum, when the desert was replaced by a fertile environment.

Results
In order to investigate the role of the last Green Sahara in the peopling of Africa, we deep-sequence the whole non-repetitive portion of the Y chromosome in 104 males selected as representative of haplogroups which are currently found to the north and to the south of the Sahara. We identify 5,966 mutations, from which we extract 142 informative markers then genotyped in about 8,000 subjects from 145 African, Eurasian and African American populations. We find that the coalescence age of the trans-Saharan haplogroups dates back to the last Green Sahara, while most northern African or sub-Saharan clades expanded locally in the subsequent arid phase.

Conclusions
Our findings suggest that the Green Sahara promoted human movements and demographic expansions, possibly linked to the adoption of pastoralism. Comparing our results with previously reported genome-wide data, we also find evidence for a sex-biased sub-Saharan contribution to northern Africans, suggesting that historical events such as the trans-Saharan slave trade mainly contributed to the mtDNA and autosomal gene pool, whereas the northern African paternal gene pool was mainly shaped by more ancient events.

y-dna-r1b-v88-e-m78
Maximum parsimony Y chromosome tree and dating of the four trans-Saharan haplogroups. a Phylogenetic relations among the 150 samples analysed here. Each haplogroup is labelled in a different colour. The four Y sequences from ancient samples are marked by the dagger symbol. b Phylogenetic tree of the four trans-Saharan haplogroups, aligned to the timeline (at the bottom). At the tip of each lineage, the ethno-geographic affiliation of the corresponding sample is represented by a circle, coloured according to the legend (bottom left). The last Green Sahara period is highlighted by a green belt in the background

Also, interesting excerpts:

The fertile environment established in the Green Sahara probably promoted demographic expansions and rapid dispersals of the human groups, as suggested by the great homogeneity in the material culture of the early Holocene Saharan populations [62]. Our data for all the four trans-Saharan haplogroups are consistent with this scenario, since we found several multifurcated topologies, which can be considered as phylogenetic footprints of demographic expansions. The multifurcated structure of the E-M2 is suggestive of a first demographic expansion, which occurred about 10.5 kya, at the beginning of the last Green Sahara (Fig. 2; Additional file 2: Figure S4). After this initial expansion, we found that most of the trans-Saharan lineages within A3-M13, E-M2 and R-V88 radiated in a narrow time interval at 8–7 kya, suggestive of population expansions that may have occurred in the same time (Fig. 2; Additional file 2: Figures S3, S4 and S6). Interestingly, during roughly the same period, the Saharan populations adopted pastoralism, probably as an adaptive strategy against a short arid period [1, 62, 63]. So, the exploitation of pastoralism resources and the reestablishment of wetter conditions could have triggered the simultaneous population expansions observed here. R-V88 also shows signals of a further and more recent (~ 5.5 kya) Saharan demographic expansion which involved the R-V1589 internal clade. We observed similar demographic patterns in all the other haplogroups in about the same period and in different geographic areas (A3-M13/V3, E-M2/V3862 and E-M78/V32 in the Horn of Africa, E-M2/M191 in the central Sahel/central Africa), in line with the hypothesis that the start of the desertification may have caused massive economic, demographic and social changes [1].

Finally, the onset of the arid conditions at the end of the last African humid period was more abrupt in the eastern Sahara compared to the central Sahara, where an extensive hydrogeological network buffered the climatic changes, which were not complete before ~ 4 kya [6, 62, 64]. Consistent with these local climatic differences, we observed slight differences among the four trans-Saharan haplogroups. Indeed, we found that the contact between northern and sub-Saharan Africa went on until ~ 4.5 kya in the central Sahara, where we mainly found the internal lineages of E-M2 and R-V88 (Additional file 2: Figures S4 and S6). In the eastern Sahara, we found a sharper and more ancient (> 5 kya) differentiation between the people from northern Africa (and, more generally, from the Mediterranean area) and the groups from the eastern sub-Saharan regions (mainly from the Horn of Africa), as testified by the distribution and the coalescence ages of the A3-M13 and E-M78 lineages (Additional file 2: Figures S3 and S5).

green-sahara-r1b-v88-em-78
Time estimates and frequency maps of the four trans-Saharan haplogroups and major sub-clades. a Time estimates of the four trans-Saharan clades and their main internal lineages. To the left of the timeline, the time windows of the main climatic/historical African events are reported in different colours (legend in the upper left). b Frequency maps of the main trans-Saharan clades and sub-clades. For each map, the relative frequencies (percentages) are reported to the right

R-V88 has been observed at high frequencies in the central Sahel (northern Cameroon, northern Nigeria, Chad and Niger) and it has also been reported at low frequencies in northwestern Africa [37]. Outside the African continent, two rare R-V88 sub-lineages (R-M18 and R-V35) have been observed in Near East and southern Europe (particularly in Sardinia)[30, 37, 38, 39]. Because of its ethno-geographic distribution in the central Sahel, R-V88 has been linked to the spread of the Chadic branch of the Afroasiatic linguistic family [37, 40].

(…) the R-V88 lineages date back to 7.85 kya and its main internal branch (branch 233) forms a “star-like” topology (“Star-like” index = 0.55), suggestive of a demographic expansion. More specifically, 18 out of the 21 sequenced chromosomes belong to branch 233, which includes eight sister clades, five of which are represented by a single subject. The coalescence age of this sub-branch dates back to 5.73 kya, during the last Green Sahara period. Interestingly, the subjects included in the “star-like” structure come from northern Africa or central Sahel, tracing a trans-Saharan axis. It is worth noting that even the three lineages outside the main multifurcation (branches 230, 231 and 232) are sister lineages without any nested sub-structure. The peculiar topology of the R-V88 sequenced samples suggests that the diffusion of this haplogroup was quite rapid and possibly triggered by the Saharan favourable climate (Fig. 2b).

One of the theories I proposed in the Indo-European demic diffusion model since the first edition – based mainly on phylogeography – is that R1b-V88 lineages had probably crossed the Mediterranean through southern Italy into a Green Sahara region, and distributed from there throuh important green corridors, humid areas between megalakes. Even though this new study – like the rest of them – is based solely on modern samples, and as such is quite prone to error in assessing ancient distributions – as we have seen in Europe -, it seems that a southern Italian route (probably through Sicily) for R1b-V88 and a late expansion through Green Sahara is more and more likely.

If we accept that the migration of R1b-V88 lineages is the last great expansion through a Green Sahara, then this expansion is a potential candidate for the initial Afroasiatic expansion – whereas older haplogroup expansions would represent languages different than Afroasiatic, and more recent haplogroup expansions would represent subsequent expansions of Afroasiatic dialects, like Semitic, Hamitic, Cushitic, or Chadic – as I explained in an older post.

In absolutely shameless speculative terms, then – as is today common in Genetic studies, by the way, so let’s all have some fun here – instead of some sort of R1b/Eurasiatic continuity in Europe, as some autochthonous continuists would like, this could mean that there would be an old Afroasiatic – R1b connection. That would imply:

NOTE. Regarding the contribution of CHG ancestry in the Pontic-Caspian steppe cultures, it is usually explained as caused by exogamy, or by absorption of a previous population (as in the Indo-Iranian case), although a contribution of communities of mainly J subclades to the formation of Neolithic steppe cultures cannot be ruled out. As for some autochthonous continuists’ belief in some sort of mythical mixed steppe people with mixed haplogroups and mixed language, well…

nostratic-tree
Simple Nostratic tree by Bomhard (2008)

The Pre-Indo-European linguistic situation, before the formation of Neolithic steppe cultures, seems like pure speculation, because a) language macro-families (with the exception of Afroasiatic) are highly speculative, b) sound anthropological models are lacking for them, and c) migrations inferred from haplogroup distributions of modern populations are often incorrect:

  • Haplogroup R could then be argued to be the source of Nostratic, and earlier subclades the source of Starostin’s Borean, given the distribution of its subclades in Asia and the timing of their migrations.
  • But of course one could also argue that, given the comparatively late population expansions that Genomics is showing, supporting Western European linguistic schools – where Russian Nostraticists tend to date languages further back in timeR1b (and not R) expansion could be the marker of Nostratic languages, due to its most likely southern path (and their old subclades found in Iran and the Caucasus), which would be more in line with the wet dreams of Europeans proposing R1b autochthonous continuity theories. I like this option far less because of that, but it cannot be ruled out.

If you have read this blog before, you know I profoundly dislike lexicostatistical and glottochronological methods, and I don’t like mass comparisons either. Whereas these methods pretend to apply mathematics to big (raw) data where there is almost no knowledge of what one is doing, comparative grammar applies complex reasoning where there is a lot of partially processed data.

But, it is always fun to ask “what if they were right?” and follow from there…

See also:

Prehistoric loan relations: Foreign elements in the Proto-Indo-European vocabulary

ancient-indo-european-world-fantasy

An interesting ongoing web project, Prehistoric loan relations, on potential loans of Proto-Indo-European words, from Uralic-Yukaghir, Caucasian, and Middle Eastern influence.

Based on a Ph.D. thesis by Bjørn (2017) Foreign elements in the Proto-Indo-European vocabulary (PDF).

From the website (emphasis mine):

This page allows historical linguists to compare and scrutinize proposed prehistoric lexical borrowings from the perspective of Proto-Indo-European. The first entries are all (135 in total) extracted from my master’s thesis “Foreign elements in the Proto-Indo-European vocabulary” (Bjørn 2017). Comments are encouraged at the bottom of each entry. New entries will be added, also on request.

Take this not as the conclusion, but an invitation to join the conversation.

So, we welcome the invitation, and hope that this new project thrives.

Also, I loved his fantasy-like map of the central Eurasian region (featured image on this post).

Related:

Expansion of peoples associated with spread of haplogroups: Mongols and C3*-F3918, Arabs and E-M183 (M81)

iron-age-migrations

The expansion of peoples is known to be associated with the spread of a certain admixture component, joint with the expansion and reduction in variability of a haplogroup. In other words, few male lineages are usually more successful during the expansion.

Known examples include:

Two recent interesting papers add prehistoric cases of potential expansion of cultures associated with haplogroups:

1. Whole Y-chromosome sequences reveal an extremely recent origin of the most common North African paternal lineage E-M183 (M81), by Solé-Morata et al., Scientific Reports (2017).

Abstract:

E-M183 (E-M81) is the most frequent paternal lineage in North Africa and thus it must be considered to explore past historical and demographical processes. Here, by using whole Y chromosome sequences from 32 North African individuals, we have identified five new branches within E-M183. The validation of these variants in more than 200 North African samples, from which we also have information of 13 Y-STRs, has revealed a strong resemblance among E-M183 Y-STR haplotypes that pointed to a rapid expansion of this haplogroup. Moreover, for the first time, by using both SNP and STR data, we have provided updated estimates of the times-to-the-most-recent-common-ancestor (TMRCA) for E-M183, which evidenced an extremely recent origin of this haplogroup (2,000–3,000 ya). Our results also showed a lack of population structure within the E-M183 branch, which could be explained by the recent and rapid expansion of this haplogroup. In spite of a reduction in STR heterozygosity towards the West, which would point to an origin in the Near East, ancient DNA evidence together with our TMRCA estimates point to a local origin of E-M183 in NW Africa.

haplogroup-E-M183-subclade-distribution
Distribution of E-M183 subclades among North Africa, the Near East and the Iberian Peninsula. Pie chart sectors areas are proportional to haplogroup frequency and are coloured according to haplogroup in the schematic tree to the right. n: sample size. Map was generated using R software.

An interesting excerpt, from the discussion:

Regarding the geographical origin of E-M183, a previous study suggested that an expansion from the Near East could explain the observed east-west cline of genetic variation that extends into the Near East. Indeed, our results also showed a reduction in STR heterozygosity towards the West, which may be taken to support the hypothesis of an expansion from the Near East. In addition, previous studies based on genome-wide SNPs reported that a North African autochthonous component increase towards the West whereas the Near Eastern decreases towards the same direction, which again support an expansion from the Near East. However, our correlations should be taken carefully because our analysis includes only six locations on the longitudinal axis, none from the Near East. As a result, we do not have sufficient statistical power to confirm a Near Eastern origin. In addition, rather than showing a west-to-east cline of genetic diversity, the overall picture shown by this correlation analysis evidences just low genetic diversity in Western Sahara, which indeed could be also caused by the small sample size (n = 26) in this region. Alternatively, given the high frequency of E-M183 in the Maghreb, a local origin of E-M183 in NW Africa could be envisaged, which would fit the clear pattern of longitudinal isolation by distance reported in genome-wide studies. Moreover, the presence of autochthonous North African E-M81 lineages in the indigenous population of the Canary Islands, strongly points to North Africa as the most probable origin of the Guanche ancestors. This, together with the fact that the oldest indigenous inviduals have been dated 2210 ± 60 ya, supports a local origin of E-M183 in NW Africa. Within this scenario, it is also worth to mention that the paternal lineage of an early Neolithic Moroccan individual appeared to be distantly related to the typically North African E-M81 haplogroup30, suggesting again a NW African origin of E-M183. A local origin of E-M183 in NW Africa > 2200 ya is supported by our TMRCA estimates, which can be taken as 2,000–3,000, depending on the data, methods, and mutation rates used.

The TMRCA estimates of a certain haplogroup and its subbranches provide some constraints on the times of their origin and spread. Although our time estimates for E-M78 are slightly different depending on the mutation rate used, their confidence intervals overlap and the dates obtained are in agreement with those obtained by Trombetta et al Regarding E-M183, as mentioned above, we cannot discard an expansion from the Near East and, if so, according to our time estimates, it could have been brought by the Islamic expansion on the 7th century, but definitely not with the Neolithic expansion, which appeared in NW Africa ~7400 BP and may have featured a strong Epipaleolithic persistence. Moreover, such a recent appearance of E-M183 in NW Africa would fit with the patterns observed in the rest of the genome, where an extensive, male-biased Near Eastern admixture event is registered ~1300 ya, coincidental with the Arab expansion. An alternative hypothesis would involve that E-M183 was originated somewhere in Northwest Africa and then spread through all the region. Our time estimates for the origin of this haplogroup overlap with the end of the third Punic War (146 BCE), when Carthage (in current Tunisia) was defeated and destroyed, which marked the beginning of Roman hegemony of the Mediterranean Sea. About 2,000 ya North Africa was one of the wealthiest Roman provinces and E-M183 may have experienced the resulting population growth.

2. The Y-chromosome haplogroup C3*-F3918, likely attributed to the Mongol Empire, can be traced to a 2500-year-old nomadic group, by Zhang et al., Journal of Human Genetics (2017)

Abstract:

The Mongol Empire had a significant role in shaping the landscape of modern populations. Many populations living in Eurasia may have been the product of population mixture between ancient Mongolians and natives following the expansion of Mongol Empire. Geneticists have found that most of these populations carried the Y-haplogroup C3* (C-M217). To trace the history of haplogroup (Hg) C3* and to further understand the origin and development of Mongolians, ancient human remains from the Jinggouzi, Chenwugou and Gangga archaeological sites, which belonged to the Donghu, Xianbei and Shiwei, respectively, were analysed. Our results show that nine of the eleven males of the Gangga site, two of the eight males of Chengwugou site and all of the twelve males of Jinggouzi site were found to have mutations at M130 (Hg C), M217 (Hg C3), L1373 (C2b, ISOGG2015), with the absence of mutations at M93 (Hg C3a), P39 (Hg C3b), M48 (Hg C3c), M407 (Hg C3d) and P62 (Hg C3f). These samples were attributed to the Y-chromosome Hg C3* (Hg C2b, ISOGG2015), and most of them were further typed as Hg C2b1a based on the mutation at F3918. Finally, we inferred that the Y-chromosome Hg C3*-F3918 can trace its origins to the Donghu ancient nomadic group.

mongol-expansion-y-dna-haplogroup
The development of Mongolia and the frequencies of haplogroup C3* in modern Eurasians. a The development of Mongolia. b The frequencies of haplogroup C3 in modern Eurasians. The dotted line represents the approximate boundary between the Xiongnu and the Donghu. The black and grey arrows denote the migration of the Donghu and Mongolians, respectively

Featured image: Diachronic map of Iron Age migrations ca. 750-250 BC.

Related: