Fulani from Cameroon show ancestry similar to Afroasiatic speakers from East Africa


Open access African evolutionary history inferred from whole genome sequence data of 44 indigenous African populations, by Fan et al. Genome Biology (2019) 20:82.

Interesting excerpts (emphasis mine):


To extend our knowledge of patterns of genomic diversity in Africa, we generated high coverage (> 30×) genome sequencing data from 43 geographically diverse Africans originating from 22 ethnic groups, representing a broad array of ethnic, linguistic, cultural, and geographic diversity (Additional file 1: Table S1). These include a number of populations of anthropological interest that have never previously been characterized for high-coverage genome sequence diversity such as Afroasiatic-speaking El Molo fishermen and Nilo-Saharan-speaking Ogiek hunter-gatherers (Kenya); Afroasiatic-speaking Aari, Agaw, and Amhara agro-pastoralists (Ethiopia); Niger-Congo-speaking Fulani pastoralists (Cameroon); Nilo-Saharan-speaking Kaba (Central African Republic, CAR); and Laka and Bulala (Chad) among others. We integrated this data with 49 whole genome sequences generated as part of the Simons Genome Diversity Project (SGDP) [14] (…)

Locations of samples included in this study. Each dot is an individual and the color indicates the language classification

Results and discussion

We found that the CRHG populations from central Africa, including the Mbuti from the Demographic Republic of Congo (DRC), Biaka from the CAR, and Baka, Bakola, and Bedzan from Cameroon, also form a basal lineage in the phylogeny. The other two hunter-gatherer populations, Hadza and Sandawe, living in Tanzania, group with populations from eastern Africa (Fig. 2). The two Nilo-Saharan-speaking populations, the Mursi from southern Ethiopia and the Dinka from southern Sudan, group into a single cluster, which is consistent with archeological data indicating that the migration of Nilo-Saharan populations to eastern Africa originated from a source population in southern Sudan in the last 3000 years [4, 23, 24, 25].

Phylogenetic relationship of 44 African and 32 west Eurasian populations determined by a neighbor joining analysis assuming no admixture. Here, the dots of each node represent bootstrap values and the color of each branch indicates language usage of each population. Human_AA human ancestral alleles

The Fulani people are traditionally nomadic pastoralists living across a broad geographic range spanning Sudan, the Sahel, Central, and Western Africa. The Fulani in our study, sampled from Cameroon, clustered with the Afroasiatic-speaking populations in East Africa in the phylogenetic analysis, indicating a potential language replacement from Afroasiatic to Niger-Congo in this population (Fig. 2). Prior studies suggest a complex history of the Fulani; analyses of Y chromosome variation suggest a shared ancestry with Nilo-Saharan and Afroasiatic populations [24], whereas mtDNA indicates a West African origin [26]. An analysis based on autosomal markers found traces of West Eurasian-related ancestry in this population [4], which suggests a North African or East African origin (as North and East Africans also have such ancestry likely related to expansions of farmers and herders from the Near East) and is consistent with the presence at moderate frequency of the −13,910T variant associated with lactose tolerance in European populations [15, 16].

Phylogenetic reconstruction of the relationship of African individuals under a model allowing for migration using TREEMIX [27] largely recapitulates the NJ phylogeny with the exception of the Fulani who cluster near neighboring Niger-Congo-speaking populations with whom they have admixed (Additional file 2: Figure S1). Interestingly, TREEMIX analysis indicates evidence for gene flow between the Hadza and the ancestors of the Ju|‘hoan and Khomani San, supporting genetic, linguistic, and archeological evidence that Khoesan-speaking populations may have originated in Eastern Africa [28, 29, 30].

ADMIXTURE analysis of 92 African and 62 West Eurasian individuals. Each bar is an individual and colors represent the proportion of inferred ancestry from K ancestral populations. The bottom bar shows the language classification of each individual. With the increasing of K, the populations are largely grouped by their current language usage

About the Fulani, this is what the referenced study of Y‐chromosome variation among 15 Sudanese populations by Hassan et al. (2008), had to say:

  • Haplogroups A-M13 and B-M60 are present at high frequencies in Nilo-Saharan groups except Nubians, with low frequencies in Afro-Asiatic groups although notable frequencies of B-M60 were found in Hausa (15.6%) and Copts (15.2%).
  • Haplogroup E (four different haplotypes) accounts for the majority (34.4%) of the chromosome and is widespread in the Sudan. E-M78 represents 74.5% of haplogroup E, the highest frequencies observed in Masalit and Fur populations. E-M33 (5.2%) is largely confined to Fulani and Hausa, whereas E-M2 is restricted to Hausa. E-M215 was found to occur more in Nilo-Saharan rather than Afro-Asiatic speaking groups.
  • In contrast, haplogroups F-M89, I-M170, J-12f2, and JM172 were found to be more frequent in the Afro-Asiatic speaking groups. J-12f2 and J-M172 represents 94% and 6%, respectively, of haplogroup J with high frequencies among Nubians, Copts, and Arabs.
  • Haplogroup K-M9 is restricted to Hausa and Gaalien with low frequencies and is absent in Nilo-Saharan and Niger-Congo.
  • Haplogroup R-M173 appears to be the most frequent haplogroup in Fulani, and haplogroup R-P25 has the highest frequency in Hausa and Copts and is present at lower frequencies in north, east, and western Sudan.
  • Haplogroups A-M51, A-M23, D-M174, H-M52, L-M11, OM175, and P-M74 were completely absent from the populations analyzed.
Image modified from “Fulfulde Language Family Report” Author: Annette Harrison; Cartographer: Irene Tucker; SIL International 2003.

This is what David Reich will talk about in the seminar Insights into language expansions from ancient DNA:

In this talk, I will describe how the new science of genome-wide ancient DNA can provide insights into past spreads of language and culture. I will discuss five examples: (1) the spread of Indo-European languages to Europe and South Asia in association with Steppe pastoralist ancestry, (2) the spread of Austronesian languages to the open Pacific islands in association with Taiwanese aboriginal-associated ancestry, (3) the spread of Austroasiatic languages through southeast Asia in association with the characteristic ancestry type that is also represented in western Indonesia suggesting that these languages were once widespread there, (4) the spread of Afroasiastic languages through in East Africa as part of the Pastoral Neolithic farming expansion, and (5) the spread of Na-Dene languages in North America in association with Proto-Paleoeskimo ancestry. I will highlight the ways that ancient DNA can meaningfully contribute to our understanding of language expansions—increasing the plausibility of some scenarios while decreasing the plausibility of others—while emphasizing that with genetic data by itself we can never definitively determine what languages ancient people spoke.

EDIT (3 MAY 2019): Apparently, there was not much to take from the talk:

Pastoralist Neolithic in Africa, through a pale-green Sahelo-Sudanian steppe corridor. See full map.

This seminar (and maybe some new paper on the Neolithic expansion in Africa) could shed light on population movements that may be related to the spread of Afroasiatic dialects. Until now, it seems that Bantu peoples have been more interesting for linguistics and archaeology, and South and East Africans for anthropology.

Archaeology in Africa appears to be in its infancy, as is population genomics. From the latest publication by Carina Schlebusch, Population migration and adaptation during the African Holocene: A genetic perspective, a chapter from Modern Human Origins and Dispersal (2019):

The process behind the introduction and development of farming in Africa is still unclear. It is not known how many independent invention events there were in the continent and to which extent the various first instances of farming in northern Africa are linked. Based on the archeological record, it was proposed that at least three regions in Africa may have developed agriculture independently: the Sahara/Sahel (around 7 ka), the Ethiopian highlands (7-4 ka), and western Africa (5-3 ka). In addition to these developments, the Nile River Valley is thought to have adopted agriculture (around 7.2 ka), from the Neolithic Revolution in the Middle East (Chapter 12 – Jobling et al. 2014; Chapter 35, 37 – Mitchell and Lane 2013). From these diverse centers of origin, farmers or farming practices spread to the rest of Africa, with domesticate animals reaching the southern tip of Africa ~2 ka and crop farming ~1,8 ka (Mitchell 2002; Huffman 2007)

Schematic representation of possible migration routes related to the expansion of herders and crop farmers during Holocene times. Arrow color indicate source populations; Brown-Eurasian, Green-western African, Blue-eastern African.

Similar to the case in Europe and the 1990s-2000s wrong haplogroup history based on the modern distribution of R1b, R1a, N, or I2, it is possible that neither of the most often mentioned haplogroups linked to the Afroasiatic expansion, E and J, were responsible for its early spread within Africa, despite their widespread distribution in certain modern Afroasiatic-speaking areas. The fact that such assessments include implausible glottochronological dates spanning up to 20,000 years for the parent language, combined with regional language continuities despite archaeological changes, makes them even more suspicious.

Similar to the case with Indo-Europeans and the “steppe ancestry” concept of the 2010s, it may be that the often-looked-for West Eurasian ancestry among Africans is the effect of recent migrations, unrelated to the Afroasiatic expansion. The results of this paper could be offering another sign of how this ancestry may have expanded only quite recently westwards from East Africa through the Sahel, after the Semitic expansion to the south:

1. From approximately 1000 BC, accompanying Nilo-Saharan peoples.

2. From approximately AD 1500, with the different population movements related to the nomadic Fulani:

Image from Sahel in West African History – Oxford Research Encyclopedia of African History.
  • Arguably, since the Fulani caste system wasn’t as elaborate in northern Nigeria, eastern Niger, and Cameroon, these specific groups would be a good example of the admixture with eastern populations, based on the (proportionally) huge amount of slaves they dealt with.
  • Similarly, it could be argued that the castes-based social stratification in most other territories (including Sudan) would have helped them keep a genetic make-up similar to their region of origin in terms of ancient lineages, hence similar to Chadic populations from west to east.

Reich’s assertion of the association of the language expansion with the spread of Pastoral Neolithic is still too vague, but – based on previous publications of ancient DNA in Africa and the Levant – I don’t have high hopes for a revolutionary paper in the near future. Without many samples and proper temporal transects, we are stuck with speculations based on modern distributions and scarce historical data.

A distribution map of Fula people. Dark green: a major ethnic group; Medium: significant; Light: minor. Modified from image by Sarah Welch at Wikipedia.

About the potential genetic make-up of Cameroon before the arrival of the Neolithic, from the recent SAA 84th Annual Meeting (Abstracts in PDF):

Lipson, Mark (Harvard Medical School), Mary Prendergast (Harvard University), Isabelle Ribot (Université de Montréal), Carles Lalueza-Fox (Institute of Evolutionary Biology CSIC-UPF) and David Reich (Harvard Medical School)

[253] Ancient Human DNA from Shum Laka (Cameroon) in the Context of African Population History We generated genome-wide DNA data from four people buried at the site of Shum Laka in Cameroon between 8000–3000 years ago. One individual carried the deeply divergent Y chromosome haplogroup A00 found at low frequencies among some present-day Niger-Congo speakers, but the genome-wide ancestry profiles for all four individuals are very different from the majority of West Africans today and instead are more similar to West-Central African hunter-gatherers. Thus, despite the geographic proximity of Shum Laka to the hypothesized birthplace of Bantu languages and the temporal range of our samples bookending the initial Bantu expansion, these individuals are not representative of a Bantu source population. We present a phylogenetic model including Shum Laka that features three major radiations within Africa: one phase early in the history of modern humans, one close to the time of the migration giving rise to non-Africans, and one in the past several thousand years. Present-day West Africans and some East Africans, in addition to Central and Southern African hunter-gatherers, retain ancestry from the first phase, which is therefore still represented throughout the majority of human diversity in Africa today.


Migrations in the Levant region during the Chalcolithic, also marked by distinct Y-DNA


Open access Ancient DNA from Chalcolithic Israel reveals the role of population mixture in cultural transformation, by Harney et al. Nature Communications (2018).

Interesting excerpts (emphasis mine, reference numbers deleted for clarity):


The material culture of the Late Chalcolithic period in the southern Levant contrasts qualitatively with that of earlier and later periods in the same region. The Late Chalcolithic in the Levant is characterized by increases in the density of settlements, introduction of sanctuaries, utilization of ossuaries in secondary burials, and expansion of public ritual practices as well as an efflorescence of symbolic motifs sculpted and painted on artifacts made of pottery, basalt, copper, and ivory. The period’s impressive metal artifacts, which reflect the first known use of the “lost wax” technique for casting of copper, attest to the extraordinary technical skill of the people of this period.

The distinctive cultural characteristics of the Late Chalcolithic period in the Levant (often related to the Ghassulian culture, although this term is not in practice applied to the Galilee region where this study is based) have few stylistic links to the earlier or later material cultures of the region, which has led to extensive debate about the origins of the people who made this material culture. One hypothesis is that the Chalcolithic culture in the region was spread in part by immigrants from the north (i.e., northern Mesopotamia), based on similarities in artistic designs. Others have suggested that the local populations of the Levant were entirely responsible for developing this culture, and that any similarities to material cultures to the north are due to borrowing of ideas and not to movements of people.

Previous genome-wide ancient DNA studies from the Near East have revealed that at the time when agriculture developed, populations from Anatolia, Iran, and the Levant were approximately as genetically differentiated from each other as present-day Europeans and East Asians are today. By the Bronze Age, however, expansion of different Near Eastern agriculturalist populations — Anatolian, Iranian, and Levantine — in all directions and admixture with each other substantially homogenized populations across the region, thereby contributing to the relatively low genetic differentiation that prevails today. Showed that the Levant Bronze Age population from the site of ‘Ain Ghazal, Jordan (2490–2300 BCE) could be fit statistically as a mixture of around 56% ancestry from a group related to Levantine Pre-Pottery Neolithic agriculturalists (represented by ancient DNA from Motza, Israel and ‘Ain Ghazal, Jordan; 8300–6700 BCE) and 44% related to populations of the Iranian Chalcolithic (Seh Gabi, Iran; 4680–3662 calBCE). Suggested that the Canaanite Levant Bronze Age population from the site of Sidon, Lebanon (~1700 BCE) could be modeled as a mixture of the same two groups albeit in different proportions (48% Levant Neolithic-related and 52% Iran Chalcolithic-related). However, the Neolithic and Bronze Age sites analyzed so far in the Levant are separated in time by more than three thousand years, making the study of samples that fill in this gap, such as those from Peqi’in, of critical importance.

This procedure produced genome-wide data from 22 ancient individuals from Peqi’in Cave (4500–3900 calBCE) (…)


We find that the individuals buried in Peqi’in Cave represent a relatively genetically homogenous population. This homogeneity is evident not only in the genome-wide analyses but also in the fact that most of the male individuals (nine out of ten) belong to the Y-chromosome haplogroup T, a lineage thought to have diversified in the Near East. This finding contrasts with both earlier (Neolithic and Epipaleolithic) Levantine populations, which were dominated by haplogroup E, and later Bronze Age individuals, all of whom belonged to haplogroup J.

Detailed sample background data for each of the 22 samples from which we successfully obtained ancient DNA. Additionally, background information for all samples from Peqi’in that were screened is included in Supplementary Data 1. *Indicates that Y-chromosome haplogroup call should be interpreted with caution, due to low coverage data.

Our finding that the Levant_ChL population can be well-modeled as a three-way admixture between Levant_N (57%), Anatolia_N (26%), and Iran_ChL (17%), while the Levant_BA_South can be modeled as a mixture of Levant_N (58%) and Iran_ChL (42%), but has little if any additional Anatolia_N-related ancestry, can only be explained by multiple episodes of population movement. The presence of Iran_ChL-related ancestry in both populations – but not in the earlier Levant_N – suggests a history of spread into the Levant of peoples related to Iranian agriculturalists, which must have occurred at least by the time of the Chalcolithic. The Anatolian_N component present in the Levant_ChL but not in the Levant_BA_South sample suggests that there was also a separate spread of Anatolian-related people into the region. The Levant_BA_South population may thus represent a remnant of a population that formed after an initial spread of Iran_ChL-related ancestry into the Levant that was not affected by the spread of an Anatolia_N-related population, or perhaps a reintroduction of a population without Anatolia_N-related ancestry to the region. We additionally find that the Levant_ChL population does not serve as a likely source of the Levantine-related ancestry in present-day East African populations.

These genetic results have striking correlates to material culture changes in the archaeological record. The archaeological finds at Peqi’in Cave share distinctive characteristics with other Chalcolithic sites, both to the north and south, including secondary burial in ossuaries with iconographic and geometric designs. It has been suggested that some Late Chalcolithic burial customs, artifacts and motifs may have had their origin in earlier Neolithic traditions in Anatolia and northern Mesopotamia. Some of the artistic expressions have been related to finds and ideas and to later religious concepts such as the gods Inanna and Dumuzi from these more northern regions. The knowledge and resources required to produce metallurgical artifacts in the Levant have also been hypothesized to come from the north.

Our finding of genetic discontinuity between the Chalcolithic and Early Bronze Age periods also resonates with aspects of the archeological record marked by dramatic changes in settlement patterns, large-scale abandonment of sites, many fewer items with symbolic meaning, and shifts in burial practices, including the disappearance of secondary burial in ossuaries. This supports the view that profound cultural upheaval, leading to the extinction of populations, was associated with the collapse of the Chalcolithic culture in this region.

Genetic structure of analyzed individuals. a Principal component analysis of 984 present-day West Eurasians (shown in gray) with 306 ancient samples projected onto the first two principal component axes and labeled by culture. b ADMIXTURE analysis of 984 and 306 ancient samples with K = 11
ancestral components. Only ancient samples are shown


I think the most interesting aspect of this paper is – as usual – the expansion of peoples associated with a single Y-DNA haplogroup. Given that the expansion of Semitic languages in the Middle East – like that of Anatolian languages from the north – must have happened after ca. 3100 BC, coinciding with the collapse of the Uruk period, these Chalcolithic north Levant peoples are probably not related to the posterior Semitic expansion in the region. This can be said to be supported by their lack of relationship with posterior Levantine migrations into Africa. The replacement of haplogroup E before the arrival of haplogroup J suggests still more clearly that Natufians and their main haplogroup were not related to the Afroasiatic expansions.

Distribution of Semitic languages. From Wikipedia.

On the other hand, while their ancestry points to neighbouring regional origins, their haplogroup T1a1a (probably T1a1a1b2) may be closely related to that of other Semitic peoples to the south, as found in east Africa and Arabia. This may be due either to a northern migration of these Chalcolithic Levantine peoples from southern regions in the 5th millennium BC, or maybe to a posterior migration of Semitic peoples from the Levant to the south, coupled with the expansion of this haplogroup, but associated with a distinct population. As we know, ancestry can change within certain generations of intense admixture, while Y-DNA haplogroups are not commonly admixed in prehistoric population expansions.

Without more data from ancient DNA, it is difficult to say. Haplogroup T1a1 is found in Morocco (ca. 3780-3650 calBC), which could point to a recent expansion of a Berbero-Semitic branch; but also in a sample from Balkans Neolithic ca. 5800-5400 calBCE, which could suggest an Anatolian origin of the specific subclades encountered here. In any case, a potential origin of Proto-Semitic anywhere near this wide Near Eastern region ca. 4500-3500 BC cannot be discarded, knowing that their ancestors came probably from Africa.

Distribution of haplogroup T of Y-chromosome. From Wikipedia.

Interesting from this paper is also that we are yet to find a single prehistoric population expansion not associated with a reduction of variability and expansion of Y-DNA haplogroups. It seems that the supposedly mixed Yamna community remains the only (hypothetical) example in history where expanding patrilineal clans will not share Y-DNA haplogroup…


Y-chromosome mixture in the modern Corsican population shows different migration layers


Open access Prehistoric migrations through the Mediterranean basin shaped Corsican Y-chromosome diversity, by Di Cristofaro et al. PLOS One (2018).

Interesting excerpts:

This study included 321 samples from men throughout Corsica; samples from Provence and Tuscany were added to the cohort. All samples were typed for 92 Y-SNPs, and Y-STRs were also analyzed.

Haplogroup R represented approximately half of the lineages in both Corsican and Tuscan samples (respectively 51.8% and 45.3%) whereas it reached 90% in Provence. Sub-clade R1b1a1a2a1a2b-U152 predominated in North Corsica whereas R1b1a1a2a1a1-U106 was present in South Corsica. Both SNPs display clinal distributions of frequency variation in Europe, the U152 branch being most frequent in Switzerland, Italy, France and Western Poland. Calibrated branch lengths from whole Y chromosome sequencing [44,45] and ancient DNA studies [46] both indicated that R1a and R1b diversification began relatively recently, about 5 Kya, consistent with Bronze Age and Copper Age demographic expansion. TMRCA estimations are concordant with such expansion in Corsica.

Spatial frequency maps for haplogroups with frequencies above 3%, their Y-STR based phylogenetic networks in Corsican populations (Blue: North, Green: West, Orange: South, Black: Center and Purple: East) and their TMRCA (in years, +/- SE).

Haplogroup G reached 21.7% in Corsica and 13.3% in Tuscany. Sub-clade G2a2a1a2-L91 accounted for 11.3% of all haplogroups in Corsica yet was not present in Provence or in Tuscany. Thirty-four out of the 37 G2a2a1a2-L91 displayed a unique Y-STR profile, illustrated by the star-like profile of STR networks (Fig 1). G2a2a1a2-L91 and G2a2a-PF3147(xL91xM286) show their highest frequency in present day Sardinia and southern Corsica compared to low levels from Caucasus to Southern Europe, encompassing the Near and Middle East [21,47–50]. Ancient DNA results from Early and Middle Neolithic samples reported the presence of haplogroup G2a-P15 [51–53], consistent with gene flow from the Mediterranean region during the Neolithic transition. Td expansion time estimated by STR for P15-affiliated chromosomes was estimated to be 15,082+/-2217 years ago [49]. Ötzi, the 5,300-year-old Alpine mummy, was derived for the L91 SNP [21]. A genetic relationship between G haplogroups from Corsica and Sardinia is further supported by DYS19 duplication, reported in North Sardinia [14], and observed in the southern part of the Corsica in 9 out of 37 G2a2a1a2-L91 chromosomes and in 4 out of 5 G2a2a-PF3147(xL91xM286) chromosomes, 3 of which displayed an identical STR profile (S4 Table).

This lineage has a reported coalescent age estimated by whole sequencing in Sardinian samples of about 9,000 years ago. This could reflect common ancestors coming from the Caucasus and moving westward during the Neolithic period [48], whereas their continental counterparts would have been replaced by rapidly expanding populations associated with the Bronze Age [46,54,55]. Estimated TMRCA for L91 lineage in Corsica is 4529 +/- 853 years. G-L497 showed high frequencies in Corsica compared to Provence and Tuscany, and this haplogroup was common in Europe, but rare in Greece, Anatolia and the Middle East. Fifteen out of the 17 Corsican G2a2b2a1a1b-L497 displayed a unique Y-STR profile (S4 Table) with an estimated TMRCA of 6867 +/- 1294 years. Haplogroup G2a2b1-M406, associated with Impressed Ware Neolithic markers, along with J2a1-DYS445 = 6 and J2a1b1-M92 [22,49], had very low levels in Corsica. Conversely, G2a2b2a-P303was highly represented and seemed to be independent of the G2a2b1-M406 marker. The 7 G2a2b2a-P303(xL497xM527) Corsican chromosomes displayed a unique Y-STR profile (S4 Table).

First and second axes of the PCA based on 12 Y-chromosome haplogroup frequencies in 83 west Mediterranean populations.

Haplogroup J, mainly represented by J2a1b-M67(xM92), displayed intermediate frequencies in Corsica compared to Tuscany and Provence. J2a1b-M67(xM92) derived STR network analysis displayed a quite homogeneous profile across the island with an estimated TMRCA of 2381 +/- 449 years (Fig 1) and individuals displaying M67 were peripheral compared to Northwestern Italians (S2 Fig). The haplogroup J2a1-Page55(xM67xM530), characteristic of non-Greek Anatolia [22], was found in the north-west of Corsica. Haplogroup J2a1-DYS445 = 6 was found in the north-west with DYS391 = 10 repeats, and in the far south with DYS391 = 9 repeats, the former was associated with Anatolian Greek samples, whereas the second was found in central Anatolia [22]. The 7 J2b2a-M241 displayed a unique Y-STR profile (S4 Table), they were only detected in the Cap Corse region, this sub-haplogroup shows frequency peaks in both the southern Balkans and northern-central Italy [56] and is associated with expansion from the Near East to the Balkans during Neolithic period [57].

Haplogroup E, mainly represented by E1b1b1a1b1a-V13, displayed intermediate frequencies in Corsica compared to Tuscany and Provence. E1b1b1a1b1a-V13 was thought to have initiated a pan-Mediterranean expansion 7,000 years ago starting from the Balkans [52] and its dispersal to the northern shore of the Mediterranean basin is consistent with the Greek Anatolian expansion to the western Mediterranean [22], characteristic of the region surrounding Alaria, and consistent with the TMRCA estimated in Corsica for this haplogroup. A few E1b1a-V38 chromosomes are also observed in the same regions as V13.


Recent Africa origin with hybridization, and back to Africa 70,000 years ago


Open access Carriers of mitochondrial DNA macrohaplogroup L3 basal lineages migrated back to Africa from Asia around 70,000 years ago, by Cabrera et al. BMC Evol Biol (2018) 18(98).

Abstract (emphasis mine):


The main unequivocal conclusion after three decades of phylogeographic mtDNA studies is the African origin of all extant modern humans. In addition, a southern coastal route has been argued for to explain the Eurasian colonization of these African pioneers. Based on the age of macrohaplogroup L3, from which all maternal Eurasian and the majority of African lineages originated, the out-of-Africa event has been dated around 60-70 kya. On the opposite side, we have proposed a northern route through Central Asia across the Levant for that expansion and, consistent with the fossil record, we have dated it around 125 kya. To help bridge differences between the molecular and fossil record ages, in this article we assess the possibility that mtDNA macrohaplogroup L3 matured in Eurasia and returned to Africa as basal L3 lineages around 70 kya.


The coalescence ages of all Eurasian (M,N) and African (L3 ) lineages, both around 71 kya, are not significantly different. The oldest M and N Eurasian clades are found in southeastern Asia instead near of Africa as expected by the southern route hypothesis. The split of the Y-chromosome composite DE haplogroup is very similar to the age of mtDNA L3. An Eurasian origin and back migration to Africa has been proposed for the African Y-chromosome haplogroup E. Inside Africa, frequency distributions of maternal L3 and paternal E lineages are positively correlated. This correlation is not fully explained by geographic or ethnic affinities. This correlation rather seems to be the result of a joint and global replacement of the old autochthonous male and female African lineages by the new Eurasian incomers.


These results are congruent with a model proposing an out-of-Africa migration into Asia, following a northern route, of early anatomically modern humans carrying pre-L3 mtDNA lineages around 125 kya, subsequent diversification of pre-L3 into the basal lineages of L3, a return to Africa of Eurasian fully modern humans around 70 kya carrying the basal L3 lineages and the subsequent diversification of Eurasian-remaining L3 lineages into the M and N lineages in the outside-of-Africa context, and a second Eurasian global expansion by 60 kya, most probably, out of southeast Asia. Climatic conditions and the presence of Neanderthals and other hominins might have played significant roles in these human movements. Moreover, recent studies based on ancient DNA and whole-genome sequencing are also compatible with this hypothesis.


You can also read the recent interesting open access review How did Homo sapiens evolve? by Julia Galway-Witham, Chris Stringer, Science (2018) 360:6395 1296-1298.


Bantu distinguished from Khoe by uniparental markers, not genome-wide autosomal admixture


The role of matrilineality in shaping patterns of Y chromosome and mtDNA sequence variation in southwestern Angola, by Oliveira et al. bioRxiv (2018).

Interesting excerpts (emphasis mine):

The origins of NRY diversity in SW Angola

In accordance with our previous mtDNA study9, the present NRY analysis reveals a major division between the Kx’a-speaking !Xun and the Bantu-speaking groups, whose paternal genetic ancestry does not display any old remnant lineages, or a clear link to pre-Bantu eastern African migrants introducing Khoe-Kwadi languages and pastoralism into southern Africa (cf. 15). This is especially evident in the distribution of the eastern African subhaplogroup E1b1b1b2b29, which reaches the highest frequency in the !Xun (25%) and not in the formerly Kwadi-speaking Kwepe (7%). This observation, together with recent genome-wide estimates of 9-22% of eastern African ancestry in other Kx’a and Tuu-speaking groups35, suggests that eastern African admixture was not restricted to present-day Khoe-Kwadi speakers. Alternatively, it is likely that the dispersal of pastoralism and Khoe-Kwadi languages involved a series of punctuated contacts that led to a wide variety of cultural, genetic and linguistic outcomes, including possible shifts to Khoe-Kwadi by originally Bantu-speaking peoples36.

Although traces of an ancestral pre-Bantu population may yet be found in autosomal genome-wide studies, the extant variation in both uniparental markers strongly supports a scenario in which all groups of the Angolan Namib share most of their genetic ancestry with other Bantu groups but became increasingly differentiated within the highly stratified social context of SW African pastoral societies11.

Y chromosome phylogeny, haplogroup distribution and map of the sampling locations. The phylogenetic tree was reconstructed in BEAST based on 2,379 SNPs and is in accordance with the known Y chromosome topology. Main haplogroup clades and their labels are shown with different colors. Age estimates are reported in italics near each node, with the TMRCA of main haplogroups shown with their corresponding color. A map of the sampling locations, re-used with permission from Oliveira et al. (2018) 9, is shown on the bottom left, and the haplogroup distribution per population is shown on the bottom right, with color-coding corresponding to the phylogenetic tree.

The influence of socio-cultural behaviors on the diversity of NRY and mtDNA

A comparison of the NRY variation with previous mtDNA results for the same groups 9 identifies three main sex-specific patterns. First, gene flow from the Bantu into the !Xun is much higher for male than for female lineages (31% NRY vs. 3% mtDNA), similar to the reported male-biased patterns of gene flow from Bantu to Khoisan-speaking groups33, and from non-Pygmies to Pygmies in Central Africa 37. A comparable trend, involving exclusive introgression of NRY eastern African lineages into the !Xun (25%) was also found. (…)

Secondly, the levels of intrapopulation diversity in the Bantu-speaking peoples from the Namib were found to be consistently higher for mtDNA than for the NRY, reflecting the marked association between the Bantu expansion and the relatively young NRY E1b1a1a1 haplogroup, which has no parallel in mtDNA25,39. (…)

In the context of the Bantu expansions, these patterns have been mostly interpreted as the result of polygyny and/or higher levels of assimilation of females from resident forager communities38,40. However, most groups from the Angolan Namib are only mildly polygynous11 and ethnographic data suggest that the actual rates of polygyny in many populations may be insufficient to significantly reduce Nem2,41. In addition, the finding of a large Nef/ Nem ratio in the Himba (Fig. S5), who have almost no Khoisan-related mtDNA lineages9, indicates that female biased introgression cannot fully explain the observed patterns.

An alternative explanation may be sought in the prevailing matrilineal descent rules, which might have created a sex-specific structuring effect, similar to that proposed for patrilineal groups from Central Asia (…)

Bayesian skyline plots (BSP) of effective population size change through time, based on mtDNA (red) and the NRY (black). Thick lines show the mean estimates and dashed lines show the 95% HPD intervals. The vertical line highlights the 2 ky before present mark. Effective sizes are plotted on a log scale. Generation times of 25 and 31 years were assumed for mtDNA and the NRY, respectively32.

The third important sex-specific pattern observed in this study is the much lower amount of between-group differentiation for NRY than for mtDNA among Bantu-speaking populations (4.4% NRY vs. 20.2% mtDNA), in spite of the patrilocal residence patterns of all ethnic groups (Table S5). This difference can hardly be explained by unequal levels of introgression of “Khoisan” mtDNA lineages into the Bantu, since the percentage of mtDNA variation remains high (18.8%) when the Kuvale, who have high frequencies of “Khoisan”-related mtDNA, are excluded from the comparisons. It therefore seems more plausible that differentiation is higher in the mtDNA simply because there is more ancestral mtDNA than NRY variation that can be sorted among different populations (see 45). Moreover, due to the matriclanic organization of all Bantu-speaking communities, factors enhancing inter-group differentiation, like kin-structured migration and kin-structured founder effects46, would have been restricted to mtDNA. Finally, it is also likely that the discrepancy between among-group divergence of mtDNA and NRY might have been influenced by higher migration rates in males than females. In fact, although all Bantu-speaking populations have patrilocal residence patterns, the observance of endogamy rules severely constrains the between-group mobility of females. In this context, the children from extramarital unions involving members from different populations tend to be raised in the mother’s group, effectively increasing male versus female migration rates. Moreover, it is likely that, in the highly hierarchized setting of the Namib, most intergroup extramarital unions would involve men from dominant groups and women from peripatetic communities. This hypothesis is indirectly supported by the finding that in NRY-based clusters (but not in mtDNA) pastoralist populations are grouped together with peripatetic communities that share their cultural traits (Figs. S6 and 3b), suggesting that migration of NRY lineages follows a path that is similar to horizontally transmitted cultural features.


Shared ancestry of ancient Eurasian hepatitis B virus diversity linked to Bronze Age steppe


Ancient hepatitis B viruses from the Bronze Age to the Medieval period, by Mühlemann et al., Science (2018) 557:418–423.

NOTE. You can read the PDF at Dalia Pokutta’s Academia.edu account.

Abstract (emphasis):

Hepatitis B virus (HBV) is a major cause of human hepatitis. There is considerable uncertainty about the timescale of its evolution and its association with humans. Here we present 12 full or partial ancient HBV genomes that are between approximately 0.8 and 4.5 thousand years old. The ancient sequences group either within or in a sister relationship with extant human or other ape HBV clades. Generally, the genome properties follow those of modern HBV. The root of the HBV tree is projected to between 8.6 and 20.9 thousand years ago, and we estimate a substitution rate of 8.04 × 10−6–1.51 × 10−5 nucleotide substitutions per site per year. In several cases, the geographical locations of the ancient genotypes do not match present-day distributions. Genotypes that today are typical of Africa and Asia, and a subgenotype from India, are shown to have an early Eurasian presence. The geographical and temporal patterns that we observe in ancient and modern HBV genotypes are compatible with well-documented human migrations during the Bronze and Iron Ages1,2. We provide evidence for the creation of HBV genotype A via recombination, and for a long-term association of modern HBV genotypes with humans, including the discovery of a human genotype that is now extinct. These data expose a complexity of HBV evolution that is not evident when considering modern sequences alone.

Geographical distribution of analysed samples and modern genotypes. a (featured image), Distribution of modern human HBV genotypes. Genotypes relevant to this Letter are shown in colour. Coloured shapes indicate the locations of the HBV-positive samples included for further analysis. b (above this text), Locations of analysed Bronze Age samples are shown as circles and Iron Age and later samples are shown as triangles. Coloured markers indicate HBV-positive samples. Ancient genotype A samples are found in regions in which genotype D predominates today, and HBV-DA27 is of subgenotype D5 which today is found almost exclusively in India.

Interesting excerpts:

We find genotype A in south-western Russia by 4.3 ka (in samples RISE386 and RISE387) in individuals belonging to the Sintashta culture, and in a Hungarian sample (DA195) from the Scythian culture. The western Scythians are related to the Bronze Age cultures of western steppe populations2 and their shared ancestry suggests that the modern genotype A may descend from this ancient Eurasian diversity and not, as previously hypothesized, from African ancestors29,30. This is also consistent with the phylogeny (Fig. 2), as well as the fact that the three oldest ancient genotype A sequences (HBV-DA195, HBV-RISE386 and HBV-RISE387) lack the six-nucleotide insertion found in the youngest (HBV-DA119) and in all modern genotype A sequences. The ancestors of subgenotypes A1 and A3 could have been carried into Africa subsequently, via migration from western Eurasia31.

The ancient HBV genotype D sequences were all found in Central Asia. HBV-DA27, found in Kazakhstan and dated to 1.6 ka, falls basal to the modern subgenotype D5 sequences that today are found in the Paharia tribe from eastern India32. DA27 and the Paharia people in India are linked by their East Asian ancestry2,33.

Dated maximum clade credibility tree of HBV. A log-normal relaxed clock and coalescent exponential population prior were used. Grey horizontal bars indicate the 95% HPD interval of the age of the node. Larger numbers on the nodes indicate the median age and 95% HPD interval of the age (in parentheses) under a strict clock and Bayesian skyline tree prior. Clades of genotypes C (except clade C4), E, F, G and H are collapsed and shown as dots. The figure includes a possible tenth genotype, J, based on a single human isolate. Taxon names for ancient samples indicate era (BA, Bronze Age; IA, Iron Age or later), sample name, sample age in years, ISO 3166 three-letter abbreviation of country of sequence origin, and region of sequence origin. Taxon names for modern samples indicate human genotype or subgenotype or host species if non-human, GenBank accession number, sample age in years, ISO 3166 three-letter abbreviation of country of sequence origin, and region of sequence origin.

(…)Despite the age of the samples and the imperfect diagnostic test, our dataset contained a high proportion of HBV-positive individuals. The actual ancient prevalence during the Bronze Age and thereafter might have been higher, reaching or exceeding the prevalence typically found in contemporary indigenous populations5. This clearly establishes the potential of HBV as powerful proxy tool for research into human spread and interactions. The data from ancient genomes reveal aspects of complexity in HBV evolution that are not apparent when only modern sequences are considered. They show the existence of ancient HBV genotypes in locations incongruent with their present-day distribution, contradicting previously suggested geographical or temporal origins of genotypes or sub-genotypes; evidence for the creation of genotype A via recombination and the emergence of the genotype outside Africa; at least one now-extinct human genotype; ancient genotype-level localized diversity; and demonstrate that the viral substitution rate obtained from modern heterochronously sampled sequences is probably misleading. Together, these findings suggest that the difficulty in formulating a coherent theory for the origin and spread of HBV may be due to genetic evidence of an earlier evolutionary scenario being overwritten by relatively recent alterations, as has previously been suggested in the context of recombination24

See also:

Ancient genomes from North Africa evidence Neolithic migrations to the Maghreb

BioRxiv preprint now published (behind paywall) Ancient genomes from North Africa evidence prehistoric migrations to the Maghreb from both the Levant and Europe, by Fregel et al., PNAS (2018).

NOTE. I think one of the important changes in this version compared to the preprint is the addition of the recent Iberomaurusian samples.

Abstract (emphasis mine):

The extent to which prehistoric migrations of farmers influenced the genetic pool of western North Africans remains unclear. Archaeological evidence suggests that the Neolithization process may have happened through the adoption of innovations by local Epipaleolithic communities or by demic diffusion from the Eastern Mediterranean shores or Iberia. Here, we present an analysis of individuals’ genome sequences from Early and Late Neolithic sites in Morocco and from Early Neolithic individuals from southern Iberia. We show that Early Neolithic Moroccans (∼5,000 BCE) are similar to Later Stone Age individuals from the same region and possess an endemic element retained in present-day Maghrebi populations, confirming a long-term genetic continuity in the region. This scenario is consistent with Early Neolithic traditions in North Africa deriving from Epipaleolithic communities that adopted certain agricultural techniques from neighboring populations. Among Eurasian ancient populations, Early Neolithic Moroccans are distantly related to Levantine Natufian hunter-gatherers (∼9,000 BCE) and Pre-Pottery Neolithic farmers (∼6,500 BCE). Late Neolithic (∼3,000 BCE) Moroccans, in contrast, share an Iberian component, supporting theories of trans-Gibraltar gene flow and indicating that Neolithization of North Africa involved both the movement of ideas and people. Lastly, the southern Iberian Early Neolithic samples share the same genetic composition as the Cardial Mediterranean Neolithic culture that reached Iberia ∼5,500 BCE. The cultural and genetic similarities between Iberian and North African Neolithic traditions further reinforce the model of an Iberian migration into the Maghreb.

Ancestry inference in ancient samples from North Africa and the Iberian Peninsula. PCA analysis using the Human Origins panel (European, Middle Eastern, and North African populations) and LASER projection of aDNA samples.

Relevant excerpts:

FST and outgroup-f3 distances indicate a high similarity between IAM and Taforalt. As observed for IAM, most Taforalt sample ancestry derives from Epipaleolithic populations from the Levant. However, van de Loosdrecht et al. (17) also reported that one third of Taforalt ancestry was of sub-Saharan African origin. To confirm whether IAM individuals show a sub-Saharan African component, we calculated f4(chimpanzee, African population; Natufian, IAM) in such a way that a positive result for f4 would indicate that IAM is composed both of Levantine and African ancestries. Consistent with the results observed for Taforalt, f4 values are significantly positive for West African populations, with the highest value observed for Gambian and Mandenka (Fig. 3 and SI Appendix, Supplementary Note 10). Together, these results indicate the presence of the same ancestral components in ∼15,000-y old and ∼7,000-y-old populations from Morocco, strongly suggesting a temporal continuity between Later Stone Age and Early Neolithic populations in the Maghreb. However, it is important to take into account that the number of ancient genomes available for comparison is still low and future sampling can provide further refinement in the evolutionary history of North Africa.

Genetic analyses have revealed that the population history of modern North Africans is quite complex (11). Based on our aDNA analysis, we identify an Early Neolithic Moroccan component that is (i) restricted to North Africa in present-day populations (11); (ii) the sole ancestry in IAM samples; and (iii) similar to the one observed in Later Stone Age samples from Morocco (17). We conclude that this component, distantly related to that of Epipaleolithic communities from the Levant, represents the autochthonous Maghrebi ancestry associated with Berber populations. Our data suggests that human populations were isolated in the Maghreb since Upper Paleolithic times. Our hypothesis is in agreement with archaeological research pointing to the first stage of the Neolithic expansion in Morocco as the result of a local population that adopted some technological innovations, such as pottery production or farming, from neighboring areas.

By 3,000 BCE, a continuity in the Neolithic spread brought Mediterranean-like ancestry to the Maghreb, most likely from Iberia. Other archaeological remains, such as African elephant ivory and ostrich eggs found in Iberian sites, confirm the existence of contacts and exchange networks through both sides of the Gibraltar strait at this time. Our analyses strongly support that at least some of the European ancestry observed today in North Africa is related to prehistoric migrations, and local Berber populations were already admixed with Europeans before the Roman conquest. Furthermore, additional European/ Iberian ancestry could have reached the Maghreb after KEB people; this scenario is supported by the presence of Iberian-like Bell-Beaker pottery in more recent stratigraphic layers of IAM and KEB caves. Future paleogenomic efforts in North Africa will further disentangle the complex history of migrations that forged the ancestry of the admixed populations we observe today.

Ancestry inference in ancient samples from North Africa and the Iberian Peninsula. (B) ADMIXTURE analysis using the Human Origins dataset (European, Middle Eastern, and North African populations) for modern and ancient samples (K = 8). (D) Detail of ADMIXTURE analysis using the Human Origins dataset (European, Middle Eastern, North African, and sub-Saharan African populations) for modern and ancient samples, including Taforalt.

Also, from the main author’s Twitter account:

I just realized that the paragraph with information on data availability is missing! Sequence data in the European Nucleotide Archive (PRJEB22699). Consensus mtDNA sequences are available at the National Center of Biotechnology Information (Accession Numbers MF991431-MF991448).

I find it hard to believe that this genetic continuity from Upper Palaeolithic to Late Neolithic could be representative of an autochthonous development of Afroasiatic. An important population movement – likely more than one – must be found in ancient DNA influencing North-Central and North-East Africa, probably during the time of the Green Sahara corridor.

See here:

Tales of Human Migration, Admixture, and Selection in Africa


Comprehensive review (behind paywall) Tales of Human Migration, Admixture, and Selection in Africa, by Carina M. Schlebusch & Mattias Jakobsson, Annual Review of Genomics and Human Genetics (2018), Vol. 9.

Abstract (emphasis mine):

In the last three decades, genetic studies have played an increasingly important role in exploring human history. They have helped to conclusively establish that anatomically modern humans first appeared in Africa roughly 250,000–350,000 years before present and subsequently migrated to other parts of the world. The history of humans in Africa is complex and includes demographic events that influenced patterns of genetic variation across the continent. Through genetic studies, it has become evident that deep African population history is captured by relationships among African hunter–gatherers, as the world’s deepest population divergences occur among these groups, and that the deepest population divergence dates to 300,000 years before present. However, the spread of pastoralism and agriculture in the last few thousand years has shaped the geographic distribution of present-day Africans and their genetic diversity. With today’s sequencing technologies, we can obtain full genome sequences from diverse sets of extant and prehistoric Africans. The coming years will contribute exciting new insights toward deciphering human evolutionary history in Africa.

Regarding potential Afroasiatic origins and expansions:

It is currently believed that farming practices in northeastern and eastern Africa developed independently in the Sahara/Sahel (around 7,000 BP) and the Ethiopian highlands (7,000–4,000 BP), while farming in the Nile River Valley developed as a consequence of the Neolithic Revolution in the Middle East (84). Northeastern and eastern African farmers today speak languages from the Afro-Asiatic and Nilo-Saharan linguistic groups, which is also reflected in their genetic affinities (Figure 3, K=6). In the northern parts of East Africa (South Sudan, Somalia, and Ethiopia), Nilo-Saharan and Afro-Asiatic speakers with farming lifeways have completely replaced hunter–gatherers. It is still largely unclear how farming and herding practices influenced the northeastern African prefarming population structure and whether the spread of farming is better explained by demic or cultural diffusion in this part of the world. Genetic studies of contemporary populations and aDNA have started to provide some insights into population continuity and incoming gene flow in this region of Africa.

Demographic model of African history and estimated divergences. (a) Population split times, hierarchy, and population sizes (summarized from 123). Horizontal width represents population size; horizontal colored lines represent migrations, with down-pointing triangles indicating admixture into another group. (b) Population structure analysis at 5 assumed ancestries (K=5) for 93 African and 6 non-African populations. Non-Africans (brown), East Africans (blue), West Africans ( green), central African hunter–gatherers (light blue), and Khoe-San (red ) populations are sorted according to their broad historical distributions.

For example, studies have shown that a back-migration from Eurasia into Africa affected most of northeastern and eastern Africa (36, 46, 53, 89, 132) (Figure 1b). A genetic baseline of eastern African ancestral genetic variation unaffected by recent Eurasian admixture and farming migrations within the last 4,500 years has been suggested in the form of the genome sequence of a 4,500-year-old individual from Mota, Ethiopia (36). Based on comparisons with the ancient Mota genome, we know that certain populations from northeastern Africa show deep continuity in their local area with very limited gene flow resulting from recent population movements. For example, the Nilotic herder populations from South Sudan (e.g., Dinka, Nuer, and Shilluk) appear to have remained relatively isolated over time and received little to no gene flow from Eurasians, West African Bantu-speaking farmers, and other surrounding groups (53) (Figures 2 and 3). By contrast, the Nubian and Arab populations to their north show gene flow with Eurasians, which has been connected to the Arab expansion (53). The Nubian, Arab, and Beja populations of northeastern Africa roughly display equal admixture fractions from a local northeastern African gene pool (similar to the Nilotic component) and an incoming Eurasian migrant component (53) (Figure 3). The Eurasian component has been linked to the Middle East and the Arab migration, but only the Arab groups shifted to the Semitic languages; the Nubians and Beja groups kept their original languages. The Eurasian gene flow appears to have spread from north to south along the Nile and Blue Nile in a succession of admixture events (53).

Skoglund and Mathieson’s preprint has also been published in the same volume, without meaningful changes.