Tales of Human Migration, Admixture, and Selection in Africa


Comprehensive review (behind paywall) Tales of Human Migration, Admixture, and Selection in Africa, by Carina M. Schlebusch & Mattias Jakobsson, Annual Review of Genomics and Human Genetics (2018), Vol. 9.

Abstract (emphasis mine):

In the last three decades, genetic studies have played an increasingly important role in exploring human history. They have helped to conclusively establish that anatomically modern humans first appeared in Africa roughly 250,000–350,000 years before present and subsequently migrated to other parts of the world. The history of humans in Africa is complex and includes demographic events that influenced patterns of genetic variation across the continent. Through genetic studies, it has become evident that deep African population history is captured by relationships among African hunter–gatherers, as the world’s deepest population divergences occur among these groups, and that the deepest population divergence dates to 300,000 years before present. However, the spread of pastoralism and agriculture in the last few thousand years has shaped the geographic distribution of present-day Africans and their genetic diversity. With today’s sequencing technologies, we can obtain full genome sequences from diverse sets of extant and prehistoric Africans. The coming years will contribute exciting new insights toward deciphering human evolutionary history in Africa.

Regarding potential Afroasiatic origins and expansions:

It is currently believed that farming practices in northeastern and eastern Africa developed independently in the Sahara/Sahel (around 7,000 BP) and the Ethiopian highlands (7,000–4,000 BP), while farming in the Nile River Valley developed as a consequence of the Neolithic Revolution in the Middle East (84). Northeastern and eastern African farmers today speak languages from the Afro-Asiatic and Nilo-Saharan linguistic groups, which is also reflected in their genetic affinities (Figure 3, K=6). In the northern parts of East Africa (South Sudan, Somalia, and Ethiopia), Nilo-Saharan and Afro-Asiatic speakers with farming lifeways have completely replaced hunter–gatherers. It is still largely unclear how farming and herding practices influenced the northeastern African prefarming population structure and whether the spread of farming is better explained by demic or cultural diffusion in this part of the world. Genetic studies of contemporary populations and aDNA have started to provide some insights into population continuity and incoming gene flow in this region of Africa.

Demographic model of African history and estimated divergences. (a) Population split times, hierarchy, and population sizes (summarized from 123). Horizontal width represents population size; horizontal colored lines represent migrations, with down-pointing triangles indicating admixture into another group. (b) Population structure analysis at 5 assumed ancestries (K=5) for 93 African and 6 non-African populations. Non-Africans (brown), East Africans (blue), West Africans ( green), central African hunter–gatherers (light blue), and Khoe-San (red ) populations are sorted according to their broad historical distributions.

For example, studies have shown that a back-migration from Eurasia into Africa affected most of northeastern and eastern Africa (36, 46, 53, 89, 132) (Figure 1b). A genetic baseline of eastern African ancestral genetic variation unaffected by recent Eurasian admixture and farming migrations within the last 4,500 years has been suggested in the form of the genome sequence of a 4,500-year-old individual from Mota, Ethiopia (36). Based on comparisons with the ancient Mota genome, we know that certain populations from northeastern Africa show deep continuity in their local area with very limited gene flow resulting from recent population movements. For example, the Nilotic herder populations from South Sudan (e.g., Dinka, Nuer, and Shilluk) appear to have remained relatively isolated over time and received little to no gene flow from Eurasians, West African Bantu-speaking farmers, and other surrounding groups (53) (Figures 2 and 3). By contrast, the Nubian and Arab populations to their north show gene flow with Eurasians, which has been connected to the Arab expansion (53). The Nubian, Arab, and Beja populations of northeastern Africa roughly display equal admixture fractions from a local northeastern African gene pool (similar to the Nilotic component) and an incoming Eurasian migrant component (53) (Figure 3). The Eurasian component has been linked to the Middle East and the Arab migration, but only the Arab groups shifted to the Semitic languages; the Nubians and Beja groups kept their original languages. The Eurasian gene flow appears to have spread from north to south along the Nile and Blue Nile in a succession of admixture events (53).

Skoglund and Mathieson’s preprint has also been published in the same volume, without meaningful changes.


The origin and expansion of Pama–Nyungan languages across Australia

Yet another questionable paper by Nature, The origin and expansion of Pama–Nyungan languages across Australia, by Bouckaert, Bowern & Atkinson, Nat Ecol Evol (2018).


It remains a mystery how Pama–Nyungan, the world’s largest hunter-gatherer language family, came to dominate the Australian continent. Some argue that social or technological advantages allowed rapid language replacement from the Gulf Plains region during the mid-Holocene. Others have proposed expansions from refugia linked to climatic changes after the last ice age or, more controversially, during the initial colonization of Australia. Here, we combine basic vocabulary data from 306 Pama–Nyungan languages with Bayesian phylogeographic methods to explicitly model the expansion of the family across Australia and test between these origin scenarios. We find strong and robust support for a Pama–Nyungan origin in the Gulf Plains region during the mid-Holocene, implying rapid replacement of non-Pama–Nyungan languages. Concomitant changes in the archaeological record, together with a lack of strong genetic evidence for Holocene population expansion, suggests that Pama–Nyungan languages were carried as part of an expanding package of cultural innovations that probably facilitated the absorption and assimilation of existing hunter-gatherer groups.

“Diversification of the Pama–Nyungan language family. Maximum clade credibility tree showing the inferred timing and emergence of the major branches and their subsequent diversification.”

Even with my absolute lack of knowledge on Australian languages, I am not conviced. Not at all.

I have already expressed more than once my opinion on Glottochronology – and the improved method of this paper seems like the final twist of the screw for its strongest proponents.

Interestingly, this paper includes the same journal, author, and (mostly) method of the famous Language-tree divergence times support the Anatolian theory of Indo-European origin (2003).

And we have also seen how most suggested prehistorical cultural diffusion events were actually migrations, so it seems rather odd to dare publish this right now.

At a time of groundbreaking genomic papers being published on South-East Asian migrations, and probably expecting more on the region – including Australia – , this paper seems to me quite unnecessary.

It will especially not help Nature make forget its latest fiasco on Indo-European migrations.

See also:

Population turnover in remote Oceania shortly after initial settlement


Interesting preprint at BioRxiv by the team of the Reich lab, Population Turnover in Remote Oceania Shortly After Initial Settlement, by Mark Lipson, Pontus Skoglund, Matthew Spriggs, et al. (2018).

Abstract (emphasis mine):

Ancient DNA analysis of three individuals dated to ~3000 years before present (BP) from Vanuatu and one ~2600 BP individual from Tonga has revealed that the first inhabitants of Remote Oceania (“First Remote Oceanians”) were almost entirely of East Asian ancestry, and thus their ancestors passed New Guinea, the Bismarck Archipelago, and the Solomon Islands with minimal admixture with the Papuan groups they encountered. However, all present-day populations in Near and Remote Oceania harbor 25-100% Papuan ancestry, implying that there must have been at least one later stream of migration eastward from Near Oceani>. We generated genome-wide data for 14 ancient individuals from Efate and Epi Islands in Vanuatu ranging from 3,000-150 BP, along with 185 present-day Vanuatu individuals from 18 islands. We show that people of almost entirely Papuan ancestry had arrived in Vanuatu by 2400 BP, an event that coincided with the end of the Lapita cultural period, changes in skeletal morphology, and the cessation of long-distance trade between Near and Remote Oceania. First Remote Oceanian ancestry subsequently increased via admixture but remains at 10-20% in most islands. Through a fine-grained comparison of ancestry profiles in Vanuatu and Polynesia with diverse groups in Near Oceania, we find that Papuan ancestry in Vanuatu is consistent with deriving from the Bismarck Archipelago instead of the geographically closer Solomon Islands. Papuan ancestry in Polynesia also shows connections to the ancestry profiles present in the Bismarck Archipelago but is more similar to Tolai from New Britain and Tutuba from Vanuatu than to the ancient Vanuatu individuals and the great majority of present-day Vanuatu populations. This suggests a third eastward stream of migration from Near to Remote Oceania bringing a different type of Papuan ancestry.

Admixture graph model with inferred parameters, alternative visualization. Branch lengths are given in units of f2 genetic drift distance times 1000, and admixture proportions are indicated along corresponding dotted lines. Red, Solomon Islands majority source; blue, Bismarck Archipelago majority source; purple, New Guinea-related source; green, First Remote Oceanian; brown, mixed ancestry. The order of admixture events specified is arbitrary.

See also:

Indo-European and Central Asian admixture in Indian population, dependent on ethnolinguistic and geodemographic divisions


Preprint paper at BioRxiv, Dissecting Population Substructure in India via Correlation Optimization of Genetics and Geodemographics, by Bose et al. (2017), a mixed group from Purdue University and IBM TJ Watson Research Center. A rather simple paper, which is nevertheless interesting in its approach to the known multiple Indian demographic divisions, and in its short reported methods and results.


India represents an intricate tapestry of population substructure shaped by geography, language, culture and social stratification operating in concert. To date, no study has attempted to model and evaluate how these evolutionary forces have interacted to shape the patterns of genetic diversity within India. Geography has been shown to closely correlate with genetic structure in other parts of the world. However, the strict endogamy imposed by the Indian caste system, and the large number of spoken languages add further levels of complexity. We merged all publicly available data from the Indian subcontinent into a data set of 835 individuals across 48,373 SNPs from 84 well-defined groups. Bringing together geography, sociolinguistics and genetics, we developed COGG (Correlation Optimization of Genetics and Geodemographics) in order to build a model that optimally explains the observed population genetic sub-structure. We find that shared language rather than geography or social structure has been the most powerful force in creating paths of gene flow within India. Further investigating the origins of Indian substructure, we create population genetic networks across Eurasia. We observe two major corridors towards mainland India; one through the Northwestern and another through the Northeastern frontier with the Uygur population acting as a bridge across the two routes. Importantly, network, ADMIXTURE analysis and f3 statistics support a far northern path connecting Europe to Siberia and gene flow from Siberia and Mongolia towards Central Asia and India.

Among the most interesting results (emphasis mine):

Our meta-analysis of the ADMIXTURE output shows that the IE and DR populations across castes shared very high ancestry, indicating the autochthonous origin of the caste system in India (Figure 2). f3 statistics show that most of the castes and tribes in India are admixed, with contributions from other castes and/or tribes, across languages affiliations (Supplementary Table 4 and Supplementary Note). The geographically isolated Tibeto-Burman tribes and the Dravidian speaking tribes appear to be the most isolated in India. Linear Discriminant Analysis on the normalized data set clearly supports genetic strati cation by castes and languages in the Indian sub-continent


Our meta-analysis of the ADMIXTURE plot in Figure 4A quantifies the ADMIXTURE results (darker colors indicate higher pairwise shared ancestry). Indian populations show a greater proportion of shared ancestry with the so-called Indian Northwestern Frontier populations, namely the tribal populations spanning Afghanistan and Pakistan. Central Asian populations share higher degrees of ancestry with IE and DR Froward castes. Uygurs share high degrees of ancestry with Indian populations.


f3 statistics (all negative Z-scores are shown) indicate Chinese and Siberian ancestry contributing to the Tibeto-Burman tribal speakers. On the other hand, the Mongols and the Europeans have contributed significant amounts of ancestry to the Indo-European and Tibeto-Burman forward castes. F3 statistics also show that the Central Asians are an admixed population with signs of admixture from Caucasus and other parts of Europe.

Among the results for proportions of shared ancestry between Indians and Eurasians (FIG. 4), there is an obvious influence of European admixture (Caucasus, and Southern, Central, and Northern EU), potentially from the Yamna-Corded Ware expansion, in IE_ForwardCaste, which is lessened in IE_BackwardCaste and also in IE_Tribal, while DR_ForwardCaste shows again more admixture than IE_Tribal, but diminishing with lower castes and quite low in DR_Tribal.

Ancestry from Central Asia is strong with a similar pattern, which hints at the influence of Sintashta, Andronovo, and BMAC influence in the expansion of the Steppe component, even more than a later Turkic component.

On the other hand, the influence from Turkey is difficult to assess, given the complex genetic history of Anatolia, but the map contained in Fig. 6 doesn’t feel right, not only from a genetic viewpoint, but also from linguistic and archaeological points of view. This is the typical map created with admixture analyses that is wrong because of not taking into account anthropological theories.

Quite interesting is then the influence of admixture in these different ethnolinguistic groups, Indo-European and Dravidic, which points to an initially greater expansion of Indo-European speakers, and later resurge of Dravidian languages.

Featured image contains simplified origin and data of samples studied, from the article.


About the European Union’s arcane language: the EU does seem difficult for people to understand

Mark Mardell asks in his post Learn EU-speak:

Does the EU shroud itself in obscure language on purpose or does any work of detail produce its own arcane language? Of course it is not just the lingo: the EU does seem difficult for people to understand. What’s at the heart of the problem?

His answer on the radio (as those comments that can be read in his blog) will probably look for complex reasoning on the nature of the European Union as an elitist institution, distant from real people, on the “obscure language” (intentionally?) used by MEPs, on the need of that language to be obscured by legal terms, etc.

All that is great. You can talk a lot about the possible reasons why people would find too boring those Europarliament discussions where everyone speaks his own national language; possible reasons why important media (like the BBC) would never show debates on important issues, unless the MEP uses their national language; possible reasons why that doesn’t happen with national parliaments where everyone speaks a common language…

But the most probable answer is so obvious it doesn’t really make sense to ask. The initeresting question is do people actually want to pay the price for having a common Europe?

Five lines of ancient script on a shard of pottery could be the longest proto-Canaanite text ever found, archaeologists say

According to the BBC News ‘Oldest Hebrew script’ is found:

The shard was found by a teenage volunteer during a dig about 20km (12 miles) south-west of Jerusalem. Experts at Hebrew University said dating showed it was written 3,000 years ago – about 1,000 years earlier than the Dead Sea Scrolls. Other scientists cautioned that further study was needed to understand it.

Preliminary investigations since the shard was found in July have deciphered some words, including judge, slave and king. The characters are written in Proto-Canaanite, a precursor of the Hebrew alphabet.

I found it interesting because of the implications that these findings might have on classifications of dead languages into more natural or artificial regarding the knowledge we have of them, especially about proto-languages like Proto-Canaanite (or Europe’s Indo-European), which can easily move from category 9 (‘hypothetical language’) to category 8 or even 7 (‘dead language’).

As we have said before, this implies that, despite the efforts of some conlangers to make their newly created conlangs (look) the same as proto-languages like PIE – in the sense of ‘artificiality’, they obviously aren’t.