The importance of fine-scale studies for integrating palaeogenomics and archaeology


Short review (behind paywall) The importance of fine-scale studies for integrating paleogenomics and archaeology, by Krishna R. Veeramah, Current Opinion in Genetics & Development (2018) 53:83-89.

Abstract (emphasis mine):

There has been an undercurrent of intellectual tension between geneticists studying human population history and archaeologists for almost 40 years. The rapid development of paleogenomics, with geneticists working on the very material discovered by archaeologists, appears to have recently heightened this tension. The relationship between these two fields thus far has largely been of a multidisciplinary nature, with archaeologists providing the raw materials for sequencing, as well as a scaffold of hypotheses based on interpretation of archaeological cultures from which the geneticists can ground their inferences from the genomic data. Much of this work has taken place in the context of western Eurasia, which is acting as testing ground for the interaction between the disciplines. Perhaps the major finding has not been any particular historical episode, but rather the apparent pervasiveness of migration events, some apparently of substantial scale, over the past ∼5000 years, challenging the prevailing view of archaeology that largely dismissed migration as a driving force of cultural change in the 1960s. However, while the genetic evidence for ‘migration’ is generally statistically sound, the description of these events as structured behaviours is lacking, which, coupled with often over simplistic archaeological definitions, prevents the use of this information by archaeologists for studying the social processes they are interested in. In order to integrate paleogenomics and archaeology in a truly interdisciplinary manner, it will be necessary to focus less on grand narratives over space and time, and instead integrate genomic data with other form of archaeological information at the level of individual communities to understand the internal social dynamics, which can then be connected amongst communities to model migration at a regional level. A smattering of recent studies have begun to follow this approach, resulting in inferences that are not only helping ask questions that are currently relevant to archaeologists, but also potentially opening up new avenues of research.

Interesting excerpts (emphasis mine, reference numbers removed for clarity):

There are two major, somewhat intertwined, problems that currently exist.

First, archaeologists are not critiquing whether the migrations identified by paleogenomics using sophisticated population genetic machinery are actually occurring. Instead, the technical criticism arrives in terms of how these migrations are being ascribed to specific cultures. In many paleogenomic papers, there is a tendency (and often an analytical and technical need) to associate samples with particular archaeological cultures, for which all samples are then treated as possessing some kind homogenous and pervasive social identity that is bound in space and time. The major critiques of this thus far have been directed to those studies examining Corded-Ware and Bell-Beaker-related individuals and their potential relationship to the Yamnaya [Vander Linden (2016), Heyd (2017), Furholt (2017)], but are applicable to many other ‘migration’ scenarios described in the recent literature. This is compounded by the use of sometimes small numbers of samples to represent certain cultures from a particular geographic area as representatives of the entire culture at a supra-regional level. Yet often these archaeological cultures such as Corded-Ware and Bell-Beaker themselves show considerable variability in space and time, and even within cemeteries, which is not factored into the genetic analysis.

From a population geneticists point of view, this kind of simplification is somewhat understandable and will often likely have very little impact on the final analysis, given that the primary goal is usually to use ancient samples to better understand modern genetic variation. Though there may be a specific historical interest in some of these past events, I would argue that the aim for most population geneticists at a higher level is to try and fit modern patterns of genetic variation using the simplest models possible that take into account past demographic events (for example fitting f-statistics using the ADMIXTUREGRAPH approach), as this is how we are trained. Although sharing an archaeological culture may not mean that a set of individuals are part of the same homogeneous social group in reality, this approach may be a good enough heuristic to find broad genetic connections compared to another group represented by a different culture, which can then ultimately help understand and model modern human population structure. However, for an archaeologists interested in the ancient individuals themselves and their social identity, this lumping is unsatisfactory, where sophisticated narratives of the individual migrants and their ancient communities are the intended goal.

From the paper. Barplot showing cumulative number of ancient Eurasian genomes published on a yearly basis up to 8th July 2018. Includes samples undergoing both whole genome shotgun and SNP capture sequencing.

The second related problem is that ‘migration’ in the sense used currently in the paleogenomics literature lacks sufficient detail to be of much use for an archaeologists attempting to disentangle the complex social dynamics within and between communities. To truly understand the role of migration as a social process and its contribution towards cultural changes, it is necessary to describe it as a structured behaviour, rather than treating it as an explanatory ‘black box’. Are the migrations occurring as a result of short range waves-of-advance movements, or as long-distance movements via leapfrogging models or stream migrations along established routes dependent on key kinship networks. Are there return migrants, and are some subset of individuals more predisposed to migration driving the signals? Although such models were implemented in past studies (even with classical markers [1]) and are part of the population genetics literature, they are lacking in the current paleogenomics literature when discussing migration. The finding that there is an increase of 12.3% of ancestry type X in population A compared to the preceding population B that is suggestive of a migration, is not particularly useful for examining these kind of models. It is also unclear to what degree standard population genetic parameters estimated from genomic data such as effective population size, Ne, and gene flow are relevant to models studied in archaeology, given they reflect (somewhat undefined) long-term population sizes and average rates of movements over time, rather than reflecting any kind of reality of census size and mobility in the ancient communities the archaeologists are actually attempting to study.

The text goes on to talk about ways of studying fine-grained social dynamics of local cultures, such as:

define levels of genetic relatedness, but also in terms of material culture, age, sex, stress and activity indicators, stable isotopes for diet reconstruction (nitrogen, d13C and d15N, carbon, 13C/12C) and strontium and oxygen isotopes for mobility (87Sr/86Sr, d18O). Where possible, sites should be examined over multiple generations. In addition it will be incredibly useful to characterize the impact of disease in these communities, which is also proving to be a highly fruitful realm for paleogenomics.

I would say that the main problem is not the obvious limitations of palaeogenomics in terms of identifying prehistoric ethnolinguistic communities and their evolution, which is why it is just another tool to complement archaeology and linguistics. The main problem is the narrow understanding that some people have of the inherent limitations of palaeogenomics – especially when it interests them – , when publicizing simplistic conclusions based on these tools and their results. And I am not referring only to amateurs.


Contrastive principal component analysis (cPCA) to explore patterns specific to a dataset

Interesting open access paper Exploring patterns enriched in a dataset with contrastive principal component analysis, by Abid, Zhang, Bagaria & Zou, Nature Communications (2018) 9:2134.

Abstract (emphasis mine):

Visualization and exploration of high-dimensional data is a ubiquitous challenge across disciplines. Widely used techniques such as principal component analysis (PCA) aim to identify dominant trends in one dataset. However, in many settings we have datasets collected under different conditions, e.g., a treatment and a control experiment, and we are interested in visualizing and exploring patterns that are specific to one dataset. This paper proposes a method, contrastive principal component analysis (cPCA), which identifies low-dimensional structures that are enriched in a dataset relative to comparison data. In a wide variety of experiments, we demonstrate that cPCA with a background dataset enables us to visualize dataset-specific patterns missed by PCA and other standard methods. We further provide a geometric interpretation of cPCA and strong mathematical guarantees. An implementation of cPCA is publicly available, and can be used for exploratory data analysis in many applications where PCA is currently used.

Schematic Overview of cPCA. To perform cPCA, compute the covariance matrices C X , C Y of the target and background datasets. The singular vectors of the weighted difference of the covariance matrices, C X  − α · C Y , are the directions returned by cPCA. As shown in the scatter plot on the right, PCA (on the target data) identifies the direction that has the highest variance in the target data, while cPCA identifies the direction that has a higher variance in the target data as compared to the background data. Projecting the target data onto the latter direction gives patterns unique to the target data and often reveals structure that is missed by PCA. Specifically, in this example, reducing the dimensionality of the target data by cPCA would reveal two distinct clusters

The Mexican example caught my attention:

Relationship between ancestral groups in Mexico

In previous examples, we have seen that cPCA allows the user to discover subclasses within a target dataset that are not labeled a priori. However, even when subclasses are known ahead of time, dimensionality reduction can be a useful way to visualize the relationship within groups. For example, PCA is often used to visualize the relationship between ethnic populations based on genetic variants, because projecting the genetic variants onto two dimensions often produces maps that offer striking visualizations of geographic and historic trends26,27. But again, PCA is limited to identifying the most dominant structure; when this represents universal or uninteresting variation, cPCA can be more effective at visualizing trends.

The dataset that we use for this example consists of single nucleotide polymorphisms (SNPs) from the genomes of individuals from five states in Mexico, collected in a previous study28. Mexican ancestry is challenging to analyze using PCA since the PCs usually do not reflect geographic origin within Mexico; instead, they reflect the proportion of European/Native American heritage of each Mexican individual, which dominates and obscures differences due to geographic origin within Mexico (see Fig. 4a). To overcome this problem, population geneticists manually prune SNPs, removing those known to derive from Europeans ancestry, before applying PCA. However, this procedure is of limited applicability since it requires knowing the origin of the SNPs and that the source of background variation to be very different from the variation of interest, which are often not the case.

Relationship between Mexican ancestry groups. a PCA applied to genetic data from individuals from 5 Mexican states does not reveal any visually discernible patterns in the embedded data. b cPCA applied to the same dataset reveals patterns in the data: individuals from the same state are clustered closer together in the cPCA embedding. c Furthermore, the distribution of the points reveals relationships between the groups that matches the geographic location of the different states: for example, individuals from geographically adjacent states are adjacent in the embedding. c Adapted from a map of Mexico that is originally the work of User:Allstrak at Wikipedia, published under a CC-BY-SA license, sourced from

As an alternative, we use cPCA with a background dataset that consists of individuals from Mexico and from Europe. This background is dominated by Native American/European variation, allowing us to isolate the intra-Mexican variation in the target dataset. The results of applying cPCA are shown in Fig. 4b. We find that individuals from the same state in Mexico are embedded closer together. Furthermore, the two groups that are the most divergent are the Sonorans and the Mayans from Yucatan, which are also the most geographically distant within Mexico, while Mexicans from the other three states are close to each other, both geographically as well as in the embedding captured by cPCA (see Fig. 4c). See also Supplementary Fig. 6 for more details.

So, by using a background dataset, it discovers patterns in a single target dataset via dimensionality reduction, that standard dimensionality reduction techniques do not discover. Maybe useful for some prehistoric populations, too…

They have released a Python implementation of cPCA on GitHub:, including Python notebooks and datasets.

See also:

Tales of Human Migration, Admixture, and Selection in Africa


Comprehensive review (behind paywall) Tales of Human Migration, Admixture, and Selection in Africa, by Carina M. Schlebusch & Mattias Jakobsson, Annual Review of Genomics and Human Genetics (2018), Vol. 9.

Abstract (emphasis mine):

In the last three decades, genetic studies have played an increasingly important role in exploring human history. They have helped to conclusively establish that anatomically modern humans first appeared in Africa roughly 250,000–350,000 years before present and subsequently migrated to other parts of the world. The history of humans in Africa is complex and includes demographic events that influenced patterns of genetic variation across the continent. Through genetic studies, it has become evident that deep African population history is captured by relationships among African hunter–gatherers, as the world’s deepest population divergences occur among these groups, and that the deepest population divergence dates to 300,000 years before present. However, the spread of pastoralism and agriculture in the last few thousand years has shaped the geographic distribution of present-day Africans and their genetic diversity. With today’s sequencing technologies, we can obtain full genome sequences from diverse sets of extant and prehistoric Africans. The coming years will contribute exciting new insights toward deciphering human evolutionary history in Africa.

Regarding potential Afroasiatic origins and expansions:

It is currently believed that farming practices in northeastern and eastern Africa developed independently in the Sahara/Sahel (around 7,000 BP) and the Ethiopian highlands (7,000–4,000 BP), while farming in the Nile River Valley developed as a consequence of the Neolithic Revolution in the Middle East (84). Northeastern and eastern African farmers today speak languages from the Afro-Asiatic and Nilo-Saharan linguistic groups, which is also reflected in their genetic affinities (Figure 3, K=6). In the northern parts of East Africa (South Sudan, Somalia, and Ethiopia), Nilo-Saharan and Afro-Asiatic speakers with farming lifeways have completely replaced hunter–gatherers. It is still largely unclear how farming and herding practices influenced the northeastern African prefarming population structure and whether the spread of farming is better explained by demic or cultural diffusion in this part of the world. Genetic studies of contemporary populations and aDNA have started to provide some insights into population continuity and incoming gene flow in this region of Africa.

Demographic model of African history and estimated divergences. (a) Population split times, hierarchy, and population sizes (summarized from 123). Horizontal width represents population size; horizontal colored lines represent migrations, with down-pointing triangles indicating admixture into another group. (b) Population structure analysis at 5 assumed ancestries (K=5) for 93 African and 6 non-African populations. Non-Africans (brown), East Africans (blue), West Africans ( green), central African hunter–gatherers (light blue), and Khoe-San (red ) populations are sorted according to their broad historical distributions.

For example, studies have shown that a back-migration from Eurasia into Africa affected most of northeastern and eastern Africa (36, 46, 53, 89, 132) (Figure 1b). A genetic baseline of eastern African ancestral genetic variation unaffected by recent Eurasian admixture and farming migrations within the last 4,500 years has been suggested in the form of the genome sequence of a 4,500-year-old individual from Mota, Ethiopia (36). Based on comparisons with the ancient Mota genome, we know that certain populations from northeastern Africa show deep continuity in their local area with very limited gene flow resulting from recent population movements. For example, the Nilotic herder populations from South Sudan (e.g., Dinka, Nuer, and Shilluk) appear to have remained relatively isolated over time and received little to no gene flow from Eurasians, West African Bantu-speaking farmers, and other surrounding groups (53) (Figures 2 and 3). By contrast, the Nubian and Arab populations to their north show gene flow with Eurasians, which has been connected to the Arab expansion (53). The Nubian, Arab, and Beja populations of northeastern Africa roughly display equal admixture fractions from a local northeastern African gene pool (similar to the Nilotic component) and an incoming Eurasian migrant component (53) (Figure 3). The Eurasian component has been linked to the Middle East and the Arab migration, but only the Arab groups shifted to the Semitic languages; the Nubians and Beja groups kept their original languages. The Eurasian gene flow appears to have spread from north to south along the Nile and Blue Nile in a succession of admixture events (53).

Skoglund and Mathieson’s preprint has also been published in the same volume, without meaningful changes.


Genetic structure, divergence and admixture of Han Chinese, Japanese and Korean populations


Open access Genetic structure, divergence and admixture of Han Chinese, Japanese and Korean populations, by Wang, Lu, Chung, and Xu, Hereditas (2018) 155:19.

Abstract (emphasis mine):

Han Chinese, Japanese and Korean, the three major ethnic groups of East Asia, share many similarities in appearance, language and culture etc., but their genetic relationships, divergence times and subsequent genetic exchanges have not been well studied.

We conducted a genome-wide study and evaluated the population structure of 182 Han Chinese, 90 Japanese and 100 Korean individuals, together with the data of 630 individuals representing 8 populations wordwide. Our analyses revealed that Han Chinese, Japanese and Korean populations have distinct genetic makeup and can be well distinguished based on either the genome wide data or a panel of ancestry informative markers (AIMs). Their genetic structure corresponds well to their geographical distributions, indicating geographical isolation played a critical role in driving population differentiation in East Asia. The most recent common ancestor of the three populations was dated back to 3000 ~ 3600 years ago. Our analyses also revealed substantial admixture within the three populations which occurred subsequent to initial splits, and distinct gene introgression from surrounding populations, of which northern ancestral component is dominant.

These estimations and findings facilitate to understanding population history and mechanism of human genetic diversity in East Asia, and have implications for both evolutionary and medical studies.

Population level phylogenetic Tree and Principal component analysis (PCA). (A) The maximum likelihood tree was constructed based on pair-wise FST matrix. And the marked number are bootstrap value; (B) The top two PCs of individuals representing six East Asian populations, mapped to their corresponding geographic locations (generated by R 2.15.2 and Microsoft Excel 2010)

Interesting excerpts:

It is obvious that the genetic difference among the three East Asian groups initially resulted from population divergence due to pre-historical or historical migrations. Subsequently, different geographical locations where the three populations are located, mainland of China, Korean Peninsular and Japanese archipelago, respectively, apparently facilitated population differentiation due to physical isolation and independent genetic drift. Our estimations of population divergence time among the three groups, 1.2~ 3.6 KYA, are largely consistent with known history of the three populations and those related. However, considering that recent admixture could have reduced genetic difference between populations, it is likely the divergence time was underestimated.

We detected substantial gene flow among the three populations and also from the surrounding populations. For example, based on our analysis with the F3 test, Korean received gene flow from Han Chinese and Japanese, and gene flow also happened between Han Chinese and Japanese (Additional file 12: Table S3). These gene flows are expected to have reduced the genetic differentiation between the three ethnic groups. On the other hand, we also detected considerable gene flow from surrounding populations to the three populations studied. For instance, an ancestral population represented by Ryukyuan have contributed greater to Japanese than to Han Chinese, while southern ethnic group like Dai have contributed more to continent populations than to island and peninsula populations. Contrary to the gene flow among the three populations, these gene flows from surrounding populations are expected to have increased genetic difference among the three populations if they occurred independently and from different source populations. According to our results, the major source of gene flow to the three ethnic groups were substantially different, for example, the major source of gene flow to Han Chinese was from southern ethnic groups, the major source of gene flow to Japanese was from southern islands, and the major source of gene flow to Korean were from both mainland and islands. Therefore, those gene flows might have significantly contributed to further genetic differentiation of the three populations.

The three populations have similar but not identical demographical history; they all experience a strong population expansion in the last 20,000 years. However, according to different geographic distribution, their effective population size and population expansion are different.

Although based on modern populations, the study is interesting in light of the potential implications for a Macro-Altaic proposal.


Distribution of Southern Iberian haplogroup H indicates exchanges in the western Mediterranean

Recent open access paper The distribution of mitochondrial DNA haplogroup H in southern Iberia indicates ancient human genetic exchanges along the western edge of the Mediterranean, by Hernández, Dugoujon, Novelletto, Rodríguez, Cuesta and Calderón, BMC Genetics (2017).

Abstract (emphasis mine):

The structure of haplogroup H reveals significant differences between the western and eastern edges of the Mediterranean, as well as between the northern and southern regions. Human populations along the westernmost Mediterranean coasts, which were settled by individuals from two continents separated by a relatively narrow body of water, show the highest frequencies of mitochondrial haplogroup H. These characteristics permit the analysis of ancient migrations between both shores, which may have occurred via primitive sea crafts and early seafaring. We collected a sample of 750 autochthonous people from the southern Iberian Peninsula (Andalusians from Huelva and Granada provinces). We performed a high-resolution analysis of haplogroup H by control region sequencing and coding SNP screening of the 337 individuals harboring this maternal marker. Our results were compared with those of a wide panel of populations, including individuals from Iberia, the Maghreb, and other regions around the Mediterranean, collected from the literature.

Both Andalusian subpopulations showed a typical western European profile for the internal composition of clade H, but eastern Andalusians from Granada also revealed interesting traces from the eastern Mediterranean. The basal nodes of the most frequent H sub-haplogroups, H1 and H3, harbored many individuals of Iberian and Maghrebian origins. Derived haplotypes were found in both regions; haplotypes were shared far more frequently between Andalusia and Morocco than between Andalusia and the rest of the Maghreb. These and previous results indicate intense, ancient and sustained contact among populations on both sides of the Mediterranean.

Our genetic data on mtDNA diversity, combined with corresponding archaeological similarities, provide support for arguments favoring prehistoric bonds with a genetic legacy traceable in extant populations. Furthermore, the results presented here indicate that the Strait of Gibraltar and the adjacent Alboran Sea, which have often been assumed to be an insurmountable geographic barrier in prehistory, served as a frequently traveled route between continents.

a, b, c. Interpolated frequency surfaces of clade H and its main sub-clades (H1 and H3). Frequencies (%) are showed in a colour scale. See information about the populations used in Additional files 4 and 5. Map templates were taken from Natural Earth free map repository (

I usually find mtDNA data, especially studies like this one based on modern populations, very difficult to interpret for anthropological purposes. It is well-known that there are important differences in the pattern of Y-DNA and mtDNA expansion and distribution.

A paragraph in this respect caught my attention:

The patterns of variation in the Y-chromosome between western and eastern Andalusians, based on 416 males, have also been investigated for a set of Y-Short Tandem Repeats (Y-STRs) and Y-SNPs [53, 54, 55], Calderón et al., unpublished data] in combination to mtDNA analyses ([18, 19] and present study). In general, for both uniparental makers, Andalusians exhibit a typical western European genetic background, with peak frequencies of mtDNA Hg H and Y-chromosome Hg R1b1b2-M269 (45% and 60%, respectively). Interestingly, our results have further revealed that the influence of African female input is far more significant when compared to male influence in contemporary Andalusians. The lack of correspondence between the maternal and paternal genetic profiles of human populations reflects intrinsic differences in migratory behavior related to sex-biased processes and admixture, as well as differences in male and female effective population sizes related to the variance in reproductive success affected, for example, by polygyny [56, 57].

I think that the greater reduction in patrilineal lineages compared to maternal lineages we usually see during and after prehistoric or historic migrations have more to do with the renown Uí Néill family case and with war-related casualties (since combatants were usually men) than with other more popular explanations, such as enslavement of women or polygyny.

The most successful paternal lines (anywhere in the world) were probably those who remained in power for a long time (be it a patriarchal society based on families, clans, or more complex organizational units), who were richer and thus more capable of having healthy offspring, who in turn were able to survive longer and have more children who inherited power, etc.

In case of recent migrations or population movements that disrupt the previously established organization, after a certain number of generations, successful patrilocal families (usually from incoming lineages) might slowly dominate over a whole region, with poorer families (usually of ‘indigenous’ lineages) suffering a greater – especially perinatal and child – mortality, without any obvious (pre)historic event associated to these gradual changes.

This gradual replacement of paternal lineages is compatible with the adoption of the native language by newcomers. If the number of migrants is greater that the native population, and especially if their technology is more advanced, then a more radical change including ethnolinguistic identification is more likely.

I don’t deny the (pre)historic existence of radical replacement of male populations with continuity of female lineages due to massacres of men, female slavery, or polygyny, but they are probably not the main explanation for most regional differences seen in paternal lineages, and should thus be used with caution.

Gradual replacement and founder effects are also the most logical explanation for why autochthonous continuity myths (that the modern regional prevalence of few successful lineages tended to create in the 2000s) haven’t been corroborated by ancient DNA; e.g. R1b-DF27 in Basques, N1c-M178 in Finnic populations, R1a-Z283 in Slavs, etc. There is nothing different in those areas from other recent founder effects and internal migratory flows seen everywhere in Europe in the past millennia.

Paper discovered via a link by Alberto Gonzalez on Facebook group Iberia ADN


Genetic ancestry of Hadza and Sandawe peoples reveals ancient population structure in Africa

Open access paper Genetic Ancestry of Hadza and Sandawe Peoples Reveals Ancient Population Structure in Africa, by Shriner, Tekola-Ayele, Adeyemo, & Rotimi, GBE (2018).

Abstract (emphasis mine):

The Hadza and Sandawe populations in present-day Tanzania speak languages containing click sounds and therefore thought to be distantly related to southern African Khoisan languages. We analyzed genome-wide genotype data for individuals sampled from the Hadza and Sandawe populations in the context of a global data set of 3,528 individuals from 163 ethno-linguistic groups. We found that Hadza and Sandawe individuals share ancestry distinct from and most closely related to Omotic ancestry; share Khoisan ancestry with populations such as ≠Khomani, Karretjie, and Ju/’hoansi in southern Africa; share Niger-Congo ancestry with populations such as Yoruba from Nigeria and Luhya from Kenya, consistent with migration associated with the Bantu Expansion; and share Cushitic ancestry with Somali, multiple Ethiopian populations, the Maasai population in Kenya, and the Nama population in Namibia. We detected evidence for low levels of Arabian, Nilo-Saharan, and Pygmy ancestries in a minority of individuals. Our results indicate that west Eurasian ancestry in eastern Africa is more precisely the Arabian parent of Cushitic ancestry. Relative to the Out-of-Africa migrations, Hadza ancestry emerged early whereas Sandawe ancestry emerged late.


In the Hadza population, the distribution of Y chromosomes includes mostly B2 haplogroups, with a smaller number of E1b1a haplogroups, which are common in Niger-Congo-speaking populations, and E1b1b haplogroups, which are common in Cushitic populations (Tishkoff, et al. 2007). In the Sandawe population, E1b1a and E1b1b haplogroups are more common, with lower frequencies of B2 and A3b2 haplogroups (Tishkoff, et al. 2007).

We found that Hadza ancestry diverged early, rather than late. We found evidence for contributions of Cushitic and Niger-Congo ancestries in Tanzania, consistent with the movements of herding and cultivating Cushitic speakers ~4,000 years ago and agricultural Niger-Congo speakers ~2,500 years ago (Newman 1995). However, we did not find evidence of a substantial contribution of Nilo-Saharan ancestry that might have resulted from movement of pastoralist Nilo-Saharan speakers (Newman 1995). We also identified west Eurasian ancestry in eastern and southern African populations more precisely as the Arabian parent of Cushitic ancestry. Finally, our ancestry analyses support the hypothesis that Omotic, Hadza, and Sandawe languages group together, rather than Omotic languages belonging to the Afroasiatic family and Hadza and Sandawe languages belonging to the Khoisan family.

I don’t like linguistic assumptions from admixture analysis; especially from scarce modern samples, as in this case.

Nevertheless, these papers may help clarify the different nature of Omotic and Cushitic among Afroasiatic languages, and thus leave the origin of Afroasiatic either:

a) To the east, with the traditionalist Afroasiatic – Semitic/Hamitic homeland association.

Expansion of Afroasiatic

b) To the west, near modern Chadic languages (associated with the expansion of R1b-V88 subclades through a Green Sahara), as I suggested.