Mixed haplogroups R1a, R1b, I, in collective burials of early Medieval Bavarians


New paper (behind paywall) Family graves? The genetics of collective burials in early medieval southern Germany on trial, by Rott. Päffgen, Haas-Gebhard, Peters, & Harbecka, J Arch Sci (2018) 92: 103–115.


Simultaneous collective burials appear quite regularly in early medieval linear cemeteries. Despite their relatively regular occurrence, they are seen as extraordinary as the interred individuals’ right to be buried in a single grave was ignored for certain reasons. Here, we present a study examining the possible familial relationship of early medieval individuals buried in this way by using aDNA analysis of mitochondrial HVR-I, Y-STRs, and autosomal miniSTRs. We can show that biological relatedness may have been an additional reason for breaking the usual burial custom besides a common cause of death, such as the Plague, which is a precondition for a simultaneous burial. Finally, with our sample set, we also see that signs of interaction between individuals such as holding hands which are often interpreted by archeologists as signs of biological or social relatedness, do not always reflect true genetic kin relationships.

Most of the burials studied are from the mid-6th and early 7th century, and all are from collective burials:

Of the simultaneous burials nine graves are proven or potential (due to contemporaneity) Plague burials (Feldman et al., 2016; Harbeck et al., 2013) and one grave is attributed to interpersonal violence against the background of the early medieval feud system (Schneider, 2008). The remaining simultaneous and the two successive burials did not reveal hints on their individuals’ cause of death.

The distribution of lineages includes R1b, R1a, and I (one family each) in Altenerding-Klettham, and T, R1b, and R1a (two families) in Aschheim-Bajuwarenring.

Map of Upper Bavaria showing the location of the sites investigated. Both Aschheim and Altenerding are located north-east of the Bavarian capital Munich (black star). The two sites are approximately 20 km apart from each other. The map is based on maps taken from here and here (Wikimedia Commons).

There were, for example:

A father and son R1a in a “warrior grave”:

Showing traces of perimortal sharp traumata (AE 888), both men seem to have died in succession of a physical conflict (Sage, 1984). It must remain open, whether this conflict was executed as a blood vengeance in connection with the medieval feud system (Schneider, 2008; Steuer, 2008) or any other kind of interpersonal violence. Attacks and interpersonal violence are also often believed to be a precondition for individuals being buried together.

It has been assumed that burials of several men with weaponry, so-called “warrior graves”, are burials which reflect the early medieval feud system (Schneider, 2008; Steuer, 2008) in the very sophisticated but implausible assumption, that women and children might have been spared in those conflicts. While feuds were actually struggles between familiae, friends and servants of a particular family could be also involved, which would explain the deposition of nonrelated individuals in such burials.

Two children, half-siblings, one of haplogroup R1b, in a shared coffin.

A non-genetic family of an elderly man of haplogroup I and a child being protected:

The early medieval concept of familia not only comprised the (biological) nuclear family and individuals certainly entered a family clan by marriage. This leaves room for any possible social (i.e. non-genetic) relation that may have allowed these two individuals to be buried in a common grave.

It is tempting for me to hail the mixed genetic pool among late Germanic tribes found in recent genetic studies, as I have done for Proto-Balto-Slavic territory and Iberia.

It is indeed possible that the mostly R1b-L11 and I1 subclades seen in late medieval West Germanic-speaking populations (and in modern West Germanic speakers) are in fact the result of later internal migratory flows and founder effects.

However, Bavarians – like the recently studied Lombards (with a predominance of R1b and I lineages), and especially Goths (apparently showing ‘eastern’ ancestry) – occupied territories of mixed ‘Barbarian’ populations after the invasion of the Huns and their allies, and settled near Slavs and Avars.

EDIT (18 MAR 2018). We should add here for this southern Germanic territory the Merovingian burials (ca. 7th c.) from Ergolding, with 3 samples of haplogroup R1b, and 2 samples of G2a, published by Vanek, Saskova, & Koch (2009).

Earlier, expanding Proto-Germanic tribes may not show this variable admixture and haplogroups we are seeing right now, though.


A history of male migration in and out of the Green Sahara

Open access research highlight A history of male migration in and out of the Green Sahara, by Yali Xue, Genome Biology (2018) 19:30, on the recent paper by D’Atanasio et al.

Insights from the Green Saharan Y-chromosomal findings (emphasis mine):

It is widely accepted that sub-Saharan Y chromosomes are dominated by E-M2 lineages carried by Bantu-speaking farmers as they expanded from West Africa starting < 5 kya, reaching South Africa within recent centuries [4]. The E-M2-Bantu lineages lie phylogenetically within the E-M2-Green Sahara lineage and show at least three explosive lineage expansions beginning 4.9–5.3 kya [5] (Fig. 1a). These events of E-M2-Bantu expansion are slightly later than the R-V88 expansion, and highlight the range of male demographic changes in the mid-Holocene. North of the Sahara, in addition to the four trans-Saharan haplogroups, haplogroup E-M81 (which diverged from E-M78 ~ 13 kya) became very common in present-day populations as a result of another massive expansion ~ 2 kya [6] (Fig. 1a).

Simplified Y-chromosomal phylogeny and inferred past or observed present-day distribution of relevant Y-chromosomal lineages. a Calibrated phylogenetic tree of Y-chromosomal lineages discussed in the text. Green shading represents the period when the present-day Sahara Desert was green and fertile. Lineages represented by filled pentagons have undergone very rapid expansions. b [featured image] The Green Sahara period 5–12 kya. Green shading indicates that the present-day Sahara Desert was green and fertile. The colors within the large oval represent the four Y-chromosomal haplogroups deduced to be present in the region at this time; specific locations are not implied. The arrows indicate the inferred origins of these haplogroups to the north or south, but specific origins and routes are not implied. c The present-day distributions of the four Green Saharan Y-chromosomal haplogroups. Yellow shading indicates the Sahara Desert. Each circle represents a sampled population, with the presence or absence of the four Green Saharan haplogroups shown by the colored sectors; other haplogroups may also be present in these populations, but are not shown. The small arrows indicate the inferred northwards and southwards movements of these haplogroups when the Sahara became uninhabitable.

Although Y chromosomes exist within populations and so share and reflect the general history of those populations, they can sometimes show some departures from other parts of the genome that result from differences in male and female behaviors. D’Atanasio et al. [1] highlight one such contrast in their study. Present-day North African populations show substantial sub-Saharan autosomal and mtDNA genetic components ascribed to the Roman and Arab slave trades 1–2 kya [7], but carry few sub-Saharan Y lineages from this source, probably reflecting the smaller numbers of male slaves and their reduced reproductive opportunities when compared to those of female slaves. The sub-Saharan Y chromosomes in these North African populations thus originate predominantly from the earlier Green Sahara period.

In this part of Africa, the indigenous languages that are spoken belong to three of the four African linguistic families (Afro-Asiatic, Nilo-Saharan and Niger-Congo). Interestingly, these languages show non-random associations with Y lineages. For example, Chadic languages within the Afro-Asiatic family are associated with haplogroup R-V88, whereas Nilo-Saharan languages are associated with specific sublineages within A3-M13 and E-M78, further illustrating the complex human history of the region.

The main question after D’Atanasio et al. (2018) is thus:

(…) what are the reasons for the very rapid R-V88 expansion 5–6 kya [1] and E-M81 expansion ~ 2 kya [6], and how do these expansions fit within general worldwide patterns of male-specific expansions, which in other cases have been linked to cultural and technological changes [5]?

I think that the only known haplogroup expansion that might fit today the spread and dialectalization of Afroasiatic, a proto-language probably contemporaneous or slighly older than Middle Proto-Indo-European, is that of R1b-V88 lineages. However, without ancient DNA samples to corroborate this, we cannot be sure.

See also:

David Reich on the influence of ancient DNA on Archaeology and Linguistics

An interesting interview has appeared on The Atlantic, Ancient DNA Is Rewriting Human (and Neanderthal) History, on the occasion of the publication of David Reich’s book Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past.

Some interesting excerpts (I have emphasized some of Reich’s words):

On the efficiency of the Reich Lab

Zhang: How much does it cost to process an ancient DNA sample right now?

Reich: In our hands, a successful sample costs less than $200. That’s only two or three times more than processing them on a present-day person. And maybe about one-third to one half of the samples we screen are successful at this point.

This is probably the most controversial assessment for the Twitterverse, since it puts the Reich Lab at the top of the publishing chain, but I don’t find this fact controversial; at all.

Anyone interested in doing genetic studies has free datasets, papers, and bioinformatic tools at hand – thanks to his lab, mostly – to develop new methods and publish papers. Such secondary works won’t probably be published in journals with the highest impact factor, but what can you do, welcome to the scientific world…

Also, by the looks of it, every single researcher involved in recovering an archaeological sample is included as co-author of the papers, so there is a clear benefit for ‘local’ researchers collaborating with the Lab. Therefore, these researchers and their institutions are responsible for whatever unfair situation might be created by their exchange.

On Archaeology’s reaction to Kossinna and Nazi ideas:

Zhang: You actually had German collaborators drop out of a study because of these exact concerns, right? One of them wrote, “We must(!) avoid … being compared with the so-called ‘siedlungsarchäologie Method’ from Gustaf Kossinna!”

Reich: Yeah, that’s right. I think one of the things the ancient DNA is showing is actually the Corded Ware culture does correspond coherently to a group of people. I think that was a very sensitive issue to some of our coauthors, and one of the coauthors resigned because he felt we were returning to that idea of migration in archaeology that pots are the same as people. There have been a fair number of other coauthors from different parts of continental Europe who shared this anxiety.

We responded to this by adding a lot of content to our papers to discuss these issues and contextualize them. Our results are actually almost diametrically opposite from what Kossina thought because these Corded Ware people come from the East, a place that Kossina would have despised as a source for them. But nevertheless it is true that there’s big population movements, and so I think what the DNA is doing is it’s forcing the hand of this discussion in archaeology, showing that in fact, major movements of people do occur. They are sometimes sharp and dramatic, and they involve large-scale population replacements over a relatively short period of time. We now can see that for the first time.

What the genetics is finding is often outside the range of what the archaeologists are discussing these days.

This is mostly true: Genomics offers a whole new dimension to assess exchanges among groups, and help thus select anthropological models of cultural diffusion. They offer another way of interpreting prehistoric cultural evolution and change, including the investigation of potential languages of these cultures, ways of change and replacement, etc.

Also, he acknowledges that there is a lot of content added to the papers in search for context – and thus avoid simplistic assumptions and conclusions – , so this is a reasonable way to look at the (often erroneous) cultural and linguistic context which accompany most genetic papers, and even the new methods being developed to assess samples.

On the other hand, the fact that many in Archaeology didn’t want to discuss migrations does not mean that it was not discussed at all, as he seems to suggest.

On how Genomics fits with traditional disciplines

Zhang: I think at one point in your book you actually describe ancient DNA researchers as the “barbarians” at the gates of the study of history.

Reich: Yeah.

Zhang: Does it feel that way? Have you gotten into arguments with archaeologists over your findings?

Reich: I think archaeologists and linguists find it frustrating that we’re not trained in the language of archaeology and all these sensitivities like about Kossinna. Yet we have this really powerful tool which is this way of looking at things nobody has been able to look at before.

The point I was trying to make there was that even if we’re not always able to articulate the context of our findings very well, this is very new information, and a serious scholar really needs to take this on board. It’s dangerous. Barbarians may not talk in an educated and learned way but they have access to weapons and ways of looking at things that other people haven’t looked to. And time and again we’ve learned in the past that ignoring barbarians is a dangerous thing to do.

I think this is also mostly true: many academics find it frustrating to read these papers, most of which lack a minimal understanding of the topics being discussed.

For example, you can’t pretend to derive meaningful conclusions about Proto-Indo-Europeans knowing nothing about their language and the potential cultures associated with them (and why they were associated with them in the first place)…

I also agree with him in that the study of ancient DNA is a very powerful tool. Everyone involved in Anthropology and Archaeology should be trained these days in Genomics – or, at least, they should have the opportunity to do so.

On the dangers of Genomics

Reich: (…) I know there are extremists who are interested in genealogy and genetics. But I think those are very marginal people, and there’s, of course, a concern they may impinge on the mainstream.

But if you actually take any serious look at this data, it just confounds every stereotype. It’s revealing that the differences among populations we see today are actually only a few thousand years old at most and that everybody is mixed. I think that if you pay any attention to this world, and have any degree of seriousness, then you can’t come out feeling affirmed in the racist view of the world. You have to be more open to immigration. You have to be more open to the mixing of different peoples. That’s your own history.

I guess David Reich does not frequent forums on human genetics linked to ethnolinguistic identification, or he would not think of ‘extremists’ as marginal people. Or else we have a different view of what defines an ‘extremist’…


I did not have the best of opinions about David Reich – or any other geneticist involved in publishing anthropological theories, for that matter. I have always had great respect for their scientific work, though.

If anything, this article shows that he knows his own (and his fellow geneticists’) limitations, and the dangers and limitations of Genomics as a whole, so I have more respect for him – and anyone involved with his Lab’s work – after reading this piece.

I would sum up his interview with his humbling sentence:

We should think we really don’t know what we’re talking about.

NOTE. Also on the occasion of the publication of his book, Nature has published the piece Sex, power and ancient DNA – Turi King hails David Reich’s thrilling account of mapping humans through time and place.

After buying Lalueza-Fox’s recent book ‘La forja genètica d’Europa’, I don’t really feel like buying another book on Genomics and migrations from a geneticist. If you have read Reich’s book, please share your impressions.

Pleistocene North African genomes link Near Eastern and sub-Saharan African human populations


Pleistocene North African genomes link Near Eastern and sub-Saharan African human populations, by van de Loosdrecht et al. Science (2018).


North Africa is a key region for understanding human history, but the genetic history of its people is largely unknown. We present genomic data from seven 15,000-year-old modern humans from Morocco, attributed to the Iberomaurusian culture. We find a genetic affinity with early Holocene Near Easterners, best represented by Levantine Natufians, suggesting a pre-agricultural connection between Africa and the Near East. We do not find evidence for gene flow from Paleolithic Europeans into Late Pleistocene North Africans. The Taforalt individuals derive one third of their ancestry from sub-Saharan Africans, best approximated by a mixture of genetic components preserved in present-day West and East Africans. Thus, we provide direct evidence for genetic interactions between modern humans across Africa and Eurasia in the Pleistocene.


We analyzed the genetic affinities of the Taforalt individ-uals by performing principal component analysis (PCA) and model-based clustering of worldwide data (Fig. 2). When pro-jected onto the top PCs of African and West Eurasian popu-lations, the Taforalt individuals form a distinct cluster in an intermediate position between present-day North Africans (e.g., Amazighes (Berbers), Mozabite and Saharawi) and East Africans (e.g., Afar, Oromo and Somali) (Fig. 2A). Consist-ently, we find that all males with sufficient nuclear DNA preservation carry Y haplogroup E1b1b1a1 (M-78; table S16). This haplogroup occurs most frequently in present-day North and East African populations (18). The closely related E1b1b1b (M-123) haplogroup has been reported for Epipaleolithic Natufians and Pre-Pottery Neolithic Levantines (“Levant_N”) (16). Unsupervised genetic clustering also suggests a connection of Taforalt to the Near East. The three major components that comprise the Taforalt genomes are maximized in early Holocene Levantines, East African hunter-gatherer Hadza from north-central Tanzania, and West Africans (K = 10; Fig. 2B). In contrast, present-day North Africans have smaller sub-Saharan African components with minimal Hadza-related contribution (Fig. 2B).

Taforalt harboring an ancestry that contains additional affinity with South, East and Central African outgroups. None of the present-day or ancient Holocene African groups serve as a good proxy for this unknown ancestry, because adding them as the third source is still insufficient to match the model to the Taforalt gene pool.

Mitochondrial consensus sequences of the Taforalt indi-viduals belong to the U6a (n = 6) and M1b (n = 1) haplogroups (15), which are mostly confined to present-day populations in North and East Africa (7). U6 and M1 have been proposed as markers for autochthonous Maghreb ancestry, which might have been originally introduced into this region by a back-to-Africa migration from West Asia (6, 7). The occurrence of both haplogroups in the Taforalt individuals proves their pre-Holocene presence in the Maghreb.
(…) the diversification of haplogroup U6a and M1 found for Taforalt is dated to ~24,000 yBP (fig. S23), which is close in time to the earliest known appearance of the Iberomaurusian in Northwest Africa (25,845-25,270 cal. yBP at Tamar Hat (26)).

A summary of the genetic profile of the Taforalt individuals. (A) The top two PCs calculated from present-day African, Near Eastern and South European individuals from 72 populations. The Taforalt individuals are projected thereon (red-colored circles). Selected present-day populations are marked by colored symbols. Labels for other populations (marked by small grey circles) are provided in fig. S8. (B) ADMIXTURE results of chosen African and Middle Eastern populations (K = 10). Ancient individuals are labeled in red color. Major ancestry components in Taforalt are maximized in early Holocene Levantines (green), West Africans (purple) and East African Hadza (brown). The ancestry component prevalent in pre-Neolithic Europeans (beige) is absent in Taforalt.

The relationships of the Iberomaurusian culture with the preceding MSA, including the local backed bladelet technologies in Northeast Africa, and the Epigravettian in southern Europe have been questioned (13). The genetic profile of Taforalt suggests substantial Natufian-related and sub-Saharan African-related ancestries (63.5% and 36.5%, respec-tively), but not additional ancestry from Epigravettian or other Upper Paleolithic European populations. Therefore, we provide genomic evidence for a Late Pleistocene connection between North Africa and the Near East, predating the Neolithic transition by at least four millennia, while rejecting a potential Epigravettian gene flow from southern Europe into northern Africa within the resolution of our data.

It seems that the Taforalt gene pool (ca. 13000-12000 BC) cannot be explained by a connection with Upper Palaeolithic Europeans, but a more archaic admixture, so the authors cannot prove a migration through the Strait of Gibraltar or Sicily.

Nevertheless, these results apparently suggest:

  • That there is no contact before ca. 12000 BC through the Strait of Gibraltar; therefore the Sicilian route I support for the migration of R1b-V88 lineages is still the most likely one.
  • That the North African connection with Natufians is quite old – for which we already had modern Y-DNA investigation – , and therefore unlikely to be related to the Afroasiatic expansion.

I am glad I had some more time this week to read at least some interesting parts of the published papers, because the information to process is becoming insanely huge…


Model for the spread of Transeurasian (Macro-Altaic) communities with farming


Austronesian influence and Transeurasian ancestry in Japanese: A case of farming/language dispersal, by Martine Robbeets, Max Planck Institute for the Science of Human History.


In this paper, I propose a hypothesis reconciling Austronesian influence and Transeurasian ancestry in the Japanese language, explaining the spread of the Japanic languages through farming dispersal. To this end, I identify the original speech community of the Transeurasian language family as the Neolithic Xinglongwa culture situated in the West Liao River Basin in the sixth millennium bc. I argue that the separation of the Japanic branch from the other Transeurasian languages and its spread to the Japanese Islands can be understood as occurring in connection with the dispersal of millet agriculture and its subsequent integration with rice agriculture. I further suggest that a prehistorical layer of borrowings related to rice agriculture entered Japanic from a sister language of proto-Austronesian, at a time when both language families were still situated in the Shandong-Liaodong interaction sphere.

Classification of the Transeurasian languages according to Robbeets ( forthcoming)

Another interesting anthropological model to validate with future genomic analyses, although I was never convinced about a grouping (let alone reconstructible proto-language) beyond Micro-Altaic languages.

NOTE. The Max Planck Institute may be a great source of scientific advancement, but in Linguistics you can see from the projects Indo-European languages originate in Anatolia (2012) and A massive migration from the steppe brought Indo-European languages to Europe (2015) (the last one referring to the Corded Ware culture, associated with the study by Haak et al. 2015) that they have not got it quite right with Proto-Indo-European… I like the traditional approach of this paper, though, including a thorough assessment of archaeological and linguistic details.

Featured images: Left. The eastward spread of millet agriculture in association with ancestral speech communities. Right: The spread of agriculture and language to Japan.

See also:

Two sources of archaic Denisovan ancestry in East Asia, one possibly after the isolation of Native Americans


Open access paper Analysis of Human Sequence Data Reveals Two Pulses of Archaic Denisovan Admixture, by Sharon L. Browning, Brian L. Browning, Zhou, Tucci, & Akey, Cell (2018).


Anatomically modern humans interbred with Neanderthals and with a related archaic population known as Denisovans. Genomes of several Neanderthals and one Denisovan have been sequenced, and these reference genomes have been used to detect introgressed genetic material in present-day human genomes. Segments of introgression also can be detected without use of reference genomes, and doing so can be advantageous for finding introgressed segments that are less closely related to the sequenced archaic genomes. We apply a new reference-free method for detecting archaic introgression to 5,639 whole-genome sequences from Eurasia and Oceania. We find Denisovan ancestry in populations from East and South Asia and Papuans. Denisovan ancestry comprises two components with differing similarity to the sequenced Altai Denisovan individual. This indicates that at least two distinct instances of Denisovan admixture into modern humans occurred, involving Denisovan populations that had different levels of relatedness to the sequenced Altai Denisovan.

Mean detected archaic sequence per individual (Mb)

The discussion on the potential implication of the paper:

Featured image, from the article: Contour Density Plots of Match Proportion of Introgressed Segments to the Altai Neanderthal and Altai Denisovan Genomes.


Uralic as a Corded Ware substrate of Indo-Iranian, and loanwords in Finno-Ugric

Asko Parpola has recently published a new paper, Finnish vatsa ~ Sanskrit vatsá and the formation of Indo-Iranian and Uralic languages.


Finnish vatsa ‘stomach’ < PFU *vaćća < Proto-Indo-Aryan *vatsá- ‘calf’ < PIE *vet-(e)s-ó- ‘yearling’ contrasts with Finnish vasa- ‘calf’ < Proto-Iranian *vasa- ‘calf’. Indo-Aryan -ts- versus Iranian -s- refl ects the divergent development of PIE *-tst- in the Iranian branch (> *-st-, with Greek and Balto-Slavic) and in the Indo-Aryan branch ( > *-tt-, probably due to Uralic substratum). The split of Indo-Iranian can be traced in the archaeological record to the differentiation of the Yamnaya culture in the North Pontic and Volga steppes respectively during the third millennium BCE, due to the use of separate sources of metal: the Iranian branch was dependent on the North Caucasus, while the Indo-Aryan branch was oriented towards the Urals. It is argued that the Abashevo culture of the Mid-Volga-Kama-Belaya basins and the Sejma-Turbino trade network (2200–1900 BCE) were bilingual in Proto-Indo-Aryan and PFU, and introduced the PFU as the basis of West Uralic (Volga-Finnic) into the Netted Ware Culture of the Upper Volga-Oka (1900–200 BCE).

He updates thus his quite recent model from On the emergence, contacts and dispersal of Proto-Indo-European, Proto-Uralic and Proto-Aryan in an archaeological perspective (2017).

In it he supported a North-West Indo-European expansion with Corded Ware, and a Neolithic Proto-Uralic community in East Europe (associated with the Comb Ware culture), as I did before the famous 2015 papers.

In fact, he supports that the satemization trend of Proto-Indo-Iranian is due to a Proto-Finno-Ugric substratum in its population in the Volga-Ural region, similar to the model I propose (with the Corded Ware substratum hypothesis).

NOTE. While for Parpola the ‘satemizing’ substratum of Balto-Slavic (a NWIE dialect) may not come exactly from the same Finno-Ugric population as for Indo-Iranian, but from a different Uralic dialect (as I explain in my hypothesis), for the few extant supporters of an Indo-Slavonic group there should not be any problem identifying the same ancient substrate as for the Proto-Indo-Iranian population…

Now that North-West Indo-European is clearly associated with the Yamna -> Bell Beaker expansion, I understand that his previous model is obsolete and needs a revision.

I find it especially difficult to understand (in light of his previous theory) why he compares Indo-Aryan *vatsa– and Iranian *vasa– to assert that the former is the origin of the loanword in Finno-Ugric, when the Proto-Indo-Iranian form is essentially the same as the Indo-Aryan one, with respect to the *w– evolution into *v– in both PII and late FU dialects…

NOTE: I wrote him yesterday asking for this issue, I will post here his answer.

Potential spread of Finnic. “Distribution of the Netted Ware according to Carpelan (2002: 198). A: Emergence of the Netted Ware on the Upper Volga c. 1900 calBC. B: Spread of Netted Ware by c. 1800 calBC. C: Early Iron Age spread of Netted Ware. (After Carpelan 2002: 198 > Parpola 2012a: 151.)

His effort to link the actual expansion of Finno-Ugric to Corded Ware territory, linking it also partially to population movements from the Seima-Turbino phenomenon – probably associated with the initial expansion of N1c lineages – is another good example of convergence of the different anthropological theories thanks to recent Genomic studies.


Genetic ancestry of Hadza and Sandawe peoples reveals ancient population structure in Africa

Open access paper Genetic Ancestry of Hadza and Sandawe Peoples Reveals Ancient Population Structure in Africa, by Shriner, Tekola-Ayele, Adeyemo, & Rotimi, GBE (2018).

Abstract (emphasis mine):

The Hadza and Sandawe populations in present-day Tanzania speak languages containing click sounds and therefore thought to be distantly related to southern African Khoisan languages. We analyzed genome-wide genotype data for individuals sampled from the Hadza and Sandawe populations in the context of a global data set of 3,528 individuals from 163 ethno-linguistic groups. We found that Hadza and Sandawe individuals share ancestry distinct from and most closely related to Omotic ancestry; share Khoisan ancestry with populations such as ≠Khomani, Karretjie, and Ju/’hoansi in southern Africa; share Niger-Congo ancestry with populations such as Yoruba from Nigeria and Luhya from Kenya, consistent with migration associated with the Bantu Expansion; and share Cushitic ancestry with Somali, multiple Ethiopian populations, the Maasai population in Kenya, and the Nama population in Namibia. We detected evidence for low levels of Arabian, Nilo-Saharan, and Pygmy ancestries in a minority of individuals. Our results indicate that west Eurasian ancestry in eastern Africa is more precisely the Arabian parent of Cushitic ancestry. Relative to the Out-of-Africa migrations, Hadza ancestry emerged early whereas Sandawe ancestry emerged late.


In the Hadza population, the distribution of Y chromosomes includes mostly B2 haplogroups, with a smaller number of E1b1a haplogroups, which are common in Niger-Congo-speaking populations, and E1b1b haplogroups, which are common in Cushitic populations (Tishkoff, et al. 2007). In the Sandawe population, E1b1a and E1b1b haplogroups are more common, with lower frequencies of B2 and A3b2 haplogroups (Tishkoff, et al. 2007).

We found that Hadza ancestry diverged early, rather than late. We found evidence for contributions of Cushitic and Niger-Congo ancestries in Tanzania, consistent with the movements of herding and cultivating Cushitic speakers ~4,000 years ago and agricultural Niger-Congo speakers ~2,500 years ago (Newman 1995). However, we did not find evidence of a substantial contribution of Nilo-Saharan ancestry that might have resulted from movement of pastoralist Nilo-Saharan speakers (Newman 1995). We also identified west Eurasian ancestry in eastern and southern African populations more precisely as the Arabian parent of Cushitic ancestry. Finally, our ancestry analyses support the hypothesis that Omotic, Hadza, and Sandawe languages group together, rather than Omotic languages belonging to the Afroasiatic family and Hadza and Sandawe languages belonging to the Khoisan family.

I don’t like linguistic assumptions from admixture analysis; especially from scarce modern samples, as in this case.

Nevertheless, these papers may help clarify the different nature of Omotic and Cushitic among Afroasiatic languages, and thus leave the origin of Afroasiatic either:

a) To the east, with the traditionalist Afroasiatic – Semitic/Hamitic homeland association.

Expansion of Afroasiatic

b) To the west, near modern Chadic languages (associated with the expansion of R1b-V88 subclades through a Green Sahara), as I suggested.


First Iberian R1b-DF27 sample, probably from incoming East Bell Beakers


I had some more time to read the paper by Valdiosera et al. (2018) and its supplementary material.

One of the main issues since the publication of Olalde et al. (2018) (and its hundreds of Bell Beaker samples) was the lack of a clear Y-DNA R1b-DF27 subclades among East Bell Beaker migrants, which left us wondering when the subclade entered the Iberian Peninsula, since it could have (theoretically) happened from the Chalcolithic to the Iron Age.

My prediction was that this lineage found today widespread among the Iberian population crossed the Pyrenees quite early, during the Chalcolithic, with migrating East Bell Beakers expanding North-West Indo-European dialects, and that it spread slowly afterwards.

The first ancient sample clearly identified as of R1b-DF27 subclade is found in this paper, at the Late Bronze Age site Cueva de los Lagos. Although it is unidentified and has no radiocarbon date, the site as a whole is associated with the Cogotas culture and its Bouquique ceramic decoration.

Y-DNA and mtDNA haplogroups, from the paper. Sequencing statistics and contamination rates for newly generated sequence data.

It was found in the northern part of the Cogotas culture territory (which lies mainly between Castille and Aragon, in North-Central Spain), shows evident steppe admixture, and it has become obvious with the latest papers (including this one) that R1b-M269 lineages intruded south of the Pyrenees associated with East Bell Beaker migrations.

The Proto-Cogotas culture is associated with a Bell Beaker substrate influenced by either El Argar or Atlantic Bronze, and the specific type of ceramics found at this Cogotas culture site are probably from the mid-2nd millennium, which is too early for the Celtic expansion.

Supervised ADMIXTURE results.

Nevertheless, due to the quite likely late date of the sample (in the centuries around 1500 BC), there is still a possibility that incoming R1b-DF27 lineages were not among the early R1b-M269 lineages found in the Iberian Chalcolithic, and were associated with later migrations from Central Europe, potentially linked to the expansion of the Urnfield culture, and thus nearer to an Italo-Celtic community.

Diachronic map of migrations in Europe ca. 1250-750 BC.

In any of these scenarios, a Pre-Celtic expansion of North-West Indo-European in Iberia (possibly associated with Lusitanian) is still the best explanation for the origin and expansion of (at least some) modern Iberian R1b-DF27 lineages, including those found among the Basque-speaking population.

This implies that the ‘indigenous’ Neolithic lineages of Iberia (like I2 and G2a2) were replaced with subsequent internal gene flows and founder effects, such as those that evidently happened (probably quite recently) among Basques, even though indigenous languages show an obvious continuity.

I would say this is the last nail in the coffin for autochthonous Y-DNA continuity theories for Spain and France (i.e. for the traditional Vasconic-Uralic hypothesis), but we know that data is never enough for any die hard continuist…so let’s just say another nail in the coffin for endless autochthonous continuity theories.

EDIT (18/3/2018): Genetiker has published Y-SNP calls for both R1b samples, showing this one is R1b1a1a2a1a2a-BY15964 (see modern members of this subclade in ytree), and that the other one is R1b1a1a2a~L23.


Iberian prehistoric migrations in Genomics from Neolithic, Chalcolithic, and Bronze Age


New open access paper Four millennia of Iberian biomolecular prehistory illustrate the impact of prehistoric migrations at the far end of Eurasia, by Valdiosera, Günther, Vera-Rodríguez, et al. PNAS (2018) published ahead of print.

Abstract (emphasis mine)

Population genomic studies of ancient human remains have shown how modern-day European population structure has been shaped by a number of prehistoric migrations. The Neolithization of Europe has been associated with large-scale migrations from Anatolia, which was followed by migrations of herders from the Pontic steppe at the onset of the Bronze Age. Southwestern Europe was one of the last parts of the continent reached by these migrations, and modern-day populations from this region show intriguing similarities to the initial Neolithic migrants. Partly due to climatic conditions that are unfavorable for DNA preservation, regional studies on the Mediterranean remain challenging. Here, we present genome-wide sequence data from 13 individuals combined with stable isotope analysis from the north and south of Iberia covering a four-millennial temporal transect (7,500–3,500 BP). Early Iberian farmers and Early Central European farmers exhibit significant genetic differences, suggesting two independent fronts of the Neolithic expansion. The first Neolithic migrants that arrived in Iberia had low levels of genetic diversity, potentially reflecting a small number of individuals; this diversity gradually increased over time from mixing with local hunter-gatherers and potential population expansion. The impact of post-Neolithic migrations on Iberia was much smaller than for the rest of the continent, showing little external influence from the Neolithic to the Bronze Age. Paleodietary reconstruction shows that these populations have a remarkable degree of dietary homogeneity across space and time, suggesting a strong reliance on terrestrial food resources despite changing culture and genetic make-up.

(A) f4 statistics testing affinities of prehistoric European farmers to either early Neolithic Iberians or central Europeans, restricting these reference populations to SNP-captured individuals to avoid technical artifacts driving the affinities. The boxplots in A show the distributions of all individual f4 statistics belonging to the respective groups. The signal is not sensitive to the choice of reference populations and is not driven by hunter-gatherer–related admixture (Datasets S4 and S5). (B) Estimates of ancestry proportions in different prehistoric Europeans as well as modern southwestern Europeans. Individuals from regions of Iberia were grouped together for the analysis in A and B to increase sample sizes per group and reduce noise


We present a comprehensive biomolecular dataset spanning four millennia of prehistory across the whole Iberian Peninsula. Our results highlight the power of archaeogenomic studies focusing on specific regions and covering a temporal transect. The 4,000 y of prehistory in Iberia were shaped by major chronological changes but with little geographic substructure within the Peninsula. The subtle but clear genetic differences between early Neolithic Iberian farmers and early Neolithic central European farmers point toward two independent migrations, potentially originating from two slightly different source populations. These populations followed different routes, one along the Mediterranean coast, giving rise to early Neolithic Iberian farmers, and one via mainland Europe forming early Neolithic central European farmers. This directly links all Neolithic Iberians with the first migrants that arrived with the initial Mediterranean Neolithic wave of expansion. These Iberians mixed with local hunter-gatherers (but maintained farming/pastoral subsistence strategies, i.e., diet), leading to a recovery from the loss of genetic diversity emerging from the initial migration founder bottleneck. Only after the spread of Bell Beaker pottery did steppe-related ancestry arrive in Iberia, where it had smaller contributions to the population compared with the impact that it had in central Europe. This implies that the two prehistoric migrations causing major population turnovers in central Europe had differential effects at the southwestern edge of their distribution: The Neolithic migrations caused substantial changes in the Iberian gene pool (the introduction of agriculture by farmers) (6, 9, 11, 13, 24), whereas the impact of Bronze Age migrations (Yamnaya) was significantly smaller in Iberia than in north-central Europe (24). The post-Neolithic prehistory of Iberia is generally characterized by interactions between residents rather than by migrations from other parts of Europe, resulting in relative genetic continuity, while most other regions were subject to major genetic turnovers after the Neolithic (4, 6, 7, 9, 25, 48). Although Iberian populations represent the furthest wave of Neolithic expansion in the westernmost Mediterranean, the subsequent populations maintain a surprisingly high genetic legacy of the original pioneer farming migrants from the east compared with their central European counterparts. This counterintuitive result emphasizes the importance of in-depth diachronic studies in all parts of the continent.