The future of the Reich Lab’s studies and interpretations of Late Indo-European migrations


Short report on advances in Genomics, and on the Reich Lab:

Some interesting details:

  • The Lab is impressive. I would never dream of having something like this at our university. I am really jealous of that working environment.
  • They are currently working on population transformations in Italy; I hope we can have at last Italic and Etruscan samples.
  • It is always worth it to repeat that we are all the source of multiple admixture events, many of them quite recent; and I liked the Star Wars simile.
  • Also, some names hinting at potential new samples?? Zajo-I, Chanchan, Gurulde?, Володарка (Ukraine – medieval?), Autodrom, Облевка, Кресты, Кудуксай (Ural region, palaeo-metal?), Золкут, etc.
Ancient DNA sample bag?

On the bad aspect, they keep repeating the same “steppe ancestry” meme (in the featured image above, or the one below). I know this is the news report (i.e. science communication), not exactly the Reich Lab, but these maps didn’t appear out of the blue.

Steppe ancestry distribution in Europe, according to PBS.

Interesting for future interpretations is the whiteboard behind David Reich’s back (apparently they like to keep relevant information on whiteboards…):

Whiteboard behind David Reich’s back (at his office?).

It seems that while the Copenhagen group will still be bound (see here) by the Gimbutas/Kristiansen starting point, the Reich Lab will remain bound by Anthony’s selection of Ringe’s (2002) glottochronological model, and they will try to make genomic data fit in with it.

In fact, the whiteboard doesn’t even include Ringe’s link of Germanic with Italo-Celtic, which could maybe hint at Anthony’s recent change of heart? (i.e. Yamna Hungary -> Corded Ware). That would mean still less Linguistics (if glottochronology can be called that), and more Archaeology…

Image from Anthony & Ringe (2015). “The Proto-Indo-European homeland, with migrations outward at about 4200 BCE (1), 3300 BCE (2), and 3000 BCE (3a and 3b). A tree diagram (inset) shows the pre-Germanic split as unresolved. Modified from Anthony (2013).”

I don’t know why university labs need to do this: To select the linguistic model preferred by a single archaeologist, which happens to be the lead archaeologist of the group, and then try to make genetic data agree again and again with that model. I guess it is a strategic question, and has to do with granting continued contacts with archaeological sites, and access to samples from them?

I understand none of them will try to learn ancient languages, too much work probably. But, wouldn’t it have been more scientifish, at least, to depart from, say, three or four reasonable potential linguistic models (that is, from Indo-Europeanists), and from there discuss the best potential fits for the current genomic data in each paper?

This is, for example, how the Heyd (archaeologist) + German/Spanish Indo-Europeanist schools would look like:

Yamnaya expansion coupled with Meid’s (1975) description of three stages of Proto-Indo-European development (as interpreted by Adrados 1998) and depiction of Heyd’s proposal of Yamna expansion.

Wouldn’t you say it could have fitted the statistical and Y-DNA data seamlessly, in contrast to Gimbutas/Trager (i.e. Kristiansen today), or to Anthony/Ringe?

NOTE. I would say the mainstream German school follows Meid’s (1975) three-stage theory coupled with Dunkel’s (e.g. 1997) nomenclature. The Spanish school follows Adrados, who has repeated ad nauseam that he was the first to mention the three-stage theory in conferences and papers previous to and coincident with Meid’s proposal (see his latest JIES article, a paper available in Scribd). In any case, Spanish and German scholars have been working hand in hand in accepting and developing a general linguistic model similar to the one above.

Archaeological theories like those of Heyd or Mallory for Yamna and Bell Beaker (in contrast to Kristiansen or Anthony), and Prescott and Walderhaug for Bell Beaker and Germanic (contrasting with Kristiansen and Iversen) are compatible with this German/Spanish model.

The French school is non-existent on the homeland matter, Italian scholars seem to be behind even in the description of Anatolian as archaic (probably related to the general wish to have Latin as derived from Vergil’s Troy), Russian scholars are still working with Nostratic and Mesolithic expansions, and Leiden, as the leading IE publisher worldwide today, is full of very different ‘divos’, each with his own pet theory (some obviously agreeing with the German/Spanish model; and especially interesting is that some of them are strong supporters of an Indo-Uralic proto-language).

The English-speaking world, on the other hand, has seen the most varied models being either proposed or translated into its language, with the most popular ones being those publicized by archaeologists (Winfred P. Lehmann being one of the noteworthy exceptions), which may explain why for some people (archaeologists or geneticists) linguistics seems more like a game. It is to be assumed that these same people haven’t taken a look at the dozens of genetic papers published to date – and hundreds of archaeological papers using a bit of linguistics to support their models – , and how wrong they have all been in their interpretations, or else they would realize that genomics does (sadly) not really look like a serious discipline at all right now among most linguists, and among many archaeologists either…

Thus, instead of comparing the main theories on Proto-Indo-European (i.e. linguistics->archaeology->genetics), which would have offered the most stable framework to assess potential prehistoric ethnolinguistic identifications, they keep using a single, simplistic language tree liked by an archaeologist, and trying to fit genetic data to it, while also adapting archaeology to genetics, i.e. genetics->archaeology->linguistics; which, as you can imagine, is not going to convince any linguist.

Especially disappointing is that the world’s leading genetic lab still relies on a marginal proposal based on glottochronology, the homeopathy of linguistics… At least in that regard everyone should know better by now.

Also, they keep interacting with the wrong audience: instead of trying to engage linguists into the real homeland and dialectal quest, to keep Genomics a serious discipline among academics, they tend to discuss with politically- or racially-motivated people, which is probably also in line with strategic decisions.

In the example below, we see the main author of their recent paper on Indo-Iranian migrations seeking once again interaction, this time through “news” promoted by Hindu nationalist bigots, so that – even if that makes them look more neutral in the eyes of those who may allow access to Indian samples – , in the end, we see in genomics a fictitious revival of the “AIT vs. OIT debate” dead long ago in linguistics and archaeology (anywhere but in India).

Pretty disappointing to see these trends; so much effort and time invested in futile discussions and infinitely reworked doomed glottochronological or 19th-century models, when it is the fine-scale population structure of expanding Yamna peoples what we should be discussing now, and thus Late PIE dialectalisation with offshoots Afanasevo, East Bell Beaker, Balkan Bronze Age, and Sintashta/Potapovka; as well as Corded Ware evolution in Uralic-speaking territory.

EDIT (7 JUN 2018): Some parts of the text have been corrected or slightly modified.


The uneasy relationship between Archaeology and Ancient Genomics

Allentoft Corded Ware

News feature Divided by DNA: The uneasy relationship between archaeology and ancient genomics, Two fields in the midst of a technological revolution are struggling to reconcile their views of the past, by Ewen Callaway, Nature (2018) 555:573-576.

Interesting excerpts (emphasis mine):

In duelling 2015 Nature papers6,7the teams arrived at broadly similar conclusions: an influx of herders from the grassland steppes of present-day Russia and Ukraine — linked to Yamnaya cultural artefacts and practices such as pit burial mounds — had replaced much of the gene pool of central and Western Europe around 4,500–5,000 years ago. This was coincident with the disappearance of Neolithic pottery, burial styles and other cultural expressions and the emergence of Corded Ware cultural artefacts, which are distributed throughout northern and central Europe. “These results were a shock to the archaeological community,” Kristiansen says.


Still, not everyone was satisfied. In an essay8 titled ‘Kossinna’s Smile’, archaeologist Volker Heyd at the University of Bristol, UK, disagreed, not with the conclusion that people moved west from the steppe, but with how their genetic signatures were conflated with complex cultural expressions. Corded Ware and Yamnaya burials are more different than they are similar, and there is evidence of cultural exchange, at least, between the Russian steppe and regions west that predate Yamnaya culture, he says. None of these facts negates the conclusions of the genetics papers, but they underscore the insufficiency of the articles in addressing the questions that archaeologists are interested in, he argued. “While I have no doubt they are basically right, it is the complexity of the past that is not reflected,” Heyd wrote, before issuing a call to arms. “Instead of letting geneticists determine the agenda and set the message, we should teach them about complexity in past human actions.”

Many archaeologists are also trying to understand and engage with the inconvenient findings from genetics. (…)
[Carlin:] “I would characterize a lot of these papers as ‘map and describe’. They’re looking at the movement of genetic signatures, but in terms of how or why that’s happening, those things aren’t being explored,” says Carlin, who is no longer disturbed by the disconnect. “I am increasingly reconciling myself to the view that archaeology and ancient DNA are telling different stories.” The changes in cultural and social practices that he studies might coincide with the population shifts that Reich and his team are uncovering, but they don’t necessarily have to. And such biological insights will never fully explain the human experiences captured in the archaeological record.

Reich agrees that his field is in a “map-making phase”, and that genetics is only sketching out the rough contours of the past. Sweeping conclusions, such as those put forth in the 2015 steppe migration papers, will give way to regionally focused studies with more subtlety.

This is already starting to happen. Although the Bell Beaker study found a profound shift in the genetic make-up of Britain, it rejected the notion that the cultural phenomenon was associated with a single population. In Iberia, individuals buried with Bell Beaker goods were closely related to earlier local populations and shared little ancestry with Beaker-associated individuals from northern Europe (who were related to steppe groups such as the Yamnaya). The pots did the moving, not the people.

This final paragraph apparently sums up a view that Reich has of this field, since he repeats it:

Reich concedes that his field hasn’t always handled the past with the nuance or accuracy that archaeologists and historians would like. But he hopes they will eventually be swayed by the insights his field can bring. “We’re barbarians coming late to the study of the human past,” Reich says. “But it’s dangerous to ignore barbarians.”

I would say that the true barbarians didn’t have a habit or possibility to learn from the higher civilizations they attacked or invaded. Geneticists, on the other hand, only have to do what they expect archaeologists to do: study.

EDIT (30 MAR 2018): A new interesting editorial of Nature, On the use and abuse of ancient DNA.

See also:

Something is very wrong with models based on the so-called ‘Yamnaya admixture’ – and archaeologists are catching up (II)

A new article by Leo S. Klejn tries to improve the Northern Mesolithic Proto-Indo-European homeland model of the Russian school of thought: The Steppe hypothesis of Indo-European origins remains to be proven, Acta Archaeologica, 88:1, 193–204.


Recent genetic studies have claimed to reveal a massive migration of the bearers of the Yamnaya culture (Pit-grave culture) to the Central and Northern Europe. This migration has supposedly lead to the formation of the Corded Ware cultures and thereby to the dispersal of Indo-European languages in Europe. The article is a summary presentation of available archaeological, linguistic, genetic and cultural data that demonstrates many discrepancies in the suggested scenario for the transformations caused by the Yamnaya “invasion” some 5000 years ago.


Both teams [Reich/Anthony, and Willerslev/Kristiansen] interpreted this resemblance in the same way: as evidence of mass migration of the Yamnaya culture from the steppes into the Central and Northern Europe, resulting in the formation of the Corded Ware cultures, and these are universally recognised as Indo-European. Since earlier in this part of Europe existed a different pool of genomes, geneticists presumed that the Yamnaya migration alone had brought the Indo-European languages into Europe. It is difficult to say to what extent the pre-convictions of the involved archaeologists influenced these conclusions, or whether the results of the genetic studies attracted archaeologists with such beliefs.

Mismatch of cultural manifestations

First, we might question the idea of the Yamnaya culture as a unity rather than a loose conglomerate of cultures. Merpert (1974) divided it into nine local groups but did not recognise them as separate cultures. However, in 1975 I suggested that Nerushay (Budzhak) monuments should be recognised as a distinct culture (Klejn 1975), although still as a part of the same broader steppe community.

This was accepted by other specialists (Ivanova 2012; 2013; 2014). Generally, in the western branch of this community, a mixture of the eastern rites of interment with local, Balkan ceramics can be observed. It should be noted that hitherto all genetic samples were taken from eastern material (in the vicinity of Samara in the Volga basin and Kalmykia), while the central thesis concerns the intrusion of the western branch of this community (Budzhak culture) into Europe.

The spread of cultural-historical communities of the Yamnaya culture and the location of the Budzhak culture. GAC – Globular Amphora culture; CWC – Corded Ware culture. After Ivanova 2013.

Simultaneity of cultures

The Yamnaya culture (Chernykh & Orlovskaya 2004a; Heyd 2011; Frȋnculeasa et al. 2015) appears not to be the predecessor of the Corded Ware cultures but is contemporary with them. The Corded Ware cultures appeared also around the turn between the fourth and third millennium BC (Stöckli 2001; Furholt 2003). Their derivation from the Yamnaya seems, therefore, to be less probable. This is evidenced by the fact that the corded beakers or amphorae found in the Budzhak culture are not the prototypes of the corded beakers or amphorae found in more northern territories, but seem instead to be an outcome of contemporaneous contacts (Ivanova 2014; Klejn 2017c).

Discrepancies across the haplogroups

Even more remarkable is the variation in the distribution of types of Y chromosome. In the Yamnaya population, R1b is not just a single occurrence (there are about seven known occurrences) while in the Corded Ware population a different clade of R1b is found and R1a is predominant (several instances). Thus the postulate of unbroken succession finds no support!

Distribution of artefacts and customs of the Yamnaya culture in the area of the Corded Ware cultures. After Bátora 2006.

Paradoxical gradient

In the tables presented in the article by Reichs’ team (Haak et al. 2015) the genetic pool connecting the Yamnaya culture with the Corded Ware people is shown to be more intense in Northern Europe (Norway and Sweden) and decreases gradually from the North to the South (Fig. 6). It is weakest around the Danube, in Hungary, i. e. areas neighbouring the western branch of the Yamnaya culture! This is the reverse image to what the proposed hypothesis by the geneticists would lead us to expect. It is true that this gradient is traced back from the contemporary materials, but it was already present during the Bronze Age (Klejn 2015a).

The author also uses questionable interpretations from selected articles to advance his (as of today) untenable positions regarding a Mesolithic origin of the reconstructible Proto-Indo-European language.

1. Glottochronology, for a PIE origin:

If based on the data of glottochronology (taking into account all disputes) the period of initial dispersal is to be dated to the 7th-5th millennium BC.

2. Doubts on the origin of R1b-L51 subclades expressed in Genetic differentiation between upland and lowland populations shapes the Y-chromosomal landscape of West Asia, by Balanovsky et al. (2017), Human Genetics 136, 4. 437-450:

The currently available dataset does not contradict the hypothesis that R-GG400 marks a link between the East European steppe dwellers and West Asians, though the route and even direction of this migration is disputable. It does, however, demonstrate that present-day West European R1b chromosomes do not originate from the Yamnaya populations analyzed in (Haak et al. 2015; Mathieson et al. 2015) and raises the question of their origin. A Bronze Age origin is more likely than a Neolithic one (Balaresque et al. 2010), but further ancient DNA studies may be necessary to identify this source.

Just yesterday I read the post The retraction paradox: Once you retract, you implicitly have to defend all the many things you haven’t yet retracted, by Andrew Gelman. While – in my opinion – the post does not live up to its title, it poses an interesting question, as to how ad logicam (fallacy fallacy) is often used today in research: One author proposes something that is later demonstrated to be wrong, so everything they wrote or write can be said ipso facto to be wrong…especially if they accept that it was wrong.

This is usual with amateur geneticists (those who don’t publish, and are therefore not subjected to criticism): if anyone is wrong (whether in Archaeology or Genetics), then they are wrong in everything else. It seems to me that Klejn’s theses against recent genetic results rest on the same assumption: The Yamna -> Corded Ware migration model is wrong, ergo the Yamna homeland model is wrong.

I guess this same fallacy is what a lot of angered geneticists (whether professional or amateurs) are going to use to dismiss Klejn’s criticism, trying to focus on what he clearly does not grasp – about genomic data of Yamna peoples and their expansion – to disregard his doubts on genetic interpretations entirely.

I have warned many times about how simplistic interpretations of genetic data would cause a general mistrust in the field, and that archaeologists won’t take the discipline seriously, no matter how many articles get published in famous research tabloids like Nature or Science…

Those who dismiss this warning lightly seem to forget the fate of other recent “scientific breakthroughs” which were initially so promising that Humanities appeared to matter no more, like glottochronology for Linguistics and, to some extent, that of radiocarbon analysis for Archaeology.
EDIT: see here a recent example of discusion on discrepancies between archaeological and 14C-based chronologies, whereby ‘scientific data’ obviously needs archaeological context for a meaningful interpretation

Featured image: The direction of the supposed migration of the bearers of the Yamnaya culture into the area of the Corded Ware cultures. After Haak et al. 2015.

NOTE: I obviously don’t agree with Klejn’s main model: he criticises the Proto-Indo-European steppe homeland, and more specifically the expansion of Yamna peoples with R1b-L23 subclades, which I support. But, probably because of his “pre-convictions” (as he puts it when describing proponents of the steppe hypotheses) about the Proto-Indo-European homeland in Northern Europe during the Mesolithic, he was one of the first renown archaeologists to criticise the obvious inconsistencies in the genetic model of migrations based exclusively on the “Yamnaya ancestral component” concept, and to provoke the necessary reaction from (until then) overconfident geneticists, and he deserves credit for that.

In my opinion, the Russian school’s “Northern European Mesolithic” homeland model – as I have said before – could be based on the appearance of EHG ancestry, or maybe on the expansion of haplogroup R1b with post-Swiderian cultures, but the timeframe proposed is too early for any reconstructible parent proto-language, even for Indo-Uralic.


Modern Hungarian mtDNA more similar to ancient Europeans than to Hungarian conquerors


New preprint at BioRxiv, MITOMIX, an Algorithm to Reconstruct Population Admixture Histories Indicates Ancient European Ancestry of Modern Hungarians, by Maroti et al. (2018).

The estimated age distribution of the shared mt Hgs between Hungarians (Hun), the best hypothetical admix (mixFreq) and the populations contributing to this admix: Belgian/Dutch (BeN), Danish (Dan), Basque (Bsq), Croatian/Serbian (CrS), Baltic Late Bronze Age culture (BalBA), Bell Beaker culture (BellB), Slovakian (Slo). The numbers in parentheses indicate the contributions to the best hypothetical admix.

Abstract (emphasis mine)

By making use of the increasing number of available mitogenomes we propose a novel population genetic distance metric, named Shared Haplogroup Distance (SHD). Unlike FST, SHD is a true mathematical distance that complies with all metric axioms, which enables our new algorithm (MITOMIX) to detect population-level admixture based on SHD minimum optimization. In order to demonstrate the effectiveness of our methodology we analyzed the relation of 62 modern and 25 ancient Eurasian human populations, and compared our results with the most widely used FST calculation. We also sequenced and performed an in-depth analysis of 272 modern Hungarian mtDNA genomes to shed light on the genetic composition of modern Hungarians. MITOMIX analysis showed that in general admixture occurred between neighboring populations, but in some cases it also indicated admixture with migrating populations. SHD and MITOMIX analysis comply with known genetic data and shows that in case of closely related and/or admixing populations, SHD gives more realistic results and provides better resolution than FST. Our results suggest that the majority of modern Hungarian maternal lineages have Late Neolith/Bronze Age European origins (partially shared also with modern Danish, Belgian/Dutch and Basque populations), and a smaller fraction originates from surrounding (Serbian, Croatian, Slovakian, Romanian) populations. However only a minor genetic contribution (<3%) was identified from the IXth Hungarian Conquerors whom are deemed to have brought Hungarians to the Carpathian Basin. Our analysis shows that SHD and MITOMIX can augment previous methods by providing novel insights into past population processes.

Unrooted hierarchic cluster of modern and archaic populations based on the SHD matrix.

It is interesting to keep receiving data as to how language does not correlate well with Genomics, whether admixture or haplogroups, even though it is already known to happen in regions such as Anatolia, the Baltic, South-Eastern or Northern Europe.

Thorough anthropological models of migration or cultural diffusion are necessary for a proper interpretation of genetic data. There is no shortcut to that.

Co-occurrence of Hungarian Bronze Age mt Hgs Distribution of mt Hgs found in Hungarian Bronze Age archaic samples in the analyzed populations. The fixation dates are based on Behar et al [6].

Images made available under a CC-BY-NC-ND 4.0 International license.
See also:

Science and Archaeology (Humanities): collaboration or confrontation?

Allentoft Corded Ware

Another discussion on the role of Science for Archaeology, in The Two Cultures and a World Apart: Archaeology and Science at a New Crossroads, by Tim Flohr Sørensen, Norwegian Archaeological Review, vol. 50, 2 (2017):

Within the past decade or so, archaeology has increasingly utilised and contributed to major advances in scientific methods when exploring the past. This progress is frequently celebrated as a quantum leap in the possibilities for understanding the archaeological record, opening up hitherto inaccessible dimensions of the past. This article represents a critique of the current consumption of science in archaeology, arguing that the discipline’s grounding in the humanities is at stake, and that the notion of ‘interdisciplinarity’ is becoming distorted with the increasing fetishisation of ‘data’, ‘facts’ and quantitative methods. It is argued that if archaeology is to break free of its self-induced inferiority to and dependence on science, it must revitalise its methodology for asking questions pertinent to the humanities.

Commentators in the discussion include:

The answer of Sørensen to them is on Archaeological Paradigms: Pendulum or Wrecking Ball?. Excerpts:

Thus, I argue that what we are witnessing with ‘the third science revolution’ (Kristiansen 2014) is precisely the proliferation of an already very authoritative science ideal in archaeology. And I worry that this dominance will limit research possibilities and potentials rather than encouraging plurality and radical experimentation with different forms of knowing.
I do believe in the coexistence of disparate academic principles and that collaboration is very often necessary, but I am also of the conviction that some degree of epistemological friction keeps both fields of research progressing. Nurturing distinctions, in other words, is no less useful than aiming for assimilation. What I am arguing for is thus a more respectful friction than the one characterising the processual/post-processual collisions, hoping for an academic environment where differences between research ideals are humbly accepted and cultivated precisely for their disparate strengths.
So, what I am arguing for is a more kaleidoscopic academic landscape, where different positions do not always have to assume a defensive or compromising stance, especially in confrontation with paradigms that are prospering politically. This also implies that science is not simply in the service of archaeology, as Lidén argues, but that we need to consider how archaeology may benefit science more generally by continuing to debate epistemological grounds, methodology and our modes of inquiry. And so, my fellow archaeologists: ask not what science can do for us, but what we can do for science.
In my original article, I addressed the widespread tendency in archaeology to disseminate research findings with sometimes too much conviction, where ambiguous results (and limited statistical data) are adopted with little concern for the inherent uncertainties. It is precisely this valorisation and authority of scientific observations that I claim to lead to an implicit devaluation of studies based in the humanities. The problem is – as stated numerous times in my original article – not science, but the consumption of scientific observations in archaeology, where the subtleties and not least ambiguities of scientific results are filtered out, leaving space almost exclusively for scientifically ‘proven’ facts and unequivocal results. This mode of consumption stands in direct contrast to the epistemological observation in the sciences, dictating that ‘“proof” and “certainty” are actually in short supply in the world of science’ (Freudenburg et al. 2008, p. 5). Hence, the risk is that archaeology somewhat uncritically adopts scientific observations that are in fact ‘empirically underdetermined – based largely on evidence that is in the category of the “maybe,” being inherently ambiguous rather than being absolutely clear-cut’ (Freudenburg et al. 2008, p. 6).

As I said recently on the article Massive Migrations…, by Martin Furholt, we are living a historical debate on essential questions for the future of all these disciplines.

And, as always, there is no shortcut to reading the texts. Unlike in Science, you cannot write a table with a summary of findings…

Discovered (again) via a comment on this blog by Joshua Jonathan.

Featured image from Allentoft et al. “They conclude that the Corded Ware culture of central Europe had ancestry from the Yamnaya. Allentoft et al. also show that the Afanasievo culture to the east is related to the Yamnaya, and that the Sintashta and Andronovo cultures had ancestry from the Corded Ware. Arrows indicate migrations — those from the Corded Ware reflect the evidence that people of this archaeological culture (or their relatives) were responsible for the spreading of Indo-European languages. All coloured boundaries are approximate.”


We are all special, which also means that none of us is


Adam Rutherford writes You’re Descended from Royalty and So Is Everybody Else – Anybody you can name from ancient history is in your family tree, which I discovered via John Hawks’ new post The surprising connectedness of human genealogies over centuries.


One way to think of it is to accept that everyone of European descent should have billions of ancestors at a time in the 10th century, but there weren’t billions of people around then, so try to cram them into the number of people that actually were. The math that falls out of that apparent impasse is that all of the billions of lines of ancestry have coalesced into not just a small number of people, but effectively literally everyone who was alive at that time. So, by inference, if Charlemagne was alive in the ninth century, which we know he was, and he left descendants who are alive today, which we also know is true, then he is the ancestor of everyone of European descent alive in Europe today.

Since most of this blog’s posts support academic disciplines looking for answers to the Indo-European question, and gives constantly reasons against modern genetic (and phylogenetic) identification, I think it is worth at least a quick read for anyone interested in the field.

I recently referred to the interesting series of posts by Graham Coop on this matter.

Featured image: Europe around 800 – the map is public domain from from the Historical Atlas (New York, 1911)


Genetic vs. genealogical ancestors and actual geographical constraints


Interesting post from Graham Coop, Where did your genetic ancestors come from?

An excerpt:

A thousand years back I’m descended from nearly everyone everywhere in Europe. I’m related to these individuals via millions of lines of descent back through my vast family tree. Yet the majority of the lines back through my pedigree trace to people living in the UK and Western Europe. Many lines trace back to more distant locations, but these are relatively few in number compared to those tracing back to closer to home. Ancestors along each of these lines are (roughly) equally likely to contribute to my genome. Therefore, most of my roughly 2600 genetic ancestors from 1000 years ago, who contributed the majority of my genome to me, will be random people living in the UK and western Europe at that time (who happened to leave descendants).

Looking back a few thousand years more, I’m a descendant of nearly everyone who ever lived almost everywhere in the world (at least those who left descendants, and many did). Yet most of the just over ~6000 individuals from that time who contributed the majority of my genome to me will mostly be found all over Western Eurasia. There’s nothing much special about these individuals who happen to be my genetic ancestors a few thousand years back. They’re likely not royalty. My genetic ancestors are just a random subset of all of my genealogical ancestors, they just happen to be my genetic ancestors due to the vagaries of meiosis and recombination.

As always, a humbling example, e.g. for those looking at haplogroups in the distant past to make modern ethnolinguistic identifications.

Genetics in combination with genealogy poses a question akin to the Ship of Theseus paradox.

Featured image (from the article): Simulation of how much of your autosomal genome is present in each genealogical ancestor as we go back up the generations. Image explained in detail in the article How many genetic ancestors do I have?


Review article about Ancient Genomics, by Pontus Skoglund and Iain Mathieson


A preprint article by two of the most prolific researchers in Human Ancestry is out, and they request feedback: Ancient genomics: a new view into human prehistory and evolution, by Skoglund and Mathieson (2017). Right now, it is downloadable on Dropbox.


The first decade of ancient genomics has revolutionized the study of human prehistory and evolution. We review new insights based on ancient genomic data, including greatly increased resolution of the timing and structure of the out-of-Africa event, the diversification of present-day non-African populations, and the earliest expansions of those populations into Eurasia and America. Prehistoric genomes now document patterns of population continuity and change on every inhabited continent–in particular the effect of agricultural expansions in Africa, Europe and Oceania–and record a history of natural selection that shapes present-day phenotypic diversity. Despite these advances, much remains unknown, in particular about the genomic histories of Asia–the most populous continent, and Africa–the continent that contains the most genetic diversity. Ancient genomes from these and other regions, integrated with a growing understanding of the genomic basis of human phenotypic diversity, will be in focus during the next decade of research in the field.

The paper may be highly recommended as an introduction for anyone interested in the field of Human Ancestry in general.

However, its short summary of steppe ancestry expansion (where the Corded Ware culture predominates) is still reminiscent of the infamous “Yamnaya -> Corded Ware -> Bell Beaker” model set forth by the 2015 Nature articles on the subject, and Kristiansen’s Indo-European Corded Ware theory.

Here is an excerpt (emphasis mine):

The next substantial change is closely related to ancestry that by around 5000 BP extended over a region of more than 2000 miles of the Eurasian steppe, including in individuals associated with the Yamnaya Cultural Complex in far-eastern Europe (1; 38) and with the Afanasievo culture in the central Asian Altai mountains (1). This “steppe” ancestry is itself a mixture between ancestry that is related to Mesolithic hunter-gatherers of eastern Europe and ancestry that is related to both present-day populations (38) and Mesolithic hunter-gatherers (46) from the Caucasus mountains, and also to the populations of Neolithic (11), and Copper Age (56) Iran. Steppe ancestry appeared in southeastern Europe by 6000 BP (72), northeastern Europe around 5000 BP (47) and central Europe at the time of the Corded Ware Complex around 4600 BP (1; 38). These dates are reasonably tight constraints, because in each case there is no evidence of steppe ancestry in individuals immediately preceding these dates (47; 72). Gene flow on the steppe was extensive and bidirectional, as shown by the eastward flow of Anatolian Neolithic ancestry– reaching well into central Eurasia by the time of the Andronovo culture ~3500 BP (1)–and the westward flow of East Asian ancestry–found in individuals associated with the Iron Age Scythian culture close to the Black Sea ~2500 BP (143).

Copper and Bronze Age population movements (14; 78 Martiniano, 2017 #8761; 85; 112), as well as later movements in the Iron Age and Historical period (70; 119) further distributed steppe ancestry around Europe. Present-day western European populations can be modeled as mixtures of these three ancestry components (Mesolithic hunter-gatherer, Anatolian Neolithic and Steppe) (38; 57). In eastern Europe, further shifts in ancestry are the result of additional or distinct gene flow from Anatolia throughout the Neolithic and Bronze Age in the Aegean (42; 51; 55; 72; 87), and gene flow from Siberian-related populations in Finland and the Baltic region (38). East-west gene flow also brought new ancestry–related to populations from 265 Copper Age Iran–to the Levant during the Copper and Bronze ages (39; 56).

The geographic structure of these population transformations gave rise to population structure of present-day Europe. For example Anatolian Neolithic ancestry is highest in southern European populations like Sardinians, and lowest in northern European populations (38). Steppe ancestry is at high frequency in north-central Europeans and low in the south. Isolation-by-distance may have contributed to these patterns to some extent, but the contribution must have been small. In much of Europe, extreme population discontinuity was the norm.

Featured image: from the article, “Major Holocene population movements and expansions that have been demonstrated using ancient DNA.”


New preprint papers on Finland’s population history and disease, skin pigmentation in Africa, and genetic variation in Thailand hunter-gatherers


New and interesting research these days in BioRxiv:

Haplotype sharing provides insights into fine-scale population history and disease in Finland, by Martín et al. (2017):

Finland provides unique opportunities to investigate population and medical genomics because of its adoption of unified national electronic health records, detailed historical and birth records, and serial population bottlenecks. We assemble a comprehensive view of recent population history (≤100 generations), the timespan during which most rare disease-causing alleles arose, by comparing pairwise haplotype sharing from 43,254 Finns to geographically and linguistically adjacent countries with different population histories, including 16,060 Swedes, Estonians, Russians, and Hungarians. We find much more extensive sharing in Finns, with at least one ≥ 5 cM tract on average between pairs of unrelated individuals. By coupling haplotype sharing with fine-scale birth records from over 25,000 individuals, we find that while haplotype sharing broadly decays with geographical distance, there are pockets of excess haplotype sharing; individuals from northeast Finland share several-fold more of their genome in identity-by-descent (IBD) segments than individuals from southwest regions containing the major cities of Helsinki and Turku. We estimate recent effective population size changes over time across regions of Finland and find significant differences between the Early and Late Settlement Regions as expected; however, our results indicate more continuous gene flow than previously indicated as Finns migrated towards the northernmost Lapland region. Lastly, we show that haplotype sharing is locally enriched among pairs of individuals sharing rare alleles by an order of magnitude, especially among pairs sharing rare disease causing variants. Our work provides a general framework for using haplotype sharing to reconstruct an integrative view of recent population history and gain insight into the evolutionary origins of rare variants contributing to disease.

Migration rates and haplotype sharing within Finland and between neighboring countries. A) Map of regional Finnish, Swedish, and Estonian birthplaces Purple triangle indicates St. Petersburg, Russia. Hungary not shown. 1 Finnish, Swedish, and Estonian region labels are shown in Table S3. B) Principal components analysis (PCA) of unrelated individuals, colored by birth region as shown in A) if available or country otherwise. C-D) Migration rates inferred with EEMS. Values and colors indicate inferred rates, for example with +1 (shades of blue) indicating an order of magnitude more migration at a given point on average, and shades of orange indicating migration barriers. C) Migration rates among municipalities in Finland. D) Migration rates within and between Finland, Sweden, Estonia, and St. Petersburg, Russia. Available under a CC-BY 4.0 International license.

Interesting to understand this paper is the whole research published by the Institute for Molecular Medicine Finland (FIMM): their website contains detailed research on Finland’s recent genetic history.

NOTE: The featured image of this article contains three figures from the FIMM (License CC-BY 4.0). Left: Position of the points represents the locations of 1042 Finnish individuals. By clustering the individuals into two groups based on genome data we see a split between eastern (blue) and western (red) parts. Individuals who show considerable relatedness to both groups have been colored with cyan. Both parents of each individual were born close to each other and based on the parents’ birth years we can infer that we are looking at the genetic structure present in Finland before 1950s. Center: An estimated borderline of the Treaty of Nöteborg on top of the map from the left. The border line is drawn between Jääski (28.92 N, 61.04 E) and Pyhäjoki (24.26 N, 64.46 E). Right: The settlement border divides Finland into the early settlement region (to west and south of the border) and the late settlement region (to east and north of the border) (Jutikkala 1933, s. 91). We see that Southern Savo (in south-eastern part of the early settlement) is among the only parts of the early settlement region that is dominated by the eastern genetic group. Information from Matti Pirinen and Sini Kerminen, 24.5.2017.

An Unexpectedly Complex Architecture for Skin Pigmentation in Africans, by Martin et al (2017):

Fewer than 15 genes have been directly associated with skin pigmentation variation in humans, leading to its characterization as a relatively simple trait. However, by assembling a global survey of quantitative skin pigmentation phenotypes, we demonstrate that pigmentation is more complex than previously assumed with genetic architecture varying by latitude. We investigate polygenicity in the Khoe and the San, populations indigenous to southern Africa, who have considerably lighter skin than equatorial Africans. We demonstrate that skin pigmentation is highly heritable, but that known pigmentation loci explain only a small fraction of the variance. Rather, baseline skin pigmentation is a complex, polygenic trait in the KhoeSan. Despite this, we identify canonical and non-canonical skin pigmentation loci, including near SLC24A5, TYRP1, SMARCA2/VLDLR, and SNX13 using a genome-wide association approach complemented by targeted resequencing. By considering diverse, under-studied African populations, we show how the architecture of skin pigmentation can vary across humans subject to different local evolutionary pressures.

Contrasting maternal and paternal genetic variation of hunter-gatherer groups in Thailand, by Kutanan et al. (2017):

The Maniq and Mlabri are the only recorded nomadic hunter-gatherer groups in Thailand. Here, we sequenced complete mitochondrial (mt) DNA genomes and ~2.364 Mbp of non-recombining Y chromosome (NRY) to learn more about the origins of these two enigmatic populations. Both groups exhibited low genetic diversity compared to other Thai populations, and contrasting patterns of mtDNA and NRY diversity: there was greater mtDNA diversity in the Maniq than in the Mlabri, while the converse was true for the NRY. We found basal uniparental lineages in the Maniq, namely mtDNA haplogroups M21a, R21 and M17a, and NRY haplogroup K. Overall, the Maniq are genetically similar to other negrito groups in Southeast Asia. By contrast, the Mlabri haplogroups (B5a1b1 for mtDNA and O1b1a1a1b and O1b1a1a1b1a1 for the NRY) are common lineages in Southeast Asian non-negrito groups, and overall the Mlabri are genetically similar to their linguistic relatives (Htin and Khmu) and other groups from northeastern Thailand. In agreement with previous studies of the Mlabri, our results indicate that the Malbri do not directly descend from the indigenous negritos. Instead, they likely have a recent origin (within the past 1,000 years) by an extreme founder event (involving just one maternal and two paternal lineages) from an agricultural group, most likely the Htin or a closely-related group.


Indo-European demic diffusion model, 3rd edition


I have just uploaded the working draft of the third version of the Indo-European demic diffusion model. Unlike the previous two versions, which were published as essays (fully developed papers), this new version adds more information on human admixture, and probably needs important corrections before a definitive edition can be published.

The third version is available right now on ResearchGate and I will post the PDF at Academia Prisca, as soon as possible:

Map overlaid by PCA including Yamna, Corded Ware, Bell Beaker, and other samples

Feel free to comment on the paper here, or (preferably) in our forum.

A working version (needing some corrections) divided by sections, illustrated with up-to-date, high resolution maps, can be found (as always) at the official collaborative Wiki website