I have tried running supervised ADMIXTURE models by selecting distant populations based on PCAs and qpAdm results. The most accurate approximations to what the software should offer appear with a small K number, between K=5 and K=7, whether supervised or unsupervised, and adding more ancestral populations gives some weird results the more distant (in time) populations are from these selected samples.
Labels for ancestral components are used following those commonly referred to in the literature, although supervised ADMIXTURE using corresponding available samples (viz. Anatolia Neolithic for AHG, Iran Hotu and/or CHG for IHG, AG2, AG3 and Mal’ta for ANE, etc.) offer slightly different, less smooth outputs for some periods, especially among more recent populations.
Outputs depend on many different factors, and these files are intended as an overview of the evolution of these simplistic components. The number of available samples per period, the potential ancestry changes within each conventionally selected period, or whether or not each available sample is representative of the territory they were recovered from, among many other factors, influence the outputs and the maps.
NOTE. In summary, ADMIXTURE results like these below might be used to develop new ideas, to be then formally tested; they cannot be used to support anything. Don’t be like the Copenhagen group, randomly selecting “Steppe ancestry” with K=4, identifying this component as “Indo-Europeans”, and correlating its evolution with changes in vegetation composition in yet another obvious correlation = causation argument among many confounding factors left unaccounted for…
Static ADMIXTURE + culture maps
Colours correspond to the components as labelled in the video and in the files below.
The following maps offer natural neighbour interpolations of ancestral components in ancient DNA samples grouped by periods (conventionally selected following the same pattern as in the Prehistory Atlas).
Extrapolation (inferred ancestry beyond the frame created by available samples per map) is obtained by adding distant external locations (such as Greenland, Arctic, Alaska…) with a value of 0.
Videos offer a dynamic timeline.
Click on the images to see a version with higher resolution.
This ancestry peaks among Baikal HG, Ust’Belaya, Nganasans, or Ulchi, hence the different labels used.
Iran HG ancestry
ADMIXTURE maps by period
Click on each image for a higher resolution version.
Early Bronze Age
Middle Bronze Age
Late Bronze Age
Early Iron Age
Late Iron Age
These are the samples used for interpolations in each period (except for modern populations, which are those included in the Reich Lab curated dataset):
The Eurasian steppes reach from the Ukraine in Europe to Mongolia and China. Over the past 5000 years, these flat grasslands were thought to be the route for the ebb and flow of migrant humans, their horses, and their languages. de Barros Damgaard et al. probed whole-genome sequences from the remains of 74 individuals found across this region. Although there is evidence for migration into Europe from the steppes, the details of human movements are complex and involve independent acquisitions of horse cultures. Furthermore, it appears that the Indo-European Hittite language derived from Anatolia, not the steppes. The steppe people seem not to have penetrated South Asia. Genetic evidence indicates an independent history involving western Eurasian admixture into ancient South Asian peoples.
According to the commonly accepted “steppe hypothesis,” the initial spread of Indo-European (IE) languages into both Europe and Asia took place with migrations of Early Bronze Age Yamnaya pastoralists from the Pontic-Caspian steppe. This is believed to have been enabled by horse domestication, which revolutionized transport and warfare. Although in Europe there is much support for the steppe hypothesis, the impact of Early Bronze Age Western steppe pastoralists in Asia, including Anatolia and South Asia, remains less well understood, with limited archaeological evidence for their presence. Furthermore, the earliest secure evidence of horse husbandry comes from the Botai culture of Central Asia, whereas direct evidence for Yamnaya equestrianism remains elusive.
We investigated the genetic impact of Early Bronze Age migrations into Asia and interpret our findings in relation to the steppe hypothesis and early spread of IE languages. We generated whole-genome shotgun sequence data (~1 to 25 X average coverage) for 74 ancient individuals from Inner Asia and Anatolia, as well as 41 high-coverage present-day genomes from 17 Central Asian ethnicities.
We show that the population at Botai associated with the earliest evidence for horse husbandry derived from an ancient hunter-gatherer ancestry previously seen in the Upper Paleolithic Mal’ta (MA1) and was deeply diverged from the Western steppe pastoralists. They form part of a previously undescribed west-to-east cline of Holocene prehistoric steppe genetic ancestry in which Botai, Central Asians, and Baikal groups can be modeled with different amounts of Eastern hunter-gatherer (EHG) and Ancient East Asian genetic ancestry represented by Baikal_EN.
In Anatolia, Bronze Age samples, including from Hittite speaking settlements associated with the first written evidence of IE languages, show genetic continuity with preceding Anatolian Copper Age (CA) samples and have substantial Caucasian hunter-gatherer (CHG)–related ancestry but no evidence of direct steppe admixture.
In South Asia, we identified at least two distinct waves of admixture from the west, the first occurring from a source related to the Copper Age Namazga farming culture from the southern edge of the steppe, who exhibit both the Iranian and the EHG components found in many contemporary Pakistani and Indian groups from across the subcontinent. The second came from Late Bronze Age steppe sources, with a genetic impact that is more localized in the north and west.
Our findings reveal that the early spread of Yamnaya Bronze Age pastoralists had limited genetic impact in Anatolia as well as Central and South Asia. As such, the Asian story of Early Bronze Age expansions differs from that of Europe. Intriguingly, we find that direct descendants of Upper Paleolithic hunter-gatherers of Central Asia, now extinct as a separate lineage, survived well into the Bronze Age. These groups likely engaged in early horse domestication as a prey-route transition from hunting to herding, as otherwise seen for reindeer. Our findings further suggest that West Eurasian ancestry entered South Asia before and after, rather than during, the initial expansion of western steppe pastoralists, with the later event consistent with a Late Bronze Age entry of IE languages into South Asia. Finally, the lack of steppe ancestry in samples from Anatolia indicates that the spread of the earliest branch of IE languages into that region was not associated with a major population migration from the steppe.
I think the wording of the abstract is weird, but consequent with their samples and results, so probably just clickbait / citebait for Indian journalists and social networks, or maybe a new attempt to ‘show respect for the sensibilities of Indians’ related to the artificially magnified “AIT vs. OIT” controversy, that is only present in India.
There has been an undercurrent of intellectual tension between geneticists studying human population history and archaeologists for almost 40 years. The rapid development of paleogenomics, with geneticists working on the very material discovered by archaeologists, appears to have recently heightened this tension. The relationship between these two fields thus far has largely been of a multidisciplinary nature, with archaeologists providing the raw materials for sequencing, as well as a scaffold of hypotheses based on interpretation of archaeological cultures from which the geneticists can ground their inferences from the genomic data. Much of this work has taken place in the context of western Eurasia, which is acting as testing ground for the interaction between the disciplines. Perhaps the major finding has not been any particular historical episode, but rather the apparent pervasiveness of migration events, some apparently of substantial scale, over the past ∼5000 years, challenging the prevailing view of archaeology that largely dismissed migration as a driving force of cultural change in the 1960s. However, while the genetic evidence for ‘migration’ is generally statistically sound, the description of these events as structured behaviours is lacking, which, coupled with often over simplistic archaeological definitions, prevents the use of this information by archaeologists for studying the social processes they are interested in. In order to integrate paleogenomics and archaeology in a truly interdisciplinary manner, it will be necessary to focus less on grand narratives over space and time, and instead integrate genomic data with other form of archaeological information at the level of individual communities to understand the internal social dynamics, which can then be connected amongst communities to model migration at a regional level. A smattering of recent studies have begun to follow this approach, resulting in inferences that are not only helping ask questions that are currently relevant to archaeologists, but also potentially opening up new avenues of research.
Interesting excerpts (emphasis mine, reference numbers removed for clarity):
There are two major, somewhat intertwined, problems that currently exist.
First, archaeologists are not critiquing whether the migrations identified by paleogenomics using sophisticated population genetic machinery are actually occurring. Instead, the technical criticism arrives in terms of how these migrations are being ascribed to specific cultures. In many paleogenomic papers, there is a tendency (and often an analytical and technical need) to associate samples with particular archaeological cultures, for which all samples are then treated as possessing some kind homogenous and pervasive social identity that is bound in space and time. The major critiques of this thus far have been directed to those studies examining Corded-Ware and Bell-Beaker-related individuals and their potential relationship to the Yamnaya [Vander Linden (2016), Heyd (2017), Furholt (2017)], but are applicable to many other ‘migration’ scenarios described in the recent literature. This is compounded by the use of sometimes small numbers of samples to represent certain cultures from a particular geographic area as representatives of the entire culture at a supra-regional level. Yet often these archaeological cultures such as Corded-Ware and Bell-Beaker themselves show considerable variability in space and time, and even within cemeteries, which is not factored into the genetic analysis.
From a population geneticists point of view, this kind of simplification is somewhat understandable and will often likely have very little impact on the final analysis, given that the primary goal is usually to use ancient samples to better understand modern genetic variation. Though there may be a specific historical interest in some of these past events, I would argue that the aim for most population geneticists at a higher level is to try and fit modern patterns of genetic variation using the simplest models possible that take into account past demographic events (for example fitting f-statistics using the ADMIXTUREGRAPH approach), as this is how we are trained. Although sharing an archaeological culture may not mean that a set of individuals are part of the same homogeneous social group in reality, this approach may be a good enough heuristic to find broad genetic connections compared to another group represented by a different culture, which can then ultimately help understand and model modern human population structure. However, for an archaeologists interested in the ancient individuals themselves and their social identity, this lumping is unsatisfactory, where sophisticated narratives of the individual migrants and their ancient communities are the intended goal.
The second related problem is that ‘migration’ in the sense used currently in the paleogenomics literature lacks sufficient detail to be of much use for an archaeologists attempting to disentangle the complex social dynamics within and between communities. To truly understand the role of migration as a social process and its contribution towards cultural changes, it is necessary to describe it as a structured behaviour, rather than treating it as an explanatory ‘black box’. Are the migrations occurring as a result of short range waves-of-advance movements, or as long-distance movements via leapfrogging models or stream migrations along established routes dependent on key kinship networks. Are there return migrants, and are some subset of individuals more predisposed to migration driving the signals? Although such models were implemented in past studies (even with classical markers ) and are part of the population genetics literature, they are lacking in the current paleogenomics literature when discussing migration. The finding that there is an increase of 12.3% of ancestry type X in population A compared to the preceding population B that is suggestive of a migration, is not particularly useful for examining these kind of models. It is also unclear to what degree standard population genetic parameters estimated from genomic data such as effective population size, Ne, and gene flow are relevant to models studied in archaeology, given they reflect (somewhat undefined) long-term population sizes and average rates of movements over time, rather than reflecting any kind of reality of census size and mobility in the ancient communities the archaeologists are actually attempting to study.
The text goes on to talk about ways of studying fine-grained social dynamics of local cultures, such as:
define levels of genetic relatedness, but also in terms of material culture, age, sex, stress and activity indicators, stable isotopes for diet reconstruction (nitrogen, d13C and d15N, carbon, 13C/12C) and strontium and oxygen isotopes for mobility (87Sr/86Sr, d18O). Where possible, sites should be examined over multiple generations. In addition it will be incredibly useful to characterize the impact of disease in these communities, which is also proving to be a highly fruitful realm for paleogenomics.
I would say that the main problem is not the obvious limitations of palaeogenomics in terms of identifying prehistoric ethnolinguistic communities and their evolution, which is why it is just another tool to complement archaeology and linguistics. The main problem is the narrow understanding that some people have of the inherent limitations of palaeogenomics – especially when it interests them – , when publicizing simplistic conclusions based on these tools and their results. And I am not referring only to amateurs.
Recent genetic studies have claimed to reveal a massive migration of the bearers of the Yamnaya culture (Pit-grave culture) to the Central and Northern Europe. This migration has supposedly lead to the formation of the Corded Ware cultures and thereby to the dispersal of Indo-European languages in Europe. The article is a summary presentation of available archaeological, linguistic, genetic and cultural data that demonstrates many discrepancies in the suggested scenario for the transformations caused by the Yamnaya “invasion” some 5000 years ago.
Both teams [Reich/Anthony, and Willerslev/Kristiansen] interpreted this resemblance in the same way: as evidence of mass migration of the Yamnaya culture from the steppes into the Central and Northern Europe, resulting in the formation of the Corded Ware cultures, and these are universally recognised as Indo-European. Since earlier in this part of Europe existed a different pool of genomes, geneticists presumed that the Yamnaya migration alone had brought the Indo-European languages into Europe. It is difficult to say to what extent the pre-convictions of the involved archaeologists influenced these conclusions, or whether the results of the genetic studies attracted archaeologists with such beliefs.
Mismatch of cultural manifestations
First, we might question the idea of the Yamnaya culture as a unity rather than a loose conglomerate of cultures. Merpert (1974) divided it into nine local groups but did not recognise them as separate cultures. However, in 1975 I suggested that Nerushay (Budzhak) monuments should be recognised as a distinct culture (Klejn 1975), although still as a part of the same broader steppe community.
This was accepted by other specialists (Ivanova 2012; 2013; 2014). Generally, in the western branch of this community, a mixture of the eastern rites of interment with local, Balkan ceramics can be observed. It should be noted that hitherto all genetic samples were taken from eastern material (in the vicinity of Samara in the Volga basin and Kalmykia), while the central thesis concerns the intrusion of the western branch of this community (Budzhak culture) into Europe.
Simultaneity of cultures
The Yamnaya culture (Chernykh & Orlovskaya 2004a; Heyd 2011; Frȋnculeasa et al. 2015) appears not to be the predecessor of the Corded Ware cultures but is contemporary with them. The Corded Ware cultures appeared also around the turn between the fourth and third millennium BC (Stöckli 2001; Furholt 2003). Their derivation from the Yamnaya seems, therefore, to be less probable. This is evidenced by the fact that the corded beakers or amphorae found in the Budzhak culture are not the prototypes of the corded beakers or amphorae found in more northern territories, but seem instead to be an outcome of contemporaneous contacts (Ivanova 2014; Klejn 2017c).
Discrepancies across the haplogroups
Even more remarkable is the variation in the distribution of types of Y chromosome. In the Yamnaya population, R1b is not just a single occurrence (there are about seven known occurrences) while in the Corded Ware population a different clade of R1b is found and R1a is predominant (several instances). Thus the postulate of unbroken succession finds no support!
In the tables presented in the article by Reichs’ team (Haak et al. 2015) the genetic pool connecting the Yamnaya culture with the Corded Ware people is shown to be more intense in Northern Europe (Norway and Sweden) and decreases gradually from the North to the South (Fig. 6). It is weakest around the Danube, in Hungary, i. e. areas neighbouring the western branch of the Yamnaya culture! This is the reverse image to what the proposed hypothesis by the geneticists would lead us to expect. It is true that this gradient is traced back from the contemporary materials, but it was already present during the Bronze Age (Klejn 2015a).
The author also uses questionable interpretations from selected articles to advance his (as of today) untenable positions regarding a Mesolithic origin of the reconstructible Proto-Indo-European language.
1. Glottochronology, for a PIE origin:
If based on the data of glottochronology (taking into account all disputes) the period of initial dispersal is to be dated to the 7th-5th millennium BC.
The currently available dataset does not contradict the hypothesis that R-GG400 marks a link between the East European steppe dwellers and West Asians, though the route and even direction of this migration is disputable. It does, however, demonstrate that present-day West European R1b chromosomes do not originate from the Yamnaya populations analyzed in (Haak et al. 2015; Mathieson et al. 2015) and raises the question of their origin. A Bronze Age origin is more likely than a Neolithic one (Balaresque et al. 2010), but further ancient DNA studies may be necessary to identify this source.
This is usual with amateur geneticists (those who don’t publish, and are therefore not subjected to criticism): if anyone is wrong (whether in Archaeology or Genetics), then they are wrong in everything else. It seems to me that Klejn’s theses against recent genetic results rest on the same assumption: The Yamna -> Corded Ware migration model is wrong, ergo the Yamna homeland model is wrong.
I guess this same fallacy is what a lot of angered geneticists (whether professional or amateurs) are going to use to dismiss Klejn’s criticism, trying to focus on what he clearly does not grasp – about genomic data of Yamna peoples and their expansion – to disregard his doubts on genetic interpretations entirely.
I have warned many times about how simplistic interpretations of genetic data would cause a general mistrust in the field, and that archaeologists won’t take the discipline seriously, no matter how many articles get published in famous research tabloids like Nature or Science…
Those who dismiss this warning lightly seem to forget the fate of other recent “scientific breakthroughs” which were initially so promising that Humanities appeared to matter no more, like glottochronology for Linguistics and, to some extent, that of radiocarbon analysis for Archaeology. EDIT: see here a recent example of discusion on discrepancies between archaeological and 14C-based chronologies, whereby ‘scientific data’ obviously needs archaeological context for a meaningful interpretation
Featured image: The direction of the supposed migration of the bearers of the Yamnaya culture into the area of the Corded Ware cultures. After Haak et al. 2015.
NOTE: I obviously don’t agree with Klejn’s main model: he criticises the Proto-Indo-European steppe homeland, and more specifically the expansion of Yamna peoples with R1b-L23 subclades, which I support. But, probably because of his “pre-convictions” (as he puts it when describing proponents of the steppe hypotheses) about the Proto-Indo-European homeland in Northern Europe during the Mesolithic, he was one of the first renown archaeologists to criticise the obvious inconsistencies in the genetic model of migrations based exclusively on the “Yamnaya ancestral component” concept, and to provoke the necessary reaction from (until then) overconfident geneticists, and he deserves credit for that.
Human ancestry can only help solve anthropological questions by using all anthropological disciplines involved. I have said that many times in this blog.
Correlation does not mean causation
Really, it does not.
You might think the tenet ‘correlation does not mean causation‘ must be evident at this point in Statistics, and it must also be for all those using statistical methods in their research. But it is sadly not so. A lot of researchers just look for correlation, and derive conclusions – without even an initial sound hypothesis to be contrasted… You can judge for yourself, e.g. reading the many instances of this complaint in recent publications of Biomedical and Social Sciences, on the interesting blog Statistical Modeling, Causal Inference, and Social Science.
In anthropological questions regarding Indo-European studies there is an added handicap: not taking correlation to mean causation does also mean – to avoid at least the most obvious confounders – taking into account the multiple linguistic and archaeological data that are available right now, to explain the expansion of Indo-European languages.
You might also believe that international researchers in Human Evolutionary Biology – after all, this is essentially a biomedical discipline – are acquainted with statistical methods and their problems when applied to their field. And that scientific journals – and especially those with the highest impact factors, like Nature, Science, or PNAS – have professional, careful reviewers who would never accept papers that equal correlation with causation, especially when Social Sciences are involved (because this alone might make errors grow exponentially…). Sadly, this is obviously not so, either.
Both studies [Haak et al. (2015) and this one] found a genetic affinity between samples from a central European culture known as Corded Ware, which existed from around 2500 bc, and samples from the earlier Yamnaya steppe culture. This similarity between distant populations is best explained by a substantial westward expansion of the Yamnaya or their close relatives into central Europe (Fig. 1b). Such an expansion is consistent with the steppe hypothesis, which argues that Corded Ware cultures were a conduit for the dispersal of Indo-European languages into Europe.
More interesting than these vague words – and the short, almost invisible suggestion that Yamna may not be exactly the population behind Corded Ware peoples – are the maps that illustrated in Nature their risky hypothesis: they called it “steppe hypothesis“, like that (in general terms), as if everyone defending a steppe origin for Proto-Indo-European would support such a model, when they actually referred to the specific hypothesis of one of their authors (Kristiansen), one of the few archaeologists who keep Gimbutas’ concept of the ‘Kurgan peoples’ alive, based on the Corded Ware culture:
In many publications that followed, the trend has been to reproduce this graphical model, by asserting (or implying) that Bell Beaker peoples were the result of subsequent Corded Ware migrations, and indeed that Corded Ware peoples migrated from the Yamna culture, and were thus the vector of expansion for Indo-European languages in Europe.
We shall see then just a rather surreptitious shift in terminology from ‘Yamnaya’ to ‘steppe’ component, to adapt to the new data – i.e. some damage control while the ship of ‘Yamnaya ancestry’ capsizes – but little else. “Earlier ‘Yamnaya ancestry’, you say? Just, you know, let’s call it ‘steppe ancestry’ and shift the expansion of Indo-European languages to one or two thousand years earlier, and done!”
The damage of this post-truth genetics is already done: we will see the unending distribution on the Internet in general, and on social networks in particular, of these grandiose conclusions, of far-fetched Indo-European migration models that include the Corded Ware culture, of simplistic maps with apparently harmless ‘arrows of migration’ (like the above) representing fictional population movements suggesting nonexistent dialectal branches.
You might be one of those sceptics wary of so many boring statistical rules: “But it’s a safe reasoning: Yamanaya samples have an ‘ancestral component’ that is found elevated in Corded Ware samples, and less so in Bell Beaker samples, and PCA showed a similar result…so the migration model Yamnaya -> Corded Ware -> Bell Beaker is a priori correct, right?”
The ‘Future American’ hypothesis
Let me illustrate this attractive “Correlation = Causation” argument, using it to solve the problem of Future American languages.
Suppose we live in a future post-apocalyptic world ca. 3500 AD, with no surviving historical records before 3000 AD. None. Just investigation of cultures and their relationship by Archaeology, proto-languages reconstructed and language families identified by Linguistics, etc.
We have thus Future Germanic and Future Romance as the only language families spoken in Future Western Europe and in the Future Americas, in a distribution similar to the present day*, and we have certain somehow related archaeologically-defined cultures on both sides of the Atlantic, like Briton, Iberian, Norman, or Lowlandish, although their distribution remains partly undefined in time and space.
* If you are really curious about this scenario, you can read about the potential evolution of a Future North-American language.
But what languages did the ancestors of Future Americans speak, and who spread them? That question remains far from being settled by our future researchers, in spite of the solidest linguistic and migration models (talking mainly about Briton and Iberian cultures): too many authorities out there questioning them, fighting to impose their own pet theories.
Suddenly, the newly developed field of Human Ancestry comes to save the day. So let’s say we have this map of ancient samples recovered (dated from, say, the 6th to the 18th century AD), and our study is centered on the newly described “Western European” component (a precise combination of, say, WHG+steppe), which peaks in early samples from the Low Lands – hence we call it, quite daringly, “Lowlandic component“.
Our group is keen to demonstrate that the ancient Lowlandic culture described in Archaeology (marked especially by the worldwide distribution of tulips among other traits) is the origin of Western European and American languages… Now, let’s reach conclusions about migrations in the Middle Ages!
PCA shows that South-West European samples cluster closely to some North-West European samples, and that some late South American samples available cluster at some distance from North American samples – nearer to a native component represented by two individuals with 0% Lowlandic ancestry and a different cluster in PCA. And some North-American samples cluster quite closely to North-West European samples.
Based on the decrease in ‘Lowlandic component’ in the different samples and on PCA, we conclude that Lowlandic peoples (“or their close relatives”) must have migrated at the same time to North America, South America (or potentially from North America to South America?) as well as western, central, and northern Europe. Both migration events must have happened roughly at the same time, in part because both distinct language families appear in a north-south distribution, and Proto-Lowlandic must be (according to Genetics) the ancestor of both, Proto-Future-Germanic and Proto-Future-Romance.
That makes a lot of sense! A huge Lowlandic pressure for migration, you see. Push-pull mechanisms and stuff. A Lowlandic Empire probably (scattered remains are found everywhere)! And, judging by the presence of the ‘Lowlandic component’ in Future East Europe from the Elbe to the Vistula, maybe Lowlandic peoples spread Proto-Slavic, too! We can even date the common Lowlandic-Slavic proto-language this way! So many groundbreaking conclusions!
Future scholars supporting the Lowlandic homeland are on fire; they can’t get enough of publishing papers on the subject. “Two different Future American language families with cultural origins in Britain and Iberia, my ass! Because genetics.”
And don’t forget the future people of haplogroup R1b-U106 and high Lowlandic component: Wow, they are the heirs of those who expanded Future Germanic and Future Romance languages everywhere, aren’t they? How proud they must be. And who wouldn’t want to have these tall, blond, blue-eyed Lowlanders as their forefathers? Personalised genetic analysis is selling like crazy: “let’s know our Lowlandic percentage!”. Everyone is happy, colourful maps with lots of arrows and shit…
But – your future you might ask in awe, seeing that this doesn’t sound quite right, based on your basic archaeological and linguistic knowledge:
What about specific models of migration proposed to date? The solidest ones, not just anyone that seems to fit?
What about the dialectal classification of languages? The mainstream ones, not those that are compatible with this interpretation?
What about archaeological cultures to which individual samples belonged?
What about the actual dates of each sample? And how this date relates to the state of the culture to which it belongs?
What about the haplogroups, and the actual subclade of each haplogroup?
What about the territories, cultures, and dates not sampled, could they change this interpretation in light of known archaeological models?
And what about the actual origin of that ancestral component they so frivolously named? Dit it really appear ex nihilo in the Low Lands, and expanded from it?
“Who cares! This new data is sooo coool… And it proves what we wanted, what a coincidence! And it’s numbers, mate! Numbers don’t lie.”
I have just uploaded the working draft of the third version of the Indo-European demic diffusion model. Unlike the previous two versions, which were published as essays (fully developed papers), this new version adds more information on human admixture, and probably needs important corrections before a definitive edition can be published.
The third version is available right now on ResearchGate and Academia.edu. I will post the PDF at Academia Prisca, as soon as possible:
Feel free to comment on the paper here, or (preferably) in our forum.
A working version (needing some corrections) divided by sections, illustrated with up-to-date, high resolution maps, can be found (as always) at the official collaborative Wiki website indo-european.info.
Palaeogenomic data have illuminated several important periods of human past with surprising im- plications for our understanding of human evolution. One of the major changes in human prehistory was Neolithisation, the introduction of the farming lifestyle to human societies. Farming originated in the Fertile Crescent approximately 10,000 years BC and in Europe it was associated with a major population turnover. Ancient DNA from Anatolia, the presumed source area of the demic spread to Europe, and the Balkans, one of the first known contact zones between local hunter-gatherers and incoming farmers, was obtained from roughly contemporaneous human remains dated to ∼6 th millennium BC. This new unprecedented dataset comprised of 86 full mitogenomes, five whole genomes (7.1–3.7x coverage) and 20 high coverage (7.6–93.8x) genomic samples. The Aegean Neolithic pop- ulation, relatively homogeneous on both sides of the Aegean Sea, was positively proven to be a core zone for demic spread of farmers to Europe. The farmers were shown to migrate through the central Balkans and while the local sedentary hunter-gathers of Vlasac in the Danube Gorges seemed to be isolated from the farmers coming from the south, the individuals of the Aegean origin infiltrated the nearby hunter-gatherer community of Lepenski Vir. The intensity of infiltration increased over time and even though there was an impact of the Danubian hunter-gatherers on genetic variation of Neolithic central Europe, the Aegean ancestry dominated during the introduction of farming to the continent.
Taking only admixture analyses using Yamna samples:
This increased genetic affinity of Neolithic farmers to Danubians was observed for Neolithic Hungarians, LBK from central Europe and LBK Stuttgart sample. Some post-Neolithic samples also proved to share more drift with Danubians, again samples from Hungary (Bronze Age and Copper Age samples and also Yamnaya and samples with elevated Yamnaya ancestry (Early Bronze Age samples from Únětice, Bell Beaker samples, Late Neolithic Karlsdorf sample and Corded Ware samples).
The results of our ADMIXTURE analysis for the dataset including also Yamnaya samples are shown in Figure S1c. The cross-validation error was the lowest for K=2. Supervised and unsupervised analyses for K=3 are again highly concordant. Early Neolithic farmers again demonstrate almost no evidence of hunter-gatherer admixture, while it is observable in the Middle Neolithic farmers. However, much of the Late Neolithic hunter-gatherer ancestry from the previous analysis is replaced by Yamnaya ancestry. These results are consistent with the results of Haak et al. who demonstrated a resurgence of hunter-gatherer ancestry followed by the establishment of Eastern hunter-gatherer ancestry.
Again, admixture results show that something in the simplistic Yamna -> Corded Ware model is off. It is still interesting to review admixture results of European Mesolithic and Late Neolithic genomic data in relation to the so-called steppe or yamnaancestry or component (most likely an eastern steppe / forest zone ancestry probably also present in the earlier Corded Ware horizons) and its interpretation…