Yersinia pestis, the etiologic agent of plague, is a bacterium associated with wild rodents and their fleas. Historically it was responsible for three pandemics: the Plague of Justinian in the 6th century AD, which persisted until the 8th century [ 1 ]; the renowned Black Death of the 14th century [ 2, 3 ], with recurrent outbreaks until the 18th century [ 4 ]; and the most recent 19th century pandemic, in which Y. pestis spread worldwide [ 5 ] and became endemic in several regions [ 6 ]. The discovery of molecular signatures of Y. pestis in prehistoric Eurasian individuals and two genomes from Southern Siberia suggest that Y. pestis caused some form of disease in humans prior to the first historically documented pandemic [ 7 ]. Here, we present six new European Y. pestis genomes spanning the Late Neolithic to the Bronze Age (LNBA; 4,800 to 3,700 calibrated years before present). This time period is characterized by major transformative cultural and social changes that led to cross-European networks of contact and exchange [ 8, 9 ]. We show that all known LNBA strains form a single putatively extinct clade in the Y. pestis phylogeny. Interpreting our data within the context of recent ancient human genomic evidence that suggests an increase in human mobility during the LNBA, we propose a possible scenario for the early spread of Y. pestis: the pathogen may have entered Europe from Central Eurasia following an expansion of people from the steppe, persisted within Europe until the mid-Bronze Age, and moved back toward Central Eurasia in parallel with human populations.
It seems that, notwithstanding the simplistic (white) arrows of steppe ancestry expansion shown in their map (see below), the actual expansion of Yersinia pestis might have in fact accompanied Yamna migrants from the Pontic-Caspian steppe into Early Bronze Age cultures from the Balkans, including Bell Beaker migrants, as the phylogenetic analysis and dates suggest – and as the potential arrows of the plague expansion in the map (in green) show.
Instead of warring nature, close ties, and mobility of Corded Ware peoples (reasons I used to justify the rapid spread of the disease among CWC groups), I guess it was rather the higher population density of SE Europecompared to the regions north of the loess belt, as well as the greater admixture of Yamna migrants with native SE European populations, the factors which might have helped expand the disease.
Nevertheless, lacking more data, it is unclear if the disease expanded with both steppe groups.
The extent of population structure within Ireland is largely unknown, as is the impact of historical migrations. Here we illustrate fine-scale genetic structure across Ireland that follows geographic boundaries and present evidence of admixture events into Ireland. Utilising the ‘Irish DNA Atlas’, a cohort (n = 194) of Irish individuals with four generations of ancestry linked to specific regions in Ireland, in combination with 2,039 individuals from the Peoples of the British Isles dataset, we show that the Irish population can be divided in 10 distinct geographically stratified genetic clusters; seven of ‘Gaelic’ Irish ancestry, and three of shared Irish-British ancestry. In addition we observe a major genetic barrier to the north of Ireland in Ulster. Using a reference of 6,760 European individuals and two ancient Irish genomes, we demonstrate high levels of North-West French-like and West Norwegian-like ancestry within Ireland. We show that that our ‘Gaelic’ Irish clusters present homogenous levels of ancient Irish ancestries. We additionally detect admixture events that provide evidence of Norse-Viking gene flow into Ireland, and reflect the Ulster Plantations. Our work informs both on Irish history, as well as the study of Mendelian and complex disease genetics involving populations of Irish ancestry.
Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.
Here are some interesting excerpts (emphasis mine):
Population structure in Ireland
The geographical distribution of this deep subdivision of Leinster resembles pre-Norman territorial boundaries which divided Ireland into fifths (cúige), with north Leinster a kingdom of its own known as Meath (Mide) . However interpreted, the firm implication of the observed clustering is that despite its previously reported homogeneity, the modern Irish population exhibits genetic structure that is subtly but detectably affected by ancestral population structure conferred by geographical distance and, possibly, ancestral social structure.
ChromoPainter PC1 demonstrated high diversity amongst clusters from the west coast, which may be attributed to longstanding residual ancient (possibly Celtic) structure in regions largely unaffected by historical migration. Alternatively, genetic clusters may also have diverged as a consequence of differential influence from outside populations. This diversity between western genetic clusters cannot be explained in terms of geographic distance alone.
In contrast to the west of Ireland, eastern individuals exhibited relative homogeneity; (…) The overall pattern of western diversity and eastern homogeneity in Ireland may be explained by increased gene flow and migration into and across the east coast of Ireland from geographically proximal regions, the closest of which is the neighbouring island of Britain.
Analysis of variance of the British admixture component in cluster groups showed a significant difference (p < 2×10-16), indicating a role for British Anglo-Saxon admixture in distinguishing clusters, and ChromoPainter PC2 was correlated with the British component (p < 2×10-16), explaining approximately 43% of the variance. PC2 therefore captures an east to west Anglo-Celtic cline in Irish ancestry. This may explain the relative eastern homogeneity observed in Ireland, which could be a result of the greater English influence in Leinster and the Pale during the period of British rule in Ireland following the Norman invasion, or simply geographic proximity of the Irish east coast to Britain. Notably, the Ulster cluster group harboured an exceptionally large proportion of the British component (Fig 1D and 1E), undoubtedly reflecting the strong influence of the Ulster Plantations in the 17th century and its residual effect on the ethnically British population that has remained.
On the genetic structure of the British Isles
The genetic substructure observed in Ireland is consistent with long term geographic diversification of Celtic populations and the continuity shown between modern and Early Bronze Age Irish people
Clusters representing Celtic populations harbouring less Anglo-Saxon influence separate out above and below SEE on PC4. Notably, northern Irish clusters (NLU), Scottish (NISC, SSC and NSC), Cumbria (CUM) and North Wales (NWA) all separate out at a mutually similar level, representing northern Celtic populations. The southern Celtic populations Cornwall (COR), south Wales (SWA) and south Munster (SMN) also separate out on similar levels, indicating some shared haplotypic variation between geographically proximate Celtic populations across both Islands. It is notable that after the split of the ancestrally divergent Orkney, successive ChromoPainter PCs describe diversity in British populations where “Anglo-saxonization” was repelled . PC3 is dominated by Welsh variation, while PC4 in turn splits North and South Wales significantly, placing south Wales adjacent to Cornwall and north Wales at the other extreme with Cumbria, all enclaves where Brittonic languages persisted.
In an interesting symmetry, many Northern Irish samples clustered strongly with southern Scottish and northern English samples, defining the Northern Irish/Cumbrian/Scottish (NICS) cluster group. More generally, by modelling Irish genomes as a linear mixture of haplotypes from British clusters, we found that Scottish and northern English samples donated more haplotypes to clusters in the north of Ireland than to the south, reflecting an overall correlation between Scottish/north English contribution and ChromoPainter PC1 position in Fig 1 (Linear regression: p < 2×10-16, r2 = 0.24).
North to south variation in Ireland and Britain are therefore not independent, reflecting major gene flow between the north of Ireland and Scotland (Fig 5) which resonates with three layers of historical contacts. First, the presence of individuals with strong Irish affinity among the third generation PoBI Scottish sample can be plausibly attributed to major economic migration from Ireland in the 19th and 20th centuries . Second, the large proportion of Northern Irish who retain genomes indistinguishable from those sampled in Scotland accords with the major settlements (including the Ulster Plantation) of mainly Scottish farmers following the 16th Century Elizabethan conquest of Ireland which led to these forming the majority of the Ulster population. Third, the suspected Irish colonisation of Scotland through the Dál Riata maritime kingdom, which expanded across Ulster and the west coast of Scotland in the 6th and 7th centuries, linked to the introduction and spread of Gaelic languages . Such a migratory event could work to homogenise older layers of Scottish population structure, in a similar manner as noted on the east coasts of Britain and Ireland. Earlier communications and movements across the Irish Sea are also likely, which at its narrowest point separates Ireland from Scotland by approximately 20 km.
Genomic footprints of migration into Ireland
Quite interesting is that it is haplogroups, and not admixture, that which defines the oldest migration layers into Ireland. Without evidence of paternal Y-DNA lineages we would probably not be able to ascertain the oldest migrations and languages broght by migrants, including Celtic languages:
Of all the European populations considered, ancestral influence in Irish genomes was best represented by modern Scandinavians and northern Europeans, with a significant single-date one-source admixture event overlapping the historical period of the Norse-Viking settlements in Ireland (p < 0.01; fit quality FQB > 0.985; Fig 6). (…) This suggests a contribution of historical Viking settlement to the contemporary Irish genome and contrasts with previous estimates of Viking ancestry in Ireland based on Y chromosome haplotypes, which have been very low . The modern-day paucity of Norse-Viking Y chromosome haplotypes may be a consequence of drift with the small patrilineal effective population size, or could have social origins with Norse males having less influence after their military defeat and demise as an identifiable community in the 11th century, with persistence of the autosomal signal through recombination.
European admixture date estimates in northwest Ulster did not overlap the Viking age but did include the Norman period and the Plantations
The genetic legacies of the populations of Ireland and Britain are therefore extensively intertwined and, unlike admixture from northern Europe, too complex to model with GLOBETROTTER.
Featured image, from the article on Science Reports: The clustering of individuals with Irish and British ancestry based solely on genetics. Shown are 30 clusters identified by fineStructure from 2,103 Irish and British individuals. The dendrogram (left) shows the tree of clusters inferred by fineStructure and the map (right) shows the geographic origin of 192 Atlas Irish individuals and 1,611 British individuals from the Peoples of the British Isles (PoBI) cohort, labelled according to fineStructure cluster membership. Individuals are placed at the average latitude and longitude of either their great-grandparental (Atlas) or grandparental (PoBI) birthplaces. Great Britain is separated into England, Scotland, and Wales. The island of Ireland is split into the four Provinces; Ulster, Connacht, Leinster, and Munster. The outline of Britain was sourced from Global Administrative Areas (2012). GADM database of Global Administrative Areas, version 2.0. www.gadm.org. The outline of Ireland was sourced from Open Street Map Ireland, Copyright OpenStreetMap Contributors, (https://www.openstreetmap.ie/) – data available under the Open Database Licence. The figure was plotted in the statistical software language R46, version 3.4.1, with various packages.
It is unclear whether Indo-European languages in Europe spread from the Pontic steppes in the late Neolithic, or from Anatolia in the Early Neolithic. Under the former hypothesis, people of the Globular Amphorae culture (GAC) would be descended from Eastern ancestors, likely representing the Yamnaya culture. However, nuclear (six individuals typed for 597 573 SNPs) and mitochondrial (11 complete sequences) DNA from the GAC appear closer to those of earlier Neolithic groups than to the DNA of all other populations related to the Pontic steppe migration. Explicit comparisons of alternative demographic models via approximate Bayesian computation confirmed this pattern. These results are not in contrast to Late Neolithic gene flow from the Pontic steppes into Central Europe. However, they add nuance to this model, showing that the eastern affinities of the GAC in the archaeological record reflect cultural influences from other groups from the East, rather than the movement of people.
Excerpt, from the discussion:
In its classical formulation, the Kurgan hypothesis, i.e. a late Neolithic spread of proto-Indo-European languages from the Pontic steppes, regards the GAC people as largely descended from Late Neolithic ancestors from the East, most likely representing the Yamna culture; these populations then continued their Westward movement, giving rise to the later Corded Ware and Bell Beaker cultures. Gimbutas  suggested that the spread of Indo-European languages involved conflict, with eastern populations spreading their languages and customs to previously established European groups, which implies some degree of demographic change in the areas affected by the process. The genomic variation observed in GAC individuals from Kierzkowo, Poland, does not seem to agree with this view. Indeed, at the nuclear level, the GAC people show minor genetic affinities with the other populations related with the Kurgan Hypothesis, including the Yamna. On the contrary, they are similar to Early-Middle Neolithic populations, even geographically distant ones, from Iberia or Sweden. As already found for other Late Neolithic populations , in the GAC people’s genome there is a component related to those of much earlier hunting-gathering communities, probably a sign of admixture with them. At the nuclear level, there is a recognizable genealogical continuity from Yamna to Corded Ware. However, the view that the GAC people represented an intermediate phase in this large-scale migration finds no support in bi-dimensional representations of genome diversity (PCA and MDS), ADMIXTURE graphs, or in the set of estimated f3-statistics.
Together with Globular Amphora culture samples from Mathieson et al. (2017), this suggests that Kristiansen’s Indo-European Corded Ware Theory is wrong, even in its latest revised models of 2017.
On the other hand, the article’s genetic finds have some interesting connections in terms of mtDNA phylogeography, but without a proper archaeological model it is difficult to explain them.
Ancient DNA studies have established that Neolithic European populations were descended from Anatolian migrants who received a limited amount of admixture from resident hunter-gatherers. Many open questions remain, however, about the spatial and temporal dynamics of population interactions and admixture during the Neolithic period. Here we investigate the population dynamics of Neolithization across Europe using a high-resolution genome-wide ancient DNA dataset with a total of 180 samples, of which 130 are newly reported here, from the Neolithic and Chalcolithic periods of Hungary (6000–2900 BC, n = 100), Germany (5500–3000 BC, n = 42) and Spain (5500–2200 BC, n = 38). We find that genetic diversity was shaped predominantly by local processes, with varied sources and proportions of hunter-gatherer ancestry among the three regions and through time. Admixture between groups with different ancestry profiles was pervasive and resulted in observable population transformation across almost all cultural transitions. Our results shed new light on the ways in which gene flow reshaped European populations throughout the Neolithic period and demonstrate the potential of time-series-based sampling and modelling approaches to elucidate multiple dimensions of historical population interactions.
There were some interesting finds on a regional level, with some late survival of hunter-gatherer ancestry (and Y-DNA haplogroups) in certain specific sites, but nothing especially surprising. This survival of HG ancestry and lineages in Iberia and other regions may be used to revive (yet again) the controversy over the origin of non-Indo-European languages of Europe attested in historical times, such as the only (non-Uralic) one surviving to this day, the Basque language.
This study kept confirming the absence of Y-DNA R1b-M269 subclades in Central Europe before the arrival of Yamna migrants, though, which offers strong reasons to reject the Indo-European from the west hypothesis.
Here are first the PCA of samples included in this paper, and then the PCA of ancient Eurasians (Mathieson et al. 2017) and modern populations (Lazaridis et al. 2014) for comparison of similar clusters:
Human ancestry can only help solve anthropological questions by using all anthropological disciplines involved. I have said that many times in this blog.
Correlation does not mean causation
Really, it does not.
You might think the tenet ‘correlation does not mean causation‘ must be evident at this point in Statistics, and it must also be for all those using statistical methods in their research. But it is sadly not so. A lot of researchers just look for correlation, and derive conclusions – without even an initial sound hypothesis to be contrasted… You can judge for yourself, e.g. reading the many instances of this complaint in recent publications of Biomedical and Social Sciences, on the interesting blog Statistical Modeling, Causal Inference, and Social Science.
In anthropological questions regarding Indo-European studies there is an added handicap: not taking correlation to mean causation does also mean – to avoid at least the most obvious confounders – taking into account the multiple linguistic and archaeological data that are available right now, to explain the expansion of Indo-European languages.
You might also believe that international researchers in Human Evolutionary Biology – after all, this is essentially a biomedical discipline – are acquainted with statistical methods and their problems when applied to their field. And that scientific journals – and especially those with the highest impact factors, like Nature, Science, or PNAS – have professional, careful reviewers who would never accept papers that equal correlation with causation, especially when Social Sciences are involved (because this alone might make errors grow exponentially…). Sadly, this is obviously not so, either.
Both studies [Haak et al. (2015) and this one] found a genetic affinity between samples from a central European culture known as Corded Ware, which existed from around 2500 bc, and samples from the earlier Yamnaya steppe culture. This similarity between distant populations is best explained by a substantial westward expansion of the Yamnaya or their close relatives into central Europe (Fig. 1b). Such an expansion is consistent with the steppe hypothesis, which argues that Corded Ware cultures were a conduit for the dispersal of Indo-European languages into Europe.
More interesting than these vague words – and the short, almost invisible suggestion that Yamna may not be exactly the population behind Corded Ware peoples – are the maps that illustrated in Nature their risky hypothesis: they called it “steppe hypothesis“, like that (in general terms), as if everyone defending a steppe origin for Proto-Indo-European would support such a model, when they actually referred to the specific hypothesis of one of their authors (Kristiansen), one of the few archaeologists who keep Gimbutas’ concept of the ‘Kurgan peoples’ alive, based on the Corded Ware culture:
In many publications that followed, the trend has been to reproduce this graphical model, by asserting (or implying) that Bell Beaker peoples were the result of subsequent Corded Ware migrations, and indeed that Corded Ware peoples migrated from the Yamna culture, and were thus the vector of expansion for Indo-European languages in Europe.
We shall see then just a rather surreptitious shift in terminology from ‘Yamnaya’ to ‘steppe’ component, to adapt to the new data – i.e. some damage control while the ship of ‘Yamnaya ancestry’ capsizes – but little else. “Earlier ‘Yamnaya ancestry’, you say? Just, you know, let’s call it ‘steppe ancestry’ and shift the expansion of Indo-European languages to one or two thousand years earlier, and done!”
The damage of this post-truth genetics is already done: we will see the unending distribution on the Internet in general, and on social networks in particular, of these grandiose conclusions, of far-fetched Indo-European migration models that include the Corded Ware culture, of simplistic maps with apparently harmless ‘arrows of migration’ (like the above) representing fictional population movements suggesting nonexistent dialectal branches.
You might be one of those sceptics wary of so many boring statistical rules: “But it’s a safe reasoning: Yamanaya samples have an ‘ancestral component’ that is found elevated in Corded Ware samples, and less so in Bell Beaker samples, and PCA showed a similar result…so the migration model Yamnaya -> Corded Ware -> Bell Beaker is a priori correct, right?”
The ‘Future American’ hypothesis
Let me illustrate this attractive “Correlation = Causation” argument, using it to solve the problem of Future American languages.
Suppose we live in a future post-apocalyptic world ca. 3500 AD, with no surviving historical records before 3000 AD. None. Just investigation of cultures and their relationship by Archaeology, proto-languages reconstructed and language families identified by Linguistics, etc.
We have thus Future Germanic and Future Romance as the only language families spoken in Future Western Europe and in the Future Americas, in a distribution similar to the present day*, and we have certain somehow related archaeologically-defined cultures on both sides of the Atlantic, like Briton, Iberian, Norman, or Lowlandish, although their distribution remains partly undefined in time and space.
* If you are really curious about this scenario, you can read about the potential evolution of a Future North-American language.
But what languages did the ancestors of Future Americans speak, and who spread them? That question remains far from being settled by our future researchers, in spite of the solidest linguistic and migration models (talking mainly about Briton and Iberian cultures): too many authorities out there questioning them, fighting to impose their own pet theories.
Suddenly, the newly developed field of Human Ancestry comes to save the day. So let’s say we have this map of ancient samples recovered (dated from, say, the 6th to the 18th century AD), and our study is centered on the newly described “Western European” component (a precise combination of, say, WHG+steppe), which peaks in early samples from the Low Lands – hence we call it, quite daringly, “Lowlandic component“.
Our group is keen to demonstrate that the ancient Lowlandic culture described in Archaeology (marked especially by the worldwide distribution of tulips among other traits) is the origin of Western European and American languages… Now, let’s reach conclusions about migrations in the Middle Ages!
PCA shows that South-West European samples cluster closely to some North-West European samples, and that some late South American samples available cluster at some distance from North American samples – nearer to a native component represented by two individuals with 0% Lowlandic ancestry and a different cluster in PCA. And some North-American samples cluster quite closely to North-West European samples.
Based on the decrease in ‘Lowlandic component’ in the different samples and on PCA, we conclude that Lowlandic peoples (“or their close relatives”) must have migrated at the same time to North America, South America (or potentially from North America to South America?) as well as western, central, and northern Europe. Both migration events must have happened roughly at the same time, in part because both distinct language families appear in a north-south distribution, and Proto-Lowlandic must be (according to Genetics) the ancestor of both, Proto-Future-Germanic and Proto-Future-Romance.
That makes a lot of sense! A huge Lowlandic pressure for migration, you see. Push-pull mechanisms and stuff. A Lowlandic Empire probably (scattered remains are found everywhere)! And, judging by the presence of the ‘Lowlandic component’ in Future East Europe from the Elbe to the Vistula, maybe Lowlandic peoples spread Proto-Slavic, too! We can even date the common Lowlandic-Slavic proto-language this way! So many groundbreaking conclusions!
Future scholars supporting the Lowlandic homeland are on fire; they can’t get enough of publishing papers on the subject. “Two different Future American language families with cultural origins in Britain and Iberia, my ass! Because genetics.”
And don’t forget the future people of haplogroup R1b-U106 and high Lowlandic component: Wow, they are the heirs of those who expanded Future Germanic and Future Romance languages everywhere, aren’t they? How proud they must be. And who wouldn’t want to have these tall, blond, blue-eyed Lowlanders as their forefathers? Personalised genetic analysis is selling like crazy: “let’s know our Lowlandic percentage!”. Everyone is happy, colourful maps with lots of arrows and shit…
But – your future you might ask in awe, seeing that this doesn’t sound quite right, based on your basic archaeological and linguistic knowledge:
What about specific models of migration proposed to date? The solidest ones, not just anyone that seems to fit?
What about the dialectal classification of languages? The mainstream ones, not those that are compatible with this interpretation?
What about archaeological cultures to which individual samples belonged?
What about the actual dates of each sample? And how this date relates to the state of the culture to which it belongs?
What about the haplogroups, and the actual subclade of each haplogroup?
What about the territories, cultures, and dates not sampled, could they change this interpretation in light of known archaeological models?
And what about the actual origin of that ancestral component they so frivolously named? Dit it really appear ex nihilo in the Low Lands, and expanded from it?
“Who cares! This new data is sooo coool… And it proves what we wanted, what a coincidence! And it’s numbers, mate! Numbers don’t lie.”
After my first version, findings in Olalde et al. (2017) and Mathieson et al. (2017) supported some of my predictions. Now after my third, their new data also supports another prediction. Because the model is based on solid linguistic and archaeological models. Here is an excerpt from the Indo-European demic diffusion model, 3rd ed. (pp. 55-56):
At the end of the Trypillian culture, herding/hunting trends intensified, and the agricultural system collapsed, with people moving to the steppe zone, as confirmed by the presence of numerous graves to the south (Rassamakin 1999). At the same time, the Trypillian world absorbed a foreign tradition related to materials of settlement sites of the Dnieper steppes – such as the late Sredni Stog culture –, like cord impressions and burial rites similar to the later Corded Ware culture, marking also the transformation of decors and changes in their interpretation (Palaguta 2007).
The similarity in burial rituals between Yamna and Corded Ware made Gimbutas define a common “Kurgan people”, whose relationship has also been long supported by Kristiansen (Kristiansen 1989; Kristiansen et al. 2017). An equivalence of both burial rites has been, however, rejected (Häusler 1963, 1978, 1983), and it is generally agreed that the Yamna culture did not expand to the north of the Tisza River.
The importance of horse exploitation in Deriivka, in the forest-steppe zone of the north Pontic region along the Dnieper region, during the Middle Eneolithic period (probably ca. 3700-3530 BC), suggests that horses played a significant role in the life of this Sredni Stog community (Anthony and Brown 2003). In its late period (ca. 4000-3500 BC), this culture had adopted corded ware pottery, and stone battle-axes.
However, this [sic] western steppe peoples were mainly hunters (Rassamakin 1999), and the ‘herding skill’ essential for wild horse domestication seems absent (Kuzmina 2003). All this has been confirmed with zooarchaeological evidence and new molecular and stable isotope results, suggesting an absence of horse domestication in territories of the late Sredni Stog culture in the north Pontic steppe (Mileto et al. 2017), before the advent of migrants from the Indo-European-speaking Repin culture.
The new sample described in Mathieson et al. (2017), dated ca. 4200 BC (but within a wide range, 5000-3500 BC) is from a site classified as of late Sredni Stog (although potentially from Post-Mariupol / Kvitjana), a culture of hunters who probably did not breed domesticated horses (even after the period of conquest and dominance of Suvorovo-Novodanilovka chiefs, from Indo-Hittite-speaking early Khvalynsk, who had domesticated horses), and – more importantly – is of R1a-M417 lineage, shows high so-called “Yamna component” in ADMIXTURE, and clusters among Corded Ware samples in PCA approximately a thousand years before this culture’s expansion. Information from the supplementary material:
An Eneolithic cemetery of the Sredny Stog II culture was excavated by D. Telegin in 1955-1957 near the village of Alexandria, Kupyansk district, Kharkov region on the left bank of the river Oskol. A total of 33 individuals were recovered. Based on craniometric analysis (I.Potekhina 1999) it was suggested that the Eneolithic inhabitants of Alexandria were not homogeneous and resulted from admixture of local Neolithic hunter-gatherers and early farmers, possibly Trypillian groups. We report genetic data from one individual: I6561
Another individual from Eneolithic Ukraine (of R1b1 xM269 lineage) clusters quite closely with Neolithic samples from the Baltic, which points to the strong connection between both – southern and northern – regions of east-central Europe before the period of great Chalcolithic expansions, and the potential origin of the spread of R1b (xM269) lineages with the Corded Ware culture.
It will be fun to see the mess that certain researchers have made (and will still make in the near future) of their findings coupled with the concept of “Yamna component”, when trying to describe the “proxy ancestral populations” of European Copper Age and Bronze Age cultures… Difficult times ahead for many, after the collapse of the simplistic Yamna -> Corded Ware -> Bell Beaker genetic model laid out since Haak et al. (2015) and Allentoft et al. (2015).
[EDIT 27 September 2017] Not directly related, but here is today’s interesting discussion on Twitter surrounding the ancestral populations of the “Yamnaya component”, for illustration of the discussions to come when this ancestry is divided into different, more precise, older (Neolithic) steppe components, and these in turn shown to contribute to different European and Asian Chalcolithic and Bronze Age cultures:
Rough attempt to understand genetic history of Europe and W. Eurasia. It's tough to get your head around. Comments welcome. pic.twitter.com/lutXFKmluk
Given the variance found in the three samples from Eneolithic Ukraine (comparable to the variance found in east Bell Beaker samples), we may now be getting closer to the precise territory and culture where the Corded Ware culture might have formed, which cannot be much further from the Dnieper-Dniester region before the Yamna expansion to the west ca. 3300 BC, judging from the elevated steppe component.
It seems, because of the proximity of both cultures and the similar dates of their migrations, that the westward expansion of the Yamna culture may have indeed provided an important push (among some strong ‘pull’ forces) for peoples of the expansion of the Corded Ware culture.
So Genetics reinforces the solidest models of Archaeology and Linguistics? Professional academics being mostly right in their careful research, and amateur geneticists playing with software being wrong? Who would have thought… More and more papers help thus shut up naysayers who state (again and again) that new algorithms are here to revolutionise these academic fields.
The expansion of peoples is known to be associated with the spread of a certain admixture component + the expansion and reduction in variability of a haplogroup (i.e. few male lineages are usually more successful during the expansion): Neolithic farmers from the Middle East expanding with haplogroup G2a; Natufian component (Levant hunter-gatherers or later, Neolithic farmers) and haplogroup E southward into Africa; CHG component expansion with haplogroup J; WHG expansion into east Europe with haplogroup R1b; etc.
There were (at least) two main expansion processes involving Proto-Indo-European: one causing the branching off of the language ancestral to Anatolian, and another during the spread of Late Indo-European dialects. Based on this, and on known archaeological models, I have predicted since the first version of the demic diffusion model:
Based on haplogroups found until then in Yamna (R1b-M269), Corded Ware (R1a-M417, especially Z645), and Bell Beaker (R1b-L151):
that mainly R1b-L23 (especially L51) lineages and more steppe admixture would be found in east Bell Beaker – confirmed some two months after my publication by Olalde et al. (2017);
and that mainly R1a-M417 (especially Z645) subclades will be found in Corded Ware samples.
Based on the finding of “Yamna component” in the Corded Ware culture: that this admixture must have come from somewhere else. I pointed out to eastern Europe, including the forest and forest-steppe zone especially in the natural continuum of the Dniester-Dnieper region. Especially after Mathieson et al. (2017), in my second and third versions of the model, I have more specifically suggested a southern origin in the region, nearer to where the CHG ancestry must have come from (the Caucasus and cultures formed in contact with it), according to mainstream archaeological data, i.e. cultures of the North Pontic steppe / steppe-forest. But of course, until more samples are available, more CHG ancestry in other cultures of the Forest Zone cannot be discarded.
For the vast majority of academics, more samples (regionally proportioned) are needed only from early Corded Ware, as we have from Bell Beaker: if they are (as expected) mostly R1a-M417, then everything is clear, and it will finally mean the end for the tiring, now almost ‘traditional’ association R1a – Proto-Indo-European. Some more samples from the potential homeland of the third Corded Ware horizon, most likely Ukraine (Podolia and Volynia regions), nearer to the time of the Corded Ware expansion, would also be great, to locate the actual ancestral population of Corded Ware migrants – recognisable by the main presence of haplogroup R1a-Z645 (formed ca. 3500 BC), and elevated “Yamna component” before the arrival of the Yamna culture…
If, however, early Corded Ware samples of R1b-L23 subclades are found in certain quantity, especially old samples from east-central Europe (excluding Yamna migrants along the Prut), the tricky question of Late Indo-European cultural diffusion will remain: Did Corded Ware peoples adopt a Late Indo-European language from clans of R1b-L23 lineages? That is what Kristiansen and Anthony have been betting for, a cultural diffusion, caused by:
A long-lasting contact, according to Kristiansen (1989,…,2017). He defends that Sredni Stog adopted the language – but obviously not the same culture – from the east, but that it is a genetic and cultural mix from Globular Amphora, Trypillia, and steppe cultures. This has been Kristiansen’s model for almost 30 years, and it follows Marija Gimbutas’ outdated theory of the “Kurgan people”.
A rapid change according to Anthony (2007). He associates the adoption of Pre-Germanic with the domination of Yamna chiefs over Usatovo people, and the adoption of Balto-Slavic by the people from (Corded Ware) Middle Dnieper group because of the technical superiority of neighbouring Yamna herders.
Linguistics, with the growing support of a North-West Indo-European group, points clearly to a European expansion of a community speaking the ancestral language of Italo-Celtic, Germanic, and probably Balto-Slavic. Archaeology, too, showed migration from Yamna only to south-eastern Europe (correcting Gimbutas’ Kurgan model) and later with east Bell Beaker mainly into central, western, and northern Europe.
Even Kristiansen admits that only after the arrival of Bell Beaker in Scandinavia was a linguistic community (i.e. Germanic) formed – although he places the center of gravity in Úněticean influence, and (yet again) a cultural diffusion event into the Danish Dagger period.
Because of more and more data contrasting with old theories, some have elected to develop weak, indemonstrable links, to keep supporting e.g. Gimbutas’ concept of “Kurgan people” in Archaeology, and a sudden, early expansion of all PIE dialects at once in Linguistics. It seems that, after so much fuss about the (misleading) ‘Yamna component’ concept – and so many far-fetched assumptions by amateur geneticists -, the Corded Ware connection will once again hinge on weak, indemonstrable cultural diffusion theories, be it ‘Kurgan peoples’ (including now, of course, Eneolithic cultures of Ukraine) or any culture from eastern Europe that will reveal some close samples to Corded Ware migrants, in terms of PCA, ADMIXTURE, or haplogroup.
So once we find mainly R1a-Z645 in more Corded Ware samples (and this haplogroup and more “Yamna component” in non-Yamna cultures of Eneolithic Ukraine, and potentially Poland or Belarus) we all may finally expect a peaceful acceptance of reality, at least in Genetics? Nope. No siree. Nein. Not then, not ever.
Why? Because some people want their paternal lineage to have lived in their historical region, and spoken their historical language, since time immemorial. It won’t matter if Archaeology, Linguistics, Genetics, etc. don’t support their claims: if they need to use some aspects of admixture, or haplogroups (or a combination of them) from carefully selected samples instead of looking at the whole picture; if they have to support that Indo-Europeans came from a culture different than Yamna, in- or outside of the steppe or forest-steppe, be it the Balkans, Anatolia, Armenia, or the Moon; if their proto-language should then come directly from Indo-Hittite, or from a Germano-Slavonic, or Indo-Slavonic, or Indo-Germanic group, or whatever invented dialectal branch necessary to fit their model, or if they have to support the ‘constellation analogy’ of Clackson, or thousands of years of development for each branch; etc. They will support whatever is necessary.
And this adaptation, obviously, has no end. It’s stupid, I know. But that’s how we are, how we think. We have seen that these sad trends continue no matter what, for decades, and not only regarding Indo-European. Some common examples include:
Indo-Aryan-speaking Indians defending an autochthonous origin of R1a and Indo-European; as well as the ‘opposite’ autochtonous continuity theory of Dravidian-speaking Indians (based on ASI ancestry, haplogroup R2, mtDNA haplogroup M, or whatever is at hand).
Western Europeans defending an autochthonous origin of the R1b haplogroup, with a Palaeolithic or Mesolithic origin, including the language, viz. the recent Indo-European from the Atlantic façade theories (in the Celtic from the West series, by Koch and Cunliffe); the now fading Palaeolithic Continuity Theory; and many other forgotten Eurocentric proposals; as well as the more recent informal hints of a central European/Balkan homeland based on the Villabruna cluster and south-eastern Mesolithic finds, which is at risk of being related to a Balkan origin of Proto-Indo-European…
There is also the ‘opposite’ theory of the autochthonous origin of the Basques, including Proto-Iberians and potentially other peoples like Paleo-Sardinians, based on the previously popular Vasconic-Uralic hypothesis (and an ancient Europe divided into R1b and N1c1 haplogroups), which is still widely believed in certain regions.
Nordic speakers supporting the autochthonous nature of Germanic and haplogroup I1 to Scandinavia.
Armenian speakers delighted to see a proposal of Indo-European homeland in the Armenian highlands, be it supported by glottalic consonants, CHG ancestrty, R1b (xM269) or J lineages…
Greek speakers now willing to support continuity of haplogroup J as a ‘native’ Greek lineage, of people speaking Proto-Greek (and in earlier times PIE), because of two Minoan, and one Mycenaean samples found in Lazaridis et al. (2017).
Even Turks linking Yamna with the expansion of Turkic languages. That one is fun to read, almost like a parody for the rest – substituting “Indo-European” for “Turkic”.
For years, a lot of people – me included (at least since 2005) – believed, because of modern maps of R1a distribution, that R1a and Corded Ware are the vector of Indo-European languages. For those of us who don’t have any personal or national tie with this haplogroup, this notion has been easy to change with new data. For others, it obviously isn’t, and it won’t be.
For all these people, a sample, result, or conclusion from any paper, just dubiously in favour, means everything, but a thousand against mean nothing, or can be reinterpreted to support their fantasies.
The Kossinnian “autochthonous continuity” crap permeates this relatively new subfield of Human Evolutionary Genetics, as it permeated Indo-European studies (first Linguistics, then Archaeology) in its infancy. It seems to be a generalised human trend, no doubt related to some absurd inferiority complex, mixed with historical romanticism, a certain degree of chauvinism, and (falling in the eternal Godwin’s Law of our field) some outdated, childish notion of ‘supremacy’ linked with the expansion of the own language and people.
Such simplistic and popular models are also lucrative, judging by the boom in demand for DNA analysis, which companies embellish with modern fortune tellers (or fortune tellers themselves sell for a price), promising to ascertain your ‘ancestry proportions’ using automated algorithms, so that you don’t have to get lost in complex genetic data and prehistoric accounts, which can’t help you define your “ethnicity”…
Some just don’t want to realize that the spread of prehistoric languages (like Late Indo-European dialects) was a complex, non-uniform, stepped process, devoid of modern romantic concepts, which in genetic terms necessarily included later founder effects and cultural diffusions, so that no one can trace their haplogroup, lineage, family, region, or country to any single culture, language, or ethnic group. The same, by the way, can be said of peoples and countries in historic times.
As I said before, we shall expect supporters of the Kurgan model (and thus the expansion of R1a-Z645 with Yamna) to wait for just one sample of R1a-M417 in Yamna and/or Bell Beaker (which will eventually be found), and just one sample of R1b-M269 in Corded Ware (which will also eventually be found), to blow the horn of victory in this naïve competition against time, general knowledge, and (essentially) themselves.
A sad consequence of how we are is that, because of the obvious influence of these stupid modern ethnolinguistic agendas, because we are not all rowing in the same direction, genetic results and conclusions are still perceived as far-fetched and labile, and thus most archaeologists and linguists prefer not to include genetic results in their investigation. And those who dare to do so, are badly counselled by those who go with the tide, so that their papers become almost instantly outdated.