Corded Ware and Bell Beaker related groups defined by patrilocality and female exogamy


Two new interesting papers concerning Corded Ware and Bell Beaker peoples appeared last week, supporting yet again what is already well-known since 2015 about West Uralic and North-West Indo-European speakers and their expansion.

Below are relevant excerpts (emphasis mine) and comments.

NOTE. I will add analyses of ancestry, renewed Y-DNA maps, etc. if and when I find the time.

I. Corded Ware and Battle Axe cultures

Open access The genomic ancestry of the Scandinavian Battle Axe Culture people and their relation to the broader Corded Ware horizon, by Malmström, Günther, et al. Philos. Trans. R. Soc. (2019).

I.1. Origins of Corded Ware peoples

The discovery of the Alexandria outlier represented a clear support for a long-lasting genomic difference between the two distinct cultural groups, Yamnaya and Corded Ware, already visible in an opposition Khvalynsk vs. late Sredni Stog ca. 4000 BC, i.e. well before the formation of both Late Eneolithic/Early Bronze Age groups.

However, the realization that it may not have been an Eneolithic individual, but rather a (Middle?) Bronze Age one, suggests that Sredni Stog was possibly not directly related to Corded Ware, and a potential direct connection with Yamnaya might have to be reevaluated, e.g. through the Carpathian Basin, as Anthony (2017) proposed.

Principal component analysis of modern Europeans (grey) and projected ancient Europeans.

This new paper shows two early Corded Ware individuals from Obłaczkowo, Poland (ca. 2900-2600 BC) – hence close to the supposed original Proto-Corded Ware community – with an apparently (almost) full “Steppe-like” ancestry, clustering (almost) with Yamnaya individuals:

Similar to the BAC individuals, the newly sequenced individuals from the present-day Karlova in Estonia and Obłaczkowo in Poland appear to have strong genetic affinities to other individuals from BAC and CWC contexts across the Baltic Sea region. Some individuals from CWC contexts, including the two from Obłaczkowo, cluster closely with the potential source population of steppe-related ancestry, the Yamnaya herders. Notably, these individuals appear to be those with the earliest radiocarbon dates among all genetically investigated individuals from CWC contexts. Overall, for CWC-associated individuals, there is a clear trend of decreasing affinity to Yamnaya herders with time.

NOTE. Interestingly, this sample is almost certainly attributed to the skeleton E8-A, which had been supposedly already investigated by the Copenhagen group as the RISE1 sample:

We note that RISE1 is also described as the individual from Obłaczkowo feature E8-A. However, their genetic results differ from ours. They present this individual as a molecularly determined male that belongs to Y-chromosomal haplogroup (hg) R1b and to mtDNA hg K1b1a1 while our results show this individual to be female, carrying a mtDNA hg U3a’c profile

Since the typical Steppe_MLBA ancestry of Corded Ware groups does not show good fits for (Pre-)Yamnaya-derived ancestry, it is almost certain that these individuals will show no (or almost no) direct Yamnaya-related contribution, but rather a contribution of East European sub-Neolithic groups, more or less close to the steppe-forest region.

NOTE. They might show contributions from Pre-Yamnaya-influenced Sredni Stog, though, but if they show a contribution of Yamnaya, then they are probably outliers, related to Yamnaya vanguard groups (see image below). And for them to show it, then both sources, Yamnaya and Corded Ware, should be clearly distinguishable from each other and their relative contribution quantifiable in formal stats, something difficult (if not impossible) to ascertain today.

Their position in the published PCA – a plot apparently affected by projection bias – suggests a cluster in common with early Baltic samples, which are known to show contributions from East European sub-Neolithic populations (see qpAdm values for Baltic CWC samples).

NOTE. Results for previous samples labelled as Poland CWC are unreliable due to their low coverage.

The most interesting aspect about the ancestry shown by these early samples is their further support for an origin of the culture different than Sredni Stog, and for a rejection of the Alexandria outlier as ancestral to them, hence for a Volhynian-Podolian homeland of Proto-Corded Ware peoples, with an ancestry probably more closely related to the late Maykop Steppe- and Trypillian/GAC groups admixed with sub-Neolithic populations of the Eastern European Late Eneolithic.

NOTE. That is, unless there is a reason for the apparent increase in so-called “Steppe-ancestry” during the northward and westward migration of CWC peoples that represents another thing entirely…

Trypillian routes of influence and Yamnaya culture influences in Central and Central-East Europe during the Late Eneolithic / Early Bronze Age. Images by Klochko (2009).

I.2. CWC expansion under R1a bottlenecks

The two males in our dataset (ber1 and poz81) belonged to Y-chromosome R1a haplogroups, as do the majority of males (16/24) from the previously published CWC contexts, while a smaller fraction belonged to R1b [3/24] or I2a [3/24] lineages. The R1a haplogroup has not been found among Neolithic farmer populations nor in hunter–gatherer groups in central and western Europe, but it has been reported from eastern European hunter–gatherers and Eneolithic groups. Individuals from the Pontic–Caspian steppe, associated with the Yamnaya Culture, carry mostly R1b and not R1a haplotypes.

Sample poz81 is of basal hg. R1a-CTS4385*, an R1a-M417 subclade, supporting once again that most Corded Ware individuals from western and central European groups expanded under R1a-M417 (xZ645) lineages. The Battle Axe sample from Bergsgraven (ca. 2620-2470 BC) shows a basal hg. R1a-Y2395*, a R1a-Z283 subclade leading to the typically Fennoscandian R1a-Z284.

Both findings further support that typical lineages of West CWC groups, including R1a-M417 (xZ645) subclades, were fully replaced by incoming East Bell Beakers, and that the limited expansion of R1a-Z284 and I1 (the latter found in one newly reported Late Neolithic sample from Sweden) was the outcome of later regional bottlenecks within Scandinavia, after the creation of a maritime dominion by the Bell Beaker elites during the Dagger Period.

I.3. CWC and lactase persistence

(…) one of these individuals (kar1) carried at least one allele (-13910 C->T) associated with lactose tolerance, while the other two individuals (ber1 and poz81) carried at least one ancestral variant each, consistent with previous observations of low levels of lactose tolerance variants in the Neolithic and a slight increase among individuals from CWC contexts.

The fact that two early CWC individuals carry ancestral variants could be said to support the improbability of the individual from Alexandria representing a community ancestral to the Corded Ware community. On the other hand, the late CWC individual from Estonia carries one allele, but it still seems that only Bell Beakers and Steppe-related groups show the necessary two alleles during the Early Bronze Age, which is in line with a late Repin/early Yamnaya-related origin of the successful selection of the trait, consistent with the expansion of their specialized semi-nomadic cattle-breeding economy through the steppe biome during the Late Eneolithic.

Maps part of the public data used for the post by Iain Mathieson on Lactase Persistence. “By 2500 BP, the allele is present over a band stretching from Ireland to Central Asia at around 50 degrees latitude. This probably reflects the spread of Steppe ancestry populations in which the allele originated. However, the allele is still rare (say less than 1% frequency) over this entire range. It does not become common anywhere until some time in the past 2500 years – when it reaches its present-day high frequency in Britain and Central Europe”.

I.4. West Uralic spread from the East

The BAC groups fit as a sister group to the CWC-associated group from Estonia but not as a sister group to the CWC groups from Poland or Lithuania (|Z| > 3), indicating some differences in ancestry between these CWC groups and BAC. Supervised admixture modelling suggests that BAC may be the CWC-related group with the lowest YAM-related ancestry and with more ancestry from European Neolithic groups.

While the results of the paper are compatible with a migration from either the Eastern or the Western Baltic into Scandinavia, phylogeography and archaeology support that Battle Axe peoples emerged as a Baltic Corded Ware group close to the Vistula that expanded first to the north-east, and then to the west from Finland, continuing mostly unscathed during the whole Bronze Age mostly in eastern Fennoscandia with the development of Balto-Finnic- and Samic-speaking communities.

Correlation between f4(Chimp, LBK, YAM, X), where X is a CWC or BAC individual, and the date (BCE) of each individual. This statistic measures shared drift between CWC and Linear Pottery Culture (LBK) as opposed to YAM and should increase with the higher proportion of Neolithic farmer ancestry in CWC and BAC.

Radiocarbon dating showed that the three individuals from the Öllsjö megalithic tomb derived from later burials, where oll007 (2860–2500 cal BCE) overlaps with the time interval of the BAC, and oll009 and oll010 (1930–1650 cal BCE) fall within the Scandinavian Late Neolithic and Early Bronze Age

For more on how the Pitted Ware culture may have influenced Uralic-speaking Battle Axe peoples earlier than Indo-European-speaking Bell Beakers in Scandinavia, read more about Early Bronze Age Scandinavia and about the emergence of the Pre-Proto-Germanic community.

II. Bell Beakers through the Bronze Age

New paper (behind paywall) Kinship-based social inequality in Bronze Age Europe, by Mittnik et al. Science (2019).

II.1. Yamnaya vanguard settlers

In my last post, I showed how the ancestry of Corded Ware from Esperstedt is consistent with influence by incoming Yamnaya vanguard settlers or early Bell Beakers, stemming ultimately from the Carpathian Basin, something that could be inferred from the position of the Esperstedt outlier in the PCA, and by the knowledge of Yamnaya archaeological influences up to Saxony-Anhalt.

Yamnaya settlers are strongly suspected to have migrated in small so-called vanguard groups to the west and north of the Carpathians in the first half of the 3rd millennium BC, well before the eventual adoption of the Proto-Beaker package and their expansion ca. 2500 BC as East Bell Beakers.

Tauber Valley infiltration

As I mentioned in the books, one of the known – among the many more unknown – sites displaying Yamnaya-related traits and suggesting the expansion of Yamnaya settlers into Central Europe is Lauda-Königshofen, in the Tauber Valley.

From Diet and Mobility in the Corded Ware of Central Europe, by Sjögren, Price, & Kristiansen PLoS One (2017):

A series of CW cemeteries have been excavated in the Tauber valley. There are three large cemeteries known and some 30 smaller sites. The larger ones are Tauberbischofsheim-Dittingheim with 62 individuals, Tauberbischofsheim-Impfingen with 40 individuals, and Lauda-Königshofen with 91 individuals. The cemeteries are dispersed rather regularly along the Tauber valley, on both sides of the river, suggesting a quite densely settled landscape.

The Lauda-Königshofen graves consisted mostly of single inhumations in contracted position, usually oriented E-W or NE-SW. A total of 91 individuals were buried in 69 graves. At least 9 double graves and three graves with 3–4 individuals were present. In contrast to the common CW pattern, sexes were not distinguished by body position, only by grave goods. This trait is common in the Tauber valley and suggests a local burial tradition in this area. Stone axes were restricted to males, pottery to females, while other artifacts were common to both sexes. About a third of the graves were surrounded by ring ditches, suggesting palisade enclosures and possibly over-plowed barrows.

In particular, Frînculeasa, Preda, & Heyd (2015) used Lauda-Königshofen as representative of the mobility of horse-riding Yamnaya nomadic herders migrating into southern Germany, referring to the findings in Trautmann (2012) about the nomadic herders from the Tauber Valley, and their already known differences with other Corded Ware groups.

The likely influence of Yamnaya in the region has been reported at least since the 2000s, repeatedly mentioned by Jozef Bátora (2002, 2003, 2006), who compiled Yamnaya influences in a map that has been copied ever since, with little improvement over time. Heyd believes that there are potentially many Yamnaya remains along the Middle and Lower Danube and tributaries not yet found, though.

NOTE. Looking for this specific site, I realized that Bátora (and possibly many after him who, like me, copied his map) located Lauda-Königshofen in a more south-western position within Baden-Württemberg than its actual location. I have now corrected it in the maps of Chalcolithic migrations.

Yamnaya influences in Central Europe suggestive of vanguard settlements, contemporary with Corded Ware groups. See full map.

Althäuser Hockergrab…Bell Beakers

Unfortunately, though, it is very difficult to attribute the reported R1b-L51 sample from the Tauber valley to a population preceding the arrival of East Bell Beakers in the region, so there is no uncontroversial smoking gun of Yamnaya vanguard settlers – yet. Reasons to doubt a Pre-Beaker origin are as follows:

1. This family of the Tauber valley shows a late radiocarbon date (ca. 2500 BC), i.e. from a time where East Bell Beakers are known to have been already expanding in all directions from the Middle and Upper Danube and its tributaries.

Crouched burial from Althausen (Althäuser Hockergrab), dated ca. 2500 BC.

2. Archaeological information is scarce. Remains of these four individuals were discovered in 1939 and officially reported together with other findings in 1950, without any meaningful data that could distinguish between Bell Beakers and Corded Ware individuals.

This site is located in the Tauber valley, ca. 100 km to the northwest of the Lech valley. The site was discovered during the construction of a sports field in 1939 and was subsequently excavated by G. Müller and O. Paret. Four individuals in crouched position were found in the burial pit of a flat grave. The burial did not contain any grave goods, but due to the type of grave and positioning of the bodies (with heads pointing towards southwest) the site was attributed to the Corded Ware complex.

The classification of this burial as of CWC and not BBC seems to have been based entirely on the numerous CWC findings in the Tauber valley, rather than on its particular burial orientation following a regional custom (foreign to the described standard of both cultures), and on its grave type that was also found among Bell Beaker groups. Like many human remains recovered in dubious circumstances in the 20th century, these samples should have probably been labelled (at least in the genetic paper) more properly as Tauber_LN or Tauber_EBA.

Changes in ancestry over time. (A) Median ages of individuals plotted against z scores of f4 (Mbuti, Test; Yamnaya_Samara, Anatolia_Neolithic) show increase of Anatolian farmer-related ancestry (indicated by more positive z-scores) and decrease of variation in ancestry over time. Grey shading indicates significant z scores, red line shonw near correlation (r = -0.35971; P = 0.003) and dotted lines the 95% confidence interval. (B) ancestry proportions on autosomes calculated with qpAdm. (C) Sex-bias z scores between autosomes and X chromosomes show significant male bias for steppe-related ancestry in the Tauber samples. Image modified from the paper: Surrounded with a blue circle in (A) are females with more Steppe-related ancestry, and in (C) surrounded by squares are the distinct sex biases found in the earliest BBC from the Tauber valley vs. later groups from the Lech valley.

3. In terms of ancestry, there seem to be no gross differences between the Lech Valley BBC individuals and previously reported South German Beakers, originally Yamnaya-like settlers admixing through exogamy with locals, including Corded Ware peoples, as the sex bias of the Lech Valley Beakers proves (see PCA plot below). In other words, northern and eastern Beakers admixed with regional (Epi-)Corded Ware females during their respective expansions, similar to how southern and western Beakers admixed with regional EEF-related females.

The two available Tauber Valley samples (“Tauber_CWC”) show the same pattern: a quite recent Steppe-related male bias and Anatolia_Neolithic-related female bias. Nevertheless, the male sample clusters ‘to the south’ in the PCA relative to all sampled Corded Ware individuals (see PCA plot below), and shows less Yamnaya-like ancestry than what is reported (or can be inferred) for Yamnaya from Hungary or early Bell Beakers of elevated Steppe-related ancestry.

The ancestry and position of the Althäuser male in the PCA is thus fully compatible with recently incoming East Bell Beakers admixing with local peoples (including Corded Ware) through exogamy, but not so much with a sample that would be expected from Yamanaya vanguard + Corded Ware-related ancestry (more like the Esperstedt outlier or the early France Beaker). Compared to the more ‘northern’ (fully Corded Ware-like) position of his female counterpart, there is little to support that both are part of the same native Tauber valley community after generations of ancestry levelling…

Table S9. Three-way qpAdm admixture model for European MN/Chalcolithic group+Yamnaya_Samara. P-values greater than 0.05 (model is not rejected) marked in green.

4. The haplogroup inference is also unrevealing: whereas the paper reports that it is R1b-P310* (xU106, xP312), there is no data to support a xP312 call, so it may well be even within the P312 branch, like most sampled Bell Beaker males. Similarly, the paper also reports that HUGO_180Sk1 (ca. 2340 BC) shows a positive SNP for the U106 trunk, which would make it the earliest known U106 sample and originally from Central Europe, but there is no clear support for this SNP call, either. At least not in their downloadable BAM files, as far as I can tell. Even if both were true, they would merely confirm the path of expansion of Yamnaya / East Bell Beakers through the Danube, already visible in confirmed genomic data:

Distribution of ‘archaic’ R1b-L51 subclades in ancient samples, overlaid over a map of Yamnaya and Bell Beaker migrations. In blue, Yamnaya Pre-L51 from Lopatino (not shown) and R1b-L52* from BBC Augsburg. In violet, R1b-L51 (xP312,xU106) from BBC Prague and Poland. In maroon, hg. R1b-L151* from BBC Hungary, BA Bohemia, and (not shown) a potential sample from the Tauber Valley and one from BBC at Mondelange, which is certainly xU106, maybe xP312. Interestingly, the earliest sample of hg. R1b-U106 (a lineage more proper of northern Europe) has been found in a Bell Beaker from Radovesice (ca. 2350 BC), between two of these ‘archaic’ R1b-L51 samples; and a sample possibly of hg. R1b-ZZ11+ (ancestral to DF27 and U152) was found in a Bell Beaker from Quedlinburg, Germany (ca. 2290 BC), to the north-west of Bohemia. The oldest R1b-U152 are logically from Central Europe, too.

II.2. Proto-Celts and the Tumulus culture

The most interesting data from Mittnik et al. (2019) – overshadowed by the (at first sight) striking “CWC” label of the Althäuser male – is the finding that the most likely (Pre-)Proto-Celtic community of Southern Germany shows, as expected, major genetic continuity over time with Yamnaya/East Bell Beaker-derived patrilineal families, which suggests an almost full replacement of other Y-chromosome haplogroups in Southern German Bronze Age communities, too.

Sampled families form part of an evolving Bell Beaker-derived European BA cluster in common with other Indo-European-speaking cultures from Western, Southern, and Northern Europe, also including early Balto-Slavs, clearly distinct from the Corded Ware-related clusters surviving in the Eastern Baltic and the forest zone.

This Central European Bronze Age continuity is particularly visible in many generations of different patrilocal families practising female exogamy, showing patrilineal inheritance mainly under R1b-P312 (mostly U152+) lineages proper of Central European bottlenecks, all of them apparently following a similar sociopolitical system spanning roughly a thousand years, since the arrival of East Bell Beakers in the region (ca. 2500 BC) until – at least – the end of the Middle Bronze Age (ca. 1300 BC):

Here, we show a different kind of social inequality in prehistory, i.e., complex households that consisted of i) a higher-status core family, passing on wealth and status to descendants, ii) unrelated, wealthy and high-status non-local women and iii) local, low-status individuals. Based on comparisons of grave goods, several of the high-status non-local females could have come from areas inhabited by the Unetice culture, i.e., from a distance of at least 350 km. As the EBA evidence from most of Southern Germany is very similar to the Lech valley, we suggest that social structures comparable to our microregion existed in a much broader area. The EBA households in the Lech valley, however, seem similar to the later historically known oikos, the household sphere of classic Greece, as well as the Roman familia, both comprising the kin-related family and their slaves.

Genetic structure of Late Neolithic and Bronze Age individuals from southern Germany. (A) Ancient individuals (covered at 20,000 or more SNPs) projected onto principal components defined by 1129 present day west Eurasians (shown in fig. S6); individuals in this study shown with outlines corresponding to their 87Sr/86Sr isotope value (black: consistent with local values, orange: uncertain/intermediate, red: inconsistent with local values). Selected published ancient European individuals are shown without outlines. Image modified from the paper. Surrounded by triangles in cyan, Corded Ware-like females; with a blue triangle, Yamnaya/Early BBC-like sample from the Tauber valley.

NOTE. For those unfamiliar with the usual clusters formed by the different populations in the PCA, you can check similar graphics: PCA with Bell Beaker communities, PCA with Yamnaya settlers from the Carpathians, a similar one from Wang et al. (2019) showing the Yamnaya-Hungary cline, or the chronological PCAs prepared by me for the books.

The gradual increase in local EEF-like ancestry among South Germany EBA and MBA communities over the previous BBC period offers a reasonable explanation as to how Italic and Celtic communities remained in loose contact (enough to share certain innovations) despite their physical separation by the Alps during the Early Bronze Age, and probably why sampled Bell Beakers from France were found to be the closest source of Celts arriving in Iberia during the Urnfield period.

Furthermore, continued contacts with Únětice-related peoples through exogamy also show how Celtic-speaking communities closer to the Danube might have influenced (and might have been influenced by) Germanic-speaking communities of the Nordic Late Neolithic and Bronze Age, helping explain their potentially long-lasting linguistic exchange.

Like other previous Neolithic or Chalcolithic groups that Yamnaya and Bell Beakers encountered in Europe, ancestry related to the Corded Ware culture became part of Bell Beaker groups during their expansion and later during the ancestry levelling in the European Early Bronze Age, which helps us distinguish the evolution of Indo-European-speaking communities in Europe, and suggests likely contacts between different cultural groups separated by hundreds of km. from each other.

All in all, there is nothing to support that (epi-)Corded Ware groups might have survived in any way in Central or Western Europe: whether through their culture, their Y-chromosome haplogroups, or their ancestry, they followed the fate of other rapidly expanding groups before them, viz. Funnelbeaker, Baden, or Globular Amphorae cultural groups. This is very much unlike the West Uralic-speaking territory in the Eastern Baltic and the Russian forests, where Corded Ware-related cultures thrived during the Bronze Age.

f4-statistics showing differences in ancestry in populations grouped by period. An increase in affinity to ancestry related to Anatolia Neolithic over time. Males and females grouped together shown as upward and downward pointing triangles, respectively.


It was about time that geneticists caught up with the relevance of Y-DNA bottlenecks when assessing migrations and cultural developments.

From Malmström et al. (2019):

The paternal lineages found in the BAC/CWC individuals remain enigmatic. The majority of individuals from CWC contexts that have been genetically investigated this far for the Y-chromosome belong to Y-haplogroup R1a, while the majority of sequenced individuals of the presumed source population of Yamnaya steppe herders belong to R1b. R1a has been found in Mesolithic and Neolithic Ukraine. This opens the possibility that the Yamnaya and CWC complexes may have been structured in terms of paternal lineages—possibly due to patrilineal inheritance systems in the societies — and that genetic studies have not yet targeted the direct sources of the expansions into central and northern Europe.

From Gibbons (2019), a commentary to Mittnik et al. (2019):

Some of the early farmers studied were part of the Neolithic Bell Beaker culture, named for the shape of their pots. Later generations of Bronze Age men who retained Bell Beaker DNA were high-ranking, buried with bronze and copper daggers, axes, and chisels. Those men carried a Y chromosome variant that is still common today in Europe. In contrast, low-ranking men without grave goods had different Y chromosomes, showing a different ancestry on their fathers’ side, and suggesting that men with Bell Beaker ancestry were richer and had more sons, whose genes persist to the present.

There was no sign of these women’s daughters in the burials, suggesting they, too, were sent away for marriage, in a pattern that persisted for 700 years. The only local women were girls from high-status families who died before ages 15 to 17, and poor, unrelated women without grave goods, probably servants, Mittnik says. Strontium levels from three men, in contrast, showed that although they had left the valley as teens, they returned as adults.

Also, from Scientific American:

(…) it has long been assumed that prior to the Athenian and Roman empires,—which arose nearly 2,500 and more than 2,000 years ago, respectively—human social structure was relatively straightforward: you had those who were in power and those who were not. A study published Thursday in Science suggests it was not that simple. As far back as 4,000 years ago, at the beginning of the Bronze Age and long before Julius Caesar presided over the Forum, human families of varying status levels had quite intimate relationships. Elites lived together with those of lower social classes and women who migrated in from outside communities. It appears early human societies operated in a complex, class-based system that propagated through generations.

It seems wrong (to me, at least) that the author and – as he believes – archaeologists and historians had “assumed” a different social system for the European Bronze Age, which means they hadn’t read about how Indo-European societies were structured. For example, long ago Benveniste (1969) already drew some coherent picture of these prehistoric peoples based on their reconstructed language alone: regarding their patrilocal and patrilineal family system; regarding their customs of female exogamy and marriage system; and regarding the status of foreigners and slaves as movable property in their society.

A long-lasting and pervasive social system of Bronze Age elites under Yamnaya lineages strikingly similar to this Southern German region can be easily assumed for the British Isles and Iberia, and it is likely to be also found in the Low Countries, Northern Germany, Denmark, Italy, France, Bohemia and Moravia, etc., but also (with some nuances) in Southern Scandinavia and Central-East Europe during the Bronze Age.

Therefore, only the modern genetic pool of some border North-West Indo-European-speaking communities of Europe need further information to describe a precise chain of events before their eventual expansion in more recent times:

  1. the relative geographical isolation causing the visible regional founder effects in Scandinavia, proper of the maritime dominion of the Nordic Late Neolithic (related thus to the Island Biogeography Theory); and
  2. the situation of the (Pre-)Proto-Balto-Slavic community close to the Western Baltic which, I imagine, will be shown to be related to a resurge of local lineages, possibly due to a shift of power structures similar to the case described for Babia Góra.

NOTE. Rumour has it that R1b-L23 lineages have already been found among Mycenaeans, while they haven’t been found among sampled early West European Corded Ware groups, so the westward expansion of Indo-European-speaking Yamnaya-derived peoples mainly with R1b-L23 lineages through the Danube Basin merely lacks official confirmation.


Bell Beakers and Mycenaeans from Yamnaya; Corded Ware from the forest steppe


I have recently written about the spread of Pre-Yamnaya or Yamnaya ancestry and Corded Ware-related ancestry throughout Eurasia, using exclusively analyses published by professional geneticists, and filling in the gaps and contradictory data with the most reasonable interpretations. I did so consciously, to avoid any suspicion that I was interspersing my own data or cherry picking results.

Now I’m finished recapitulating the known public data, and the only way forward is the assessment of these populations using the available datasets and free tools.

Understanding the complexities of qpAdm is fairly difficult without a proper genetic and statistical background, which I won’t pretend to have, so its tweaking to get strictly correct results would require an unending game of trial and error. I have sadly little time for this, even taking my tendency to procrastination into account… so I have used a simple model akin to those published before – in particular, the outgroup selection by Ning, Wang et al. (2019), who seem to be part of the only group interested in distinguishing Yamnaya-related from Corded Ware-related ancestry, probably the most relevant question discussed today in population genomics regarding the Proto-Indo-European and Proto-Uralic homelands.

Supplementary Table 13. P values of rank=2 and admixture proportions in modelling Steppe ancestry populations as a three-way admixture of Eneolithic steppe Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Test, Eneolithic_steppe, Anatolian_Neolithic, WHG.
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

I have used for all analyses below a merged dataset including the curated one of the Reich Lab, the latest on Central and South Asia by Narasimhan, Patterson et al. (2019), on Iberia by Olalde et al. (2019), and on the East Baltic by Saag et al. (2019), as well as datasets including samples from Wang et al. (2019) and Lamnidis et al. (2018). I used (and intend to use) the same merged dataset in all cases, despite its huge size, to avoid adding one more uncontrolled variable to the analyses, so that all results obtained can be compared.

I try to prepare in advance a bunch of relevant files with left pops and right pops for each model:

  1. It seems a priori more reasonable to use geographically and chronologically closer proxy populations (say, Trypillia or GAC for Steppe-related peoples) than hypothetic combinations of ancestral ones (viz. Anatolian farmer, WHG, and EHG).
  2. This also means using subgroups closer to the most likely source population, such as (Don-Volga interfluve) Yamnaya_Kalmykia rather than (Middle Volga) Yamnaya_Samara for the western expansion of late Repin/early Yamnaya, or the early Germany_Corded_Ware.SG or Czech_Corded Ware for the group closest to the Proto-Corded Ware population (see below), likely neighbouring the Upper Vistula region.
  3. I usually test two source populations for different targets, which seems like a much more efficient way of using computer resources, whenever I know what I want to test, since I need my PC back for its normal use; whenever I don’t know exactly what to test, I use three-way admixture models and look for subsets to try and improve the results.

I have probably left out some more complex models by individualizing the most relevant groups, but for the time being this would have to do. Also, no other formal stats have been used in any case, which is an evident shortcoming, ruling out an interpretation drawn directly and only from the results below.

Full qpAdm results for each batch of samples are presented in a Google Spreadsheet, with each tab (bottom of the page) showing a different combination of sources, usually in order of formally ‘best’ (first to the left) to ‘worst’ (last to the right) fits, although the order is difficult to select in highly heterogeneous target groups, as will be readily visible.

Disintegration, migration, and imports of the Azov–Black Sea region. First migration event (solid arrows): Gordineşti–Maikop expansion (groups: I – Bursuchensk; II – Zhyvotylivka; III – Vovchans’k; IV – Crimean; V – Lower Don; VI – pre-Kuban). Second migration event (hollow arrows): Repin expansion. After Rassamakin (1999), Demchenko (2016).

Corded Ware origins

The latest publications on the Yampil barrow complex have not improved much our understanding of the complexity of Corded Ware origins from an archaeological point of view, involving multiple cultural (hence likely population) influences. This bit is from Ivanova et al., Baltic-Pontic Studies (2015) 20:1, and most hypotheses of the paper remain unanswered (except maybe for the relevance of the Złota group):

In the light of the above outline therefore one should argue that the ‘architecture of barrows’ associated in the ‘Yampil landscape’ of the Middle Dniester Area with the Eneolithic (specifically, mainly with the TC), precedes the development of a similar phenomenon that can be observed from 2900/2800 BC in the Upper Dniester Area and drainage basin of the Upper Vistula, associated with the CWC [Goslar et al. 2015; Włodarczak 2006; 2007; 2008; Jarosz, Włodarczak 2007]. The most consuming research question therefore is whether ritual customs making use of Eneolithic (Tripolye) ‘barrow architecture’ could have penetrated northwards along the Dniester route, where GAC communities functioned. One could also ask what role the rituals played among the autochthons [Kośko 2000; Włodarczak 2008; 2014: 335; Ivanova, Toshchev 2015b].

This issue has already been discussed with a resulting tentative systemic taxonomy in the studies of Włodarczak, arguing for the Złota culture (ZC) in the Vistula region as an illustration of one of the (Małopolska) reception centres of civilization inspirations from the oldest Pontic ‘barrow culture’ circle associated with the Eneolithic and Early Bronze Age [Włodarczak 2008]. Notably, it is in the ZC that one can notice a set of cultural traits (catacomb grave construction, burial details, forms and decoration of vessels) analogous to those shared by the north-western Black Sea Coast groups of the forest-steppe Eneolithic (chiefly Zhyvotilovka-Volchansk) and the Late Tripolye circle (chiefly Usatovo-Gordinești-Horodiștea-Kasperovtsy).

Globular Amphorae culture „exodus” to the Danube Delta: a – Globular Amphorae culture; b – GAC (1), Gorodsk (2), Vykhvatintsy (3) and Usatovo (4) groups of Trypillia culture; c – Coţofeni culture; d – northern border of the late phase of Baden culture;red arrows – direction of Globular Amphora culture expansion; blue arrow – direction of „reflux” of Globular Amphora culture (apud Włodarczak, 2008, with changes).

Taking into account that I6561 might be wrongly dated, we cannot include the Corded Ware-like sample of the end-5th millennium BC in the analysis of Corded Ware origins. That uncertainty in the chronology of the appearance of “Steppe ancestry” in Proto-Corded Ware peoples complicates the selection of any potential source population from the CHG cline.

Nevertheless, the lack of hg. R1a-M417 and sizeable Pre-Yamnaya-related ancestry in the sampled Pontic forest-steppe Eneolithic populations (represented exclusively by two samples from Dereivka ca. 3600-3400 BC) would leave open the interesting possibility that a similar ancestry got to the forest-steppe region between modern Poland and Ukraine during the known complex population movements of the Late Eneolithic.

It is known that Corded Ware-derived groups and Steppe Maykop show bad fits for Pre-Yamnaya/Yamnaya ancestry, and also that Steppe Maykop is a potential source of “Steppe-related ancestry” within the Eneolithic CHG mating network of the Pontic-Caspian steppes and forest-steppes. Testing Corded Ware for recent Trypillia and Maykop influences, proper of Late Trypillia and Late Maykop groups in the North Pontic area (such as Zhyvotylivka–Vovchans’k and Gordineşti) side by side with potential Pre-Yamnaya and Yamnaya sources makes thus sense:

Now, the main obvious difference between Khvalynsk-Yamnaya and Corded Ware is the long-lasting, pervasive Y-chromosome bottlenecks under R1b lineages in the former, compared to the haplogroup variability and late bottleneck under R1a-M417 in the latter, which speaks in favour – on top of everything else – of a different community of sub-Neolithic hunter-gatherers including hg. R1a-M417 hijacking the expansion of Steppe_Maykop-related ancestry around the Volhynian-Podolian Upland.

Akin to how Yamnaya patrilineal descendants hijacked regional EEF (±CWC) ancestry components mainly through exogamy, dragging them into the different expanding Bell Beaker groups (see below), but kept their Indo-European languages, these hunter-gatherers that admixed with peoples of “Steppe ancestry” were the most likely vector of expansion of Uralic languages in Eastern Europe.

PCA of ancient Eurasian samples. Marked likely Proto-Corded Ware samples and potential origin of its PCA cluster based on qpAdm results. See full PCA and more related files.

Baltic Corded Ware

One of the most interesting aspects of the results above is the surprising heterogeneity of the different regional groups, which is also reflected in the Y-DNA variability of early Corded Ware samples.

Seeing how Baltic CWC groups, especially the early Latvia_LN sample, show particularly bad fits with the models above, it seems necessary to test how this population might have come to be. My first impression in 2017 was that they could represent early Corded Ware groups admixed with Yamnaya settlers through their interactions along the Dnieper-Dniester corridor.

However, I recently predicted that the most likely admixture leading to their ancestry and PCA cluster would involve a Corded Ware-like group and a group related to sub-Neolithic cultures of eastern Europe, whose best proxy to date are EHG-like Khvalynsk samples (i.e. excluding the outlier with Pre-Yamnaya ancestry, I0434):

Detail of the PCA of the Corded Ware expansion. See full PCA and more related files.

Late Corded Ware + Yamnaya vanguard

Relevant are also the mixtures of Corded Ware from Esperstedt, and particularly those of the sample I0104, which I have repeated many times in this blog I suspected to be influenced by vanguard Yamnaya settlers:

The infeasible models of CWC + Yamnaya_Kalmykia ± Hungary_Baden (see below for Bell Beakers) and the potential cluster formed with other samples from the Baltic suggest that it could represent a more complex set of mixtures with sub-Neolithic populations. On the other hand, its location in Germany, late date (ca. 2500 BC or later), and position in the PCA, together with the good fits obtained for Germany_Beaker as a source, suggest that the increase in Steppe-related ancestry + EEF makes it impossible for the model (as I set it) to directly include Yamnaya_Kalmykia, despite this excess Steppe-related ancestry actually coming from Yamnaya vanguard groups.

I think it is very likely that the future publication of EEF-admixed Yamnaya_Hungary samples (or maybe even Yamnaya vanguard samples) will improve the fits of this model.

These results confirm at least the need to distrust the common interpretation of mixtures including late Corded Ware samples from Esperstedt (giving rise to the “up to 75% Yamnaya ancestry of CWC” in the 2015 papers) as representative of the Corded Ware culture as a whole, and to keep always in mind that an admixture of European BA groups including Corded Ware Esperstedt as a source also includes East BBC-like ancestry, unless proven otherwise.

Yamnaya vanguard groups in Corded Ware territory before the expansion of Bell Beakers (ca. 2500 BC). See full map.

Bell Beaker expansion

A hotly (re)debated topic in the past 6 months or so, and for all the wrong reasons, is the origin of the Bell Beaker folk. Archaeology, linguistics, and different Y-chromosome bottlenecks clearly indicate that Bell Beakers were at the origin of the North-West Indo-European expansion in Europe, while the survival of Corded Ware-related groups in north-eastern Europe is clearly related to the expansion of Uralic languages.

NOTE. For the interesting case of Proto-Indo-Iranians expanding with Corded Ware-like ancestry, see more on the formation of Sintashta-Potapovka-Filatovka from East Uralic-speaking Abashevo and Pre-Proto-Indo-Iranian-speaking Poltavka herders. See also more on R1a in Indo-Iranians and on the social complexity of Sintashta.

Nevertheless, every single discarded theory out there seems to keep coming back to life from time to time, and a new wave of interest in “Bell Beaker from the Single Grave culture” somehow got revived in the process, too, because this obsession – unlike the “Bell Beakers from Iberia Chalcolithic” – is apparently acceptable in certain circles, for some reason.

We know that Iberian Beakers, British Beakers, or Sicilian EBA – representing the most likely closest source population of speakers of Proto-Galaico-Lusitanian, Pre-Celtic Indo-European, and Proto-Elymian, respectively – have already been successfully tested for a direct origin among Western European Beakers in Olalde et al. (2018), Olalde et al. (2019), and Fernandes et al. (2019).

This success in ascertaining a closer Beaker source is probably due to the physical isolation of the specific groups (related to Germany_Beaker, Netherlands_Beaker, and NE_Mediterranean_Beaker samples, respectively) after their migration into regions dominated by peoples without Steppe-related ancestry. Furthermore, Celtic-speaking populations expanding with Urnfield south of the Pyrenees also show a good fit with a source close to France_Beaker.

So I decided to test sampled Bell Beaker populations, to see if it could shed light to the most likely source population of individual Beaker groups and the direction of migration within Central Europe, i.e. roughly eastwards or westwards. As it was to be expected for closely related populations (see the relevant discussion here), an attempt to offer a simplistic analysis of direction based on formal stats does not make any sense, because most of the alternative hypotheses cannot be rejected:

Not only because of the similar values obtained, but because it is absurd to take p-values as a measure of anything, especially when most of these conflicting groups with slightly ‘better’ or ‘worse’ p-values represent multiple different mixtures of the type (Yamnaya + EEF) + (Corded Ware + EEF ± Yamnaya), impossible to distinguish without selecting proper, direct ancestral populations…

A further example of how explosive the Bell Beaker expansion was into different territories, and of their extensive local admixture, is shown by the unsuccessful attempt by Olalde et al. (2018) to obtain an origin of the EEF source for all Beaker groups (excluding Iberian Beakers):

Investigating the genetic makeup of Beaker-complex-associated individuals. Testing different populations as a source for the Neolithic ancestry component in Beaker-complex-associated individuals. The table shows P values (* indicates values > 0.05) for the fit of the model: ‘Steppe_EBA + Neolithic/Copper Age’ source population.
Map of attested Yamnaya pit-grave burials in the Hungarian plains; superimposed in shades of blue are common areas covered by floods before the extensive controls imposed in the 19th century; in orange, cumulative thickness of sand, unfavourable loamy sand layer. Marked are settlements/findings of Boleráz (ca. 3500 BC on), Baden (until ca. 2800 BC), Kostolac (precise dates unknown), and Yamna kurgans (from ca. 3100/3000 BC on).

Now, there is a simpler way to understand what kind of Steppe-related ancestry is proper of Bell Beakers. I tested two simple models for some Beaker groups: Yamnaya + Hungary Baden vs. Corded Ware + GAC Poland. After all, the Bell Beaker folk should prefer a source more closely related to either Yamnaya Hungary or Central European Corded Ware:

Interestingly, models including Yamnaya + Baden show good fits for the most important groups related to North-West Indo-Europeans, including Bell Beakers from Germany, the Netherlands, Italy, and Poland, representing the most likely closest source populations of speakers of Pre-Proto-Celtic, Pre-Proto-Germanic, Proto-Italo-Venetic, and Pre-Proto-Balto-Slavic, respectively.

The admixed Yamnaya samples from Hungary that will hopefully be published soon by the Jena Lab will most likely further improve these fits, especially in combination with intermediate Chalcolithic populations of the Middle and Upper Danube and its tributaries, to a point where there will be an absolute chronological and geographical genomic trail from the fully Yamnaya-like Yamnaya settlers from Hungary to all North-West Indo-European-speaking groups of the Early Bronze Age.

The only difference between groups will be the gradual admixture events of their source Beaker group with local populations on their expansion paths, including peoples of mainly EEF, CWC+EEF, or CWC+EEF+Yamnaya related ancestry. There is ample evidence beyond ancestry models to support this, in particular continued Y-DNA bottlenecks under typical Yamnaya paternal lineages, mainly represented by R1b-L51 subclades.

Distribution of the Bell Beaker East Group, with its regional provinces, as of c. 2400 cal BC (after Heyd et al. 2004, modified). See full maps.

European Early Bronze Age

European EBA groups that might show conflicting results due to multiple admixture events with Corded Ware-related populations are the Únětice culture and the Nordic Late Neolithic.

The results for Únětice groups seem to be in line with what is expected of a Central European EBA population derived from Bell Beakers admixed with surrounding poulations of East Bell Beaker and/or late (Epi-)Corded Ware descent.

Potential models of mixture for Nordic Late Neolithic samples – despite the bad fits due to the lack of direct ancestral CWC and BBC groups from Denmark – seem to be impossible to justify as derived exclusively from Single Grave or (even less) from Battle Axe peoples, supporting immigration waves of Bell Beakers from the south and further admixture events with local groups through maritime domination.

PCA of ancient European samples. Marked are Bronze Age clusters. See full PCAs.

Balkans Bronze Age

The potential origin of the typical Corded Ware Steppe-related ancestry in the social upheaval and population movements of the Dnieper-Dniester forest-steppe corridor during the 4th millennium BC raises the question: how much do Balkan Bronze Age groups owe their ancestry to a population different than the spread of Pre-Yamnaya-like Suvorovo-Novodanilovka chieftains? Furthermore, which Bronze Age groups seem to be more likely derived exclusively from Pre-Yamnaya groups, and which are more likely to be derived from a mixture of Yamnaya and Pre-Yamnaya? Do the formal stats obtained correspond to the expected results for each group?

Since the expansion of hg. I2a-L699 (TMRCA ca. 5500 BC) need not be associated with Yamnaya, some of these values – together with the assessment of each individual archaeological culture – may question their origin in a Yamnaya-related expansion rather than in a Khvalynsk-related one.

NOTE. These are the last ones I was able to test yesterday, and I have not thought these models through, so feel free to propose other source and target groups. In particular, complex movements through the North Pontic area during the Late Eneolithic would suggest that there might have been different Steppe-ancestry-related vs. EEF-related interactions in the north-west and west Pontic area before and during the expansion of Yamnaya.


One of the key Indo-European populations that should be derived from Yamnaya to confirm the Steppe hypothesis, together with North-West Indo-Europeans, are Proto-Greeks, who will in turn improve our understanding of the preceding Palaeo-Balkan community. Unfortunately, we only have Mycenaean samples from the Aegean, with slight contributions of Steppe-related ancestry.

Still, analyses with potential source populations for this Steppe ancestry show that the Yamnaya outlier from Bulgaria is a good fit:

The comparison of all results makes it quite evident the why of the good fits from (Srubnaya-related) Bulgaria_MLBA I2163 or of Sintashta_MLBA relative to the only a priori reasonable Yamnaya and Catacomb sources: it is not about some hypothetical shared ancestor in Graeco-Aryan-speaking East Yamnaya– or even Catacomb-Poltavka-related groups, because all available Yamnaya-related peoples are almost indistinguishable from each other (at least with the sampling available today). These results reflect a sizeable contribution of similar EEF-related populations from around the Carpathians in both Steppe-related groups: Corded Ware and Yamnaya settlers from the Balkans.

Cultural groups in and around the Balkans during the Early Bronze Age. See full maps.

qpAdm magic

In hobby ancestry magic, as in magic in general, it is not about getting dubious results out of thin air: misdirection is the key. A magician needs to draw the audience attention to ‘remarkable’ ancestry percentages coupled with ‘great’ (?) p-values that purportedly “prove” what the audience expects to see, distracting everyone from the true interesting aspects, like statistical design, the data used (and its shortcomings), other opposing models, a comparison of values, a proper interpretation…you name it.

I reckon – based on the examples above – that the following problems lie at the core of bad uses of qpAdm:

  1. In the formal aspect, the poor understanding of what p-values and other formal stats obtained actually mean, and – more importantly – what they don’t mean. The simplistic trend to accept results of a few analyses at face value is necessarily wrong, in so far as there is often no proper reasoning of what is being assessed and how, and there is never a previous opinion about what could be expected if the alternative hypotheses were true.
  2. In the interpretation aspect, the poor judgement of accompanying any results with simplistic, superficial, irrelevant, and often plainly wrong archaeological or linguistic data selected a posteriori; the inclusion of some racial or sociopolitical overtones in the mixture to set a propitious mood in the target audience; and a sort of ritualistic theatrics with the main theme of ‘winning’, that is best completed with ad hominems.

If you get rid of all this, the most reasonable interpretation of the output of a model proposed and tested should be similar to Nick Patterson’s words in his explanation of qpWave and qpAdm use:

Here we see that, at least in this analysis there are reasonable models with CordedWareNeolithic is a mix of either WHG or LBKNeolithic and YamnayaEBA. (…) The point of this note is not to give a serious phylogenetic analysis but the results here certainly support a major Steppe contribution to the Corded Ware population, which is entirely concordant with the archaeology [?].

Very far, as you can see, from the childish “Eureka! I proved the source!”-kind of thinking common among hobbyists.

The Mycenaean case is an illustrative example: if the Yamnaya outlier from Bulgaria were not available, and if one were not careful when designing and assessing those mixture models, the interpretation would range from erroneous (viz. a Graeco-Aryan substrate, as I initially thought) to impossible (say, inventing migration waves of Sintashta or Srubnaya peoples into Crete). The models presented above show that a contribution of Yamnaya to Mycenaeans couldn’t be rejected, and this alone should have been enough to accept Yamnaya as the most likely source population of “Steppe ancestry” in Proto-Greeks, pending intermediate samples from the Balkans. In other words, one could actually find that ‘the best’ p-values for source populations of Mycenaeans is a combination of modern Poles + Turks, despite the impracticality of such a model…

I haven’t been able to reproduce results which supposedly showed that Corded Ware is more likely to be derived from (Pre-)Yamnaya than other source population, or that Corded Ware is better suited as the ancestral population of Bell Beakers. The analyses above show values in line with what has been published in recent scientific papers, and what should be expected based on linguistics and archaeology. So I’ll go out on a limb here and say that it’s only through a careful selection of outgroups and samples tested, and of as few compared models as possible, that you could eventually get this kind of results and interpretation, if at all.

Whether that kind of special care for outgroups and samples is about (a) an acceptable fine-tuning of the analyses, (b) a simplistic selection dragged from the first papers published and applied indiscriminately to all models, or (c) cherry picking analyses until results fit the expected outcome, is a question that will become mostly irrelevant when future publications continue to support an origin of the expansion of ancient Indo-European languages in Khvalynsk- and Yamnaya-related migrations.

Feel free to suggest (reasonable) modifications to correct some of these models in the comments. Also, be sure to check out other values such as proportions, SD or SNPs of the different results that I might have not taken into account when assessing ‘good’ or ‘bad’ fits.


Yamnaya ancestry: mapping the Proto-Indo-European expansions


The latest papers from Ning et al. Cell (2019) and Anthony JIES (2019) have offered some interesting new data, supporting once more what could be inferred since 2015, and what was evident in population genomics since 2017: that Proto-Indo-Europeans expanded under R1b bottlenecks, and that the so-called “Steppe ancestry” referred to two different components, one – Yamnaya or Steppe_EMBA ancestry – expanding with Proto-Indo-Europeans, and the other one – Corded Ware or Steppe_MLBA ancestry – expanding with Uralic speakers.

The following maps are based on formal stats published in the papers and supplementary materials from 2015 until today, mainly on Wang et al. (2018 & 2019), Mathieson et al. (2018) and Olalde et al. (2018), and others like Lazaridis et al. (2016), Lazaridis et al. (2017), Mittnik et al. (2018), Lamnidis et al. (2018), Fernandes et al. (2018), Jeong et al. (2019), Olalde et al. (2019), etc.

NOTE. As in the Corded Ware ancestry maps, the selected reports in this case are centered on the prototypical Yamnaya ancestry vs. other simplified components, so everything else refers to simplistic ancestral components widespread across populations that do not necessarily share any recent connection, much less a language. In fact, most of the time they clearly didn’t. They can be interpreted as “EHG that is not part of the Yamnaya component”, or “CHG that is not part of the Yamnaya component”. They can’t be read as “expanding EHG people/language” or “expanding CHG people/language”, at least no more than maps of “Steppe ancestry” can be read as “expanding Steppe people/language”. Also, remember that I have left the default behaviour for color classification, so that the highest value (i.e. 1, or white colour) could mean anything from 10% to 100% depending on the specific ancestry and period; that’s what the legend is for… But, fere libenter homines id quod volunt credunt.


  1. Neolithic or the formation of Early Indo-European
  2. Eneolithic or the expansion of Middle Proto-Indo-European
  3. Chalcolithic / Early Bronze Age or the expansion of Late Proto-Indo-European
  4. European Early Bronze Age and MLBA or the expansion of Late PIE dialects

1. Neolithic

Anthony (2019) agrees with the most likely explanation of the CHG component found in Yamnaya, as derived from steppe hunter-fishers close to the lower Volga basin. The ultimate origin of this specific CHG-like component that eventually formed part of the Pre-Yamnaya ancestry is not clear, though:

The hunter-fisher camps that first appeared on the lower Volga around 6200 BC could represent the migration northward of un-admixed CHG hunter-fishers from the steppe parts of the southeastern Caucasus, a speculation that awaits confirmation from aDNA.

Natural neighbor interpolation of CHG ancestry among Neolithic populations. See full map.

The typical EHG component that formed part eventually of Pre-Yamnaya ancestry came from the Middle Volga Basin, most likely close to the Samara region, as shown by the sampled Samara hunter-gatherer (ca. 5600-5500 BC):

After 5000 BC domesticated animals appeared in these same sites in the lower Volga, and in new ones, and in grave sacrifices at Khvalynsk and Ekaterinovka. CHG genes and domesticated animals flowed north up the Volga, and EHG genes flowed south into the North Caucasus steppes, and the two components became admixed.

Natural neighbor interpolation of EHG ancestry among Neolithic populations. See full map.

To the west, in the Dnieper-Dniester area, WHG became the dominant ancestry after the Mesolithic, at the expense of EHG, revealing a likely mating network reaching to the north into the Baltic:

Like the Mesolithic and Neolithic populations here, the Eneolithic populations of Dnieper-Donets II type seem to have limited their mating network to the rich, strategic region they occupied, centered on the Rapids. The absence of CHG shows that they did not mate frequently if at all with the people of the Volga steppes (…)

Natural neighbor interpolation of WHG ancestry among Neolithic populations. See full map.

North-West Anatolia Neolithic ancestry, proper of expanding Early European farmers, is found up to border of the Dniester, as Anthony (2007) had predicted.

Natural neighbor interpolation of Anatolia Neolithic ancestry among Neolithic populations. See full map.

2. Eneolithic

From Anthony (2019):

After approximately 4500 BC the Khvalynsk archaeological culture united the lower and middle Volga archaeological sites into one variable archaeological culture that kept domesticated sheep, goats, and cattle (and possibly horses). In my estimation, Khvalynsk might represent the oldest phase of PIE.

(…) this middle Volga mating network extended down to the North Caucasian steppes, where at cemeteries such as Progress-2 and Vonyuchka, dated 4300 BC, the same Khvalynsk-type ancestry appeared, an admixture of CHG and EHG with no Anatolian Farmer ancestry, with steppe-derived Y-chromosome haplogroup R1b. These three individuals in the North Caucasus steppes had higher proportions of CHG, overlapping Yamnaya. Without any doubt, a CHG population that was not admixed with Anatolian Farmers mated with EHG populations in the Volga steppes and in the North Caucasus steppes before 4500 BC. We can refer to this admixture as pre-Yamnaya, because it makes the best currently known genetic ancestor for EHG/CHG R1b Yamnaya genomes.

From Wang et al (2019):

Three individuals from the sites of Progress 2 and Vonyuchka 1 in the North Caucasus piedmont steppe (‘Eneolithic steppe’), which harbour EHG and CHG related ancestry, are genetically very similar to Eneolithic individuals from Khvalynsk II and the Samara region. This extends the cline of dilution of EHG ancestry via CHG-related ancestry to sites immediately north of the Caucasus foothills

Natural neighbor interpolation of Pre-Yamnaya ancestry among Neolithic populations. See full map. This map corresponds roughly to the map of Khvalynsk-Novodanilovka expansion, and in particular to the expansion of horse-head pommel-scepters (read more about Khvalynsk, and specifically about horse symbolism)

NOTE. Unpublished samples from Ekaterinovka have been previously reported as within the R1b-L23 tree. Interestingly, although the Varna outlier is a female, the Balkan outlier from Smyadovo shows two positive SNP calls for hg. R1b-M269. However, its poor coverage makes its most conservative haplogroup prediction R-M343.

The formation of this Pre-Yamnaya ancestry sets this Volga-Caucasus Khvalynsk community apart from the rest of the EHG-like population of eastern Europe.

Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Eneolithic populations. See full map.

Anthony (2019) seems to rely on ADMIXTURE graphics when he writes that the late Sredni Stog sample from Alexandria shows “80% Khvalynsk-type steppe ancestry (CHG&EHG)”. While this seems the most logical conclusion of what might have happened after the Suvorovo-Novodanilovka expansion through the North Pontic steppes (see my post on “Steppe ancestry” step by step), formal stats have not confirmed that.

In fact, analyses published in Wang et al. (2019) rejected that Corded Ware groups are derived from this Pre-Yamnaya ancestry, a reality that had been already hinted in Narasimhan et al. (2018), when Steppe_EMBA showed a poor fit for expanding Srubna-Andronovo populations. Hence the need to consider the whole CHG component of the North Pontic area separately:

Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Eneolithic populations. See full map. You can read more about population movements in the late Sredni Stog and closer to the Proto-Corded Ware period.

NOTE. Fits for WHG + CHG + EHG in Neolithic and Eneolithic populations are taken in part from Mathieson et al. (2019) supplementary materials (download Excel here). Unfortunately, while data on the Ukraine_Eneolithic outlier from Alexandria abounds, I don’t have specific data on the so-called ‘outlier’ from Dereivka compared to the other two analyzed together, so these maps of CHG and EHG expansion are possibly showing a lesser distribution to the west than the real one ca. 4000-3500 BC.

Natural neighbor interpolation of WHG ancestry among Eneolithic populations. See full map.

Anatolia Neolithic ancestry clearly spread to the east into the north Pontic area through a Middle Eneolithic mating network, most likely opened after the Khvalynsk expansion:

Natural neighbor interpolation of Anatolia Neolithic ancestry among Eneolithic populations. See full map.
Natural neighbor interpolation of Iran Chl. ancestry among Eneolithic populations. See full map.

Regarding Y-chromosome haplogroups, Anthony (2019) insists on the evident association of Khvalynsk, Yamnaya, and the spread of Pre-Yamnaya and Yamnaya ancestry with the expansion of elite R1b-L754 (and some I2a2) individuals:

Y-DNA haplogroups in West Eurasia during the Early Eneolithic in the Pontic-Caspian steppes. See full map, and see culture, ADMIXTURE, Y-DNA, and mtDNA maps of the Early Eneolithic and Late Eneolithic.

3. Early Bronze Age

Data from Wang et al. (2019) show that Corded Ware-derived populations do not have good fits for Eneolithic_Steppe-like ancestry, no matter the model. In other words: Corded Ware populations show not only a higher contribution of Anatolia Neolithic ancestry (ca. 20-30% compared to the ca. 2-10% of Yamnaya); they show a different EHG + CHG combination compared to the Pre-Yamnaya one.

Supplementary Table 13. P values of rank=2 and admixture proportions in modelling Steppe ancestry populations as a three-way admixture of Eneolithic steppe Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Test, Eneolithic_steppe, Anatolian_Neolithic, WHG.
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Yamnaya Kalmykia and Afanasievo show the closest fits to the Eneolithic population of the North Caucasian steppes, rejecting thus sizeable contributions from Anatolia Neolithic and/or WHG, as shown by the SD values. Both probably show then a Pre-Yamnaya ancestry closest to the late Repin population.

Modelling results for the Steppe and Caucasus cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional AF ancestry in Steppe groups and additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups. See tables above. Modified from Wang et al. (2019). Within a blue square, Yamnaya-related groups; within a cyan square, Corded Ware-related groups. Green background behind best p-values. In red circle, SD of AF/WHG ancestry contribution in Afanasevo and Yamnaya Kalmykia, with ranges that almost include 0%.

EBA maps include data from Wang et al. (2018) supplementary materials, specifically unpublished Yamnaya samples from Hungary that appeared in analysis of the preprint, but which were taken out of the definitive paper. Their location among Yamnaya settlers from Hungary is speculative, although most uncovered kurgans in Hungary are concentrated in the Tisza-Danube interfluve.

Natural neighbor interpolation of Pre-Yamnaya ancestry among Early Bronze Age populations. See full map. This map corresponds roughly with the known expansion of late Repin/Yamnaya settlers.

The Y-chromosome bottleneck of elite males from Proto-Indo-European clans under R1b-L754 and some I2a2 subclades, already visible in the Khvalynsk sampling, became even more noticeable in the subsequent expansion of late Repin/early Yamnaya elites under R1b-L23 and I2a-L699:

Y-DNA haplogroups in West Eurasia during the Yamnaya expansion. See full map and maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Chalcolithic and Yamnaya Hungary.

Maps of CHG, EHG, Anatolia Neolithic, and probably WHG show the expansion of these components among Corded Ware-related groups in North Eurasia, apart from other cultures close to the Caucasus:

NOTE. For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you can read the post Corded Ware ancestry in North Eurasia and the Uralic expansion.

Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Early Bronze Age populations. See full map.
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Early Bronze Age populations. See full map.
Natural neighbor interpolation of WHG ancestry among Early Bronze Age populations. See full map.
Natural neighbor interpolation of Anatolia Neolithic ancestry among Early Bronze Age populations. See full map.
Natural neighbor interpolation of Iran Chl. ancestry among Early Bronze Age populations. See full map.

4. Middle to Late Bronze Age

The following maps show the most likely distribution of Yamnaya ancestry during the Bell Beaker-, Balkan-, and Sintashta-Potapovka-related expansions.

4.1. Bell Beakers

The amount of Yamnaya ancestry is probably overestimated among populations where Bell Beakers replaced Corded Ware. A map of Yamnaya ancestry among Bell Beakers gets trickier for the following reasons:

  • Expanding Repin peoples of Pre-Yamnaya ancestry must have had admixture through exogamy with late Sredni Stog/Proto-Corded Ware peoples during their expansion into the North Pontic area, and Sredni Stog in turn had probably some Pre-Yamnaya admixture, too (although they don’t appear in the simplistic formal stats above). This is supported by the increase of Anatolia farmer ancestry in more western Yamna samples.
  • Later, Yamnaya admixed through exogamy with Corded Ware-like populations in Central Europe during their expansion. Even samples from the Middle to Upper Danube and around the Lower Rhine will probably show increasing contributions of Steppe_MLBA, at the same time as they show an increasing proportion of EEF-related ancestry.
  • To complicate things further, the late Corded Ware Espersted family (from ca. 2500 BC or later) shows, in turn, what seems like a recent admixture with Yamnaya vanguard groups, with the sample of highest Yamnaya ancestry being the paternal uncle of other individuals (all of hg. R1a-M417), suggesting that there might have been many similar Central European mating networks from the mid-3rd millennium BC on, of (mainly) Yamnaya-like R1b elites displaying a small proportion of CW-like ancestry admixing through exogamy with Corded Ware-like peoples who already had some Yamnaya ancestry.
Natural neighbor interpolation of Yamnaya ancestry among Middle to Late Bronze Age populations (Esperstedt CWC site close to BK_DE, label is hidden by BK_DE_SAN). See full map. You can see how this map correlated with the map of Late Copper Age migrations and Yamanaya into Bell Beaker expansion.

NOTE. Terms like “exogamy”, “male-driven migration”, and “sex bias”, are not only based on the Y-chromosome bottlenecks visible in the different cultural expansions since the Palaeolithic. Despite the scarce sampling available in 2017 for analysis of “Steppe ancestry”-related populations, it appeared to show already a male sex bias in Goldberg et al. (2017), and it has been confirmed for Neolithic and Copper Age population movements in Mathieson et al. (2018) – see Supplementary Table 5. The analysis of male-biased expansion of “Steppe ancestry” in CWC Esperstedt and Bell Beaker Germany is, for the reasons stated above, not very useful to distinguish their mutual influence, though.

Based on data from Olalde et al. (2019), Bell Beakers from Germany are the closest sampled ones to expanding East Bell Beakers, and those close to the Rhine – i.e. French, Dutch, and British Beakers in particular – show a clear excess “Steppe ancestry” due to their exogamy with local Corded Ware groups:

Only one 2-way model fits the ancestry in Iberia_CA_Stp with P-value>0.05: Germany_Beaker + Iberia_CA. Finding a Bell Beaker-related group as a plausible source for the introduction of steppe ancestry into Iberia is consistent with the fact that some of the individuals in the Iberia_CA_Stp group were excavated in Bell Beaker associated contexts. Models with Iberia_CA and other Bell Beaker groups such as France_Beaker (P-value=7.31E-06), Netherlands_Beaker (P-value=1.03E-03) and England_Beaker (P-value=4.86E-02) failed, probably because they have slightly higher proportions of steppe ancestry than the true source population.


The exogamy with Corded Ware-like groups in the Lower Rhine Basin seems at this point undeniable, as is the origin of Bell Beakers around the Middle-Upper Danube Basin from Yamnaya Hungary.

To avoid this excess “Steppe ancestry” showing up in the maps, since Bell Beakers from Germany pack the most Yamnaya ancestry among East Bell Beakers outside Hungary (ca. 51.1% “Steppe ancestry”), I equated this maximum with BK_Scotland_Ach (which shows ca. 61.1% “Steppe ancestry”, highest among western Beakers), and applied a simple rule of three for “Steppe ancestry” in Dutch and British Beakers.

NOTE. Formal stats for “Steppe ancestry” in Bell Beaker groups are available in Olalde et al. (2018) supplementary materials (PDF). I didn’t apply this adjustment to Bk_FR groups because of the R1b Bell Beaker sample from the Champagne/Alsace region reported by Samantha Brunel that will pack more Yamnaya ancestry than any other sampled Beaker to date, hence probably driving the Yamnaya ancestry up in French samples.

The most likely outcome in the following years, when Yamnaya and Corded Ware ancestry are investigated separately, is that Yamnaya ancestry will be much lower the farther away from the Middle and Lower Danube region, similar to the case in Iberia, so the map above probably overestimates this component in most Beakers to the north of the Danube. Even the late Hungarian Beaker samples, who pack the highest Yamnaya ancestry (up to 75%) among Beakers, represent likely a back-migration of Moravian Beakers, and will probably show a contribution of Corded Ware ancestry due to the exogamy with local Moravian groups.

Despite this decreasing admixture as Bell Beakers spread westward, the explosive expansion of Yamnaya R1b male lineages (in words of David Reich) and the radical replacement of local ones – whether derived from Corded Ware or Neolithic groups – shows the true extent of the North-West Indo-European expansion in Europe:

Y-DNA haplogroups in West Eurasia during the Bell Beaker expansion. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Late Copper Age and of the Yamnaya-Bell Beaker transition.

4.2. Palaeo-Balkan

There is scarce data on Palaeo-Balkan movements yet, although it is known that:

  1. Yamnaya ancestry appears among Mycenaeans, with the Yamnaya Bulgaria sample being its best current ancestral fit;
  2. the emergence of steppe ancestry and R1b-M269 in the eastern Mediterranean was associated with Ancient Greeks;
  3. Thracians, Albanians, and Armenians also show R1b-M269 subclades and “Steppe ancestry”.

4.3. Sintashta-Potapovka-Filatovka

Interestingly, Potapovka is the only Corded Ware derived culture that shows good fits for Yamnaya ancestry, despite having replaced Poltavka in the region under the same Corded Ware-like (Abashevo) influence as Sintashta.

This proves that there was a period of admixture in the Pre-Proto-Indo-Iranian community between CWC-like Abashevo and Yamnaya-like Catacomb-Poltavka herders in the Sintashta-Potapovka-Filatovka community, probably more easily detectable in this group because of the specific temporal and geographic sampling available.

Supplementary Table 14. P values of rank=3 and admixture proportions in modelling Steppe ancestry populations as a four-way admixture of distal sources EHG, CHG, Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Steppe cluster, EHG, CHG, WHG, Anatolian_Neolithic
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Srubnaya ancestry shows a best fit with non-Pre-Yamnaya ancestry, i.e. with different CHG + EHG components – possibly because the more western Potapovka (ancestral to Proto-Srubnaya Pokrovka) also showed good fits for it. Srubnaya shows poor fits for Pre-Yamnaya ancestry probably because Corded Ware-like (Abashevo) genetic influence increased during its formation.

On the other hand, more eastern Corded Ware-derived groups like Sintashta and its more direct offshoot Andronovo show poor fits with this model, too, but their fits are still better than those including Pre-Yamnaya ancestry.

Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Middle to Late Bronze Age populations. See full map.
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Middle to Late Bronze Age populations. See full map.
Natural neighbor interpolation of Anatolia Neolithic ancestry among Middle to Late Bronze Age populations. See full map.
Natural neighbor interpolation of Iran Chl. ancestry among Middle to Late Bronze Age populations. See full map.

NOTE For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you should read the post Corded Ware ancestry in North Eurasia and the Uralic expansion instead.

The bottleneck of Proto-Indo-Iranians under R1a-Z93 was not yet complete by the time when the Sintashta-Potapovka-Filatovka community expanded with the Srubna-Andronovo horizon:

Y-DNA haplogroups in West Eurasia during the European Early Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Bronze Age.

4.4. Afanasevo

At the end of the Afanasevo culture, at least three samples show hg. Q1b (ca. 2900-2500 BC), which seemed to point to a resurgence of local lineages, despite continuity of the prototypical Pre-Yamnaya ancestry. On the other hand, Anthony (2019) makes this cryptic statement:

Yamnaya men were almost exclusively R1b, and pre-Yamnaya Eneolithic Volga-Caspian-Caucasus steppe men were principally R1b, with a significant Q1a minority.

Since the only available samples from the Khvalynsk community are R1b (x3), Q1a(x1), and R1a(x1), it seems strange that Anthony would talk about a “significant minority”, unless Q1a (potentially Q1b in the newer nomenclature) will pop up in some more individuals of those ca. 30 new to be published. Because he also mentions I2a2 as appearing in one elite burial, it seems Q1a (like R1a-M459) will not appear under elite kurgans, although it is still possible that hg. Q1a was involved in the expansion of Afanasevo to the east.

Y-DNA haplogroups in West Eurasia during the Middle Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Middle Bronze Age and the Late Bronze Age.

Okunevo, which replaced Afanasevo in the Altai region, shows a majority of hg. Q1b, but also some R1b-M269 samples proper of Afanasevo, suggesting partial genetic continuity.

NOTE. Other sampled Siberian populations clearly show a variety of Q subclades that likely expanded during the Palaeolithic, such as Baikal EBA samples from Ust’Ida and Shamanka with a majority of Q1b, and hg. Q reported from Elunino, Sagsai, Khövsgöl, and also among peoples of the Srubna-Andronovo horizon (the Krasnoyarsk MLBA outlier), and in Karasuk.

From Damgaard et al. Science (2018):

(…) in contrast to the lack of identifiable admixture from Yamnaya and Afanasievo in the CentralSteppe_EMBA, there is an admixture signal of 10 to 20% Yamnaya and Afanasievo in the Okunevo_EMBA samples, consistent with evidence of western steppe influence. This signal is not seen on the X chromosome (qpAdm P value for admixture on X 0.33 compared to 0.02 for autosomes), suggesting a male-derived admixture, also consistent with the fact that 1 of 10 Okunevo_EMBA males carries a R1b1a2a2 Y chromosome related to those found in western pastoralists. In contrast, there is no evidence of western steppe admixture among the more eastern Baikal region region Bronze Age (~2200 to 1800 BCE) samples.

This Yamnaya ancestry has been also recently found to be the best fit for the Iron Age population of Shirenzigou in Xinjiang – where Tocharian languages were attested centuries later – despite the haplogroup diversity acquired during their evolution, likely through an intermediate Chemurchek culture (see a recent discussion on the elusive Proto-Tocharians).

Haplogroup diversity seems to be common in Iron Age populations all over Eurasia, most likely due to the spread of different types of sociopolitical structures where alliances played a more relevant role in the expansion of peoples. A well-known example of this is the spread of Akozino warrior-traders in the whole Baltic region under a partial N1a-VL29-bottleneck associated with the emerging chiefdom-based systems under the influence of expanding steppe nomads.

Y-DNA haplogroups in West Eurasia during the Early Iron Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Iron Age and Late Iron Age.

Surprisingly, then, Proto-Tocharians from Shirenzigou pack up to 74% Yamnaya ancestry, in spite of the 2,000 years that separate them from the demise of the Afanasevo culture. They show more Yamnaya ancestry than any other population by that time, being thus a sort of Late PIE fossils not only in their archaic dialect, but also in their genetic profile:


The recent intrusion of Corded Ware-like ancestry, as well as the variable admixture with Siberian and East Asian populations, both point to the known intense Old Iranian and Old/Middle Chinese contacts. The scarce Proto-Samoyedic and Proto-Turkic loans in Tocharian suggest a rather loose, probably more distant connection with East Uralic and Altaic peoples from the forest-steppe and steppe areas to the north (read more about external influences on Tocharian).

Interestingly, both R1b samples, MO12 and M15-2 – likely of Asian R1b-PH155 branch – show a best fit for Andronovo/Srubna + Hezhen/Ulchi ancestry, suggesting a likely connection with Iranians to the east of Xinjiang, who later expanded as the Wusun and Kangju. How they might have been related to Huns and Xiongnu individuals, who also show this haplogroup, is yet unknown, although Huns also show hg. R1a-Z93 (probably most R1a-Z2124) and Steppe_MLBA ancestry, earlier associated with expanding Iranian peoples of the Srubna-Andronovo horizon.

All in all, it seems that prehistoric movements explained through the lens of genetic research fit perfectly well the linguistic reconstruction of Proto-Indo-European and Proto-Uralic.


How the genocidal Yamnaya men loved to switch cultures


After some really interesting fantasy full of arrows, it seems Kristiansen & friends are coming back to their most original idea from 2015, now in New Scientist’s recent clickbait Story of most murderous people of all time revealed in ancient DNA (2019):

Teams led by David Reich at Harvard Medical School and Eske Willerslev at the University of Copenhagen in Denmark announced, independently, that occupants of Corded Ware graves in Germany could trace about three-quarters of their genetic ancestry to the Yamnaya. It seemed that Corded Ware people weren’t simply copying the Yamnaya; to a large degree they actually were Yamnayan in origin.

If you think you have seen that movie, it’s because you have. They are at it again, Corded Ware from Yamna, and more “steppe ancestry” = “more Indo-European. It seems we haven’t learnt anything about “Steppe ancestry” since 2015. But there’s more:

Genocidal peoples who “switch cultures”

Burial practices shifted dramatically, a warrior class appeared, and there seems to have been a sharp upsurge in lethal violence. “I’ve become increasingly convinced there must have been a kind of genocide,” says Kristian Kristiansen at the University of Gothenburg, Sweden.

The collaboration revealed that the origin and initial spread of Bell Beaker culture had little to do – at least genetically – with the expansion of the Yamnaya or Corded Ware people into central Europe. “It started in It is in that region that the earliest Bell Beaker objects – including arrowheads, copper daggers and distinctive Bell-shaped pots – have been found, in archaeological sites carbon-dated to 4700 years ago. Then, Bell Beaker culture began to spread east, although the people more or less stayed put. By about 4600 years ago, it reached the most westerly Corded Ware people around where the Netherlands now lies. For reasons still unclear, the Corded Ware people fully embraced it. “They simply take on part of the Bell Beaker package and become Beaker people,” says Kristiansen.

The fact that the genetic analysis showed the Britons then all-but disappeared within a couple of generations might be significant. It suggests the capacity for violence that emerged when the Yamnaya lived on the Eurasia steppe remained even as these people moved into Europe, switched identity from Yamnaya to Corded Ware, and then switched again from Corded Ware to Bell Beaker.

Notice what Kristiansen did there? Yamnaya men “switched identities” into Corded Ware, then “switched identities” into Bell Beakers…So, the most aggresive peoples who have ever existed, exterminating all other Europeans, were actually not so violent when embracing wholly different cultures whose main connection is that they built kurgans (yes, Gimbutas lives on).

NOTE. By the way, just so we are clear, only Indo-Europeans are “genocidal”. Not like Neolithic farmers, or Palaeolithic or Mesolithic populations, or more recent Bronze Age or Iron Age peoples, who also replaced Y-DNA from many regions…


In fact, there is much stronger evidence that these Yamnaya Beakers were ruthless. By about 4500 years ago, they had pushed westwards into the Iberian Peninsula, where the Bell Beaker culture originated a few centuries earlier. Within a few generations, about 40 per cent of the DNA of people in the region could be traced back to the incoming Yamnaya Beakers, according to research by a large team including Reich that was published this month. More strikingly, the ancient DNA analysis reveals that essentially all the men have Y chromosomes characteristic of the Yamnaya, suggesting only Yamnaya men had children.

“The collision of these two populations was not a friendly one, not an equal one, but one where the males from outside were displacing local males and did so almost completely,” Reich told New Scientist Live in September. This supports Kristiansen’s view of the Yamnaya and their descendants as an almost unimaginably violent people. Indeed, he is about to publish a paper in which he argues that they were responsible for the genocide of Neolithic Europe’s men. “It’s the only way to explain that no male Neolithic lines survived,” he says.

So these unimaginably violent Yamnaya men had children exclusively with their Y chromosomes…but not Dutch Single Grave peoples. These great great steppe-like northerners switched culture, cephalic index…and Y-chromosome from R1a (and others) to R1b-L151 to expand Italo-Celtic From The West™.

It’s hilarious how (exactly like their latest funny episode of PIE from south of the Caucasus) this new visionary idea copied by Copenhagen from amateur friends (or was it the other way around?) had been already rejected before this article came out, in Olalde et al. (2019), and that “Corded Ware=Indo-European” fans have become a parody of themselves.

What’s not to love about 2019 with all this back-and-forth hopping between old and new pet theories?

NOTE. I would complain (again) that the obsessive idea of the Danes is that Denmark CWC is (surprise!) the Pre-Germanic community, so it has nothing to do with “steppe ancestry = Indo-European” (or even with “Corded Ware = Indo-European”, for that matter), but then again you have Koch still arguing for Celtic from the West, Kortlandt still arguing for Balto-Slavic from the east, and – no doubt worst of all – “R1a=IE / R1b=Vasconic / N1c=Uralic” ethnonationalists arguing for whatever is necessary right now, in spite of genetic research.

So prepare for the next episode in the nativist and haplogroup fetishist comedy, now with western and eastern Europeans hand in hand: Samara -> Khvalynsk -> Yamnaya -> Bell Beaker spoke Vasconic-Tyrsenian, because R1b. Wait for it…

Vanguard Yamnaya groups

On a serious note, interesting comment by Heyd in the article:

A striking example of this distinction is a discovery made near the town of Valencina de la Concepción in southern Spain. Archaeologists working there found a Yamnaya-like kurgan, below which was the body of a man buried with a dagger and Yamnaya-like sandals, and decorated with red pigment just as Yamnaya dead were. But the burial is 4875 years old and genetic information suggests Yamnaya-related people didn’t reach that far west until perhaps 4500 years ago. “Genetically, I’m pretty sure this burial has nothing to do with the Yamnaya or the Corded Ware,” says Heyd. “But culturally – identity-wise – there is an aspect that can be clearly linked with them.” It would appear that the ideology, lifestyle and death rituals of the Yamnaya could sometimes run far ahead of the migrants.

NOTE. I have been trying to find which kurgan is this, reviewing this text on the archaeological site, but didn’t find anything beyond occasional ochre and votive sandals, which are usual. Does some reader know which one is it?

Yamna expansion and succeeding East Bell Beaker expansion, without color on Bell Beaker territories. Notice vanguard Yamna groups in blue where East Bell Beakers later emerge. See original image with Bell Beaker territories.

Notice how, if you add all those vanguard Yamna findings of Central and Western Europe, including this one from southern Spain, you begin to get a good idea of the territories occupied by East Bell Beakers expanding later. More or less like vanguard Abashevo and Sintashta finds in the Zeravshan valley heralded the steppe-related Srubna-Andronovo expansions in Turan…

It doesn’t seem like Proto-Beaker and Yamna just “crossed paths” at some precise time around the Lower Danube, and Yamna men “switched cultures”. It seems that many Yamna vanguard groups, probably still in long-distance contact with Yamna settlers from the Carpathian Basin, were already settled in different European regions in the first half of the 3rd millennium BC, before the explosive expansion of East Bell Beakers ca. 2500 BC. As Heyd says, there are potentially many Yamna settlements along the Middle and Lower Danube and tributaries not yet found, connecting the Carpathian Basin to Western and Northern Europe.

These vanguard groups would have more easily transformed their weakened eastern Yamna connections with the fashionable Proto-Beaker package expanding from the west (and surrounding all of these loosely connected settlements), just like the Yamna materials from Seville probably represent a close cultural contact of Chalcolithic Iberia with a Yamna settlement (the closest known site with Yamna traits is near Alsace, where high Yamna ancestry is probably going to be found in a Bell Beaker R1b-L151 individual).

This does not mean that there wasn’t a secondary full-scale migration from the Carpathian Basin and nearby settlements, just like Corded Ware shows a secondary (A-horizon?) migration to the east with R1a-Z645. It just means that there was a complex picture of contacts between Yamna and European Chalcolithic groups before the expansion of Bell Beakers. Doesn’t seem genocidal enough for a popular movie, tho.


A very “Yamnaya-like” East Bell Beaker from France, probably R1b-L151


Interesting report by Bernard Sécher on Anthrogenica, about the Ph.D. thesis of Samantha Brunel from Institut Jacques Monod, Paris, Paléogénomique des dynamiques des populations humaines sur le territoire Français entre 7000 et 2000 (2018).

NOTE. You can visit Bernard Sécher’s blog on genetic genealogy.

A summary from user Jool, who was there, translated into English by Sécher (slight changes to translation, and emphasis mine):

They have a good hundred samples from the North, Alsace and the Mediterranean coast, from the Mesolithic to the Iron Age.

There is no major surprise compared to the rest of Europe. On the PCA plot, the Mesolithic are with the WHG, the early Neolithics with the first farmers close to the Anatolians. Then there is a small resurgence of hunter-gatherers that moves the Middle Neolithics a little closer to the WHGs.

From the Bronze Age, they have 5 samples with autosomal DNA, all in Bell Beaker archaeological context, which are very spread on the PCA. A sample very high, close to the Yamnaya, a little above the Corded Ware, two samples right in the Central European Bell Beakers, a fairly low just above the Neolithic package, and one last full in the package. The most salient point was that the Y chromosomes of their 12 Bronze Age samples (all Bell Beakers) are all R1b, whereas there was no R1b in the Neolithic samples.

Finally they have samples of the Iron Age that are collected on the PCA plot close to the Bronze Age samples. They could not determine if there is continuity with the Bronze Age, or a partial replacement by a genetically close population.

Image modified from Wang et al. (2018). Samples projected in PCA of 84 modern-day West Eurasian populations (open symbols). Previously known clusters have been marked and referenced. Marked and labelled are interesting samples; In red, likely position of late Yamna Hungary / early East Bell Beakers An EHG and a Caucasus ‘clouds’ have been drawn, leaving Pontic-Caspian steppe and derived groups between them. See the original file here. To understand the drawn potential Caucasus Mesolithic cluster, see above the PCA from Lazaridis et al. (2018).

The sample with likely high “steppe ancestry“, clustering closely to Yamna (more than Corded Ware samples) is then probably an early East Bell Beaker individual, probably from Alsace, or maybe close to the Rhine Delta in the north, rather than from the south, since we already have samples from southern France from Olalde et al. (2018) with high Neolithic ancestry, and samples from the Rhine with elevated steppe ancestry, but not that much.

This specific sample, if confirmed as one of those reported as R1b (then likely R1b-L151), as it seems from the wording of the summary, is key because it would finally link Yamna to East Bell Beaker through Yamna Hungary, all of them very “Yamnaya-like”, and therefore R1b-L151 (hence also R1b-L51) directly to the steppe, and not only to the Carpathian Basin (that is, until we have samples from late Repin or West Yamna…)

NOTE. The only alternative explanation for such elevated steppe ancestry would be an admixture between a ‘less Yamnaya-like’ East Bell Beaker + a Central European Corded Ware sample like the Esperstedt outlier + drift, but I don’t think that alternative is the best explanation of its position in the PCA closer to Yamna in any of the infinite parallel universes, so… Also, the sample from Esperstedt is clearly a late outlier likely influenced by Yamna vanguard settlers from Hungary, not the other way round…

Unexpectedly, then, fully Yamnaya-like individuals are found not only in Yamna Hungary ca. 3000-2500 BC, but also among expanding East Bell Beakers later than 2500 BC. This leaves us with unexplained, not-at-all-Yamnaya-like early Corded Ware samples from ca. 2900 BC on. An explanation based on admixture with locals seems unlikely, seeing how Corded Ware peoples continue a north Pontic cluster, being thus different from Yamna and their ancestors since the Neolithic; and how they remained that way for a long time, up to Sintashta, Srubna, Andronovo, and even later samples… A different, non-Indo-European community it is, then.

Image modified from Olalde et al. (2018). PCA of 999 Eurasian individuals. Marked is the Espersted Outlier with the approximate position of Yamna Hungary, probably the source of its admixture. Different Bell Beaker clines have been drawn, to represent approximate source of expansions from Central European sources into the different regions. In red, likely zone of Yamna Hungary and reported early East Bell Beaker individual from France.

Let’s wait and see the Ph.D. thesis, when it’s published, and keep observing in the meantime the absurd reactions of denial, anger, bargaining, and depression (stages of grief) among BBC/R1b=Vasconic and CWC/R1a=Indo-European fans, as if they had lost something (?). Maybe one of these reactions is actually the key to changing reality and going back to the 2000s, who knows…

Featured image: initial expansion of the East Bell Beaker Group, by Volker Heyd (2013).


Common pitfalls in human genomics and bioinformatics: ADMIXTURE, PCA, and the ‘Yamnaya’ ancestral component


Good timing for the publication of two interesting papers, that a lot of people should read very carefully:


Open access A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots, by Daniel J. Lawson, Lucy van Dorp & Daniel Falush, Nature Communications (2018).

Interesting excerpts (emphasis mine):

Experienced researchers, particularly those interested in population structure and historical inference, typically present STRUCTURE results alongside other methods that make different modelling assumptions. These include TreeMix, ADMIXTUREGRAPH, fineSTRUCTURE, GLOBETROTTER, f3 and D statistics, amongst many others. These models can be used both to probe whether assumptions of the model are likely to hold and to validate specific features of the results. Each also comes with its own pitfalls and difficulties of interpretation. It is not obvious that any single approach represents a direct replacement as a data summary tool. Here we build more directly on the results of STRUCTURE/ADMIXTURE by developing a new approach, badMIXTURE, to examine which features of the data are poorly fit by the model. Rather than intending to replace more specific or sophisticated analyses, we hope to encourage their use by making the limitations of the initial analysis clearer.

The default interpretation protocol

Most researchers are cautious but literal in their interpretation of STRUCTURE and ADMIXTURE results, as caricatured in Fig. 1, as it is difficult to interpret the results at all without making several of these assumptions. Here we use simulated and real data to illustrate how following this protocol can lead to inference of false histories, and how badMIXTURE can be used to examine model fit and avoid common pitfalls.

A protocol for interpreting admixture estimates, based on the assumption that the model underlying the inference is correct. If these assumptions are not validated, there is substantial danger of over-interpretation. The “Core protocol” describes the assumptions that are made by the admixture model itself (Protocol 1, 3, 4), and inference for estimating K (Protocol 2). The “Algorithm input” protocol describes choices that can further bias results, while the “Interpretation” protocol describes assumptions that can be made in interpreting the output that are not directly supported by model inference


STRUCTURE and ADMIXTURE are popular because they give the user a broad-brush view of variation in genetic data, while allowing the possibility of zooming down on details about specific individuals or labelled groups. Unfortunately it is rarely the case that sampled data follows a simple history comprising a differentiation phase followed by a mixture phase, as assumed in an ADMIXTURE model and highlighted by case study 1. Naïve inferences based on this model (the Protocol of Fig. 1) can be misleading if sampling strategy or the inferred value of the number of populations K is inappropriate, or if recent bottlenecks or unobserved ancient structure appear in the data. It is therefore useful when interpreting the results obtained from real data to think of STRUCTURE and ADMIXTURE as algorithms that parsimoniously explain variation between individuals rather than as parametric models of divergence and admixture.

For example, if admixture events or genetic drift affect all members of the sample equally, then there is no variation between individuals for the model to explain. Non-African humans have a few percent Neanderthal ancestry, but this is invisible to STRUCTURE or ADMIXTURE since it does not result in differences in ancestry profiles between individuals. The same reasoning helps to explain why for most data sets—even in species such as humans where mixing is commonplace—each of the K populations is inferred by STRUCTURE/ADMIXTURE to have non-admixed representatives in the sample. If every individual in a group is in fact admixed, then (with some exceptions) the model simply shifts the allele frequencies of the inferred ancestral population to reflect the fraction of admixture that is shared by all individuals.

Several methods have been developed to estimate K, but for real data, the assumption that there is a true value is always incorrect; the question rather being whether the model is a good enough approximation to be practically useful. First, there may be close relatives in the sample which violates model assumptions. Second, there might be “isolation by distance”, meaning that there are no discrete populations at all. Third, population structure may be hierarchical, with subtle subdivisions nested within diverged groups. This kind of structure can be hard for the algorithms to detect and can lead to underestimation of K. Fourth, population structure may be fluid between historical epochs, with multiple events and structures leaving signals in the data. Many users examine the results of multiple K simultaneously but this makes interpretation more complex, especially because it makes it easier for users to find support for preconceptions about the data somewhere in the results.

In practice, the best that can be expected is that the algorithms choose the smallest number of ancestral populations that can explain the most salient variation in the data. Unless the demographic history of the sample is particularly simple, the value of K inferred according to any statistically sensible criterion is likely to be smaller than the number of distinct drift events that have practically impacted the sample. The algorithm uses variation in admixture proportions between individuals to approximately mimic the effect of more than K distinct drift events without estimating ancestral populations corresponding to each one. In other words, an admixture model is almost always “wrong” (Assumption 2 of the Core protocol, Fig. 1) and should not be interpreted without examining whether this lack of fit matters for a given question.

Three scenarios that give indistinguishable ADMIXTURE results. a Simplified schematic of each simulation scenario. b Inferred ADMIXTURE plots at K= 11. c CHROMOPAINTER inferred painting palettes.

Because STRUCTURE/ADMIXTURE accounts for the most salient variation, results are greatly affected by sample size in common with other methods. Specifically, groups that contain fewer samples or have undergone little population-specific drift of their own are likely to be fit as mixes of multiple drifted groups, rather than assigned to their own ancestral population. Indeed, if an ancient sample is put into a data set of modern individuals, the ancient sample is typically represented as an admixture of the modern populations (e.g., ref. 28,29), which can happen even if the individual sample is older than the split date of the modern populations and thus cannot be admixed.

This paper was already available as a preprint in bioRxiv (first published in 2016) and it is incredible that it needed to wait all this time to be published. I found it weird how reviewers focused on the “tone” of the paper. I think it is great to see files from the peer review process published, but we need to know who these reviewers were, to understand their whiny remarks… A lot of geneticists out there need to develop a thick skin, or else we are going to see more and more delays based on a perceived incorrect tone towards the field, which seems a rather subjective reason to force researchers to correct a paper.

PCA of SNP data

Open access Effective principal components analysis of SNP data, by Gauch, Qian, Piepho, Zhou, & Chen, bioRxiv (2018).

Interesting excerpts:

A potential hindrance to our advice to upgrade from PCA graphs to PCA biplots is that the SNPs are often so numerous that they would obscure the Items if both were graphed together. One way to reduce clutter, which is used in several figures in this article, is to present a biplot in two side-by-side panels, one for Items and one for SNPs. Another stratagem is to focus on a manageable subset of SNPs of particular interest and show only them in a biplot in order to avoid obscuring the Items. A later section on causal exploration by current methods mentions several procedures for identifying particularly relevant SNPs.

One of several data transformations is ordinarily applied to SNP data prior to PCA computations, such as centering by SNPs. These transformations make a huge difference in the appearance of PCA graphs or biplots. A SNPs-by-Items data matrix constitutes a two-way factorial design, so analysis of variance (ANOVA) recognizes three sources of variation: SNP main effects, Item main effects, and SNP-by-Item (S×I) interaction effects. Double-Centered PCA (DC-PCA) removes both main effects in order to focus on the remaining S×I interaction effects. The resulting PCs are called interaction principal components (IPCs), and are denoted by IPC1, IPC2, and so on. By way of preview, a later section on PCA variants argues that DC-PCA is best for SNP data. Surprisingly, our literature survey did not encounter even a single analysis identified as DC-PCA.

The axes in PCA graphs or biplots are often scaled to obtain a convenient shape, but actually the axes should have the same scale for many reasons emphasized recently by Malik and Piepho [3]. However, our literature survey found a correct ratio of 1 in only 10% of the articles, a slightly faulty ratio of the larger scale over the shorter scale within 1.1 in 12%, and a substantially faulty ratio above 2 in 16% with the worst cases being ratios of 31 and 44. Especially when the scale along one PCA axis is stretched by a factor of 2 or more relative to the other axis, the relationships among various points or clusters of points are distorted and easily misinterpreted. Also, 7% of the articles failed to show the scale on one or both PCA axes, which leaves readers with an impressionistic graph that cannot be reproduced without effort. The contemporary literature on PCA of SNP data mostly violates the prohibition against stretching axes.

DC-PCA biplot for oat data. The gradient in the CA-arranged matrix in Fig 13 is shown here for both lines and SNPs by the color scheme red, pink, black, light green, dark green.

The percentage of variation captured by each PC is often included in the axis labels of PCA graphs or biplots. In general this information is worth including, but there are two qualifications. First, these percentages need to be interpreted relative to the size of the data matrix because large datasets can capture a small percentage and yet still be effective. For example, for a large dataset with over 107,000 SNPs for over 6,000 persons, the first two components capture only 0.3693% and 0.117% of the variation, and yet the PCA graph shows clear structure (Fig 1A in [4]). Contrariwise, a PCA graph could capture a large percentage of the total variation, even 50% or more, but that would not guarantee that it will show evident structure in the data. Second, the interpretation of these percentages depends on exactly how the PCA analysis was conducted, as explained in a later section on PCA variants. Readers cannot meaningfully interpret the percentages of variation captured by PCA axes when authors fail to communicate which variant of PCA was used.


Five simple recommendations for effective PCA analysis of SNP data emerge from this investigation.

  1. Use the SNP coding 1 for the rare or minor allele and 0 for the common or major allele.
  2. Use DC-PCA; for any other PCA variant, examine its augmented ANOVA table.
  3. Report which SNP coding and PCA variant were selected, as required by contemporary standards in science for transparency and reproducibility, so that readers can interpret PCA results properly and reproduce PCA analyses reliably.
  4. Produce PCA biplots of both Items and SNPs, rather than merely PCA graphs of only Items, in order to display the joint structure of Items and SNPs and thereby to facilitate causal explanations. Be aware of the arch distortion when interpreting PCA graphs or biplots.
  5. Produce PCA biplots and graphs that have the same scale on every axis.

I read the referenced paper Biplots: Do Not Stretch Them!, by Malik and Piepho (2018), and even though it is not directly applicable to the most commonly available PCA graphs out there, it is a good reminder of the distorting effects of stretching. So for example quite recently in Krause-Kyora et al. (2018), where you can see Corded Ware and BBC samples from Central Europe clustering with samples from Yamna:

NOTE. This is related to a vertical distorsion (i.e. horizontal stretching), but possibly also to the addition of some distant outlier sample/s.

Principal Component Analysis (PCA) of the human Karsdorf and Sorsum samples together with previously published ancient populations projected on 27 modern day West Eurasian populations (not shown) based on a set of 1.23 million SNPs (Mathieson et al., 2015).

The so-called ‘Yamnaya’ ancestry

Every time I read papers like these, I remember commenters who kept swearing that genetics was the ultimate science that would solve anthropological problems, where unscientific archaeology and linguistics could not. Well, it seems that, like radiocarbon analysis, these promising developing methods need still a lot of refinement to achieve something meaningful, and that they mean nothing without traditional linguistics and archaeology… But we already knew that.

Also, if this is happening in most peer-reviewed publications, made by professional geneticists, in journals of high impact factor, you can only wonder how many more errors and misinterpretations can be found in the obscure market of so many amateur geneticists out there. Because amateur geneticist is a commonly used misnomer for people who are not geneticists (since they don’t have the most basic education in genetics), and some of them are not even ‘amateurs’ (because they are selling the outputs of bioinformatic tools)… It’s like calling healers ‘amateur doctors’.

NOTE. While everyone involved in population genetics is interested in knowing the truth, and we all have our confirmation (and other kinds of) biases, for those who get paid to tell people what they want to hear, and who have sold lots of wrong interpretations already, the incentives of ‘being right’ – and thus getting involved in crooked and paranoid behaviour regarding different interpretations – are as strong as the money they can win or loose by promoting themselves and selling more ‘product’.

As a reminder of how badly these wrong interpretations of genetic results – and the influence of the so-called ‘amateurs’ – can reflect on research groups, yet another turn of the screw by the Copenhagen group, in the oral presentations at Languages and migrations in pre-historic Europe (7-12 Aug 2018), organized by the Copenhagen University. The common theme seems to be that Bell Beaker and thus R1b-L23 subclades do represent a direct expansion from Yamna now, as opposed to being derived from Corded Ware migrants, as they supported before.

NOTE. Yes, the “Yamna → Corded Ware → Únětice / Bell Beaker” migration model is still commonplace in the Copenhagen workgroup. Yes, in 2018. Guus Kroonen had already admitted they were wrong, and it was already changed in the graphic representation accompanying a recent interview to Willerslev. However, since there is still no official retraction by anyone, it seems that each member has to reject the previous model in their own way, and at their own pace. I don’t think we can expect anyone at this point to accept responsibility for their wrong statements.

So their lead archaeologist, Kristian Kristiansen, in The Indo-Europeanization of Europé (sic):

Kristiansen’s (2018) map of Indo-European migrations

I love the newly invented arrows of migration from Yamna to the north to distinguish among dialects attributed by them to CWC groups, and the intensive use of materials from Heyd’s publications in the presentation, which means they understand he was right – except for the fact that they are used to support a completely different theory, radically opposed to those defended in Heyd’s model

Now added to the Copenhagen’s unending proposals of language expansions, some pearls from the oral presentation:

  • Corded Ware north of the Carpathians of R1a lineages developed Germanic;
  • R1b borugh [?] Italo-Celtic;
  • the increase in steppe ancestry on north European Bell Beakers mean that they “were a continuation of the Yamnaya/Corded Ware expansion”;
  • Corded Ware groups [] stopped their expansion and took over the Bell Beaker package before migrating to England” [yep, it literally says that];
  • Italo-Celtic expanded to the UK and Iberia with Bell Beakers [I guess that included Lusitanian in Iberia, but not Messapian in Italy; or the opposite; or nothing like that, who knows];
  • 2nd millennium BC Bronze Age Atlantic trade systems expanded Proto-Celtic [yep, trade systems expanded the language]
  • 1st millennium BC expanded Gaulish with La Tène, including a “Gaulish version of Celtic to Ireland/UK” [hmmm, dat British Gaulish indeed].

You know, because, why the hell not? A logical, stable, consequential, no-nonsense approach to Indo-European migrations, as always.

Also, compare still more invented arrows of migrations, from Mikkel Nørtoft’s Introducing the Homeland Timeline Map, going against Kristiansen’s multiple arrows, and even against the own recent fantasy map series in showing Bell Beakers stem from Yamna instead of CWC (or not, you never truly know what arrows actually mean):

Nørtoft’s (2018) maps of Indo-European migrations.

I really, really loved that perennial arrow of migration from Volosovo, ca. 4000-800 BC (3000+ years, no less!), representing Uralic?, like that, without specifics – which is like saying, “somebody from the eastern forest zone, somehow, at some time, expanded something that was not Indo-European to Finland, and we couldn’t care less, except for the fact that they were certainly not R1a“.

This and Kristiansen’s arrows are the most comical invented migration routes of 2018; and that is saying something, given the dozens of similar maps that people publish in forums and blogs each week.

NOTE. You can read a more reasonable account of how haplogroup R1b-L51 and how R1-Z645 subclades expanded, and which dialects most likely expanded with them.

We don’t know where these scholars of the Danish workgroup stand at this moment, or if they ever had (or intended to have) a common position – beyond their persistent ideas of Yamnaya™ ancestral component = Indo-European and R1a must be Indo-European – , because each new publication changes some essential aspects without expressly stating so, and makes thus everything still messier.

It’s hard to accept that this is a series of presentations made by professional linguists, archaeologists, and geneticists, as stated by the official website, and still harder to imagine that they collaborate within the same professional workgroup, which includes experienced geneticists and academics.

I propose the following video to close future presentations introducing innovative ideas like those above, to help the audience find the appropriate mood:


East Bell Beakers, an in situ admixture of Yamna settlers and GAC-like groups in Hungary


I wanted to repeat what I said last week in two different posts (see on the new Caucasus and Yamna Hungary samples, and on local groups in contact with Yamna settlers).

We already knew that expanding East Bell Beakers had received influence from a population similar to the available Globular Amphorae culture samples.

  1. Without Yamna settlers, but with Yamna Ukraine and East Bell Beaker samples, including an admixed Yamna Bulgaria sample (from Olalde & Mathieson 2017, and then with their Nature 2018 papers), the most likely interpretation was that Yamna settlers had received GAC ancestry probably during their migration through the Balkans, before turning into East Bell Beakers. However, some comments still supported that it was Corded Ware migrants the ones behind the formation of East Bell Beakers. I couldn’t understand it.
  2. Now we have (with Wang et al. 2018) Yamna settlers (identical to other Yamna groups and Afanasevo migrants) and GAC-like peoples coexisting with them in Hungary, with a Late Chalcolithic Yamna sample from Hungary showing a greater contribution from a GAC/Iberian_N-like source. However, I still read discussions on Yamna settlers receiving GAC admixture from Corded Ware in Eastern Europe, from GAC in the Dnieper-Dniester area, in Budzhak/Usatovo, etc. I can’t understand this, either.
  3. I will post here the data we have, with the simplest maps and images showing the simplest possible model. No more long paragraphs.

    NOTE. All this data does not mean that this model is certain, especially because we don’t have direct access to the samples. But it is the simplest and most likely one. Sometimes 2+2=4. Even if it turns out later to be false.

    EDIT (30 MAY 2018): In fact, as I commented in the first post about these samples, there is a Yamna LCA/EBA sample probably from Late Yamna (in the North Pontic steppe, west of the Catacomb culture), which shows GAC-like contribution. However, this admixture is lesser than that of Hungary LCA/EBA1 sample, and both Yamna groups (Hungary and steppe) were probably already more sedentary, which also supports different contributions from nearby local GAC-like groups to each region, rather than maintained long-range internal genetic contributions from a single source near the steppe…

    Yamna migrants ca. 3300-2600. Most likely site of admixture with GAC circled in red.
    Yamna – Bell Beaker migration according to Heyd (2007, 2012). Most likely site of admixture with GAC is marked by the evolution of Blue to Red color.
    PCA results. Samples from Yamna Hungary are surrounded by red circles, GAC-like Hungarian groups surrounded by light brown (see below for ADMIXTURE data) Notice the most likely Yamna Hungary sample with GAC admixture clustering closely to CWC Esperstedt outlier, and thus to some East Bell Beaker samples. (d) shows these projected onto a PCA of 84 modern-day West Eurasian populations (open symbols).
    Modified image, with red rectangles surrounding (unreleased) Hungarian samples from Yamna and GAC-like groups. (c) ADMIXTURE results of relevant prehistoric individuals mentioned in the text (filled symbols)
    Modified image, with red rectangles surrounding (unreleased) Yamna samples Notice greater GAC contribution to late Yamna Hungary sample. Modelling results for the Steppe and Caucasus cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional Anatolian farmer-related ancestry in Steppe groups as well as additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups
    Modified table from Wang et al. (2018) Supplementary materials (in bold, Yamna and related samples; in red, newly reported samples). Notice greater GAC contribution to late Yamna Hungary sample. “Supplementary Table 18. P values of rank=1 and admixture coefficients of modelling the Steppe ancestry populations as a two-way admixture of the Eneolithic_steppe and Globular_Amphora using 14 outgroups. Left populations: Steppe cluster, Eneolithic_steppe, Globular Amphora Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.”

    The CWC outlier from Esperstedt

    I already said that my initial interpretation of the Esperstedt outlier, dated ca. 2430 BC, as due to a late contribution directly from the steppe (i.e. from long-range contacts between late Corded Ware groups from Europe and late groups from the steppe) was probably wrong, seeing how (in Olalde et al. 2017) early East Bell Beaker samples from Hungary and Central Europe clustered closely to this individual.

    Now we see that fully ‘Yamnaya-like’ Yamna settlers lived in Hungary probably for two or three centuries ca. 2900-2600 BC, and the absorption of known (or unknown) Yamna vanguard groups found up to Saxony-Anhalt before 2600 BC would be enough to justify the genomic findings of this individual.

    An outlier it is, then. But probably from admixture with nearby Yamna-like people.

    Image modified by me, from Olalde et al. (2017). PCA of 999 Eurasian individuals. Marked is the Espersted Outlier.

    See also


Immigration and transhumance in the Early Bronze Age Carpathian Basin

Interesting excerpts about local Hungarian groups that had close contacts with Yamna settlers in the Carpathian Basin, from the paper Immigration and transhumance in the Early Bronze Age Carpathian Basin: the occupants of a kurgan, by Gerling, Bánffy, Dani, Köhler, Kulcsár, Pike, Szeverényi & Heyd, Antiquity (2012) 86(334):1097-1111.

The most interesting of the local people is the occupant of grave 12, which is the earliest grave in the kurgan and the main statistical range of its radiocarbon date clearly predates the arrival of the western Yamnaya groups c. 3000 BC. This is also confirmed by the burial rite, which is not typical for the Yamnaya (Dani 2011: 29–33; Heyd in press), although some heterogeneity may apply in Yamnaya communities too. The migrant group, graves nos. 4, 7, 9 and 11, all occupy late stratigraphic positions in the mound, and have radiocarbon dates in the second quarter of the third millennium BC. It is also noteworthy that they are all adult or mature men. The contextual data, their physical distribution over the space of the whole kurgan, and the variety of burial practices, indicate several generations of burials. The cultural attributes of this group are summarised in Figure 5. Overall, their closest match lies in the Livezile group from the eastern and southern Apuseni Mountains, which is also the likely place of origin of the buried persons.

Cultural geography of the Carpathian Basin in the first half of the third millennium BC (in black: archaeological cultures and groups dating roughly to the first quarter; in red: those dating to the second quarter). Indicated also are regions and sites mentioned in the text.

The key question is, what cultural process could be responsible for attracting these men from their homeland to the Great Hungarian Plain, over several generations? Their sex and age uniformity indicate they are a social sub-set within a larger group, implying that only a portion of their society was on the move. Exogamy can probably be excluded, since one would expect more women than men to move in prehistoric times; not to mention the distance of more than 200km between the places of potential origin and burial.

One hypothesis would see these men involved in the exchange of goods, with long-term relations between the mountain and steppe communities. Normally living in, or next to, the Apuseni, these men would journey for weeks into the plain, returning to the same places and people over many decades. Ethnographic examples of such travels to exchange objects and ideas, and perhaps people, are numerous (e.g. Helms 1988). However, the child’s (grave 7a) local isotopic signature would remain unexplained, and one has to wonder for how many generations an exchange continues for four men to die near the Őrhalom.

A second hypothesis is essentially an economic model of transhumance, with livestock passing the winter and spring in the milder regions of the Great Hungarian Plain, and returning to higher pastures in the warmer months (Arnold & Greenfield 2006). Such systems can endure for centuries, provided the social relations underpinning them are stable. This has the advantage of accounting for relatively long periods of time spent away from home, as herdsmen guarded their animals, and perhaps some women and their children came too, which would account for the child’s presence, and the pottery relations of the Livezile group. Furthermore, regular visits to a region would increase the likelihood of Livezile transhumant herders becoming integrated locally. The second quarter of the third millennium BC was a period when Yamnaya ideology, and thus its internal coherence, might have already diminished. This would likely have resulted in a weakened grip by Yamnaya people on pastures and territory, consequently allowing Livezile herders, and potentially others, to step in and take over locally, perhaps first on a seasonal basis and then permanently.

On West Yamna settlers in Hungary

Modified table from Wang et al. (2018) Supplementary materials (in bold, Yamna and related samples; in red, newly reported samples). “Supplementary Table 18. P values of rank=1 and admixture coefficients of modelling the Steppe ancestry populations as a two-way admixture of the Eneolithic_steppe and Globular_Amphora using 14 outgroups. Left populations: Steppe cluster, Eneolithic_steppe, Globular Amphora Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.”

By disclosing very interesting information on (yet unpublished) Yamna samples from Hungary, the latest preprint from the Reich Lab has rendered irrelevant – in a rather surprising turn of events – (what I expected would be) future discussions on West Yamna settlers potentially sharing a similar ancestry with Baltic Late Neolithic / Corded Ware settlers (see here for more details).

Interesting excerpts regarding the tight cluster formed by all Yamna samples:

Individuals from the North Caucasian steppe associated with the Yamnaya cultural formation (5300-4400 BP, 3300-2400 calBCE) appear genetically almost identical to previously reported Yamnaya individuals from Kalmykia20 immediately to the north, the middle Volga region19, 27, Ukraine and Hungary, and to other Bronze Age individuals from the Eurasian steppes who share the characteristic ‘steppe ancestry’ profile as a mixture of EHG and CHG/Iranian ancestry23, 28. These individuals form a tight cluster in PCA space (Figure 2) and can be shown formally to be a mixture by significantly negative admixture f3-statistics of the form f3(EHG, CHG; target) (Supplementary Fig. 3).

Using qpAdm with Globular Amphora as a proximate surrogate population (assuming that a related group was the source of the Anatolian farmer-related ancestry), we estimated the contribution of Anatolian farmer-related ancestry into Yamnaya and other steppe groups. We find that Yamnaya individuals from the Volga region (Yamnaya Samara) have 13.2±2.7% and Yamnaya individuals in Hungary 17.1±4.1% Anatolian farmer-related ancestry (Fig.4; Supplementary Table 18)– statistically indistinguishable proportions.

Yamna – Bell Beaker migration according to Heyd (2007, 2012)

Before this paper, we had the solidest anthropological models backed by Y-DNA against conflicting data from certain statistical tools applied to a few samples (which some used to contradict what was mainstream in Academia).

NOTE. I have discussed this extensively in this blog, and more than once. See for example my posts on R1a speaking IE (July 2017), on the Eneolithic Ukraine sample (September 2017), or on the “Yamnaya ancestral component” (November 2017).

Today, we have everything – including statistical tools – showing a genetically homogeneous, Late PIE-speaking late Khvalynsk/Yamna community expanding into its known branches, confirming what was described using traditional anthropological disciplines:

  • Late Khvalynsk expanding into Afanasevo ca. 3300-3000 BC with an archaic Late PIE dialect, which was attested much later as Tocharian;
  • East Yamna/Poltavka admixing with Uralic-speaking Abashevo migrants probably ca. 2600-2100 BC to form Proto-Indo-Iranian-speaking Sintashta-Petrovka and Potapovka;
  • and now also Yamna settlers: those in Hungary admixing (probably ca. 2800-2500 BC) with the local population to form North-West Indo-European-speaking East Bell Beakers; those from the Balkans forming other IE-speaking Balkan cultures, including the peoples that admixed in Greece, as seen in Mycenaeans.

If Volker Heyd is right with this and other papers – and he has been right until now in his predictions regarding Yamna, Bell Beaker, and Corded Ware cultures – , the change in ancestry will probably begin to be noticed in Yamna samples from Hungary and the Lower Danube during the second quarter of the 3rd millennium, a period defined by the addition of a more fashionable western Proto-Bell Beaker package to the fading traditional Yamna cultural package.

EDIT (19 MAY 2018): I corrected some sentences and added interesting information.