“Steppe ancestry” step by step: Khvalynsk, Sredni Stog, Repin, Yamna, Corded Ware


Wang et al. (2018) is obviously a game changer in many aspects. I have already written about the upcoming Yamna Hungary samples, about the new Steppe_Eneolithic and Caucasus Eneolithic keystones, and about the upcoming Greece Neolithic samples with steppe ancestry.

An interesting aspect of the paper, hidden among so many relevant details, is a clearer picture of how the so-called Yamnaya or steppe ancestry evolved from Samara hunter-gatherers to Yamna nomadic pastoralists, and how this ancestry appeared among Proto-Corded Ware populations.

Image modified from Wang et al. (2018). Marked are in orange: equivalent Steppe_Maykop ADMIXTURE; in red, approximate limit of Anatolia_Neolithic ancestry found in Yamna populations; in blue, Corded Ware-related groups. “Modelling results for the Steppe and Caucasus cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional Anatolian farmer-related ancestry in Steppe groups as well as additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups.”

Please note: arrows of “ancestry movement” in the following PCAs do not necessarily represent physical population movements, or even ethnolinguistic change. To avoid misinterpretations, I have depicted arrows with Y-DNA haplogroup migrations to represent the most likely true ethnolinguistic movements. Admixture graphics shown are from Wang et al. (2018), and also (the K12) from Mathieson et al. (2018).

1. Samara to Early Khvalynsk

The so-called steppe ancestry was born during the Khvalynsk expansion through the steppes, probably through exogamy of expanding elite clans (eventually all R1b-M269 lineages) originally of Samara_HG ancestry. The nearest group to the ANE-like ghost population with which Samara hunter-gatherers admixed is represented by the Steppe_Eneolithic / Steppe_Maykop cluster (from the Northern Caucasus Piedmont).

Steppe_Eneolithic samples, of R1b1 lineages, are probably expanded Khvalynsk peoples, showing thus a proximate ancestry of an Early Eneolithic ghost population of the Northern Caucasus. Steppe_Maykop samples represent a later replacement of this Steppe_Eneolithic population – and/or a similar population with further contribution of ANE-like ancestry – in the area some 1,000 years later.


This is what Steppe_Maykop looks like, different from Steppe_Eneolithic:


NOTE. This admixture shows how different Steppe_Maykop is from Steppe_Eneolithic, but in the different supervised ADMIXTURE graphics below Maykop_Eneolithic is roughly equivalent to Eneolithic_Steppe (see orange arrow in ADMIXTURE graphic above). This is useful for a simplified analysis, but actual differences between Khvalynsk, Sredni Stog, Afanasevo, Yamna and Corded Ware are probably underestimated in the analyses below, and will become clearer in the future when more ancestral hunter-gatherer populations are added to the analysis.

2. Early Khvalynsk expansion

We have direct data of Khvalynsk-Novodanilovka-like populations thanks to Khvalynsk and Steppe_Eneolithic samples (although I’ve used the latter above to represent the ghost Caucasus population with which Samara_HG admixed).

We also have indirect data. First, there is the PCA with outliers:


Second, we have data from north Pontic Ukraine_Eneolithic samples (see next section).

Third, there is the continuity of late Repin / Afanasevo with Steppe_Eneolithic (see below).

3. Proto-Corded Ware expansion

It is unclear if R1a-M459 subclades were continuously in the steppe and resurged after the Khvalynsk expansion, or (the most likely option) they came from the forested region of the Upper Dnieper area, possibly from previous expansions there with hunter-gatherer pottery.

Supporting the latter is the millennia-long continuity of R1b-V88 and I2a2 subclades in the north Pontic Mesolithic, Neolithic, and Early Eneolithic Sredni Stog culture, until ca. 4500 BC (and even later, during the second half).

Only at the end of the Early Eneolithic with the disappearance of Novodanilovka (and beginning of the steppe ‘hiatus’ of Rassamakin) is R1a to be found in Ukraine again (after disappearing from the record some 2,000 years earlier), related to complex population movements in the north Pontic area.

NOTE. In the PCA, a tentative position of Novodanilovka closer to Anatolia_Neolithic / Dzudzuana ancestry is selected, based on the apparent cline formed by Ukraine_Eneolithic samples, and on the position and ancestry of Sredni Stog, Yamna, and Corded Ware later. A good alternative would be to place Novodanilovka still closer to the Balkan outliers (i.e. Suvorovo), and a source closer to EHG as the ancestry driven by the migration of R1a-M417.


The first sample with steppe ancestry appears only after 4250 BC in the forest-steppe, centuries after the samples with steppe ancestry from the Northern Caucasus and the Balkans, which points to exogamy of expanding R1a-M417 lineages with the remnants of the Novodanilovka population.


4. Repin / Early Yamna expansion

We don’t have direct data on early Repin settlers. But we do have a very close representative: Afanasevo, a population we know comes directly from the Repin/late Khvalynsk expansion ca. 3500/3300 BC (just before the emergence of Early Yamna), and which shows fully Steppe_Eneolithic-like ancestry.


Compared to this eastern Repin expansion that gave Afanasevo, the late Repin expansion to the west ca. 3300 BC that gave rise to the Yamna culture was one of colonization, evidenced by the admixture with north Pontic (Sredni Stog-like) populations, no doubt through exogamy:


This admixture is also found (in lesser proportion) in east Yamna groups, which supports the high mobility and exogamy practices among western and eastern Yamna clans, not only with locals:


5. Corded Ware

Corded Ware represents a quite homogeneous expansion of a late Sredni Stog population, compatible with the traditional location of Proto-Corded Ware peoples in the steppe-forest/forest zone of the Dnieper-Dniester region.


We don’t have a comparison with Ukraine_Eneolithic or Corded Ware samples in Wang et al. (2018), but we do have proximate sources for Abashevo, when compared to the Poltavka population (with which it admixed in the Volga-Ural steppes): Sintashta, Potapovka, Srubna (with further Abashevo contribution), and Andronovo:


The two CWC outliers from the Baltic show what I thought was an admixture with Yamna. However, given the previous mixture of Eneolithic_Steppe in north Pontic steppe-forest populations, this elevated “steppe ancestry” found in Baltic_LN (similar to west Yamna) seems rather an admixture of Baltic sub-Neolithic peoples with a north Pontic Eneolithic_Steppe-like population. Late Repin settlers also admixed with a similar population during its colonization of the north Pontic area, hence the Baltic_LN – west Yamna similarities.

NOTE. A direct admixture with west Yamna populations through exogamy by the ancestors of this Baltic population cannot be ruled out yet (without direct access to more samples), though, because of the contacts of Corded Ware with west Yamna settlers in the forest-steppe regions.


A similar case is found in the Yamna outlier from Mednikarovo south of the Danube. It would be absurd to think that Yamna from the Balkans comes from Corded Ware (or vice versa), just because the former is closer in the PCA to the latter than other Yamna samples. The same error is also found e.g. in the Corded Ware → Bell Beaker theory, because of their proximity in the PCA and their shared “steppe ancestry”. All those theories have been proven already wrong.

NOTE. A similar fallacy is found in potential Sintashta→Mycenaean connections, where we should distinguish statistically that result from an East/West Yamna + Balkans_BA admixture. In fact, genetic links of Mycenaeans with west Yamna settlers prove this (there are some related analyses in Anthrogenica, but the site is down at this moment). To try to relate these two populations (separated more than 1,000 years before Sintashta) is like comparing ancient populations to modern ones, without the intermediate samples to trace the real anthropological trail of what is found…Pure numbers and wishful thinking.


Yamna and Corded Ware show a similar “steppe ancestry” due to convergence. I have said so many times (see e.g. here). This was clear long ago, just by looking at the Y-chromosome bottlenecks that differentiate them – and Tomenable noticed this difference in ADMIXTURE from the supplementary materials in Mathieson et al. (2017), well before Wang et al. (2018).

This different stock stems from (1) completely different ancestral populations + (2) different, long-lasting Y-chromosome bottlenecks. Their similarities come from the two neighbouring cultures admixing with similar populations.

If all this does not mean anything, and each lab was going to support some pre-selected archaeological theories from the 1960s or the 1980s, coupled with outdated linguistic models no matter what – Anthony’s model + Ringe’s glottochronological tree of the early 2000s in the Reich Lab; and worse, Kristiansen’s CWC-IE + Germano-Slavonic models of the 1940s in the Copenhagen group – , I have to repeat my question again:

What’s (so much published) ancient DNA useful for, exactly?


Yamna/Afanasevo elite males dominated by R1b-L23, Okunevo brings ancient Siberian/Asian population


Open access paper New genetic evidence of affinities and discontinuities between bronze age Siberian populations, by Hollard et al., Am J Phys Anthropol. (2018) 00:1–11.

NOTE. This seems to be a peer-reviewed paper based on a more precise re-examination of the samples from Hollard’s PhD thesis, Peuplement du sud de la Sibérie et de l’Altaï à l’âge du Bronze : apport de la paléogénétique (2014).

Interesting excerpts:

Afanasevo and Yamna

The Afanasievo culture is the earliest known archaeological culture of southern Siberia, occupying the Minusinsk-Altai region during the Eneolithic era 3600/3300 BC to 2500 BC (Svyatko et al., 2009; Vadetskaya et al., 2014). Archeological data showed that the Afanasievo culture had strong affinities with the Yamnaya and pre-Yamnaya Eneolithic cultures in the West (Grushin et al., 2009). This suggests a Yamnaya migration into western Altai and into Afanasievo. Note that, in most current publications, “the Yamnaya culture” combines the so-called “classical Yamnaya culture” of the Early Bronze Age and archeological sites of the preceding Repin culture in the middle reaches of the Don and Volga rivers. In the present article we conventionally use the term Yamnaya in the same sense, in which case the beginning of the “Yamnaya culture” can be dated after the middle of the 4th millennium BC, when the Afanasievo culture appeared in the Altai.

Because of numerous traits attributed to early Indo-Europeans and cultural relations with Kurgan steppe cultures, members of the Afanasievo culture are believed to have been Indo-European speakers (Mallory and Mair, 2000). In a recent whole-genome sequencing study, Allentoft et al. (2015) concluded that Eastern Yamnaya individuals and Afanasievo individuals were genetically indistinguishable. Moreover, this study and one published concurrently by Haak et al. (2015) analyzed 11 Eastern Yamnaya males and showed that all of them belonged to the R1b1a1a (formerly R1b1a) (…)

Early Chalcolithic migrations ca. 3300-2600 BC.

Published works indicate that R1b was a predominant haplogroup from the late Neolithic to the early Bronze Age, notably in the Bell Beaker and Yamnaya cultures (Allentoft et al., 2015; Haak et al., 2015; Lee et al., 2012; Mathieson et al., 2015). Nearly 100% of the Afanasievo men we typed belonged to the R1b1a1a subhaplogroup and, for at least three of them, more precisely to the L23 (xM412) subclade. (…)

(…) our results therefore support the hypothesis of a genetic link between Afanasievo and Yamnaya. This also suggests that R1b was indeed dominant in the early Bronze Age Siberian steppe, at least in individuals that were buried in kurgans (possibly an elite part of the population). The geographical and temporal distribution of subhaplogroup R1b1a1a supports the hypothesis of population expansion from West to East in the Eurasian steppe during this period. It should however be noted that the Yamnaya burials from which the samples for DNA analysis were obtained (Allentoft et al., 2015; Haak et al., 2015; Mathieson et al., 2015) were dated within the limits of the Afanasievo period. Ancestors of both East Yamnaya and Afanasievo populations must therefore be sought in the context of earlier Eneolithic cultures in Eastern Europe. Sufficient Y-chromosomal data from such Eneolithic populations is, unfortunately, not yet available.

Mitochondrial- (A) and Y- (B) haplogroup distribution in studied populations

Okunevo and paternal lineage shift in South Siberia

Results obtained in the current study, from more than a dozen Okunevo individuals belonging to the earliest stage of Okunevo culture, that is the Uibat period (2500–2200 BC) (Lazaretov, 1997), suggest a discontinuity in the genetic pool between Afanasievo and Okunevo cultures. Although Y-chromosomal data obtained for bearers of the Okunevo culture showed that one individual carried haplogroup R1b, most Okunevo Y-haplogroups are representative of an Asian component represented by paternal lineages Q and NO1.

Okunevo carrier of Y-haplogroup Q1b1a-L54, which also supports this hypothesis (L54 being a marker of the lineage from which M3, the main Ameridian lineage, arose). Okunevo people could therefore be a remnant paleo-Siberian population with possible Afanasievo input, as suggested by the presence of the R1b1a1a2a subhaplogroup in one individual.

Late Chalcolithic migrations ca. 2600-2250 BC.

Replacement of Asian Indo-European elite lineages by R1a

Published genetic data from the late Bronze Age Andronovo culture from the Minusinsk Basin (Keyser et al., 2009), the Sintashta culture from Russia (Allentoft et al., 2015) and the Srubnaya culture from the region of Samara (Mathieson et al., 2015), show that males did not belong to Y-haplogroup R1b but mostly to R1a clades: there appears to have been a change in the dominant Y-chromosomal haplogroup between the early and the late Bronze Age in these regions. Moreover, as described in Allentoft et al. (2015), the Andronovo and Sintashta peoples were closely related to each other but clearly distinct from both Yamnaya and Afanasievo. Although these results do not imply that Y-haplogroup R1b was entirely absent in these later populations, they could correspond to a replacement of the elite between these two main periods and therefore a difference in the haplogroups of the men that were preferentially buried.

Early Bronze Age migrations ca. 2250-1750 BC.

Afanasevo and the Tarim Basin

The discovery, in the Tarim Basin, of well-preserved mummies from the Bronze Age allows for the construction of two hypotheses regarding the peopling of the Xinjiang province at this period. The “steppe hypothesis,” argues for a link with nomadic steppe herders (Hemphill and Mallory, 2004), possibly represented in this case by Afanasievo populations and their descendants (Mallory and Mair, 2000). However, newly published cultural data from the burial grounds of Gumugou (Wang, 2014) and Xiaohe (Xinjiang, 2003, 2007) shows material culture and burial rites incompatible with the Afanasievo culture. The earliest 14C date for Tarim Basin burials would place them at the turn of the 2nd millenium BC (Wang, 2013), 500 years after the Afanasievo period.

Instead, early Gumugou and Xiaohe burial grounds were contemporary with the start of the Andronovo period. Likewise, the Bronze Age population of the Xinjiang at Gumugou/Qäwrighul is not phenotypically closest to Afanasievo but to the Andronovo (Fedorovo) group of northeastern Kazakhstan and western Altai (Kozintsev, 2009). Our investigations demonstrate that Y-chromosomal lineage composition is also compatible with the notion that the ancient Tarim population was genetically distinct from the Afanasievo population. The only Y-haplogroup found by Li et al. (2010) in the Bronze Age Tarim Basin population was Y-haplogroup R1a, which suggests a proximity of this population with Andronovo groups rather than Afanasievo groups.

I don’t think these finds are much of a surprise based on what we already know, or need much explanation…

I would add that, once again, we have more proof that the movement of Okunevo and related ancient Siberian migrants from Central or North Asia will not be able to explain the presence of Uralic languages spread over North-East Europe and Scandinavia already during the Bronze Age.

Also interesting is to read in more peer-reviewed papers the idea of Late Indo-European speakers clearly linked to the expansion of patrilineally-related elite males marked by haplogroup R1b-L23, most likely since Eneolithic Khvalynsk/Repin cultures.