Corded Ware ancestry in North Eurasia and the Uralic expansion


Now that it has become evident that Late Repin (i.e. Yamnaya/Afanasevo) ancestry was associated with the migration of R1b-L23-rich Late Proto-Indo-Europeans from the steppe in the second half of the the 4th millennium BC, there’s still the question of how R1a-rich Uralic speakers of Corded Ware ancestry expanded , and how they spread their languages throughout North Eurasia.

Modern North Eurasians

I have been collecting information from the supplementary data of the latest papers on modern and ancient North Eurasian peoples, including Jeong et al. (2019), Saag et al. (2019), Sikora et al. (2018), or Flegontov et al. (2019), and I have tried to add up their information on ancestral components and their modern and historical distributions.

Fortunately, the current obsession with simplifying ancestry components into three or four general, atemporal groups, and the common use of the same ones across labs, make it very simple to merge data and map them.

Corded Ware ancestry

There is no doubt about the prevalent ancestry among Uralic-speaking peoples. A map isn’t needed to realize that, because ancient and modern data – like those recently summarized in Jeong et al. (2019) – prove it. But maps sure help visualize their intricate relationship better:

Natural neighbor interpolation of Srubnaya ancestry among modern populations. See full map.
Kriging interpolation of Srubnaya ancestry among modern populations. See full map

Interestingly, the regions with higher Corded Ware-related ancestry are in great part coincident with (pre)historical Finno-Ugric-speaking territories:

Modern distribution of Uralic languages, with ancient territory (in the Common Era) labelled and delimited by a red line. For more information on the ancient territory see here.

Edit (29/7/2019): Here is the full Steppe_MLBA ancestry map, including Steppe_MLBA (vs. Indus Periphery vs. Onge) in modern South Asian populations from Narasimhan et al. (2018), apart from the ‘Srubnaya component’ in North Eurasian populations. ‘Dummy’ variables (with 0% ancestry) have been included to the south and east of the map to avoid weird interpolations of Steppe_MLBA into Africa and East Asia.

Natural neighbor interpolation of Steppe MLBA-like ancestry among modern populations. See full map.

Anatolia Neolithic ancestry

Also interesting are the patterns of non-CWC-related ancestry, in particular the apparent wedge created by expanding East Slavs, which seems to reflect the intrusion of central(-eastern) European ancestry into Finno-Permic territory.

NOTE. Read more on Balto-Slavic hydrotoponymy, on the cradle of Russians as a Finno-Permic hotspot, and about Pre-Slavic languages in North-West Russia.

Natural neighbor interpolation of LBK EN ancestry among modern populations. See full map.
Kriging interpolation of LBK EN ancestry among modern populations. See full map

WHG ancestry

The cline(s) between WHG, EHG, ANE, Nganasan, and Baikal HG are also simplified when some of them excluded, in this case EHG, represented thus in part by WHG, and in part by more eastern ancestries (see below).

Natural neighbor interpolation of WHG ancestry among modern populations. See full map.
Kriging interpolation of WHG ancestry among modern populations. See full map.

Arctic, Tundra or Forest-steppe?

Data on Nganasan-related vs. ANE vs. Baikal HG/Ulchi-related ancestry is difficult to map properly, because both ancestry components are usually reported as mutually exclusive, when they are in fact clearly related in an ancestral cline formed by different ancient North Eurasian populations from Siberia.

When it comes to ascertaining the origin of the multiple CWC-related clines among Uralic-speaking peoples, the question is thus how to properly distinguish the proportions of WHG-, EHG-, Nganasan-, ANE or BaikalHG-related ancestral components in North Eurasia, i.e. how did each dialectal group admix with regional groups which formed part of these clines east and west of the Urals.

The truth is, one ought to test specific ancient samples for each “Siberian” ancestry found in the different Uralic dialectal groups, but the simplistic “Siberian” label somehow gets a pass in many papers (see a recent example).

Below qpAdm results with best fits for Ulchi ancestry, Afontova Gora 3 ancestry, and Nganasan ancestry, but some populations show good fits for both and with similar proportions, so selecting one necessarily simplifies the distribution of both.

Ulchi ancestry

Natural neighbor interpolation of Ulchi ancestry among modern populations. See full map.
Kriging interpolation of Ulchi ancestry among modern populations. See full map.

ANE ancestry

Natural neighbor interpolation of ANE ancestry among modern populations. See full map.
Kriging interpolation of ANE ancestry among modern populations. See full map.

Nganasan ancestry

Natural neighbor interpolation of Nganasan ancestry among modern populations. See full map.
Kriging interpolation of Nganasan ancestry among modern populations. See full map.

Iran Chalcolithic

A simplistic Iran Chalcolithic-related ancestry is also seen in the Altaic cline(s) which (like Corded Ware ancestry) expanded from Central Asia into Europe – apart from its historical distribution south of the Caucasus:

Natural neighbor interpolation of Iran Neolithic ancestry among modern populations. See full map.
Kriging interpolation of Iran Chalcolithic ancestry among modern populations. See full map.

Other models

The first question I imagine some would like to know is: what about other models? Do they show the same results? Here is the simplistic combination of ancestry components published in Damgaard et al. (2018) for the same or similar populations:

NOTE. As you can see, their selection of EHG vs. WHG vs. Nganasan vs. Natufian vs. Clovis of is of little use, but corroborate the results from other papers, and show some interesting patterns in combination with those above.


Natural neighbor interpolation of EHG ancestry among modern populations, data from Damgaard et al. (2018). See full map.
Kriging interpolation of EHG ancestry among modern populations. See full map.

Natufian ancestry

Natural neighbor interpolation of Natufian ancestry among modern populations, data from Damgaard et al. (2018). See full map.
Kriging interpolation of Natufian ancestry among modern populations. See full map.

WHG ancestry

Natural neighbor interpolation of WHG ancestry among modern populations, data from Damgaard et al. (2018). See full map.
Kriging interpolation of WHG ancestry among modern populations. See full map.

Baikal HG ancestry

Natural neighbor interpolation of Baikal hunter-gatherer ancestry among modern populations, data from Damgaard et al. (2018). See full map.
Kriging interpolation of Baikal HG ancestry among modern populations. See full map.

Ancient North Eurasians

Once the modern situation is clear, relevant questions are, for example, whether EHG-, WHG-, ANE, Nganasan-, and/or Baikal HG-related meta-populations expanded or became integrated into Uralic-speaking territories.

When did these admixture/migration events happen?

How did the ancient distribution or expansion of Palaeo-Arctic, Baikalic, and/or Altaic peoples affect the current distribution of the so-called “Siberian” ancestry, and of hg. N1a, in each specific population?

NOTE. A little excursus is necessary, because the calculated repetition of a hypothetic opposition “N1a vs. R1a” doesn’t make this dichotomy real:

  1. There was not a single ethnolinguistic community represented by hg. R1a after the initial expansion of Eastern Corded Ware groups, or by hg. N1a-L392 after its initial expansion in Siberia:
  2. Different subclades became incorporated in different ways into Bronze Age and Iron Age communities, most of which without an ethnolinguistic change. For example, N1a subclades became incorporated into North Eurasian populations of different languages, reaching Uralic- and Indo-European-speaking territories of north-eastern Europe during the late Iron Age, at a time when their ancestral origin or language in Siberia was impossible to ascertain. Just like the mix found among Proto-Germanic peoples (R1b, R1a, and I1)* or among Slavic peoples (I2a, E1b, R1a)*, the mix of many Uralic groups showing specific percentages of R1a, N1a, or Q subclades* reflect more or less recent admixture or acculturation events with little impact on their languages.

*other typically northern and eastern European haplogroups are also represented in early Germanic (N1a, I2, E1b, J, G2), Slavic (I1, G2, J) and Finno-Permic (I1, R1b, J) peoples.

Map of archaeological cultures in north-eastern Europe ca. 8th-3rd centuries BC. [The Mid-Volga Akozino group not depicted] Shaded area represents the Ananino cultural-historical society. Fading purple arrows represent likely stepped movements of subclades of haplogroup N for centuries (e.g. Siberian → Ananino → Akozino → Fennoscandia [N-VL29]; Circum-Arctic → forest-steppe [N1, N2]; etc.). Blue arrows represent eventual expansions of Uralic peoples to the north. Modified image from Vasilyev (2002).

The problem with mapping the ancestry of the available sampling of ancient populations is that we lack proper temporal and regional transects. The maps that follow include cultures roughly divided into either “Bronze Age” or “Iron Age” groups, although the difference between samples may span up to 2,000 years.

NOTE. Rough estimates for more external groups (viz. Sweden Battle Axe/Gotland_A for the NW, Srubna from the North Pontic area for the SW, Arctic/Nganasan for the NE, and Baikal EBA/”Ulchi-like” for the SE) have been included to offer a wider interpolated area using data already known.

Bronze Age

Similar to modern populations, the selection of best fit “Siberian” ancestry between Baikal HG vs. Nganasan, both potentially ± ANE (AG3), is an oversimplification that needs to be addressed in future papers.

Corded Ware ancestry

Natural neighbor interpolation of Srubnaya ancestry among Bronze Age populations. See full map.

Nganasan-like ancestry

Natural neighbor interpolation of Nganasan-like ancestry among Bronze Age populations. See full map.

Baikal HG ancestry

Natural neighbor interpolation of Baikal Hunter-Gatherer ancestry among Bronze Age populations. See full map.

Afontova Gora 3 ancestry

Natural neighbor interpolation of Afontova Gora 3 ancestry among Bronze Age populations. See full map.

Iron Age

Corded Ware ancestry

Interestingly, the moderate expansion of Corded Ware-related ancestry from the south during the Iron Age may be related to the expansion of hg. N1a-VL29 into the chiefdom-based system of north-eastern Europe, including Ananyino/Akozino and later expanding Akozino warrior-traders around the Baltic Sea.

NOTE. The samples from Levänluhta are centuries older than those from Estonia (and Ingria), and those from Chalmny Varre are modern ones, so this region has to be read as a south-west to north-east distribution from the Iron Age to modern times.

Natural neighbor interpolation of Srubnaya ancestry among Iron Age populations. See full map.

Baikal HG-like ancestry

The fact that this Baltic N1a-VL29 branch belongs in a group together with typically Avar N1a-B197 supports the Altaic origin of the parent group, which is possibly related to the expansion of Baikalic ancestry and Iron Age nomads:

Natural neighbor interpolation of Baikal HG ancestry among Iron Age populations. See full map.

Nganasan-like ancestry

The dilution of Nganasan-like ancestry in an Arctic region featuring “Siberian” ancestry and hg. N1a-L392 at least since the Bronze Age supports the integration of hg. N1a-Z1934, sister clade of Ugric N1a-Z1936, into populations west and east of the Urals with the expansion of Uralic languages to the north into the Tundra region (see here).

The integration of N1a-Z1934 lineages into Finnic-speaking peoples after their migration to the north and east, and the displacement or acculturation of Saami from their ancestral homeland, coinciding with known genetic bottlenecks among Finns, is yet another proof of this evolution:

Natural neighbor interpolation of Nganasan ancestry among Iron Age populations. See full map.

WHG ancestry

Similarly, WHG ancestry doesn’t seem to be related to important population movements throughout the Bronze Age, which excludes the multiple North Eurasian populations that will be found along the clines formed by WHG, EHG, ANE, Nganasan, Baikal HG ancestry as forming part of the Uralic ethnogenesis, although they may be relevant to follow later regional movements of specific populations.

Natural neighbor interpolation of WHG ancestry among Iron Age populations. See full map.


It seems natural that people used to look at maps of haplogroup distribution from the 2000s, coupled with modern language distributions, and would try to interpret them in a certain way, reaching thus the wrong conclusions whose consequences are especially visible today when ancient DNA keeps contradicting them.

In hindsight, though, assuming that Balto-Slavs expanded with Corded Ware and hg. R1a, or that Uralians expanded with “Siberian” ancestry and hg. N1a, was as absurd as looking at maps of ancestry and haplogroup distribution of ancient and modern Native Americans, trying to divide them into “Germanic” or “Iberian”…

The evolution of each specific region and cultural group of North Eurasia is far from being clear. However, the general trend speaks clearly in favour of an ancient, Bronze Age distribution of North Eurasian ancestry and haplogroups that have decreased, diluted, or become incorporated into expanding Uralians of Corded Ware ancestry, occasionally spreading with inter-regional expansions of local groups.

Given the relatively recent push of Altaic and Indo-European languages into ancestral Uralic-speaking territories, only the ancient Corded Ware expansion remains compatible with the spread of Uralic languages into their historical distribution.


Resurge of local populations in the final Corded Ware culture period from Poland


Open access A genomic Neolithic time transect of hunter-farmer admixture in central Poland, by Fernandes et al. Scientific Reports (2018).

Interesting excerpts (emphasis mine, stylistic changes):

Most mtDNA lineages found are characteristic of the early Neolithic farmers in south-eastern and central Europe of the Starčevo-Kőrös-Criş and LBK cultures. Haplogroups N1a, T2, J, K, and V, which are found in the Neolithic BKG, TRB, GAC and Early Bronze Age samples, are part of the mitochondrial ‘Neolithic package’ (which also includes haplogroups HV, V, and W) that was introduced to Europe with farmers migrating from Anatolia at the onset of the Neolithic17,31.

A noteworthy proportion of Mesolithic haplogroup U5 is also found among the individuals of the current study. The proportion of haplogroup U5 already present in the earliest of the analysed Neolithic groups from the examined area differs from the expected pattern of diversity of mtDNA lineages based on a previous archaeological view and on the aDNA findings from the neighbouring regions which were settled by post-Linear farmers similar to BKG at that time. A large proportion of Mesolithic haplogroups in late-Danubian farmers in Kuyavia was also shown in previous studies concerning BKG samples based on mtDNA only, although these frequencies were derived on the basis of very small sample sizes.


A significant genetic influence of HG populations persisted in this region at least until the Eneolithic/Early Bronze Age period, when steppe migrants arrived to central Europe. The presence of two outliers from the middle and late phases of the BKG in Kuyavia associated with typical Neolithic burial contexts provides evidence that hunter-farmer contacts were not restricted to the final period of this culture and were marked by various episodes of interaction between two societies with distinct cultural and subsistence differences.

The identification of both mitochondrial and Y-chromosome haplogroup lineages of Mesolithic provenance (U5 and I, respectively) in the BKG support the theory that both male and female hunter-gatherers became part of these Neolithic agricultural societies, as has been reported for similar cases from the Carpathian Basin, and the Balkans. The identification of an individual with WHG affinity, dated to ca. 4300 BCE, in a Middle Neolithic context within a BKG settlement, provides direct evidence for the regional existence of HG enclaves that persisted and coexisted at least for over 1000 years, from the arrival of the LBK farmers ca. 5400 BCE until ca. 4300 BCE, in proximity with Neolithic settlements, but without admixing with their inhabitants.

Principal component analysis with modern populations greyed out on the background (top), ADMIXTURE results with K = 10 with samples from this study amplified (bottom).

The analysis of two Late Neolithic cultures, the GAC and CWC, shows that steppe ancestry was present only among the CWC individuals analysed, and that the single GAC individual had more WHG ancestry than previous local Neolithic individuals. (…) The CWC’s affinity to WHG, however, contrasts with results from published CWC individuals that identified steppe ancestry related to Yamnaya as the major contributor to the CWC genomes, while here we report also substantial contributions from WHG that could relate to the late persistence of pockets of WHG populations, as supported by the admixture results of N42 and the finding of the 4300-year-old N22 HG individual. These results agree with archaeological theories that suggest that the CWC interaction with incoming steppe cultures was complex and that it varied by region.

Some comments

About the analyzed CWC samples, it is remarkable that, even though they are somehow related to each other, they do not form a tight cluster. Also, their Y-DNA (I2a), and this:

When compared to previously published CWC data, our CWC group (not individuals) is genetically significantly closer to WHG than to steppe individuals (Z = −4.898), a result which is in contrast with those for CWC from Germany (Z = 2.336), Estonia (Z = 0.555), and Latvia (Z = 1.553).

Ancestry proportions based on qpAdm. Visual representation of the main results presented in Supplementary Table S5. Populations from this study marked with an asterisk. Values and populations in brackets show the nested model results marked in green in Supplementary Table S5.

Włodarczak (2017) talks about the CWC period in Poland after ca. 2600 BC as a time of emergence of an allochthnous population, marked by the rare graves of this area, showing infiltrations initially mainly from Lesser Poland, and later (after 2500 BC) from the western Baltic zone.

Since forest sub-Neolithic populations would have probably given more EHG to the typical CWC population, these samples support the resurge of ‘local’ pockets of GAC- or TRB-like groups with more WHG (and also Levant_Neolithic) ancestry.

The known presence of I2a2a1b lineages in GAC groups in Poland also supports this interpretation, and the subsistence of such pockets of pre-steppe-like populations is also seen with the same or similar lineages appearing in comparable ‘resurge’ events in Central Europe, e.g. in samples from the Únětice and Tumulus culture.

About the Bronze Age sample, we have at last official confirmation of haplogroup R1a1a (sadly no subclade*) at the very beginning of the Trzciniec period – in a region between western (Iwno) and eastern (Strzyżów) groups related to Mierzanowice – , which has to be put in relation with the samples from the final Trzciniec period in the Baltic published in Mittnik et al. (2018).

EDIT (8 OCT 2018): More specific subclades have been published, including a R1a-Z280 lineage for the Bronze Age sample (see spreadsheet).

This confirms the early resurge of R1a-Z645 (probably R1a-Z282) lineages at the core of the developing East European Bronze Age, a province of the European Bronze Age that emerged from evolving Bell Beaker groups in Poland.

Arrival of Bell Beakers in Poland after ca. 2400 BC, and their origin in other BBC centres (Czebreszuk and Szmyt 2011).

I don’t have any hope that the Balto-Slavic evolution through BBC Poland → Mierzanowice/Iwno → Trzciniec → Lusatian cultures is going to be confirmed any time soon, until we have a complete trail of samples to follow all the way to historic Slavs of the Prague culture. However, I do think that the current data on central-east Europe – and the recent data we are receiving from north-east Europe and the Iranian steppes, at odds with the Indo-Slavonic alternative – supports this model.

I guess that, in the end, similar to how the Yamna vs. Corded Ware question is being solved, the real route of expansion of Proto-Balto-Slavic (supposedly spoken ca. 1500-1000 BC) is probably going to be decided by the expansion of either R1a-M458 (from the west) or R1a-Z280 lineages (from the east), because the limited precision of genetic data and analyses available today are going to show ‘modern Slavic’-like populations from the whole eastern half of Europe for the past 4,000 years…