ASoSaH Reread (II): Y-DNA haplogroups among Uralians (apart from R1a-M417)


This is mainly a reread of from Book Two: A Game of Clans of the series A Song of Sheep and Horses: chapters iii.5. Early Indo-Europeans and Uralians, iv.3. Early Uralians, v.6. Late Uralians and vi.3. Disintegrating Uralians.

“Sredni Stog”

While the true source of R1a-M417 – the main haplogroup eventually associated with Corded Ware, and thus Uralic speakers – is still not known with precision, due to the lack of R1a-M198 in ancient samples, we already know that the Pontic-Caspian steppes were probably not it.

We have many samples from the north Pontic area since the Mesolithic compared to the Volga-Ural territory, and there is a clear prevalence of I2a-M223 lineages in the forest-steppe area, mixed with R1b-V88 (possibly a back-migration from south-eastern Europe).

R1a-M459 (xR1a-M198) lineages appear from the Mesolithic to the Chalcolithic scattered from the Baltic to the Caucasus, from the Dniester to Samara, in a situation similar to haplogroups Q1a-M25 and R1b-L754, which supports the idea that R1a, Q1a, and R1b expanded with ANE ancestry, possibly in different waves since the Epipalaeolithic, and formed the known ANE:EHG:WHG cline.

Y-DNA samples from Khvalynsk and neighbouring cultures. See full version.

The first confirmed R1a-M417 sample comes from Alexandria, roughly coinciding with the so-called steppe hiatus. Its emergence in the area of the previous “early Sredni Stog” groups (see the mess of the traditional interpretation of the north Pontic groups as “Sredni Stog”) and its later expansion with Corded Ware supports Kristiansen’s interpretation that Corded Ware emerged from the Dnieper-Dniester corridor, although samples from the area up to ca. 4000 BC, including the few Middle Eneolithic samples available, show continuity of hg. I2a-M223 and typical Ukraine Neolithic ancestry.

NOTE. The further subclade R1a-Z93 (Y26) reported for the sample from Alexandria seems too early, given the confidence interval for its formation (ca. 3500-2500 BC); even R1a-Z645 could be too early. Like the attribution of the R1b-L754 from Khvalynsk to R1b-V1636 (after being previously classifed as of Pre-V88 and M73 subclade), it seems reasonable to take these SNP calls with a pinch of salt: especially because Yleaf (designed to look for the furthest subclade possible) does not confirm for them any subclade beyond R1a-M417 and R1b-L754, respectively.

The sudden appearance of “steppe ancestry” in the region, with the high variability shown by Ukraine_Eneolithic samples, suggests that this is due to recent admixture of incoming foreign peoples (of Ukraine Neolithic / Comb Ware ancestry) with Novodanilovka settlers.

The most likely origin of this population, taking into account the most common population movements in the area since the Neolithic, is the infiltration of (mainly) hunter-gatherers from the forest areas. That would confirm the traditional interpretation of the origin of Uralic speakers in the forest zone, although the nature of Pontic-Caspian settlers as hunter-gatherers rather than herders make this identification today fully unnecessary (see here).

EDIT (3 FEB 2019): As for the most common guesstimates for Proto-Uralic, roughly coinciding with the expansion of this late Sredni Stog community (ca. 4000 BC), you can read the recent post by J. Pystynen in Freelance Reconstruction, Probing the roots of Samoyedic.

Late Sredni Stog admixture shows variability proper of recent admixture of forest-steppe peoples with steppe-like population. See full version here.

NOTE. Although my initial simplistic interpretation (of early 2017) of Comb Ware peoples – traditionally identified as Uralic speakers – potentially showing steppe ancestry was probably wrong, it seems that peoples from the forest zone – related to Comb Ware or neighbouring groups like Lublyn-Volhynia – reached forest-steppe areas to the south and eventually expanded steppe ancestry into east-central Europe through the Volhynian Upland to the Polish Upland, during the late Trypillian disintegration (see a full account of the complex interactions of the Final Eneolithic).

The most interesting aspect of ascertaining the origin of R1a-M417, given its prevalence among Uralic speakers, is to precisely locate the origin of contacts between Late Proto-Indo-European and Proto-Uralic. Traditionally considered as the consequence of contacts between Middle and Upper Volga regions, the most recent archaeological research and data from ancient DNA samples has made it clear that it is Corded Ware the most likely vector of expansion of Uralic languages, hence these contacts of Indo-Europeans of the Volga-Ural region with Uralians have to be looked for in neighbours of the north Pontic area.

Sredni Stog – Repin contacts representing Uralic – Late Indo-European contacts were probably concentrated around the Don River.

My bet – rather obvious today – is that the Don River area is the source of the earliest borrowings of Late Uralic from Late Indo-European (i.e. post-Indo-Anatolian). The borrowing of the Late PIE word for ‘horse’ is particularly interesting in this regard. Later contacts (after the loss of the initial laryngeal) may be attributed to the traditionally depicted Corded Ware – Yamna contact zone in the Dnieper-Dniester area.

NOTE. While the finding of R1a-M417 populations neighbouring R1b-L23 in the Don-Volga interfluve would be great to confirm these contacts, I don’t know if the current pace of more and more published samples will continue. The information we have right now, in my opinion, suffices to support close contacts of neighbouring Indo-Europeans and Uralians in the Pontic-Caspian area during the Late Eneolithic.

Classical Corded Ware

After some complex movements of TRB, late Trypillia and GAC peoples, Corded Ware apparently emerged in central-east Europe, under the influence of different cultures and from a population that probably (at least partially) stemmed from the north Pontic forest-steppe area.

Single Grave and central Corded Ware groups – showing some of the earliest available dates (emerging likely ca. 3000/2900 BC) – are as varied in their haplogroups as it is expected from a sink (which does not in the least resemble the Volga-Ural population):

Interesting is the presence of R1b-L754 in Obłaczkowo, potentially of R1b-V88 subclade, as previously found in two Central European individuals from Blätterhole MN (ca. 3650 and 3200 BC), and in the Iron Gates and north Pontic areas.

Haplogroups I2a and G have also been reported in early samples, all potentially related to the supposed Corded Ware central-east European homeland, likely in southern Poland, a region naturally connected to the north Pontic forest-steppe area and to the expansion of Neolithic groups.

Y-DNA samples from early Corded Ware groups and neighbouring cultures. See full version.

The true bottlenecks under haplogroup R1a-Z645 seem to have happened only during the migration of Corded Ware to the east: to the north into the Battle Axe culture, mainly under R1a-Z282, and to the south into Middle Dnieper – Fatyanovo-Balanovo – Abashevo, probably eventually under R1a-Z93.

This separation is in line with their reported TMRCA, and supports the split of Finno-Permic from an eastern Uralic group (Ugric and Samoyedic), although still in contact through the Russian forest zone to allow for the spread of Indo-Iranian loans.

This bottleneck also supports in archaeology the expansion of a sort of unifying “Corded Ware A-horizon” spreading with people (disputed by Furholt), the disintegrating Uralians, and thus a source of further loanwords shared by all surviving Uralic languages.

Confirming this ‘concentrated’ Uralic expansion to the east is the presence of R1a-M417 (xR1a-Z645) lineages among early and late Single Grave groups in the west – which essentially disappeared after the Bell Beaker expansion – , as well as the presence of these subclades in modern Central and Western Europeans. Central European groups became thus integrated in post-Bell Beaker European EBA cultures, and their Uralic dialect likely disappeared without a trace.

NOTE. The fate of R1b-L51 lineages – linked to North-West Indo-Europeans undergoing a bottleneck in the Yamna Hungary -> Bell Beaker migration to the west – is thus similar to haplogroup R1a-Z645 – linked to the expansion of Late Uralians to the east – , hence proving the traditional interpretation of the language expansions as male-driven migrations. These are two of the most interesting genetic data we have to date to confirm previous language expansions and dialectal classifications.

It will be also interesting to see if known GAC and Corded Ware I2a-Y6098 subclades formed eventually part of the ancient Uralic groups in the east, apart from lineages which will no doubt appear among asbestos ware groups and probably hunter-gatherers from north-eastern Europe (see the recent study by Tambets et al. 2018).

Corded Ware ancestry marked the expansion of Uralians

Sadly, some brilliant minds decided in 2015 that the so-called “Yamnaya ancestry” (now more appropriately called “steppe ancestry”) should be associated to ‘Indo-Europeans’. This is causing the development of various new pet theories on the go, as more and more data contradicts this interpretation.

There is a clear long-lasting cultural, populational, and natural barrier between Yamna and Corded Ware: they are derived from different ancestral populations, which show clearly different ancestry and ancestry evolution (although they did converge to some extent), as well as different Y-DNA bottlenecks; they show different cultures, including those of preceding and succeeding groups, and evolved in different ecological niches. The only true steppe pastoralists who managed to dominate over grasslands extending from the Upper Danube to the Altai were Yamna peoples and their cultural successors.

Corded Ware admixture proper of expanding late Sredni Stog-like populations from the forest-steppe. See full version here.

NOTE. You can also read two recent posts by FrankN in the blog aDNA era, with detailed information on the Pontic-Caspian cultures and the formation of “steppe ancestry” during the Palaeolithic, Mesolithic and Neolithic: How did CHG get into Steppe_EMBA? Part 1: LGM to Early Holocene and How did CHG get into Steppe_EMBA? Part 2: The Pottery Neolithic. Unlike your typical amateur blogger on genetics using few statistical comparisons coupled with ‘archaeolinguoracial mumbo jumbo’ to reach unscientific conclusions, these are obviously carefully redacted texts which deserve to be read.

I will not enter into the discussion of “steppe ancestry” and the mythical “Siberian ancestry” for this post, though. I will just repost the opinion of Volker Heyd – an archaeologist specialized in Yamna Hungary and Bell Beakers who is working with actual geneticists – on the early conclusions based on “steppe ancestry”:

[A]rchaeologist Volker Heyd at the University of Bristol, UK, disagreed, not with the conclusion that people moved west from the steppe, but with how their genetic signatures were conflated with complex cultural expressions. Corded Ware and Yamnaya burials are more different than they are similar, and there is evidence of cultural exchange, at least, between the Russian steppe and regions west that predate Yamnaya culture, he says. None of these facts negates the conclusions of the genetics papers, but they underscore the insufficiency of the articles in addressing the questions that archaeologists are interested in, he argued. “While I have no doubt they are basically right, it is the complexity of the past that is not reflected,” Heyd wrote, before issuing a call to arms. “Instead of letting geneticists determine the agenda and set the message, we should teach them about complexity in past human actions.


ASoSaH Reread (I): Y-DNA haplogroups among Indo-Europeans (apart from R1b-L23)


Given my reduced free time in these months, I have decided to keep updating the text on Indo-European and Uralic migrations and/or this blog, simultaneously or alternatively, to make the most out of the time I can dedicate to this. I will add the different ‘A Song of Sheep and Horses (ASoSaH) reread’ posts to the original post announcing the books. I would be especially interested in comments and corrections to the book chapters rather than the posts, but any comments are welcome (including in the forum, where comments are more likely to stick).

This is mainly a reread of iv.2. Indo-Anatolians and vi.1. Disintegrating Indo-Europeans.

Indo-Anatolians and Late Indo-Europeans

I have often written about R1b-L23 as the majority haplogroup among Late Proto-Indo-Europeans (see my predictions for 2018 and my summary of 2018), but always expected other haplogroups to pop up somewhere along the way, in Khvalynsk, in Repin, in Yamna, and in Bell Beakers (see e.g. the post on common fallacies of R1a/IE-fans).

Luckily enough – for those of us who want precise answers to our previous infinite models of Indo-European language expansions (viz. GAC-associated expansion, IE-speaking Old Europe, Anatolian homeland, Iran homeland, Maykop as Proto-Anatolian, Palaeolithic Continuity Theory, Celtic in the Atlantic façade, etc.) – the situation has been more clear-cut than expected: it turns out that, especially during population expansions, acute Y-chromosome bottlenecks were very common in the past, at least until the Iron Age.

Khvalynsk and Repin-Yamna expansions were no different, and that seems quite natural in hindsight, given the strong familial ties and aversion to foreigners proper of the Late Proto-Indo-European society and culture – probably not really that different from other contemporary societies, like the neighbouring Late Proto-Uralic or Trypillian ones.

Y-DNA samples from Khvalynsk and neighbouring cultures. See full version here.

Y-DNA haplogroups

During the expansion of early Khvalynsk, the most likely Indo-Anatolian culture, the society of the Don-Volga area was probably made up of different lineages including R1b-V1636, R1b-M269, R1a-YP1272, Q1a-M25, and I2a-L699 (and possibly some R1b-V88?), a variability possibly greater than that of the contemporary north Pontic area, probably a sign of this region being a sink of different east and west migrations from steppe and forest areas.

During its expansion, the Khvalynsk society saw its haplogroup variability reduced, as evidenced by the succeeding expansive Repin culture:

Afanasevo, representing Pre-Tocharian (the earliest Late PIE dialect to branch off), expanded with R1b-L23 – especially R1b-Z2103 – lineages, while early Yamna expanded with R1b-L23 and I2a-L699 lineages, which suggests that these are the main haplogroups that survived the Y-DNA bottleneck undergone during the Khvalynsk expansion, and especially later during the late Repin expansion. Nevertheless, other old haplogroups might still pop up during the Repin and early Yamna period, such as the R1b-V1636 sample from Yamna in the Northern Caucasus.

It is still unclear if R1b-L23 sister clade R1b-PF7562 (formed ca. 4400 BC, TMRCA ca. 3400 BC), prevalent among modern Albanians, expanded with Yamna migrants, or if it was part of an earlier expansion of R1b-M269 into the Balkans, and represent thus Indo-Anatolian speakers who later hitchhiked the expansion of the Late PIE language from the north or west Pontic area. The early TMRCA seems to suggest an association with Repin (and therefore Yamna), rather than later movements in the Balkans.

Y-DNA samples from Yamnaya and neighbouring cultures. See full version here.

‘Yamnaya’ or ‘steppe’ ancestry?

After the early years when population genetics relied mainly on modern Y-DNA haplogroups, geneticists and amateurs have been recently playing around with testing “ancestry percentages”, based on newly developed free statistical tools, which offer obviously just one among many types of data to achieve a proper interpretation of the past.

Today we have quite a lot Y-DNA haplogroups reported for ancient samples of more recent prehistoric periods, and they seem to offer (at least since the 2015 papers, but more evidently since the 2018 papers on Bell Beakers and Europeans, Corded Ware, or Fennoscandia among others) the most straightforward interpretation of all results published in population genomics research.

NOTE. The finding of a specific type of ancestry in one isolated 40,000-year-old sample from Tianyuan can offer very interesting information on potential population movements to the region. However, the identification of ethnolinguistic communities and their migrations among neighbouring groups in Neolithic or Bronze Age groups is evidently not that simple.

Yamnaya (Indo-European peoples) and their evolution in the steppes, together with North Pontic (eventually Uralic) peoples.Notice how little Indo-European ancestry changes from Khvalynsk (Indo-Anatolian) to Yamna Hungary (North-West Indo-Europeans) Image modified from Wang et al. (2018). See more on the evolution of “steppe ancestry”.

It is becoming more and more clear with each paper that the true “Yamnaya ancestry” – not the originally described one – was in fact associated with Indo-Europeans (see more on the very Yamnaya-like Yamna Hungary and early East Bell Beaker R1b samples, all of quite similar ancestry and PCA cluster before their further admixture with EEF- and CWC-like groups).

The so-called “steppe ancestry”, on the other hand, reflects the contribution of a Northern Caucasus-related ancestry to expanding Khvalynsk settlers, who spread through the steppes more than a thousand years before the expansion of Late Proto-Indo-Europeans with late Repin, and can thus be found among different groups related to the Pontic-Caspian steppes (see more on the emergence and evolution of “steppe ancestry”).

In fact, after the Yamna/Indo-European and Corded Ware/Uralic expansions, it is more likely to find “steppe ancestry” to the north and east in territories traditionally associated with Uralic languages, whereas to the south and west – i.e. in territories traditionally associated with Indo-European languages – it is more likely to find “EEF ancestry” with diminished “steppe ancestry”, among peoples patrilineally descended from Yamna settlers.

Y-DNA haplogroups, the only uniparental markers (see exceptions in mtDNA inheritance) – unlike ancestry percentages based on the comparison of a few samples and flawed study designs – do not admix, do not change, and therefore they do not lend themselves to infinite pet theories (see e.g. what David Reich has to say about R1b-P312 in Iberia directly derived from Yamna migrants in spite of their predominant EEF ancestry): their cultural continuity can only be challenged with carefully threaded linguistic, archaeological, and genetic data.


Modern Sardinians show elevated Neolithic farmer ancestry shared with Basques


New paper (behind paywall), Genomic history of the Sardinian population, by Chiang et al. Nature Genetics (2018), previously published as a preprint at bioRxiv (2016).

#EDIT (18 Sep 2018): Link to read paper for free shared by the main author.

Interesting excerpts (emphasis mine):

Our analysis of divergence times suggests the population lineage ancestral to modern-day Sardinia was effectively isolated from the mainland European populations ~140–250 generations ago, corresponding to ~4,300–7,000 years ago assuming a generation time of 30 years and a mutation rate of 1.25 × 10−8 per basepair per generation. (…) in terms of relative values, the divergence time between Northern and Southern Europeans is much more recent than either is to Sardinia, signaling the relative isolation of Sardinia from mainland Europe.

We documented fine-scale variation in the ancient population ancestry proportions across the island. The most remote and interior areas of Sardinia—the Gennargentu massif covering the central and eastern regions, including the present-day province of Ogliastra— are thought to have been the least exposed to contact with outside populations. We found that pre-Neolithic hunter-gatherer and Neolithic farmer ancestries are enriched in this region of isolation. Under the premise that Ogliastra has been more buffered from recent immigration to the island, one interpretation of the result is that the early populations of Sardinia were an admixture of the two ancestries, rather than the pre-Neolithic ancestry arriving via later migrations from the mainland. Such admixture could have occurred principally on the island or on the mainland before the hypothesized Neolithic era influx to the island. Under the alternative premise that Ogliastra is simply a highly isolated region that has differentiated within Sardinia due to genetic drift, the result would be interpreted as genetic drift leading to a structured pattern of pre-Neolithic ancestry across the island, in an overall background of high Neolithic ancestry.

PCA results of merged Sardinian whole-genome sequences and the HGDP Sardinians. See below for a map of the corresponding regions.

We found Sardinians show a signal of shared ancestry with the Basque in terms of the outgroup f3 shared-drift statistics. This is consistent with long-held arguments of a connection between the two populations, including claims of Basque-like, non-Indo-European words among Sardinian placenames. More recently, the Basque have been shown to be enriched for Neolithic farmer ancestry and Indo-European languages have been associated with steppe population expansions in the post-Neolithic Bronze Age. These results support a model in which Sardinians and the Basque may both retain a legacy of pre-Indo-European Neolithic ancestry. To be cautious, while it seems unlikely, we cannot exclude that the genetic similarity between the Basque and Sardinians is due to an unsampled pre-Neolithic population that has affinities with the Neolithic representatives analyzed here.

Left: Geographical map of Sardinia. The provincial boundaries are given as black lines. The provinces are abbreviated as Cag (Cagliari), Cmp (Campidano), Car (Carbonia), Ori (Oristano), Sas (Sassari), Olb (Olbia-tempio), Nuo (Nuoro), and Ogl (Ogliastra). For sampled villages within Ogliastra, the names and abbreviations are indicated in the colored boxes. The color corresponds to the color used in the PCA plot (Fig. 2a). The Gennargentu region referred to in the main text is the mountainous area shown in brown that is centered in western Ogliastra and southeastern Nuoro.
Right: Density of Nuraghi in Sardinia, from Wikipedia.

While we can confirm that Sardinians principally have Neolithic ancestry on the autosomes, the high frequency of two Y-chromosome haplogroups (I2a1a1 at ~39% and R1b1a2 at ~18%) that are not typically affiliated with Neolithic ancestry is one challenge to this model. Whether these haplogroups rose in frequency due to extensive genetic drift and/or reflect sex-biased demographic processes has been an open question. Our analysis of X chromosome versus autosome diversity suggests a smaller effective size for males, which can arise due to multiple processes, including polygyny, patrilineal inheritance rules, or transmission of reproductive success. We also find that the genetic ancestry enriched in Sardinia is more prevalent on the X chromosome than the autosome, suggesting that male lineages may more rapidly trace back to the mainland. Considering that the R1b1a2 haplogroup may be associated with post-Neolithic steppe ancestry expansions in Europe, and the recent timeframe when the R1b1a2 lineages expanded in Sardinia, the patterns raise the possibility of recent male-biased steppe ancestry migration to Sardinia, as has been reported among mainland Europeans at large (though see Lazaridis and Reich and Goldberg et al.). Such a recent influx is difficult to square with the overall divergence of Sardinian populations observed here.

Mixture proportions of the three-component ancestries among Sardinian populations. Using a method first presented in Haak et al. (Nature 522, 207–211, 2015), we computed unbiased estimates of mixture proportions without a parameterized model of relationships between the test populations and the outgroup populations based on f4 statistics. The three-component ancestries were represented by early Neolithic individuals from the LBK culture (LBK_EN), pre-Neolithic huntergatherers (Loschbour), and Bronze Age steppe pastoralists (Yamnaya). See Supplementary Table 5 for standard error estimates computed using a block jackknife.

Once again, haplogroup R1b1a2 (M269), and only R1b1a2, related to male-biased, steppe-related Indo-European migrations…just sayin’.

Interestingly, haplogroup I2a1a1 is actually found among northern Iberians during the Neolithic and Chalcolithic, and is therefore associated with Neolithic ancestry in Iberia, too, and consequently – unless there is a big surprise hidden somewhere – with the ancestry found today among Basques.

NOTE. In fact, the increase in Neolithic ancestry found in south-west Ireland with expanding Bell Beakers (likely Proto-Beakers), coupled with the finding of I2a subclades in Megalithic cultures of western Europe, would support this replacement after the Cardial and Epi-Cardial expansions, which were initially associated with G2a lineages.

I am not convinced about a survival of Palaeo-Sardo after the Bell Beaker expansion, though, since there is no clear-cut cultural divide (and posterior continuity) of pre-Beaker archaeological cultures after the arrival of Bell Beakers in the island that could be identified with the survival of Neolithic languages.

We may have to wait for ancient DNA to show a potential expansion of Neolithic ancestry from the west, maybe associated with the emergence of the Nuragic civilization (potentially linked with contemporaneous Megalithic cultures in Corsica and in the Balearic Islands, and thus with an Iberian rather than a Basque stock), although this is quite speculative at this moment in linguistic, archaeological, and genetic terms.

Nevertheless, it seems that the association of a Basque-Iberian language with the Neolithic expansion from Anatolia (see Villar’s latest book on the subject) is somehow strengthened by this paper. However, it is unclear when, how, and where expanding G2a subclades were replaced by native I2 lineages.


Viking Age town shows higher genetic diversity than Neolithic and Bronze Age


Open access Genomic and Strontium Isotope Variation Reveal Immigration Patterns in a Viking Age Town, by Krzewińska et al., Current Biology (2018).

Interesting excerpts (emphasis mine, some references deleted for clarity):

The town of Sigtuna in eastern central Sweden was one of the pioneer urban hubs in the vast and complex communicative network of the Viking world. The town that is thought to have been royally founded was planned and organized as a formal administrative center and was an important focal point for the establishment of Christianity [19]. The material culture in Sigtuna indicates that the town had intense international contacts and hosted several cemeteries with a Christian character. Some of them may have been used by kin-based groups or by people sharing the same sociocultural background. In order to explore the character and magnitude of mobility and migration in a late Viking Age town, we generated and analyzed genomic (n = 23) and strontium isotope (n = 31) data from individuals excavated in Sigtuna.


The mitochondrial genomes were sequenced at 1.5× to 367× coverage. Most of the individuals were assigned to haplogroups commonly found in current-day Europeans, such as H, J, and U [14, 26, 27]. All of these haplotypes are present in Scandinavia today.

The Y chromosome haplogroups were assigned in seven males. The Y haplogroups include I1a, I2a, N1a, G2a, and R1b. Two identified lineages (I2a and N1a) have not been found in modern-day Sweden or Norway [28, 29]. Haplogroups I and N are associated with eastern and central Europe, as well as Finno-Ugric groups [30]. Interestingly, I2a was previously identified in a middle Neolithic Swedish hunter-gatherer dating to ca. 3,000 years BCE [31].

In Sigtuna, the genetic diversity in the late Viking Age was greater than the genetic diversity in late Neolithic and Bronze Age cultures (Unetice and Yamnaya as examples) and modern East Asians; it was on par with Roman soldiers in England but lower than in modern-day European groups (GBR and FIN; Figure 2B). Within the town, the group excavated at church 1 has somewhat greater diversity than that at cemetery 1. Interestingly, the diversity at church 1 is nearly as high as that observed in Roman soldiers in England, which is remarkable, since the latter was considered to be an exceptionally heterogeneous group in contemporary Europe [39].

A PCA plot visualising all 23 individuals from Sigtuna used in ancient DNA analyses (m – males, f – females).

Different sex-related mobility patterns for Sigtuna inhabitants have been suggested based on material culture, especially ceramics. Building on design and clay analyses, some female potters in Sigtuna are thought to have grown up in Novgorod in Rus’ [40]. Moreover, historical sources mention female mobility in connection to marriage, especially among the elite from Rus’ and West Slavonic regions [41, 42]. Male mobility is also known from historical sources, often in connection to clergymen moving to the town [43].

Interestingly, we found a number of individuals from Sigtuna to be genetically similar to the modern-day human variation of eastern Europeans, and most harbor close genetic affinities to Lithuanians (Figure 2A). The strontium isotope ratios in 28 adult individuals with assigned biological sex and strontium values obtained from teeth (23 M1 and five M2) show that 70% of the females and 44% of the males from Sigtuna were non-locals (STAR Methods). The difference in migrant ratios between females and male mobility patterns was not statistically significant (Fisher’s exact test, p = 0.254 for 28 individuals and p = 0.376 for 16 individuals). Hence, no evidence of a sex-specific mobility pattern was found.

(…) As these social groups are not mirrored by our genetic or strontium data, this suggests that the inclusion in them was not based on kinship. Therefore, it appears as if socio-cultural factors, not biological bonds, governed where people were interred (i.e., the choice of cemetery).

Average pairwise genetic diversity measured in complete Sigtuna, St. Gertrud (church 1) and cemetery 1 (the Nunnan block) compared to both ancient and modern populations ranked by time period (Yamnaya, Unetice, and GBR-Roman, Roman Age individuals from Great Britain; GBR-AS, Anglo-Saxon individuals from Great Britain; GBR-IA, Iron Age individuals from Great Britain; JPT-Modern, presentday Japanese from Tokyo; FIN-Modern, present-day Finnish; GBR-Modern, present-day British; GIHModern, present-day Gujarati Indian from Houston, Texas). Error bars show ±2 SEs.

Interesting from this paper is the higher genetic (especially Y-DNA) diversity found in more recent periods (see e.g. here) compared to Neolithic and Bronze Age cultures, which is probably the reason behind some obviously wrong interpretations, e.g. regarding links between Yamna and Corded Ware populations.

The sample 84001, a “first-generation short-distance migrant” of haplogroup N1c-L392 (N1a in the new nomenclature) brings yet more proof of how:

  • Admixture changes completely within a certain number of generations. In this case, the N1c-L392 sample clusters within the genetic variation of modern Norwegians, near to the Skane Iron Age sample, and not with its eastern origin (likely many generations before).
  • This haplogroup appeared quite late in Fennoscandia but still managed to integrate and expand into different ethnolinguistic groups; in this case, this individual was probably a Viking of Nordic language, given its genetic admixture and its non-local (but neighbouring Scandinavian) strontium values.


On the origin of haplogroup R1b-L51 in late Repin / early Yamna settlers


A recent comment on the hypothetical Central European origin of PIE helped me remember that, when news appeared that R1b-L51 had been found in Khvalynsk ca. 4250-4000 BC, I began to think about alternative scenarios for the expansion of this haplogroup, with one of them including Central Europe.

Because, if YFull‘s (and Iain McDonald‘s) estimation of the split of R1b-L23 in L51 and Z2103 (ca. 4100 BC, TMRCA ca. 3700 BC) was wrong, by as much as the R1a-Z645 estimates proved wrong, and both subclades were older than expected, then maybe R1b-L51 was not part of the Yamna expansion, but rather part of an earlier expansion with Suvorovo-Novodanilovka into central Europe.

That is, R1b-L51 and R1b-Z2103 would have expanded wih Khvalynsk-Novodanilovka migrants, and they would have either disappeared among local populations, or settled and expanded with successful lineages in certain regions. I think this may give rise to two potential models.

A hidden group in the European east-central steppes?

Here is what Heyd (2011), for example, has to say about the effect of the Khvalynsk-Novodanilovka expansion in the 4th millennium BC, with the first Kurgan wave that shuttered the social, economic, and cultural foundations of south-eastern Europe (before the expansion of west Yamna migrants in the region):

Proto-Anatolian migrations with Khvalynsk-Novodanilovka expansion, including ADMIXTURE data from Wang et al. (2018).

As the Boleraz and Baden tumuli cases in Serbia and Hungary demonstrate, there are earlier, 4th millennium cal. B.C. round tumuli in the Carpathian basin. There are also earlier north-Pontic steppe populations who infiltrated similar environments west of the Black Sea prior to the rise of the Yamnaya culture. This situation can be traced back to the 2nd half of the 5th millennium cal. B.C. to a group of distinct burials, zoomorphic maceheads, long flint blades, triangular flint points, etc., summarized under the term Suvurovo-Novodanilovka (Govedarica 2004; Rassamakin 2004; Anthony 2007; Heyd forthcoming 2011). They also erected round personalized tumuli, though smaller in size and height, above inhumations of single individuals. Suvorovo and Casimcea are the key examples in the lower Danube region of Romania. In northeast Bulgaria, the primary grave of Polska Kosovo (ochre-stained supine extended body position: information communicated by S. Alexandrov) can also be seen as such, as should the Targovishte-“Gonova mogila” primary grave 1 in the Thracian plain with a burial arranged in a supine position with flexed legs, southeast-northwest orientated, and strewed with ochre (Kanchev 1991 , p. 56- 57; Ivanova Gaydarska 2007). In addition to the many copper and shell beads, the 17.4cm long obsidian blade is exceptional, which links this grave to the Csongrád-“Kettoshalom” grave in the south Hungarian plain (Ecsedy 1979). It also yielded an obsidian blade ( 13.2cm long) and copper, shell and limestone beads.

The Southeast European distribution of graves of the Suvorovo-Novodanilovka group and such unequipped ones mentioned in the text which can be attributed by burial custom and stratigraphic position in the barrow, plus zoomorphic and abstract animal head sceptres as well as specific maceheads with knobs as from Decea Maresului (mid-5th millennium until around 4000 BC). Heyd (2016).

However, no traces of a tumulus have been recorded above the Kettoshalom tomb. Conventionally, it is dated to the Bodrogkeresztur-period in east Hungary, shortly after 4000 cal. B.C., which would correspond very well with the suggested Cernavodă I (or its less known cultural equivalent in the Thracian plain) attribution for the “Gonova mogila” grave, a cultural background to which the Csongrád grave should have also belonged. Bodrogkeresztur and Cernavodă I periods are not the only examples of 4th millennium cal. B.C. tumuli and burials displaying this steppe connection. Indeed we can find this early steppe impact throughout the 4th millennium cal. B.C. These include adscriptions to the Horodiștea II (Corlateni-Dealul Stadole, grave I: Burtanescu l 998, p. 37; Holbocai, grave 34: Coma 1998, p. 16); to Gordinești-Cernavodă 11 (Liești-Movila Arbănașu, grave 22: Brudiu 2000); to Gorodsk-Usatovo (Corlăteni Dealul Cetăţii, grave I: Comșa 1998, p. 17- 18, in Romania; Durankulak, grave 982: Vajsov 2002, in Bulgaria); and to Cernavodă III(Golyama Detelina, tum. 4: Leshtakov, Borisov 1995), and early (end of 4th millennium cal. B.C.) Ezero in Ovchartsi, primary grave (Kalchev 1994, p. 134-138) and Golyama Detelina, tum. 2 (Kanchev 1991) in Bulgaria. Also the Boleráz and Baden tumuli of Banjevac-Tolisavac and Mokrin in the south Carpathian basin account for this, since one should perhaps take into account primary grave 12 of the Sárrédtudavari-Orhalom tumulus in the Hungarian Alfold: a left-sided crouched juvenile ( 15- 17 y) individual in an oval, NW-SE orientated grave pit 14C dated to 3350-3100 cal. B.C. at 2 sigma (Dani, Ncpper 2006). Neither the burial custom (no ochre strewing or depositing a lump of ochre has been recorded), nor date account for its ascription to the Yamnaya!

All of these tumuli and burials demonstrate, though, that there is already a constant but perhaps low-level 4th millennium cal. B.C. steppe interaction, linking the regions of the north of the Black Sea with those of the west, and reaching deep into the Carpathian basin. This has to be acknowledged. even if these populations remain small, bounded to their steppe habitat with an economy adapted to this special environment, and are not always visible in the record. Indirect hints may help in seeing them, such as the frequent occurrence of horse bones, regarded as deriving from domesticated horses, in Hungarian Baden settlements (Bokonyi 1978; Benecke 1998), and in those of the south German Cham Culture (Matuschik 1999, p. 80-82) and the east German Bernburg Culture (Becker 1999; Benecke 1999). These occur, however, always in low numbers, perhaps not enough to maintain and regenerate a herd. Does this point us towards otherwise archaeologically hidden horsebreeders in the Carpathian basin, before the Yamnaya? In any case, I hope to make one case clear: these are by no means Yamnaya burials in the strict definition! Attribution to the Yamnaya in its strict definition applies.

Distribution of Pit-Grave burials west of the Black Sea likely dating to the 2nd half of the 4th millennium BC (triangles: side-crouched burials; filled circles: supine extended burials; open circles: suspected). In Alin Frînculeasa, Bianca Preda, Volker Heyd, Pit-Graves, Yamnaya and Kurgans along the Lower Danube.

Also, about the expansion of Yamna settlers along the steppes:

However, it should have been made clear by the distribution map of the Western Yamnaya that they were confining themselves solely to their own, well-known, steppe habitat and therefore not occupying, or pushing away and expelling, the locally settled farming societies. Also, living solely in the steppes requires another lifestyle, and quite different economic and social bases, most likely very different to the established farming societies. Although surely regarded as incoming strangers, they may therefore not have been seen as direct competitors. This argument can be further enforced when remembering that the lowlands and the steppes in the southeast of Europe had already been populated throughout the 4th millennium cal. B.C., as demonstrated above, by societies with a similar north-Pontic steppe origin and tradition, albeit in lower numbers. It is only for these groups that the Yamnaya may have become a threat, but their common origin and perhaps a similar economic/ social background with comparable lifestyles would surely have assisted to allow rapid assimilation. More important, though, is that farming societies in this region may therefore have been accustomed to dealing and interacting with different people and ethnic strangers for a long time. (…)

When assessing farming and steppe societies’ interaction from a general point of view, attitudes can diverge in three main directions:

  1. the violent one; with raids, fights, struggles, warfare, suppression and finally the superiority and exploitation of the one over the other;
  2. the peaceful one; with a continuous exchange of gifts, goods, work, information and genes in a balanced reciprocal system, leading eventually to the merging of the two societies and creation of a new identity;
  3. the neutral one; with the two societies ignoring each other for a long time.

What we see from trying to understand the record of the Yamnaya, based on their tumuli and burials, and the local and neighbouring contemporary societies, based on their settlements, hoards, and graves, is likely a mixture of all three scenarios, with the balance perhaps more towards exchange in a highly dynamic system with alterations over time. However, violence and raids cannot be ruled out; they would be difficult to see in the archaeological record; or only indirectly, such as the building of hill forts, particularly the defence-like chain of Vucedol hillforts along the south shore of the Danube on the Serbian/Croatian border zone (Tasic 1995a), and the retreat of people into them (Falkenstein 1998, p. 261-262), with other interpretations also possible. And finally, we are dealing here with very different local and neighbouring societies, as well as with more distant contemporary ones, looking, in reality, rather like a chequer board of societies and archaeological cultures (see Parzinger 1993 for the overview). These display different regional backgrounds and traditions leading to different social and settlement organizations, different economic bases and material cultures in the wide areas between Prut and Maritza rivers, and Black Sea and Tisza river. They surely found their individual way of responding to the incoming and settling Yamnaya people.

Yamnaya tumuli signalling the expansion of West Yamna from ca. 3100 BC (especially after ca. 2950 BC). Heyd (2011).

The best data we have about this potential non-Yamna origin of R1b-L51 – and thus in favour of its admixture in the Carpathian basin – lies in:

  1. The majority of R1a-Z2103 subclades found to date among Yamna samples.
  2. The presence of R1b-Z2103 in the Catacomb culture – in the Northern Caucasus and in Ukraine.
  3. The limited presence of (ancient and modern) R1b-L51 in eastern Europe and India, whose isolated finds are commonly (and simplistically) attributed to ‘late migrations’.
  4. The presence of R1b-L51 (xZ2103) in cultures related to the ‘Yamna package’, but supposedly not to Yamna settlers. So for example I7043, of haplogroup R1b-L151(xU106,xP312), ca. 2500-2200 BC from Szigetszentmiklós-Üdülősor, probably from the Bell Beaker (Csepel group), but maybe from the early Nagýrev culture.
  5. The expansion of its subclades apparently only from a single region, around the Carpathian basin, in contrast to R1b-Z2103.
  6. The already ‘diluted’ steppe admixture found in the earliest samples with respect to Yamna, which points to the appearance after the Yamna admixture with the local population.
  7. Ukrainian archaeologists (in contrast to their Russian colleagues) point to the relevance of North Pontic cultures like Kvitjana and Lower Mikhailovka in the development of Early Yamna in the west, and some eastern European researchers also believe in this similarity.
  8. If R1b-Z2103 and R1b-L51 had expanded with Suvorovo-Novodanilovka migrants to the west, and had admixed later as Hungary_LCA-LBA-like peoples with Yamna migrants during the long-term contacts with other ‘kurganized cultures’ ca. 2900-2500 BC in the Great Hungarian Plains, it could explain some peculiar linguistic traits of North-West Indo-European, and also why R1b-Z2103 appears in cultures associated with this earlier ‘steppe influence’ (i.e. not directly related to Yamna) such as Vučedol (with a R1b-Z2103 sample, see below). That could also explain the presence of R1b-L151(xP312, xU106) in similar Balkan cultures, possibly not directly related to Yamna.
Image modified from Wang et al. (2018). PCA of ancient and modern samples. Red circle in dashed line around Varna, Greece Neolithic, and (approximate position of) Smyadovo outliers, part of Khvalynsk-Novodanilovka settlers.

A hidden group among north or west Pontic Eneolithic steppe cultures?

The expansion of Khvalynsk as Novodanilovka into the North Pontic area happened through the south across the steppe, near the coast, with the forest-steppe region working as a clear natural border for this culture of likely horse-riding chieftains, whose economy was probably based on some rudimentary form of mobile pastoralism.

Although archaeologists are divided as to the origin of each individual Middle Eneolithic group near the Black Sea after the end of the Khvalynsk-Novodanilovka period, it seems more or less clear that steppe cultures like Cernavodă, Lower Mikhailovka, or Kvitjana are closer (or “more archaic”) in their steppe features, which connects them to Volga–Ural and Northern Caucasus cultures, like Northern Caucasus, Repin or Khvalynsk.

On the other hand, forest-steppe cultures like Dereivka (including Alexandria) show innovative traits and contacts with para- or sub-Neolithic cultures to the north, like Comb-Pit Ware groups, apart from corded decoration influenced by Trypillian groups to the west, especially in their later (‘Proto-Corded Ware‘) stage after ca. 3500 BC.

If Ukrainian researchers like Rassamakin are right, Early Yamna expanded not only from Repin settlers, but also from local steppe cultures adopting Repin traits to develop an Early Yamna culture, similar to how eastern (Volga–Ural groups) seem to have synchronously adopted Early Yamna without massive affluence of Repin settlements.

Furthermore, local traits develop in southern groups, like anthropomorphic stelae (shared with Kemi-Oba, direct heir of Lower Mikhailovka), and rich burials featuring wagons. These traits are seen in west Yamna settlers.

Modified from Rassamakin (1999), adding red color to Repin expansion. The system of the latest Eneolithic Pointic cultures and the sites of the Zhivotilovo-Volchanskoe type: 1) Volchanskoe; 2) Zhivotilovka; 3) Vishnevatoe; 4) Koisug.

Problems of this model include:

  1. On the North Pontic area – in contrast to the Volga–Ural region – , there was a clear “colonization” wave of Repin settlers, also supported by Ukrainian researchers, based on the number of new settlements and burials, and on the progressive retreat of Dereivka, Kvitjana, as well as (more recent) Maykop- and Trypillia-related groups from the North Pontic area ca. 3350/3300 BC. It seems unlikely that these expansionist, semi-nomadic, cattle-breeding, patrilineally-related steppe clans that were driving all native populations out of their territories suddenly decided, at some point during their spread into the North Pontic area ca. 3300-3100 BC, to join forces with some foreign male lineages from the area, and then continue their expansion to the west…
  2. Similar to the fate of R1b-P297 subclades in the Baltic after the expansion of Corded Ware migrants, previous haplogropus of the North Pontic region – such as R1a, R1b-V88, and I2 subclades basically disappeared from the ancient DNA record after the expansion of Khvalynsk-Novodanilovka, and then after the expansion of Yamna, as is clear from Yamna, Afanasevo, and Bell Beaker samples obtained to date. This, in combination with what we know about Y-chromosome bottlenecks in post-Neolithic expansions, leaves little space to think that a big enough territorial group with a majority of “native” haplogroups could survive later expansions (be it R1b-L51 or R1a-Z645).
  3. Supporting an expansion of the same male (and partly female) population, the Yamna admixture from east to west is quite homogeneous, with the only difference found in (non-significant) EEF-like proportion which becomes elevated in distant areas [apart from significant ‘southern’ contribution to certain outlier samples]. Based on the also homogeneous Y-DNA picture, the heterogeneity must come, in general, from the female exogamy practiced by expanding groups.
  4. There is a short period, spanning some centuries (approximately 3300-2700 BC), in which the North Pontic area – especially the forest-steppe territories to the west of the Dnieper, i.e. the Upper Dniester, Boh, and Prut-Siret areas – are a chaos of incoming and emigrating, expanding and shrinking groups of different cultures, such as late Trypillian groups, Maykop-related traits, TRB, GAC, (Proto-)Corded Ware, and Early Yamna settlements. No natural geographic frontier can be delimited between these groups, which probably interacted in different ways. Nevertheless, based on their cultural traits, admixture, and especially on their Y-DNA, it seems that they never incorporated foreign male lineages, beyond those they probably had during their initial expansion trends.
  5. The further expansionist waves of Early Yamna seen ca. 3100 BC, from the Danube Delta to the west, give an overall image of continuously expanding patrilineal clans of R1b-M269 subclades since the Khvalynsk-Novodanilovka migration, in different periodic steps, mostly from eastern Pontic-Caspian nuclei, usually overriding all encountered cultures and (especially male) populations, rather than showing long-term collaboration and interaction. Such interaction is seen only in exceptional cases, e.g. the long-term admixture between Abashevo and Poltavka, as seen in Proto-Indo-Iranian peoples and their language.
Image modified from Wang et al. (2018). PCA of ancient and modern samples. Arrows depicting Khvalynsk -> Yamna drift (blue), and hypothetic approximate Ukraine Eneolithic -> Yamna drift accompanying R1b-L51 (red).


We are living right now an exemplary ego-, (ethno-)nationalism-, and/or supremacy-deflating moment, for some individuals of eastern and northern European descent who believed that R1a or ‘steppe ancestry proportions’ meant something special. The same can be said about those who had interiorized some social or ethnolinguistic meaning for the origin of R1b in western Europe, N1c in north-eastern Europe, as well as Greeks, Iranians, Armenians, or Mediterranean peoples in general of ‘Near Eastern’ ancestry or haplogroups, or peoples of Near Eastern origin and/or language.

These people had linked their haplogroups or ancestry with some fantasy continuity of ‘their’ ancestral populations to ‘their’ territories or languages (or both), and all are being proven wrong.

Apart from teaching such people a lesson about what simplistic views are useful for – whether it is based on ABO or RH group, white skin, blond hair, blue eyes, lactase persistence, or on the own ancestry or Y-DNA haplogroup -, it teaches the rest of us what can happen in the near future among western Europeans. Because, until recently, most western Europeans were comfortably settled thinking that our ancestors were some remnant population from an older, Palaeolithic or Mesolithic population, who acquired Indo-European languages by way of cultural diffusion in different periods, including only minor migrations.

Judging by what we can see now among some individuals of Northern and Eastern European descent, the only thing that can worsen the air of superiority among western Europeans is when they realize (within a few years, when all these stupid battles to control the narrative fade) that not only are they the cultural ‘heirs’ of the Graeco-Roman tradition that began with the Roman Empire, but that most of them are the direct patrilineal descendants of Khvalynsk, Yamna, Bell Beaker, and European Bronze Age peoples, and thus direct descendants of Middle PIE, Late PIE, and NWIE speakers.

Steppe-related migrations ca. 3100-2600 BC with tentative linguistic identification.

The finding of R1b-L51 and R1b-Z2103 among expanding Suvorovo-Novodanilovka chieftains, with pockets of R1b-L51 remaining in steppe-like societies of the Balkans and the Carpathian Basin, would have beautifully complemented what we know about the East Yamna admixture with R1a-Z93 subclades (Uralic speakers) ca. 2600-2100 BC to form Proto-Indo-Iranian, and about the regional admixtures seen in the Balkans, e.g. in Proto-Greeks, with the prevalent J subclades of the region.

It would have meant an end to any modern culture or nation identifying themselves with the ‘true’ Late PIE and Yamna heirs, because these would be exclusively associated with the expansion of R1b-Z2103 subclades with late Repin, and later as the full-fledged Late PIE with Yamna settlers to south-east and central Europe, and to the southern Urals. The language would have had then obviously undergone different language changes in all these territories through long-lasting admixture with other populations. In that sense, it would have ended with the ideas of supremacy in western Europe before they even begin.

The most likely future

However limited the evidence, it seems that R1b-L51 expanded with Yamna, though, based on the estimates for the haplogroups involved, and on marginal hints at the variability of L23 subclades within Yamna and neighbouring populations. If R1b-L51 expanded with West Repin / Early Yamna settlers, this is why they have not yet been found among Yamna samples:

Simplified map of Repin expansions from ca. 3500/3400 BC.
  • The subclade division of Yamna settlers needs not be 50:50 for L51:Z2103, either in time or in space. I think this is the simplistic view underlying many thoughts on this matter. Many different expanding patrilineal clans of L23 subclades may have been more or less successful in different areas, and non-Z2103 may have been on the minority, or more isolated relative to Z2103-clans among expanding peoples on the steppe, especially on the east. In fact, we usually talk in terms of “Z2103 vs. L51” as if
    1. these two were the only L23 subclades; and
    2. both had split and succeeded (expanding) synchronously;

    that is, as if there had not been multiple subclades of both haplogroups, and as if there had not been different expansion waves for hundreds of years stemming from different evolving nuclei, involving each time only limited (successful) clans. Many different subclades of haplogroups L23 (xZ2103, xL51), Z2103, and L51 must have been unsuccessful during the ca. 1,500 years of late Khvalynsk and late Repin-Early Yamna expansions in which they must have participated (for approximately 60-75 generations, based on a mean 20-25 years).

  • If we want to imagine a pocket of ‘hidden’ L51 for some region of the North Pontic or Carpathian region, the same can be imagined – and much more likely – for any unsampled territory of expanding late Repin/Early Yamna settlers from the Lower Don – Lower Volga region (probably already a mixed society of L51 and Z2103 subclades since their beginning, as the early Repin culture, ca. 3800 BC), with L51 clans being probably successful to the west.
  • The Repin culture expanded only in small, mobile settlements from the Lower Don – Lower Volga to the north, east, and south, starting ca. 3500/3400 BC, in the waves that eventually gave a rather early distant offshoot in the Altai region, i.e. Afanasevo. Starting ca. 3300 BC in the archaeological record, the majority of R1b-Z2103 subclades found to date in Afanasevo also supports either
    • a mixed Repin society, with Z2103-clans predominating among eastern settlers; or
    • a Repin society marked by haplogroup L51, and thus a cultural diffusion of late Repin/Early Yamna traits among neighbouring (Khvalynsk, Samara, etc.) groups of essentially the same (early Khvalynsk-Novodanilovka) genetic stock in the Volga–Ural region.

    Both options could justify a majority of Z2103 in the Lower Volga–Ural region, with the latter being supported by the scattered archaeological remains of late Repin in the region before the synchronous emergence of Early Yamna findings in the whole Pontic-Caspian steppe.

  • Most Z2103 from Yamna samples to date are from around 3100 BC (in average) onward, and from the right bank of the Lower Don to the east, particularly from the Lower Volga–Ural area (especially the Samara region), which – based on the center of expansion of late Repin settlers – may be depicting an artificially high Z2103-distribution of the whole Yamna community.
Repin expansion into the Volga–Ural region from ca. 3500/3400 BC. Map made by me based on maps and data from Morgunova (2014, 2016). Lopatino is marked with number 64.
  • Yamna sample I0443, R1b-L23 (Y410+, L51-), ca. 3300-2700 BCE from Lopatino II, points to an intermediate subclade between L23 and L51, near one of the supposed late Repin sites (based on kurgan burials with late Repin cultural traits) in the Samara region.
  • Other Balkan cultures potentially unrelated to the Yamna expansion also show Z2103 (and not only L51) subclades, like I3499 (ca. 2884-2666 calBC), of the Vučedol culture, from Beli Manastir-Popova zemlja, which points to the infiltration of Yamna peoples in other cultures. In any case, the appearance of R1b-L23 subclades in the region happens only after the Yamna expansion ca. 3100 BC, probably through intrusions into different neighbouring regions, if these Balkan cultures are not directly derived from Yamna settlements (which is probably the case of the Csepel Bell Beaker or early Nagýrev sample, see above).
  • The diversity of haplogroups found in or around the Carpathian Basin in Late Chalcolithic / Early Bronze Age samples, including L151(xP312, xU106), P312, U106, Z2103, makes it the most likely sink of Yamna settlers, who spread thus with expanding family clans of different R1b-L23 subclades.
  • Even though some Yamna vanguard groups are known to have expanded up to Saxony-Anhalt before ca. 2700 BC, haplogroup Z2103 seems to be restricted to more eastern regions, which suggests that R1b-L51 was already successful among expanding West Yamna clans in Hungary, which gave rise only later to expanding East Bell Beakers (overwhelmingly of L151 subclades). The source of R1b-L51 and L151 expansion over Z2103 must lie therefore in the West Yamna period, and not in the Bell Beaker expansion.
Yamna migrants ca. 3300-2600. Most likely site of admixture with GAC circled in red.
  • The R1b-Z2103 found in Poltavka, Catacomb, and to the south point to a late migration displacing the western R1b-L51, only after the late Repin expansion. This is also seen in the steppe ancestry and R1b-Z2103 south of the Caucasus, in Hajji Firuz, which points to this route as a potential source of the supposed “Earliest Proto-Indo-Iranian” (the mariannu term) of the Near East. A similar replacement event happened some centuries later with expanding R1a-Z93 subclades from the east wiping out haplogroup R1b-Z2103 from the Pontic-Caspian steppe.
  • Many ancient samples from Khvalynsk, Northern Caucasus, Yamna, or later ones are reported simply as R1b-M269 or L23, without a clear subclade, so the simplistic ‘Yamna–Z2103’ picture is not real: if one takes into account that Z2103 might have been successful quite early in the eastern region, it is more likely to obtain a successful Y-SNP call of a Z2103 subclade in the Volga–Ural region than a xZ2103 one.
  • There are some modern samples of R1b-L51 in eastern Europe and Asia, whose common simplistic attribution to “late expansions” is usually not substantiated; and also ancient R1b-L51 samples might be confirmed soon for Asia.
  • ‘Western’ features described by archaeologists for West Yamna settlers, associated with Kemi Oba and southern Yamna groups in the North Pontic area – like rich burials with anthropomorphic stelae and wagons – are actually absent in burials from settlers beyond Bulgaria, which does not support their affiliation with these local steppe groups of the Black Sea. Also, a mix with local traditions is seen accross all Early Yamna groups of the Pontic-Caspian steppe, and still genetics and common cultural traits point to their homogeneization under the same patrilineal clans expanding continuously for centuries. The maintenance of local traditions (as evidenced by East Bell Beakers in Iberia related to Iberian Proto-Beakers) is often not a useful argument in genetics, especially when the female population is not replaced.
Yamna settlers in the Great Pannonian Plain, showing only kurgans of Hungary ca. 2950-2500 BC. Yamna Hungary was one of the biggest West Yamna provinces. From Hórvath et al. (2013).


This is what we know, using linguistics, archaeology, and genetics:

  • Middle Proto-Indo-European expanded with Khvalynsk-Novodanilovka after ca. 4800 BC, with the first Suvorovo settlements dated ca. 4600 BC.
  • Archaic Late Proto-Indo-European expanded with late Repin (or Volga–Ural settlers related to Khvalynsk, influenced by the Repin expansion) into Afanasevo ca. 3500/3400 BC.
  • Late Proto-Indo-European expanded with Early Yamna settlers to the west into central Europe and the Balkans ca. 3100 BC; and also to the east (as Pre-Proto-Indo-Iranian) into the southern Urals ca. 2600 BC.
  • North-West Indo-European expanded with Yamna Hungary -> East Bell Beakers, from ca. 2500 BC.
  • Proto-Indo-Iranian expanded with Sintashta, Potapovka, and later Andronovo and Srubna from ca. 2100 BC.

It seems that the subclades from Khvalynsk ca. 4250-4000 BC were wrongly reported – like those of Narasimhan et al. (2018). However, even if they are real and YFull estimates have to be revised, and even if the split had happened before the expansion of Suvorovo-Novodanilovka, the most likely origin of R1b-L51 among Bell Beakers will still be the expansion of late Repin / Early Yamna settlers, and that is what ancient DNA samples will most likely show, whatever the social or political consequences.

The only relevance of the finding of R1b-L51 in one place or another – especially if it is found to be a remnant of a Middle PIE expansion coupled with centuries of admixture and interaction in the Carpathian Basin – is the potential influence of an archaic PIE (or non-IE) layer on the development of North-West Indo-European in Yamna Hungary -> East Bell Beaker. That is, more or less like the Uralic influence related to the appearance of R1a-Z93 among Proto-Indo-Iranians, of R1a-Z284 among Pre-Germanic peoples, and of R1a-Z282 among Balto-Slavic peoples.

I think there is little that ancient DNA samples from West Yamna could add to what we know in general terms of archaeology or linguistics at this point regarding Late PIE migrations, beyond many interesting details. I am sure that those who have not attributed some random 6,000-year-old paternal ancestor any magical (ethnic or nationalist) meaning are just having fun, enjoying more and more the precise data we have now on European prehistoric populations.

As for those who believe in magical consequences of genetic studies, I don’t think there is anything for them to this quest beyond the artificially created grand-daddy issues. And, funnily enough, those who played (and play) the ‘neutrality’ card to feel superior in front of others – the “I only care about the truth”-type of lie, while secretly longing for grandpa’s ethnolinguistic continuity – are suffering the hardest fall.


Before steppe ancestry: Europe’s genetic diversity shaped mainly by local processes, with varied sources and proportions of hunter-gatherer ancestry


The definitive publication of a BioRxiv preprint article, in Nature: Parallel palaeogenomic transects reveal complex genetic history of early European farmers, by Lipson et al. (2017).

The dataset with all new samples is available at the Reich Lab’s website. You can try my drafts on how to do your own PCA and ADMIXTURE analysis with some of their new datasets.


Ancient DNA studies have established that Neolithic European populations were descended from Anatolian migrants who received a limited amount of admixture from resident hunter-gatherers. Many open questions remain, however, about the spatial and temporal dynamics of population interactions and admixture during the Neolithic period. Here we investigate the population dynamics of Neolithization across Europe using a high-resolution genome-wide ancient DNA dataset with a total of 180 samples, of which 130 are newly reported here, from the Neolithic and Chalcolithic periods of Hungary (6000–2900 BC, n = 100), Germany (5500–3000 BC, n = 42) and Spain (5500–2200 BC, n = 38). We find that genetic diversity was shaped predominantly by local processes, with varied sources and proportions of hunter-gatherer ancestry among the three regions and through time. Admixture between groups with different ancestry profiles was pervasive and resulted in observable population transformation across almost all cultural transitions. Our results shed new light on the ways in which gene flow reshaped European populations throughout the Neolithic period and demonstrate the potential of time-series-based sampling and modelling approaches to elucidate multiple dimensions of historical population interactions.

There were some interesting finds on a regional level, with some late survival of hunter-gatherer ancestry (and Y-DNA haplogroups) in certain specific sites, but nothing especially surprising. This survival of HG ancestry and lineages in Iberia and other regions may be used to revive (yet again) the controversy over the origin of non-Indo-European languages of Europe attested in historical times, such as the only (non-Uralic) one surviving to this day, the Basque language.

This study kept confirming the absence of Y-DNA R1b-M269 subclades in Central Europe before the arrival of Yamna migrants, though, which offers strong reasons to reject the Indo-European from the west hypothesis.

Here are first the PCA of samples included in this paper, and then the PCA of ancient Eurasians (Mathieson et al. 2017) and modern populations (Lazaridis et al. 2014) for comparison of similar clusters:

First two principal components from the PCA. We computed the principal components (PCs) for a set of 782 present-day western Eurasian individuals genotyped on the Affymetrix Human Origins array (background grey points) and then projected ancient individuals onto these axes. A close-up omitting the present-day Bedouin population is shown. From Lipton et al. (2017(
PCA of South-East European and other European samples from Mathieson et al. (2017)
Ancient and modern samples on Lazaridis et al. (2014)


Ancient DNA samples from Mesolithic Scandinavia show east-west genetic gradient


New pre-print article at BioRxiv, Genomics of Mesolithic Scandinavia reveal colonization routes and high-latitude adaptation, by Günther et al. (2017), from the Uppsala University (group led by Mattias Jakobsson).

Abstract (emphasis mine):

Scandinavia was one of the last geographic areas in Europe to become habitable for humans after the last glaciation. However, the origin(s) of the first colonizers and their migration routes remain unclear. We sequenced the genomes, up to 57x coverage, of seven hunter-gatherers excavated across Scandinavia and dated to 9,500-6,000 years before present. Surprisingly, among the Scandinavian Mesolithic individuals, the genetic data display an east-west genetic gradient that opposes the pattern seen in other parts of Mesolithic Europe. This result suggests that Scandinavia was initially colonized following two different routes: one from the south, the other from the northeast. The latter followed the ice-free Norwegian north Atlantic coast, along which novel and advanced pressure-blade stone-tool techniques may have spread. These two groups met and mixed in Scandinavia, creating a genetically diverse population, which shows patterns of genetic adaptation to high latitude environments. These adaptations include high frequencies of low pigmentation variants and a gene-region associated with physical performance, which shows strong continuity into modern-day northern Europeans. Finally, we were able to compute a 3D facial reconstruction of a Mesolithic woman from her high-coverage genome, giving a glimpse into an individual’s physical appearance in the Mesolithic.

Interesting is the genetic similarity found with Baltic hunter-gatherers from Zvejnieki:

To investigate the postglacial colonization of Scandinavia, we explored four hypothetical migration routes (primarily based on natural geography) linked to WHGs and EHGs, respectively (Supplementary Information 11); a) a migration of WHGs from the south, b) a migration of EHGs from the east across the Baltic Sea, c) a migration of EHGs from the east and along the north-Atlantic coast, d) a migration of EHGs from the east and south of the Baltic Sea, and combinations of these four migration routes.
The SHGs from northern and western Scandinavia show a distinct and significantly stronger affinity to the EHGs compared to the central and eastern SHGs (Fig. 1). Conversely, the SHGs from eastern and central Scandinavia were genetically more similar to WHGs compared to the northern and western SHGs (Fig. 1). Using a model-based approach (15, 16), the EHG genetic component of northern and western SHGs was estimated to 55% on average (43-67%) and significantly different (Wilcoxon test, p=0.014) from the average 35% (22-44%) in eastern and south-central SHGs. This average is similar to eastern Baltic hunter-gatherers from Latvia (28) (average 33%, Fig. 1A, Supplementary Information 6). These patterns of genetic affinity within SHGs are in direct contrast to the expectation based on geographic proximity with EHGs and WHGs and do not correlate with age of the sample.
Combining these isotopic results with the patterns of genetic variation, we suggest an initial colonization from the south, likely by WHGs. A second migration of people who were related to the EHGs – that brought the new pressure blade technique to Scandinavia and that utilized the rich Atlantic coastal marine resources –entered from the northeast moving southwards along the ice-free Atlantic coast where they encountered WHG groups. The admixture between the two colonizing groups created the observed pattern of a substantial EHG component in the northern and the western SHGs, contrary to the higher levels of WHG genetic component in eastern and central SHGs (Fig. 1, Supplementary Information 11).

From the same article, three samples with reported Y-DNA, the three of haplogroup I2 (one more specifically I2a1b). Regarding mtDNA, four samples U5a1 (two of them U5a1d), two samples U4a1, one U4a2.

Featured image: potential migration routes, taken from the supplementary material.