ASoSaH Reread (I): Y-DNA haplogroups among Indo-Europeans (apart from R1b-L23)

Given my reduced free time in these months, I have decided to keep updating the text on Indo-European and Uralic migrations and/or this blog, simultaneously or alternatively, to make the most out of the time I can dedicate to this. I will add the different ‘A Song of Sheep and Horses (ASoSaH) reread’ posts to the original post announcing the books. I would be especially interested in comments and corrections to the book chapters rather than the posts, but any comments are welcome (including in the forum, where comments are more likely to stick).

This is mainly a reread of iv.2. Indo-Anatolians and vi.1. Disintegrating Indo-Europeans.

Indo-Anatolians and Late Indo-Europeans

I have often written about R1b-L23 as the majority haplogroup among Late Proto-Indo-Europeans (see my predictions for 2018 and my summary of 2018), but always expected other haplogroups to pop up somewhere along the way, in Khvalynsk, in Repin, in Yamna, and in Bell Beakers (see e.g. the post on common fallacies of R1a/IE-fans).

Luckily enough – for those of us who want precise answers to our previous infinite models of Indo-European language expansions (viz. GAC-associated expansion, IE-speaking Old Europe, Anatolian homeland, Iran homeland, Maykop as Proto-Anatolian, Palaeolithic Continuity Theory, Celtic in the Atlantic façade, etc.) – the situation has been more clear-cut than expected: it turns out that, especially during population expansions, acute Y-chromosome bottlenecks were very common in the past, at least until the Iron Age.

Khvalynsk and Repin-Yamna expansions were no different, and that seems quite natural in hindsight, given the strong familial ties and aversion to foreigners proper of the Late Proto-Indo-European society and culture – probably not really that different from other contemporary societies, like the neighbouring Late Proto-Uralic or Trypillian ones.

Y-DNA samples from Khvalynsk and neighbouring cultures. See full version here.

Y-DNA haplogroups

During the expansion of early Khvalynsk, the most likely Indo-Anatolian culture, the society of the Don-Volga area was probably made up of different lineages including R1b-V1636, R1b-M269, R1a-YP1272, Q1a-M25, and I2a-L699 (and possibly some R1b-V88?), a variability possibly greater than that of the contemporary north Pontic area, probably a sign of this region being a sink of different east and west migrations from steppe and forest areas.

During its expansion, the Khvalynsk society saw its haplogroup variability reduced, as evidenced by the succeeding expansive Repin culture:

Afanasevo, representing Pre-Tocharian (the earliest Late PIE dialect to branch off), expanded with R1b-L23 – especially R1b-Z2103 – lineages, while early Yamna expanded with R1b-L23 and I2a-L699 lineages, which suggests that these are the main haplogroups that survived the Y-DNA bottleneck undergone during the Khvalynsk expansion, and especially later during the late Repin expansion. Nevertheless, other old haplogroups might still pop up during the Repin and early Yamna period, such as the R1b-V1636 sample from Yamna in the Northern Caucasus.

It is still unclear if R1b-L23 sister clade R1b-PF7562 (formed ca. 4400 BC, TMRCA ca. 3400 BC), prevalent among modern Albanians, expanded with Yamna migrants, or if it was part of an earlier expansion of R1b-M269 into the Balkans, and represent thus Indo-Anatolian speakers who later hitchhiked the expansion of the Late PIE language from the north or west Pontic area. The early TMRCA seems to suggest an association with Repin (and therefore Yamna), rather than later movements in the Balkans.

Y-DNA samples from Yamnaya and neighbouring cultures. See full version here.

‘Yamnaya’ or ‘steppe’ ancestry?

After the early years when population genetics relied mainly on modern Y-DNA haplogroups, geneticists and amateurs have been recently playing around with testing “ancestry percentages”, based on newly developed free statistical tools, which offer obviously just one among many types of data to achieve a proper interpretation of the past.

Today we have quite a lot Y-DNA haplogroups reported for ancient samples of more recent prehistoric periods, and they seem to offer (at least since the 2015 papers, but more evidently since the 2018 papers on Bell Beakers and Europeans, Corded Ware, or Fennoscandia among others) the most straightforward interpretation of all results published in population genomics research.

NOTE. The finding of a specific type of ancestry in one isolated 40,000-year-old sample from Tianyuan can offer very interesting information on potential population movements to the region. However, the identification of ethnolinguistic communities and their migrations among neighbouring groups in Neolithic or Bronze Age groups is evidently not that simple.

Yamnaya (Indo-European peoples) and their evolution in the steppes, together with North Pontic (eventually Uralic) peoples.Notice how little Indo-European ancestry changes from Khvalynsk (Indo-Anatolian) to Yamna Hungary (North-West Indo-Europeans) Image modified from Wang et al. (2018). See more on the evolution of “steppe ancestry”.

It is becoming more and more clear with each paper that the true “Yamnaya ancestry” – not the originally described one – was in fact associated with Indo-Europeans (see more on the very Yamnaya-like Yamna Hungary and early East Bell Beaker R1b samples, all of quite similar ancestry and PCA cluster before their further admixture with EEF- and CWC-like groups).

The so-called “steppe ancestry”, on the other hand, reflects the contribution of a Northern Caucasus-related ancestry to expanding Khvalynsk settlers, who spread through the steppes more than a thousand years before the expansion of Late Proto-Indo-Europeans with late Repin, and can thus be found among different groups related to the Pontic-Caspian steppes (see more on the emergence and evolution of “steppe ancestry”).

In fact, after the Yamna/Indo-European and Corded Ware/Uralic expansions, it is more likely to find “steppe ancestry” to the north and east in territories traditionally associated with Uralic languages, whereas to the south and west – i.e. in territories traditionally associated with Indo-European languages – it is more likely to find “EEF ancestry” with diminished “steppe ancestry”, among peoples patrilineally descended from Yamna settlers.

Y-DNA haplogroups, the only uniparental markers (see exceptions in mtDNA inheritance) – unlike ancestry percentages based on the comparison of a few samples and flawed study designs – do not admix, do not change, and therefore they do not lend themselves to infinite pet theories (see e.g. what David Reich has to say about R1b-P312 in Iberia directly derived from Yamna migrants in spite of their predominant EEF ancestry): their cultural continuity can only be challenged with carefully threaded linguistic, archaeological, and genetic data.


16 thoughts on “ASoSaH Reread (I): Y-DNA haplogroups among Indo-Europeans (apart from R1b-L23)

    1. I don’t know if Koch has ever been interested in a Satem group, or if this is just another expendable detail of the theory to justify their “Celtic of the Atlantic façade” hypothesis.

      I think this presentation shows the same attitude we are seeing everywhere: academics trying to use ancestry components to support their previous ideas. It’s what I have been saying about Kristiansen as the lead archaeologist of the Copenhagen group (and some linguists who have joined them), but repeated everywhere:

      The obvious reasoning goes: if Indo-European-Corded Ware supporters do it… why not us?

      I haven’t seen any presentation from the Palaeolithic Continuity Theory group, but I guess they will eventually come up with something similar, arguing that continuity of hunter-gatherer ancestry in certain regions proves what they supported: e.g that Neolithic ancestry admixed with hunter-gatherer ancestry in the Mediterranean, hence showing continuity in Iberia, France, and the British Isles; and then Chalcolithic ancestry continuing in Iberian, Irish and Atlantic Bell Beakers…hence Celtic was always spoken in Iberia since the Palaeolithic.

      And the problem is, such reactionary proposals require zero effort, but other academics are supposed to answer them or take them into account…

      This is a recent link shared by Razib Khan on another twist of the OIT, with many many details to answer…if anyone had time for that:

      I can’t wait until Spanish archaeologists catch up with ancestry percentages and release their theories of how Iberian Bell Beakers traveled east and turned into East Bell Beakers, expanding with EEF ancestry from Lisbon to Hungary…

      All these are obvious reactionary views, but disguised as new by including population genomics (especially ancestry percentages) supposedly supporting their claims.

      Population genomics not as the solution to problems, but just as another weapon to add to linguistics and archaeology, to continue the infinite pet theories that we had before. The nightmare of any scientist.

  1. 18-20% of PF7562 among Albanians isn’t correct, the actual number is closer to 5%. 18-20% is total R1b among Albanians, of which majority (>2/3) belong to a fairly young subclade Z2103>BY611>Z2705.

  2. I have just read your book, a Song of Sheep and Horses, a truly fascinating work. You do not lump language families together, instead going for smaller groupings, which is refreshing. I must ask, what does ”e” stand for in your sheep and horse reconstructions? Is it a mid vowel, or is it just for used for tradition’s sake?
    If it is the former, I recommend you read Pulleyblank’s thesis on a vertical vowel system of [ə] and [a] for PIE.

    Maybe when Early PIE moved westwards into the Pontic Steppe, the vowel system was transformed (Caucasian influence?) from a large one to a vertical system. [u] and [i] might have diphthongized to [əu] and [əi]. If PIE was transformed under Caucasian influence, I cannot help but think that the Indo-Uralic vowel system was more like that of Uralic.

    If Yukaghir is included, then Uralic seems to have simplified things in so far as there is solely a height distinction in the second syllable. In many places where Uralic has CVCi and CVCa, Yukaghir has CVCø. My theory is that Indo-Uralic had a height and back distinction in the second syllable, with Uralic merging [i] with [y], [a] with [o].

    1. Thank you, the idea was to offer a concise summary of some Eurasian languages and their contacts.

      “e” stands for a mid vowel, as traditionally reconstructed; I haven’t had much interest in questioning the vocalic nature and their evolution in the different PIE stages, though. By Pulleyblank’s thesis you mean ?

      Yes, I assume the direction was east to west for Indo-Uralic at some point during the Late Mesolithic-Neolithic. The Caucasian influence supported by Kortlandt and Bomhard makes sense, given the archaeological and genetic data. And yes, in principle Uralic would be less affected, because its speakers must have roamed in the forest-steppe and forested areas without much contact with Caucasian languages.

      The problem with Yukaghir is the late date of attestation. In fact, it is usually interpreted as from a Uralo-Yukaghir group, which makes the Proto-Yukaghir reconstruction biased to some extent. Morphologically it is difficult to assert anything about the relationship with either Uralic or Indo-European. I think late contacts with Uralic (and with Altaic languages) might have confused things to some extent, especially in phonology.

      Anyway, as you can see I didn’t dedicate enough time to sum (and tidy) things up and offer some tentative phonological sketch of Indo-Uralic; the vocalic system was left fully unexplained. I hope I have enough time to revise it properly.

Leave a Reply

Your email address will not be published.