Another “Pre-Yamnaya” sample from the Northern Caucasus?

I have updated the Ancient DNA Dataset, including a lot of new information from – among other sources – the latest version of Reich Lab curated Dataset, now renamed Allen Ancient DNA Resource (AADR). This includes new columns:

  1. Object-ID: I am now using whenever possible the Master-ID; Version-ID for a quick identification of the ‘best’ sample to include in SmartPCA or ADMIXTURE runs; and Index as a key with a unique reference number for each sample. That should make for enough stable references for any external tool to use the data.
  2. mtDNA: Added mtDNA coverage; and FTDNA-mt-Haplotree to order samples, even though YFull seems to have more splits now. Since the FTDNA R&D team and phylogeneticist are probably finishing off analyzing all ancient Y-DNA soon, it is to be expected that they will update the mtDNA Haplotree next.
  3. Y-DNA: I have added in the NRY column the number of valid markers from YLeaf as a standardized proxy for “Y-DNA coverage”, and updated FTDNA-Y-Haplotree to order samples according to their branch position in the Y-DNA Haplotree.
  4. Quality: new columns SNPs hits on Autosome, Autosomal coverage, Damage rate, and color-coded Assessment.
  5. General: new Method-Date and updated Kinship-Notes, Date (with precise confidence intervals), and Label.
  6. ADMIXTURE values: supervised run with K=7, roughly imitating the most – for lack of a better word – attractive results from unsupervised runs I got about a year ago. I have also updated the ArcGIS Online web app accordingly.
ArcGIS Online web app with selected ADMIXTURE values for Middle and Late Eneolithic samples. Color shows the majority ancestry component of the sample, while increasing transparency shows decreasing proportion of that component.

NOTE. I have only added the file with randomized latitude/longitude values, because the exact locations cause an overlap that makes visualization essentially useless. Also, the graphic selected is the transparency-based one, because there is no option for pie charts or histograms. I am sure there are better populations to represent ancient SE Asia than Papuan, and possibly a more accurate visualization with more K, but I don’t want to dedicate more time to this than necessary. After all, the concept of adding ADMIXTURE is for users to get a quick, simple idea of ancestry evolution together with mtDNA and Y-DNA, and propose hypotheses to be tested with more robust methods.

Please, take a closer look at the new changes, especially in what affects the cultural groups, Y-DNA and mtDNA haplogroups you know of. After so many corrections, deletions and additions using massive copy & paste and replacement formulas, I am sure there will be weird errors that were not there before, mostly due to the different (and sometimes conflicting) naming conventions, so please let me know if you see any.

Map of ancient samples with some of the tools by

For those of you who would like to check ancient DNA samples by Y-chromosome or mtDNA SNPs (including derived ones in their trees), by date, by culture, etc. or by a combination of any of the features published, including statistics, a great tool is published by Jari Kinnunen at, as part of the Haplotree project.

Also, Rob Spencer will try to update soon his famous SNP Tracker using reliable inferences of high quality ancient DNA, which will most likely increase the accuracy of all haplogroups found in ancient samples.

NOTE. A quick reminder that the data I collect is free to use, and projects reusing, modifying, or republishing it have no affiliation with me or my website.

One more “Pre-Yamnaya” sample

One interesting change to notice from the dataset is that SA6010, the Northern Caucasus Steppe sample of hg. R1b-pre-V1636, previously classified as Yamnaya based on its “Yamnaya-like orientation” and radiocarbon date (ca. 2884-2679 calBCE) has now been re-dated (and relabeled) on the basis of the direct date of “relative SA6001”, a female interred close by and dated ca. 3520-3371 calBCE.

Sharakhalsun 2. General plan of mound 6 and radiocarbon dates of the burials. Modified from Reinhold et al. (2017). Top right: PCA with highlighted samples from Sharakhalsun 6, with colors corresponding to the graves of kurgan 2, modified from Wang et al. (2019) Supp. Materials.

A word of caution regarding the new classification: (1) no express family ties are mentioned under 06010 or 06001; (2) they weirdly selected a label “Russia_Caucasus_Eneolithic” (like samples from Progress and Vonyuchka) instead of “Russia_Steppe_Maikop”, used for the other LN Sarakhalsun-6 Eneolithic samples; and (3) there is a marked difference in ancestry between both ‘relatives’, and no coinciding mtDNA. On the other hand, its outlier ancestry in common with AY2001 (“Russia_Steppe_Maikop_o2”) and close to that of the grave neighbor SA6013 (“Russia_Steppe_Maikop_o”), as well as its Y-DNA haplogroup suggest that the new classification is appropriate.

Not only is this relevant as yet another example of the (lack of) reliability of radiocarbon dates (like the hypothetical “Eneolithic” R1a-M417 sample from Alexandria, or the “early” Steppe-related populations from Switzerland) and cultural inferences based on simplistic assessments (like the Althäuser “Corded Ware”, or the West European pre-migration “Bell Beakers”), but it essentially eliminates one of the very few outliers in both Y-DNA and ancestry from the Late PIE-speaking Yamnaya-Afanasievo equation, which renders the inferred Late Repin ethnogenesis even more homogeneous.

PCA of samples published in Wang et al. bioRxiv (2018), modified to include clusters and labels relevant to this post. Also included is a small red dot () representing attested or likely R1b-pre-V1636 samples. See full PCA from Wang et al. Nature (2019), and the supplementary figure with additional PCA analyses including Eurasian and Native American populations, to get a better perspective of the genetic distance among Steppe Maykop and Maykop samples relative to Yamnaya-Afanasievo.

Another interesting sample is the coeval low coverage Maykop I1720 from Baksanyonok (ca. 3700-3000 BC), of hg. P1 (xQ; xR1a; xR1b-PH155; xR1b-M269, xR1b-PF6292, xR2; xT), which I guess could be an R1b sample – then probably R1b-pre-V1636. It shows a completely different ancestry profile, in common with Maykop samples, which could in turn be the closest available link in haplogroup and ancestry to the late Kura-Araxes individual I1635, from the Kalavan EBA group.

Indeed, the presence of SA6010 and I1720 among non-Indo-European-speaking Maykop-Novosvobodnaya groups gives a whole new perspective to the late, Kura-Araxes-related sample from the Southern Caucasus, supporting yet again (together with the presence of different Steppe-rich clusters among Maykop groups) that Kura-Araxes does not represent the spread of Indo-European speakers in Anatolia, but rather the intrusion of this haplogroup and ancestry among non-Indo-European-speaking communities, as evidenced first and foremost by palaeolinguistics.

What is more, the upcoming paper on shared IBD between Yamnaya and Corded Ware shows that Corded Ware has a particularly close relationship with a Yamnaya sample from the Caucasus, which suggests it is the already published RK1007, from Rasshevatskiy, with an ancestry closer to SA6010 (and to the later Catacomb SA6003). All this supports the long-lasting effects of the assumed opening of the “Steppe mating network” during the Khvalynsk-Novodanilovka expansion, as described by Anthony (2019).

Opening of the Steppe mating network during the Eneolithic, with the Khvalynsk-Novodanilovka expansion and the formation and spread of Pre-Yamnaya ancestry. Within red circles (), sites with individuals of Pre-Yamnaya ancestry (ca. 5000-4000 BC). In pink circles (), Middle to Late Eneolithic (ca. 4000-3100 BC) sites of Pre-Yamnaya ancestry. Solid lines indicate a majority of Steppe ancestry in one or more individuals; dashed lines show a limited contribution of Steppe ancestry in one or more individuals. In white circles, unknown genetic profile within the EHG-CHG continuum. In red dots (•), R1b-pre-V1636 samples from the Early Eneolithic; in pink dots (•), R1b-pre-V1636 samples from the Middle Eneolithic and later. Image modified from Anthony (2019).

Eneolithic evolution of R1b-V1636

These developments twist a little bit more my first impression of the “Yamnaya-related” R1b-V1636 of Gjerrild5 of the SGC/LN transition period, which was that Bell Beakers were their likely direct vector of expansion. This new information has to be interpreted together with the now even higher homogeneity of the R1b-L23-based Y-chromosome bottleneck found in Indo-Tocharian-speaking Repin (based on Core Indo-European-speaking Yamnaya and Pre-Tocharian-speaking Afanasievo), also reflected in the well-researched early Bell Beakers before their increasing admixture with local groups.

It seems perfectly possible then, if the correction is legitimate, that the haplogroup found among late Danish Single Gravers – possibly including the oldest individual sampled, Gjerrild8 – was directly related to a dubious Srubnaya downstream V1636 sample from the Northern Caspian steppes, hence all from a Corded Ware-related expansion of a full-fledged V1636. They would then represent diverging lineages that formed part of stray “Steppe-rich” Middle Eneolithic north Pontic steppe and forest-steppe communities before the Pre-Corded Ware community formed under an R1a-rich population, hijacking their typical archaic PIE ancestry and incorporating to some extent this haplogroup.

NOTE. Its modern distribution (see FTDNA and YFull), with an old TMRCA ca. 4600 BC, seems compatible with any option, but succeeding formation and TMRCA estimates suggests that they spread successfully only recently with Steppe-related populations. For more on this discussion of BBC vs. CWC origins of the SGC-related one and the full-fledged V1636, see The Last of the Single Gravers.

Models of expansion of hg. R1b-(pre-)V1636 subclades during the Late Eneolithic/EBA: (A) non-Indo-European, post-Khvalynsk groups from the North Pontic forest-steppes (Late Trypillia-related) and the Caucasus (Maykop/Kura-Araxes-related); (B) Indo-European Yamnaya-related migrations.

I guess that, if someone is able to get good coverage DNA from early Single Grave groups to the south of Denmark, this question could be settled, perhaps proving that the earliest Corded Ware spearhead that started migrating at the turn of the 4th to 3rd millennium BC was formed not only by R1a-M198* and R1a-M417* (as found in Central European CWC and likely surviving in later groups from Central Asia), but also by R1b-(pre-?)V1636 lineages that spread first with Khvalynsk and full Steppe-like autosomal ancestry.

Overall, the finding of some R1b-pre-V1636-rich and Steppe-rich communities in the north Pontic region eventually integrated into the Pre-Corded Ware population could seamlessly explain their “Steppe ancestry” close to the Yamnaya-Afanasievo-related one. On the other hand, this finding (like that of a hypothetic R1b-(pre-)M269 with full Steppe ancestry) would deal a significant blow to the idea of a shared Indo-Uralic community, since all Archaic PIE traits shared with Proto-Uralic could be alternatively explained as a heavy Indo-Anatolian substrate during the formation of the language.

The other potential explanations of the Corded Ware “Steppe ancestry” – (1) stemming from different ancient north Pontic Steppe-like hunter-gatherer communities vs. (2) recent female exogamy with the Yamnaya – seem much more compatible with a proposal of a shared (1) Steppe-like or (2) EHG-like Indo-Uralic-speaking society.

Re-labelled Khvalynsk and Yamnaya-related “post-Khvalynsk” samples of hg. R1b(-pre-)V1636.

Strict Indo-Tocharian kinship rules

Based on the recent talk by David Reich, one could believe that Corded Ware was a Yamnaya offshoot in terms of “close cousins”. This relationship would, however, contradict what we know in terms of both roughly coeval groups displaying different culture, ancestry, and Y-chromosome bottlenecks, including from sneak peeks into what will be published soon.


Upcoming online conference (free registration) Power, Gender and Mobility – Features of Indo-European Society by the University of Copenhagen March 26–27, 2021. Here are some relevant abstracts (emphasis mine):

Proto-Indo-European kinship according to ancient DNA from steppe cemeteries: preliminary results, by David Anthony and Dorcas Brown.

While kinship and biological descent are not the same thing, they are related. New data from ancient DNA on family relationships within Eneolithic and Yamnaya cemeteries, arguably linked to archaic PIE and late PIE, suggests that family relationships within and between cemeteries changed significantly between the Eneolithic and Yamnaya periods. Closely related males were buried together in Eneolithic cemeteries, but not in Yamnaya cemeteries. Females were unrelated to males within 3 degrees in both contexts, eliminating cousin marriage as a possibility and suggesting a virilocal system with required female exogamy. Genetic diversity in maternal descent was high across both periods, but genetic diversity in paternal descent collapsed in Yamnaya males, producing a surprisingly homogeneous set of Yamnaya men who nevertheless rarely were related to each other within 1st, 2nd or 3rd degrees, but instead shared a small group of male ancestors 4-7 generations before.

The generalized pattern among the many Yamnaya samples – as opposed to the custom in Khvalynsk – suggests, once again, that the Y-chromosome bottleneck under R1b-L23 subclades was no accident or independent local phenomena, belonging instead to an underlying, long-lasting strict set of marriage rules deeply embedded in their common culture, and thus traceable to their ancestral origin. Both sources of information, archaeological-genomic and linguistic, have to be taken into account when evaluating the evolution of Indo-Anatolian into Indo-Tocharian.

NOTE. A summary of what is known to date in linguistics, archaeology, and genomics, check my recent post on Proto-Indo-European kinship system and patrilineality.

A model of the Tsonga system. (H is necessary to assume closure of the system). Modified from Kuper (1982).

It is also important to consider the later evolution of marriage rules, with the best known example represented by the spread of Indo-European languages from SE Europe to the west with Bell Beakers. Innovations in kinship practices – including an increased role for the maternal uncle – seem to be associated with the shift to a more sedentary lifestyle of originally mobile pastoralist groups. This has been evidenced in Olalde et al. (2018), Mittnik et al. (2019), Sjögren et al. (2020), and the findings of the last two will probably be repeated at the conference.

In-laws and Outlaws in Indo-European Society, by Birgit Anette Olsen

A determining factor in the spread and establishment of Indo-European language and culture throughout Europe and parts of western Asia was the deliberate formation of alliances between Indo-European speaking immigrants and indigenous families. At least part of these alliances were founded on bonds of marriage where a foreign wife would typically be included in an Indo-European speaking enlarged family. The advantages of such a system include military support, guest friendship and the possibility of fosterage. Far from being negligible, the male members of the brides’ families would become an important element of the socio-economic network of an Indo-European patriclan.

This situation is mirrored in the vocabulary, where special attention will be payed to the terminology of inlaws on the wife’s side, allies and subordinates, and the marked contrast between insiders and outsiders.

In this communication, I expect a review of her recent chapter Aspects of family structure among the Indo-Europeans, In: Tracing the Indo-Europeans (2019). These were the basic conclusions:

  • Knowledge of kinship terminology in Indo-European proper is scanty since a number of important terms differ between Anatolian and Core Indo-European.
  • the typical Core Indo-European family was patrilineal and patrilocal, consisting of the master of the house, his wife, sons, unmarried daughters, daughters-in-law and grandchildren.
  • a particular bond between young boys and their maternal uncles is revealed by the derivational relationship between e.g. Latin avus ‘grandfather’ (paternal and maternal) and avunculus ‘son of the maternal grandfather’. This bond often included fosterage.
  • olsen-avus-nepots-avunculus
    Grandfather-grandson and uncle-nephew-relationship according to Olsen (forthcoming). Image from Olsen (2019).
  • men’s relations to their in-laws were typically expressed by derivatives of roots meaning ‘bind’, stressing the importance of alliances by marriage.
  • marriages were exogamous, often implying long-distance travelling as suggested by a new etymological interpretation of the word for ‘husband’s brother’s wife’ as ‘traveller’.
  • as far as we can see, the terminology of kinship and close social relations is almost entirely based on inherited material rather than substrata and loanwords. This suggests that while the Indo-Europeans learned about agriculture by the early farmers, they imposed the main features of their social structure on the indigenous populations.

Also relevant for the description of the Proto-Indo-European society are probably the communications presented by Benedicte Nielsen Whitehead, on The (Lack of a) Terminology of Slavery in Proto-Indo-European; by Ulla Remmer, on How (not) to name a Woman in Indo-European; and by Michael Weiss, with a title as homage to Peter Ramus that suggests Benveniste’s writings on this subject will be criticized by their lack of a proper systematization. A general anthropological – linguistic correlation will be dealt with by Georges-Jean Pinault.

NOTE. Registration is free and the conference is online. I doubt I’d have the necessary time or patience to attend to so many sessions in two days, but if any of you register and find something interesting in any of the communications, you can write it here in the comments.


Join the discussion...

It is good practice to be registered and logged in to comment.
Please keep the discussion of this post on topic.
Civilized discussion. Academic tone.
For other topics, use the forums instead.
Inline Feedbacks
View all comments