A very “Yamnaya-like” East Bell Beaker from France, probably R1b-L151

bell-beaker-expansion

Interesting report by Bernard Sécher on Anthrogenica, about the Ph.D. thesis of Samantha Brunel from Institut Jacques Monod, Paris, Paléogénomique des dynamiques des populations humaines sur le territoire Français entre 7000 et 2000 (2018).

NOTE. You can visit Bernard Sécher’s blog on genetic genealogy.

A summary from user Jool, who was there, translated into English by Sécher (slight changes to translation, and emphasis mine):

They have a good hundred samples from the North, Alsace and the Mediterranean coast, from the Mesolithic to the Iron Age.

There is no major surprise compared to the rest of Europe. On the PCA plot, the Mesolithic are with the WHG, the early Neolithics with the first farmers close to the Anatolians. Then there is a small resurgence of hunter-gatherers that moves the Middle Neolithics a little closer to the WHGs.

From the Bronze Age, they have 5 samples with autosomal DNA, all in Bell Beaker archaeological context, which are very spread on the PCA. A sample very high, close to the Yamnaya, a little above the Corded Ware, two samples right in the Central European Bell Beakers, a fairly low just above the Neolithic package, and one last full in the package. The most salient point was that the Y chromosomes of their 12 Bronze Age samples (all Bell Beakers) are all R1b, whereas there was no R1b in the Neolithic samples.

Finally they have samples of the Iron Age that are collected on the PCA plot close to the Bronze Age samples. They could not determine if there is continuity with the Bronze Age, or a partial replacement by a genetically close population.

PCA-caucasus-yamna
Image modified from Wang et al. (2018). Samples projected in PCA of 84 modern-day West Eurasian populations (open symbols). Previously known clusters have been marked and referenced. Marked and labelled are interesting samples; In red, likely position of late Yamna Hungary / early East Bell Beakers An EHG and a Caucasus ‘clouds’ have been drawn, leaving Pontic-Caspian steppe and derived groups between them. See the original file here. To understand the drawn potential Caucasus Mesolithic cluster, see above the PCA from Lazaridis et al. (2018).

The sample with likely high “steppe ancestry“, clustering closely to Yamna (more than Corded Ware samples) is then probably an early East Bell Beaker individual, probably from Alsace, or maybe close to the Rhine Delta in the north, rather than from the south, since we already have samples from southern France from Olalde et al. (2018) with high Neolithic ancestry, and samples from the Rhine with elevated steppe ancestry, but not that much.

This specific sample, if confirmed as one of those reported as R1b (then likely R1b-L151), as it seems from the wording of the summary, is key because it would finally link Yamna to East Bell Beaker through Yamna Hungary, all of them very “Yamnaya-like”, and therefore R1b-L151 (hence also R1b-L51) directly to the steppe, and not only to the Carpathian Basin (that is, until we have samples from late Repin or West Yamna…)

NOTE. The only alternative explanation for such elevated steppe ancestry would be an admixture between a ‘less Yamnaya-like’ East Bell Beaker + a Central European Corded Ware sample like the Esperstedt outlier + drift, but I don’t think that alternative is the best explanation of its position in the PCA closer to Yamna in any of the infinite parallel universes, so… Also, the sample from Esperstedt is clearly a late outlier likely influenced by Yamna vanguard settlers from Hungary, not the other way round…

Unexpectedly, then, fully Yamnaya-like individuals are found not only in Yamna Hungary ca. 3000-2500 BC, but also among expanding East Bell Beakers later than 2500 BC. This leaves us with unexplained, not-at-all-Yamnaya-like early Corded Ware samples from ca. 2900 BC on. An explanation based on admixture with locals seems unlikely, seeing how Corded Ware peoples continue a north Pontic cluster, being thus different from Yamna and their ancestors since the Neolithic; and how they remained that way for a long time, up to Sintashta, Srubna, Andronovo, and even later samples… A different, non-Indo-European community it is, then.

olalde_pca2
Image modified from Olalde et al. (2018). PCA of 999 Eurasian individuals. Marked is the Espersted Outlier with the approximate position of Yamna Hungary, probably the source of its admixture. Different Bell Beaker clines have been drawn, to represent approximate source of expansions from Central European sources into the different regions. In red, likely zone of Yamna Hungary and reported early East Bell Beaker individual from France.

Let’s wait and see the Ph.D. thesis, when it’s published, and keep observing in the meantime the absurd reactions of denial, anger, bargaining, and depression (stages of grief) among BBC/R1b=Vasconic and CWC/R1a=Indo-European fans, as if they had lost something (?). Maybe one of these reactions is actually the key to changing reality and going back to the 2000s, who knows…

Featured image: initial expansion of the East Bell Beaker Group, by Volker Heyd (2013).

Related

Common pitfalls in human genomics and bioinformatics: ADMIXTURE, PCA, and the ‘Yamnaya’ ancestral component

invasion-from-the-steppe-yamnaya

Good timing for the publication of two interesting papers, that a lot of people should read very carefully:

ADMIXTURE

Open access A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots, by Daniel J. Lawson, Lucy van Dorp & Daniel Falush, Nature Communications (2018).

Interesting excerpts (emphasis mine):

Experienced researchers, particularly those interested in population structure and historical inference, typically present STRUCTURE results alongside other methods that make different modelling assumptions. These include TreeMix, ADMIXTUREGRAPH, fineSTRUCTURE, GLOBETROTTER, f3 and D statistics, amongst many others. These models can be used both to probe whether assumptions of the model are likely to hold and to validate specific features of the results. Each also comes with its own pitfalls and difficulties of interpretation. It is not obvious that any single approach represents a direct replacement as a data summary tool. Here we build more directly on the results of STRUCTURE/ADMIXTURE by developing a new approach, badMIXTURE, to examine which features of the data are poorly fit by the model. Rather than intending to replace more specific or sophisticated analyses, we hope to encourage their use by making the limitations of the initial analysis clearer.

The default interpretation protocol

Most researchers are cautious but literal in their interpretation of STRUCTURE and ADMIXTURE results, as caricatured in Fig. 1, as it is difficult to interpret the results at all without making several of these assumptions. Here we use simulated and real data to illustrate how following this protocol can lead to inference of false histories, and how badMIXTURE can be used to examine model fit and avoid common pitfalls.

admixture-protocol
A protocol for interpreting admixture estimates, based on the assumption that the model underlying the inference is correct. If these assumptions are not validated, there is substantial danger of over-interpretation. The “Core protocol” describes the assumptions that are made by the admixture model itself (Protocol 1, 3, 4), and inference for estimating K (Protocol 2). The “Algorithm input” protocol describes choices that can further bias results, while the “Interpretation” protocol describes assumptions that can be made in interpreting the output that are not directly supported by model inference

Discussion

STRUCTURE and ADMIXTURE are popular because they give the user a broad-brush view of variation in genetic data, while allowing the possibility of zooming down on details about specific individuals or labelled groups. Unfortunately it is rarely the case that sampled data follows a simple history comprising a differentiation phase followed by a mixture phase, as assumed in an ADMIXTURE model and highlighted by case study 1. Naïve inferences based on this model (the Protocol of Fig. 1) can be misleading if sampling strategy or the inferred value of the number of populations K is inappropriate, or if recent bottlenecks or unobserved ancient structure appear in the data. It is therefore useful when interpreting the results obtained from real data to think of STRUCTURE and ADMIXTURE as algorithms that parsimoniously explain variation between individuals rather than as parametric models of divergence and admixture.

For example, if admixture events or genetic drift affect all members of the sample equally, then there is no variation between individuals for the model to explain. Non-African humans have a few percent Neanderthal ancestry, but this is invisible to STRUCTURE or ADMIXTURE since it does not result in differences in ancestry profiles between individuals. The same reasoning helps to explain why for most data sets—even in species such as humans where mixing is commonplace—each of the K populations is inferred by STRUCTURE/ADMIXTURE to have non-admixed representatives in the sample. If every individual in a group is in fact admixed, then (with some exceptions) the model simply shifts the allele frequencies of the inferred ancestral population to reflect the fraction of admixture that is shared by all individuals.

Several methods have been developed to estimate K, but for real data, the assumption that there is a true value is always incorrect; the question rather being whether the model is a good enough approximation to be practically useful. First, there may be close relatives in the sample which violates model assumptions. Second, there might be “isolation by distance”, meaning that there are no discrete populations at all. Third, population structure may be hierarchical, with subtle subdivisions nested within diverged groups. This kind of structure can be hard for the algorithms to detect and can lead to underestimation of K. Fourth, population structure may be fluid between historical epochs, with multiple events and structures leaving signals in the data. Many users examine the results of multiple K simultaneously but this makes interpretation more complex, especially because it makes it easier for users to find support for preconceptions about the data somewhere in the results.

In practice, the best that can be expected is that the algorithms choose the smallest number of ancestral populations that can explain the most salient variation in the data. Unless the demographic history of the sample is particularly simple, the value of K inferred according to any statistically sensible criterion is likely to be smaller than the number of distinct drift events that have practically impacted the sample. The algorithm uses variation in admixture proportions between individuals to approximately mimic the effect of more than K distinct drift events without estimating ancestral populations corresponding to each one. In other words, an admixture model is almost always “wrong” (Assumption 2 of the Core protocol, Fig. 1) and should not be interpreted without examining whether this lack of fit matters for a given question.

admixture-pitfalls
Three scenarios that give indistinguishable ADMIXTURE results. a Simplified schematic of each simulation scenario. b Inferred ADMIXTURE plots at K= 11. c CHROMOPAINTER inferred painting palettes.

Because STRUCTURE/ADMIXTURE accounts for the most salient variation, results are greatly affected by sample size in common with other methods. Specifically, groups that contain fewer samples or have undergone little population-specific drift of their own are likely to be fit as mixes of multiple drifted groups, rather than assigned to their own ancestral population. Indeed, if an ancient sample is put into a data set of modern individuals, the ancient sample is typically represented as an admixture of the modern populations (e.g., ref. 28,29), which can happen even if the individual sample is older than the split date of the modern populations and thus cannot be admixed.

This paper was already available as a preprint in bioRxiv (first published in 2016) and it is incredible that it needed to wait all this time to be published. I found it weird how reviewers focused on the “tone” of the paper. I think it is great to see files from the peer review process published, but we need to know who these reviewers were, to understand their whiny remarks… A lot of geneticists out there need to develop a thick skin, or else we are going to see more and more delays based on a perceived incorrect tone towards the field, which seems a rather subjective reason to force researchers to correct a paper.

PCA of SNP data

Open access Effective principal components analysis of SNP data, by Gauch, Qian, Piepho, Zhou, & Chen, bioRxiv (2018).

Interesting excerpts:

A potential hindrance to our advice to upgrade from PCA graphs to PCA biplots is that the SNPs are often so numerous that they would obscure the Items if both were graphed together. One way to reduce clutter, which is used in several figures in this article, is to present a biplot in two side-by-side panels, one for Items and one for SNPs. Another stratagem is to focus on a manageable subset of SNPs of particular interest and show only them in a biplot in order to avoid obscuring the Items. A later section on causal exploration by current methods mentions several procedures for identifying particularly relevant SNPs.

One of several data transformations is ordinarily applied to SNP data prior to PCA computations, such as centering by SNPs. These transformations make a huge difference in the appearance of PCA graphs or biplots. A SNPs-by-Items data matrix constitutes a two-way factorial design, so analysis of variance (ANOVA) recognizes three sources of variation: SNP main effects, Item main effects, and SNP-by-Item (S×I) interaction effects. Double-Centered PCA (DC-PCA) removes both main effects in order to focus on the remaining S×I interaction effects. The resulting PCs are called interaction principal components (IPCs), and are denoted by IPC1, IPC2, and so on. By way of preview, a later section on PCA variants argues that DC-PCA is best for SNP data. Surprisingly, our literature survey did not encounter even a single analysis identified as DC-PCA.

The axes in PCA graphs or biplots are often scaled to obtain a convenient shape, but actually the axes should have the same scale for many reasons emphasized recently by Malik and Piepho [3]. However, our literature survey found a correct ratio of 1 in only 10% of the articles, a slightly faulty ratio of the larger scale over the shorter scale within 1.1 in 12%, and a substantially faulty ratio above 2 in 16% with the worst cases being ratios of 31 and 44. Especially when the scale along one PCA axis is stretched by a factor of 2 or more relative to the other axis, the relationships among various points or clusters of points are distorted and easily misinterpreted. Also, 7% of the articles failed to show the scale on one or both PCA axes, which leaves readers with an impressionistic graph that cannot be reproduced without effort. The contemporary literature on PCA of SNP data mostly violates the prohibition against stretching axes.

pca-how-to
DC-PCA biplot for oat data. The gradient in the CA-arranged matrix in Fig 13 is shown here for both lines and SNPs by the color scheme red, pink, black, light green, dark green.

The percentage of variation captured by each PC is often included in the axis labels of PCA graphs or biplots. In general this information is worth including, but there are two qualifications. First, these percentages need to be interpreted relative to the size of the data matrix because large datasets can capture a small percentage and yet still be effective. For example, for a large dataset with over 107,000 SNPs for over 6,000 persons, the first two components capture only 0.3693% and 0.117% of the variation, and yet the PCA graph shows clear structure (Fig 1A in [4]). Contrariwise, a PCA graph could capture a large percentage of the total variation, even 50% or more, but that would not guarantee that it will show evident structure in the data. Second, the interpretation of these percentages depends on exactly how the PCA analysis was conducted, as explained in a later section on PCA variants. Readers cannot meaningfully interpret the percentages of variation captured by PCA axes when authors fail to communicate which variant of PCA was used.

Conclusion

Five simple recommendations for effective PCA analysis of SNP data emerge from this investigation.

  1. Use the SNP coding 1 for the rare or minor allele and 0 for the common or major allele.
  2. Use DC-PCA; for any other PCA variant, examine its augmented ANOVA table.
  3. Report which SNP coding and PCA variant were selected, as required by contemporary standards in science for transparency and reproducibility, so that readers can interpret PCA results properly and reproduce PCA analyses reliably.
  4. Produce PCA biplots of both Items and SNPs, rather than merely PCA graphs of only Items, in order to display the joint structure of Items and SNPs and thereby to facilitate causal explanations. Be aware of the arch distortion when interpreting PCA graphs or biplots.
  5. Produce PCA biplots and graphs that have the same scale on every axis.

I read the referenced paper Biplots: Do Not Stretch Them!, by Malik and Piepho (2018), and even though it is not directly applicable to the most commonly available PCA graphs out there, it is a good reminder of the distorting effects of stretching. So for example quite recently in Krause-Kyora et al. (2018), where you can see Corded Ware and BBC samples from Central Europe clustering with samples from Yamna:

NOTE. This is related to a vertical distorsion (i.e. horizontal stretching), but possibly also to the addition of some distant outlier sample/s.

pca-cwc-yamna-bbc
Principal Component Analysis (PCA) of the human Karsdorf and Sorsum samples together with previously published ancient populations projected on 27 modern day West Eurasian populations (not shown) based on a set of 1.23 million SNPs (Mathieson et al., 2015). https://doi.org/10.7554/eLife.36666.006

The so-called ‘Yamnaya’ ancestry

Every time I read papers like these, I remember commenters who kept swearing that genetics was the ultimate science that would solve anthropological problems, where unscientific archaeology and linguistics could not. Well, it seems that, like radiocarbon analysis, these promising developing methods need still a lot of refinement to achieve something meaningful, and that they mean nothing without traditional linguistics and archaeology… But we already knew that.

Also, if this is happening in most peer-reviewed publications, made by professional geneticists, in journals of high impact factor, you can only wonder how many more errors and misinterpretations can be found in the obscure market of so many amateur geneticists out there. Because amateur geneticist is a commonly used misnomer for people who are not geneticists (since they don’t have the most basic education in genetics), and some of them are not even ‘amateurs’ (because they are selling the outputs of bioinformatic tools)… It’s like calling healers ‘amateur doctors’.

NOTE. While everyone involved in population genetics is interested in knowing the truth, and we all have our confirmation (and other kinds of) biases, for those who get paid to tell people what they want to hear, and who have sold lots of wrong interpretations already, the incentives of ‘being right’ – and thus getting involved in crooked and paranoid behaviour regarding different interpretations – are as strong as the money they can win or loose by promoting themselves and selling more ‘product’.

As a reminder of how badly these wrong interpretations of genetic results – and the influence of the so-called ‘amateurs’ – can reflect on research groups, yet another turn of the screw by the Copenhagen group, in the oral presentations at Languages and migrations in pre-historic Europe (7-12 Aug 2018), organized by the Copenhagen University. The common theme seems to be that Bell Beaker and thus R1b-L23 subclades do represent a direct expansion from Yamna now, as opposed to being derived from Corded Ware migrants, as they supported before.

NOTE. Yes, the “Yamna → Corded Ware → Únětice / Bell Beaker” migration model is still commonplace in the Copenhagen workgroup. Yes, in 2018. Guus Kroonen had already admitted they were wrong, and it was already changed in the graphic representation accompanying a recent interview to Willerslev. However, since there is still no official retraction by anyone, it seems that each member has to reject the previous model in their own way, and at their own pace. I don’t think we can expect anyone at this point to accept responsibility for their wrong statements.

So their lead archaeologist, Kristian Kristiansen, in The Indo-Europeanization of Europé (sic):

kristiansen-migrations
Kristiansen’s (2018) map of Indo-European migrations

I love the newly invented arrows of migration from Yamna to the north to distinguish among dialects attributed by them to CWC groups, and the intensive use of materials from Heyd’s publications in the presentation, which means they understand he was right – except for the fact that they are used to support a completely different theory, radically opposed to those defended in Heyd’s model

Now added to the Copenhagen’s unending proposals of language expansions, some pearls from the oral presentation:

  • Corded Ware north of the Carpathians of R1a lineages developed Germanic;
  • R1b borugh [?] Italo-Celtic;
  • the increase in steppe ancestry on north European Bell Beakers mean that they “were a continuation of the Yamnaya/Corded Ware expansion”;
  • Corded Ware groups [] stopped their expansion and took over the Bell Beaker package before migrating to England” [yep, it literally says that];
  • Italo-Celtic expanded to the UK and Iberia with Bell Beakers [I guess that included Lusitanian in Iberia, but not Messapian in Italy; or the opposite; or nothing like that, who knows];
  • 2nd millennium BC Bronze Age Atlantic trade systems expanded Proto-Celtic [yep, trade systems expanded the language]
  • 1st millennium BC expanded Gaulish with La Tène, including a “Gaulish version of Celtic to Ireland/UK” [hmmm, dat British Gaulish indeed].

You know, because, why the hell not? A logical, stable, consequential, no-nonsense approach to Indo-European migrations, as always.

Also, compare still more invented arrows of migrations, from Mikkel Nørtoft’s Introducing the Homeland Timeline Map, going against Kristiansen’s multiple arrows, and even against the own recent fantasy map series in showing Bell Beakers stem from Yamna instead of CWC (or not, you never truly know what arrows actually mean):

corded-ware-migrations
Nørtoft’s (2018) maps of Indo-European migrations.

I really, really loved that perennial arrow of migration from Volosovo, ca. 4000-800 BC (3000+ years, no less!), representing Uralic?, like that, without specifics – which is like saying, “somebody from the eastern forest zone, somehow, at some time, expanded something that was not Indo-European to Finland, and we couldn’t care less, except for the fact that they were certainly not R1a“.

This and Kristiansen’s arrows are the most comical invented migration routes of 2018; and that is saying something, given the dozens of similar maps that people publish in forums and blogs each week.

NOTE. You can read a more reasonable account of how haplogroup R1b-L51 and how R1-Z645 subclades expanded, and which dialects most likely expanded with them.

We don’t know where these scholars of the Danish workgroup stand at this moment, or if they ever had (or intended to have) a common position – beyond their persistent ideas of Yamnaya™ ancestral component = Indo-European and R1a must be Indo-European – , because each new publication changes some essential aspects without expressly stating so, and makes thus everything still messier.

It’s hard to accept that this is a series of presentations made by professional linguists, archaeologists, and geneticists, as stated by the official website, and still harder to imagine that they collaborate within the same professional workgroup, which includes experienced geneticists and academics.

I propose the following video to close future presentations introducing innovative ideas like those above, to help the audience find the appropriate mood:

Related

East Bell Beakers, an in situ admixture of Yamna settlers and GAC-like groups in Hungary

indo-european-yamnaya-corded-ware

I wanted to repeat what I said last week in two different posts (see on the new Caucasus and Yamna Hungary samples, and on local groups in contact with Yamna settlers).

We already knew that expanding East Bell Beakers had received influence from a population similar to the available Globular Amphorae culture samples.

  1. Without Yamna settlers, but with Yamna Ukraine and East Bell Beaker samples, including an admixed Yamna Bulgaria sample (from Olalde & Mathieson 2017, and then with their Nature 2018 papers), the most likely interpretation was that Yamna settlers had received GAC ancestry probably during their migration through the Balkans, before turning into East Bell Beakers. However, some comments still supported that it was Corded Ware migrants the ones behind the formation of East Bell Beakers. I couldn’t understand it.
  2. Now we have (with Wang et al. 2018) Yamna settlers (identical to other Yamna groups and Afanasevo migrants) and GAC-like peoples coexisting with them in Hungary, with a Late Chalcolithic Yamna sample from Hungary showing a greater contribution from GAC. However, I still read discussions on Yamna settlers receiving GAC admixture from Corded Ware in Eastern Europe, from GAC in the Dnieper-Dniester area, in Budzhak/Usatovo, etc. I can’t understand this, either.
  3. I will post here the data we have, with the simplest maps and images showing the simplest possible model. No more long paragraphs.

    NOTE. All this data does not mean that this model is certain, especially because we don’t have direct access to the samples. But it is the simplest and most likely one. Sometimes 2+2=4. Even if it turns out later to be false.

    EDIT (30 MAY 2018): In fact, as I commented in the first post about these samples, there is a Yamna LCA/EBA sample probably from Late Yamna (in the North Pontic steppe, west of the Catacomb culture), which shows GAC-like contribution. However, this admixture is lesser than that of Hungary LCA/EBA1 sample, and both Yamna groups (Hungary and steppe) were probably already more sedentary, which also supports different contributions from nearby local GAC-like groups to each region, rather than maintained long-range internal genetic contributions from a single source near the steppe…

    indo-european-uralic-migrations-yamna-gac
    Yamna migrants ca. 3300-2600. Most likely site of admixture with GAC circled in red.
    yamna_bell_beaker
    Yamna – Bell Beaker migration according to Heyd (2007, 2012). Most likely site of admixture with GAC is marked by the evolution of Blue to Red color.
    PCA-yamna-hungary
    PCA results. Samples from Yamna Hungary are surrounded by red circles, GAC-like Hungarian groups surrounded by light brown (see below for ADMIXTURE data) Notice the most likely Yamna Hungary sample with GAC admixture clustering closely to CWC Esperstedt outlier, and thus to some East Bell Beaker samples. (d) shows these projected onto a PCA of 84 modern-day West Eurasian populations (open symbols).
    gac-like-hungary-yamnaya
    Modified image, with red rectangles surrounding (unreleased) Hungarian samples from Yamna and GAC-like groups. (c) ADMIXTURE results of relevant prehistoric individuals mentioned in the text (filled symbols)
    yamnaya-hungary-lca-eba
    Modified image, with red rectangles surrounding (unreleased) Yamna samples Notice greater GAC contribution to late Yamna Hungary sample. Modelling results for the Steppe and Caucasus cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional Anatolian farmer-related ancestry in Steppe groups as well as additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups
    yamnaya-hungary-globular-amphora
    Modified table from Wang et al. (2018) Supplementary materials (in bold, Yamna and related samples; in red, newly reported samples). Notice greater GAC contribution to late Yamna Hungary sample. “Supplementary Table 18. P values of rank=1 and admixture coefficients of modelling the Steppe ancestry populations as a two-way admixture of the Eneolithic_steppe and Globular_Amphora using 14 outgroups. Left populations: Steppe cluster, Eneolithic_steppe, Globular Amphora Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.”

    The CWC outlier from Esperstedt

    I already said that my initial interpretation of the Esperstedt outlier, dated ca. 2430 BC, as due to a late contribution directly from the steppe (i.e. from long-range contacts between late Corded Ware groups from Europe and late groups from the steppe) was probably wrong, seeing how (in Olalde et al. 2017) early East Bell Beaker samples from Hungary and Central Europe clustered closely to this individual.

    Now we see that fully ‘Yamnaya-like’ Yamna settlers lived in Hungary probably for two or three centuries ca. 2900-2600 BC, and the absorption of known (or unknown) Yamna vanguard groups found up to Saxony-Anhalt before 2600 BC would be enough to justify the genomic findings of this individual.

    An outlier it is, then. But probably from admixture with nearby Yamna-like people.

    olalde_pca
    Image modified by me, from Olalde et al. (2017). PCA of 999 Eurasian individuals. Marked is the Espersted Outlier.

    Related:

Immigration and transhumance in the Early Bronze Age Carpathian Basin

Interesting excerpts about local Hungarian groups that had close contacts with Yamna settlers in the Carpathian Basin, from the paper Immigration and transhumance in the Early Bronze Age Carpathian Basin: the occupants of a kurgan, by Gerling, Bánffy, Dani, Köhler, Kulcsár, Pike, Szeverényi & Heyd, Antiquity (2012) 86(334):1097-1111.

The most interesting of the local people is the occupant of grave 12, which is the earliest grave in the kurgan and the main statistical range of its radiocarbon date clearly predates the arrival of the western Yamnaya groups c. 3000 BC. This is also confirmed by the burial rite, which is not typical for the Yamnaya (Dani 2011: 29–33; Heyd in press), although some heterogeneity may apply in Yamnaya communities too. The migrant group, graves nos. 4, 7, 9 and 11, all occupy late stratigraphic positions in the mound, and have radiocarbon dates in the second quarter of the third millennium BC. It is also noteworthy that they are all adult or mature men. The contextual data, their physical distribution over the space of the whole kurgan, and the variety of burial practices, indicate several generations of burials. The cultural attributes of this group are summarised in Figure 5. Overall, their closest match lies in the Livezile group from the eastern and southern Apuseni Mountains, which is also the likely place of origin of the buried persons.

yamna-settlements-hungary
Cultural geography of the Carpathian Basin in the first half of the third millennium BC (in black: archaeological cultures and groups dating roughly to the first quarter; in red: those dating to the second quarter). Indicated also are regions and sites mentioned in the text.

The key question is, what cultural process could be responsible for attracting these men from their homeland to the Great Hungarian Plain, over several generations? Their sex and age uniformity indicate they are a social sub-set within a larger group, implying that only a portion of their society was on the move. Exogamy can probably be excluded, since one would expect more women than men to move in prehistoric times; not to mention the distance of more than 200km between the places of potential origin and burial.

One hypothesis would see these men involved in the exchange of goods, with long-term relations between the mountain and steppe communities. Normally living in, or next to, the Apuseni, these men would journey for weeks into the plain, returning to the same places and people over many decades. Ethnographic examples of such travels to exchange objects and ideas, and perhaps people, are numerous (e.g. Helms 1988). However, the child’s (grave 7a) local isotopic signature would remain unexplained, and one has to wonder for how many generations an exchange continues for four men to die near the Őrhalom.

A second hypothesis is essentially an economic model of transhumance, with livestock passing the winter and spring in the milder regions of the Great Hungarian Plain, and returning to higher pastures in the warmer months (Arnold & Greenfield 2006). Such systems can endure for centuries, provided the social relations underpinning them are stable. This has the advantage of accounting for relatively long periods of time spent away from home, as herdsmen guarded their animals, and perhaps some women and their children came too, which would account for the child’s presence, and the pottery relations of the Livezile group. Furthermore, regular visits to a region would increase the likelihood of Livezile transhumant herders becoming integrated locally. The second quarter of the third millennium BC was a period when Yamnaya ideology, and thus its internal coherence, might have already diminished. This would likely have resulted in a weakened grip by Yamnaya people on pastures and territory, consequently allowing Livezile herders, and potentially others, to step in and take over locally, perhaps first on a seasonal basis and then permanently.

On West Yamna settlers in Hungary

yamnaya-hungary-globular-amphora
Modified table from Wang et al. (2018) Supplementary materials (in bold, Yamna and related samples; in red, newly reported samples). “Supplementary Table 18. P values of rank=1 and admixture coefficients of modelling the Steppe ancestry populations as a two-way admixture of the Eneolithic_steppe and Globular_Amphora using 14 outgroups. Left populations: Steppe cluster, Eneolithic_steppe, Globular Amphora Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.”

By disclosing very interesting information on (yet unpublished) Yamna samples from Hungary, the latest preprint from the Reich Lab has rendered irrelevant – in a rather surprising turn of events – (what I expected would be) future discussions on West Yamna settlers potentially sharing a similar ancestry with Baltic Late Neolithic / Corded Ware settlers (see here for more details).

Interesting excerpts regarding the tight cluster formed by all Yamna samples:

Individuals from the North Caucasian steppe associated with the Yamnaya cultural formation (5300-4400 BP, 3300-2400 calBCE) appear genetically almost identical to previously reported Yamnaya individuals from Kalmykia20 immediately to the north, the middle Volga region19, 27, Ukraine and Hungary, and to other Bronze Age individuals from the Eurasian steppes who share the characteristic ‘steppe ancestry’ profile as a mixture of EHG and CHG/Iranian ancestry23, 28. These individuals form a tight cluster in PCA space (Figure 2) and can be shown formally to be a mixture by significantly negative admixture f3-statistics of the form f3(EHG, CHG; target) (Supplementary Fig. 3).

Using qpAdm with Globular Amphora as a proximate surrogate population (assuming that a related group was the source of the Anatolian farmer-related ancestry), we estimated the contribution of Anatolian farmer-related ancestry into Yamnaya and other steppe groups. We find that Yamnaya individuals from the Volga region (Yamnaya Samara) have 13.2±2.7% and Yamnaya individuals in Hungary 17.1±4.1% Anatolian farmer-related ancestry (Fig.4; Supplementary Table 18)– statistically indistinguishable proportions.

yamna_bell_beaker
Yamna – Bell Beaker migration according to Heyd (2007, 2012)

Before this paper, we had the solidest anthropological models backed by Y-DNA against conflicting data from certain statistical tools applied to a few samples (which some used to contradict what was mainstream in Academia).

NOTE. I have discussed this extensively in this blog, and more than once. See for example my posts on R1a speaking IE (July 2017), on the Eneolithic Ukraine sample (September 2017), or on the “Yamnaya ancestral component” (November 2017).

Today, we have everything – including statistical tools – showing a genetically homogeneous, Late PIE-speaking late Khvalynsk/Yamna community expanding into its known branches, confirming what was described using traditional anthropological disciplines:

  • Late Khvalynsk expanding into Afanasevo ca. 3300-3000 BC with an archaic Late PIE dialect, which was attested much later as Tocharian;
  • East Yamna/Poltavka admixing with Uralic-speaking Abashevo migrants probably ca. 2600-2100 BC to form Proto-Indo-Iranian-speaking Sintashta-Petrovka and Potapovka;
  • and now also Yamna settlers: those in Hungary admixing (probably ca. 2800-2500 BC) with the local population to form North-West Indo-European-speaking East Bell Beakers; those from the Balkans forming other IE-speaking Balkan cultures, including the peoples that admixed in Greece, as seen in Mycenaeans.

If Volker Heyd is right with this and other papers – and he has been right until now in his predictions regarding Yamna, Bell Beaker, and Corded Ware cultures – , the change in ancestry will probably begin to be noticed in Yamna samples from Hungary and the Lower Danube during the second quarter of the 3rd millennium, a period defined by the addition of a more fashionable western Proto-Bell Beaker package to the fading traditional Yamna cultural package.

EDIT (19 MAY 2018): I corrected some sentences and added interesting information.

Related:

Brexit forces relocation of one of today’s main Yamna research projects to Finland

yamnaya_distribution

Archaeologist Volker Heyd is bringing his ERC Advanced Grant to Helsinki. So has proudly reported the University of Helskinki.

Some interesting excerpts (emphasis mine):

With his research group, Heyd wants to map out how the Yamnaya culture, also known as the Pit Grave culture, migrated from the Eurasian steppes to prehistoric south-eastern Europe approximately 3,000 years BCE. Most of the burial mounds typical of the Yamnaya culture have already been destroyed, but new techniques enable their identification and study.

The project is using multidisciplinary methods to solve the mystery. Archaeologists are collaborating with scholars of biological and environmental sciences, using the methods of funerary archaeology, landscape archaeology and remote sensing that are at the group’s disposal. From the field of biological sciences, the group is making use of genetics/DNA analysis, biological anthropology and biogeochemistry. As for environmental sciences, their contribution is in the form of palaeoclimatology, which studies climate before modern meteorological observations, and soil formation processes.

The project, coordinated by the discipline of archaeology at the University of Helsinki, will also welcome researchers from Mainz, London, Bristol and Budapest, in addition to which the group will collaborate with Czech, Slovak and Polish colleagues. Field studies and sample collection for the project will be conducted in Romania, Bulgaria, Hungary and Serbia.

In Helsinki, Volker Heyd’s main collaborator is Professor Heikki Seppä from the Department of Geosciences and Geography on the Kumpula Campus, while the team will also be hiring three postdoctoral researchers.

yamna-bell-beaker
Yamna – East Bell Beaker migration 3000-2300 BC, after Heyd (2007, 2012)

Yam­naya from the east changed Europe forever

The researchers wish to understand how the Yamnaya migrated to Europe and how the arrival of a new culture changed an entire continent.

How many people actually arrived? Taking the scale of the changes, some estimates range in the millions, but according to Volker Heyd, the number of people representing the Yamnaya culture in southeast Europe was around several ten thousands. It is indeed remarkable how such a relatively small group of people has had such a significant and far-reaching impact on Europe.

The Yamnaya also brought with them new cultural and social norms that have had far-reaching consequences. For instance, patriarchy and monogamy seems to be part of the Yamnaya legacy. Another established theory speculates that marriages made women migrate and travel even across great distances.

In accordance with primogeniture, the first-born son of the family inherited his parents’ possessions, while the younger siblings had to make their own way through other means. Among other things, this practice guaranteed ample human resources for the legions of the Roman Empire, which enabled its establishment and expansion, and later filled the ranks of medieval monasteries across Europe.

Another interesting question is what made representatives of the Yamnaya culture migrate from the eastern European steppes to the west. Heyd believes that the underlying reason may have been climate change. The Yamnaya were almost exclusively dependent on animal husbandry. As the climate changed – when rainfalls decreased in the east – they may have been forced to migrate west to secure the welfare of their cattle.

North-East Europe and Corded Ware

Heyd has already been here as a visiting professor in the Helsinki University Humanities programme since the beginning of the year, working on another project. Together with Postdoctoral Researcher Kerkko Nordqvist, he is investigating the prehistoric settlement of north-eastern Europe 3,000 – 6,000 years ago with research methods similar to the new Yamnaya project. One of their central research questions is what made people migrate to this region, and which innovations they brought with them. In this case also, the reasons behind the migration may be related to changes in the environment and climate.

This is probably bad news for research in the UK (I say probably because I guess many Brexiteers will be happy to have less foreign researchers in their country), but it is great news to see both researchers, Heyd and Nordqvist (whose Ph.D. thesis includes research on the Corded Ware culture that I have recently mentioned) – , be able to collaborate together to assess Indo-European and Uralic migrations.

Heyd’s website at the University of Bristol states that he is currently working on:

  1. The Milking Revolution in Temperate Neolithic Europe (NeoMilk)‘. Funded by an ERC Advanced Grant, European Union, to R. Evershed. See, for further information: www.neomilk-erc.eu
  2. The Yamnaya Impact‘: Archaeology and scientific research of/into the Yamnaya populations of Southeastern Europe and their impact on contemporary local and neighboring 3rd millennium BC societies as well as their role in the emergence of the Corded Ware and Bell Beaker complexes in Europe.
  3. The Prehistoric Peopling of Northeastern Europe‘: Inter-/crossdisciplinary studies on the archaeology, anthropology, linguistics, and bio- and environmental sciences of early Uralic speakers and their first horizon of interactions with Indo-European speakers. This wider project is in cooperation with colleagues from Helsinki and Turku Universities in Finland, as well as from Russia, Estonia and Poland.
  4. Czech Republic‘: I am closely cooperating with the Institute of Archaeology, Czech Academy of Sciences, in Prague for two research projects funded by the Czech Grant Agency in which we measure various isotopes from human remains in Bristol to understand past mobility and diet. The Humboldt-Kolleg -conference ‘Reinecke’s Heritage’ (with P. Pavúk, M. Ernée and J. Peska) held in June 2017 at Chateau Křtiny/Moravia is also part of this cooperation. See, for further information: http://ukar.ff.cuni.cz/reinecke.
yamna-late-proto-indo-european
Image modified from Narasimhan et al. (2018), including the most likely proto-language identification of different groups. Original description “Modeling results including Admixture events, with clines or 2-way mixtures shown in rectangles, and clouds or 3-way mixtures shown in ellipses”. See the original full image here.

On the genetic aspect, we have gross Yamna migrations today as clearly depicted as they will ever be: late Khvalynsk/Yamna expanded Late Proto-Indo-European languages, and Bell Beakers brought North-West Indo-European to almost all of Europe, as predicted in Harrison and Heyd (2007). Full stop.

There is still fine-grained population structure, though, as Lazaridis puts it, to be detected in migratory movements contemporary or subsequent to the Yamna settlements in South-East Europe and the East Bell Beaker expansion.

We will probably lack a comprehensive description of local archaeological cultural exchanges – to fit the potential dialectal developments and expansions – to be coupled with small-scale migratory movements in genetics, as more samples are made available.

This work from the University of Helsinki will hopefully provide the necessary detailed anthropological foundations to be used with future genetic studies to obtain a more precise picture of the formation and expansion of North-West Indo-Europeans.

Related:

Population replacement in Early Neolithic Britain, and new Bell Beaker SNPs

copper-age-late-bell-beaker

New (copyrighted) preprint at BioRxiv, Population Replacement in Early Neolithic Britain, by Brace et al. (2018).

Abstract (emphasis mine):

The roles of migration, admixture and acculturation in the European transition to farming have been debated for over 100 years. Genome-wide ancient DNA studies indicate predominantly Anatolian ancestry for continental Neolithic farmers, but also variable admixture with local Mesolithic hunter-gatherers. Neolithic cultures first appear in Britain c. 6000 years ago (kBP), a millennium after they appear in adjacent areas of northwestern continental Europe. However, the pattern and process of the British Neolithic transition remains unclear. We assembled genome-wide data from six Mesolithic and 67 Neolithic individuals found in Britain, dating from 10.5-4.5 kBP, a dataset that includes 22 newly reported individuals and the first genomic data from British Mesolithic hunter-gatherers. Our analyses reveals persistent genetic affinities between Mesolithic British and Western European hunter-gatherers over a period spanning Britain’s separation from continental Europe. We find overwhelming support for agriculture being introduced by incoming continental farmers, with small and geographically structured levels of additional hunter-gatherer introgression. We find genetic affinity between British and Iberian Neolithic populations indicating that British Neolithic people derived much of their ancestry from Anatolian farmers who originally followed the Mediterranean route of dispersal and likely entered Britain from northwestern mainland Europe.

Also, Genetiker has updated Y-SNP calls from new data published from the Harvard group.

The R1b lineages that expanded from (Yamna->) East Bell Beakers -> Western Europe are more and more clearly of R1b-L151 subclades, as expected.

Quite interesting are the early samples from Poland, of R1b1a1a2a2-Z2103 and R1b1a1a2a1a-L151 lineages – , which may point (different to the more homogeneous L151 distribution in Western Europe) to a mix in both original (east-west) Yamna groups. This could tentatively be used to explain the Graeco-Aryan influence that some linguists see in Balto-Slavic (or its superstrate).

That link would then be quite early, to account for an influence during the Yamna settlements in Hungary, before its expansion as East Bell Beakers, but we haven’t seen a clearly differentiated subgroup (yet) in Archaeology, Anthropology, or Genomics within the Hungarian Yamna/East Bell Beaker community, so I am not convinced. It could be just that different scattered subclades mixed with the general L151 population pop up (following old Yamna lineages, or having being added along the way), as expected in an expansion over such a great territory – as if some scattered samples of R1a, I1, I2, J, etc. were found.

We need more early samples from south-eastern Europe and the steppe during the Chalcolithic to ascertain the composition and migration paths of the different Yamna settlers.

Other interesting findings are the early (Proto-)Bell Beaker samples of haplogroup R1b with no steppe ancestry from Spain – which some autochthonous continuists wanted to believe was a proof of some kind – , which are actually R1b-V88, a haplogroup known to have expanded throughout Europe quite early. In fact, this subclade has been recently shown to have most likely expanded through the Green Sahara region, and is potentially linked to the expansion of Afro-Asiatic.

See also:

About the European Union’s arcane language: the EU does seem difficult for people to understand

Mark Mardell asks in his post Learn EU-speak:

Does the EU shroud itself in obscure language on purpose or does any work of detail produce its own arcane language? Of course it is not just the lingo: the EU does seem difficult for people to understand. What’s at the heart of the problem?

His answer on the radio (as those comments that can be read in his blog) will probably look for complex reasoning on the nature of the European Union as an elitist institution, distant from real people, on the “obscure language” (intentionally?) used by MEPs, on the need of that language to be obscured by legal terms, etc.

All that is great. You can talk a lot about the possible reasons why people would find too boring those Europarliament discussions where everyone speaks his own national language; possible reasons why important media (like the BBC) would never show debates on important issues, unless the MEP uses their national language; possible reasons why that doesn’t happen with national parliaments where everyone speaks a common language…

But the most probable answer is so obvious it doesn’t really make sense to ask. The initeresting question is do people actually want to pay the price for having a common Europe?