The earliest archaeological evidence of horse milking, harnessing, and corralling is found in the ∼5,500-year-old Botai culture of Central Asian steppes (Gaunitz et al., 2018, Outram et al., 2009; see Kosintsev and Kuznetsov, 2013 for discussion). Botai-like horses are, however, not the direct ancestors of modern domesticates but of Przewalski’s horses (Gaunitz et al., 2018). The genetic origin of modern domesticates thus remains contentious, with suggested candidates in the Pontic-Caspian steppes (Anthony, 2007), Anatolia (Arbuckle, 2012, Benecke, 2006), and Iberia (Uerpmann, 1990, Warmuth et al., 2011). Irrespective of the origins of domestication, the horse genome is known to have been reshaped significantly within the last ∼2,300 years (Librado et al., 2017, Wallner et al., 2017, Wutke et al., 2018). However, when and in which context(s) such changes occurred remains largely unknown.
To clarify the origins of domestic horses and reveal their subsequent transformation by past equestrian civilizations, we generated DNA data from 278 equine subfossils with ages mostly spanning the last six millennia (n = 265, 95%) (Figures 1A and 1B; Table S1; STAR Methods). Endogenous DNA content was compatible with economical sequencing of 87 new horse genomes to an average depth-of-coverage of 1.0- to 9.3-fold (median = 3.3-fold; Table S2). This more than doubles the number of ancient horse genomes hitherto characterized. With a total of 129 ancient genomes, 30 modern genomes, and new genome-scale data from 132 ancient individuals (0.01- to 0.9-fold, median = 0.08-fold), our dataset represents the largest genome-scale time series published for a non-human organism (Tables S2, S3, and S4; STAR Methods).
Discovering Two Divergent and Extinct Lineages of Horses
Domestic and Przewalski’s horses are the only two extant horse lineages (Der Sarkissian et al., 2015). Another lineage was genetically identified from three bones dated to ∼43,000–5,000 years ago (Librado et al., 2015, Schubert et al., 2014a). It showed morphological affinities to an extinct horse species described as Equus lenensis (Boeskorov et al., 2018). We now find that this extinct lineage also extended to Southern Siberia, following the principal component analysis (PCA), phylogenetic, and f3-outgroup clustering of an ∼24,000-year-old specimen from the Tuva Republic within this group (Figures 3, 5A and S7A). This new specimen (MerzlyYar_Rus45_23789) carries an extremely divergent mtDNA only found in the New Siberian Islands some ∼33,200 years ago (Orlando et al., 2013) (Figure 6A; STAR Methods) and absent from the three bones previously sequenced. This suggests that a divergent ghost lineage of horses contributed to the genetic ancestry of MerzlyYar_Rus45_23789. However, both the timing and location of the genetic contact between E. lenensis and this ghost lineage remain unknown.
Modeling Demography and Admixture of Extinct and Extant Horse Lineages
Phylogenetic reconstructions without gene flow indicated that IBE differentiated prior to the divergence between DOM2 and Przewalski’s horses (Figure 3; STAR Methods). However, allowing for one migration edge in TreeMix suggested closer affinities with one single Hungarian DOM2 specimen from the 3rd mill. BCE (Dunaujvaros_Duk2_4077), with extensive genetic contribution (38.6%) from the branch ancestral to all horses (Figure S7B).This, and the extremely divergent IBE Y chromosome (Figure 6B), suggest that a divergent but yet unidentified ghost population could have contributed to the IBE genetic makeup.
Rejecting Iberian Contribution to Modern Domesticates
The genome sequences of four ∼4,800- to 3,900-year-old IBE specimens characterized here allowed us to clarify ongoing debates about the possible contribution of Iberia to horse domestication (Benecke, 2006, Uerpmann, 1990, Warmuth et al., 2011). Calculating the so-called fG ratio (Martin et al., 2015) provided a minimal boundary for the IBE contribution to DOM2 members (Cahill et al., 2013) (Figure 7A). The maximum of such estimate was found in the Hungarian Dunaujvaros_Duk2_4077 specimen (∼11.7%–12.2%), consistent with its TreeMix clustering with IBE when allowing for one migration edge (Figure S7B). This specimen was previously suggested to share ancestry with a yet-unidentified population (Gaunitz et al., 2018). Calculation of f4-statistics indicates that this population is not related to E. lenensis but to IBE (Figure 7B; STAR Methods). Therefore, IBE or horses closely related to IBE, contributed ancestry to animals found at an Early Bronze Age trade center in Hungary from the late 3rd mill. BCE. This could indicate that there was long-distance exchange of horses during the Bell Beaker phenomenon (Olalde et al., 2018). The fG minimal boundary for the IBE contribution into an Iron Age Spanish horse (ElsVilars_UE4618_2672) was still important (~9.6%–10.1%), suggesting that an IBE genetic influence persisted in Iberia until at least the 7th century BCE in a domestic context. However, fG estimates were more limited for almost all ancient and modern horses investigated (median = ~4.9%–5.4%; Figure 7A).
Iron Age horses
Y chromosome nucleotide diversity (π) decreased steadily in both continents during the last ∼2,000 years but dropped to present-day levels only after 850–1,350 CE (Figures 2B and S2E; STAR Methods). This is consistent with the dominance of an ∼1,000- to 700-year-old oriental haplogroup in most modern studs (Felkel et al., 2018, Wallner et al., 2017). Our data also indicate that the growing influence of specific stallion lines post-Renaissance (Wallner et al., 2017) was responsible for as much as a 3.8- to 10.0-fold drop in Y chromosome diversity.
We then calculated Y chromosome π estimates within past cultures represented by a minimum of three males to clarify the historical contexts that most impacted Y chromosome diversity. This confirmed the temporal trajectory observed above as Byzantine horses (287–861 CE) and horses from the Great Mongolian Empire (1,206–1,368 CE) showed limited yet larger-than-modern diversity. Bronze Age Deer Stone horses from Mongolia, medieval Aukštaičiai horses from Lithuania (C9th–C10th [ninth through the tenth centuries of the Common Era]), and Iron Age Pazyryk Scythian horses showed similar diversity levels (0.000256–0.000267) (Figure 2A). However, diversity was larger in La Tène, Roman, and Gallo-Roman horses, where Y-to-autosomal π ratios were close to 0.25. This contrasts to modern horses, where marked selection of specific patrilines drives Y-to-autosomal π ratios substantially below 0.25 (0.0193–0.0396) (Figure 2A). The close-to-0.25 Y-to-autosomal π ratios found in La Tène, Roman, and Gallo-Roman horses suggest breeding strategies involving an even reproductive success among stallions or equally biased reproductive success in both sexes (Wilson Sayres et al., 2014).
Lineage is used in this paper, as in many others in genetics, as defined by a specific ancestry. I keep that nomenclature below. It should not be confused with the “lineages” or “lines” referring to Y-chromosome (or mtDNA) haplogroups.
Supporting the “archaic” nature of the Hungarian BBC horses expanding from the Pontic-Caspian steppes are:
Among Y-chromosome lines, the common group formed by Botai-Borly4 (closely related to DOM2), Scythian horses from Aldy Bel (Arzhani), Iron Age horses from Estonia (Ridala), horses from the Xiongnu culture (Uushgiin Uvur), and Roman horses from Autricum (Chartres).
Among mtDNA lines, the common group formed by Botai samples, LebyazhinkaIV NB35, and different Eurasian domesticates, including many ancient Western European ones, which reveals a likely expansion of certain subclades east and west with the Repin culture.
(…) DOM2 contributed 22% to the ancestor of Przewalski’s horses ca. 9.47 kya, suggesting the Holocene optimum, rather than the Eneolithic Botai culture (∼5.5 kya), as a period of population contact. This pre-Botai introgression could explain the Y chromosome topology, where Botai horses were reported to carry two different segregating haplogroups: one occupied a basal position in the phylogeny while the other was closely related to DOM2. Multiple admixture pulses, however, are known to have occurred along the divergence of DOM2 and the Botai-Borly4 lineage, including 2.3% post-Borly4 contribution to DOM2, and a more recent 6.8% DOM2 intogression into Przewalski’s horses (Gaunitz et al., 2018). Model C2 parameters accommodate all these as a single admixture pulse, likely averaging the contributions of all these multiple events.
The paper cannot offer a detailed picture of ancient horse domestication, but it is yet another step in showing how Repin/Yamna is the most likely source of expansion of horse domesticates in Eurasia. Even more interestingly, Yamna settlers in Hungary probably expanded an ancient lineage of that horse at the same time as they spread with the Classical Bell Beaker culture. Remarkable parallels are thus found between:
The expansion of an ancient line of horse domesticates related to Yamna Hungary/East Bell Beakers seems to be confirmed by the pre-Iberian sample from Vilars I, Els Vilars4618 2672 (ca. 700-550 BC), likely of Iberian Beaker descent, showing a lineage older than the Indo-Iranian ones, which later replaced most European lines.
NOTE. For known contacts between Yamna and Proto-Beakers just before the expansion of East Bell Beakers, see a recent post on Vanguard Yamna groups.
The findings of the paper confirm the expansion of the horse firstly (and mainly) through the steppe biome, mimicking the expansion of Proto-Indo-Europeans first, and then replaced gradually (or not so gradually) by lines brought to Europe during westward expansions of Bronze Age, Iron Age, and later specialized horse-riding steppe cultures. The expansion also correlates well with the known spread of animal traction and pastoralism before 2000 BC:
Yamna expansion to the west “with horses and wagons”, with a more homogeneous ancestry in modern Europeans due to later migrations from the east (and north):
DR: inference is that two major migrations: farmers from Anatolia, followed by steppe pastoralists. Who are they? They took horses and wagons and spread. See rapid 90% pop turn over in Britain. Similar timing in Iberia, but a bit less turnover, and more period of overlap
Interesting excerpts, referring mainly to Uralic peoples (emphasis mine):
A model-based clustering analysis using ADMIXTURE shows a similar pattern (Fig. 2b and Supplementary Fig. 3). Overall, the proportions of ancestry components associated with Eastern or Western Eurasians are well correlated with longitude in inner Eurasians (Fig. 3). Notable outliers include known historical migrants such as Kalmyks, Nogais and Dungans. The Uralic- and Yeniseian-speaking populations, as well as Russians from multiple locations, derive most of their Eastern Eurasian ancestry from a component most enriched in Nganasans, while Turkic/Mongolic speakers have this component together with another component most enriched in populations from the Russian Far East, such as Ulchi and Nivkh (Supplementary Fig. 3). Turkic/Mongolic speakers comprising the bottom-most cline have a distinct Western Eurasian ancestry profile: they have a high proportion of a component most enriched in Mesolithic Caucasus hunter-gatherers and Neolithic Iranians and frequently harbour another component enriched in present-day South Asians (Supplementary Fig. 4). Based on the PCA and ADMIXTURE results, we heuristically assigned inner Eurasians to three clines: the ‘forest-tundra’ cline includes Russians and all Uralic and Yeniseian speakers; the ‘steppe-forest’ cline includes Turkic- and Mongolic-speaking populations from the Volga and Altai–Sayan regions and Southern Siberia; and the ‘southern steppe’ cline includes the rest of the populations.
For the forest-tundra populations, the Nganasan + Srubnaya model is adequate only for the two Volga region populations, Udmurts and Besermyans (Fig. 5 and Supplementary Table 8).
For the other populations west of the Urals, six from the northeastern corner of Europe are modelled with additional Mesolithic Western European hunter-gatherer (WHG) contribution (8.2–11.4%; Supplementary Table 8), while the rest need both WHG and early Neolithic European farmers (LBK_EN; Supplementary Table 2). Nganasan-related ancestry substantially contributes to their gene pools and cannot be removed from the model without a significant decrease in the model fit (4.1–29.0% contribution; χ2 P ≤ 1.68 × 10−5; Supplementary Table 8).
NOTE. It doesn’t seem like Hungarians can be easily modelled with Nganasan ancestry, though…
For the 4 populations east of the Urals (Enets, Selkups, Kets and Mansi), for which the above models are not adequate, Nganasan + Srubnaya + AG3 provides a good fit (χ2 P ≥ 0.018; Fig. 5 and Supplementary Table 8). Using early Bronze Age populations from the Baikal Lake region (‘Baikal_EBA’; Supplementary Table 2) as a reference instead of Nganasan, the two-way model of Baikal_EBA + Srubnaya provides a reasonable fit (χ2 P ≥ 0.016; Supplementary Table 8) and the three-way model of Baikal_EBA + Srubnaya + AG3 is adequate but with negative AG3 contribution for Enets and Mansi (χ2 P ≥ 0.460; Supplementary Table 8).
Bronze/Iron Age populations from Southern Siberia also show a similar ancestry composition with high ANE affinity (Supplementary Table 9). The additional ANE contribution beyond the Nganasan + Srubnaya model suggests a legacy from ANE-ancestry-rich clines before the Late Bronze Age.
Even among the earliest available inner Eurasian genomes, east–west connectivity is evident. These, too, form a longitudinal cline, characterized by the easterly increase of a distinct ancestry, labelled Ancient North Eurasian (ANE), lowest in western European hunter-gatherers (WHG) and highest in Palaeolithic Siberians from the Baikal region. Flow-through from this ANE cline is seen in steppe populations until at least the Bronze Age, including the world’s earliest known horse herders — the Botai. However, this is eroded over time by migration from west and east, following agricultural adoption on the continental peripheries (Fig. 1b,c).
Strikingly, Jeong et al. model the modern upper steppe cline as a simple two-way mixture between western Late Bronze Age herders and Northeast Asians (Fig. 1c), with no detectable residue from the older ANE cline. They propose modern steppe peoples were established mainly through migrations post-dating the Bronze Age, a sequence for which has been recently outlined using ancient genomes. In contrast, they confirm a substantial ANE legacy in modern Siberians of the northernmost cline, a pattern mirrored in excesses of WHG ancestry west of the Urals (Fig. 1b). This marks the inhospitable biome as a reservoir for older lineages, an indication that longstanding barriers to latitudinal movement may indeed be at work, reducing the penetrance of gene flows further south along the steppe.
Given the findings as reported in the paper, I think it should be much easier to describe different subclines in the “northernmost cline” than in the much more recent “Turkic/Mongolic cline”, which is nevertheless subdivided in this paper in two clines. As an example, there are at least two obvious clines with “Nganasan-related meta-populations” among Uralians, which converge in a common Steppe MLBA (i.e. Corded Ware) ancestry – one with Palaeo-Laplandic peoples, and another one with different Palaeo-Siberian populations:
The inclusion of certain Eurasian groups (or lack thereof) in the PCA doesn’t help to distinguish these subclines visually, and I guess the tiny “Naganasan-related” ancestral components found in some western populations (e.g. the famous ~5% among Estonians) probably don’t lend themselves easily to further subdivisions. Notice, nevertheless, the different components of the Eastern Eurasian source populations among Finno-Ugrians:
Also remarkable is the lack of comparison of Uralic populations with other neighbouring ones, since the described Uralic-like ancestry of Russians was already known, and is most likely due to the recent acculturation of Uralic-speaking peoples in the cradle of Russians, right before their eastward expansions.
A comparison of Estonians and Finns with Balts, Scandinavians, and Eastern Europeans would have been more informative for the division of the different so-called “Nganasan-like meta-populations”, and to ascertain which one of these ancestral peoples along the ancient WHG:ANE cline could actually be connected (if at all) to the Cis-Urals.
Because, after all, based on linguistics and archaeology, geneticists are not supposed to be looking for populations from the North Asian Arctic region, for “Siberian ancestry”, or for haplogroup N1c – despite previous works by their peers – , but for the Bronze Age Volga-Kama region…
It has been known for a long time that the Caucasus must have hosted many (at least partially) isolated populations, probably helped by geographical boundaries, setting it apart from open Eurasian areas.
David Reich writes in his book the following about India:
The genetic data told a clear story. Around a third of Indian groups experienced population bottlenecks as strong or stronger than the ones that occurred among Finns or Ashkenazi Jews. We later confirmed this finding in an even larger dataset that we collected working with Thangaraj: genetic data from more than 250 jati groups spread throughout India (…)
Rather than an invention of colonialism as Dirks suggested, long-term endogamy as embodied in India today in the institution of caste has been overwhelmingly important for millennia. (…)
The Han Chinese are truly a large population. They have been mixing freely for thousands of years. In contrast, there are few if any Indian groups that are demographically very large, and the degree of genetic differentiation among Indian jati groups living side by side in the same village is typically two to three times higher than the genetic differentiation between northern and southern Europeans. The truth is that India is composed of a large number of small populations.
There is little doubt now, based on findings spanning thousands of years, that the Mesolithic and Neolithic Caucasus hosted various very small populations, even if the ancestral components may be reduced to the few known to date (such as ANE, EHG, AME*, ENA, CHG, and other “deep” ancestral components).
NOTE. I will call the ancestral component of Dzudzuana/Anatolian hunter-gatherers Ancient Middle Easterner (AME), to give a clear idea of its likely extension during the Late Upper Palaeolithic, and to avoid using the more simplistic Dzudzuana, unless it is useful to mention these specific local samples.
Genetic labs have a strong fixation with ancestry. I guess the use of complex statistical methods gives professionals and laymen alike the feeling of dealing with “Science”, as opposed to academic fields where you have to interpret data. I think language reveals a lot about the way people think, and the fact that ancestral components are called ‘lineages’ – while not wrong per se – is a clear symptom of the lack of interest in the true lineages: Y-DNA haplogroups.
It has become quite clear that male-biased migrations are often the ones which can be confidently followed for actual population movements and ethnolinguistic identification, at least until the Iron Age. The frequently used Palaeolithic clusters offer a clear example of why ancestry does not represent what some people believe: They merely give a basic idea of sizeable population replacements by distant peoples.
Both concepts are important: sizeable and distant peoples. For example, during the Upper Palaeolithic in Europe there was a sizeable population replacement of the Aurignacian Goyet cluster by the Gravettian Vestonice cluster (probably from populations of far eastern Russia) coupled with the arrival of haplogroup I, although during the thousands of years that this material culture lasted, the previously expanded C1a2 lineages did not disappear, and there were probably different resurgence and admixture events.
Haplogroup I certainly expanded with the Gravettian culture to Iberia, where the Goyet ancestry did not change much – probably because of male-driven migrations -, to the extent that during the Magdalenian expansions haplogroup I expanded with an ancestry closer to Goyet, in what is called a ‘resurge’ of the Goyet cluster – even though there is a clear replacement of male lines.
The Villabruna (WHG) cluster is another good example. It probably spread with haplogroup R1b-L754, which – based on the extra ‘East Asian’ affinity of some samples and on modern samples from the Middle East – came probably from the east through a southern route, and not too long before the expansion of WHG likely from around the Black Sea, although this is still unclear. The finding of haplogroup I in samples of mostly WHG ancestry could confuse people that do not care about timing, sub-structured populations, and gene flow.
NOTE. If you don’t understand why ‘clusters’ that span thousands of years don’t really matter for the many Palaeolithic population expansions that certainly happened among hunter-gatherers in Europe, just take a look at what happened with Bell Beakers expanding from Yamna into western Europe within 500 years.
If we don’t thread carefully when talking about population migrations, these terms are bound to confuse people. Just as the fixation on “steppe ancestry” – which marks the arrival in Chalcolithic Europe of peoples from the Pontic-Caspian region – has confused a lot of researchers to this day.
When I began to write about the Indo-European demic diffusion model, my concern was to find a single spot where a North-West Indo-European proto-language could have expanded from ca. 2000 BC (our most common guesstimate). Based on the 2015 papers, and in spite of their conclusions, I thought it had become clear that Corded Ware was not it, and it was rather Bell Beakers. I assumed that Uralic was spoken to the north (as was the traditional belief), and thus Corded Ware expanded from the forest zone, hence steppe ancestry would also be found there with other R1a lineages.
With the publication of Mathieson et al. (2017) and Olalde et al. (2017), I changed my mind, seeing how “steppe ancestry” did in fact appear quite late, hence it was likely to be the result of very specific population movements, probably directly from the Caucasus. Later, Mathieson published in a revision the sample from Alexandria of hg R1a-M417 (probably R1a-Z645, possibly Z93+), which further supported the idea that the migration of Corded Ware peoples started near the North Pontic forest-steppe (as I included in a the next revision).
The question remains the same I repeated recently, though: where do the extra Caucasus components (i.e. beyond EHG) of Eneolithic Ukraine/Corded Ware and Khvalynsk/Yamna come from?
Considering 2-way mixtures, we can model Karelia_HG as deriving 34 ± 2.8% of its ancestry from a Villabruna-related source, with the remainder mainly from ANE represented by the AfontovaGora3 (AG3) sample from Lake Baikal ~17kya.
AG3 was likely of haplogroup Q1a (as reported by YFull, see Genetiker), and probably the ANE ancestry found in Eastern Europe accompanied a Palaeolithic migration of Q1a2-M25 (formed ca. 22600 BC, TMRCA ca. 14300 BC).
Combined with what we know about the Eneolithic Steppe and Caucasus populations – it is likely that ANE ancestry remained the most important component of some of the small ghost populations of the Caucasus until their emergence with the Lola culture.
The first sample we have now attributed to the EHG cluster is Sidelkino, from the Samara region (ca. 9300 BC), mtDNA U5a2. In Damgaard et al. (Science 2018), Yamnaya could be modelled as a CHG population related to Kotias Klde (54%) and the remaining from ANE population related to Sidelkino (>46%), with the following split events:
A split event, where the CHG component of Yamnaya splits from KK1. The model inferred this time at 27 kya (though we note the larger models in Sections S2.12.4 and S2.12.5 inferred a more recent split time).
A split event, where the ANE component of Yamnaya splits from Sidelkino. This was inferred at about about 11 kya.
A split event, where the ANE component of Yamnaya splits from Botai. We inferred this to occur 17 kya. Note that this is above the Sidelkino split time, so our model infers Yamnaya to be more closely related to the EHG Sidelkino, as expected.
An ancestral split event between the CHG and ANE ancestral populations. This was inferred to occur around 40 kya.
Other samples classified as of the EHG cluster:
Popovo2 (ca. 6250 BC) of hg J1, mtDNA U4d – Po2 and Po4 from the same site (ca. 6550 BC) show continuity of mtDNA.
Karelia_HG, from Juzhnii Oleni Ostrov (ca. 6300 BC): I0211/UzOO40 (ca. 6300 BC) of hg J1(xJ1a), mtDNA U4a; and I0061/UzOO74 of hg R1a1(xR1a1a), mtDNA C1
UzOO77 and UzOO76 from Juzhnii Oleni Ostrov (ca. 5250 BC) of mtDNA R1b.
Samara_HG from Lebyanzhinka (ca. 5600 BC) of hg R1b1a, mtDNA U5a1d.
About the enigmatic Anatolia_Neolithic-related ancestry found in Pontic-Caspian steppe samples, this is what Wang et al. (2018) had to say:
We focused on model of mixture of proximal sources such as CHG and Anatolian Chalcolithic for all six groups of the Caucasus cluster (Eneolithic Caucasus, Maykop and Late Makyop, Maykop-Novosvobodnaya, Kura-Araxes, and Dolmen LBA), with admixture proportions on a genetic cline of 40-72% Anatolian Chalcolithic related and 28-60% CHG related (Supplementary Table 7). When we explored Romania_EN and Greece_Neolithic individuals as alternative southeast European sources (30-46% and 36-49%), the CHG proportions increased to 54-70% and 51-64%, respectively. We hypothesize that alternative models, replacing the Anatolian Chalcolithic individual with yet unsampled populations from eastern Anatolia, South Caucasus or northern Mesopotamia, would probably also provide a fit to the data from some of the tested Caucasus groups.
The first appearance of ‘Near Eastern farmer related ancestry’ in the steppe zone is evident in Steppe Maykop outliers. However, PCA results also suggest that Yamnaya and later groups of the West Eurasian steppe carry some farmer related ancestry as they are slightly shifted towards ‘European Neolithic groups’ in PC2 (Fig. 2D) compared to Eneolithic steppe. This is not the case for the preceding Eneolithic steppe individuals. The tilting cline is also confirmed by admixture f3-statistics, which provide statistically negative values for AG3 as one source and any Anatolian Neolithic related group as a second source
Detailed exploration via D-statistics in the form of D(EHG, steppe group; X, Mbuti) and D(Samara_Eneolithic, steppe group; X, Mbuti) show significantly negative D values for most of the steppe groups when X is a member of the Caucasus cluster or one of the Levant/Anatolia farmer-related groups (Supplementary Figs. 5 and 6). In addition, we used f- and D-statistics to explore the shared ancestry with Anatolian Neolithic as well as the reciprocal relationship between Anatolian- and Iranian farmer-related ancestry for all groups of our two main clusters and relevant adjacent regions (Supplementary Fig. 4). Here, we observe an increase in farmer-related ancestry (both Anatolian and Iranian) in our Steppe cluster, ranging from Eneolithic steppe to later groups. In Middle/Late Bronze Age groups especially to the north and east we observe a further increase of Anatolian farmer related ancestry consistent with previous studies of the Poltavka, Andronovo, Srubnaya and Sintashta groups and reflecting a different process not especially related to events in the Caucasus.
(…) Surprisingly, we found that a minimum of four streams of ancestry is needed to explain all eleven steppe ancestry groups tested, including previously published ones (Fig. 2; Supplementary Table 12). Importantly, our results show a subtle contribution of both Anatolian farmer-related ancestry and WHG-related ancestry (Fig.4; Supplementary Tables 13 and 14), which was likely contributed through Middle and Late Neolithic farming groups from adjacent regions in the West. The discovery of a quite old AME ancestry has rendered this probably unnecessary, because this admixture from an Anatolian-like ghost population could be driven even by small populations from the Caucasus.
While it is not yet fully clear, the increased Anatolian_Neolithic-like ancestry in Ukraine_Eneolithic samples (see below) makes it unlikely that all such ancestry in Corded Ware groups comes from a GAC-related contribution. It is likely that at least part of it represents contributions from populations of the Caucasus, based on the mostly westward population movements in the steppe from ca. 4600 BC on, including the Suvorovo-Novodanilovka expansion, and especially the Kuban-Maykop expansion during the final Eneolithic into the North Pontic area.
NOTE. Since CHG-like groups from the Caucasus may have combinations of AME and ANE ancestry similar to Yamna (which may thus appear as ‘steppe ancestry’ in the North Pontic area), it is impossible to interpret with precision the following ADMIXTURE graphic:
The East Asian contribution to samples from the WHG samples (like Loschbour or La Braña), as specified in Fu et al. (2016), does not seem to be related to Baikal_EN, and appears possibly (in the ADMIXTURE analysis) integrated into he Villabruna component. I guess this implies that the shared alleles with East Asians are quite early, and potentially due to the expansion of R1b-L754 from the East.
It would be interesting to know the specific material culture Sidelkino belonged to – i.e. if it was related to the expansion of the North-Eastern Technocomplex – , and its Y-DNA. The Post-Swiderian expansion into eastern Europe, probably associated with the expansion of R1b-P297 lineages (including R1b-M73, found later in Botai and in Baltic HG) is supposed to have begun during the 11th millennium BC, but migrations to the Urals and beyond are probably concentrated in the 9th millennium, so this sample is possibly slightly early for R1b.
NOTE. User Rozenfeld at Anthrogenica posted this, which I think is interesting (in case anyone wants to try a Y-SNP call):
there is something strange with Sidelkino EHG: first, its archaeological context is not described in the supplementary. Second, its sex is not listed in the supplementary tables. Third, after looking for info about this sample, I found that: “Сиделькино-3. Для снятия вопроса о половой принадлежности индивида была проведена генетическая экспертиза, выявившая принадлежность останков мужчине.”(translation: Sidelkino-3. To resolve the question about sex of the remains, the genetic analysis was conducted, which showed that remains belonged to male), source: http://static.iea.ras.ru/books/7487_Traditsii.pdf
So either they haven’t mentioned his Y-DNA in the paper for some reason, or there are more than one Sidelkino sample and the male one has not yet been published. The coverage of the Sidelkino sample from the paper is 2.9, more than enough to tell Y-DNA haplogroup.
My speculative guess right now about specific population movements in far eastern Europe, based on the few data we have:
The expansion of the North-Eastern Technocomplex first around the 9th millennium BC, most likely expanded R1b-P279 ca. 11300 BC, judging by its TMRCA, with both R1b-M73 (TMRCA 5300) and R1b-M269 (TMRCA 4400 BC) info (with extra El Mirón ancestry) back, and thus Eurasiatic.
The expansion of haplogroup J1 to the north may have happened before or after the R1b-P279 expansion. Judging by the increase in AG3-related ancestry near Karelia compared to Baltic_HG, it is possible that it expanded just after R1b-P279 (hence possibly J1-Y6304? TMRCA 9700 BC). Its long-lasting presence in the Caucasus is supported by the Satsurblia (ca. 11300 BC) and the Dolmen BA (ca. 1300 BC) samples.
The expansion of R1a-M17 ca. 6600 BC is still likely to have happened from the east, based on the R1a-M17 samples found in Baikalic cultures slightly later (ca. 5300 BC). The presence of elevated Baikal_EN ancestry in Karelia HG and in Samara HG, and the finding of R1a-M417 samples in the Forest Zone after the Mesolithic suggests a connection with the expansion of Hunter-Gatherer pottery, from the Elshanka culture in the Samara region northward into the Forset Zone and westward into the North Pontic area.
The expansion of R1b-M73 ca. 5300 BC is likely to be associated with the emergence of a group east of the Urals (related to the later Botai culture, and potentially Pre-Yukaghir). Its presence in a Narva sample from Donkalnis (ca. 5200 BC) suggest either an early split and spread of both R1b-P297 lineages (M73 and M269) through Eastern Europe, or maybe a back-migration with hunter-gatherer pottery.
R1b-M269 spread successfully ca. 4400 BC (and R1b-L23 ca. 4100 BC, both based on TMRCA), and this successful expansion is probably to be associated with the Khvalynsk-Novodanilovka expansion. We already know that Samara_HG ca. 5600 was R1b1a, so it is likely that R1b-M269 appeared (or ‘resurged’) in the Volga-Ural region shortly after the expansion of R1a-M17, whose expansion through the region may be inferred by the additional AG3 and Baikal_EN ancestry. Interesting from Samara_HG compared to the previous Sidelkino sample is the introduction of more El Mirón-related ancestry, typical of WHG populations (and thus proper of Baltic groups).
NOTE. The TMRCA dates are obviously gross approximations, because a) the actual rate of mutation is unknown and b) TMRCA estimates are based on the convergence of lineages that survived. The potential finding of R1a-Z645 (possibly Z93+) in Ukraine Eneolithic (ca. 4000 BC), and the potential finding of R1b-L23 in Khvalynsk ca. 4250 BC complicates things further, in terms of dates and origins of any subclade.
The question thus remains as it was long ago: did R1b-M269 lineages expand (‘return’) from the east, near the Urals, or directly from the north? Were they already near Samara at the same time as the expansion of hunter-gatherer pottery, and were not much affected by it? Or did they ‘resurge’ from populations admixed with Caucasus-related ancestry after the expansion of R1a-M17 with this pottery (since there are different stepped expansions from the Samara region)? We could even ask, did R1a-M17 really expand from the east, i.e. are the dates on Baikalic subclades from Moussa et al. (2016) reliable? Or did R1a-M17 expand from some pockets in the Pontic-Caspian steppe, taking over the expansion of HG pottery at some point?
The most interesting aspect from the new paper (regarding Indo-Uralic migrations) is that Ancestral Middle Easterner ancestry will probably be a better proxy for the Anatolia_Neolithic component found in Ukraine Mesolithic to Eneolithic, and possibly also for some of the “more CHG-like” component found among Pontic-Caspian steppe populations, all likely derived from different admixture events with groups from the Caucasus.
NOTE. Even the supposed gene flow of Neolithic Iranian ancestry into the Caucasus can be put into question, since that means possibly a Dzudzuana-like population with greater “deep ancestry” proportion than the one found in CHG, which may still be found within the Caucasus.
If it was not clear already that following ‘steppe ancestry’ wherever it appears is a rather lame way of following Indo-European migrations, every single sample from the Caucasus and their admixture with Pontic-Caspian steppe populations will probably show that “steppe ancestry” is in fact formed by a variety of steppe-related ancestral components, impossible to follow coherently with a single population. Exactly what is happening already with the Siberian ancestry.
If the paper on the Dzudzuana samples has shown something, is that the expansion of an ANE-like population shook the entire Caucasus area up to the Zagros Mountains, creating this ANE – AME cline that are CHG and Iran_N, with further contributions of “deep ancestries” (probably from the south) complicating the picture further.
If this happens with few known samples, and we know of an ANE-like ghost population in the Caucasus (appearing later in the Lola culture), we can already guess that the often repeated “CHG component” found in Ukraine_Eneolithic and Khvalynsk will not be the same (except the part mediated by the Novodanilovka expansion).
This ANE-like expansion happened probably in the Late Upper Palaeolithic, and reached Northern Europe probably after the expansion of the Villabruna cluster (ca. 12000 BC), judging by the advance of AG3-like and ENA-like ancestry in later WHG samples.
The population movements during the Mesolithic and Early Neolithic in the North Pontic area are quite complicated: the extra AME ancestry is probably connected to the admixture with populations from the Caucasus, while the close similarity of Ukraine populations with Scandinavian ones (with an increase in Villabruna ancestry from Mesolithic to Neolithic samples), probably reveal population movements related to the expansion of Maglemose-related groups.
These Maglemose-related groups were probably migrants from the north-west, originally from the Northern European Plains, who occupied the previous Swiderian territory, and then expanded into the North Pontic area. The overwhelming presence of I2a (likely all I2a2a1b1b) lineages in Ukraine Neolithic supports this migration.
The likely picture of Mesolithic-Neolithic migrations in the North Pontic area right now is then:
Expansion of R1a-M459 from the east ca. 12000 BC – probably coupled with AG3 and also some Baikal_EN ancestry. First sample is I1819 from Vasilievka (ca. 8700 BC), another is from Dereivka ca. 6900 BC.
Expansion of R1b-V88 from the Balkans in the west ca. 9700 BC, based on its TMRCA and also the Balkan hunter-gatherer population overwhemingly of this haplogroup from the 10th millennium until the Neolithic. First sample is I1734 from Vasilievka (ca. 7252 BC), which suggests that it replaced the male population there, based on their similar EHG-like adxmixture (and lack of sizeable WHG increase), and shared mtDNA U5b2, U5a2.
Expansion of I2a-Y5606 probably ca. 6800 based on its TMRCA with Janislawice culture. Supporting this is the increase in WHG contribution to Neolithic samples, including the spread of U4 subclades compared to the previous period.
Expansion of R1a-M17 starting probably ca. 6600 BC in the east (see above).
NOTE. The first sample of haplogroup I appears in the Mesolithic: I1763 (ca. 8100 BC) of haplogroup I2a1, probably related to an older Upper Palaeolithic expansion.
It is becoming more and more clear with each new paper that – unless the number of very ancient samples increases – the use of Y-chromosome haplogroups remains one of the most important tools for academics; this is especially so in the steppes, in light of the diversity found in populations from the Caucasus. A clear example comes from the Yamna – Corded Ware similarities:
The presence of haplogroups Q and R1a-M459 (xM17) in Khvalynsk along with a R1b1a sample, which some interpreted as being akin to modern ‘mixed’ populations in the past, is likely to point instead to a period of Khvalynsk-Novodanilovka expansion with R1b-M269, where different small populations from the steppe were being integrated into the common Khvalynsk stock, but where differences are seen in material culture surrounding their burials, as supported by the finding of R1b1 in the Kuban area already in the first half of the 5th millennium. The case would be similar to the early ‘mixed’ Icelandic population.
Only after the emergence of the Samara culture (in the second half of the 6th millennium BC), with a sample of haplogroup R1b1a, starts then the obvious connection with Early Proto-Indo-Europeans; and only after the appearance of late Sredni Stog and haplogroup R1a-M417 (ca. 4000 BC) is its connection with Uralic also clear. In previous population movements, I think more haplogroups were involved in migrations of small groups, and only some communities among them were eventually successful, expanding to be dominant, creating ever growing cultures during their expansions.
Indeed, if you think in terms of Uralic and Indo-European just as converging languages, and forget their potential genetic connection, then the genetic + linguistic picture becomes simplified, and the upper frontier of the 6th millennium BC with a division North Pontic (Mariupol) vs. Volga-Ural (Samara) is enough. However, tracing their movements backwards – with cultural expansions from west to east (with the expansion of farming), and earlier east to west (with hunter-gatherer pottery), and still earlier west to east (with the north-eastern technocomplex), offers an interesting way to prove their potential connection to macrofamilies, at least in terms of population movements.
I am quite convinced right now that it would be possible to connect the expansion of R1b-L754 subclades with a speculative Nostratic (given the R1b-V88 connection with Afroasiatic, and the obvious connection of R1b-L297 with Eurasiatic). Paradoxically, the connection of an Indo-Uralic community in the steppes (after the separation of Yukaghir) with any lineage expansion (R1a-M17, R1b-M269, or even Q, I or J1) seems somehow blurrier than one year ago, possibly just because there are too many open possibilities.
David Reich says about the admixture with Neanderthals, which he helped discover:
At the conclusion of the Neanderthal genome project, I am still amazed by the surprises we encountered. Having found the first evidence of interbreeding between Neanderthals and modern humans, I continue to have nightmares that the finding is some kind of mistake. But the data are sternly consistent: the evidence for Neanderthal interbreeding turns out to be everywhere. As we continue to do genetic work, we keep encountering more and more patterns that reflect the extraordinary impact this interbreeding has had on the genomes of people living today.
I think this is a shared feeling among many of us who have made proposals about anything, to fear that we have made a gross, evident mistake, and constantly look for flaws. However, it seems to me that geneticists are more preoccupied with being wrong in their developed statistical methods, in the theoretical models they are creating, and not so much about errors in the true ancient ethnolinguistic picture human population genetics is (at least in theory) concerned about. Their publications are, after all, constantly associating genetic finds with cultures and (whenever possible) languages, so this aspect of their research should not be taken lightly.
Seeing how David Anthony or Razib Khan (among many others) have changed their previously preferred migration models as new data was published, and they continue to be respected in their own fields, I guess we can be confident that professionals with integrity are going to accept whatever new picture appears. While I don’t think that genetic finds can change what we can reconstruct with comparative grammar, I am also ready to revise guesstimates and routes of expansion of certain dialects if R1a-Z645 is shown to have accompanied Late Proto-Indo-Europeans during their expansion with Yamna, and later integrated somehow with Corded Ware.
However, taking into account the obsession of some with an ancestral, uninterrupted R1a—Indo-European association, and the lack of actual political repercussion of Neanderthal admixture, I think the most common nightmare that all genetic researchers should be worried about is to keep inflating this “Yamnaya ancestry”-based hornet’s nest, which has been constantly stirred up for the past two years, by rejecting it – or, rather, specifying it into its true complex nature.
This succession of corrections and redefinitions, coupled with the distinct Y-DNA bottleneck of each steppe population, will eventually lead to a completely different ethnolinguistic picture of the Pontic-Caspian region during the Eneolithic, which is likely to eventually piss off not only reasonable academics stubbornly attached to the CWC-IE idea, but also a part of those interested in daydreaming about their patrilineal ancestors.
Sometimes it’s better to just rip off the band-aid once and for all…
The indigenous populations of inner Eurasia, a huge geographic region covering the central Eurasian steppe and the northern Eurasian taiga and tundra, harbor tremendous diversity in their genes, cultures and languages. In this study, we report novel genome-wide data for 763 individuals from Armenia, Georgia, Kazakhstan, Moldova, Mongolia, Russia, Tajikistan, Ukraine, and Uzbekistan. We furthermore report genome-wide data of two Eneolithic individuals (~5,400 years before present) associated with the Botai culture in northern Kazakhstan. We find that inner Eurasian populations are structured into three distinct admixture clines stretching between various western and eastern Eurasian ancestries. This genetic separation is well mirrored by geography. The ancient Botai genomes suggest yet another layer of admixture in inner Eurasia that involves Mesolithic hunter-gatherers in Europe, the Upper Paleolithic southern Siberians and East Asians. Admixture modeling of ancient and modern populations suggests an overwriting of this ancient structure in the Altai-Sayan region by migrations of western steppe herders, but partial retaining of this ancient North Eurasian-related cline further to the North. Finally, the genetic structure of Caucasus populations highlights a role of the Caucasus Mountains as a barrier to gene flow and suggests a post-Neolithic gene flow into North Caucasus populations from the steppe.
On North Eurasians
In a PCA of Eurasian individuals, we find that PC1 separates eastern and western Eurasian populations, PC2 splits eastern Eurasians along a north-south cline, and PC3 captures variation in western Eurasians with Caucasus and northeastern European populations at opposite ends (Figure 2A and Figures S1-S2). Inner Eurasians are scattered across PC1 in between, largely reflecting their geographic locations. Strikingly, inner Eurasian populations seem to be structured into three distinct west-east genetic clines running between different western and eastern Eurasian groups, instead of being evenly spaced in PC space. Individuals from northern Eurasia, speaking Uralic or Yeniseian languages, form a cline connecting northeast Europeans and the Uralic (Samoyedic) speaking Nganasans from northern Siberia (“forest-tundra” cline). Individuals from the Eurasian steppe, mostly speaking Turkic and Mongolic languages, are scattered along two clines below the forest-tundra cline. Both clines run into Turkic- and Mongolic-speaking populations in southern Siberia and Mongolia, and further into Tungusic-speaking populations in Manchuria and the Russian Far East in the East; however, they diverge in the west, oneheading to the Caucasus and the other heading to populations of the Volga-308 Ural area (the “southern steppe” and “steppe-forest” clines, respectively; Figure 2 and Figure S2).
(…) The forest-tundra cline populations derive most of their eastern Eurasian ancestry from a component most enriched in Nganasans, while those on the steppe-forest and southern steppe clines have this component together with another component most enriched in populations from the Russian Far East, such as Ulchi and Nivkh. The southern steppe cline groups are distinct from the others in their western Eurasian ancestry profile, in the sense that they have a high proportion of a component most enriched in Mesolithic Caucasus hunter-gatherers (“CHG”) and Neolithic Iranians (“Iran_N”) and frequently harbor another component enriched in South Asians (Figure S4).
For the forest-tundra cline populations, for which currently no relevant Holocene ancient genomes are available, we took a more generalized approach of using proxies for contemporary Europeans: WHG, WSH (represented by “Yamnaya_Samara”), and early Neolithic European farmers (EEF; represented by “LBK_EN”; Table S2). Adding Nganasans as the fourth reference, we find that most Uralic-speaking populations in Europe (i.e. west of the Urals) and Russians are well modeled by this four-way admixture model (χ 2 p ≥ 0.05 for all but three groups; Figure 5 and Table S8). Nganasan-related ancestry substantially contributes to their gene pools and cannot be removed from the model without a significant decrease in model fit (4.7% to 29.1% contribution; χ 2 p ≤ 1.12×10-8; Table S8). The ratio of contributions from three European references varies from group to group, probably reflecting genetic exchange with neighboring non-Uralic groups. For example, Saami from northern Fennoscandia contain a higher WHG and lower WSH contribution (16.1% and 41.3%, respectively) than Udmurts or Besermyans from the Volga river region do (4.9-6.6% and 50.7-53.2%, respectively), while the three groups have similar amounts of Nganasan-related ancestry (25.5-29.1%).
The Caucasus Mountains form a barrier to gene flow
By applying EEMS to the Caucasus region, we identify a strong barrier to gene flow separating North and South Caucasus populations. This genetic barrier coincides with the Greater Caucasus mountain ridge even to small scale: a weaker barrier in the middle, overlapping with Ossetia, matches well with the region where the ridge also becomes narrow. We also observe weak barriers running in the north-south direction that separate northeastern populations from northwestern ones. Together with PCA, EEMS results suggest that the Caucasus Mountains have posed a strong barrier to human migration.
On the Botai individuals
The Y-chromosome of the male Botai individual (TU45) belongs to the haplogroup R1b (Table 411 S6). However, it falls into neither a predominant European branch R1b-L5165 nor into a R1b-GG400 branch found in Yamnaya individuals. Thus, phylogenetically this Botai individual should belong to the R1b-M73 branch which is frequent in the Eurasian steppe (Figure S9). This branch was also found in Mesolithic samples from Latvia as well as in numerous modern southern Siberian and Central Asian groups.
The Botai genomes provide a critical snapshot of the genetic profile of pre-Bronze Age steppe populations. Our admixture modeling positions Botai primarily on an ancient genetic cline of the pre-Neolithic western Eurasian hunter-gatherers: stretching from the post-Ice Age western European hunter-gatherers (e.g. WHG) to EHG in Karelia and Samara to the Upper Paleolithic southern Siberians (e.g. AG3). Botai’s position on this cline, between EHG and AG3, fits well with their geographic location and suggests that ANE-related ancestry in the East did have a lingering genetic impact on Holocene Siberian and Central Asian populations at least till the time of Botai.
The most recent clear connection with the Botai ancestry can be found in the Middle Bronze Age Okunevo individuals (Figure S6C). In contrast, additional EHG-related ancestry is required to explain the forest-tundra populations to the east of the Urals (Figure 5 and Table S8). Their multi-way mixture model may in fact portrait a prehistoric two-way mixture of a WSH population and a hypothetical eastern Eurasian one that has an ANE-related contribution higher than that in Nganasans. Botai and Okunevo individuals prove the existence of such ANE ancestry-rich populations. Pre-Bronze Age genomes from Siberia will be critical for testing this hypothesis.
So, to sum up:
Northern Eurasia forms a Uralic – Yeniseian cline from east to west, with contribution from Steppe, WHG, and Siberian ancestry. Siberian ancestry is represented by Palaeo-Siberian Nganasans, who adopted Samoyedic quite late. It was already known that the different waves of Siberian ancestry are too late and do not represent the spread of Uralic languages, so that leaves us with Steppe and WHG.
The Botai sample (ca. 3632-3100 BC) represents thus the furthest east that R1b-P297 subclades had expanded (we did know that, and that they didn’t have close genetic links with Khvalynsk, so the haplogroup spread there probably much earlier). It expanded R1b-M269’s sister clade R1b-M73 (also found in the Baltic region), and the Botai are on the ‘eastern’ end of an ancient genetic cline stretching from WHG to EHG to Afontova Gora.
EDIT (23 MAY 2018) Both samples share mtDNA, and the male one shares Y-DNA, with those reported in Damgaard et al. (Nature 2018); although dates are slightly different (3371-3354 calBC for BOT 14), it is within the range given for this one; for the female, the dates are similar (3521-3377 calBC for BOT2016, 3517-3367 cal. BCE for this one). The lack of data on their origin may point to the fact that we only have different bone samples from the same two Botai individuals. So probably still 50% R1b-M73 (with the other 50% being N2* from BOT15)…
It seems therefore not only that R1b-M269 is bound to split from the parent haplogroup in or around the steppe or forest-steppe: the Mesolithic spread of haplogroup R1b in North Eurasia is wider and its relevance thus greater than previously thought.
Featured image, from the supplementary materials: Frequency distribution map of the Y-chromosomal haplogroup R1b-P343(xM269) identified in the Eneolithic Botai individual. All modern Eurasian samples with this haplogroup tested to date for the downstream markers fall into R1b-M73 branch, suggesting Botai sample be one of its earliest representatives.
Article of general knowledge in Der Spiegel, Invasion from the Steppe, with comments from Willerslev and Kristiansen, appeared roughly at the same time as the Damgaard et al. Nature (2018) and Science (2018) papers were published.
Particularly striking is the genetic signature from the steppe on the Y chromosome. From this the researchers conclude that the majority of migrants were males. Kristian Kristiansen, chief archaeologist in the Willerslev team, also has an idea of how this could be explained: “Maybe it’s a rite of initiation, as it was spread among the steppe peoples,” he says.
The younger sons of the Yamnaya herders, who were excluded from the succession, had to seek their fortune on their own. As part of a solemn ritual, they threw themselves to wolves’ skins and then swarmed in warlike gangs to buy their own herds by cattle-stealing.
An ally that they seem to have brought from their homeland may also have contributed to the genetic success of the steppe people: Yersinia pestis, the plague bacterium. Its genes were found by researchers from the Max Planck Institute in Jena – and apparently it emerged exactly at the same time as the Yamnaya thrust began.
About the Hittites
(…) And yet now, where Asia and Europe meet geographically, there is no trace of the Yamnaya genes. The wander-loving people from the Pontic-Caspian steppe apparently found neither the way across the Balkans nor through the Caucasus mountains.
Now the researchers are puzzled: How can it be that a language goes on a walk, without the accompanying speakers coming along? Is it possible that the Indo-European seeped into Anatolia, much like the English language spread today without the need for Englishmen?
Archaeologist Kristiansen does not believe it. The researchers would find it hard to reconsider their theories, he says: “Especially the first chapter of the story has to be rewritten.”
He suspects that there was a predecessor of the Yamnaya culture, in which a kind of Proto-Proto-Indo-European was spoken. And he also has a suspicion, where this people could have drifted around: The Caucasus, says Kristiansen, was their homeland. But that remains unproven: “There’s another hole left,” he admits.
About the Botai
The study of [the Botai] genome revealed that it was genetically radically different from the members of the Yamnaya culture. The Botai, it seems, consistently avoided any contact with their neighbors – even though they must have crossed the territory of the Botai on their migratory waves.
Willerslev assumes that the art of keeping horses from the Yamnaya steppe nomads was adopted from these peoples, and then they developed it further. At some point, the Botai could then have itself become doomed by its groundbreaking innovation: While the descendants of the Yamnaya spread over half of Eurasia, the Botai disappeared without leaving a trace.
Even more interesting than the few words that set the Copenhagen group’s views for future papers (such as the expected Maykop samples with EHG ancestry) is the artistic sketch of the Indo-European migrations, probably advised by the group.
A simple map does not mean that all members of the Danish workgroup have changed their view completely, but I would say it is a great improvement over the previous “arrows of migration” (see here), and it is especially important that they show a more realistic picture of ancient migrations to general readers.
NOTE. Especially absurd is the identification of the ‘Celtic’ expansion with the first Bell Beakers in the British Isles (that idea is hold by few, such as Koch and Cunliffe in their “Celtic from the West” series). Also inexact, but not so worrying, are the identification of ‘Germanic’ in Germany/Únětice, or the spread of ‘Baltic’ and ‘Slavic’ directly to East Europe (i.e. I guess Mierzanowice/Nitra -> Trzciniec), which is probably driven by the need to assert a close connection with early Iranians and thus with their satemization trends.
Their results, as well as those of the competition labs at Harvard University and Jena’s Max Planck Institute for the History of Humanity, leave no doubt: Yes, the legendary herdsmen in the Pontic-Caspian steppe really existed. They belonged to the so-called Yamnaya culture, and they spread, as linguists had predicted, in massive migrations towards Central Europe and India – a later triumph for linguists.
The project has been an extremely enriching and exciting process. We were able to direct many very different academic fields towards a single coherent approach. By asking the right questions, and keeping limitations of the data in mind, contextualizing, nuancing, and keeping dialogues open between scholars of radically different backgrounds and approaches, we have carved out a path for a new field of research. We have already seen too many papers come out in which models produced by geneticists working on their own have been accepted without vital input from other fields, and, at the other extreme, seen archaeologists opposing new studies built on archaeogenetic data, due to a lack of transparency between the fields.
Data on ancient DNA is astonishing for its ability to provide a fine-grained image of early human mobility, but it does stand on the shoulders of decades of work by scholars in other fields, from the time of excavation of human skeletons to interpreting the cultural, linguistic origins of the samples. This is how cold statistics are turned into history.
Analysis of a sacrificed and interred domestic donkey from an Early Bronze Age (EB) IIIB (c. 2800–2600 BCE) domestic residential neighborhood at Tell eṣ-Ṣâfi/Gath, Israel, indicate the presence of bit wear on the Lower Premolar 2 (LPM2). This is the earliest evidence for the use of a bit among early domestic equids, and in particular donkeys, in the Near East. The mesial enamel surfaces on both the right and left LPM2 of the particular donkey in question are slightly worn in a fashion that suggests that a dental bit (metal, bone, wood, etc.) was used to control the animal. Given the secure chronological context of the burial (beneath the floor of an EB IIIB house), it is suggested that this animal provides the earliest evidence for the use of a bit on an early domestic equid from the Near East.
In contrast to what is known about the use of donkeys for transportation, relatively little is known about their use for riding during this early period . Riding is possible, but fast riding is difficult without some kind of bridle with reins to grasp. Thus, the development of the bit becomes an essential part of the mechanism to control and ride an equid, whether horse, donkey or otherwise [38–41]. While some have tried to argue based on cave art for the presence of bridles (including cheek straps and potentially bits) on equids as far back as the Upper Palaeolithic [42, 43], this perspective has not been accepted [44, 45]. Instead, the weight of the evidence for bridles points toward the Eneolithic and Bronze Age of Kazakhstan and Russia, c. 3500 BCE for horses, not donkeys [38, 40, 46–50].But, horses are not the earliest domestic equids to appear in the Near East. This role is reserved for the ass/donkey [20, 32, 51].
The earliest unambiguous evidence for bridles and bits in equids in the Near East appear only in the Middle Bronze Age [52, 62, 63], and horses become common only in cuneiform texts and the archaeological record after the turn of the second millennium BC . For example, at the Middle Bronze Age site of Tel Haror, a metal bit was found associated with a donkey burial .
Beginning in the Middle Bronze Age, there is a variety of sources that demonstrate that asses were being ridden. In fact, they seem to be the preferred animal ridden for elites in the Early and Middle Bronze Age of Mesopotamia. The earliest clear association of asses being ridden by elites comes from the Old Babylonian period (MBA, 18th century BCE—the Kings of Mari, Syria) . Similarly, by the beginning of the Middle Kingdom of Egypt, various texts and iconographic images (e.g. the stela of Serabit el-Khadem) from Egypt and petroglyphs from southern Sinai unambiguously depict and/or describe elites riding asses [5, 65, 66]. The later biblical narrative depicts donkeys carrying the biblical Patriarchs (Abraham), various leaders (such as Saul before he became king), prophets, and judges of Israel [16, 67, 68].
Horses became the standard royal riding animal during the Late Bronze and Iron Ages as they became more prevalent. In later periods, donkeys became associated with humility and the lower classes, and leaders emanating from it (e.g. Jesus).
These finds suggest that bit use on donkeys was already present in the early to mid-3rd millennium BCE, long before the appearance of horses in the ancient Near East. Thus, the appearance of bit use in donkeys in the ancient Near East is not connected to appearance of the horse, contrary to previous suggestions (as already noted by ). As such, the impact of the domestic donkey on the cultures of this region and the evolution of early complex societies cannot be underestimated. As with plant and animal domestication, the use of donkeys created a surplus of human labor that allowed for the easy transport of people and goods across the entire Near East. These changes continue to permeate the economic, social, and political aspects of even modern life in many third world countries [3, 8, 9, 93, 94].
So, the first case of equid riding in the Near East, near two of the cradles of civilization (Sumeria and Egypt), is a donkey from the early third millennium BC. Not much in favour of horse domestication (and still less for horse riding) expanding from Norh Iran or the Southern Caucasus to the north.
NOTE. The recent papers of the Copenhagen group made yet another controversial interpretation of genomic findings (see here): they support multiple simultaneous origins for horse-riding technique, in Khvalynsk and Botai, based on the lack of genetic connection between both human populations, with which I can’t agree. Based on the similar time of appearance and the geographic proximity, I think the most likely explanation is expansion of the technique from one to the other, probably – as supported by Anthony’s investigation – from Khvalynsk to neighbouring cultures.
Our findings fit well with current insights from the historical linguistics of this region (Supplementary Information section 2). The steppes were probably largely Iranian-speaking in the first and second millennia bc. This is supported by the split of the Indo-Iranian linguistic branch into Iranian and Indian33, the distribution of the Iranian languages, and the preservation of Old Iranian loanwords in Tocharian34. The wide distribution of the Turkic languages from Northwest China, Mongolia and Siberia in the east to Turkey and Bulgaria in the west implies large-scale migrations out of the homeland in Mongolia since about 2,000 years ago35. The diversification within the Turkic languages suggests that several waves of migration occurred36 and, on the basis of the effect of local languages, gradual assimilation to local populations had previously been assumed37. The East Asian migration starting with the Xiongnu accords well with the hypothesis that early Turkic was the major language of Xiongnu groups38. Further migrations of East Asians westwards find a good linguistic correlate in the influence of Mongolian on Turkic and Iranian in the last millennium39. As such, the genomic history of the Eurasian steppes is the story of a gradual transition from Bronze Age pastoralists of West Eurasian ancestry towards mounted warriors of increased East Asian ancestry—a process that continued well into historical times.
This paper will need a careful reading – better in combination with Narasimhan et al. (2018), when their tables are corrected – , to assess the actual ‘Iranian’ nature of the peoples studied. Their wide and long-term dominion over the steppe could also potentially explain some early samples from Hajji Firuz with steppe ancestry.
For the moment, at first sight, it seems that, in terms of Y-DNA lineages:
R1b-Z93 (especially Z2124 subclades) dominate the steppes in the studied periods.
R1b-P312 is found in Hallstatt ca. 810 BC, which is compatible with its role in the Celtic expansion.
R1b-U106 is found in a West Germanic chieftain in Poprad (Slovakia) ca. 400 AD, during the Migration Period, hence supporting once again the expansion of Germanic tribes especially with R1b-U106 lineages.
A sample of haplogroup R1a-Z282 (Z92) dated ca. 1300 AD in the Golden Horde is probably not quite revealing, not even for the East Slavic expansion.
Also, interestingly, some R1b(xM269) lineages seem to be associated with Turkic expansions from the eastern steppe dated around 500 AD, which probably points to a wide Eurasian distribution of early R1b subclades in the Mesolithic.
NOTE. I have referenced not just the reported subclades from the paper, but also (and mainly) further Y-SNP calls studied by Open Genomes. See the spreadsheet here.
Interesting also to read in the supplementary materials the following, by Michaël Peyrot (emphasis mine):
1. Early Indo-Europeans on the steppe: Tocharians and Indo-Iranians
The Indo-European language family is spread over Eurasia and comprises such branches and languages as Greek, Latin, Germanic, Celtic, Sanskrit etc. The branches relevant for the Eurasian steppe are Indo-Aryan (= Indian) and Iranian, which together form the Indo-Iranian branch, and the extinct Tocharian branch. All Indo-European languages derive from a postulated protolanguage termed Proto-Indo-European. This language must have been spoken ca 4500–3500 BCE in the steppe of Eastern Europe21. The Tocharian languages were spoken in the Tarim Basin in present-day Northwest China, as shown by manuscripts from ca 500–1000 CE. The Indo-Aryan branch consists of Sanskrit and several languages of the Indian subcontinent, including Hindi. The Iranian branch is spread today from Kurdish in the west, through a.o. Persian and Pashto, to minority languages in western China, but was in the 2nd and 1st millennia BCE widespread also on the Eurasian steppe. Since despite their location Tocharian and Indo-Iranian show no closer relationship within Indo-European, the early Tocharians may have moved east before the Indo-Iranians. They are probably to be identified with the Afanasievo Culture of South Siberia (ca 2900 – 2500 BCE) and have possibly entered the Tarim Basin ca 2000 BCE103.
The Indo-Iranian branch is an extension of the Indo-European Yamnaya Culture (ca 3000–2400 BCE) towards the east. The rise of the Indo-Iranian language, of which no direct records exist, must be connected with the Abashevo / Sintashta Culture (ca 2100 – 1800 BCE) in the southern Urals and the subsequent rise and spread of Andronovo-related Culture (1700 – 1500 BCE). The most important linguistic evidence of the Indo-Iranian phase is formed by borrowings into Finno-Ugric languages104–106. Kuz’mina (2001) identifies the Finno-Ugrians with the Andronoid cultures in the pre-taiga zone east of the Urals107. Since some of the oldest words borrowed into Finno-Ugric are only found in Indo-Aryan, Indo-Aryan and Iranian apparently had already begun to diverge by the time of these contacts, and when both groups moved east, the Iranians followed the Indo-Aryans108. Being pushed by the expanding Iranians, the Indo-Aryans then moved south, one group surfacing in equestrian terminology of the Anatolian Mitanni kingdom, and the main group entering the Indian subcontinent from the northwest.
2. Andronovo Culture: Early Steppe Iranian
Initially, the Andronovo Culture may have encompassed speakers of Iranian as well as Indo-Aryan, but its large expansion over the Eurasian steppe is most probably to be interpreted as the spread of Iranians. Unfortunately, there is no direct linguistic evidence to prove to what extent the steppe was indeed Iranian speaking in the 2nd millennium BCE. An important piece of indirect evidence is formed by an archaic stratum of Iranian loanwords in Tocharian34,109. Since Tocharian was spoken beyond the eastern end of the steppe, this suggests that speakers of Iranian spread at least that far. In the west of the Tarim Basin the Iranian languages Khotanese and Tumshuqese were spoken. However, the Tocharian B word etswe ‘mule’, borrowed from Iranian *atswa- ‘horse’, cannot derive from these languages, since Khotanese has aśśa- ‘horse’ with śś instead of tsw. The archaic Iranian stratum in Tocharian is therefore rather to be connected with the presence of Andronovo people to the north and possibly to the east of the Tarim Basin from the middle of the 2nd millennium BCE onwards110.
Since Kristiansen and Allentoft sign the paper (and Peyrot is a colleague of Kroonen), it seems that they needed to expressly respond to the growing criticism about their recent Indo-European – Corded Ware Theory. That’s nice.
IECWT-proponents are apparently not prepared to let it go quietly, and instead of challenging the traditional Neolithic Uralic homeland in Eastern Europe with a recent paper on the subject, they selected an older one which partially fit, from Kuz’mina (2001), now shifting the Uralic homeland to the east of the Urals (when Kuz’mina asserts it was south of the Urals).
Different authors comment later in this same paper about East Uralic languages spreading quite late, so even their text is not consistent among collaborating authors.
Also interesting is the need to resort to the questionable argument of early Indo-Aryan loans – which may have evidently been Indo-Iranian instead, since there is no way to prove a difference between both stages in early Uralic borrowings from ca. 4,500-3,500 years ago…
NOTE. I don’t mind repeating it again: Uralic is one possibility (the most likely one) for the substrate language that Corded Ware migrants spread, but it could have been e.g. another Middle PIE dialect, similar to Proto-Anatolian (after the expansion of Suvorovo-Novodanilovka chiefs). I expressly stated this in the Corded Ware substrate hypothesis, since the first edition. What was clear since 2015, and should be clear to anyone now, is that Corded Ware did not spread Late PIE languages to Europe, and that some east CWC groups only spread languages to Asia after admixing with East Yamna. If they did not spread Uralic, then it was a language or group of languages phonetically similar, which has not survived to this day.
At least we won’t have the Yamna -> Corded Ware -> BBC nonsense anymore, and they expressly stated that LPIE is to be associated with Yamna, and in particular the “Indo-Iranian branch is an extension of the Indo-European Yamnaya Culture (ca 3000–2400 BCE) to the East” (which will evidently show an East Yamna / Poltavka society of R1b-L23 subclades), so that earlier Eneolithic cultures have to be excluded, and Balto-Slavic identification with East Europe is also out of the way.
The Eneolithic Botai culture of the Central Asian steppes provides the earliest archaeological evidence for horse husbandry, ~5,500 ya, but the exact nature of early horse domestication remains controversial. We generated 42 ancient horse genomes, including 20 from Botai. Compared to 46 published ancient and modern horse genomes, our data indicate that Przewalski’s horses are the feral descendants of horses herded at Botai and not truly wild horses. All domestic horses dated from ~4,000 ya to present only show ~2.7% of Botai-related ancestry. This indicates that a massive genomic turnover underpins the expansion of the horse stock that gave rise to modern domesticates, which coincides with large-scale human population expansions during the Early Bronze Age.
That none of the domesticates sampled in the last ~4,000 years descend from the horses first herded at Botai entails another major implication. It suggests that during the 3rd Mill BCE at the latest, another unrelated group of horses became the source of all domestic populations that expanded thereafter. This is compatible with two scenarios. First, Botai-type horses experienced massive introgression capture (22) from a population of wild horses until the Botai ancestry was almost completely replaced. Alternatively, horses were successfully domesticated in a second domestication center and incorporated minute amounts of Botai ancestry during their expansion. We cannot identify the locus of this hypothetic center due to a temporal gap in our dataset throughout the 3rd Mill BCE. However, that the DOM2 earliest member was excavated in Hungary adds Eastern Europe to other candidates already suggested, including the Pontic-Caspian steppe (2), Eastern Anatolia (23), Iberia (24), Western Iran and the Levant (25). Notwithstanding the process underlying the genomic turnover observed, the clustering of ~4,023-3,574 year-old specimens from Russia, Romania and Georgia within DOM2 suggests that this clade already expanded throughout the steppes and Europe at the transition between the 3rd and 2nd Mill BCE, in line with the demographic expansion at ~4,500 ya recovered in mitochondrial Bayesian Skylines (fig. S14).
This study shows that the horses exploited by the Botai people later became the feral PH. Early domestication most likely followed the ‘prey pathway’ whereby a hunting relationship was intensified until reaching concern for future progeny through husbandry, exploitation of milk and harnessing (7). Other horses, however, were the main source of domestic stock over the last ~4,000 years or more. Ancient human genomics (26) has revealed considerable human migrations ~5,000 ya involving “Yamnaya” culture pastoralists of the Pontic-Caspian steppe. This expansion might be associated with the genomic turnover identified in horses, especially if Botai horses were best suited to localized pastoral activity than to long distance travel and warfare. Future work must focus on identifying the main source of the domestic horse stock and investigating how the multiple human cultures managed the available genetic variation to forge the many horse types known in history.