Yamna the likely source of modern horse domesticates; the closest lineage, from East Bell Beakers

Open access Tracking Five Millennia of Horse Management with Extensive Ancient Genome Time Series, by Fages et al. Cell (2019).

Interesting excerpts (emphasis mine):

The earliest archaeological evidence of horse milking, harnessing, and corralling is found in the ∼5,500-year-old Botai culture of Central Asian steppes (Gaunitz et al., 2018, Outram et al., 2009; see Kosintsev and Kuznetsov, 2013 for discussion). Botai-like horses are, however, not the direct ancestors of modern domesticates but of Przewalski’s horses (Gaunitz et al., 2018). The genetic origin of modern domesticates thus remains contentious, with suggested candidates in the Pontic-Caspian steppes (Anthony, 2007), Anatolia (Arbuckle, 2012, Benecke, 2006), and Iberia (Uerpmann, 1990, Warmuth et al., 2011). Irrespective of the origins of domestication, the horse genome is known to have been reshaped significantly within the last ∼2,300 years (Librado et al., 2017, Wallner et al., 2017, Wutke et al., 2018). However, when and in which context(s) such changes occurred remains largely unknown.

To clarify the origins of domestic horses and reveal their subsequent transformation by past equestrian civilizations, we generated DNA data from 278 equine subfossils with ages mostly spanning the last six millennia (n = 265, 95%) (Figures 1A and 1B; Table S1; STAR Methods). Endogenous DNA content was compatible with economical sequencing of 87 new horse genomes to an average depth-of-coverage of 1.0- to 9.3-fold (median = 3.3-fold; Table S2). This more than doubles the number of ancient horse genomes hitherto characterized. With a total of 129 ancient genomes, 30 modern genomes, and new genome-scale data from 132 ancient individuals (0.01- to 0.9-fold, median = 0.08-fold), our dataset represents the largest genome-scale time series published for a non-human organism (Tables S2, S3, and S4; STAR Methods).

Genetic Affinities.
Principal Component Analysis (PCA) of 159 ancient and modern horse genomes showing at least 1-fold average depth-of-coverage. The overall genetic structure is shown for the first three principal components, which summarize 11.6%, 10.4% and 8.2% of the total genetic variation, respectively. The two specimens MerzlyYar_Rus45_23789 and Dunaujvaros_Duk2_4077 discussed in the main text are highlighted. See also Figure S7 and Table S5 for further information.
(B) Visualization of the genetic affinities among individuals, as revealed by the struct-f4 algorithm and 878,475 f4 permutations. The f4 calculation was conditioned on nucleotide transversions present in all groups, with samples were grouped as in TreeMix analyses (Figure 3). In contrast to PCA, f4 permutations measure genetic drift along internal branches. They are thus more likely to reveal ancient population substructure.

Discovering Two Divergent and Extinct Lineages of Horses

Domestic and Przewalski’s horses are the only two extant horse lineages (Der Sarkissian et al., 2015). Another lineage was genetically identified from three bones dated to ∼43,000–5,000 years ago (Librado et al., 2015, Schubert et al., 2014a). It showed morphological affinities to an extinct horse species described as Equus lenensis (Boeskorov et al., 2018). We now find that this extinct lineage also extended to Southern Siberia, following the principal component analysis (PCA), phylogenetic, and f3-outgroup clustering of an ∼24,000-year-old specimen from the Tuva Republic within this group (Figures 3, 5A and S7A). This new specimen (MerzlyYar_Rus45_23789) carries an extremely divergent mtDNA only found in the New Siberian Islands some ∼33,200 years ago (Orlando et al., 2013) (Figure 6A; STAR Methods) and absent from the three bones previously sequenced. This suggests that a divergent ghost lineage of horses contributed to the genetic ancestry of MerzlyYar_Rus45_23789. However, both the timing and location of the genetic contact between E. lenensis and this ghost lineage remain unknown.

Population modeling of the demographic changes and admixture events in extant and extinct horse lineages. The two models presented show best fitting to the observed multi-dimensional SFS in momi2. The width of each branch scales with effective size variation, while colored dashed lines indicate admixture proportions and their directionality. The robustness of each model was inferred from 100 bootstrap pseudo-replicates. Time is shown in a linear scale up to 120,000 years ago and in a logarithmic scale above.

Modeling Demography and Admixture of Extinct and Extant Horse Lineages

Phylogenetic reconstructions without gene flow indicated that IBE differentiated prior to the divergence between DOM2 and Przewalski’s horses (Figure 3; STAR Methods). However, allowing for one migration edge in TreeMix suggested closer affinities with one single Hungarian DOM2 specimen from the 3rd mill. BCE (Dunaujvaros_Duk2_4077), with extensive genetic contribution (38.6%) from the branch ancestral to all horses (Figure S7B).This, and the extremely divergent IBE Y chromosome (Figure 6B), suggest that a divergent but yet unidentified ghost population could have contributed to the IBE genetic makeup.

Rejecting Iberian Contribution to Modern Domesticates

The genome sequences of four ∼4,800- to 3,900-year-old IBE specimens characterized here allowed us to clarify ongoing debates about the possible contribution of Iberia to horse domestication (Benecke, 2006, Uerpmann, 1990, Warmuth et al., 2011). Calculating the so-called fG ratio (Martin et al., 2015) provided a minimal boundary for the IBE contribution to DOM2 members (Cahill et al., 2013) (Figure 7A). The maximum of such estimate was found in the Hungarian Dunaujvaros_Duk2_4077 specimen (∼11.7%–12.2%), consistent with its TreeMix clustering with IBE when allowing for one migration edge (Figure S7B). This specimen was previously suggested to share ancestry with a yet-unidentified population (Gaunitz et al., 2018). Calculation of f4-statistics indicates that this population is not related to E. lenensis but to IBE (Figure 7B; STAR Methods). Therefore, IBE or horses closely related to IBE, contributed ancestry to animals found at an Early Bronze Age trade center in Hungary from the late 3rd mill. BCE. This could indicate that there was long-distance exchange of horses during the Bell Beaker phenomenon (Olalde et al., 2018). The fG minimal boundary for the IBE contribution into an Iron Age Spanish horse (ElsVilars_UE4618_2672) was still important (~9.6%–10.1%), suggesting that an IBE genetic influence persisted in Iberia until at least the 7th century BCE in a domestic context. However, fG estimates were more limited for almost all ancient and modern horses investigated (median = ~4.9%–5.4%; Figure 7A).

TreeMix Phylogenetic Relationships. The tree topology was inferred using a total of ∼16.8 million transversion sites and disregarding migration. The name of each sample provides the archaeological site as a prefix, and the age of the specimen as a suffix (years ago). Name suffixes (E) and (A) denote European and Asian ancient horses, respectively. See Table S5 for dataset information. Image modified to include the likely ancestor of domesticates in a red circle, represented by Yamna, the most likely direct ancestor of the Dunaujvarus specimen.

Iron Age horses

Y chromosome nucleotide diversity (π) decreased steadily in both continents during the last ∼2,000 years but dropped to present-day levels only after 850–1,350 CE (Figures 2B and S2E; STAR Methods). This is consistent with the dominance of an ∼1,000- to 700-year-old oriental haplogroup in most modern studs (Felkel et al., 2018, Wallner et al., 2017). Our data also indicate that the growing influence of specific stallion lines post-Renaissance (Wallner et al., 2017) was responsible for as much as a 3.8- to 10.0-fold drop in Y chromosome diversity.

We then calculated Y chromosome π estimates within past cultures represented by a minimum of three males to clarify the historical contexts that most impacted Y chromosome diversity. This confirmed the temporal trajectory observed above as Byzantine horses (287–861 CE) and horses from the Great Mongolian Empire (1,206–1,368 CE) showed limited yet larger-than-modern diversity. Bronze Age Deer Stone horses from Mongolia, medieval Aukštaičiai horses from Lithuania (C9th–C10th [ninth through the tenth centuries of the Common Era]), and Iron Age Pazyryk Scythian horses showed similar diversity levels (0.000256–0.000267) (Figure 2A). However, diversity was larger in La Tène, Roman, and Gallo-Roman horses, where Y-to-autosomal π ratios were close to 0.25. This contrasts to modern horses, where marked selection of specific patrilines drives Y-to-autosomal π ratios substantially below 0.25 (0.0193–0.0396) (Figure 2A). The close-to-0.25 Y-to-autosomal π ratios found in La Tène, Roman, and Gallo-Roman horses suggest breeding strategies involving an even reproductive success among stallions or equally biased reproductive success in both sexes (Wilson Sayres et al., 2014).

Lineage is used in this paper, as in many others in genetics, as defined by a specific ancestry. I keep that nomenclature below. It should not be confused with the “lineages” or “lines” referring to Y-chromosome (or mtDNA) haplogroups.

Supporting the “archaic” nature of the Hungarian BBC horses expanding from the Pontic-Caspian steppes are:

  • Among Y-chromosome lines, the common group formed by Botai-Borly4 (closely related to DOM2), Scythian horses from Aldy Bel (Arzhani), Iron Age horses from Estonia (Ridala), horses from the Xiongnu culture (Uushgiin Uvur), and Roman horses from Autricum (Chartres).
  • Among mtDNA lines, the common group formed by Botai samples, LebyazhinkaIV NB35, and different Eurasian domesticates, including many ancient Western European ones, which reveals a likely expansion of certain subclades east and west with the Repin culture.
  • (…) DOM2 contributed 22% to the ancestor of Przewalski’s horses ca. 9.47 kya, suggesting the Holocene optimum, rather than the Eneolithic Botai culture (∼5.5 kya), as a period of population contact. This pre-Botai introgression could explain the Y chromosome topology, where Botai horses were reported to carry two different segregating haplogroups: one occupied a basal position in the phylogeny while the other was closely related to DOM2. Multiple admixture pulses, however, are known to have occurred along the divergence of DOM2 and the Botai-Borly4 lineage, including 2.3% post-Borly4 contribution to DOM2, and a more recent 6.8% DOM2 intogression into Przewalski’s horses (Gaunitz et al., 2018). Model C2 parameters accommodate all these as a single admixture pulse, likely averaging the contributions of all these multiple events.

    Tip labels are respectively composed of individual sample names, their reference number as well as their age (years ago, from 2017). Red, orange, light green, green, dark green and blue refer to modern horses, ancient DOM2, Botai horses, Borly4 horses, Przewalski’s horses and E. lenensis, respectively. Black refers to wild horses not yet identified to belong to any particular cluster in absence of sufficient genome-scale data. Clades composed of only Przewalski’s horses or ancient DOM2 horses were collapsed to increase readability.

    (A) Best maximum likelihood tree retracing the phylogenetic relationships between 270 mitochondrial genomes.

    B) Best Y chromosome maximum likelihood tree (GTRGAMMA substitution model) excluding outgroup. Node supports are indicated as fractions of 100 bootstrap pseudoreplicates. Bootstrap supports inferior to 90% are not shown. The root was placed on the tree midpoint. See also Table S5 for dataset information.

    Image modified from the paper, including a red square in archaic groups that contain the Hungarian sample, and a red circle around the most likely common ancestral stallion and mare from the Pontic-Caspian steppes.

    The paper cannot offer a detailed picture of ancient horse domestication, but it is yet another step in showing how Repin/Yamna is the most likely source of expansion of horse domesticates in Eurasia. Even more interestingly, Yamna settlers in Hungary probably expanded an ancient lineage of that horse at the same time as they spread with the Classical Bell Beaker culture. Remarkable parallels are thus found between:

    The expansion of an ancient line of horse domesticates related to Yamna Hungary/East Bell Beakers seems to be confirmed by the pre-Iberian sample from Vilars I, Els Vilars4618 2672 (ca. 700-550 BC), likely of Iberian Beaker descent, showing a lineage older than the Indo-Iranian ones, which later replaced most European lines.

    NOTE. For known contacts between Yamna and Proto-Beakers just before the expansion of East Bell Beakers, see a recent post on Vanguard Yamna groups.

    The findings of the paper confirm the expansion of the horse firstly (and mainly) through the steppe biome, mimicking the expansion of Proto-Indo-Europeans first, and then replaced gradually (or not so gradually) by lines brought to Europe during westward expansions of Bronze Age, Iron Age, and later specialized horse-riding steppe cultures. The expansion also correlates well with the known spread of animal traction and pastoralism before 2000 BC:

    Top image: Map with evidence of animal traction before ca. 2000 BC. Bottom image: frequency of finds of evidence for animal traction (orange), cylinder seals (purple) and potter’s wheels (green) in the 4th and 3rd millennium BC (query from the Digital Atlas of Innovations). The data points to an early peak in the expansion of this innovation at the turn of the 4th–3rd millennium BC, while direct evidence supports a radical increase from around the mid–3th millennium BC until the early 2nd millennium, coinciding with the expansion of East Bell Beakers and related European Early Bronze Age cultures. Data and image modified from Klimscha (2017).

    EDIT (3 MAY 2019): A recent reminder of these parallel developments by David Reich in Insights into language expansions from ancient DNA:

    • Yamna expansion to the west “with horses and wagons”, with a more homogeneous ancestry in modern Europeans due to later migrations from the east (and north):
    • “Descendants” of Yamna (once the culture was already “dead”), expanding to the east mainly with Corded Ware ancestry:

    Another recent open access paper on horse domestication is The horse Y chromosome as an informative marker for tracing sire lines, by Felkel et al. Scientific Reports (2019).


Mitogenomes suggest rapid expansion of domesticated horse before 3500 BC

Open access Origin and spread of Thoroughbred racehorses inferred from complete mitochondrial genome sequences: Phylogenomic and Bayesian coalescent perspectives, by Yoon et al. PLOS One (2018).

Abstract (emphasis mine)

The Thoroughbred horse breed was developed primarily for racing, and has a significant contribution to the qualitative improvement of many other horse breeds. Despite the importance of Thoroughbred racehorses in historical, cultural, and economical viewpoints, there was no temporal and spatial dynamics of them using the mitogenome sequences. To explore this topic, the complete mitochondrial genome sequences of 14 Thoroughbreds and two Przewalski’s horses were determined. These sequences were analyzed together along with 151 previously published horse mitochondrial genomes from a range of breeds across the globe using a Bayesian coalescent approach as well as Bayesian inference and maximum likelihood methods. The racing horses were revealed to have multiple maternal origins and to be closely related to horses from one Asian, two Middle Eastern, and five European breeds. Thoroughbred horse breed was not directly related to the Przewalski’s horse which has been regarded as the closest taxon to the all domestic horses and the only true wild horse species left in the world. Our phylogenomic analyses also supported that there was no apparent correlation between geographic origin or breed and the evolution of global horses. The most recent common ancestor of the Thoroughbreds lived approximately 8,100–111,500 years ago, which was significantly younger than the most recent common ancestor of modern horses (0.7286 My). Bayesian skyline plot revealed that the population expansion of modern horses, including Thoroughbreds, occurred approximately 5,500–11,000 years ago, which coincide with the start of domestication. This is the first phylogenomic study on the Thoroughbred racehorse in association with its spatio-temporal dynamics. The database and genetic history information of Thoroughbred mitogenomes obtained from the present study provide useful information for future horse improvement projects, as well as for the study of horse genomics, conservation, and in association with its geographical distribution.

Bayesian skyline plot (BSP) based on mitochondrial genome sequences from 167 modern horses.
The dark line in the BSP represents the estimated effective population size through time. The green area represents the 95% highest posterior density confidence intervals for this estimate.

Interesting excerpts:

We carried out a Bayesian coalescent approach using extended mitochondrial genome sequences from 167 horses in order to further assess the timescale of horse domestication. Here, we first calculated the time of the most recent common ancestor of Thoroughbred horses. Our analysis revealed the age of the most recent common ancestor of the racing horse to be around 8,100–111,500 years old. This estimate is much younger than that of the most recent common ancestor of the global horses, which has been estimated at 0.7286 Mys old.

Bayesian maximum clade credibility phylogenomic tree on the ground of the mitochondrial genome sequences of 167 modern horses.
The data set (16,432 base pairs) was also analyzed phylogenetically using Bayesian inference (BI) and maximum likelihood (ML) methods which showed the same topologies. 95% Highest Posterior Density of node heights are shown by blue bars. Groups are marked by a “G”. Numbers at the nodes represent (left to right): posterior probabilities (≥0.80) for the BI tree and bootstrap values (≥70%) for the ML tree. The racing horses were revealed to have multiple maternal origins and to be closely related to horses from one Asian, two Middle Eastern, and five European breeds. Results of phylogenomic analyses also uncovered no apparent association between geographic origin or breed and heterogeneity of global horses. The most recent common ancestor of the Thoroughbreds lived approximately 8,100–111,500 years ago, which was significantly younger than the most recent common ancestor of modern horses (0.7286 My).

On the domestication time of modern horses, there have been several publications derived from both archaeological [49–51] and molecular [11–12, 23, 48] evidences. D’Andrade [49] reported that the origin of domestic horses was around 4,000 years ago. Ludwig et al. [50] stated the domestication time to be about 5,000 years ago, while Anthony [51] noted that horse rearing by humans may have occurred approximately 6,000 years ago. Subsequently, on the basis of mitochondrial genome sequences, Lippold et al. [11] and Achilli et al. [12] postulated domestication time to be about 6,000–8,000 and 6,000–7,000 years ago, respectively. Warmuth [48] dated domestication time to 5,500 years ago based on autosomal genotype data, while Orlando et al. [23] claimed that Przewalski’s and domestic horse populations diverged 38,000–72,000 years ago based on analysis of genome sequences. In contrast to the previous hypothesized date of horse domestication, the results of our Bayesian skyline plot (BSP) analysis depict a rapid expansion of the horse population approximately 5,500–11,000 years ago, which coincides with the start of domestication.

It seems that we will not have an update on horse aDNA from the ISBA 8, so we will have to make do with this for the moment.


Origin of horse domestication likely on the North Caspian steppes

Open access Late Quaternary horses in Eurasia in the face of climate and vegetation change, by Leonardi et al. Science Advances (2008) 4(7):eaar5589.

Interesting excerpts (emphasis mine):

Here, we compiled an extensive continental-scale database, consisting of 3070 radiocarbon dates associated to horse paleontological and archeological finds across the whole of Eurasia, that has been analyzed in association with coarse-scale paleoclimatic reconstructions. We further collected the number of identified specimens (NISP) frequency data for horses versus other ungulates in 1120 archeological layers in Europe (…) This ma.ssive amount of data allowed us to track,with unprecedented details, how the geographic distribution of the species changed through time

Geographic range through time

For most analyses, the data have been divided into climatic periods: pre-LGM(older than 27 ka B.P.), LGM(27 to 18 ka B.P.), Late Glacial (18 to 11.7 ka B.P.), Preboreal (11.7 to 10.6 ka B.P.), Boreal (10.6 to 9.1 ka B.P.), Early Atlantic (9.1 to 7.5 ka B.P.), Late Atlantic (7.5 to 5.5 ka B.P.), and Recent (younger than 5.5 ka B.P.) (Fig. 1, A and B). The spatial and temporal distribution of horse remains compiled in our database reveals a strong imbalance in Eurasia (Fig. 1, A and B).

We found a common trend in both regions for a high number of occurrences at the end of the Pleistocene (with a decrease during the LGM, only visible in Europe), followed by a drastic reduction in the Early and Middle Holocene, and a relative increase toward more recent times. These included both the Early Atlantic in Europe, which started ~9.1 ka B.P., and the time range after 5.5 ka B.P. for Asia. The horse fossil record appears ubiquitous throughout Europe in the Late Pleistocene, while in the Early and Middle Holocene the finds are concentrated in central-western Europe and Iberia. From 7.5 ka B.P., the number of finds increases markedly, and the geographical distribution extends toward the east and southeast.

Horse occurrences through time. (A) Horse occurrences through time. Histograms showing the number of horse observations in Europe (left panel) and Asia (right panel) for each time bin (top) and for climatic period (bottom). Only time bins with more than 10 observations (black horizontal line) have been considered for the SDM analyses. From 22 ka B.P. backward (gray vertical line), time bins cover 2 ka following the available paleoclimatic reconstructions. The central map shows the boundaries considered while defining European and Asian regions, with the black line representing the Urals. The zoomed area shows the geographical resolution of the climatic reconstructions, with each pixel representing a grid cell. (B) Geographic distribution of horse occurrences. Maps showing horse occurrences for each climatic period in Europe (left) and Asia (right).

Different Asian and European niches

This analysis revealed that, in both continents, horses occupied only a portion of the climatic space available. The range covered by random locations shows that the paleoecological conditions present in Europe were only a subset of those found in Asia. However, European horses occupied a much wider climatic space than in Asia, with only limited overlap between the two ranges.

Horses conquered temperate environments from a European source

There is no evidence of climatic barriers between those two populations through time because the forecasts from Europe and Asia always overlap in central Eurasia, except 5 ka B.P. (figs. S3 and S4). An alternative explanation is the role of the Urals as a potential constraint for the dispersal of horses between Europe and north central Asia.

Climatic suitability. (A) Cumulative climatic suitability for the past 44 ka based on simulation on the European (left), Eurasian (middle), and Asian (right) data sets. To correct for sampling bias in the Eurasian data set, for each time slice, all estimates and projections for Eurasia are performed considering 100 random resampling of European occurrences in the same number as Asian occurrences. The darker the colors, themore stable the climatic suitability for horses (climatic niche = p-Hor) through time. (B) Projection of climatic suitability across Eurasia in different climatic periods based on occurrences in Europe (left), Eurasia (middle), and Asia (right). Because of the scarcity of data available for Asia, no models for the Holocene have been possible for both Asia and Eurasia, with the exception of 5 and 3 ka B.P. (both included in the “Recent” period).

Climatic and habitat association patterns for horses in Europe support increasing habitat fragmentation

The decrease of horse remains in Europe is not characterized by a geographic reduction in the overall extent of the area occupied by the species but in a drop of frequencies in a geographic extent that does not vary much between the Late Glacial and the Early Atlantic (Figs. 1B and 4B). This pattern is more likely to result from habitat fragmentation than from a geographic shift in the climatic range suitable for the species, as observed for many animals during the LGM (23).

In the whole period ranging from the Preboreal (11.7 to 10.6 ka B.P.) to the Late Atlantic (7.5 to 5.5 ka B.P.), the total amount of land space most and likely suitable to horses is wider than in the Late Glacial, and only between 8 to 7 ka ago the European range appears patchy and fragmented (Fig. 4C). When comparing each of four successive time bins during the Holocene (8, 7, 6, and 5 ka B.P., respectively) (Fig. 4E), the difference in successive p-Hor values in Europe shows that the suitability for the species in Iberia, northeastern France, Italy, the Balkans, and eastern Europe steadily increased, while in Central Europe strong differences can be observed between neighboring regions.

Analyses of the European data set and biomefrequency. (A) Distribution through time of the frequency of horse remains in Europe calculated as NISP of horses versus other ungulates. (B) Density of horse remains through time in Europe, calculated as NISP of horses versus other ungulates. The numbers at the bottom of each bar represent the number of observations falling in each class, from 0 to >5%. (C) Climatic suitability for horses in Europe between 10 and 3 ka B.P. (D) Climatic suitability per time period. Percentage of land cells in Europe with a value of suitability for horses (p-Hor) > 0.5 and p-Hor > 0.8. (E) Holocene climatic amelioration. Difference in p-Hor in Europe comparing five successive time bins during the Holocene: 9, 8, 7, 6, and 5 ka B.P. Eachmap shows the difference in themore recent distribution compared to the previous one. (F) Environmental reconstructions in themacro area surrounding horse finds in Europe (left) and Asia (right) per climatic period. The lighter the color, the less forested is the region. The numbers at the bottom of the bars show the number of occurrences in closed environments over all the observations. The dotted line represents a frequency of 0.5.

Taken at face value, this pattern would suggest that horses were not restricted to open environments but could equally well inhabit closed, forested environments, as previously suggested (18). However, as others recently emphasized (19), the faunal associations inHolocene sites from Europe suggest a different pattern. The PCAs based on faunal assemblages (figs. S1 and S2) separate on the second principal component sites characterized by ungulates associated to forested areas (red deer, wild boar, and roe deer) and all other animals, associated to semi-open and open environments, including horses for most records.

Together, the contrast between the reconstructed microscale and macroscale vegetable coverage in Europe, the increase of horses in mainly forested macroregions, and the spatial pattern of extinction suggest that, from the beginning of the Holocene, the suitable environment became more and more patchy, with open areas increasingly fragmented by forests, where wild populations of horses could have survived in isolation until one or several waves of arrivals of domestic horses, leading to either local admixture or a full replacement of the preexisting local populations.


Our data show that, up to 5.5 ka ago, horse finds do not show association with species characteristic of forested areas such as wild boar and roe deer. We infer that the open and semi-open habitats occupied by horses on a narrow geographic scale appear less and less frequent at a macroenvironmental scale, supporting the possibility of increasing fragmentation of open habitats. This event is also likely to have led to an intensification of genetic isolation for the remaining horse populations, a pattern that still needs to be tested on genomic data.

The suitability of both Iberia and eastern Europe appears constant throughout the entire post-LGM period, in line with these regions being hotspots of genetic diversity and, possibly, the refugia sources for the recolonization of the continent (11). While the Pontic-Caspian region appears not suitable for European horses around the time when horses where first domesticated some 5.5 ka ago (6), part of this region appears suitable for the Asian horses (with the Caspian Sea as the westernmost boundary). This may suggest that horse domestication started from a population background related to an Asian ancestry and that the further spread of the domesticated horses in Europe involved either adaptation to novel niches (possibly through selective breeding) or the application of domestication techniques to local horse populations pre-adapted to these environmental conditions. Testing this scenario will require mapping the genetic structure of the Eurasian horse population within the fifth to third millennium BCE.

Some remarks

Cultural-anthropological research and archaeological remains (see here), genetics (see here and here), and now also thorough palaeoclimatic and archaeological models point to the North Caspian region, settled by the Khvalynsk culture, as the most likely earliest origin of horse domestication. The paper also supports the favorable conditions of western Europe up to Iberia for the introduction of a horse-riding culture.

I intended to write a post about the myth of Corded Ware horse riders, but for the moment I haven’t found the time. Not that Corded Ware pastoralists didn’t have horses, or could not ride them: they were a highly mobile culture of pastoralists stemming from eastern Poland / western Ukraine, so they must have known horses, like many other European cultures of the late 4th / early 3rd millennium influenced by expanding Yamna settlers. But it just cannot be said to have formed an essential part of their culture, as it was for Khvalynsk-Novodanilovka, and especially Yamna and later East Bell Beaker, Sintashta, etc.

A mere look at these maps suffices to assess the limited role of the horse in north-eastern Europe, the only region where groups of late Corded Ware-derived cultures survived the expansion of Yamna, and especially East Bell Beakers after ca. 2500 BC, which transformed Western, Northern, and Central Europe, and even East Europe reaching the modern Baltic countries, Belarus, and Romania. Even Trzciniec was born out of the influence from expanding Bell Beakers into earlier Corded Ware territory, although the later (Iron Age) relevance of this culture was probably quite limited.

As you can imagine, without horses and horse symbolism, horse riding, carts, and intensive cattle-breeding (associated with Yamna and the broad, east-central European grasslands typical of steppe regions), there can be no Proto-Indo-European, whose reconstructed vocabulary is particulary rich in horse-related words, and whose reconstructed culture, society, and religion cannot be understood without the domesticated horse. In forest regions to the north-east and eastern Europe, there was apparently little space for horses, but plenty of room for other ungulates and thus hunting, and indeed Uralic languages

In the upcoming months we will see R1a-fans associating Proto-Indo-Europeans more and more with wool, and sheep, and corded ware, and forest regions, until the proposed homeland shifts to the Baltic and Finland, instead of dat boring horse-riding people of the steppes…No wait, it’s already happening.

NOTE. Also open access is the recent Horse Y chromosome assembly displays unique evolutionary features and putative stallion fertility genes, by Janečka et al. Nature Communications (2018).


Decline of genetic diversity in ancient domestic stallions in Europe

Open access research article Decline of genetic diversity in ancient domestic stallions in Europe, by Wutke et al., Science (2018), 4(4):eaap9691.

Abstract (emphasis mine):

Present-day domestic horses are immensely diverse in their maternally inherited mitochondrial DNA, yet they show very little variation on their paternally inherited Y chromosome. Although it has recently been shown that Y chromosomal diversity in domestic horses was higher at least until the Iron Age, when and why this diversity disappeared remain controversial questions. We genotyped 16 recently discovered Y chromosomal single-nucleotide polymorphisms in 96 ancient Eurasian stallions spanning the early domestication stages (Copper and Bronze Age) to the Middle Ages. Using this Y chromosomal time series, which covers nearly the entire history of horse domestication, we reveal how Y chromosomal diversity changed over time. Our results also show that the lack of multiple stallion lineages in the extant domestic population is caused by neither a founder effect nor random demographic effects but instead is the result of artificial selection—initially during the Iron Age by nomadic people from the Eurasian steppes and later during the Roman period. Moreover, the modern domestic haplotype probably derived from another, already advantageous, haplotype, most likely after the beginning of the domestication. In line with recent findings indicating that the Przewalski and domestic horse lineages remained connected by gene flow after they diverged about 45,000 years ago, we present evidence for Y chromosomal introgression of Przewalski horses into the gene pool of European domestic horses at least until medieval times.

The frequencies of Y chromosome haplotypes started to change during the Late Bronze Age (1600–900 BCE).
Inferred temporal trajectories of haplotype frequencies. Each haplotype is displayed by a different color. The shaded area represents the 95% highest-density region. The trajectories were constructed taking the median values across frequencies from the simulations of the Bayesian posterior sample. The small chart represents the stacked frequencies; the amplitude of each colored area is proportional to the median haplotype frequencies (normalized) at a given time. The x and y axes of the small chart match those in the large one. Ka, thousands of years.

Interesting excerpts:

The first record of the modern domestic Y chromosome haplotype stems from two Bronze Age samples of similar age. Notably, both samples were found in two distantly located regions: present-day Slovakia (2000–1600 BCE, dated by archaeological context) and western Siberia (14C-dated: 1609–1436 cal. BCE). Although a very recent study proposes an oriental origin of this haplotype (14), we cannot determine the geographical origin of Y-HT-1 with certainty, because this haplotype has not been found thus far in predomestic or wild stallions. There are two possible scenarios: (i) Y-HT-1 emerged within the domestic population by mutation and (ii) Y-HT-1 was already present in wild horses and entered the domestic population either at the beginning of domestication (but initially restricted to Asian horses) or later by introgression (from wild Y-HT-1 carrying studs during the Iron Age). Crosses between domestic animals and their wild counterparts have been observed in several domestic species (15–18); thus, the simplest explanation would be that we missed Y-HT-1 in older samples because of limited geographical sampling. However, the estimated haplotype age is contemporary (Fig. 4) with the assumed starting point of horse domestication ~4000–3500 BCE (19), rendering it likely that Y-HT-1 originated within the domestic horse gene pool. Still, we cannot rule out definitively that it appeared before domestication.

Independent of its geographical origin, Y-HT-1 progressively replaced all other haplotypes—except for one additional lineage that is restricted to Yakutian horses (11). Considering our data, this trend in paternal diversity toward dominance of the modern lineage appears to start in the Bronze Age and becomes even more pronounced during the Iron Age. The Bronze Age was a time of large-scale human migrations across Eurasia (20–22), movements that were undoubtedly facilitated by the spread of horses as a means of transport and warfare. At that time, the western Eurasian steppes were inhabited by highly mobile cultures that largely relied on horses (20, 21, 23, 24). The genetic admixture of northern and central European humans with Caucasians/eastern Europeans did correlate with the spread of the Yamnaya culture from the Pontic-Caspian steppe (25), an area that has repeatedly been suggested as the center of horse domestication (19, 26, 27). Given the importance of domestic horses, it appears that deliberate selection/rejection of certain stallions by these people might have contributed to the loss of paternal diversity. The spread of humans out of this region might also have resulted in the spread of Y-HT-1 from Asia to Europe. This scenario also agrees with recent findings that the low male diversity of extant horses is not caused by recruiting only a limited number of stallions during early domestication (13).

Decline of paternal diversity began in Asia.
Maps displaying age, locality, and haplotype (different colors) of each successfully genotyped sample.

The presence of the Y chromosome haplotype carried by present-day Przewalski horses (Y-HT-2) in early domestic stallions and a European wild horse (Pie05; table S2) could be the result of introgression of Przewalski stallions. Although the original distribution of the Przewalski horse is unknown, it was probably much larger than that of the relict population in Mongolia that produced modern Przewalski horses and might even have extended into Central Europe. However, it is also possible that either Przewalski horses were among the initially domesticated horses or that Y-HT-2 occurred both in Przewalski horses and in those wild horses that are the ancestors of domestic horses, based on autosomal DNA data (30). Regardless of how Y-HT-2 entered the domestic gene pool, it was eventually lost, as were all haplotypes except Y-HT-1. In our sample set, Y-HT-2 was undetectable as early as the third time bin. However, it is possible that Y-HT-2 may have been present during this time period, but with a frequency below 0.11 (with 95% probability). The inferred time trajectories for Y-HT-2 frequencies suggest that it could nevertheless have persisted at very low frequencies until the Middle Ages (Fig. 3). On the basis of these simulations, this finding could be interpreted as a relic of this haplotype’s formerly higher frequency in the domestic horse gene pool. It is also possible that the presence of this haplotype could be the result of mating a wild stallion with a domestic mare, a frequently reported breeding practice when wild horses were still widely distributed. However, a significant contribution of the Przewalski horse to the gene pool of modern domestic horses has been almost ruled out by recent genomic studies (13, 31, 32).

Stallion lineages through time.
Temporal haplotype network of the four detected Y chromosome haplotypes. Age of the samples indicated by multiple layers separated by color; vertical lines connecting the haplotypes of consecutive layers/ages represent which haplotype was transferred into a later/younger period. Numbers constitute the respective number of individuals showing this particular haplotype for that period. Prz, Przewalski; Dom, domestic.


Ancient DNA upends the horse family tree

New paper, behind paywall, Ancient genomes revisit the ancestry of domestic and Przewalski’s horses, by Gaunitz et al., Science (2018)


The Eneolithic Botai culture of the Central Asian steppes provides the earliest archaeological evidence for horse husbandry, ~5,500 ya, but the exact nature of early horse domestication remains controversial. We generated 42 ancient horse genomes, including 20 from Botai. Compared to 46 published ancient and modern horse genomes, our data indicate that Przewalski’s horses are the feral descendants of horses herded at Botai and not truly wild horses. All domestic horses dated from ~4,000 ya to present only show ~2.7% of Botai-related ancestry. This indicates that a massive genomic turnover underpins the expansion of the horse stock that gave rise to modern domesticates, which coincides with large-scale human population expansions during the Early Bronze Age.

You can read more about it in the article Ancient DNA upends the horse family tree.

Excerpts, from the article (emphasis mine):

That none of the domesticates sampled in the last ~4,000 years descend from the horses first herded at Botai entails another major implication. It suggests that during the 3rd Mill BCE at the latest, another unrelated group of horses became the source of all domestic populations that expanded thereafter. This is compatible with two scenarios. First, Botai-type horses experienced massive introgression capture (22) from a population of wild horses until the Botai ancestry was almost completely replaced. Alternatively, horses were successfully domesticated in a second domestication center and incorporated minute amounts of Botai ancestry during their expansion. We cannot identify the locus of this hypothetic center due to a temporal gap in our dataset throughout the 3rd Mill BCE. However, that the DOM2 earliest member was excavated in Hungary adds Eastern Europe to other candidates already suggested, including the Pontic-Caspian steppe (2), Eastern Anatolia (23), Iberia (24), Western Iran and the Levant (25). Notwithstanding the process underlying the genomic turnover observed, the clustering of ~4,023-3,574 year-old specimens from Russia, Romania and Georgia within DOM2 suggests that this clade already expanded throughout the steppes and Europe at the transition between the 3rd and 2nd Mill BCE, in line with the demographic expansion at ~4,500 ya recovered in mitochondrial Bayesian Skylines (fig. S14).

Admixture graphs. (A to F) The six scenarios tested. Panel (A) received decisive Bayes Factor support, as indicated below each corresponding alternative scenario tested. Domestic-Ancient and Domestic-A/B refer to three phylogenetic clusters identified within DOM2 (excluding Duk2): ancient individuals; modern Mongolian, Yakutian (including Tumeski_CGG101397) and Jeju horses, and; all remaining modern breeds. (G) Posterior distributions of admixture proportions along p1 and p2 branches.

This study shows that the horses exploited by the Botai people later became the feral PH. Early domestication most likely followed the ‘prey pathway’ whereby a hunting relationship was intensified until reaching concern for future progeny through husbandry, exploitation of milk and harnessing (7). Other horses, however, were the main source of domestic stock over the last ~4,000 years or more. Ancient human genomics (26) has revealed considerable human migrations ~5,000 ya involving “Yamnaya” culture pastoralists of the Pontic-Caspian steppe. This expansion might be associated with the genomic turnover identified in horses, especially if Botai horses were best suited to localized pastoral activity than to long distance travel and warfare. Future work must focus on identifying the main source of the domestic horse stock and investigating how the multiple human cultures managed the available genetic variation to forge the many horse types known in history.

We are seing that Bell Beakers were obviously horse riders, and that their horses must have derived from Yamna riders, so it is quite possible that their ancestral early Khvalynsk culture was the origin of domesticated horses, as proposed by David W. Anthony, although for the moment we only know “that [horse] domestication could have been a process with many phases, experiments, failures, and successes”.

EDIT (23 FEB 2018): My interpretation errors removed, thanks to the comments.