Domesticated horse population structure, selection, and mtDNA geographic patterns

przewalski-hutai

Open access Detecting the Population Structure and Scanning for Signatures of Selection in Horses (Equus caballus) From Whole-Genome Sequencing Data, by Zhang et al, Evolutionary Bioinformatics (2018) 14:1–9.

Abstract (emphasis mine):

Animal domestication gives rise to gradual changes at the genomic level through selection in populations. Selective sweeps have been traced in the genomes of many animal species, including humans, cattle, and dogs. However, little is known regarding positional candidate genes and genomic regions that exhibit signatures of selection in domestic horses. In addition, an understanding of the genetic processes underlying horse domestication, especially the origin of Chinese native populations, is still lacking. In our study, we generated whole genome sequences from 4 Chinese native horses and combined them with 48 publicly available full genome sequences, from which 15 341 213 high-quality unique single-nucleotide polymorphism variants were identified. Kazakh and Lichuan horses are 2 typical Asian native breeds that were formed in Kazakh or Northwest China and South China, respectively. We detected 1390 loss-of-function (LoF) variants in protein-coding genes, and gene ontology (GO) enrichment analysis revealed that some LoF-affected genes were overrepresented in GO terms related to the immune response. Bayesian clustering, distance analysis, and principal component analysis demonstrated that the population structure of these breeds largely reflected weak geographic patterns. Kazakh and Lichuan horses were assigned to the same lineage with other Asian native breeds, in agreement with previous studies on the genetic origin of Chinese domestic horses. We applied the composite likelihood ratio method to scan for genomic regions showing signals of recent selection in the horse genome. A total of 1052 genomic windows of 10 kB, corresponding to 933 distinct core regions, significantly exceeded neutral simulations. The GO enrichment analysis revealed that the genes under selective sweeps were overrepresented with GO terms, including “negative regulation of canonical Wnt signaling pathway,” “muscle contraction,” and “axon guidance.” Frequent exercise training in domestic horses may have resulted in changes in the expression of genes related to metabolism, muscle structure, and the nervous system.

horse-admixture
Bayesian clustering output for 5 K values from K = 2 to K = 8 in 45 domestic horses. Each individual is represented by a vertical line, which is partitioned into colored segments that represent the proportion of the inferred K clusters.

Interesting excerpts:

Admixture proportions were assessed without user-defined population information to infer the presence of distinct populations among the samples (Figure 2). At K = 3 or K = 4, Franches-Montagnes and Arabian forms one unique cluster; at K = 5, Jeju pony forms one unique cluster. For other breeds, comparatively strong population structure exists among breeds, and they can be assigned to 2 (or 3) alternate clusters from K = 3 to K = 5 including group A (Duelmener, Fjord, Icelandic, Kazakh, Lichuan, and Mongolian) and group B (Hanoverian, Morgan, Quarter, Sorraia, and Standardbred). For group A, geographically this was unexpected, where Nordic breeds (Norwegian Fjord, Icelandic, and Duelmener) clustered with Asian breeds including the Mongolian. Previous results of mitochondrial DNA have revealed links between the Mongolian horse and breeds in Iceland, Scandinavia, Central Europe, and the British Isles. The Mongol horses are believed to have been originally imported from Russia subsequently became the basis for the Norwegian Fjord horse.31 At K = 6, Sorraia forms one unique cluster. The Sorraia horse has no long history as a domestic breed but is considered to be of a nearly ancestral type in the southern part of the Iberian Peninsula.32 However, our result did not support Sorraia as an independent ancestral type based on result from K = 2 to K = 5, and the unique cluster in K = 6 may be explained by the small population size and recently inbreeding programs. Genetic admixture of Morgan reveals that these breeds are currently or traditionally continually crossed with other breeds from K = 2 to K = 8. The Morgan horse has been a largely closed breed for 200 years or more but there has been some unreported crossbreeding in recent times.33

horse-pca
Principal component analysis results of all 48 horses. The x-axis denotes the value of PC1, whereas the y-axis denotes the value of PC2. Each dot in the figure represents one individual.

Bayesian clustering and PCA demonstrated the relationships among the horse breeds with weak geographic patterns. The tight grouping within most native breeds and looser grouping of individuals in admixed breeds have been reported previously in modern horses using data from a 54K SNP chip.33,34 Cluster analysis reveals that Arabian or Franches-Montagnes forms one unique cluster with relatively low K value, which is consistent with former study using 50K SNP chip 33,34 Interestingly, Standardbred forms a unique cluster with relatively high K value in this study, different from previous study.33 To date, no footprints are available to describe how the earliest domestic horses spread into China in ancient times. Our study found that Kazakh and Lichuan were assigned to the same lineage as other native Asian breeds, in agreement with previous studies on the origin of Chinese domestic horses.4,5,35,36 The strong genetic relationship between Asian native breeds and European native breeds have made it more difficult to understand the population history of the horse across Eurasia. Low levels of population differentiation observed between breeds might be explained by historical admixture. Unlike the domestic pig in China,8  we suggest that in China, Northern/Southern distinct groups could not be used to genetically distinct native Chinese horse breeds. We consider that during domestication process of horse, gene flow continued among Chinese-domesticated horses.


Open access Some maternal lineages of domestic horses may have origins in East Asia revealed with further evidence of mitochondrial genomes and HVR-1 sequences, by Ma et al., PeerJ (2018).

Abstract:

Objectives
There are large populations of indigenous horse (Equus caballus) in China and some other parts of East Asia. However, their matrilineal genetic diversity and origin remained poorly understood. Using a combination of mitochondrial DNA (mtDNA) and hypervariable region (HVR-1) sequences, we aim to investigate the origin of matrilineal inheritance in these domestic horses.

Methods
To investigate patterns of matrilineal inheritance in domestic horses, we conducted a phylogenetic study using 31 de novo mtDNA genomes together with 317 others from the GenBank. In terms of the updated phylogeny, a total of 5,180 horse mitochondrial HVR-1 sequences were analyzed.

Results
Eighteen haplogroups (Aw-Rw) were uncovered from the analysis of the whole mitochondrial genomes. Most of which have a divergence time before the earliest domestication of wild horses (about 5,800 years ago) and during the Upper Paleolithic (35–10 KYA). The distribution of some haplogroups shows geographic patterns. The Lw haplogroup contained a significantly higher proportion of European horses than the horses from other regions, while haplogroups Jw, Rw, and some maternal lineages of Cw, have a higher frequency in the horses from East Asia. The 5,180 sequences of horse mitochondrial HVR-1 form nine major haplogroups (A-I). We revealed a corresponding relationship between the haplotypes of HVR-1 and those of whole mitochondrial DNA sequences. The data of the HVR-1 sequences also suggests that Jw, Rw, and some haplotypes of Cw may have originated in East Asia while Lw probably formed in Europe.

Conclusions
Our study supports the hypothesis of the multiple origins of the maternal lineage of domestic horses and some maternal lineages of domestic horses may have originated from East Asia.

horse-mtdna
Median joining network constructed based on the 247- bp HVR-1 sequences. Circles are proportional to the number of horses represented and a scale indicator (for node sizes) was provided. The length of lines represents the number of variants that separate nodes (some manual adjustment was made for visually good). In the circles, the colors of solid pie slices indicate studied horse populations: Orange, European horses; Blue, horses of West Asia; Light Green, horses from East Asia; Grey, ancient horses; Purper, Przewalskii horses.

Geographic distributions of horse mtDNA haplogroups

The analysis of geographic distribution of the mitochondrial genome haplogroups showed that horse populations in Europe or East Asia included all haplogroups defined from the mtDNA genome sequences. The lineage Fw comprised entirely of Przewalskii horses. The two haplogroups Iw and Lw displayed frequency peaks in Europe (14.08% and 37.32%, respectively) and a decline to the east (9.33% and 8.00% in the West Asia, and 6.45% and 12.90% in East Asia, respectively), especially for Lw, which contained the largest number of European horses (Table 2). However, an opposite distribution pattern was observed for haplogroups Aw, Hw, Jw, and Rw, which were harbored by more horses from East Asia than those from other regions. The proportions of horses from East Asia for the four haplogroups were 38%, 88%, 62%, and 54%, respectively.

horse-mtdna-tree
Schematic phylogeny of mtDNAs genome from modern horses. This tree includes 348 sequences
and was rooted at a donkey (E. asinus) mitochondrial genome (not displayed). The topology was inferred by a beast approach, whereas a time divergence scale (based on rate substitutions) is shown on the bottom (age estimates were indicated with thousand years (KY)). The percentages on each branch represent Bayesian posterior credibility and the alphabets on the right represent the names of haplogroups. Additional details concerning ages were given in Tables S3 and S6.

Related:

Domestication spread probably via the North Pontic steppe to Khvalynsk… but not horse riding

Interesting paper Excavation at the Razdolnoe site on the Kalmius river in 2010, by N. Kotova, D. Anthony, D. Brown, S. Degermendzhy, P. Crabtree, In: Archaeology and Palaeoecology of the Ukrainian Steppe / IA NAS of Ukraine, Kyiv 2017.

Nothing new probably to those who have read Anthony (2007), but this new publication of his research on the North Pontic region seems to contradict recent papers which cast doubts on the presence of early forms of domestication in the North Pontic steppe, and would reject thus also the arrival of domestication to Khvalynsk from a southern route.

Interesting excerpts discussing recent research and results of this one (emphasis mine):

A brief comment about the fauna is required. A separate international archaeological project studied sites dated to the mid — 6th millennium BC in the Severskiy Donets basin (Starobelsk I, Novoselovka III) northeast of Razdolnoe, and found that they had hunting and gathering economies that made use of Unio shellfish, fish, and turtles, like the Neolithic occupation at Razdolnoe. But the Donets sites had no domesticated animal species. The author argued that the cultures of the Donets and lower Don basins in the 6th millennium BC probably had no domesticated animals, and that the domesticated sheep-goat bones identified at Semenovka, west of Razdolnoe, and dated to 5500 calBC, probably were mis-identified and actually came from wild saiga antelope (Motuzaite- Matuzeviciute 2012: 14). This suggestion was made on the basis of a single bone identified as sheep-goat at Semenovka by O.P. Zhuravlev (not N.S. Kotova as Motuzaite-Matuzeviciute wrote) and sent out for radiocarbon dating, that was re-examined by Cambridge University archaeozoologists.

Regardless of which identification is correct, a single bone is insufficient to cast doubt on sheep-goat bones identified at Sredni Stog 1, Sobachki, and other Neolithic sites in the Dnieper valley. Nevertheless, yet another international collaboration that studied the economy of Dereivka in the Dnieper valley argued that the economy of Eneolithic Dereivka site, which they dated to about 3500 calBC (ignoring 10 radiocarbon dates between 4200—3700 calBC), was still at an «initial phase of animal domestication» and that the Dereivka occupants of 3500 calBC were still largely dependent on hunting and fishing (Mileto et al. 2017: 67—68).

The dated Bos calf in the lower occupation level at Razdolnoe shows that domesticated animals were present in the Kalmius river valley in the Azov steppes in 5500 calBC, at a time when the cultures of the Donets valley were still hunters and gatherers just 200 km to the northeast of Razdolnoe. Sheep-goat and Bos bones were found in all Neolithic and Eneolithic levels at Razdolnoe. Because it was a small excavation, this evidence should not be over-interpreted. We cannot say how important domesticated animals were in the daily diet. But domesticated sheep-goat and cows had reached the Azov steppes by 5500 calBC. The appearance of cattle and sheep-goat as sacrificial animals in graves of the Khvalynsk Culture on the Volga by the early 5th millennium BC probably was a continuation of the spread of animal herding eastward from the Azov steppes.

neolithic_steppe-anatolian-migrations
Most likely route of expansion of horse domestication and horse riding (including Suvorovo-Novodanilovka chiefs) from Khvalynsk into the North Pontic steppe and the Balkans.

Re-reading the papers on this subject – in which researchers seem to be fighting among each other for a radical interpretation of few animal bones – , I would suggest that the key concept they should be emphasizing is probably not the ‘presence’ vs. ‘absence’ of domestication in North Pontic steppe cultures in absolute terms.

Since there were clearly domesticated animals to the east and west of North Pontic cultures in the Neolithic, and thus the finding there of domesticated animals is more than likely, what is of great interest is the relative measure in which domesticated animals were relied upon by forest-steppe economies, compared to the use of available natural resources.

After all, many researchers currently agree that the North Pontic steppe and forest-steppe peoples formed communities of mainly hunter-fishers and gatherers, and findings of this paper do not seem to contradict this.

NOTE. In fact, there was a more recent paper I referenced which argues in such general terms with detail – probably written at the same time as this one -, by one of the authors they discuss, Mileto et al. (2018).

Also, as the paper states,

we want to emphasize that even a small excavation in the steppe zone, where only scanty number of the Neolithic and Eneolithic sites have been known yet, is very important and always gives very interesting materials.

Hence by confirming Anthony’s account of early domestication spreading eastwards during the Neolithic expansion, and without horses’ remains in any of the periods investigated (including Sredni Stog I-III), it also supports his hypothesis of horse riding emerging in Khvalynsk and expanding westward.

The Razdolnoe site lies near modern-day Donetsk, and its latest layer investigated (ca. 4300-4150 BC) represents thus the eastern variant of Sredni Stog III, being consequently the one more in contact with expanding early Khvalynsk.

Given the absence of horse remains in all layers, these results would also suggest that Novodanilovka and Suvorovo horse-riding chiefs (emerging ca. 4400-4200 BC to the west of this region) were indeed unrelated to the surrounding Sredni Stog population, and most likely migrants from the horse-riding Khvalynsk culture.

Featured image: Expansion of domestication in the Pontic-Caspian steppe, according to Anthony (2007).

Related:

Decline of genetic diversity in ancient domestic stallions in Europe

Open access research article Decline of genetic diversity in ancient domestic stallions in Europe, by Wutke et al., Science (2018), 4(4):eaap9691.

Abstract (emphasis mine):

Present-day domestic horses are immensely diverse in their maternally inherited mitochondrial DNA, yet they show very little variation on their paternally inherited Y chromosome. Although it has recently been shown that Y chromosomal diversity in domestic horses was higher at least until the Iron Age, when and why this diversity disappeared remain controversial questions. We genotyped 16 recently discovered Y chromosomal single-nucleotide polymorphisms in 96 ancient Eurasian stallions spanning the early domestication stages (Copper and Bronze Age) to the Middle Ages. Using this Y chromosomal time series, which covers nearly the entire history of horse domestication, we reveal how Y chromosomal diversity changed over time. Our results also show that the lack of multiple stallion lineages in the extant domestic population is caused by neither a founder effect nor random demographic effects but instead is the result of artificial selection—initially during the Iron Age by nomadic people from the Eurasian steppes and later during the Roman period. Moreover, the modern domestic haplotype probably derived from another, already advantageous, haplotype, most likely after the beginning of the domestication. In line with recent findings indicating that the Przewalski and domestic horse lineages remained connected by gene flow after they diverged about 45,000 years ago, we present evidence for Y chromosomal introgression of Przewalski horses into the gene pool of European domestic horses at least until medieval times.

horses-y-chromosome-evolution
The frequencies of Y chromosome haplotypes started to change during the Late Bronze Age (1600–900 BCE).
Inferred temporal trajectories of haplotype frequencies. Each haplotype is displayed by a different color. The shaded area represents the 95% highest-density region. The trajectories were constructed taking the median values across frequencies from the simulations of the Bayesian posterior sample. The small chart represents the stacked frequencies; the amplitude of each colored area is proportional to the median haplotype frequencies (normalized) at a given time. The x and y axes of the small chart match those in the large one. Ka, thousands of years.

Interesting excerpts:

The first record of the modern domestic Y chromosome haplotype stems from two Bronze Age samples of similar age. Notably, both samples were found in two distantly located regions: present-day Slovakia (2000–1600 BCE, dated by archaeological context) and western Siberia (14C-dated: 1609–1436 cal. BCE). Although a very recent study proposes an oriental origin of this haplotype (14), we cannot determine the geographical origin of Y-HT-1 with certainty, because this haplotype has not been found thus far in predomestic or wild stallions. There are two possible scenarios: (i) Y-HT-1 emerged within the domestic population by mutation and (ii) Y-HT-1 was already present in wild horses and entered the domestic population either at the beginning of domestication (but initially restricted to Asian horses) or later by introgression (from wild Y-HT-1 carrying studs during the Iron Age). Crosses between domestic animals and their wild counterparts have been observed in several domestic species (15–18); thus, the simplest explanation would be that we missed Y-HT-1 in older samples because of limited geographical sampling. However, the estimated haplotype age is contemporary (Fig. 4) with the assumed starting point of horse domestication ~4000–3500 BCE (19), rendering it likely that Y-HT-1 originated within the domestic horse gene pool. Still, we cannot rule out definitively that it appeared before domestication.

Independent of its geographical origin, Y-HT-1 progressively replaced all other haplotypes—except for one additional lineage that is restricted to Yakutian horses (11). Considering our data, this trend in paternal diversity toward dominance of the modern lineage appears to start in the Bronze Age and becomes even more pronounced during the Iron Age. The Bronze Age was a time of large-scale human migrations across Eurasia (20–22), movements that were undoubtedly facilitated by the spread of horses as a means of transport and warfare. At that time, the western Eurasian steppes were inhabited by highly mobile cultures that largely relied on horses (20, 21, 23, 24). The genetic admixture of northern and central European humans with Caucasians/eastern Europeans did correlate with the spread of the Yamnaya culture from the Pontic-Caspian steppe (25), an area that has repeatedly been suggested as the center of horse domestication (19, 26, 27). Given the importance of domestic horses, it appears that deliberate selection/rejection of certain stallions by these people might have contributed to the loss of paternal diversity. The spread of humans out of this region might also have resulted in the spread of Y-HT-1 from Asia to Europe. This scenario also agrees with recent findings that the low male diversity of extant horses is not caused by recruiting only a limited number of stallions during early domestication (13).

horses-y-chromosome-map
Decline of paternal diversity began in Asia.
Maps displaying age, locality, and haplotype (different colors) of each successfully genotyped sample.

The presence of the Y chromosome haplotype carried by present-day Przewalski horses (Y-HT-2) in early domestic stallions and a European wild horse (Pie05; table S2) could be the result of introgression of Przewalski stallions. Although the original distribution of the Przewalski horse is unknown, it was probably much larger than that of the relict population in Mongolia that produced modern Przewalski horses and might even have extended into Central Europe. However, it is also possible that either Przewalski horses were among the initially domesticated horses or that Y-HT-2 occurred both in Przewalski horses and in those wild horses that are the ancestors of domestic horses, based on autosomal DNA data (30). Regardless of how Y-HT-2 entered the domestic gene pool, it was eventually lost, as were all haplotypes except Y-HT-1. In our sample set, Y-HT-2 was undetectable as early as the third time bin. However, it is possible that Y-HT-2 may have been present during this time period, but with a frequency below 0.11 (with 95% probability). The inferred time trajectories for Y-HT-2 frequencies suggest that it could nevertheless have persisted at very low frequencies until the Middle Ages (Fig. 3). On the basis of these simulations, this finding could be interpreted as a relic of this haplotype’s formerly higher frequency in the domestic horse gene pool. It is also possible that the presence of this haplotype could be the result of mating a wild stallion with a domestic mare, a frequently reported breeding practice when wild horses were still widely distributed. However, a significant contribution of the Przewalski horse to the gene pool of modern domestic horses has been almost ruled out by recent genomic studies (13, 31, 32).

horses-y-chromosome-selection
Stallion lineages through time.
Temporal haplotype network of the four detected Y chromosome haplotypes. Age of the samples indicated by multiple layers separated by color; vertical lines connecting the haplotypes of consecutive layers/ages represent which haplotype was transferred into a later/younger period. Numbers constitute the respective number of individuals showing this particular haplotype for that period. Prz, Przewalski; Dom, domestic.

Related:

Ancient DNA upends the horse family tree

New paper, behind paywall, Ancient genomes revisit the ancestry of domestic and Przewalski’s horses, by Gaunitz et al., Science (2018)

Abstract:

The Eneolithic Botai culture of the Central Asian steppes provides the earliest archaeological evidence for horse husbandry, ~5,500 ya, but the exact nature of early horse domestication remains controversial. We generated 42 ancient horse genomes, including 20 from Botai. Compared to 46 published ancient and modern horse genomes, our data indicate that Przewalski’s horses are the feral descendants of horses herded at Botai and not truly wild horses. All domestic horses dated from ~4,000 ya to present only show ~2.7% of Botai-related ancestry. This indicates that a massive genomic turnover underpins the expansion of the horse stock that gave rise to modern domesticates, which coincides with large-scale human population expansions during the Early Bronze Age.

You can read more about it in the article Ancient DNA upends the horse family tree.

Excerpts, from the article (emphasis mine):

That none of the domesticates sampled in the last ~4,000 years descend from the horses first herded at Botai entails another major implication. It suggests that during the 3rd Mill BCE at the latest, another unrelated group of horses became the source of all domestic populations that expanded thereafter. This is compatible with two scenarios. First, Botai-type horses experienced massive introgression capture (22) from a population of wild horses until the Botai ancestry was almost completely replaced. Alternatively, horses were successfully domesticated in a second domestication center and incorporated minute amounts of Botai ancestry during their expansion. We cannot identify the locus of this hypothetic center due to a temporal gap in our dataset throughout the 3rd Mill BCE. However, that the DOM2 earliest member was excavated in Hungary adds Eastern Europe to other candidates already suggested, including the Pontic-Caspian steppe (2), Eastern Anatolia (23), Iberia (24), Western Iran and the Levant (25). Notwithstanding the process underlying the genomic turnover observed, the clustering of ~4,023-3,574 year-old specimens from Russia, Romania and Georgia within DOM2 suggests that this clade already expanded throughout the steppes and Europe at the transition between the 3rd and 2nd Mill BCE, in line with the demographic expansion at ~4,500 ya recovered in mitochondrial Bayesian Skylines (fig. S14).

przewalski-botai
Admixture graphs. (A to F) The six scenarios tested. Panel (A) received decisive Bayes Factor support, as indicated below each corresponding alternative scenario tested. Domestic-Ancient and Domestic-A/B refer to three phylogenetic clusters identified within DOM2 (excluding Duk2): ancient individuals; modern Mongolian, Yakutian (including Tumeski_CGG101397) and Jeju horses, and; all remaining modern breeds. (G) Posterior distributions of admixture proportions along p1 and p2 branches.

This study shows that the horses exploited by the Botai people later became the feral PH. Early domestication most likely followed the ‘prey pathway’ whereby a hunting relationship was intensified until reaching concern for future progeny through husbandry, exploitation of milk and harnessing (7). Other horses, however, were the main source of domestic stock over the last ~4,000 years or more. Ancient human genomics (26) has revealed considerable human migrations ~5,000 ya involving “Yamnaya” culture pastoralists of the Pontic-Caspian steppe. This expansion might be associated with the genomic turnover identified in horses, especially if Botai horses were best suited to localized pastoral activity than to long distance travel and warfare. Future work must focus on identifying the main source of the domestic horse stock and investigating how the multiple human cultures managed the available genetic variation to forge the many horse types known in history.

We are seing that Bell Beakers were obviously horse riders, and that their horses must have derived from Yamna riders, so it is quite possible that their ancestral early Khvalynsk culture was the origin of domesticated horses, as proposed by David W. Anthony, although for the moment we only know “that [horse] domestication could have been a process with many phases, experiments, failures, and successes”.

EDIT (23 FEB 2018): My interpretation errors removed, thanks to the comments.

Related: