Iron Age Tocharians of Yamnaya ancestry from Afanasevo show hg. R1b-M269 and Q1a1

New open access Ancient Genomes Reveal Yamnaya-Related Ancestry and a Potential Source of Indo-European Speakers in Iron Age Tianshan, by Ning et al. Current Biology (2019).

Interesting excerpts (emphasis mine, changes for clarity):

Here, we report the first genome-wide data of 10 ancient individuals from northeastern Xinjiang. They are dated to around 2,200 years ago and were found at the Iron Age Shirenzigou site. We find them to be already genetically admixed between Eastern and Western Eurasians. We also find that the majority of the East Eurasian ancestry in the Shirenzigou individuals is related to northeastern Asian populations, while the West Eurasian ancestry is best presented by ∼20% to 80% Yamnaya-like ancestry. Our data thus suggest a Western Eurasian steppe origin for at least part of the ancient Xinjiang population. Our findings furthermore support a Yamnaya-related origin for the now extinct Tocharian languages in the Tarim Basin, in southern Xinjiang.

Haplogroups

The dominant mtDNA lineages of the Shirenzigou people are commonly found in modern and ancient West Eurasian populations, such as U4, U5, and H, while they also have East Eurasian-specific haplogroups A, D4, and G3, preliminarily documenting admixed ancestry from eastern and western Eurasia.

The admixture profile is also shown on the paternal Y chromosome side that 4 out of 6 males in Shirenzigou (Figure S2) belong to the West Eurasian-specific haplogroup R1b (n = 2) and East Eurasian-specific haplogroup Q1a (n = 2), the former is predominant in ancient Yamnaya and nearly 100% in Afanasievo, different from the Middle and Late Bronze Age Steppe groups (Steppe_MLBA) such as Andronovo, [Potapovka], Srubnaya, and Sintashta whose Y chromosomal haplogroup is mainly R1a.

tocharians-y-dna-mtdna

Autosomal

We first carried out principal component analysis (PCA) to assess the genetic affinities of the ancient individuals qualitatively by projecting them onto present-day Eurasian variation (Figure 2). We observed a distinct separation between East and West Eurasians. Our ancient Shirenzigou samples and present-day populations from Central Asia and northwestern China form a genetic cline from East to West in the first PC. The distribution of Shirenzigou samples on the cline is relatively scattered with two major clusters, one being closer to modern-day Uygurs and Kazakhs and the other being closer to recently published ancient Saka and Huns from the Tianshan in Kazakhstan (…).

We applied a formal admixture test using f3 statistics in the form of f3 (Shirenzigou; X, Y) where X and Y are worldwide populations that might be the genetic sources for the Shirenzigou individuals. We observed the most significant signals of admixture in the Shirenzigou samples when using Yamnaya_Samara or Srubnaya as the West Eurasian source and some Northern Asians or Koreans as the East Eurasian source (Table S1). We also plotted the outgroup f3 statistics in the form of f3 (Mbuti; X, Anatolia_Neolithic) and f3 (Mbuti; X, Kostenki14) to visualize the allele sharing between population X and Anatolian farmers. As shown in Figure S3, the Steppe_MLBA populations including Srubnaya, Andronovo, and Sintashta were shifted toward farming populations compared with Yamnaya groups and the Shirenzigou samples. This observation is consistent with ADMIXTURE analysis that Steppe_MLBA populations have an Anatolian and European farmer-related component that Yamnaya groups and the Shirenzigou individuals do not seem to have. The analysis consistently suggested Yamnaya-related Steppe populations were the better source in modeling the West Eurasian ancestry in Shirenzigou.

tocharians-pca-admixture
PCA and ADMIXTURE for Shirenzigou Samples. Modified from the original to include in black squares samples related to Yamnaya.

Genetic Composition of Iron Age Shirenzigou Individuals

We continued to use qpAdm to estimate the admixture proportions in the Shirenzigou samples by using different pairs of source populations, such as Yamnaya_Samara, Afanasievo, Srubnaya, Andronovo, BMAC culture (Bustan_BA and Sappali_Tepe_BA) and Tianshan_Hun as the West Eurasian source and Han, Ulchi, Hezhen, Shamanka_EN as the East Eurasian source. In all cases, Yamnaya, Afanasievo, or Tianshan_Hun always provide the best model fit for the Shirenzigou individuals, while Srubnaya, Andronovo, Bustan_BA and Sappali_Tepe_BA only work in some cases. The Yamnaya_Samara or Afanasievo-related ancestry ranges from ∼20% to 80% in different Shirenzigou individuals, consistent with the scattered distribution on the East-West cline in the PCA

ancestry-tocharians

(…) we then modeled Shirenzigou as a three-way admixture of Yamnaya_Samara, Ulchi (or Hezhen) and Han to infer the source from the East Eurasia side that contributed to Shirenzigou. We found the Ulchi or Hezhen and Han-related ancestry had a complicated and unevenly distribution in the Shirenzigou samples. The most Shirenzigou individuals derived the majority of their East Eurasian ancestry from Ulchi or Hezhen-related populations, while the following two individuals M820 and M15-2 have more Han related than Ulchi/Hezhen-related ancestry.

One important question remains, though: how and when did these Proto-Tocharian speakers migrate from the Afanasevo culture in the Altai into the Tarim Basin? The traditional answer, now more likely than ever, is through the Chemurchek culture. See e.g. A re-analysis of the Qiemu’erqieke (Shamirshak) cemeteries, Xinjiang, China, by Jia and Betts JIES (2010) 38(4).

Also, given the apparent lack of (extra farmer ancestry that characterizes) Corded Ware ancestry, if the results were already suspicious before, how likely are now the published R1a(xZ93) and/or radiocarbon dates of the Xiaohe mummies from Li et al. (2010, 2015)? Because, after all, one should have expected in such a late date a generalized admixture with neighbouring Srubna/Andronovo-like populations.

Related

More Hungarian Conquerors of hg. N1c-Z1936, and the expansion of ‘Altaic-Uralic’ N1c

Open access Y-chromosomal connection between Hungarians and geographically distant populations of the Ural Mountain region and West Siberia, by Post et al. Scientific Reports (2019) 9:7786.

Hungarian Conquerors

More interesting than the study of modern populations of the paper is the following excerpt from the introduction, referring to a paper that is likely in preparation, Európai És Ázsiai Apai Genetikai Vonalak A Honfoglaló Magyar Törzsekben, by Fóthi, E., Fehér, T., Fóthi, Á. & Keyser, C., Avicenna Institute of Middle Eastern Studies (2019):

Certain chr-Y lineages from haplogroup (hg) N have been proposed to be associated with the spread of Uralic languages. So far, hg N3 has not been reported for Indo-European speaking populations in Central Europe, but it is present among Hungarians, although the proportion of hg N in the paternal gene pool of present-day Hungarians is only marginal (up to 4%) compared to other Uralic speaking populations. It has been shown earlier that one of the sub-clades of hg N – N3a4-Z1936 – could be a potential link between two Ugric speaking populations: the Hungarians and the Mansi. It is also notable that some ancient Hungarian samples from the 9th and 10th century Carpathian Basin belonged to this hg N sub-clade: Three Z1936 samples were found in the Upper-Tisza area (Karos II, Bodrogszerdahely/Streda nad Bodrogom) and two in the Middle-Tisza basin cemeteries (Nagykörű and Tiszakécske). The haplotype of the Nagykörű sample is identical with one contemporary Hungarian sample from Transylvania that tested positive for B545 marker downstream of N3a4-Z193632. Similar findings come from the maternal gene pool of historical Hungarians: the analyses of early medieval aDNA samples from Karos-Eperjesszög cemeteries revealed the presence of mtDNA hgs of East Asian provenance.

A commenter recently wrote that in a study by Fehér (probably this one) two Hungarian conquerors, from Ormenykut and Tuzser, will be of hg. N1c-2110. Assuming no other lineages will appear, this would leave the proportion of N1c-L392 vs. R1a-Z280/Z93 closer to the reported proportion of hg. N vs. R1a (5 vs. 2) among Sargat samples, and is thus compatible with a direct migration of Hungarians from around the Urals.

However, the sampling of Iron Age populations around the Urals is scarce, and we don’t know what other lineages these studied Magyars will have, but – based on the known variability of the published ones, and on the ca. 50-60 early Magyar males available to date in previous studies to obtain Y-chromosome haplogroups – I would say these reported N1c lineages are just a tiny proportion of what’s to come…

“Altaic-Uralic” N1c

altaic-uralic-n1c-haplogroup
Phylogenetic tree of hg N3a4. Phylogenetic tree of 33 high coverage Y-chromosomes from
haplogroup N3a4 was reconstructed with BEAST v.1.7.5 software package.

Archaeogenetic studies based on mtDNA haplotypes have shown that ancient Hungarians were relatively close to contemporary Bashkirs who are a Turkic speaking population residing in the Volga-Ural region. Another study reported excessive identical-by-descent (IBD) genomic segments shared between the Ob-Ugric speaking Khantys and Bashkirs but a moderate IBD sharing between Turkic speaking Tatars and their neighbours including Bashkirs.

Phylogenetic tree of hg N3a4 has two main sub-clades defined by markers B535 and B539 that diverged around 4.9 kya (95% confidence interval [CI] = 3.7–6.3 kya). Inner sub-clades of N3a4-B539 (defined by markers B540 and B545) split 4.2 kya (95% CI = 3.0–5.6 kya). (…) The phylogenetic tree reveals that all five Hungarian samples belong to N3a4-B539 sub-clade that they share with Ob-Ugric speaking Khanty and Mansi, and Turkic speaking Bashkirs and Tatars from the Volga-Ural region. Hungarian and Bashkir chrY lineages belong to both sub-clades of N3a4-B539.

Modern distribution of the “Ugric N1c”

To test the presence and proportions of hg N3a4 lineages in a more comprehensive sample set and with a higher phylogenetic resolution level compared to earlier studies, we analysed the genotyping data of about 5000 Eurasian individuals, including West Siberian Mansi and Khanty who are linguistically closest to Hungarians

n3a4-n1c-z1936-ugric
Map of the entire hg N3a4.

There is a clear difference in geographic distribution patterns of these two hg N3a4 sub-clades. Hg N3a4-B535 (Fig. 3b) is common mostly among Finnic (Finns, Karelians, Vepsas, Estonians) and Saami speaking populations in North eastern Europe. The highest frequency is detected in Finns (~44%) but it also reaches up to 32% in Vepsas and around 20% in Karelians, Saamis and North Russians. The latter are known to have changed their language or to be an admixed population with reported similar genetic composition to their Finnic speaking neighbors. The frequency of N3a4-B535 rapidly decreases towards south to around 5% in Estonians, being almost absent in Latvians (1%) and not found among Lithuanians. Towards east its frequency is from 1–9% among Eastern European Russians and populations of the Volga-Ural region such as Komis, Mordvins and Chuvashes (…)

n3a4-n1c-z1936-finnic-samic
Map of N3a4 subclades defined by B535.

Hg N3a4-B539, on the other hand, is prevalent among Turkic speaking Bashkirs and also found in Tatars but is entirely missing from other populations of the Volga-Ural region such as Uralic speaking Udmurts, Maris, Komis and Mordvins, and in Northeast Europe, where instead N3a4-B535 lineages are frequent. Besides Bashkirs and Tatars in Volga-Ural region, N3a4-B539 is substantially represented in West Siberia among Ugric speaking Mansis and Khantys. Among Hungarians, however, N3a4-B539 has a subtle frequency of 1–4%.

n3a4-n1c-z1936-ugric-bashkir
Map of N3a4 subclades defined by B539, with a local snapshot showing the N3a4-B539 distribution among Hungarian speakers.

The battle to appropriate N1c-L392

So, basically, the team of Kristiina Tambets is arguing that N1c-VL29 expanded Finnic to the East Baltic (hence from a common Finno-Mordvinic dialect splitting ca. 600 BC on?) because, you know, apparently the agreed separation of known Uralic dialects from ca. 2000 BC, and their Bronze Age presence around the Baltic, is not valid when you follow haplogroups instead of languages or archaeology.

But now this other group of Tambets (co-author of this paper) considers that hg. N1c-Z1936 – which is probably behind the N1c-L392 samples from Lovozero Ware in the Kola Peninsula – represent either the True Uralic-speaking Palaeo-Arctic peoples, or else merely Ugric-speaking peoples which happened to expand to Fennoscandia but left no trace of their language…

To accept this identification you only have to NOT ask why:

  • N1c is first found in ancient cultures close to Lake Baikal.
  • N1c-L392 appears in ancient East Asian populations speaking completely different languages, with Altaic and Uralic being just some among many Palaeo-Siberian populations where the haplogroup will pop up.
  • Turkic populations like Bashkirs and Tatars (who expanded to the Volga through the southern Urals before the expansion of Hungarians) show a shared distribution of the B539 haplotype with Hungarians.
  • The phylogenetic tree and areas of N1c-L392 expansions don’t make any sense in light of the known linguistic and cultural expansions of Uralic-speaking peoples.

In fact, the Hungarian research group of Neparáczki – publishing the recent paper on Hungarian Conquerors – was apparently looking for a connection with Turkic peoples to support some traditional Turanian myths, and they found it in some scattered R1a-Z93 samples which supposedly connect Hungarian Conquerors to Huns (?), instead of looking for this closer link through N1c-Z1936 (especially haplotype B539)…

Also, is it me or are there two opposed trends with completely different interpretations among researchers publishing papers about hg. N1c: one systematically arguing for Altaic origins, and another for Uralic ones?

If somebody sees some complex reasoning behind the discussions of all these recent papers, beyond the simplest “let’s follow N for Uralic/Altaic”, feel free to comment below. Just so I can understand what I might be doing wrong in assessing Neolithic and Bronze Age migrations in linguistics and archaeology with help of ancient haplogroups coupled with ancestral components, but these researchers are doing right by playing with obsessive ideas born out of the 2000s coupled with phylogenetic trees and maps of modern haplogroup distributions…

This is probably going to be this blog’s most used image in 2019:

horse-meme-steppe-ancestry

Related

Mongolian tribes cluster with East Asians, closely related to the Japanese

mongolian-sampling

New paper behind paywall Whole-genome sequencing of 175 Mongolians uncovers population-specific genetic architecture and gene flow throughout North and East Asia, by Bai et al Nature Genetics (2018).

Interesting excerpts (emphasis mine):

Genome sequencing, variant calling, and construction of the Mongolian reference panel. We collected peripheral blood with informed consent from 175 Mongolian individuals representing six distinct tribes/regions in northern China and Mongolia, including the Abaga, Khalkha, Oirat, Buryat, Sonid, and Horchin tribes.

mongolians-pca
Population genetic structure. a, PCA of Mongolian individuals and 1000G samples. Mongolians fill a large, less characterized gap between Admixed/Native Americans and other East Asians in the 1000G project. b, PCA of Mongolians and East Asians of 1000G. The abbreviations of EAS populations were used from reference 11.

The fixation index (FST) was used to estimate pairwise genetic differentiation among our Mongolian samples and 26 modern human populations selected from 1000G (…) the Mongolian tribes cluster with East Asian groups. The Mongolian populations show the smallest differentiation from the CHB, and FST values increase relative to the magnitude of geographical separation. The Buryat are the most differentiated tribe compared with other East Asians (1.82–2.97%), while the Horchin are the least (0.25–1.35%). All tribes are closer to the Japanese (JPT) than the CHS with the exception of the Horchin. Among the tribes, the Abaga, Khalkha, Oirat, and Sonid show the least differentiation from one another (FST < 0.15%)

A PCA places the Mongolians in close genetic proximity to a group of North Asian Siberians, including Altaians, Tuvinians, Evenki, and Yakut, indicating that the Mongolian whole-genome variation panel could be a better proxy for these groups than any populations currently in the 1000G panel

The most common Y-chromosome haplogroups are from the C3 sublineage (41.67%), including C3c (29.17%) and C3b (12.50%), followed by haplogroup O (23.61%), and haplogroup N (18.06%) (…) While haplogroups C and O are primarily restricted to Asia, haplogroup N is present at high frequency in Finns (60.5%), at low frequency in non-Mongolian East Asians (< 1%), and virtually absent throughout the remainder of European and African samples in 1000G

Comparison with Finns

ibd-sharing-mongolian-finns
Distribution of D-values from D-test under the model of [EAS, Mongolians, X, chimpanzee], where X represents the test population and chimpanzee serves as an outgroup. The positive D-value (Z > 3) indicates that the test population (X) is closer to Mongolians than to EAS. The whiskers correspond to range, and the dots to individual data points, box limits are the upper and lower quartiles. The n in each boxplot is 30. All abbreviations of populations in the figure were used from reference 11.

Of the populations included in our study, Mongolians share the second-highest level of IBD with the Finnish people (FIN), behind only Northern Han Chinese (CHB). While Mongolians share more IBD with Europeans (EUR) as a whole compared with other non-EAS people (Fig. 4b), removal of Finns from the Europeans drops the level of sharing to as low as that with South Asians (SAS) or Admixed American (AMR).

There is considerable geographic separation between modern-day Mongolians and Europe. The positive D-statistic that reveal gene flow between Mongolians and Europeans (Fig. 4c), and the high degree of IBD sharing with Finnish people in particular suggest that complex admixture may have occurred throughout northeastern Europe and Siberia. To see whether Mongolians represent the ethnic group in East Asia with the highest level of gene flow with Finnish people, we calculated a D-statistic for each set of populations [Mongolians, X, FIN, Yoruba (YRI)], where X represents a population from Siberia or Northern Canada. Most of the populations reveal an imbalance in allele frequencies that suggests gene flow with Finns (D >0, Z >3), but the greatest imbalance is observed between Siberians/Northern Canadians and Finnish, rather than between Mongolians and Finns. This pattern indicates that northern Asian populations interacted across large geographic ranges.

migration-finns-nganasan
6 migration events, from the supplementary materials.

I guess the 1000G does not have northern Eurasian groups, because the IBD map and values would be lightening up with Palaeo-Siberian peoples

Related

The Iron Age expansion of Southern Siberian groups and ancestry with Scythians

iron_age-sarmatians

Maternal genetic features of the Iron Age Tagar population from Southern Siberia (1st millennium BC), by Pilipenko et al. (2018).

Interesting excerpts (emphasis mine):

The positions of non-Tagar Iron Age groups in the MDS plot were correlated with their geographic position within the Eurasian steppe belt and with frequencies of Western and Eastern Eurasian mtDNA lineages in their gene pools. Series from chronological Tagar stages (similar to the overall Tagar series) were located within the genetic variability (in terms of mtDNA) of Scythian World nomadic groups (Figs 5 and 6; S4 and S6 Tables). Specifically, the Early Tagar series was more similar to western nomads (North Pontic Scythians), while the Middle Tagar was more similar to the Southern Siberian populations of the Scythian period. The Late Tagar group (Tes`culture) belonging to the Early Xiongnu period had the “western-most” location on the MDS plot with the maximal genetic difference from Xiongnu and other eastern nomadic groups (but see Discussion concerning the low sample size for the Tes`series).

In a comparison of our Tagar series with modern populations in Eurasia, we detected similarity between the Tagar group and some modern Turkic-speaking populations (with the exception of the Indo-Iranian Tajik population) (Fig 7; S2 Table). Among the modern Turkic-speaking groups, populations from the western part of the Eurasian steppe belt, such as Bashkirs from the Volga-Ural region and Siberian Tatars from the West Siberian forest-steppe zone, were more similar to the Tagar group than modern Turkic-speaking populations of the Altay-Sayan mountain system (including the Khakassians from the Minusinsk basin) (Fig 7).

tagar-archaeology
Location of Tagar archaeological sites from which samples for this study were obtained. Burial grounds: 1—Novaya Chernaya-1; 2—Podgornoe Ozero, Barsuchiha-1, Barsuchiha-6, Barsuchiha-7; 3—Perevozinskiy; 4—Ulug-Kyuzyur, Kichik-Kyuzyur, Sovetskaya Khakassiya; 5—Tepsey-3, Tepsey-8, Tepsey-9; 6—Dolgiy Kurgan. https://doi.org/10.1371/journal.pone.0204062.g001

Mitochondrial DNA diversity and genetic relationships of the Tagar population

Our results are not inconsistent with the assumption of a probable role of gene flow due to the migration from Western Eurasia to the Minusinsk basin in the Bronze Age in the formation of the genetic composition of the Tagar population. Particularly, we detected many mtDNA lineages/clusters with probable West Eurasian origin that were dominant in modern populations of different parts of Europe, Caucasus, and the Near East (such as K and HV6) in our Tagar series based on a phylogeographic analysis.

We detected relatively low genetic distances between our Tagar population and two Bronze Age populations from the Minusinsk basin—the Okunevo culture population (pre-Andronovo Bronze Age) and Andronovo culture population, followed by Afanasievo population from the Minusinsk Basin and Middle Bronze Age population from the Mongolian Altai Mountains (the region adjacent to the Minusinsk basin) (Figs 3 and 6; S3 and S5 Tables). Among West Eurasian part of our Tagar series we also observed haplogroups/sub-haplogroups and haplotypes shared with Early and Middle Bronze Age populations from Minusinsk Basin and western part of Eurasian steppe belt (Fig 4; S5 Table). Thus, our results suggested a potentially significant role of the genetic components, introduced by migrants from Western Eurasia during the Bronze Age, in the formation of the genetic composition of the Tagar population. It is necessary to note the relatively small size of available mtDNA samples from the Bronze Age populations of Minusinsk basin; accordingly, additional mtDNA data for these populations are required to further confirm our inference.

tagar-mtdna-tree
Phylogenetic tree of mtDNA lineages from the Tagar population. Color coding of the Tagar stages: orange—the Early Tagar stage; blue—the Middle Tagar Stage; green—the Late Tagar stage. Color of haplogroup labels: yellow—for Western Eurasian haplogroups; red—for Eastern Eurasian haplogroups. https://doi.org/10.1371/journal.pone.0204062.g002

Another substantial part of the mtDNA pool of the Tagar and other eastern populations of the Scythian World is typical of populations in Southern Siberia and adjacent regions of Central Asia (autochthonous Central Asian mtDNA clusters). Most of these components belong to the East Eurasian cluster of mtDNA haplogroups. Moreover, the role of each of these components in the formation of the genetic composition of subsequent (to the present) populations in South Siberia and Central Asia could be very different. In this regard, cluster C4a2a (and its subcluster C4a2a1), and haplogroup A8 are of particular interest.

Genetic features of successive Tagar groups

We compared successive Tagar groups (Early, Middle, and Late Tagar) with each other and with other Iron Age nomadic populations to evaluate changes in the mtDNA pool structure. Despite the genetic similarity between the Early and Middle Tagar series and Scythian World nomadic groups (Figs 5 and 6; S4 and S6 Tables), there were some peculiarities. For example, the Early Tagar series was more similar to North Pontic Classic Scythians, while the Middle Tagar samples were more similar to the Southern Siberian populations of the Scythian period (i.e., completely synchronous populations of regions neighboring the Minusinsk basin, such as the Pazyryk population from the Altay Mountains and Aldy-Bel population from Tuva).

We observed differences in the mtDNA pool structure between the Early and the Middle chronological stages of the Tagar culture population, as evidenced by the change in the ratio of Western to Eastern Eurasian mtDNA components. The contribution of Eastern Eurasian lineages increased from about one-third (34.8%) in the Early Tagar group to almost one-half (45.8%) in the Middle Tagar group.

tagar-mtdna-fst
Results of multidimensional scaling based on matrix of Slatkin population differentiation (FST) according to frequencies of mtDNA haplogroup in Tagar populations and modern populations of Eurasia. Populations: Tagar (red pentagon) (this study); Mongolian-speaking populations: Khamnigans (Buryat Republic, Russia) [43]; Barghuts (Inner Mongolia, China) [44]; Buryats (Buryat Republic, Southern Siberia, Russia) [43]; Mongols (Mongolia) [45]. Turkic-speaking populations: Tuvinians (Tuva Republic, Russia) [43]; Tofalars (Irkutsk region, Russia) [46]; Altai-Kizhi ((Altai Republic, Russia) [43, 47]; Telenghits (Altai Republic, Russia) [43,47]; Tubalars (Altai Republic) [48]; Shors (Kemerovo region, Russia) [43, 47]; Khakassians (Khakassian Rupublic, Russia) [43, 46]; Altaian Kazakhs (Altai Republic) [49]; Kazakhs (Kazakhstan, Uzbekistan) [50, 51]; Kirghiz (Kyrgyzstan) [50, 51]; Uighurs (Kazakhstan and Xinjiang) [50, 52]; Siberian Tatars (Tyumen and Omsk regions, Russia) [53]; Tatars (Volga-Ural rigion, Russia) [54]; Bashkirs (Volga-Ural region, Russia) [55]; Uzbeks (Uzbekistan) [51, 56]; Turkmens (Turkmenistan) [51, 56]; Nogays [57]; Turkeys [58]; other populations: Evenks [43, 46]; Ulchi [59]; Koreans (South Korea) [43]; Han Chinese [60]; Zhuang (Guangxi, China) [61]; Tadjiks (Tadjikistan) [43, 51]; Iranians [60]; Russians [62]. https://doi.org/10.1371/journal.pone.0204062.g007

At the level of mtDNA haplogroups, we detected a decrease in the diversity of phylogenetic clusters during the transition from the Early Tagar to the Middle Tagar. This decline in diversity equally affected the West Eurasian and East Eurasian components of the Tagar mtDNA pool. It should be noted that this decrease can be partially explained by the smaller number of Middle Tagar than Early Tagar samples. Under a simple binomial approximation the mtDNA clusters, observed at frequencies of 6.3% and 11.7%, could be lost by chance in our Early (N = 46) and Middle (N = 24) Tagar samples, respectively. However, the simultaneous lack of several such clusters, with a total frequency in the gene pool of the Early group of 34.8%, is unlikely.

The observed reduction in the genetic distance between the Middle Tagar population and other Scythian-like populations of Southern Siberia(Fig 5; S4 Table), in our opinion, is primarily associated with an increase in the role of East Eurasian mtDNA lineages in the gene pool (up to nearly half of the gene pool) and a substantial increase in the joint frequency of haplogroups C and D (from 8.7% in the Early Tagar series to 37.5% in the Middle Tagar series). These features are characteristic of many ancient and modern populations of Southern Siberia and adjacent regions of Central Asia, including the Pazyryk population of the Altai Mountains. We did not obtain strong evidence for an intensification of genetic contact between the population of the Minusinsk basin and the Altai Mountains in the Middle Tagar period compared with the Early Tagar period. Although, several archaeologists have found evidence for the intensification of contact at the level of material culture, namely, a cultural influence of the population of the Altai Mountains (represented by the Pazyryk population) on the population of the Minusinsk basin (the Saragash Tagar group) [6, 71, 72].

Another important issue is the change in the genetic structure of the Tagar population during the transition from the Middle (Saragash) to the Late (Tes`) stage. The Late Tagar stage refers to the Xiongnu period. Many archaeologists suggest that the formation of the Tes`stage involved the direct cultural influence of the Xiongnu and/or related groups of nomads from more eastern regions of Central Asia [71, 73]. Some archaeologists have even suggested renaming the Tes`stage in the Tes`culture [71], emphasizing the role of new eastern cultural elements. If this influence also existed at the genetic level, then we would expect to observe new genetic elements in the Tes`gene pool, particularly those of East Eurasian origin.

Siberian ancestry

Just a reminder of the recent session in ISBA 8 on expanding Scythians (and also Mongolians and Turks) spreading Siberian ancestry, usually (wrongly) identified as “Uralic-Yeniseian” based on modern populations (similar to how steppe ancestry is wrongly identified as “Indo-European”), see the following graphic including the Tagar population:

siberian-genetic-component-chronology
Very important observation with implication of population turnover is that pre-Turkic Inner Eurasian populations’ Siberian ancestry appears predominantly “Uralic-Yeniseian” in contrast to later dominance of “Tungusic-Mongolic” sort (which does sporadically occur earlier). Alexander M. Kim

And also the poster by Alexander M. Kim et al. Yeniseian hypotheses in light of genome-wide ancient DNA from historical Siberia:

The relevance of ancient DNA data to debates in historical linguistics is an emphatic strand in much recent work on the archaeogenetics of Eurasia, where the discussion has focused heavily on Indo-European (Haak et al. 2015; Narasimhan et al. 2018; de Barros Damgaard et al. 2018a,b). We present new genome-wide ancient DNA data from a historical Siberian individual in relation to Yeniseian, an isolated language “microfamily” (Vajda 2014) that nonetheless sits at the center of numerous controversial proposals in historical linguistics and cultural interaction. Yeniseian’s sole surviving representative is Ket, a critically endangered language fluently spoken by only a few dozen individuals near the Middle Yenisei River of Central Siberia.

In strong contrast to the present-day picture, river names and argued substrate influences and loanwords in languages outside the current range of Yeniseian, as well as direct records from the Russian colonial period, indicate that speakers of extinct Yeniseian languages had a formerly much broader presence in the taiga of Central Siberia as well as further south in the mountainous Altai-Sayan region – and perhaps even further afield in Inner Asia (Vajda 2010; Gorbachov 2017; Blažek 2016). The consilience of these proposals with genetic data is not straightforward (Flegontov et al. 2015, 2017) and faces a major obstacle in the lack of genetic information from verifiable speakers of Yeniseian languages other than the Kets, who have had complex ongoing interactions with speakers of non-Yeniseian languages such as the Samoyedic Selkups. We attempt to remedy this with new historical Siberian aDNA data, orienting our search for common denominators and systematic difference in a broader landscape of concordance, discordance, and uncertainty at the interface of diachronic linguistics and genetics.

Related

Mitogenomes from Avar nomadic elite show Inner Asian origin

ring-pommel-swords

Inner Asian maternal genetic origin of the Avar period nomadic elite in the 7th century AD Carpathian Basin, by Csáky et al. bioRxiv (2018).

Abstract (emphasis mine):

After 568 AD the nomadic Avars settled in the Carpathian Basin and founded their empire, which was an important force in Central Europe until the beginning of the 9th century AD. The Avar elite was probably of Inner Asian origin; its identification with the Rourans (who ruled the region of today’s Mongolia and North China in the 4th-6th centuries AD) is widely accepted in the historical research.

Here, we study the whole mitochondrial genomes of twenty-three 7th century and two 8th century AD individuals from a well-characterised Avar elite group of burials excavated in Hungary. Most of them were buried with high value prestige artefacts and their skulls showed Mongoloid morphological traits.

The majority (64%) of the studied samples’ mitochondrial DNA variability belongs to Asian haplogroups (C, D, F, M, R, Y and Z). This Avar elite group shows affinities to several ancient and modern Inner Asian populations.

The genetic results verify the historical thesis on the Inner Asian origin of the Avar elite, as not only a military retinue consisting of armed men, but an endogamous group of families migrated. This correlates well with records on historical nomadic societies where maternal lineages were as important as paternal descent.

mds-ancient-avar-elite
MDS with 23 ancient populations. The Multidimensional Scaling plot is based on linearised Slatkin FST values that were calculated based on whole mitochondrial sequences (stress value is 0.1581). The MDS plot shows the connection of the Avars (AVAR) to the Central-Asian populations of the Late Iron Age (C-ASIA_LIAge) and Medieval period (C-ASIA_Medieval) along coordinate 1 and coordinate 2, which is caused by non-significant genetic distances between these populations. The European ancient populations are situated on the left part of the plot, where the Iberian (IB_EBRAge), Central-European (C-EU_BRAge) and British (BRIT_BRAge) populations from Early Bronze Age and Bronze Age are clustered along coordinate 2, while the Neolithic populations from Germany (GER_Neo), Hungary (HUN_Neo), Near-East (TUR_ _Neo) and Baltic region (BALT_Neo) are located on the skirt of the plot along coordinate 1. The linearised Slatkin FST values, abbreviations and references are presented in Table S4.

Interesting excerpts:

The mitochondrial genome sequences can be assigned to a wide range of the Eurasian haplogroups with dominance of the Asian lineages, which represent 64% of the variability: four samples belong to Asian macrohaplogroup C (two C4a1a4, one C4a1a4a and one C4b6); five samples to macrohaplogroup D (one by one D4i2, D4j, D4j12, D4j5a, D5b1), and three individuals to F (two F1b1b and one F1b1f). Each haplogroup M7c1b2b, R2, Y1a1 and Z1a1 is represented by one individual. One further haplogroup, M7 (probably M7c1b2b), was detected (sample AC20); however, the poor quality of its sequence data (2.19x average coverage) did not allow further analysis of this sample.

European lineages (occurring mainly among females) are represented by the following haplogroups: H (one H5a2 and one H8a1), one J1b1a1, three T1a (two T1a1 and one T1a1b), one U5a1 and one U5b1b (Table S1).

We detected two identical F1b1f haplotypes (AC11 female and AC12 male) and two identical C4a1a4 haplotypes (AC13 and AC15 males) from the same cemetery of Kunszállás; these matches indicate the maternal kinship of these individuals. There is no chronological difference between the female and the male from Grave 30 and 32 (AC11 and AC12), but the two males buried in Grave 28 and 52 (AC13 and AC15) are not contemporaries; they lived at least 2-3 generations apart.

ward-clustering-ancient-populations
Ward type clustering of 44 ancient populations. The Ward type clustering shows separation of Asian and European populations. The Avar elite group (AVAR) is situated on an Asian branch and clustered together with Central Asian populations from Late Iron Age (C-ASIA_LIAge) and Medieval period (C-ASIA_Medieval), furthermore with Xiongnu period population from Mongolia (MON_Xiongnu) and Scythians from the Altai region (E-EU_IAge_Scyth). P values are given in percent as red numbers on the dendogram, where red rectangles indicate clusters with significant p values. The abbreviations and references are presented in Table S2.

The Avar period elite shows the lowest and non-significant genetic distances to ancient Central Asian populations dated to the Late Iron Age (Hunnic) and to the Medieval period, which is displayed on the ancient MDS plot (Fig. 4); these connections are also reflected on the haplogroup based Ward-type clustering tree (Fig. 3). Building of these large Central Asian sample pools is enabled by the small number of samples per cultural/ethnic group. Further mitogenomic data from Inner Asia are needed to specify the ancient genetic connections; however, genomic analyses are also set back by the state of archaeological research, i.e. the lack of human remains from the 4th-5th century Mongolia, which would be a particularly important region in the study of the Avar elite’s origin.

The investigated elite group from the Avar period elite also shows low genetic distances and phylogenetic connections to several Central and Inner Asian modern populations. Our results indicate that the source population of the elite group of the Avar Qaganate might have existed in Inner Asia (region of today’s Mongolia and North China) and the studied stratum of the Avars moved from there westwards towards Europe. Further genetic connections of the Avars to modern populations living to East and North of Inner Asia (Yakuts, Buryats, Tungus) probably indicate common source populations.

mds-eurasian-avar-elite-group
MDS with the 44 modern populations and the Avar elite group. The Multidimensional Scaling plot is displayed based on linearised Slatkin FST values calculated based on whole mitochondrial sequences (stress value is 0.0677). The MDS plot shows differentiation of European, Near-Eastern, Central- and East-Asian populations along coordinates 1 and 2. The Avar elite (AVAR) is located on the Asian part of plot and clustered with Uyghurs from Northwest-China (NW-CHIN_UYG) and Han Chinese (CHIN), as well as with Burusho and Hazara populations from the Central-Asian Highland (Pakistan). The linearised Slatkin FST values, abbreviations and references are presented in Table S5.

Sadly, no Y-DNA is available from this paper, although haplogroups Q, C2, or R1b (xM269) are probably to be expected, given the reported mtDNA. A replacement of the male population with subsequent migrations is obvious from the current distribution of Y-DNA haplogroups in the Carpathian Basin.

Hungarians and Corded Ware

Ancient Hungarians are important to understand the evolution, not only of Ugric, but also of Finno-Ugric peoples and their origin, since they show a genetic picture before more recent population expansions, genetic drift, and bottlenecks in eastern Europe.

By now it is evident that the migration of Magyar clans from their homeland in the Cis-Urals region (from the 4th century AD on) happened after the first waves of late and gradual expansion of N1c subclades among Finno-Ugric peoples, but before the bottlenecks seen in modern populations of eastern Europe.

In Ob-Ugric peoples, from the scarce data found in Pimenoff et al. (2018), we can see how Siberian N subclades expanded further after the separation of Magyars, evidenced by the inverted proportion of haplogroups R1a and N in modern Khantys and Mansis compared to Hungarians, and the diversity of N subclades compared to modern Fennic peoples.

Similarly to Hungarians, the situation of modern Estonians (where R1a and N subclades show approximately the same proportion, ca. 33%) is probably closer to Fennic peoples in Antiquity, not having undergone the latest strong founder effect evident in modern Finns after their expansion to the north.

middle-age-hungarian
Hungarian expansion from the 4th to the 10th century AD.

Modern Hungary

This is data from recent papers, summed up in Wikipedia:

  • In Semino et al. (2001) they found among 45 Palóc from Budapest and northern Hungary: 60% R1a, 13% R1b, 11% I, 9% E, 2% G, 2% J2.
  • In Csányi et al. (2008) Among 100 Hungarian men, 90 of whom from the Great Hungarian Plain: 30% R1a, 15% R1b, 13% I2a1, 13% J2, 9% E1b1b1a, 8% I1, 3% G2, 3% J1, 3% I*, 1% E*, 1% F*, 1% K*. Among 97 Székelys, in Romania: 20% R1b, 19% R1a, 17% I1, 11% J2, 10% J1, 8% E1b1b1a, 5% I2a1, 5% G2, 3% P*, 1% E*, 1% N.
  • In Pamjav et al. (2011), among 230 samples expected to include 6-8% Gypsy peoples: 26% R1a, 20% I2a, 19% R1b, 7% I, 6% J2, 5% H, 5% G2a, 5% E1b1b1a1, 3% J1, <1% N, <1% R2.
  • In Pamjav et al. (2017), from the Bodrogköz population: R1a-M458 (20.4%), I2a1-P37 (19%), R1b-M343 (15%), R1a-Z280 (14.3%), E1b-M78 (10.2%), and N1c-Tat (6.2%).

NOTE. The N1c-Tat found in Bodrogköz belongs to the N1c-VL29 subgroup, more frequent among Balto-Slavic peoples, which may suggest (yet again) an initial stage of the expansion of N subclades among Finno-Ugric peoples by the time of the Hungarian migration.

This is the data from FTDNA group on Hungary (copied from a Wikipedia summary of 2017 data):

  • 26.1% R1a (15% Z280, 6.5% M458, 0.9% Z93=>S23201, 3.7% unknown)
  • 19.2% R1b (6% L11-P312/U106, 5.3% P312, 4.2% L23/Z2103, 3.7% U106)
  • 16.9% I2 (15.2% CTS10228, 1.4% M223, 0.5% L38)
  • 8.3% I1
  • 8.1% J2 (5.3% M410, 2.8% M102)
  • 6.9% E1b1b1 (6% V13, 0.3% V22, 0.3% M123, 0.3% M81)
  • 6.9% G2a
  • 3.2% N (1.4% Z9136, 0.5% M2019/VL67, 0.5% Y7310, 0.9% Z16981)- note: only unrelated males are sampled
  • 2.3% Q (1.2% YP789, 0.9% M346, 0.2% M242)
  • 0.9% T
  • 0.5% J1
  • 0.2% L
  • 0.2% C

R1a-Z280 stands out in FDNA (which we have to assume has no geographic preference among modern Hungarians), while R1a-M458 is prevalent in the north, which probably points to its relationship with (at least West) Slavic populations.

Ancient Hungarians

We already knew that Hungarians show similarities with Srubna and Hunnic peoples, and this paper shows a good reason for the similarities with the Huns.

Also, recent population movements in the region (before the Avars) probably increased the proportion of R1b-L23 and I1 subclades (related to Roman and Germanic peoples) as well as possibly R1a-Z283 (mainly M458, related to the expansion of Slavs). From Understanding 6th-century barbarian social organization and migration through paleogenomics, by Amorim et al. (2018):

szolad-collegno
Y-chromosome haplogroup attribution for 37 medieval and 1 Bronze age individuals.

NOTE. The sample SZ15, of haplogroup R1a1a1b1a3a (S200), belongs to the Germanic branch Z284, which has a completely different history with its integration into the Nordic Bronze Age community.

Interesting is the Szólád Bronze Age sample of R1a1a1b2a2a (Z2123) subclade (ca. 2100-1700 BC), which is possibly the same haplogroup found in King Béla III [Z93+ (80.6%), Z2123+ (10.8%)]*. Nevertheless, Z2123 refers to an upper clade, found also in East Andronovo sites in Narasimhan et al. (2018), as well as in the modern population of the Tarim Basin.

NOTE. For more on the analysis of probability of the actual subclade, see here.

Bronze Age R1a-Z93 samples of central-east Europe – like the Balkans BA sample (ca. 1750-1625 BC) from Merichleri, of R1a1a1b2 subclade – correspond most likely to the expansion of Iranian-speaking peoples in the early 2nd millennium BC, probably to the westward expansion of the Srubna culture.

The specific subclade of King Béla III, on the other hand, probably corresponds to the more recent expansion of Magyar tribes settled in the region during the 9th century AD, so the specific subclade must have separated from those found in central-east Europe and in Andronovo during the Corded Ware expansion.

r1a-z282-z93-distribution
Modified image, from Underhill et al. (2015). Spatial frequency distributions of Z282 (green) and Z93 (blue) affiliated haplogroups. Notice the potential Finno-Ugric-associated distribution of Z282 (including M558, a Z280 subclade) according to ancient maps; the northern Eurasian finds of Z2125 (upper clade of Z2123); and the potential of M458 subclades representing a west-east expansion of Balto-Slavic as a western outgroup of an original Fenno-Ugric population, equivalent to Z284 in Scandinavia.

The study by Csányi et al. (2008), where the Tat C allele was found in 2 of 4 ancient samples, showed thus a potential 50:50 relationship of N1c in ancient Magyars, which is striking given the modern 1-3% a mere 1,000 years later, without any relevant population movement in between. This result remains to be reproduced with the current technology.

In fact, recent studies of ancient Magyars, from the 10th to the 12th century, have not shown any N1c sample, and have confirmed instead the ancient presence of R1a (two other samples, interred near Béla III), R1b (four samples), I2a (two samples) J1, and E1b, a mixed genetic picture which is more in line with what is expected.

So the question that I recently posed about east Corded Ware groups remains open: were Proto-Ugric peoples mainly of R1a-Z282 or R1a-Z93 subclades? Without ancient DNA from Middle Dnieper, Fatyanovo, Afanasevo, and the succeeding cultures (like Netted Ware) in north-eastern Europe, it is difficult to say.

It is very likely that they are going to show mainly a mixture of both R1a-Z282 and R1a-Z93 lineages, with later populations showing a higher proportion of R1a-Z280 subclades. Whether this mixture happened already during the Corded Ware period, or is the result of later developments, is still unknown. What is certain is that Hungarian N1a1a1a-L708 subclades belong to more recent additions of Siberian haplogroups to the Ugric stock, probably during the Iron Age, just centuries before the Magyar expansion.

Related

Y-DNA haplogroups of Tuvinian tribes show little effect of the Mongol expansion

uralic-turkic

Open access Estimating the impact of the Mongol expansion upon the gene pool of Tuvans, by Balanovskaya et al., Vavilov Journal of genetics and breeding (2018), 22(5):611-619.

Abstract (emphasis mine):

With a view to trace the Mongol expansion in Tuvinian gene pool we studied two largest Tuvinian clans – those in which, according to data of humanities, one could expect the highest Central Asian ancestry, connected with the Mongol expansion. Thus, the results of Central Asian ancestry in these two clans component may be used as upper limit of the Mongol influence upon the Tuvinian gene pool in a whole. According to the data of 59 Y-chromosomal SNP markers, the haplogroup spectra in these Tuvinian tribal groups (Mongush, N = 64, and Oorzhak, N = 27) were similar. On average, two-thirds of their gene pools (63 %) are composed by North Eurasian haplogroups (N*, N1a2, N3a, Q) connected with autochtonous populations of modern area of Tuvans. The Central Asian haplogroups (C2, O2) composed less then fifth part (17 %) of gene pools of the clans studied. The opposite ratio was revealed in Mongols: there were 10 % North Eurasian haplogroups and 75 % Central Asian haplogroups in their gene pool. All the results derived – “genetic portraits”, the matrix of genetic distances, the dendrogram and the multidimensional scaling plot, which mirror the genetic connections between Tuvinian clans and populations of South Siberia and East Asia, demonstrated the prominent similarity of the Tuvinian gene pools with populations from and Khakassia and Altai. It could be therefore assumed that Tuvinian clans Mongush and Oorzhak originated from autochtonous people (supposedly, from the local Samoyed and Kets substrata). The minor component of Central Asian haplogroups in the gene pool of these clans allowed to suppose that Mongol expansion did not have a significant influence upon the Tuvinan gene pool at a whole.

tuvan-clans-y-dna

Interesting excerpts:

Haplogroup C2 peaks in Central Asia (Wells et al., 2001; Zerial et al., 2003), though its variants are abundant in other peoples of Siberia and Far East. For instance, in one of Buryat clans, namely Ekhirids, hg C2 frequency is 88 % (Y-base); in Kazakhs from different regions of Kazakhstan, total occurrence of hg C2 variants averages between 17 and 81 % (Abilev et al., 2012; Zhabagin et al., 2013, 2014, 2017), in populations of the Amur River (such as Nanais, Negidals, Nivkhs, Ulchs) – between 40 and 65 %, in Evenks – up to 68 % (Y-base), in Kyrgyz people of Pamir-Alay – up to 22 %, correspondingly; of all Turkic peoples of Altai, relatively high hg C2 frequency (16 %) is detected only in Telengits (Balanovskaya et al., 2014; Balaganskaya et al., 2011a, 2016). In Tuvinian clans under the study, hg C2 frequency is rather low – 19 % in Mongush and 11 % in Oorzhak, while in Mongols it makes up almost two thirds of the entire gene pool an comprises different genetic lines (subhaplogroups).

tuvinian-y-chromosome
Y-chromosomal haplogroup spectra in gene pools of Tuvinian Oorzhak and Mongush clans and of the neighboring populations of South Siberia and Central Asia.

Haplogroup N is abundant all over North Eurasia from Scandinavia to Far East (Rootsi et al., 2007). The study on whole Y-chromosome sequencing conducted with participation of our group (Ilumäe et al., 2016) subdivided this haplogroup into several branches with their regional distribution. In gene pools of the Tuvans involved, hg N was represented by two sub-clades, namely N1a2 and N3a.

Sub-clade N1a2 peaks in populations of West Siberia (in Nganasans, frequency is 92 %) and South Siberia (in Khakas 34 %, in Tofalars 25 %) (Y-base). In Tuvans, N1a2 occurrence is nearly 16 % in Mongush and 15 % in Oorzhak clans, respectively, while in Mongols, the frequency is three times less (5 %). Hg N1a2 is supposed to display the impact of the Samoyedic component to the gene pool of Tuvinian clans (Kharkov et al., 2013).

Sub-clade N3a is major in the Oorzhak clan comprising almost half of the gene pool (45 %); it is represented by two sub-clades, namely N3a* and N3a5. The same sub-branches are specific to the Mongush clan as well, though with lower frequencies: N3a* – 9 % and N3a5 – 14 % (see Table). In Khori-Buryats from the Transbaikal region, a high frequency is observed – 82 % (Kharkov et al., 2014), while in Mongols, N3a5 occurs rather rarely (6 %). Hg N3a* was detected in populations of South Siberia only, and was widely spread in Khakas-Sagays and Shors (up to 40 %) (Ilumäe et al., 2016) (Y-base).

samoyedic
Map of distribution of Samoyedic languages (red) in the XVII century (approximate; hatching) and in the end of XX century (continuous background). Modified from Wikipedia, with the Tuva region labelled.

Within the pan-Eurasian haplogroup R1a1a, two large genetic lines (sub-haplogroups) are identified: “European” (marker M458) and “Asian” (marker Z93) the latter almost never occurring in Europe (Balanovsky, 2015) but abundant in South Siberia and northern Hindustan. In the Altai-Sayan region, high frequencies of the “Asian” branch are spread in many peoples – Shors, Tubalars, Altai-Kizhi people, Telengits, Sagays, Kyzyl Khakas, Koibals, Teleuts (Y-base) (Kharkov et al., 2009). Hg R1a1a comprises perceptible parts of gene pools of Tuvinian clans (19 % in Mongush, and 15 % in Oorzhak), though its occurrence in Mongols is much lower (6 %). Those results also count in favor of the hypothesis of autochtonous component dominance even in the gene pools of clans potentially most influenced by Mongolian ancestry. If we add R1a1a variants to the “North Eurasian” haplogroups, the “not-Central Asian” component will compose average four fifth of the entire gene pools for Tuvinian clans (in Mongush 77 %, and in Oorzhak 81 %), being only 16 % in Mongols. Such data are definitely contrary to the hypothesis of a crucial influence of the Mongol expansion upon the development of Tuvinian gene pool.

I found interesting the high proportion of R1a-Z93 subclades among Sagays in Khakhasia, which stem from a local Samoyed substratum, as described by the paper…

Featured Image: Map of Uralic and Altaic languages, from Wikipedia.

Related

Close inbreeding and low genetic diversity in Inner Asian human populations despite geographical exogamy

turko-mongol-indo-iranian

Open access Close inbreeding and low genetic diversity in Inner Asian human populations despite geographical exogamy, by Marchi et al. Scientific Reports (2018) 8:9397.

Abstract (emphasis mine):

When closely related individuals mate, they produce inbred offspring, which often have lower fitness than outbred ones. Geographical exogamy, by favouring matings between distant individuals, is thought to be an inbreeding avoidance mechanism; however, no data has clearly tested this prediction. Here, we took advantage of the diversity of matrimonial systems in humans to explore the impact of geographical exogamy on genetic diversity and inbreeding. We collected ethno-demographic data for 1,344 individuals in 16 populations from two Inner Asian cultural groups with contrasting dispersal behaviours (Turko-Mongols and Indo-Iranians) and genotyped genome-wide single nucleotide polymorphisms in 503 individuals. We estimated the population exogamy rate and confirmed the expected dispersal differences: Turko-Mongols are geographically more exogamous than Indo-Iranians. Unexpectedly, across populations, exogamy patterns correlated neither with the proportion of inbred individuals nor with their genetic diversity. Even more surprisingly, among Turko-Mongols, descendants from exogamous couples were significantly more inbred than descendants from endogamous couples, except for large distances (>40 km). Overall, 37% of the descendants from exogamous couples were closely inbred. This suggests that in Inner Asia, geographical exogamy is neither efficient in increasing genetic diversity nor in avoiding inbreeding, which might be due to kinship endogamy despite the occurrence of dispersal.

Interesting excerpts:

Two cultural groups, which matrimonial systems are reported to differ, coexist in Inner Asia: Turko-Mongols are described as mainly exogamous while Indo-Iranians are thought to be mainly endogamous45. However, it is not always clear if exogamy refers to clan (ethnic) or village (geographical) exogamy. Here, we used a dataset of 16 populations representing 11 different ethnic groups from both cultural groups and we quantified geographical exogamy rates and distances in each population. Using an empirical threshold of 4 km, we confirmed that matrimonial behaviours differ as described in the literature, even though we found some exceptions: three Turko-Mongol populations (out of 14) have less than 50% exogamy, whereas one Indo-Iranian population (out of four) has more than 50% exogamy.(…).

geographic-distance-turko-mongols-indo-iranian
Geographical distances between the birth places of couples in Turko-Mongols and Indo-Iranians. The geographical distances are plotted in log scale (km). Their densities are represented by population (dashed lines) or for the Indo-Iranian and Turko-Mongol groups (solid lines). We represented the average distances within couples per population using a Kernel’s density estimate implemented in R with a smoothing bandwidth of 0.2. See Supplementary Table 1B for population codes.

An additional important result of our study is that geographical distances are not negatively correlated with inbreeding, as could have been expected under an isolation-by-distance model65. Interestingly, a recent study based on a large genealogical dataset, collected across Western Europe and North America, and including birth places information, similarly found an absence of correlation between relatedness and the distance between couples, for the cohorts born before 185066. Our analyses within present-day Turko-Mongols reveal more specifically that the structure of the relationship between geographical distance and mating choice inbreeding is not linear, but rather tends to be bell-shaped, and thus cannot be correctly assessed with a single correlation test. Indeed, descendants from parents born 4 to 40 km apart are more inbred than descendants from endogamous couples (≤4 km) or from long-range exogamous ones (>40 km). As a consequence, close inbreeding exists despite geographical exogamy, and about a third of descendants from exogamous couples are inbred.

These results, in addition to those obtained by [Kaplanis et al. 2018]66, highlight the importance of using geographic distances rather than exogamy rates to characterize the impact of exogamy on inbreeding, as already described when studying patrilocality67. Indeed, when we compare mating choice inbreeding patterns for descendants from exogamous and endogamous couples defined for thresholds of 4, 10, 20 and 30 km, we find no significant differences (for number and total length of class C-ROHs and F-Median coefficient: MWU test p-values > 0.1). We only detect significantly lower values in descendants from exogamous couples for larger distances above 40 and 50 km (p-values < 0.03).

genetic-diversity-turko-mongol-indo-iranian
Genetic diversity (A) and inbreeding patterns (B,C) within populations. Grey lines in (B) represent inbreeding values corresponding to second-cousins and first-cousins. The grey line in (C) represents the homozygosity population baseline expected under panmixia. The number of samples per population is indicated between parentheses. See Supplementary Table 1B for population codes.

Our results also challenge the intuition that exogamy necessarily increases the genetic diversity within a population and therefore reduces drift inbreeding. Indeed, we found that Turko-Mongol populations have a lower genetic diversity (as measured by the mean haplotypic heterozygosity) and more intermediate ROHs associated with drift inbreeding than those of Indo-Iranians despite higher exogamous rates. (…)

Overall, this research sheds light on mating choice preferences: we showed that two thirds of partners that have not dispersed did mate with unrelated individuals, and that drift and mating choice inbreeding is variable, even among close-by populations. We also provide new insights into the relationship between dispersal and inbreeding in humans, based on genetic data, and demonstrate that geographical exogamy is not necessarily negatively associated with mating choice inbreeding, but rather can have a more complex non-linear relationship. Contrary to the common situation in many animals, this finding suggests that Inner Asian human populations who practise exogamy at small geographical scales might be focused on alliance strategies that result in kinship endogamy. (…)

Related:

Eurasian steppe dominated by Iranian peoples, Indo-Iranian expanded from East Yamna

yamna-indo-iranian-expansion

The expected study of Eurasian samples is out (behind paywall): 137 ancient human genomes from across the Eurasian steppes, by de Barros Damgaard et al. Nature (2018).

Dicussion (emphasis mine):

Our findings fit well with current insights from the historical linguistics of this region (Supplementary Information section 2). The steppes were probably largely Iranian-speaking in the first and second millennia bc. This is supported by the split of the Indo-Iranian linguistic branch into Iranian and Indian33, the distribution of the Iranian languages, and the preservation of Old Iranian loanwords in Tocharian34. The wide distribution of the Turkic languages from Northwest China, Mongolia and Siberia in the east to Turkey and Bulgaria in the west implies large-scale migrations out of the homeland in Mongolia since about 2,000 years ago35. The diversification within the Turkic languages suggests that several waves of migration occurred36 and, on the basis of the effect of local languages, gradual assimilation to local populations had previously been assumed37. The East Asian migration starting with the Xiongnu accords well with the hypothesis that early Turkic was the major language of Xiongnu groups38. Further migrations of East Asians westwards find a good linguistic correlate in the influence of Mongolian on Turkic and Iranian in the last millennium39. As such, the genomic history of the Eurasian steppes is the story of a gradual transition from Bronze Age pastoralists of West Eurasian ancestry towards mounted warriors of increased East Asian ancestry—a process that continued well into historical times.

This paper will need a careful reading – better in combination with Narasimhan et al. (2018), when their tables are corrected – , to assess the actual ‘Iranian’ nature of the peoples studied. Their wide and long-term dominion over the steppe could also potentially explain some early samples from Hajji Firuz with steppe ancestry.
fku

eurasian-steppe-samples
Principal component analyses. The principal components 1 and 2 were plotted for the ancient data analysed with the present-day data (no projection bias) using 502 individuals at 242,406 autosomal SNP positions. Dimension 1 explains 3% of the variance and represents a gradient stretching from Europe to East Asia. Dimension 2 explains 0.6% of the variance, and is a gradient mainly represented by ancient DNA starting from a ‘basal-rich’ cluster of Natufian hunter-gatherers and ending with EHGs. BA, Bronze Age; EMBA, Early-to-Middle Bronze Age; SHG, Scandinavian hunter-gatherers.

For the moment, at first sight, it seems that, in terms of Y-DNA lineages:

  • R1b-Z93 (especially Z2124 subclades) dominate the steppes in the studied periods.
  • R1b-P312 is found in Hallstatt ca. 810 BC, which is compatible with its role in the Celtic expansion.
  • R1b-U106 is found in a West Germanic chieftain in Poprad (Slovakia) ca. 400 AD, during the Migration Period, hence supporting once again the expansion of Germanic tribes especially with R1b-U106 lineages.
  • A new sample of N1c-L392 (L1025) lineage dated ca. 400 AD, now from Lithuania, points again to a quite late expansion of this lineage to the region, believed to have hosted Uralic speakers for more than 2,000 years before this.
  • A sample of haplogroup R1a-Z282 (Z92) dated ca. 1300 AD in the Golden Horde is probably not quite revealing, not even for the East Slavic expansion.
  • Also, interestingly, some R1b(xM269) lineages seem to be associated with Turkic expansions from the eastern steppe dated around 500 AD, which probably points to a wide Eurasian distribution of early R1b subclades in the Mesolithic.

NOTE. I have referenced not just the reported subclades from the paper, but also (and mainly) further Y-SNP calls studied by Open Genomes. See the spreadsheet here.

Interesting also to read in the supplementary materials the following, by Michaël Peyrot (emphasis mine):

1. Early Indo-Europeans on the steppe: Tocharians and Indo-Iranians

The Indo-European language family is spread over Eurasia and comprises such branches and languages as Greek, Latin, Germanic, Celtic, Sanskrit etc. The branches relevant for the Eurasian steppe are Indo-Aryan (= Indian) and Iranian, which together form the Indo-Iranian branch, and the extinct Tocharian branch. All Indo-European languages derive from a postulated protolanguage termed Proto-Indo-European. This language must have been spoken ca 4500–3500 BCE in the steppe of Eastern Europe21. The Tocharian languages were spoken in the Tarim Basin in present-day Northwest China, as shown by manuscripts from ca 500–1000 CE. The Indo-Aryan branch consists of Sanskrit and several languages of the Indian subcontinent, including Hindi. The Iranian branch is spread today from Kurdish in the west, through a.o. Persian and Pashto, to minority languages in western China, but was in the 2nd and 1st millennia BCE widespread also on the Eurasian steppe. Since despite their location Tocharian and Indo-Iranian show no closer relationship within Indo-European, the early Tocharians may have moved east before the Indo-Iranians. They are probably to be identified with the Afanasievo Culture of South Siberia (ca 2900 – 2500 BCE) and have possibly entered the Tarim Basin ca 2000 BCE103.

The Indo-Iranian branch is an extension of the Indo-European Yamnaya Culture (ca 3000–2400 BCE) towards the east. The rise of the Indo-Iranian language, of which no direct records exist, must be connected with the Abashevo / Sintashta Culture (ca 2100 – 1800 BCE) in the southern Urals and the subsequent rise and spread of Andronovo-related Culture (1700 – 1500 BCE). The most important linguistic evidence of the Indo-Iranian phase is formed by borrowings into Finno-Ugric languages104–106. Kuz’mina (2001) identifies the Finno-Ugrians with the Andronoid cultures in the pre-taiga zone east of the Urals107. Since some of the oldest words borrowed into Finno-Ugric are only found in Indo-Aryan, Indo-Aryan and Iranian apparently had already begun to diverge by the time of these contacts, and when both groups moved east, the Iranians followed the Indo-Aryans108. Being pushed by the expanding Iranians, the Indo-Aryans then moved south, one group surfacing in equestrian terminology of the Anatolian Mitanni kingdom, and the main group entering the Indian subcontinent from the northwest.

steppe-migrations-pastoralists
Summary map. Depictions of the five main migratory events associated with the genomic history of the steppe pastoralists from 3000 bc to the present. a, Depiction of Early Bronze Age migrations related to the expansion of Yamnaya and Afanasievo culture. b, Depiction of Late Bronze Age migrations related to the Sintashta and Andronovo horizons. c, Depiction of Iron Age migrations and sources of admixture. d, Depiction of Hun-period migrations and sources of admixture. e, Depiction of Medieval migrations across the steppes.

2. Andronovo Culture: Early Steppe Iranian

Initially, the Andronovo Culture may have encompassed speakers of Iranian as well as Indo-Aryan, but its large expansion over the Eurasian steppe is most probably to be interpreted as the spread of Iranians. Unfortunately, there is no direct linguistic evidence to prove to what extent the steppe was indeed Iranian speaking in the 2nd millennium BCE. An important piece of indirect evidence is formed by an archaic stratum of Iranian loanwords in Tocharian34,109. Since Tocharian was spoken beyond the eastern end of the steppe, this suggests that speakers of Iranian spread at least that far. In the west of the Tarim Basin the Iranian languages Khotanese and Tumshuqese were spoken. However, the Tocharian B word etswe ‘mule’, borrowed from Iranian *atswa- ‘horse’, cannot derive from these languages, since Khotanese has aśśa- ‘horse’ with śś instead of tsw. The archaic Iranian stratum in Tocharian is therefore rather to be connected with the presence of Andronovo people to the north and possibly to the east of the Tarim Basin from the middle of the 2nd millennium BCE onwards110.

Since Kristiansen and Allentoft sign the paper (and Peyrot is a colleague of Kroonen), it seems that they needed to expressly respond to the growing criticism about their recent Indo-European – Corded Ware Theory. That’s nice.

They are obviously trying to reject the Corded Ware – Uralic links that are on the rise lately among Uralicists, now that Comb Ware is not a suitable candidate for the expansion of the language family.

IECWT-proponents are apparently not prepared to let it go quietly, and instead of challenging the traditional Neolithic Uralic homeland in Eastern Europe with a recent paper on the subject, they selected an older one which partially fit, from Kuz’mina (2001), now shifting the Uralic homeland to the east of the Urals (when Kuz’mina asserts it was south of the Urals).

Different authors comment later in this same paper about East Uralic languages spreading quite late, so even their text is not consistent among collaborating authors.

Also interesting is the need to resort to the questionable argument of early Indo-Aryan loans – which may have evidently been Indo-Iranian instead, since there is no way to prove a difference between both stages in early Uralic borrowings from ca. 4,500-3,500 years ago…

EDIT (10/5/2018) The linguistic supplement of the Science paper deals with different Proto-Indo-Iranian stages in Uralic loans, so on the linguistic side at least this influence is clear to all involved.

A rejection of such proposals of a late, eastern homeland can be found in many recent writings of Finnic scholars; see e.g. my references to Parpola (2017), Kallio (2017), or Nordqvist (2018).

NOTE. I don’t mind repeating it again: Uralic is one possibility (the most likely one) for the substrate language that Corded Ware migrants spread, but it could have been e.g. another Middle PIE dialect, similar to Proto-Anatolian (after the expansion of Suvorovo-Novodanilovka chiefs). I expressly stated this in the Corded Ware substrate hypothesis, since the first edition. What was clear since 2015, and should be clear to anyone now, is that Corded Ware did not spread Late PIE languages to Europe, and that some east CWC groups only spread languages to Asia after admixing with East Yamna. If they did not spread Uralic, then it was a language or group of languages phonetically similar, which has not survived to this day.

Their description of Yamna migrations is already outdated after Olalde et al. & Mathieson et al. (2018), and Narasimhan et al. (2018), so they will need to update their model (yet again) for future papers. As I said before, Anthony seems to be one step behind the current genetic data, and the IECWT seems to be one step behind Anthony in their interpretations.

At least we won’t have the Yamna -> Corded Ware -> BBC nonsense anymore, and they expressly stated that LPIE is to be associated with Yamna, and in particular the “Indo-Iranian branch is an extension of the Indo-European Yamnaya Culture (ca 3000–2400 BCE) to the East” (which will evidently show an East Yamna / Poltavka society of R1b-L23 subclades), so that earlier Eneolithic cultures have to be excluded, and Balto-Slavic identification with East Europe is also out of the way.

Related: