Iron Age nomads of West Siberia of hg. Q1b, R1a, and basal N1a-L1026

Open access Ancient genomic time transect from the Central Asian Steppe unravels the history of the Scythians, by Gnecchi-Ruscone et al. Sci Adv. 7 (13) eabe4414.

Interesting excerpts (emphasis mine):

From an archaeological perspective, the earliest IA burials associated with nomad-warrior cultures were identified in the eastern fringes of the Kazakh Steppe, in Tuva and the Altai region (ninth century BCE).

Following this early evidence, the Tasmola culture in central and north Kazakhstan is among the earliest major IA nomad warrior cultures emerging (eighth to sixth century BCE).

These earlier groups were followed by the iconic Saka cultures located in southeastern Kazakhstan and the Tian Shan mountains (sixth to second century BCE), the Pazyryk culture centered in the Altai mountains (fifth to first century BCE-CE), and the Sarmatians that first appeared in the southern Ural region (sixth to second century BCE) and later are found westward as far as the northern Caucasus and eastern Europe (fourth BCE to fourth CE).

The nomad groups also influenced their sedentary neighbors, such as the ones associated with the Sargat cultural horizon (fifth to first century BCE) located in the northern forest-steppe zone between the Tobol and Irtysh rivers.

The IA transition in the Kazakh Steppe

In contrast to the highly homogeneous steppe_MLBA cluster found across the Kazakh Steppe until the end of the second millennium BCE, the IA individuals are scattered across the PC space, most notably along PC1 and PC3. Their spread along these PCs suggests a varying degree of extra eastern Eurasian affinity compared to the MLBA population and extra affinity to southern populations ultimately related to the Neolithic Iranians and the Mesolithic Caucasus hunter-gatherers (from here on referred to as Iranian-related ancestry), respectively. Despite the high genetic variability, it is possible to appreciate homogeneous clusters of ancient individuals belonging to the same archaeological culture and/or geographic area:

pca-iron-age-nomads-bronze-age — PC1 versus PC3 (outer plot) and PC1 versus PC2 (inner plot in the bottom right box) including all the IA, new and previously published individuals (filled symbols), relevant published temporally preceding groups (empty symbols), and present-day Kazakh individuals (small black points). The gray labels in this and the following panel indicate broad geographical groupings of the modern individuals used to calculate PCA that in the plots are shown as small gray points. The ancient samples are distributed in (A) to (C) sliced in three different time intervals as reported in the top right corner.

Following a chronological order, most of the individuals from the sites associated with the Early IA Tasmola culture (“Tasmola_650BCE”) and the published “Saka_Kazakhstan_600BCE” of central-north Kazakhstan cluster together in the middle of the PCA plot and show a uniform pattern of genetic components in ADMIXTURE analyses. The two previously published individuals from the Aldy Bel site in Tuva (Aldy_Bel_700BCE) also fall within this genetic cloud.

pca-iron-age-nomads-early-iron-age — PC1 versus PC3 (outer plot) and PC1 versus PC2 (inner plot in the bottom right box) including all the IA, new and previously published individuals (filled symbols), relevant published temporally preceding groups (empty symbols), and present-day Kazakh individuals (small black points). The gray labels in this and the following panel indicate broad geographical groupings of the modern individuals used to calculate PCA that in the plots are shown as small gray points. The ancient samples are distributed in (A) to (C) sliced in three different time intervals as reported in the top right corner.

This genetic profile persists in the later Middle and Late IA, shown by most individuals from the Pazyryk site of Berel (“Pazyryk_Berel_50BCE”).. This IA cluster is distinct from the previous steppe_MLBA groups inhabiting the same regions, most notably because of its substantial shift toward eastern Eurasians along PC1. In addition, we find outliers showing an even stronger shift to eastern Eurasians than the main cluster: two outliers from Pazyryk Berel time (“Pazyryk_Berel_50BCE_o”), three outliers from the Tasmola site of Birlik (“Tasmola_Birlik_640BCE”), and three of four individuals from the Korgantas phase of central-north Kazakhstan. One female individual from Birlik (BIR013.A0101) with an eastern Eurasian genetic profile was unearthed with grave goods (a bronze mirror) that presented typical Eastern Steppe features.

pca-iron-age-nomads-late-iron-age — PC1 versus PC3 (outer plot) and PC1 versus PC2 (inner plot in the bottom right box) including all the IA, new and previously published individuals (filled symbols), relevant published temporally preceding groups (empty symbols), and present-day Kazakh individuals (small black points). The gray labels in this and the following panel indicate broad geographical groupings of the modern individuals used to calculate PCA that in the plots are shown as small gray points. The ancient samples are distributed in (A) to (C) sliced in three different time intervals as reported in the top right corner.

Admixture modeling of IA steppe populations

Genetic ancestry modeling of the IA groups performed with qpWave and qpAdm confirmed that the steppe_MLBA groups adequately approximate the western Eurasian ancestry source in IA Scythians while the preceding steppe_EBA (e.g., Yamnaya and Afanasievo) do not.

As an eastern Eurasian proxy, we chose LBA herders from Khovsgol in northern Mongolia based on their geographic and temporal proximity. Other eastern proxies fail the model because of a lack or an excess of affinity toward the Ancient North Eurasian (ANE) lineage. However, this two-way admixture model of Khovsgol + steppe_MLBA does not fully explain the genetic compositions of the Scythian gene pools.

We find that the missing piece matches well with a small contribution from a source related to ancient populations living in the southern regions of the Caucasus/Iran or Turan. The proportions of this ancestry increase through time and space: a negligible amount in the most northeastern Aldy_Bel_700BCE group, ~6% in the early Tasmola_650BCE, ~12% in Pazyryk_Berel_50BCE, ~10% in Sargat_300BCE, ~13% in Saka_TianShan_600BCE, and ~20% in Saka_TianShan_400BCE (Fig. 3A), in line with f4-statistics.

Sarmatians also require 15 to 20% Iranian ancestry while carrying substantially less Khovsgol and more steppe_MLBA-related ancestry than the eastern Scythian groups.

admixture-iron-age-scythian-saka-sarmatian — **Bar plots showing the ancestry proportions and SEs obtained from qpWave/qpAdm modelings.** (A) Fitting models for the main IA groups using LBA sources, the major genetic shift with the “new” East Asian influx (DevilsCave_N-like) observed in the Middle IA outliers and Korgantas. (B) Fitting models for the post-IA groups using IA groups as sources. A transparency factor is added to the models presenting poor fits (P < 0.05; only Konyr_Tobe_300CE). On the top is shown the color legend for the sources tested.

Dating ancient admixture

Admixture dating with the DATES program reveal an early formation of the main Scythian gene pools during 1000 to 1500 BCE. DATES is designed to model only the two-way admixture, so to account for the estimated three-way models obtained with qpWave and qpAdm, we independently tested the three pairwise comparisons (steppe_MLBA, BMAC, and Khovsgol). DATES was successful in fitting exponential decays for the two western + eastern Eurasian pairs, steppe_MLBA + Khovsgol, and BMAC + Khovsgol, while failing in the western + western Eurasian pair (steppe_MLBA + BMAC). For each target, steppe_MLBA + Khovsgol and BMAC + Khovsgol yielded nearly identical admixture date estimates. We believe that our estimates mostly reflect an average date between the genetically distinguishable eastern (Khovsgol) and western (steppe_MLBA + BMAC) ancestries, weighted by the relative contribution from the two western sources, rather than reflecting a true simultaneous three-way admixture.

It is noteworthy that DATES found increasingly younger admixture dates in the Tian Shan Saka groups as the BMAC-related ancestry increases: from Saka_TianShan_600BCE to the Saka_TianShan_400BCE and especially in the later Alai_Nura_300CE as well as for Pazyryk_Berel_50BCE and Sargat_300BCE with respect to the date of Tasmola_650BCE (~1100 to 900 BCE with respect to ~1300 to 1400 BCE). A small-scale gene flow from a BMAC-related source continued over IA may explain both the increase in the BMAC-related ancestry proportion and increasingly younger admixture dates.

Principal component analysis. On the top left is shown PCA (PC1 vs PC2) of present-day Eurasian populations on top of which the new 111 ancient individuals are projected. Individuals are grouped on the site based and colored according to the cultural affiliation (top right legends). In red in the PCA spaces are reported the label of the outlier individual removed from their respective cultural group (see Extended Data Table 1). The bottom legends show the list of present-day Eurasian populations used for calculating the PCA colored by language family. The new data of 96 ethnic Kazakh individuals (KZN, Kazakh_new) are colored in dark blue. See full size image with labels.

Scythian Y-SNP calls

This is a list of relevant Y-SNP calls for some selected samples from the Sargat Horizon, from older to younger layers according to mean date (mtDNA hg. appears first):

Shadrinsk_400BCE (5th-3rd c. BCE):

SHD001: H1. Q-M242>MEH2>M346>L53>YP4010>YP3966>YP3955>YP3953(xYP4024>YP4055>BY45596, xYP4024>YP4055>YP3952).
SHD002: H101. (Q-M242>MEH2>)M346 (M346 level: FT32889 1xC->A; very low coverage).

Vorobievo_350BCE (5th-2nd c. BCE):

VOR003: N-M231>L735 (low coverage). Probably false positive at TAT>P83>Y23750 level.

Shmakovo_350BCE (4th-3rd c. BCE):

SMV001: U4b1b1. N-M231>L735>F1206>L729>TAT>F1419>L839>L708>M2005>CTS9239>L1026>CTS6967>CTS3103(*? xB479>B511; xY6058>B197; xY6058>CTS10760>CTS2929, xY6058>CTS10760>PH1266; xZ1936).
SMV002: T2d1b2. R-M207>M173>M420>M459>M198>M417>PF6162>Z93>Z94>Z2124>Z2122>Y57>FGC4547>Y52(*?xBY32065; xFGC4582>YP5644; FT416857>YP1269>BY27616, FT16857>YP1269>BY30803; xY2631). Reported as G-M201 in the paper.

MtBitiya_300BCE (3rd-2nd c. BCE):

BIY001: H7e. N-M231(>L735-?>)F1206(>L729)>TAT? (xF4309; xB211; xCTS2929).
BIY002: R1b1. Q-M242>MEH2>M346>L53>YP4010>YP3966>YP3955>YP3953*(xYP4024).
BIY003: G2a1. N-M2005>(CTS9239>L1026)>CTS6967>CTS3103(xB479>B511, xY6058>B197>B202, xP89, xCTS2929, xZ1936>CTS1223, xZ1936>BY190090). If it is L1026 or Z1936, it would be a basal lineage.
BIY005: D4j. N-M231>L735>F1206>L729>TAT>F1419>L839>L708>M2005>CTS9239(*?/L1026*; xCTS6967). Derived at CTS9239 level (CTS10761+ 1x T->A), ancestral at CTS6967 level (L392- 1x A->C). Dubious due to low coverage.
BIY006: D4j. CT(xN).
BIY007: U4b1b1. N-M231>L735>F1206>L729>TAT>F1419>L839>L708>M2005>CTS9239>L1026>CTS6967>CTS3103>Z1936(*? xBY190090; xCTS1223>CTS9925, xCTS1223>Y13851>Y13852>Y13850). Reported as G-M201 in the paper.
BIY008: U4d2. N-M231>L735>F1206>L729>TAT>F1419>L839>L708>M2005>CTS9239>L1026>CTS6967>CTS3103(xB479>B511, xY6058>B197>B202, xY6058>B197>P89, xY6058>CTS10760; xZ1936>BY190090; xZ1936>CTS1223>CTS9925, xZ1936>CTS1223>Y13852>Y13850). Likely to be of a basal haplogroup among any of the no calls.
BIY009: N1a1a1a1a. N-M231>L735>F1206>L729>TAT>F1419>L839>L708>M2005>CTS9239>L1026>CTS6967>CTS3103>Z1936* (xBY190090, xCTS1223).
BIY010: H2a1. CT>F (xG, xJ, xN, xP-M45). Possibly Q-M242>MEH2 (Z36014 1xG->A), as reported in the paper.
BIY011: U4a1. R(-M207)>M173>(M343>)L754>(L389>P297>)M269>(L23>)L51?. One read for L51 (L51 1xG->A).
BIY012: U5a1a2a. N-M231(xB482>FT324; xL735). It could be thus of hg. N-M231*, N-M231>B482* or within the N-M231>B482>MF52704 branch.

Kokonovka_200BCE (4th-3rd c. BCE):

KOK001: N-M231>L735>F1206>L729>(>TAT>F1419>L839>L708>M2005>CTS9239>L1026>)CTS6967(xB479>B516). Low coverage. KOK001 outlier. Classified as Q2a in the paper.
KOK002: N-M231>(L735>F1206>L729>)TAT (xB211; xB202; xP89).

Bogdanovka_150BCE (2nd-1st c. BCE):

BGD004: H5b. N-M231>L735>(F1206>L729>)TAT-??>(F1419>)L839(xL708>B211, xL708>M2005>CTS9239>L1026). Dubious call. Reported as NO in the paper.
BGD002: K2a5. R-M207>M173>M420>M459>M198>M417>PF6162(xZ283>Z282>PF6155, xZ283>Z282>Z280; xZ93>Z94>Z2124>Z2122>Y57>FGC4547>Y52, xZ93>Z94>Z2124>Z2122>Y57>YP645, etc.). No call at Z93, Z94, Z2124, Z2122 levels.

scythian-iron-age-sakas-ancient-dna — Location of sampled groups. Symbols of Sargat samples modified in accordance with the symbols used for the PCA. Also added is the approximate location of the medieval Uyelgi site.

Sargat horizon

The following are excerpts from Hanks (2003):

Following a temporal framework put forth by Koryakova and Daire (1997), Figure 4.1 shows that the Early Iron Age period for the Trans-Ural region can be broadly divided into three chronological divisions: the Pre-Sargat (8th-6th c. BC), Gorokhovo-Sargat (5th-3rd c. BC), and the Sargat/Late-Sargat (2nd c. BC/ 3rd c. AD).

pre-sargat-gorokhovo-itkul-phase — Schematic detailing the spatial-temporal developments within the Trans-Ural region regarding conventional ‘archaeological culture’ patterns. Image from Hanks (2003).

The Pre-Sargat stage reflects the significant interaction between the Itkul metallurgical populations, located in the eastern area of the Ural Mountains, and the nomadic Saka and later Sauro-Sarmatian populations of the southern steppe that ranged north and south seasonally and penetrated the Trans-Ural area.

This interaction is in turn thought to have directly influenced the development of the indigenous forest and forest-steppe cultures in the western Trans-Ural that are known as the Iset group; comprised of the Nosilovo, Vorobievo and Zelenomys cultures. From this significant sphere of contact, two main cultural lines are believed to have developed and are named the Gorokhovo and Baitovo. According to Koryakova and Daire (1997, 166), the Baitovo cultural tradition may be connected to the earlier Suzgun Barkhatovo pattern associated with the Late Bronze Age period.

In the eastern Trans-Ural, the Pre-Sargat stage also reflects an important period of contact with the nomadic populations of the south. In this case, there is a proposed direct interaction and stimulation of the Suzgun cultural group and its subsequent development in connection with the Post-Irmen type. From these socio-cultural dynamics the earliest development, or genesis, of the Sargat culture is inferred and the start of its expansion is taken to be visible in the archaeological record.

The Gorokhovo-Sargat stage reflects continued interaction between the Trans-Ural foreststeppe and metallurgical cultures with the nomadic Saka and Sauro-Sarmatian tribes of the southern Kazakhstan steppe region. During this period the Gorokhovo culture absorbed the other local Iset groups (i.e. Baitovo, Nosilovo, etc.) and became a more significant factor in the socio-political development of the Trans-Ural region. As such, the Gorokhovo culture has traditionally been characterised as a semi-nomadic chiefdom level society that reached its zenith during the 4th-3rd centuries BC (Koryakova 1988, Koryakova and Daire 1997, 167). It is also at this time that the hypothesised westward expansion of the Sargat cultural groups occurred, whereby the gradual absorption of the Gorokhovo and other cultural groups took place by the end of the 3rd century BC. It is believed that within this time frame the Sargat cultural pattern intensified and thus extended from the Ural Mountains east to the Baraba plain in the West Siberian plateau. (…)

In the Sargat/Late-Sargat phase, according to most regional scholars, a general stabilisation period for the Trans-Ural region can be inferred. The widespread Sargat culture became a vital component in the ever-increasing activity of long distance trade and exchange networks associated with the larger Eurasian geographical sphere. It is at this time that artefacts reflecting Roman, Chinese, Hunnic and Central Asian origins or influences appeared in several Sargat settlement and mortuary contexts in the east and west of the Trans-Ural region. Some scholars have emphasised the important role of the Sargat phase in the early development of a long distance trade route between eastern and western Eurasia, one that perhaps may have provided an early foundation for the later Silk Road trade route (Koryakova 1988; 1998a, 215; Matveeva 1993b; 2000, 76). According to Koryakova and Daire, concerning imported objects in the Sargat region, “about 25% derive from the south, about 15% come from the eastern (Hunnic) world and about 10% from the west” (1997, 171). The increasing importance of the Sargat groups within the greater regional dynamics signalled a stronger orientation towards the southern Central Asian area and the nomadic groups that occupied this region. The material record, reflected in the settlements and mortuary sites of the Sargat culture, yields a strong connection in both artefacts as well as particular patterns of mortuary practices. Some of these patterns will be discussed and illustrated in more detail in the following sections of this chapter (settlement sites) and also Chapter Seven, which investigates the mortuary materials in more detail.

sargat-ananyino-sarmatian-steppe-cultural-interaction — Map showing general interaction sphere between steppe nomadic (1, 2), forest-steppe(3) and Ural Mountain metallurgical (4) population groups. Dotted arrows reflect general direction of nomadic transmigrations and double headed arrows reflect general movement of metallurgical materials (after Hanks forthcoming-b). Image from Hanks (2003).

Comments

The findings of Gnecchi-Ruscone et al. (2021) seem to be a priori in line with the traditional description of the Sargat culture as a mixture of Iranian-speaking Scythians with Ugric-speaking locals. Most Sargat-related elite samples seem as intrusive as all other sampled Iron Age nomads in terms of ancestry, mostly shared with Tasmola/Pazyryk samples, and forming a cluster intermediate between recent Iranian- and Turkic-related clines in the PCA. The distribution of haplogroups among ancient and modern populations also suggests a strong association with Iron Age and medieval nomads from the east.

The earliest sampled Sargat (Gorokhovo-related?) Shadrinsk group shows hg. Q1b-YP4010, which has been reported among Cis-Baikal Neolithic populations, and the same subclade is also found in the Mt. Bitiya group, with no distinctive ancestry pattern for all four Q subclades attested. The specific cross-site Q1b-YP3953 is also reported in a later Alanic sample from North Ossetia, and is found today mainly in the Caucasus (Chechens), which suggests that they all stem from westward nomadic expansions with an ultimate origin close to Lake Baikal, but probably more directly connected to Iranian-speaking (Scytho-Siberian?) groups.

Previously published (STR-based) Sargat samples of hg. R1a are reported to the north in the Siberian forest-steppes, but they also appear in two different groups from the Gorokhovo-Sargat phase in this sampling. The deeper R1a-Y52(*?) subclade in particular has been previously reported for 2 Sarmatians, 1 Xiongnu, 1 Late Iron Age (OutTurk) steppe nomad, and 1 Turkic individual, which suggests – again – a strong association of the Sargat population with eastern nomads.

The R1b-M269 sample from the same site is intriguing, and it might point to an ancestral connection with the already known Mezhovskaya sample of hg. M269, but – based on its ancestry, phase and location – it seems more likely that it belonged to the same ancestral eastern populations as the other Q1b, R1a, N2, & N1a samples, i.e. close to the Altai, and thus ultimately related to Afanasievo, like the medieval Old Uyghurs sampled from Mongolia. This is yet another potential indirect proof of the survival of this haplogroup close to the Altai.

There is a prevalence of apparently basal N1a-L1026-related lineages such as N-CTS9239*, N-CTS3103*, and N-Z1936* – in core Sargat areas to the east, which – based on the location and ancestry of the known ancient N-L1026 samples to date – suggests an ultimate common origin in South Siberia, with a potential direct origin among Pazyryk-related groups (where hg. N has been reported based on Y-STRs). Many of these subclades probably disappeared with the westward expansion of the Sargat horizon under increasing Y-DNA bottlenecks of elite patrilines, based on the almost exclusive N-Z1936>Y13850 subclades attested in Uyelgi and later among Turks, Early Magyars, and Ugrians compared to the diversity found in the east, like the (surprisingly homogeneous) N-B197-rich Avar elites, of a lineage later found in Mongols; or the N-M2019 samples found among Magyars, Mongols, and likely prevalent in ancient Yukaghirs and among modern Yakuts and Evenks.

Interestingly, these basal N lineages are found together with a possible N2-B482 (MF52704?) (YFull N-Y6503), which has also been reported previously for Mongolia LN (Fofonovo) and Cis-Baikal EBA, and for different (forest-)steppe samples with an evident ultimate eastern origin, including a Munkh-Khairkhan individual, a likely Pre-Scythian West Siberia LBA from Afontova Gora (ca. 920 BC), a Cimmerian of the Mezőcsát Culture in Hungary (ca. 900 BC), and a Wusun individual.

Furthermore, the strong bottlenecks under N1a-CTS2929/VL29 (and probably other N-L708) lineages that appeared slightly earlier in North-Eastern Europe are therefore likely to have stemmed from a quite recent, closely related intrusion of (Pre-)Scythian nomads to the Cis-Urals and the Volga River basin, and whose haplogroup(s) probably became incorporated into the Ananyino culture, as could be predicted long ago (read more about the Meshchera). Such a discontinuous Iranic influence would justify the variable Old Iranian, “Pre-Alanic” (i.e. dialectal Sakan) and Alanic loanwords found among Uralic dialects around the Urals and reaching up to Proto-Finnic (read more on the Proto-Uralic homeland).

Principal component analysis. On the top left is shown PCA (PC1 vs PC3) of present-day Eurasian populations on top of which the new 111 ancient individuals are projected. Individuals are grouped on the site based and colored according to the cultural affiliation (top right legends). In red in the PCA spaces are reported the label of the outlier individual removed from their respective cultural group (see Extended Data Table 1). The bottom legends show the list of present-day Eurasian populations used for calculating the PCA colored by language family. The new data of 96 ethnic Kazakh individuals (KZN, Kazakh_new) are colored in dark blue. See full size image with labels.

In fact, the referred admixture with “locals” among Sargat samples seems to be represented solely by the eastern outlier KOK001.A0101, who belongs to a late period (Gorokhovo-Sargat phase) and shows the same N-CTS6967 haplogroup as other, earlier samples of “non-local” ancestry from the same phase, suggesting a (limited?) practice of female exogamy, also seen still in progress 1,500 years later among the Uyelgi to the west… Its nature as a late outlier and qualitatively different northern shift questions thus the origin of that extra Northeast Asian admixture found in other Sargat samples, which can only be properly assessed by comparing its ancestry with ancient populations.

These findings support what the medieval samples from Uyelgi already suggested autosomally, and what the recent Baikal sampling supported in terms of haplogroups, against previous expectations: the expansion of subclades from the N-L1026 trunk (like N-Z1636 or N-VL29) represented not some deeply split and widespread Bronze Age Circum-Urals N1a-rich hunter-gatherer population rich in “Siberian” ancestry, but the recent Iron Age intrusion of populations displaying an ancestry proper of eastern (forest-)steppe nomads. These lineages formed thus part of complex, millennia-long processes of expansions, acculturation events, and Y-DNA bottlenecks and founder effects still active around the Urals and in North-Eastern Europe during the Early Middle Ages. In that sense, it is not unlike the expansion of EEF-related ancestry and haplogroups such as G2, E1b-V13, I2, J2, etc. among different European Bronze Age, Iron Age, and medieval groups.

Furthermore, (1) the early isolation of Samoyedic from Ugric, (2) the western location of the Ugric Sprachbund around the Urals, and (3) the likely western location of Proto-Hungarian isolated from Khanty, all that renders Parpola’s proposal that the westward expansion of the Sargat culture represented Proto-Hungarian speakers very, very unlikely. On the other hand, the ancestry found in Sargat is compatible with the known intense Iranian influence found in all Ugric dialects, and the eastern origin of most patrilines points to the Proto-Turkic interactions with Proto-Samoyedic, Sakan, and “Proto-Ob-Ugric”, which might be ultimately related to Janhunen’s proposal of a “Pre-Proto-Oghuric” split first attested slightly later in some Hunnic ethnonyms and anthroponyms.