Interesting report by Bernard Sécher on Anthrogenica, about the Ph.D. thesis of Samantha Brunel from Institut Jacques Monod, Paris, Paléogénomique des dynamiques des populations humaines sur le territoire Français entre 7000 et 2000 (2018).
A summary from user Jool, who was there, translated into English by Sécher (slight changes to translation, and emphasis mine):
They have a good hundred samples from the North, Alsace and the Mediterranean coast, from the Mesolithic to the Iron Age.
There is no major surprise compared to the rest of Europe. On the PCA plot, the Mesolithic are with the WHG, the early Neolithics with the first farmers close to the Anatolians. Then there is a small resurgence of hunter-gatherers that moves the Middle Neolithics a little closer to the WHGs.
From the Bronze Age, they have 5 samples with autosomal DNA, all in Bell Beaker archaeological context, which are very spread on the PCA. A sample very high, close to the Yamnaya, a little above the Corded Ware, two samples right in the Central European Bell Beakers, a fairly low just above the Neolithic package, and one last full in the package. The most salient point was that the Y chromosomes of their 12 Bronze Age samples (all Bell Beakers) are all R1b, whereas there was no R1b in the Neolithic samples.
Finally they have samples of the Iron Age that are collected on the PCA plot close to the Bronze Age samples. They could not determine if there is continuity with the Bronze Age, or a partial replacement by a genetically close population.
The sample with likely high “steppe ancestry“, clustering closely to Yamna (more than Corded Ware samples) is then probably an early East Bell Beaker individual, probably from Alsace, or maybe close to the Rhine Delta in the north, rather than from the south, since we already have samples from southern France from Olalde et al. (2018) with high Neolithic ancestry, and samples from the Rhine with elevated steppe ancestry, but not that much.
This specific sample, if confirmed as one of those reported as R1b (then likely R1b-L151), as it seems from the wording of the summary, is key because it would finally link Yamna to East Bell Beaker through Yamna Hungary, all of them very “Yamnaya-like”, and therefore R1b-L151 (hence also R1b-L51) directly to the steppe, and not only to the Carpathian Basin (that is, until we have samples from late Repin or West Yamna…)
NOTE. The only alternative explanation for such elevated steppe ancestry would be an admixture between a ‘less Yamnaya-like’ East Bell Beaker + a Central European Corded Ware sample like the Esperstedt outlier + drift, but I don’t think that alternative is the best explanation of its position in the PCA closer to Yamna in any of the infinite parallel universes, so… Also, the sample from Esperstedt is clearly a late outlier likely influenced by Yamna vanguard settlers from Hungary, not the other way round…
Unexpectedly, then, fully Yamnaya-like individuals are found not only in Yamna Hungary ca. 3000-2500 BC, but also among expanding East Bell Beakers later than 2500 BC. This leaves us with unexplained, not-at-all-Yamnaya-like early Corded Ware samples from ca. 2900 BC on. An explanation based on admixture with locals seems unlikely, seeing how Corded Ware peoples continue a north Pontic cluster, being thus different from Yamna and their ancestors since the Neolithic; and how they remained that way for a long time, up to Sintashta, Srubna, Andronovo, and even later samples… A different, non-Indo-European community it is, then.
Let’s wait and see the Ph.D. thesis, when it’s published, and keep observing in the meantime the absurd reactions of denial, anger, bargaining, and depression (stages of grief) among BBC/R1b=Vasconic and CWC/R1a=Indo-European fans, as if they had lost something (?). Maybe one of these reactions is actually the key to changing reality and going back to the 2000s, who knows…
Featured image: initial expansion of the East Bell Beaker Group, by Volker Heyd (2013).
Overall, 96 samples ranging from Slovenian littoral to Lower Styria were genotyped for 713,599 markers using the OmniExpress 24-V1 BeadChips (Figure 1), genetic data were obtained from Esko et al. (2013). After removing related individuals, 92 samples were left. The Slovenian dataset has been subsequently merged with the Human Origin dataset (Lazaridis et al., 2016) for a total of 2163 individuals.
First, Y chromosome genetic diversity was assessed. A total of 52 Y chromosomes were analyzed for 195 SNPs. The majority of individuals (25, 48.1%) belong to the haplogroup R1a1a1a (R-M417) while the second major haplogroup is represented by R1b (R-M343) including 15 individuals (28.8%). Twelve samples are assigned to haplogroup I (I M170): five and two samples belong to haplogroup I2a (I L460) and I1 (I M253), respectively, while the remaining five samples did not have enough information to be further assigned.
Considering the unbalanced sample size of the Slovenian population compared to the other populations included in the dataset, a subset of 20 Slovenian individuals randomly sampled was used.
All Slovenian samples group together with Hungarians, Czechs, and some Croatians (“Central-Eastern European” cluster) as also suggested by the PCA. All Basque individuals with few French and Spanish cluster together (“Basque” cluster) while a “Northern-European” cluster is made of the majority of French, English, Icelanders, Norwegians, and Orcadians. Five populations contributed to the “Eastern-European” cluster including Belarusians, Estonians, Lithuanians, Mordovians, and Russians. Western and South Europe is split into two cluster: the first (“Western European” cluster) includes all Spanish individuals, few French, and some Italians (North Italy) while the second (“Southern-European” cluster) groups Sicilians, Greeks, some Croatians, Romanians, and some Italians (North Italy).
Admixture Pattern and Migration
All Slovenian individuals share common pattern of genetic ancestry, as revealed by ADMIXTURE analysis. The three major ancestry components are the North East and North West European ones (light blue and dark blue, respectively, Figure 3), followed by a South European one (dark green, Figure 3). Contribution from the Sardinians and Basque are present in negligible amount. The admixture pattern of Slovenians mimics the one suggested by the neighboring Eastern European populations, but it is different from the pattern suggested by North Italian populations even though they are geographically close.
Using ALDER, the most significant admixture event was obtained with Russians and Sardinians as source populations and it happened 135 ± 9.31 generations ago (Z-score = 11.54). (…) When tested for multiple admixture events (MALDER), we obtained evidence for one admixture event 165.391 ± 17.1918 generations ago corresponding to ∼2620 BCE (CI: 3101–2139) considering a generation time of 28 years (Figure 4), with Kalmyk and Sardinians as sources.
We then modeled the Slovenian population as target of admixture of ancient individuals from Haak et al. (2015) while computing the f3(Ancient 1, Ancient 2, Slovenian) statistic. The most significant signal was obtained with Yamnaya and HungaryGamba_EN (Z-score = -10.66), followed by MA1 with LBK_EN (Z-score -9.7) and Yamnaya with Stuttgart (Z-score = -8.6) used as possible source populations (Supplementary Figure 5).
We found a significant signal of admixture by using both pairs as ancient sources. Specifically, for the pair Yamnaya and Hungary_EN the admixture event is dated at 134.38 ± 23.69 generations ago (Z-score = 5.26, p-value of 1.5e-07) while for Yamnaya and LBK_EN at 153.65 ± 22.19 generations ago (Z-score = 6.92, p-value 4.4e-12). Outgroup f3 with Yamnaya put Slovenian population close to Hungarians, Czechs, and English, indicating a similar shared drift between these population with the Steppe populations (Supplementary Figure 6).
Not that any of this would come as a surprise, but:
PCA keeps supporting the common cluster of certain West, South, and East Slavs in a “Central-Eastern European” cluster, distinct from the “North-Eastern European” cluster formed by modern Finno-Ugrians, as well as ancient Finno-Ugrians of north-eastern Europe who were only recently Slavicized.
Admixture supports the same ancient ‘western’ (a core West+South+East Slavic) cluster, and the admixture event with Yamna + Hungary_EN is logically a proxy for Yamna Hungary being at the core of ancestral Central-East population movements related to Bell Beakers in the mid- to late 3rd millennium.
I don’t know where exactly this impulse for the theory of Russia being the cradle of Slavs comes from today (although there are some obvious political trends to revive 19th c. ideas), but it was always clear for everyone, including Russians, that East Slavs had migrated to the east and north and assimilated indigenous Finno-Ugrians, apart from Turkic-, Iranian-, and Caucasian-speaking peoples to the east. Genetics is only confirming what was clear from other disciplines long ago.
To understand the population history and context of dairy pastoralism in the eastern Eurasian steppe, we applied genomic and proteomic analyses to individuals buried in Late Bronze Age (LBA) burial mounds associated with the Deer Stone-Khirigsuur Complex (DSKC) in northern Mongolia. To date, DSKC sites contain the clearest and most direct evidence for animal pastoralism in the Eastern steppe before ca. 1200 BCE.
Most LBA Khövsgöls are projected on top of modern Tuvinians or Altaians, who reside in neighboring regions. In comparison with other ancient individuals, they are also close to but slightly displaced from temporally earlier Neolithic and Early Bronze Age (EBA) populations from the Shamanka II cemetry (Shamanka_EN and Shamanka_EBA, respectively) from the Lake Baikal region. However, when Native Americans are added to PC calculation, we observe that LBA Khövsgöls are displaced from modern neighbors toward Native Americans along PC2, occupying a space not overlapping with any contemporary population. Such an upward shift on PC2 is also observed in the ancient Baikal populations from the Neolithic to EBA and in the Bronze Age individuals from the Altai associated with Okunevo and Karasuk cultures.
(…) two individuals fall on the PC space markedly separated from the others: ARS017 is placed close to ancient and modern northeast Asians, such as early Neolithic individuals from the Devil’s Gate archaeological site (22) and present-day Nivhs from the Russian far east, while ARS026 falls midway between the main cluster and western Eurasians.
Upper Paleolithic Siberians from nearby Afontova Gora and Mal’ta archaeological sites (AG3 and MA-1, respectively) (25, 26) have the highest extra affinity with the main cluster compared with other groups, including the eastern outlier ARS017, the early Neolithic Shamanka_EN, and present-day Nganasans and Tuvinians (Z > 6.7 SE for AG3). Main cluster Khövsgöl individuals mostly belong to Siberian mitochondrial (A, B, C, D, and G) and Y (all Q1a but one N1c1a) haplogroups.
Previous studies show a close genetic relationship between WSH populations and ANE ancestry, as Yamnaya and Afanasievo are modeled as a roughly equal mixture of early Holocene Iranian/ Caucasus ancestry (IRC) and Mesolithic Eastern European hunter-gatherers, the latter of which derive a large fraction of their ancestry from ANE. It is therefore important to pinpoint the source of ANE-related ancestry in the Khövsgöl gene pool: that is, whether it derives from a pre-Bronze Age ANE population (such as the one represented by AG3) or from a Bronze Age WSH population that has both ANE and IRC ancestry.
The amount of WSH contribution remains small (e.g., 6.4 ± 1.0% from Sintashta). Assuming that the early Neolithic populations of the Khövsgöl region resembled those of the nearby Baikal region, we conclude that the Khövsgöl main cluster obtained ∼11% of their ancestry from an ANE source during the Neolithic period and a much smaller contribution of WSH ancestry (4–7%) beginning in the early Bronze Age.
Apparently, then, the first individual with substantial WSH ancestry in the Khövsgöl population (ARS026, of haplogroup R1a-Z2123), directly dated to 1130–900 BC, is consistent with the first appearance of admixed forest-steppe-related populations like Karasuk (ca. 1200-800 BC) in the Altai. Interestingly, haplogroup N1a1a-M178 pops up (with mtDNA U5a2d1) among the earlier Khövsgöl samples.
I will repeat what I wrote recently here: Samoyedic arrived in the Altai with Karasuk and hg R1a-Z645 + Steppe_MLBA-like ancestry, admixed with Altai populations, clustering thus within an Ancient Altai cline. Only later did N1a1a subclades infiltrate Samoyedic (and Ugric) populations, bringing them closer to their modern Palaeo-Siberian cline. The shared mtDNA may support an ancestral EHG-“Siberian” cline, or else a more recent Afanasevo-related origin.
Also interesting, Q1a2 subclades and ANE ancestry making its appearance everywhere among ancestral Eurasian peoples, as Chetan recently pointed out.
Let me begin this final post on the Corded Ware—Uralic connection with an assertion that should be obvious to everyone involved in ethnolinguistic identification of prehistoric populations but, for one reason or another, is usually forgotten. In the words of David Reich, in Who We Are and How We Got Here (2018):
Human history is full of dead ends, and we should not expect the people who lived in any one place in the past to be the direct ancestors of those who live there today.
Another recurrent argument – apart from “Siberian ancestry” – for the location of the Uralic homeland is “haplogroup N”. This is as serious as saying “haplogroup R1” to refer to Indo-European migrations, but let’s explore this possibility anyway:
We have now a better idea of how many ancient migrations (previously hypothesized to be associated with westward Uralic migrations) look like in genetic terms. From Damgaard et al. (Science 2018):
These serial changes in the Baikal populations are reflected in Y-chromosome lineages (Fig. SA; figs. S24 to S27, and tables S13 and SI4). MAI carries the R haplogroup, whereas the majority of Baikal_EN males belong to N lineages, which were widely distributed across Northern Eurasia (29), and the Baikal_LNBA males all carry Q haplogroups, as do most of the Okunevo_EMBA as well as some present-day Central Asians and Siberians.
The only N1c1 sample comes from Ust’Ida Late Neolithic, 180km to the north of Lake Baikal, which – together with the Bronze Age sample from the Kola peninsula, and the medieval sample from Ust’Ida – gives a good idea of the overall expansion of N subclades and Siberian ancestry among the Circum-Arctic peoples of Eurasia, speakers of Palaeo-Siberian languages.
What we should expect from Uralic peoples expanding with haplogroup N – seeing how Yamna expands with R1b-L23, and Corded Ware expands with R1a-Z645 – is to find a common subclade spreading with Uralic populations. Let’s see if it works like that for any N-X subclade, in data from Ilumäe et al. (2016):
Within the Eurasian circum-Arctic spread zone, N3 and N2a reveal a well-structured spread pattern where individual sub-clades show very different distributions:
N1a1-M46 (or N-TAT), formed ca. 13900 BC, TMRCA 9800 BC
N1a1a2-B187, formed ca. 9800 BC, TMRCA 1050 AD:
The sub-clade N3b-B187 is specific to southern Siberia and Mongolia, whereas N3a-L708 is spread widely in other regions of northern Eurasia.
N1a1a1a-L708, formed ca. 6800 BC, TMRCA 5400 BC.
N1a1a1a2-B211/Y9022, formed ca. 5400 BC, TMRCA 1900 BC:
The deepest clade within N3a is N3a1-B211, mostly present in the Volga-Uralic region and western Siberian Khanty and Mansi populations.
N1a1a1a1a-L392/L1026), formed ca. 4400 BC, TMRCA 2800 BC:
The neighbor clade, N3a3’6-CTS6967, spreads from eastern Siberia to the eastern part of Fennoscandia and the Baltic States
N1a1a1a1a1a-CTS2929/VL29, formed ca. 2100 BC, TMRCA 1600 BC:
In Europe, the clade N3a3-VL29 encompasses over a third of the present-day male Estonians, Latvians, and Lithuanians but is also present among Saami, Karelians, and Finns (Table S2 and Figure 3). Among the Slavic-speaking Belarusians, Ukrainians, and Russians, about three-fourths of their hg N3 Y chromosomes belong to hg N3a3.
In the post on Finno-Permic expansions, I depicted what seems to me the most likely way of infiltration of N1c-L392 lineages with Akozino warrior-traders into the western Finno-Ugric populations, with an origin around the Barents sea.
This includes the potential spread of (a minority of) N1c-B211 subclades due to contacts with Anonino on both sides of the Urals, through a northern route of forest and forest-steppe regions (equivalent to the distribution of Cherkaskul compared to Andronovo), given the spread of certain subclades in Ugric populations.
NOTE. An alternative possibility is the association of certain B211 subclades with a southern route of expansion with Pre-Scythian and Scythian populations, under whose influence the Ananino culture emerged -which would imply a very quick infiltration of certain groups of haplogroup N everywhere among Finno-Ugrics on both sides of the Urals – , and also the expansion of some subclades with Turkic-speaking peoples, who apparently expanded with alliances of different peoples. Both (Scythian and Turkic) populations expanded from East Asia, where haplogroup N (including N1c) was present since the Neolithic. I find this a worse model of expansion for upper clades, but – given the YFull estimates and the presence of this haplogroup among Turkic peoples – it is a possibility for many subclades.
N1a1a1a1a2-Z1936, formed ca. 2800 BC, TMRCA 2400 BC:
The only notable exception from the pattern are Russians from northern regions of European Russia, where, in turn, about two-thirds of the hg N3 Y chromosomes belong to the hg N3a4-Z1936—the second west Eurasian clade. Thus, according to the frequency distribution of this clade, these Northern Russians fit better among other non-Slavic populations from northeastern Europe. N3a4 tends to increase in frequency toward the northeastern European regions but is also somewhat unexpectedly a dominant hg N3 lineage among most Turcic-speaking Volga Tatars and South-Ural Bashkirs.
N1a1a1a1a4-M2019 (previously N3a2), formed ca. 4400 BC, TMRCA 1700 BC:
Sub-hg N3a2-M2118 is one of the two main bifurcating branches in the nested cladistic structure of N3a2’6-M2110. It is predominantly found in populations inhabiting present-day Yakutia (Republic of Sakha) in central Siberia and at lower frequencies in the Khanty and Mansi populations, which exhibit a distinct Y-STR pattern (Table S7) potentially intrinsic to an additional clade inside the sub-hg N3a2
The second widespread sub-clade of hg N is N2a. (…):
N1a2b-P43 (B523/FGC10846/Y3184), formed ca. 6800 BC, TMRCA ca. 2700 BC:
The absolute majority of N2a individuals belong to the second sub-clade, N2a1-B523, which diversified about 4.7 kya (95% CI = 4.0–5.5 kya). Its distribution covers the western and southern parts of Siberia, the Taimyr Peninsula, and the Volga-Uralic region with frequencies ranging from from 10% to 30% and does not extend to eastern Siberia (…)
The “European” branch suggested earlier from Y-STR patterns turned out to consist of two clades
N1a2b2a-Y3185/FGC10847, formed ca. 2200 BC, TMRCA 800 BC:
N2a1-L1419, spread mainly in the northern part of that region.
N1a2b2b1-B528/Y24382, formed ca. 900 BC, TMRCA ca. 900 BC:
N2a1-B528, spread in the southern Volga-Uralic region.
We also have a good idea of the distribution of haplogroup R1a-Z645 in ancient samples. Its subclades were associated with the Corded Ware expansion, and some of them fit quite well the early expansion of Finno-Permic, Ugric, and Samoyedic peoples to the east.
This is how the modern distribution of R1a among Uralians looks like, from the latest report in Tambets et al. (2018):
Among Fennic populations, Estonians and Karelians (ca. 1.1 million) have not suffered the greatest bottleneck of Finns (ca. 6-7 million), and show thus a greater proportion of R1a-Z280 than N1c subclades, which points to the original situation of Fennic peoples before their expansion. To trust Finnish Y-DNA to derive conclusions about the Uralic populations is as useful as relying on the Basque Y-DNA for the language spread by R1b-P312…
Among Volga-Finnic populations, Mordovians (the closest to the original Uralic cluster, see above) show a majority of R1a lineages (27%).
Hungarians (ca. 13-15 million) represent the majority of Ugric (and Finno-Ugric) peoples. They are mainly R1a-Z280, also R1a-Z2123, have little N1c, and lack Siberian ancestry, and represent thus the most likely original situation of Ugric peoples in 4th century AD (read more on Avars and Hungarians).
Among Samoyedic peoples, the Selkup, the southernmost ones and latest to expand – that is, those not heavily admixed with Siberian populations – , also have a majority of R1a-Z2123 lineages (see also here for the original Samoyedic haplogroups to the south).
To understand the relevance of Hungarians for Ugric peoples, as well as Estonians, Karelians, and Mordovians (and northern Russians, Finno-Ugric peoples recently Russified) for Finno-Permic peoples, as opposed to the Circum-Arctic and East Siberian populations, one has to put demographics in perspective. Even a modern map can show the relevance of certain territories in the past:
Summary of ancestry + haplogroups
Fennic and Samic populations seem to be clearly influenced by Palaeo-Laplandic peoples, whereas Volga-Finnic and especially Permic populations may have received gene flow from both, but essentially Palaeo-Siberian influence from the north and east.
The fact that modern Mansis and Khantys offer the highest variation in N1a subclades, and some of the highest “Siberian ancestry” among non-Nganasans, should have raised a red flag long ago. The fact that Hungarians – supposedly stemming from a source population similar to Mansis – do not offer the same amount of N subclades or Siberian ancestry (not even close), and offer instead more R1a, in common with Estonians (among Finno-Samic peoples) and Mordvins (among Volga-Finnic peoples) should have raised a still bigger red flag. The fact that Nganasans – the model for Siberian ancestry – show completely different N1a2b-P43 lineages should have been a huge genetic red line (on top of the anthropological one) to regard them as the Uralian-type population.
It is not hard to model the stepped arrival, infiltration, and/or resurge of N subclades and “Siberian ancestries”, as well as their gradual expansion in certain regions, associated with certain migrations first – such as the expansions to the Circum-Arctic region, and later the Scythian- and Turkic-related movements – , as well as limited regional developments, like the known bottleneck in Finns, or the clear late expansion of Ugric and Samoyedic languages to the north among nomadic Palaeo-Siberians due to traditions of exogamy and multilingualism. This fits quite well with the different arrival of N (N1c and xN1c) lineages to the different Uralic-speaking groups, and to the stepped appearance of “Siberian ancestry” in the different regions.
It is evident that a lot of people were too attached to the idea of Palaeolithic R1b lineages ‘native’ to western Europe speaking Basque languages; of R1a lineages speaking Indo-European and spreading with Yamna; and N lineages ‘native’ to north-eastern Europe and speaking Uralic, and this is causing widespread weeping and gnashing of teeth (instead of the joy of discovering where one’s true patrilineal ancestors come from, and what language they spoke in each given period, which is the supposed objective of genetic genealogy…)
As far as I know – and there might be many other similar pet theories out there – there have been proposals of “modern Balto-Slavic-like” populations (in an obvious circular reasoning based on modern populations) in some Scythian clusters of the Iron Age.
NOTE. I will not enter into “Balto-Slavic-like R1a” of the Late Bronze Age or earlier because no one can seriously believe at this point of development of Population Genetics that autosomal similarity predating 1,500+ years the appearance of Slavs equates to their (ethnolinguistic) ancestral population, without a clear intermediate cultural and genetic trail – something we lack today in the Slavic case even for the late Roman period…
We also know of R1a-Z280 lineages in Srubna, probably expanding to the west. With that in mind, and knowing that Palaeo-Germanic was in close contact with Finno-Samic while both were already separated but still in contact, and that Palaeo-Germanic was also in contact and closely related to a ‘Temematic’ distinct from Balto-Slavic (and also that early Proto-Baltic and Proto-Slavic from the Roman Iron Age and later were in contact with western Uralic) this will be the linguistic map of the Iron Age if R1a is considered to expand Indo-European from some kind of “patron-client” relationship with west Yamna:
My problem with this proposal is that it is obviously beholden to the notion of the uninterrupted cultural, historic and ethnic continuity in certain territories. This bias is common in historiography (von Falkenhausen 1993), but it extends even more easily into the lesser known prehistory of any territory, and now more than ever some people feel the need to corrupt (pre)history based on their own haplogroups (or the majority haplogroups of their modern countries). However, more than on philosophical grounds, my rejection is based on facts: this picture is not what the combination of linguistic, archaeological, and genetic data shows. Period.
Nevertheless, if Yamna + Corded Ware represented the “big and early expansion” of Germanic and Italo-Celtic peoples proper of the dream Nazi’s Lebensraum and Fascist’s spazio vitale proposals; Uralians were Siberian hunter-gatherers that controlled the whole eastern and northern Russia, and miraculously managed to push (ethnolinguistically) Neolithic agropastoralists to the west during and after the Iron Age, with gradual (and often minimal) genetic impact; and Balto-Slavic peoples were represented by horse riders from Pokrovka/Srubna, hiding then somewhere around the forest-steppe until after the Scythian expansion, and then spreading their language (without much genetic impact) during the early Middle Ages…so be it.
Even though proposals of an Eastern Uralic (or Ugro-Samoyedic) group are in the minority – and those who support it tend to search for an origin of Uralic in Central Asia – , there is nothing wrong in supporting this from the point of view of a western homeland, because the eastward migration of both Proto-Ugric and Pre-Samoyedic peoples may have been coupled with each other at an early stage. It’s like Indo-Slavonic: it just doesn’t fit the linguistic data as well as the alternative, i.e. the expansion of Samoyedic first, different from a Finno-Ugric trunk. But, in case you are wondering about this possibility, here is Häkkinen’s (2012) phonological argument:
The case of Samoyedic is quite similar to that of Hungarian, although the earliest Palaeo-Siberian contact languages have been lost. There were contacts at least with Tocharian (Kallio 2004), Yukaghir (Rédei 1999) and Turkic (Janhunen 1998). Samoyedic also:
a) has moved far from the related languages and has been exposed to strong foreign influence
b) shares a small number of common words with other branches (from Sammallahti 1988: only 123 ‘Uralic’ words, versus 390 ‘Uralic’ + ‘Finno-Ugric’ words found in other branches than Samoyedic = 31,5 %)
c) derives phonologically from the East Uralic dialect.
The phonological level is taxonomically more reliable, since it lacks the distortion caused by invisible convergence and false divergence at the lexical level. Thus we can conclude that the traditional taxonomic model, according to which Samoyedic was the first branch to split off from the Proto-Uralic unity, is just as incorrect as the view that Hungarian was the first branch to split off.
Late Uralic can be traced back to metallurgical cultures thanks to terms like PU *wäśka ‘copper/bronze’ (borrowed from Proto-Samoyedic *wesä into Tocharian); PU *äsa and *olna/*olni, ‘lead’ or ‘tin’, found in *äsa-wäśka ‘tin-bronze’; and e.g. *weŋći ‘knife’, borrowed into Indo-Iranian (through the stage of vocalization of nasals), appearing later as Proto-Indo-Aryan *wāćī ‘knife, awl, axe’.
It is known that the southern regions of the Abashevo culture developed Proto-Indo-Iranian-speaking Sintashta-Petrovka and Pokrovka (Early Srubna). To the north, however, Abashevo kept its Uralic nature, with continuous contacts allowing for the spread of lexicon – mainly into Finno-Ugric – , and phonetic influence – mainly Uralisms into Proto-Indo-Iranian phonology (read more here).
The northern part of Abashevo (just like the south) was mainly a metallurgical society, with Abashevo metal prospectors found also side by side with Sintashta pioneers in the Zeravshan Valley, near BMAC, in search of metal ores. About the Seima-Turbino phenomenon, from Parpola (2013):
From the Urals to the east, the chain of cultures associated with this network consisted principally of the following: the Abashevo culture (extending from the Upper Don to the Mid- and South Trans-Urals, including the important cemeteries of Sejma and Turbino), the Sintashta culture (in the southeast Urals), the Petrovka culture (in the Tobol-Ishim steppe), the Taskovo-Loginovo cultures (on the Mid- and Lower Tobol and the Mid-Irtysh), the Samus’ culture (on the Upper Ob, with the important cemetery of Rostovka), the Krotovo culture (from the forest steppe of the Mid-Irtysh to the Baraba steppe on the Upper Ob, with the important cemetery of Sopka 2), the Elunino culture (on the Upper Ob just west of the Altai mountains) and the Okunevo culture (on the Mid-Yenissei, in the Minusinsk plain, Khakassia and northern Tuva). The Okunevo culture belongs wholly to the Early Bronze Age (c. 2250–1900 BCE), but most of the other cultures apparently to its latter part, being currently dated to the pre-Andronovo horizon of c. 2100–1800 BCE (cf. Parzinger 2006: 244–312 and 336; Koryakova & Epimakhov 2007: 104–105).
The majority of the Sejma-Turbino objects are of the better quality tin-bronze, and while tin is absent in the Urals, the Altai and Sayan mountains are an important source of both copper and tin. Tin is also available in southern Central Asia. Chernykh & Kuz’minykh have accordingly suggested an eastern origin for the Sejma-Turbino network, backing this hypothesis also by the depiction on the Sejma-Turbino knives of mountain sheep and horses characteristic of that area. However, Christian Carpelan has emphasized that the local Afanas’evo and Okunevo metallurgy of the Sayan-Altai area was initially rather primitive, and could not possibly have achieved the advanced and difficult technology of casting socketed spearheads as one piece around a blank. Carpelan points out that the first spearheads of this type appear in the Middle Bronze Age Caucasia c. 2000 BCE, diffusing early on to the Mid-Volga-Kama-southern Urals area, where “it was the experienced Abashevo craftsmen who were able to take up the new techniques and develop and distribute new types of spearheads” (Carpelan & Parpola 2001: 106, cf. 99–106, 110). The animal argument is countered by reference to a dagger from Sejma on the Oka river depicting an elk’s head, with earlier north European prototypes (Carpelan & Parpola 2001: 106–109). Also the metal analysis speaks for the Abashevo origin of the Sejma-Turbino network. Out of 353 artefacts analyzed, 47% were of tin-bronze, 36% of arsenical bronze, and 8.5% of pure copper. Both the arsenical bronze and pure copper are very clearly associated with the Abashevo metallurgy.
The Abashevo metal production was based on the Volga-Kama-Belaya area sandstone ores of pure copper and on the more easterly Urals deposits of arsenical copper (Figure 9). The Abashevo people, expanding from the Don and Mid-Volga to the Urals, first reached the westerly sandstone deposits of pure copper in the Volga and Kama basins, and started developing their metallurgy in this area, before moving on to the eastern side of the Urals to produce harder weapons and tools of arsenical copper. Eventually they moved even further south, to the area richest in copper in the whole Urals region, founding there the very strong and innovative Sintashta culture.
Regarding the most likely expansion of Eastern Uralic peoples:
Nataliya L’vovna Chlenova (1929–2009; cf. Korenyako & Ku’zminykh 2011) published in 1981 a detailed study of the Cherkaskul’ pottery. In her carefully prepared maps of 1981 and 1984 (Figure 10), she plotted Cherkaskul’ monuments not only in Bashkiria and the Trans-Urals, but also in thick concentrations on the Upper Irtysh, Upper Ob and Upper Yenissei, close to the Altai and Sayan mountains, precisely where the best experts suppose the homeland of Proto-Samoyed to be.
The Cherkaskul’ culture was transformed into the genetically related Mezhovka culture (c. 1500–1000 BCE), which occupied approximately the same area from the Mid-Kama and Belaya rivers to the Tobol river in western Siberia (cf. Parzinger 2006: 444–448; Koryakova & Epimakhov 2007: 170–175). The Mezhovka culture was in close contact with the neighbouring and probably Proto-Iranian speaking Alekseevka alias Sargary culture (c. 1500–900 BCE) of northern Kazakhstan (Figure 4 no. 8) that had a Fëdorovo and Cherkaskul’ substratum and a roller pottery superstratum (cf. Parzinger 2006: 443–448; Koryakova & Epimakhov 2007: 161–170). Both the Cherkaskul’ and the Mezhovka cultures are thought to have been Proto-Ugric linguistically, on the basis of the agreement of their area with that of Mansi and Khanty speakers, who moreover in their Fëdorovo-like ornamentation have preserved evidence of continuity in material culture (cf. Chlenova 1984; Koryakova & Epimakhov 2007: 159, 175).
The Mezhovka culture was succeeded by the genetically related Gamayun culture (c. 1000–700 BCE) (cf. Parzinger 2006: 446; 542–545).
From the Gamayun culture descend Trans-Urals cultures in close contact with Finno-Permic populations of the Cis-Ural region:
[Proto-Mansi] Itkul’ culture (c. 700–200 BCE) distributed along the eastern slope of the Ural Mountains (cf. Parzinger 2006: 552–556). Known from its walled forts, it constituted the principal Trans-Uralian centre of metallurgy in the Iron Age, and was in contact with both the Anan’ino and Akhmylovo cultures (the metallurgical centres of the Mid-Volga and Kama-Belaya region) and the neighbouring Gorokhovo culture.
[Proto-Hungarian] via the Vorob’evo Group (c. 700–550 BCE) (cf. Parzinger 2006: 546–549), to the Gorokhovo culture (c. 550–400 BCE) of the Trans-Uralian forest steppe (cf. Parzinger 2006: 549–552). For various reasons the local Gorokhovo people started mobile pastoral herding and became part of the multicomponent pastoralist Sargat culture (c. 500 BCE to 300 CE), which in a broader sense comprized all cultural groups between the Tobol and Irtysh rivers, succeeding here the Sargary culture. The Sargat intercommunity was dominated by steppe nomads belonging to the Iranian-speaking Saka confederation, who in the summer migrated northwards to the forest steppe
[Proto-Khanty] Late Bronze Age and Early Iron Age cultures related to the Gamayunskoe and Itkul’ cultures that extended up to the Ob: the Nosilovo, Baitovo, Late Irmen’, and Krasnoozero cultures (c. 900–500 BCE). Some were in contact with the Akhmylovo on the Mid-Volga.
Parpola (2012) connects the expansion of Samoyedic with the Cherkaskul variant of Andronovo. As we know, Andronovo was genetically diverse, which speaks in favour of different groups developing similar material cultures in Central Asia.
Juha Janhunen, author of the etymological dictionary of the Samoyed languages (1977), places the homeland of Proto-Samoyedic in the Minusinsk basin on the Upper Yenissei (cf. Janhunen 2009: 72). Mainly on the basis of Bulghar Turkic loanwords, Janhunen (2007: 224; 2009: 63) dates Proto-Samoyedic to the last centuries BCE. Janhunen thinks that the language of the Tagar culture (c. 800–100 BCE) ought to have been Proto-Samoyedic (cf. Janhunen 1983: 117– 118; 2009: 72; Parzinger 2001: 80 and 2006: 619–631 dates the Tagar culture c. 1000–200 BCE; Svyatko et al. 2009: 256, based on human bone samples, c. 900 BCE to 50 CE). The Tagar culture largely continues the traditions of the Karasuk culture (c. 1400–900 BCE), (…)
The use of a map of “Siberian ancestry” peaking in the arctic to show a supposedly late Uralic population movement (starting in the Iron Age!) seems to be the latest trend in population genomics:
I guess that would make this map of Neolithic farmer ancestry represent an expansion of Indo-European from the south, because Anatolia, Greece, Italy, southern France, and Iberia – where this ancestry peaks in modern populations – are among the oldest territories where Indo-European languages were recorded:
Probably not the right interpretation of this kind of simplistic data about modern populations, though…
Overall, and specifically at lower values of K, the genetic makeup of Uralic speakers resembles that of their geographic neighbours. The Saami and (a subset of) the Mansi serve as exceptions to that pattern being more similar to geographically more distant populations (Fig. 3a, Additional file 3: S3). However, starting from K = 9, ADMIXTURE identifies a genetic component (k9, magenta in Fig. 3a, Additional file 3: S3), which is predominantly, although not exclusively, found in Uralic speakers. This component is also well visible on K = 10, which has the best cross-validation index among all tests (Additional file 3: S3B). The spatial distribution of this component (Fig. 3b) shows a frequency peak among Ob-Ugric and Samoyed speakers as well as among neighbouring Kets (Fig. 3a). The proportion of k9 decreases rapidly from West Siberia towards east, south and west, constituting on average 40% of the genetic ancestry of FU speakers in Volga-Ural region (VUR) and 20% in their Turkic-speaking neighbours (Bashkirs, Tatars, Chuvashes; Fig. 3a).
However, this ‘something’ that some people occasionally find in some Uralic populations is also common to other modern and ancient groups, and not so common in some other Uralic peoples. Simply put:
I already said this in the recent publication of Siberian samples, where a renamed and radiocarbon dated Finnish_IA clearly shows that Late Iron Age Saami (ca. 400 AD) had little “Siberian ancestry”, if any at all, representing the most likely Fennic (and Samic) ancestral components before their expansion into central and northern Finland, where they admixed with circum-polar peoples of asbestos ware cultures.
I will say that again and again, any time they report the so-called “Siberian ancestry” in Uralic samples, no matter how it is defined each time: it does not seem to be that special something people are looking for, but rather (at least in a great part) a quite old ancestral component forming an evident cline with EHG, whose best proximate source are Baikal_EN (and/or Devil’s Gate) at this moment, and thus also East European hunter-gatherers for Western Uralic peoples:
So either Samara_HG, Karelia_HG, and many other groups from eastern Europe all spoke Uralic according to this ADMIXTURE graphic (and the formation of steppe ancestry in the Volga-Ural region brought the Proto-Indo-European language to the steppes through the CHG/ANE expansion), or a great part of this “Siberian ancestry” found in modern Uralic-speaking populations is not what some people would like to think it is…
PCA clines can be looked for to represent expansions of ancient populations. Most recently, Flegontov et al. (2018) are attempting to do this with Asian populations:
For some Turkic groups in the Urals and the Altai regions and in the Volga basin, a different admixture model fits the data: the same West Eurasian source + Uralic- or Yeniseian-speaking Siberians. Thus, we have revealed an admixture cline between Scythians and the Iranian farmer genetic cluster, and two further clines connecting the former cline to distinct ancestry sources in Siberia. Interestingly, few Wusun-period individuals harbor substantial Uralic/Yeniseian-related Siberian ancestry, in contrast to preceding Scythians and later Turkic groups characterized by the Tungusic/Mongolic-related ancestry. It remains to be elucidated whether this genetic influx reflects contacts with the Xiongnu confederacy. We are currently assembling a collection of samples across the Eurasian steppe for a detailed genetic investigation of the Hunnic confederacies.
There are potential errors with this approach:
The main one is practical – does a modern cline represent an ancestral language? The answer is: sometimes. It depends on the anthropological context that we have, and especially on the precision of the PCA:
The ‘Europe’, ‘Middle East’, etc. clines of the above PCA do not represent one language, but many. For starters, the PCA includes too many (and modern) populations, its precision is useless for ethnolinguistic groups. Which is the right level? Again, it depends.
The other error is one of detail of the clines drawn (which, in turn, depends on the precision of the PCA). For example, we can draw two paralell lines (or even one line, as in Flegontov et al. above) in one PCA graphic, but we still don’t have the direction of expansion. How do we know if this supposed “Uralic-speaking cline” goes from one region to the other? For that level of detail, we should examine closely modern Uralic-speaking peoples and Circum-Arctic populations:
The real ancient Uralic cluster (drawn above in blue) is thus probably from a North-East European source (probably formed by Battle Axe / Fatyanovo-Balanovo / Abashevo) to the east into Siberian populations, and to the north into Laplandic populations (see below also on Mezhovska ancestry for the drawn ‘European cline’, which some may a priori wrongly assume to be quite late).
The fact that the three formed clines point to an admixture of CWC-related populations from North-Eastern Europe, and that variation is greater at the Palaeo-Laplandic and Palaeo-Siberian extremities compared to the CWC-related one, also supports this as the correct interpretation.
However, judging by the two main clines formed, one could be alternatively inclined to interpret that Palaeo-Laplandic and Palaeo-Siberian populations formed a huge ancestral “Uralic” ghost cluster in Siberia (spanning from the Palaeo-Laplandic to the Palaeo-Siberian one), and from there expanded Finno-Samic on one hand, and “Volga-Ugro-Samoyed” on the other. That poses different problems: an obvious linguistic and archaeological one – which I assume a lot of people do not really care about – , and a not-so-obvious genetic one (see below for ancient samples and for the expansion of haplogroup N).
Unlike this PCA with ancient samples, where Bell Beaker clines could be a rough approximation to the real sources for each population, and where a cluster spanning all three depicted Early Bronze Age clusters could give a rough proximate source of European Bell Beakers in Hungary (and where one can even distinguish the Y-DNA bottlenecks in the L23 trunk created by each cline) the PCA of modern Uralic populations is probably not suitable for a good estimate of the ancient situation, which may be found shifted up or down of the drawn “Uralic” cluster along East European groups.
After all, we already know that the Siberian cline shows probably as much an ancient admixture event – from the original Uralic expansion to the east with Corded Ware ancestry – as another more recent one – a westward migration of Siberian ancestry (or even more than one). While we know with more or less exactitude what happened with the Palaeo-Laplandic admixture by expanding Proto-Finno-Samic populations (see here), the Proto-Ugric and Pre-Samoyedic populations formed probably more than one cline during the different ancient migrations through central Asia.
Apparently, the Corded Ware expansion to the east was not marked by a huge change in ancestry. While the final version of Narasimhan et al. (2018) may show a little more detail about other forest-steppe Seima-Turbino/Andronovo-related migrations (and thus also Eastern Uralic peoples), we have already had enough information for quite some time to get a good idea.
Mezhovska‘s position is similar to the later Pre-Scythian and Scythian populations. There are some interesting details: apart from haplogroup R1a-Z280 (CTS1211+), there is one R1b-M269 (PF6494+), probably Z2103, and an outlier (out of three) in a similar position to the recently described central/southern Scythian clusters.
NOTE. The finding of R1b-M269 in the forest-steppe is probably either 1) from an Afanasevo-Okunevo origin, or 2) from an admixture with neighbouring Andronovo-related populations, such as Sargary. A third, maybe less likely option is that this haplogroup admixed with Abashevo directly (as it happened in Sintashta, Potapovka, or Pokrovka) and formed part of early Uralic migrations. In any case, since Mezhovska is a Bronze Age society from the Urals region, its association with R1b-Z2103 – like the association of R1b-Z2103 in Scythian clusters – cannot be attributed to “Thracian peoples”, a link which is (as I already said) too simplistic.
The drawn “European cline” of Hungarians (see above), leading from ‘west-like’ Mansi to Hungarian populations – and hosting also Finnic and Estonian samples – , cannot therefore be attributed simply to late “Slavic/Balkan-like” admixture.
Karasuk – located further to the east – is basically also Corded Ware peoples showing clearly a recent admixture with local ANE / Baikal_EN-like populations. In terms of haplogroups it shows haplogroup Q, R1a-Z2124, and R1a-Z2123, later found among early Hungarians, and present also in ancient Samoyedic populations now acculturated.
The most interesting aspect of both Mezhovska and Karasuk is that they seem to diverge from a point close to Ukraine_Eneolithic, which is the supposed ancestral source of Corded Ware peoples (read more about the formation of “steppe ancestry”). This means that Eastern Uralians derive from a source closer to Middle Dnieper/Abashevo populations, rather than Battle Axe (shifted to Latvian Neolithic), which is more likely the source prevalent in Finno-Permic peoples.
Their initial admixture with (Palaeo-)Siberian populations is thus seen already starting by this time in Mezhovska and especially in Karasuk, but this process (compared to modern populations) is incomplete:
We know now that Samic peoples expanded during the Late Iron Age into Palaeo-Laplandic populations, admixing with them and creating this modern cline. Finns expanded later to the north (in one of their known genetic bottlenecks), admixing with (and displacing) the Saami in Finland, especially replacing their male lines.
So how did Ugric and Samoyedic peoples admix with Palaeo-Siberian populations further, to obtain their modern cline? The answer is, logically, with East Asian migrations related to forest-steppe populations of Central Asia after the Mezhovska and Karasuk periods, i.e. during the Iron Age and later. Other groups from the forest-steppe in Central Asia show similar East Asian (“Siberian”) admixture. We know this from Narasimhan et al. (2018):
(…) we observe samples from multiple sites dated to 1700-1500 BCE (Maitan, Kairan, Oy_Dzhaylau and Zevakinsikiy) that derive up to ~25% of their ancestry from a source related to present-day East Asians and the remainder from Steppe_MLBA. A similar ancestry profile became widespread in the region by the Late Bronze Age, as documented by our time transect from Zevakinsikiy and samples from many sites dating to 1500-1000 BCE, and was ubiquitous by the Scytho-Sarmatian period in the Iron Age.
Flegontov: Present day Turkic speakers fall into two clusters of admixture patterns (Uralic/Yenisean and Tungussic/Mngolic) based on genomic data with ancient Turks belonging almost exclusively to the first cluster. #ISBA8
The Ugric-speaking Sargat culture in Western Siberia shows the expected mixture of haplogroups (ca. 500 BC – 500 AD), with 5 samples of hg N and 2 of hg R1a1, in Pilipenko et al. (2017). Although radiocarbon dates and subclades are lacking, N lineages probably spread late, because of the late and gradual admixture of Siberian cultures into the Sargat melting pot.
The observed reduction in the genetic distance between the Middle Tagar population and other Scythian like populations of Southern Siberia(Fig 5; S4 Table), in our opinion, is primarily associated with an increase in the role of East Eurasian mtDNA lineages in the gene pool (up to nearly half of the gene pool) and a substantial increase in the joint frequency of haplogroups C and D (from 8.7% in the Early Tagar series to 37.5% in the Middle Tagar series). These features are characteristic of many ancient and modern populations of Southern Siberia and adjacent regions of Central Asia, including the Pazyryk population of the Altai Mountains.
Before the Iron Age, the Karasuk and Mezhovska population were probably already somehow ‘to the north’ within the ancient Steppe-Altai cline (see image below9 created by expanding Seima-Turbino- and Andronovo-related populations. During the Iron Age, further Siberian contributions with Iranian expansions must have placed Uralians of the Central Asian forest-steppe areas much closer to today’s Palaeo-Siberian cline.
However, the modern genetic picture was probably fully developed only in historic times, when Samoyedic and Ugric languages expanded to the north, only in part admixing further with Palaeo-Siberian-speaking nomads from the Circum-Arctic region (see here for a recent history of Samoyedic Enets), which justifies their more recent radical ‘northern shift’.
This late acquisition of the language by Palaeo-Siberian nomads (without much population replacement) also justifies the wide PCA clusters of very small Siberian populations. See for example in the PCA from Tambets et al. (2018):
For their relationship with modern Mansi, we have information on Hungarian conqueror populations from Neparáczki et al. (2018):
Moreover, Y, B and N1a1a1a1a Hg-s have not been detected in Finno-Ugric populations [80–84], implying that the east Eurasian component of the Conquerors and Finno-Ugric people are probably not directly related. The same inference can be drawn from phylogenetic data, as only two Mansi samples appeared in our phylogenetic trees on the side branches (S1 Fig, Networks; 1, 4) suggesting that ancestors of the Mansis separated from Asian ancestors of the Conquerors a long time ago. This inference is also supported by genomic Admixture analysis of Siberian and Northeastern European populations , which revealed that Mansis received their eastern Siberian genetic component approximately 5–7 thousand years ago from ancestors of modern Even and Evenki people. Most likely the same explanation applies to the Y-chromosome N-Tat marker which originated from China [86,87] and its subclades are now widespread between various language groups of North Asia and Eastern Europe .
The genetic picture of Hungarians (their formed cline with Mansi and their haplogroups) may be quite useful for the true admixture found originally in Mansi peoples at the beginning of the Iron Age. By now it is clear even from modern populations that Steppe_MLBA ancestry accompanied the Uralic expansion to the east (roughly approximated in the graphic with Afanasievo_EBA + Bichon_LP EasternHG_M):
An interesting aspect of the paper, hidden among so many relevant details, is a clearer picture of how the so-called Yamnaya or steppe ancestry evolved from Samara hunter-gatherers to Yamna nomadic pastoralists, and how this ancestry appeared among Proto-Corded Ware populations.
Please note: arrows of “ancestry movement” in the following PCAs do not necessarily represent physical population movements, or even ethnolinguistic change. To avoid misinterpretations, I have depicted arrows with Y-DNA haplogroup migrations to represent the most likely true ethnolinguistic movements. Admixture graphics shown are from Wang et al. (2018), and also (the K12) from Mathieson et al. (2018).
1. Samara to Early Khvalynsk
The so-called steppe ancestry was born during the Khvalynsk expansion through the steppes, probably through exogamy of expanding elite clans (eventually all R1b-M269 lineages) originally of Samara_HG ancestry. The nearest group to the ANE-like ghost population with which Samara hunter-gatherers admixed is represented by the Steppe_Eneolithic / Steppe_Maykop cluster (from the Northern Caucasus Piedmont).
Steppe_Eneolithic samples, of R1b1 lineages, are probably expanded Khvalynsk peoples, showing thus a proximate ancestry of an Early Eneolithic ghost population of the Northern Caucasus. Steppe_Maykop samples represent a later replacement of this Steppe_Eneolithic population – and/or a similar population with further contribution of ANE-like ancestry – in the area some 1,000 years later.
This is what Steppe_Maykop looks like, different from Steppe_Eneolithic:
NOTE. This admixture shows how different Steppe_Maykop is from Steppe_Eneolithic, but in the different supervised ADMIXTURE graphics below Maykop_Eneolithic is roughly equivalent to Eneolithic_Steppe (see orange arrow in ADMIXTURE graphic above). This is useful for a simplified analysis, but actual differences between Khvalynsk, Sredni Stog, Afanasevo, Yamna and Corded Ware are probably underestimated in the analyses below, and will become clearer in the future when more ancestral hunter-gatherer populations are added to the analysis.
2. Early Khvalynsk expansion
We have direct data of Khvalynsk-Novodanilovka-like populations thanks to Khvalynsk and Steppe_Eneolithic samples (although I’ve used the latter above to represent the ghost Caucasus population with which Samara_HG admixed).
We also have indirect data. First, there is the PCA with outliers:
Second, we have data from north Pontic Ukraine_Eneolithic samples (see next section).
Third, there is the continuity of late Repin / Afanasevo with Steppe_Eneolithic (see below).
3. Proto-Corded Ware expansion
It is unclear if R1a-M459 subclades were continuously in the steppe and resurged after the Khvalynsk expansion, or (the most likely option) they came from the forested region of the Upper Dnieper area, possibly from previous expansions there with hunter-gatherer pottery.
Supporting the latter is the millennia-long continuity of R1b-V88 and I2a2 subclades in the north Pontic Mesolithic, Neolithic, and Early Eneolithic Sredni Stog culture, until ca. 4500 BC (and even later, during the second half).
Only at the end of the Early Eneolithic with the disappearance of Novodanilovka (and beginning of the steppe ‘hiatus’ of Rassamakin) is R1a to be found in Ukraine again (after disappearing from the record some 2,000 years earlier), related to complex population movements in the north Pontic area.
NOTE. In the PCA, a tentative position of Novodanilovka closer to Anatolia_Neolithic / Dzudzuana ancestry is selected, based on the apparent cline formed by Ukraine_Eneolithic samples, and on the position and ancestry of Sredni Stog, Yamna, and Corded Ware later. A good alternative would be to place Novodanilovka still closer to the Balkan outliers (i.e. Suvorovo), and a source closer to EHG as the ancestry driven by the migration of R1a-M417.
The first sample with steppe ancestry appears only after 4250 BC in the forest-steppe, centuries after the samples with steppe ancestry from the Northern Caucasus and the Balkans, which points to exogamy of expanding R1a-M417 lineages with the remnants of the Novodanilovka population.
4. Repin / Early Yamna expansion
We don’t have direct data on early Repin settlers. But we do have a very close representative: Afanasevo, a population we know comes directly from the Repin/late Khvalynsk expansion ca. 3500/3300 BC (just before the emergence of Early Yamna), and which shows fully Steppe_Eneolithic-like ancestry.
Compared to this eastern Repin expansion that gave Afanasevo, the late Repin expansion to the west ca. 3300 BC that gave rise to the Yamna culture was one of colonization, evidenced by the admixture with north Pontic (Sredni Stog-like) populations, no doubt through exogamy:
This admixture is also found (in lesser proportion) in east Yamna groups, which supports the high mobility and exogamy practices among western and eastern Yamna clans, not only with locals:
We don’t have a comparison with Ukraine_Eneolithic or Corded Ware samples in Wang et al. (2018), but we do have proximate sources for Abashevo, when compared to the Poltavka population (with which it admixed in the Volga-Ural steppes): Sintashta, Potapovka, Srubna (with further Abashevo contribution), and Andronovo:
The two CWC outliers from the Baltic show what I thought was an admixture with Yamna. However, given the previous mixture of Eneolithic_Steppe in north Pontic steppe-forest populations, this elevated “steppe ancestry” found in Baltic_LN (similar to west Yamna) seems rather an admixture of Baltic sub-Neolithic peoples with a north Pontic Eneolithic_Steppe-like population. Late Repin settlers also admixed with a similar population during its colonization of the north Pontic area, hence the Baltic_LN – west Yamna similarities.
NOTE. A direct admixture with west Yamna populations through exogamy by the ancestors of this Baltic population cannot be ruled out yet (without direct access to more samples), though, because of the contacts of Corded Ware with west Yamna settlers in the forest-steppe regions.
A similar case is found in the Yamna outlier from Mednikarovo south of the Danube. It would be absurd to think that Yamna from the Balkans comes from Corded Ware (or vice versa), just because the former is closer in the PCA to the latter than other Yamna samples. The same error is also found e.g. in the Corded Ware → Bell Beaker theory, because of their proximity in the PCA and their shared “steppe ancestry”. All those theories have been proven already wrong.
NOTE. A similar fallacy is found in potential Sintashta→Mycenaean connections, where we should distinguish statistically that result from an East/West Yamna + Balkans_BA admixture. In fact, genetic links of Mycenaeans with west Yamna settlers prove this (there are some related analyses in Anthrogenica, but the site is down at this moment). To try to relate these two populations (separated more than 1,000 years before Sintashta) is like comparing ancient populations to modern ones, without the intermediate samples to trace the real anthropological trail of what is found…Pure numbers and wishful thinking.
Interesting excerpts (emphasis mine; most internal references removed):
The earliest, most secure archaeological evidence of human occupation of the region comes from the artefact-rich, high-latitude (~70° N) Yana RHS site dated to ~31.6 kya (…)
The Yana RHS human remains represent the earliest direct evidence of human presence in northeastern Siberia, a population we refer to as “Ancient North Siberians” (ANS). Both Yana RHS individuals were unrelated males, and belong to mitochondrial haplogroup U, predominant among ancient West Eurasian hunter-gatherers, and to Y chromosome haplogroup P1, ancestral to haplogroups Q and R, which are widespread among present-day Eurasians and Native Americans.
Symmetry tests using f4 statistics reject tree-like clade relationships with both Early West Eurasians (EWE; Sunghir) and Early East Asians (EEA; Tianyuan); however, Yana is genetically closer to EWE, despite its geographic location in northeastern Siberia
Using admixture graphs (qpGraph) and outgroup-based estimation of mixture proportions (qpAdm), we find that Yana can be modelled as EWE with ~25% contribution from EEA
Among all ancient individuals, Yana shares the most genetic drift with Mal’ta, and f4 statistics show that Mal’ta shares more alleles with Yana than with EWE (e.g. f4(Mbuti,Mal’ta;Sunghir,Yana) = 0.0019, Z = 3.99). Mal’ta and Yana also exhibit a similar pattern of genetic affinities to both EWE and EEA, consistent with previous studies.The ANE lineage can thus be considered a descendant of the ANS lineage, demonstrating that by 31.6 kya early representatives of this lineage were widespread across northern Eurasia, including far northeastern Siberia.
(…) the 9.8 kya Kolyma1 individual, representing a group we term “Ancient Paleosiberians” (AP). Our results indicate that AP are derived from a first major genetic shift observed in the region. Principal component analysis (PCA), outgroup f3-statistics and mtDNA and Y chromosome haplogroups (G1b and Q1a1a, respectively) demonstrate a close affinity between AP and present-day Koryaks, Itelmen and Chukchis, as well as with Native Americans.
For both AP and Native Americans, ANS ancestry appears more closely related to Mal’ta than Yana, therefore rejecting a direct contribution of Yana to later AP or Native American groups.
Lake Baikal Neolithic – Bronze Age
(…) the newly reported genomes from Ust’Belaya and recently published neighbouring Neolithic and Bronze Age sites show a succession of three distinct genetic ancestries over a ~6 ky time span. The earliest individuals show predominantly East Asian ancestry, closely related to the ancient individuals from DGC. In the early Bronze Age (BA), we observe a resurgence of AP ancestry (up to ~50% ancestry fraction), as well as influence of West Eurasian Steppe ANE ancestry represented by the early BA individuals from Afanasievo in the Altai region (~10%) This is consistent with previous reports of gene flow from an unknown ANE-related source into Lake Baikal hunter-gatherers.
Our results suggest a southward expansion of AP as a possible source, which is also consistent with the replacement of Y chromosome lineages observed at Lake Baikal, from predominantly haplogroup N in the Neolithic to haplogroup Q in the BA. Finally, the most recent individual from Ust’Belaya, dated to ~600 years ago, falls along the Neosiberian cline, similar to the ~760 year-old ‘Young Yana’ individual from northeastern Siberia, demonstrating the widespread distribution of Neosiberian ancestry in the most recent epoch.
At the western edge of northern Eurasia, genetic and strontium isotope data from ancient individuals at the Levänluhta site documents the presence of Saami ancestry in Southern Finland in the Late Holocene 1.5 kya. This ancestry component is currently limited to the northern fringes of the region, mirroring the pattern observed for AP ancestry in northeastern Siberia. However, while the ancient Saami individuals harbour East Asian ancestry, we find that this is better modelled by DGC rather than AP, suggesting that AP influence was likely restricted to the eastern side of the Urals. Comparison of ancient Finns and Saami with their present-day counterparts reveals additional gene flow over the past 1.6 kya, with evidence for West Eurasian admixture into modern Saami. The ancient Finn from Levänluhta shows lower Siberian ancestry than modern Finns .
EDIT (27 OCT 2018): By comparing the three, I see these are samples published already (at least two) in Lamnidis et al. (2018), but here with added (1) specific radiocarbon dates, (2) comparison with Neosiberian populations and (3) strontium isotope analyses.
Finnish_IA (ca. 350 AD) is probably a Saami-speaking individual, just like the Saami_IA with newly reported radiocarbon dates from Levänluhta ca. 400-600 AD (since Fennic peoples were then likely around the Gulf of Finland).
The conflicting strontium isotope data on marine dietary resources on certain samples from the supplementary material hint at possible external origin of the diet of some of the previously reported (and possibly one newly reported) Saami Iron Age individuals, from some 25-30 km. to the northwest through the river up to hundreds of km. to the southwest of Levänluhta (i.e. the whole coast of the Bothnian Sea). It is unclear why they would prefer an origin of the dietary source in southern Baltic regions instead of some km. to the west, though, unless that’s what they want to propose based on the sample’s admixture…
The coast of the Bothnian Sea (=the northern part of the Baltic Sea, between Sweden and Finland) lay only 25-30 km to the northwest, and accessible to the Iron Age people of the Levänluhta region via the Kyrönjoki river. (…) For individual JA2065/DA236, the low 87Sr/86Sr value (0.71078) would imply an exceptionally heavy reliance on Baltic Sea resources. The δ13C and δ15N values of the individual are near comparable (especially considering within-Baltic latitudinal gradients in δ13C; Torniainen et al. 2017) to the δ13C and δ15N values of a Middle Neolithic population on the Baltic island of Gotland (Eriksson, 2004) interpreted to have subsisted primarily on seals.
These new data on the samples give us some more information than what we already had, because the early date of Finnish_IA implies that there was few East Asian admixture (if any at all) in west Finland during the Roman Iron Age, which pushes still farther forward in time the expected appearance of Siberian ancestry among Saamic (first) and Fennic populations (later). It is unclear whether this East Asian ancestry found in Finnish_IA is actually related to DGC, or it is rather related to the ENA-like ancestry found already in Baltic hunter-gatherers (i.e. in some EHG samples from Karelia), for which Baikal_EN is a good proxy in Lazaridis et al. (2018).
The paper finds thus increased (probably the actual) Siberian ancestry in modern Finns compared to this Iron Age Saami individual. Coupled with the later Saami Iron Age samples, from between one to three centuries later – showing the start of Siberian ancestry influx – , we can begin to establish when the expansion of Siberian ancestry happened in central Finland, and thus quite likely when the Saami began to expand to the north and east and admix with Palaeo-Laplandic peoples.
One sample of haplogroup N1a1a1a1a4a1-M1982, Yana_MED, is found in the Arctic region (north-eastern Yakutia) ca. 1100 AD. Since it is derived from N1a1a1a1a-L392, it might be a surprise for some to find it in a clearly non-Uralic speaking environment at the same time other subclades of this haplogroup were admixing in the west with well-established Finno-Saamic, Volga-Finnic, Ugric, and Samoyedic populations…
On the growing doubts that these data – contradicting the CWC=IE theory – are creating among geneticists (from the supplementary materials):
The Proto-Saami language evolved in southern Finland and Karelia in the Early Iron Age, an area now host to Finnish and the closely related Karelian, but with Saami toponyms showing that the latter two languages are intrusive here (Saarikivi 2004). Saami-speaking populations are thought to have retreated to Lapland during the Middle Iron Age (300–800 AD), where it diverged into the modern Saami dialects. Genetically, the northward retreat of the Saami language correlates with the documented decrease of Saami ancestry in Southern Finland between the Iron Age and the modern period (cf. Lamnidis et al. 2018).
On the way to Lapland, the Saami replaced at least two linguistically obscure groups. This can be inferred from 1) an influx of non-Uralic loanwords into Proto-Saami in the Finnish Lakeland area, and 2) an influx of non-Uralic, non-Germanic words into Saami dialects in Lapland (Aikio 2012). Both of these borrowing events imply contact with non-Saami-speaking groups, e.g. non-Uralic-speaking hunter-gatherers that may have left a genetic and linguistic footprint on modern Saami populations.
The linguistic prehistory of Finland thus does not allow for a straightforward interpretation of the genetic data. The detection of East Asian ancestry in the genetically Saami individual is indicative of a population movement from the east (cf. Lamnidis et al. 2018, Rootsi et al. 2007), one that given the affinities with the ~7.6 ky old individuals from the Devil’s Gate Cave may have been a western extension of the Neosiberian turnover. However, it remains unclear whether this gene flow should be associated with the arrival of Uralic speakers, thus providing further support for a Uralic homeland in Eastern Eurasia, or with an earlier immigration of pre-Uralic, so-called “Paleo-Lakelandic” groups.
I think the genetic interpretation is already straightforward, though. We had a sneak peek at how this late admixture with non-Uralians (mainly Palaeo-Lakelandic and Palaeo-Laplandic peoples from Lovozero and related asbestos ware cultures) is going to unfold among expanding Saami-speaking populations thanks to Lamnidis et al. (2018):
Also, still no trace of R1a in far East Asia (reported as M17 ca. 5300 BC near Lake Baikal by Moussa et al. 2016), so I still have doubts about my previous assessment that R1a split into M17 (and thus also M417) in Siberia, with those expanding hunter-gatherer pottery.
It has been known for a long time that the Caucasus must have hosted many (at least partially) isolated populations, probably helped by geographical boundaries, setting it apart from open Eurasian areas.
David Reich writes in his book the following about India:
The genetic data told a clear story. Around a third of Indian groups experienced population bottlenecks as strong or stronger than the ones that occurred among Finns or Ashkenazi Jews. We later confirmed this finding in an even larger dataset that we collected working with Thangaraj: genetic data from more than 250 jati groups spread throughout India (…)
Rather than an invention of colonialism as Dirks suggested, long-term endogamy as embodied in India today in the institution of caste has been overwhelmingly important for millennia. (…)
The Han Chinese are truly a large population. They have been mixing freely for thousands of years. In contrast, there are few if any Indian groups that are demographically very large, and the degree of genetic differentiation among Indian jati groups living side by side in the same village is typically two to three times higher than the genetic differentiation between northern and southern Europeans. The truth is that India is composed of a large number of small populations.
There is little doubt now, based on findings spanning thousands of years, that the Mesolithic and Neolithic Caucasus hosted various very small populations, even if the ancestral components may be reduced to the few known to date (such as ANE, EHG, AME*, ENA, CHG, and other “deep” ancestral components).
NOTE. I will call the ancestral component of Dzudzuana/Anatolian hunter-gatherers Ancient Middle Easterner (AME), to give a clear idea of its likely extension during the Late Upper Palaeolithic, and to avoid using the more simplistic Dzudzuana, unless it is useful to mention these specific local samples.
Genetic labs have a strong fixation with ancestry. I guess the use of complex statistical methods gives professionals and laymen alike the feeling of dealing with “Science”, as opposed to academic fields where you have to interpret data. I think language reveals a lot about the way people think, and the fact that ancestral components are called ‘lineages’ – while not wrong per se – is a clear symptom of the lack of interest in the true lineages: Y-DNA haplogroups.
It has become quite clear that male-biased migrations are often the ones which can be confidently followed for actual population movements and ethnolinguistic identification, at least until the Iron Age. The frequently used Palaeolithic clusters offer a clear example of why ancestry does not represent what some people believe: They merely give a basic idea of sizeable population replacements by distant peoples.
Both concepts are important: sizeable and distant peoples. For example, during the Upper Palaeolithic in Europe there was a sizeable population replacement of the Aurignacian Goyet cluster by the Gravettian Vestonice cluster (probably from populations of far eastern Russia) coupled with the arrival of haplogroup I, although during the thousands of years that this material culture lasted, the previously expanded C1a2 lineages did not disappear, and there were probably different resurgence and admixture events.
Haplogroup I certainly expanded with the Gravettian culture to Iberia, where the Goyet ancestry did not change much – probably because of male-driven migrations -, to the extent that during the Magdalenian expansions haplogroup I expanded with an ancestry closer to Goyet, in what is called a ‘resurge’ of the Goyet cluster – even though there is a clear replacement of male lines.
The Villabruna (WHG) cluster is another good example. It probably spread with haplogroup R1b-L754, which – based on the extra ‘East Asian’ affinity of some samples and on modern samples from the Middle East – came probably from the east through a southern route, and not too long before the expansion of WHG likely from around the Black Sea, although this is still unclear. The finding of haplogroup I in samples of mostly WHG ancestry could confuse people that do not care about timing, sub-structured populations, and gene flow.
NOTE. If you don’t understand why ‘clusters’ that span thousands of years don’t really matter for the many Palaeolithic population expansions that certainly happened among hunter-gatherers in Europe, just take a look at what happened with Bell Beakers expanding from Yamna into western Europe within 500 years.
If we don’t thread carefully when talking about population migrations, these terms are bound to confuse people. Just as the fixation on “steppe ancestry” – which marks the arrival in Chalcolithic Europe of peoples from the Pontic-Caspian region – has confused a lot of researchers to this day.
When I began to write about the Indo-European demic diffusion model, my concern was to find a single spot where a North-West Indo-European proto-language could have expanded from ca. 2000 BC (our most common guesstimate). Based on the 2015 papers, and in spite of their conclusions, I thought it had become clear that Corded Ware was not it, and it was rather Bell Beakers. I assumed that Uralic was spoken to the north (as was the traditional belief), and thus Corded Ware expanded from the forest zone, hence steppe ancestry would also be found there with other R1a lineages.
With the publication of Mathieson et al. (2017) and Olalde et al. (2017), I changed my mind, seeing how “steppe ancestry” did in fact appear quite late, hence it was likely to be the result of very specific population movements, probably directly from the Caucasus. Later, Mathieson published in a revision the sample from Alexandria of hg R1a-M417 (probably R1a-Z645, possibly Z93+), which further supported the idea that the migration of Corded Ware peoples started near the North Pontic forest-steppe (as I included in a the next revision).
The question remains the same I repeated recently, though: where do the extra Caucasus components (i.e. beyond EHG) of Eneolithic Ukraine/Corded Ware and Khvalynsk/Yamna come from?
Considering 2-way mixtures, we can model Karelia_HG as deriving 34 ± 2.8% of its ancestry from a Villabruna-related source, with the remainder mainly from ANE represented by the AfontovaGora3 (AG3) sample from Lake Baikal ~17kya.
AG3 was likely of haplogroup Q1a (as reported by YFull, see Genetiker), and probably the ANE ancestry found in Eastern Europe accompanied a Palaeolithic migration of Q1a2-M25 (formed ca. 22600 BC, TMRCA ca. 14300 BC).
Combined with what we know about the Eneolithic Steppe and Caucasus populations – it is likely that ANE ancestry remained the most important component of some of the small ghost populations of the Caucasus until their emergence with the Lola culture.
The first sample we have now attributed to the EHG cluster is Sidelkino, from the Samara region (ca. 9300 BC), mtDNA U5a2. In Damgaard et al. (Science 2018), Yamnaya could be modelled as a CHG population related to Kotias Klde (54%) and the remaining from ANE population related to Sidelkino (>46%), with the following split events:
A split event, where the CHG component of Yamnaya splits from KK1. The model inferred this time at 27 kya (though we note the larger models in Sections S2.12.4 and S2.12.5 inferred a more recent split time).
A split event, where the ANE component of Yamnaya splits from Sidelkino. This was inferred at about about 11 kya.
A split event, where the ANE component of Yamnaya splits from Botai. We inferred this to occur 17 kya. Note that this is above the Sidelkino split time, so our model infers Yamnaya to be more closely related to the EHG Sidelkino, as expected.
An ancestral split event between the CHG and ANE ancestral populations. This was inferred to occur around 40 kya.
Other samples classified as of the EHG cluster:
Popovo2 (ca. 6250 BC) of hg J1, mtDNA U4d – Po2 and Po4 from the same site (ca. 6550 BC) show continuity of mtDNA.
Karelia_HG, from Juzhnii Oleni Ostrov (ca. 6300 BC): I0211/UzOO40 (ca. 6300 BC) of hg J1(xJ1a), mtDNA U4a; and I0061/UzOO74 of hg R1a1(xR1a1a), mtDNA C1
UzOO77 and UzOO76 from Juzhnii Oleni Ostrov (ca. 5250 BC) of mtDNA R1b.
Samara_HG from Lebyanzhinka (ca. 5600 BC) of hg R1b1a, mtDNA U5a1d.
About the enigmatic Anatolia_Neolithic-related ancestry found in Pontic-Caspian steppe samples, this is what Wang et al. (2018) had to say:
We focused on model of mixture of proximal sources such as CHG and Anatolian Chalcolithic for all six groups of the Caucasus cluster (Eneolithic Caucasus, Maykop and Late Makyop, Maykop-Novosvobodnaya, Kura-Araxes, and Dolmen LBA), with admixture proportions on a genetic cline of 40-72% Anatolian Chalcolithic related and 28-60% CHG related (Supplementary Table 7). When we explored Romania_EN and Greece_Neolithic individuals as alternative southeast European sources (30-46% and 36-49%), the CHG proportions increased to 54-70% and 51-64%, respectively. We hypothesize that alternative models, replacing the Anatolian Chalcolithic individual with yet unsampled populations from eastern Anatolia, South Caucasus or northern Mesopotamia, would probably also provide a fit to the data from some of the tested Caucasus groups.
The first appearance of ‘Near Eastern farmer related ancestry’ in the steppe zone is evident in Steppe Maykop outliers. However, PCA results also suggest that Yamnaya and later groups of the West Eurasian steppe carry some farmer related ancestry as they are slightly shifted towards ‘European Neolithic groups’ in PC2 (Fig. 2D) compared to Eneolithic steppe. This is not the case for the preceding Eneolithic steppe individuals. The tilting cline is also confirmed by admixture f3-statistics, which provide statistically negative values for AG3 as one source and any Anatolian Neolithic related group as a second source
Detailed exploration via D-statistics in the form of D(EHG, steppe group; X, Mbuti) and D(Samara_Eneolithic, steppe group; X, Mbuti) show significantly negative D values for most of the steppe groups when X is a member of the Caucasus cluster or one of the Levant/Anatolia farmer-related groups (Supplementary Figs. 5 and 6). In addition, we used f- and D-statistics to explore the shared ancestry with Anatolian Neolithic as well as the reciprocal relationship between Anatolian- and Iranian farmer-related ancestry for all groups of our two main clusters and relevant adjacent regions (Supplementary Fig. 4). Here, we observe an increase in farmer-related ancestry (both Anatolian and Iranian) in our Steppe cluster, ranging from Eneolithic steppe to later groups. In Middle/Late Bronze Age groups especially to the north and east we observe a further increase of Anatolian farmer related ancestry consistent with previous studies of the Poltavka, Andronovo, Srubnaya and Sintashta groups and reflecting a different process not especially related to events in the Caucasus.
(…) Surprisingly, we found that a minimum of four streams of ancestry is needed to explain all eleven steppe ancestry groups tested, including previously published ones (Fig. 2; Supplementary Table 12). Importantly, our results show a subtle contribution of both Anatolian farmer-related ancestry and WHG-related ancestry (Fig.4; Supplementary Tables 13 and 14), which was likely contributed through Middle and Late Neolithic farming groups from adjacent regions in the West. The discovery of a quite old AME ancestry has rendered this probably unnecessary, because this admixture from an Anatolian-like ghost population could be driven even by small populations from the Caucasus.
While it is not yet fully clear, the increased Anatolian_Neolithic-like ancestry in Ukraine_Eneolithic samples (see below) makes it unlikely that all such ancestry in Corded Ware groups comes from a GAC-related contribution. It is likely that at least part of it represents contributions from populations of the Caucasus, based on the mostly westward population movements in the steppe from ca. 4600 BC on, including the Suvorovo-Novodanilovka expansion, and especially the Kuban-Maykop expansion during the final Eneolithic into the North Pontic area.
NOTE. Since CHG-like groups from the Caucasus may have combinations of AME and ANE ancestry similar to Yamna (which may thus appear as ‘steppe ancestry’ in the North Pontic area), it is impossible to interpret with precision the following ADMIXTURE graphic:
The East Asian contribution to samples from the WHG samples (like Loschbour or La Braña), as specified in Fu et al. (2016), does not seem to be related to Baikal_EN, and appears possibly (in the ADMIXTURE analysis) integrated into he Villabruna component. I guess this implies that the shared alleles with East Asians are quite early, and potentially due to the expansion of R1b-L754 from the East.
It would be interesting to know the specific material culture Sidelkino belonged to – i.e. if it was related to the expansion of the North-Eastern Technocomplex – , and its Y-DNA. The Post-Swiderian expansion into eastern Europe, probably associated with the expansion of R1b-P297 lineages (including R1b-M73, found later in Botai and in Baltic HG) is supposed to have begun during the 11th millennium BC, but migrations to the Urals and beyond are probably concentrated in the 9th millennium, so this sample is possibly slightly early for R1b.
NOTE. User Rozenfeld at Anthrogenica posted this, which I think is interesting (in case anyone wants to try a Y-SNP call):
there is something strange with Sidelkino EHG: first, its archaeological context is not described in the supplementary. Second, its sex is not listed in the supplementary tables. Third, after looking for info about this sample, I found that: “Сиделькино-3. Для снятия вопроса о половой принадлежности индивида была проведена генетическая экспертиза, выявившая принадлежность останков мужчине.”(translation: Sidelkino-3. To resolve the question about sex of the remains, the genetic analysis was conducted, which showed that remains belonged to male), source: http://static.iea.ras.ru/books/7487_Traditsii.pdf
So either they haven’t mentioned his Y-DNA in the paper for some reason, or there are more than one Sidelkino sample and the male one has not yet been published. The coverage of the Sidelkino sample from the paper is 2.9, more than enough to tell Y-DNA haplogroup.
My speculative guess right now about specific population movements in far eastern Europe, based on the few data we have:
The expansion of the North-Eastern Technocomplex first around the 9th millennium BC, most likely expanded R1b-P279 ca. 11300 BC, judging by its TMRCA, with both R1b-M73 (TMRCA 5300) and R1b-M269 (TMRCA 4400 BC) info (with extra El Mirón ancestry) back, and thus Eurasiatic.
The expansion of haplogroup J1 to the north may have happened before or after the R1b-P279 expansion. Judging by the increase in AG3-related ancestry near Karelia compared to Baltic_HG, it is possible that it expanded just after R1b-P279 (hence possibly J1-Y6304? TMRCA 9700 BC). Its long-lasting presence in the Caucasus is supported by the Satsurblia (ca. 11300 BC) and the Dolmen BA (ca. 1300 BC) samples.
The expansion of R1a-M17 ca. 6600 BC is still likely to have happened from the east, based on the R1a-M17 samples found in Baikalic cultures slightly later (ca. 5300 BC). The presence of elevated Baikal_EN ancestry in Karelia HG and in Samara HG, and the finding of R1a-M417 samples in the Forest Zone after the Mesolithic suggests a connection with the expansion of Hunter-Gatherer pottery, from the Elshanka culture in the Samara region northward into the Forset Zone and westward into the North Pontic area.
The expansion of R1b-M73 ca. 5300 BC is likely to be associated with the emergence of a group east of the Urals (related to the later Botai culture, and potentially Pre-Yukaghir). Its presence in a Narva sample from Donkalnis (ca. 5200 BC) suggest either an early split and spread of both R1b-P297 lineages (M73 and M269) through Eastern Europe, or maybe a back-migration with hunter-gatherer pottery.
R1b-M269 spread successfully ca. 4400 BC (and R1b-L23 ca. 4100 BC, both based on TMRCA), and this successful expansion is probably to be associated with the Khvalynsk-Novodanilovka expansion. We already know that Samara_HG ca. 5600 was R1b1a, so it is likely that R1b-M269 appeared (or ‘resurged’) in the Volga-Ural region shortly after the expansion of R1a-M17, whose expansion through the region may be inferred by the additional AG3 and Baikal_EN ancestry. Interesting from Samara_HG compared to the previous Sidelkino sample is the introduction of more El Mirón-related ancestry, typical of WHG populations (and thus proper of Baltic groups).
NOTE. The TMRCA dates are obviously gross approximations, because a) the actual rate of mutation is unknown and b) TMRCA estimates are based on the convergence of lineages that survived. The potential finding of R1a-Z645 (possibly Z93+) in Ukraine Eneolithic (ca. 4000 BC), and the potential finding of R1b-L23 in Khvalynsk ca. 4250 BC complicates things further, in terms of dates and origins of any subclade.
The question thus remains as it was long ago: did R1b-M269 lineages expand (‘return’) from the east, near the Urals, or directly from the north? Were they already near Samara at the same time as the expansion of hunter-gatherer pottery, and were not much affected by it? Or did they ‘resurge’ from populations admixed with Caucasus-related ancestry after the expansion of R1a-M17 with this pottery (since there are different stepped expansions from the Samara region)? We could even ask, did R1a-M17 really expand from the east, i.e. are the dates on Baikalic subclades from Moussa et al. (2016) reliable? Or did R1a-M17 expand from some pockets in the Pontic-Caspian steppe, taking over the expansion of HG pottery at some point?
The most interesting aspect from the new paper (regarding Indo-Uralic migrations) is that Ancestral Middle Easterner ancestry will probably be a better proxy for the Anatolia_Neolithic component found in Ukraine Mesolithic to Eneolithic, and possibly also for some of the “more CHG-like” component found among Pontic-Caspian steppe populations, all likely derived from different admixture events with groups from the Caucasus.
NOTE. Even the supposed gene flow of Neolithic Iranian ancestry into the Caucasus can be put into question, since that means possibly a Dzudzuana-like population with greater “deep ancestry” proportion than the one found in CHG, which may still be found within the Caucasus.
If it was not clear already that following ‘steppe ancestry’ wherever it appears is a rather lame way of following Indo-European migrations, every single sample from the Caucasus and their admixture with Pontic-Caspian steppe populations will probably show that “steppe ancestry” is in fact formed by a variety of steppe-related ancestral components, impossible to follow coherently with a single population. Exactly what is happening already with the Siberian ancestry.
If the paper on the Dzudzuana samples has shown something, is that the expansion of an ANE-like population shook the entire Caucasus area up to the Zagros Mountains, creating this ANE – AME cline that are CHG and Iran_N, with further contributions of “deep ancestries” (probably from the south) complicating the picture further.
If this happens with few known samples, and we know of an ANE-like ghost population in the Caucasus (appearing later in the Lola culture), we can already guess that the often repeated “CHG component” found in Ukraine_Eneolithic and Khvalynsk will not be the same (except the part mediated by the Novodanilovka expansion).
This ANE-like expansion happened probably in the Late Upper Palaeolithic, and reached Northern Europe probably after the expansion of the Villabruna cluster (ca. 12000 BC), judging by the advance of AG3-like and ENA-like ancestry in later WHG samples.
The population movements during the Mesolithic and Early Neolithic in the North Pontic area are quite complicated: the extra AME ancestry is probably connected to the admixture with populations from the Caucasus, while the close similarity of Ukraine populations with Scandinavian ones (with an increase in Villabruna ancestry from Mesolithic to Neolithic samples), probably reveal population movements related to the expansion of Maglemose-related groups.
These Maglemose-related groups were probably migrants from the north-west, originally from the Northern European Plains, who occupied the previous Swiderian territory, and then expanded into the North Pontic area. The overwhelming presence of I2a (likely all I2a2a1b1b) lineages in Ukraine Neolithic supports this migration.
The likely picture of Mesolithic-Neolithic migrations in the North Pontic area right now is then:
Expansion of R1a-M459 from the east ca. 12000 BC – probably coupled with AG3 and also some Baikal_EN ancestry. First sample is I1819 from Vasilievka (ca. 8700 BC), another is from Dereivka ca. 6900 BC.
Expansion of R1b-V88 from the Balkans in the west ca. 9700 BC, based on its TMRCA and also the Balkan hunter-gatherer population overwhemingly of this haplogroup from the 10th millennium until the Neolithic. First sample is I1734 from Vasilievka (ca. 7252 BC), which suggests that it replaced the male population there, based on their similar EHG-like adxmixture (and lack of sizeable WHG increase), and shared mtDNA U5b2, U5a2.
Expansion of I2a-Y5606 probably ca. 6800 based on its TMRCA with Janislawice culture. Supporting this is the increase in WHG contribution to Neolithic samples, including the spread of U4 subclades compared to the previous period.
Expansion of R1a-M17 starting probably ca. 6600 BC in the east (see above).
NOTE. The first sample of haplogroup I appears in the Mesolithic: I1763 (ca. 8100 BC) of haplogroup I2a1, probably related to an older Upper Palaeolithic expansion.
It is becoming more and more clear with each new paper that – unless the number of very ancient samples increases – the use of Y-chromosome haplogroups remains one of the most important tools for academics; this is especially so in the steppes, in light of the diversity found in populations from the Caucasus. A clear example comes from the Yamna – Corded Ware similarities:
The presence of haplogroups Q and R1a-M459 (xM17) in Khvalynsk along with a R1b1a sample, which some interpreted as being akin to modern ‘mixed’ populations in the past, is likely to point instead to a period of Khvalynsk-Novodanilovka expansion with R1b-M269, where different small populations from the steppe were being integrated into the common Khvalynsk stock, but where differences are seen in material culture surrounding their burials, as supported by the finding of R1b1 in the Kuban area already in the first half of the 5th millennium. The case would be similar to the early ‘mixed’ Icelandic population.
Only after the emergence of the Samara culture (in the second half of the 6th millennium BC), with a sample of haplogroup R1b1a, starts then the obvious connection with Early Proto-Indo-Europeans; and only after the appearance of late Sredni Stog and haplogroup R1a-M417 (ca. 4000 BC) is its connection with Uralic also clear. In previous population movements, I think more haplogroups were involved in migrations of small groups, and only some communities among them were eventually successful, expanding to be dominant, creating ever growing cultures during their expansions.
Indeed, if you think in terms of Uralic and Indo-European just as converging languages, and forget their potential genetic connection, then the genetic + linguistic picture becomes simplified, and the upper frontier of the 6th millennium BC with a division North Pontic (Mariupol) vs. Volga-Ural (Samara) is enough. However, tracing their movements backwards – with cultural expansions from west to east (with the expansion of farming), and earlier east to west (with hunter-gatherer pottery), and still earlier west to east (with the north-eastern technocomplex), offers an interesting way to prove their potential connection to macrofamilies, at least in terms of population movements.
I am quite convinced right now that it would be possible to connect the expansion of R1b-L754 subclades with a speculative Nostratic (given the R1b-V88 connection with Afroasiatic, and the obvious connection of R1b-L297 with Eurasiatic). Paradoxically, the connection of an Indo-Uralic community in the steppes (after the separation of Yukaghir) with any lineage expansion (R1a-M17, R1b-M269, or even Q, I or J1) seems somehow blurrier than one year ago, possibly just because there are too many open possibilities.
David Reich says about the admixture with Neanderthals, which he helped discover:
At the conclusion of the Neanderthal genome project, I am still amazed by the surprises we encountered. Having found the first evidence of interbreeding between Neanderthals and modern humans, I continue to have nightmares that the finding is some kind of mistake. But the data are sternly consistent: the evidence for Neanderthal interbreeding turns out to be everywhere. As we continue to do genetic work, we keep encountering more and more patterns that reflect the extraordinary impact this interbreeding has had on the genomes of people living today.
I think this is a shared feeling among many of us who have made proposals about anything, to fear that we have made a gross, evident mistake, and constantly look for flaws. However, it seems to me that geneticists are more preoccupied with being wrong in their developed statistical methods, in the theoretical models they are creating, and not so much about errors in the true ancient ethnolinguistic picture human population genetics is (at least in theory) concerned about. Their publications are, after all, constantly associating genetic finds with cultures and (whenever possible) languages, so this aspect of their research should not be taken lightly.
Seeing how David Anthony or Razib Khan (among many others) have changed their previously preferred migration models as new data was published, and they continue to be respected in their own fields, I guess we can be confident that professionals with integrity are going to accept whatever new picture appears. While I don’t think that genetic finds can change what we can reconstruct with comparative grammar, I am also ready to revise guesstimates and routes of expansion of certain dialects if R1a-Z645 is shown to have accompanied Late Proto-Indo-Europeans during their expansion with Yamna, and later integrated somehow with Corded Ware.
However, taking into account the obsession of some with an ancestral, uninterrupted R1a—Indo-European association, and the lack of actual political repercussion of Neanderthal admixture, I think the most common nightmare that all genetic researchers should be worried about is to keep inflating this “Yamnaya ancestry”-based hornet’s nest, which has been constantly stirred up for the past two years, by rejecting it – or, rather, specifying it into its true complex nature.
This succession of corrections and redefinitions, coupled with the distinct Y-DNA bottleneck of each steppe population, will eventually lead to a completely different ethnolinguistic picture of the Pontic-Caspian region during the Eneolithic, which is likely to eventually piss off not only reasonable academics stubbornly attached to the CWC-IE idea, but also a part of those interested in daydreaming about their patrilineal ancestors.
Sometimes it’s better to just rip off the band-aid once and for all…