Interesting excerpts (emphasis mine, changes for clarity):
Here, we report the first genome-wide data of 10 ancient individuals from northeastern Xinjiang. They are dated to around 2,200 years ago and were found at the Iron Age Shirenzigou site. We find them to be already genetically admixed between Eastern and Western Eurasians. We also find that the majority of the East Eurasian ancestry in the Shirenzigou individuals is related to northeastern Asian populations, while the West Eurasian ancestry is best presented by ∼20% to 80% Yamnaya-like ancestry. Our data thus suggest a Western Eurasian steppe origin for at least part of the ancient Xinjiang population. Our findings furthermore support a Yamnaya-related origin for the now extinct Tocharian languages in the Tarim Basin, in southern Xinjiang.
The dominant mtDNA lineages of the Shirenzigou people are commonly found in modern and ancient West Eurasian populations, such as U4, U5, and H, while they also have East Eurasian-specific haplogroups A, D4, and G3, preliminarily documenting admixed ancestry from eastern and western Eurasia.
The admixture profile is also shown on the paternal Y chromosome side that 4 out of 6 males in Shirenzigou (Figure S2) belong to the West Eurasian-specific haplogroup R1b (n = 2) and East Eurasian-specific haplogroup Q1a (n = 2), the former is predominant in ancient Yamnaya and nearly 100% in Afanasievo, different from the Middle and Late Bronze Age Steppe groups (Steppe_MLBA) such as Andronovo, [Potapovka], Srubnaya, and Sintashta whose Y chromosomal haplogroup is mainly R1a.
We first carried out principal component analysis (PCA) to assess the genetic affinities of the ancient individuals qualitatively by projecting them onto present-day Eurasian variation (Figure 2). We observed a distinct separation between East and West Eurasians. Our ancient Shirenzigou samples and present-day populations from Central Asia and northwestern China form a genetic cline from East to West in the first PC.The distribution of Shirenzigou samples on the cline is relatively scattered with two major clusters, one being closer to modern-day Uygurs and Kazakhs and the other being closer to recently published ancient Saka and Huns from the Tianshan in Kazakhstan (…).
We applied a formal admixture test using f3 statistics in the form of f3 (Shirenzigou; X, Y) where X and Y are worldwide populations that might be the genetic sources for the Shirenzigou individuals. We observed the most significant signals of admixture in the Shirenzigou samples when using Yamnaya_Samara or Srubnaya as the West Eurasian source and some Northern Asians or Koreans as the East Eurasian source (Table S1). We also plotted the outgroup f3 statistics in the form of f3 (Mbuti; X, Anatolia_Neolithic) and f3 (Mbuti; X, Kostenki14) to visualize the allele sharing between population X and Anatolian farmers. As shown in Figure S3, the Steppe_MLBA populations including Srubnaya, Andronovo, and Sintashta were shifted toward farming populations compared with Yamnaya groups and the Shirenzigou samples. This observation is consistent with ADMIXTURE analysis that Steppe_MLBA populations have an Anatolian and European farmer-related component that Yamnaya groups and the Shirenzigou individuals do not seem to have. The analysis consistently suggested Yamnaya-related Steppe populations were the better source in modeling the West Eurasian ancestry in Shirenzigou.
Genetic Composition of Iron Age Shirenzigou Individuals
We continued to use qpAdm to estimate the admixture proportions in the Shirenzigou samples by using different pairs of source populations, such as Yamnaya_Samara, Afanasievo, Srubnaya, Andronovo, BMAC culture (Bustan_BA and Sappali_Tepe_BA) and Tianshan_Hun as the West Eurasian source and Han, Ulchi, Hezhen, Shamanka_EN as the East Eurasian source. In all cases, Yamnaya, Afanasievo, or Tianshan_Hun always provide the best model fit for the Shirenzigou individuals, while Srubnaya, Andronovo, Bustan_BA and Sappali_Tepe_BA only work in some cases. The Yamnaya_Samara or Afanasievo-related ancestry ranges from ∼20% to 80% in different Shirenzigou individuals, consistent with the scattered distribution on the East-West cline in the PCA
(…) we then modeled Shirenzigou as a three-way admixture of Yamnaya_Samara, Ulchi (or Hezhen) and Han to infer the source from the East Eurasia side that contributed to Shirenzigou. We found the Ulchi or Hezhen and Han-related ancestry had a complicated and unevenly distribution in the Shirenzigou samples. The most Shirenzigou individuals derived the majority of their East Eurasian ancestry from Ulchi or Hezhen-related populations, while the following two individuals M820 and M15-2 have more Han related than Ulchi/Hezhen-related ancestry.
Also, given the apparent lack of (extra farmer ancestry that characterizes) Corded Ware ancestry, if the results were already suspicious before, how likely are now the published R1a(xZ93) and/or radiocarbon dates of the Xiaohe mummies from Li et al. (2010, 2015)? Because, after all, one should have expected in such a late date a generalized admixture with neighbouring Srubna/Andronovo-like populations.
Interesting excerpts (emphasis mine, modified for clarity):
To understand the genetic structure and influence of the Viking expansion, we sequenced the genomes of 442 ancient humans from across Europe and Greenland ranging from the Bronze Age (c. 2400 BC) to the early Modern period (c. 1600 CE), with particular emphasis on the Viking Age. We find that the period preceding the Viking Age was accompanied by foreign gene flow into Scandinavia from the south and east: spreading from Denmark and eastern Sweden to the rest of Scandinavia. Despite the close linguistic similarities of modern Scandinavian languages, we observe genetic structure within Scandinavia, suggesting that regional population differences were already present 1,000 years ago.
Maps illustrating the following texts have been made based on data from this and other papers:
Maps showing ancestry include only data from this preprint (which also includes some samples from Sigtuna).
Maps showing haplogroups of ancient DNA samples based on their age include data from all published papers, but with slightly modified locations to avoid overcrowding (randomized distance approx. ± 0.1 long. and lat.).
We find that the transition from the BA to the IA is accompanied by a reduction in Neolithic farmer ancestry, with a corresponding increase in both Steppe-like ancestry and hunter-gatherer ancestry. While most groups show a slight recovery of farmer ancestry during the VA, there is considerable variation in ancestry across Scandinavia. In particular, we observe a wide range of ancestry compositions among individuals from Sweden, with some groups in southern Sweden showing some of the highest farmer ancestry proportions (40% or more in individuals from Malmö, Kärda or Öland).
Ancestry proportions in Norway and Denmark on the other hand appear more uniform. Finally we detect an influx of low levels of “eastern” ancestry starting in the early VA, mostly constrained among groups from eastern and central Sweden as well as some Norwegian groups. Testing of putative source groups for this “eastern” ancestry revealed differing patterns among the Viking Age target groups, with contributions of either East Asian- or Caucasus-related ancestry.
Overall, our findings suggest that the genetic makeup of VA Scandinavia derives from mixtures of three earlier sources: Mesolithic hunter-gatherers, Neolithic farmers, and Bronze Age pastoralists. Intriguingly, our results also indicate ongoing gene flow from the south and east into Iron Age Scandinavia. Thus, these observations are consistent with archaeological claims of wide-ranging demographic turmoil in the aftermath of the Roman Empire with consequences for the Scandinavian populations during the late Iron Age.
Genetic structure within Viking-Age Scandinavia
We find that VA Scandinavians on average cluster into three groups according to their geographic origin, shifted towards their respective present-day counterparts in Denmark, Sweden and Norway. Closer inspection of the distributions for the different groups reveals additional complexity in their genetic structure.
We find that the ‘Norwegian’ cluster includes Norwegian IA individuals, who are distinct from both Swedish and Danish IA individuals which cluster together with the majority of central and eastern Swedish VA individuals. Many individuals from southwestern Sweden (e.g. Skara) cluster with Danish present-day individuals from the eastern islands (Funen, Zealand), skewing towards the ‘Swedish’ cluster with respect to early and more western Danish VA individuals (Jutland).
Some individuals have strong affinity with Eastern Europeans, particularly those from the island of Gotland in eastern Sweden. The latter likely reflects individuals with Baltic ancestry, as clustering with Baltic BA individuals is evident in the IBS-UMAP analysis and through f4-statistics.
Genetic clustering using IBS-UMAP suggested genetic affinities of some Viking Age individuals with Bronze Age individuals from the Baltic. To further test these, we quantified excess allele sharing of Viking Age individuals with Baltic BA compared to early Viking Age individuals from Salme using f4 statistics. We find that many individuals from the island of Gotland share a significant excess of alleles with Baltic BA, consistent with other evidence of this site being a trading post with contacts across the Baltic Sea.
The earliest N1a-VL29 sample available comes from Iron Age Gotland (VK579) ca. AD 200-400 (see Iron Age Y-DNA maps), which also proves its presence in the western Baltic before the Viking expansion. The distribution of N1a-VL29 and R1a-Z280 (compared to R1a in general) among Vikings also supports a likely expansion of both lineages in succeeding waves from the east with Akozino warrior-traders, at the same time as they expanded into the Gulf of Finland.
Vikings in Estonia
(…) only one Viking raiding or diplomatic expedition has left direct archaeological traces, at Salme in Estonia, where 41 Swedish Vikings who died violently were buried in two boats accompanied by high-status weaponry. Importantly, the Salme boat-burial predates the first textually documented raid (in Lindisfarne in 793) by nearly half a century. Comparing the genomes of 34 individuals from the Salme burial using kinship analyses, we find that these elite warriors included four brothers buried side by side and a 3rd degree relative of one of the four brothers. In addition, members of the Salme group had very similar ancestry profiles, in comparison to the profiles of other Viking burials. This suggests that this raid was conducted by genetically homogeneous people of high status, including close kin. Isotope analyses indicate that the crew descended from the Mälaren area in Eastern Sweden thus confirming that the Baltic-Mid-Swedish interaction took place early in the VA.
N1a-VL29 lineages spread again later eastwards with Varangians, from Sweden into north-eastern Europe, most likely including the ancestors of the Rurikid dynasty. Unsurprisingly, the arrival of Vikings with Swedish ancestry into the East Baltic and their dispersal through the forest zone didn’t cause a language shift of Balto-Finnic, Mordvinic, or East Slavic speakers to Old Norse, either…
NOTE. For N1a-Y4339 – N1a-L550 subclade of Swedish origin – as main haplogroup of modern descendants of Rurikid princes, see Volkov & Seslavin (2019) – full text in comments below. Data from ancient samples show varied paternal lineages even among early rulers traditionally linked to Rurik’s line, which explains some of the discrepancies found among modern descendants:
A sample from Chernihiv (VK542) potentially belonging to Gleb Svyatoslavich, the 11th century prince of Tmutarakan/Novgorod, belongs to hg. I2a-Y3120 (a subclade of early Slavic I2a-CTS10228) and has 71% “Modern Polish” ancestry (see below).
Izyaslav Ingvarevych, the 13th century prince of Dorogobuzh, Principality of Volhynia/Galicia, is probably behind a sample from Lutsk (VK541), and belongs to hg. R1a-L1029 (a subclade of R1a-M458), showing ca. 95% of “Modern Polish” ancestry.
Firstly, modern Finnish individuals are not like ancient Finnish individuals, modern individuals have ancestry of a population not in the reference; most likely Steppe/Russian ancestry, as Chinese are in the reference and do not share this direction. Ancient Swedes and Norwegians are more extreme than modern individuals in PC2 and 4. Ancient UK individuals were more extreme than Modern UK individuals in PC3 and 4. Ancient Danish individuals look rather similar to modern individuals from all over Scandinavia. By using a supervised ancient panel, we have removed recent drift from the signal, which would have affected modern Scandinavians and Finnish populations especially. This is in general a desirable feature but it is important to check that it has not affected inference.
The story for Modern-vs-ancient Finnish ancestry is consistent, with ancient Finns looking much less extreme than the moderns. Conversely, ancient Norwegians look like less-drifted modern Norwegians; the Danish admixture seen through the use of ancient DNA is hard to detect because of the extreme drift within Norway that has occurred since the admixture event. PC4 vs PC5 is the most important plot for the ancient DNA story: Sweden and the UK (along with Poland, Italy and to an extent also Norway) are visibly extremes of a distribution the same “genes-mirror-geography” that was seen in the Ancient-palette analysis. PC1 vs PC2 tells the same story – and stronger, since this is a high variance-explained PC – for the UK, Poland and Italy.
Evidence for Pictish Genomes
The four ancient genomes of Orkney individuals with little Scandinavian ancestry may be the first ones of Pictish people published to date. Yet a similar (>80% “UK ancestry) individual was found in Ireland (VK545) and five in Scandinavia, implying that Pictish populations were integrated into Scandinavian culture by the Viking Age.
Our interpretation for the Orkney samples can be summarised as follows. Firstly, they represent “native British” ancestry, rather than an unusual type of Scandinavian ancestry. Secondly, that this “British” ancestry was found in Britain before the Anglo-Saxon migrations. Finally, that in Orkney, these individuals would have descended from Pictish populations.
(…) ‘UK’ represents a group from which modern British and Irish people all receive an ancestry component. This information together implies that within the sampling frame of our data, they are proxying the ‘Briton’ component in UK ancestry; that is, a pre-Roman genetic component present across the UK. Given they were found in Orkney, this makes it very likely that they were descended from a Pictish population.
Modern genetic variation within the UK sees variation between ‘native Briton’ populations Wales, Scotland, Cornwall and Ireland as large compared to that within the more ‘Anglo-Saxon’ English. This is despite subsequent gene flow into those populations from English-like populations. We have not attempted to disentangle modern genetic drift from historically distinct populations. Roman-era period people in England, Wales, Ireland and Scotland may not have been genetically close to these Orkney individuals, but our results show that they have a shared genetic component as they represent the same direction of variation.
As in the case of mitochondrial DNA, the overall distribution profile of the Y chromosomal haplogroups in the Viking Age samples was similar to that of the modern North European populations. The most frequently encountered male lineages were the haplogroups I1, R1b and R1a.
Haplogroup I (I1, I2)
The distribution of I1 in southern Scandinavia, including a sample from Sealand (VK532) ca. AD 100 (see Iron Age Y-DNA maps) proves that it had become integrated into the West Germanic population already before their expansions, something that we already suspected thanks to the sampling of Germanic tribes.
Haplogroup R1b (M269, U106, P312)
Especially interesting is the finding of R1b-L151 widely distributed in the historical Nordic Bronze Age region, which is in line with the estimated TMRCA for R1b-P312 subclades found in Scandinavia, despite the known bottleneck among Germanic peoples under U106. Particularly telling in this regard is the finding of rare haplogroups R1b-DF19, R1b-L238, or R1b-S1194. All of that points to the impact of Bell Beaker-derived peoples during the Dagger period, when Pre-Proto-Germanic expanded into Scandinavia.
Also interesting is the finding of hg. R1b-P297 in Troms, Norway (VK531) ca. 2400 BC. R1b-P297 subclades might have expanded to the north through Finland with post-Swiderian Mesolithic groups (read more about Scandinavian hunter-gatherers), and the ancestry of this sample points to that origin.
However, it is also known that ancestry might change within a few generations of admixture, and that the transformation brought about by Bell Beakers with the Dagger Period probably reached Troms, so this could also be a R1b-M269 subclade. In fact, the few available data from this sample show that it comes from the natural harbour Skarsvågen at the NW end of the island Senja, and that its archaeologist thought it was from the Viking period or slightly earlier, based on the grave form. From Prescott (2017):
In 1995, Prescott and Walderhaug tentatively argued that a dramatic transformation took place in Norway around the Late Neolithic (2350 BCE), and that the swift nature of this transition was tied to the initial Indo-Europeanization of southern and coastal Norway, at least to Trøndelag and perhaps as far north as Troms. (…)
The Bell Beaker/early Late Neolithic, however, represents a source and beginning of these institution and practices, exhibits continuity to the following metal age periods and integrated most of Northern Europe’s Nordic region into a set of interaction fields. This happened around 2400 BCE, at the MNB to LN transition.
NOTE. This particular sample is not included in the maps of Viking haplogroups.
Among the ancient samples, two individuals were derived haplogroups were identified as E1b1b1-M35.1, which are frequently encountered in modern southern Europe, Middle East and North Africa. Interestingly, the individuals carrying these haplogroups had much less Scandinavian ancestry compared to the most samples inferred from haplotype based analysis. A similar pattern was also observed for less frequent haplogroups in our ancient dataset, such as G (n=3), J (n=3) and T (n=2), indicating a possible non-Scandinavian male genetic component in the Viking Age Northern Europe. Interestingly, individuals carrying these haplogroups were from the later Viking Age (10th century and younger), which might indicate some male gene influx into the Viking population during the Viking period.
As the paper says, the small sample size of rare haplogroups cannot distinguish if these differences are statistically relevant. Nevertheless, both E1b samples have substantial Modern Polish-like ancestry: one sample from Gotland (VK474), of hg. E1b-L791, has ca. 99% “Polish” ancestry, while the other one from Denmark (VK362), of hg. E1b-V13, has ca. 35% “Polish”, ca. 35% “Italian”, as well as some “Danish” (14%) and minor “British” and “Finnish” ancestry.
Given the E1b-V13 samples of likely Central-East European origin among Lombards, Visigoths, and especially among Early Slavs, and the distribution of “Polish” ancestry among Viking samples, VK362 is probably a close description of the typical ancestry of early Slavs. The peak of Modern Polish-like ancestry around the Upper Pripyat during the (late) Viking Age suggests that Poles (like East Slavs) have probably mixed since the 10th century with more eastern peoples close to north-eastern Europeans, derived from ancient Finno-Ugrians:
Similarly, the finding of R1a-M458 among Vikings in Funen, Denmark (VK139), in Lutsk, Poland (VK541), and in Kurevanikha, Russia (VK160), apart from the early Slav from Usedom, may attest to the origin of the spread of this haplogroup in the western Baltic after the Bell Beaker expansion, once integrated in both Germanic and Balto-Slavic populations, as well as intermediate Bronze Age peoples that were eventually absorbed by their expansions. This contradicts, again, my simplistic initial assessment of R1a-M458 expansion as linked exclusively (or even mainly) to Balto-Slavs.
As I said 6 months ago, 2019 is a tough year to write a blog, because this was going to be a complex regional election year and therefore a time of political promises, hence tenure offers too. Now the preliminary offers have been made, elections have passed, but the timing has slightly shifted toward 2020. So I may have the time, but not really any benefit of dedicating too much effort to the blog, and a lot of potential benefit of dedicating any time to evaluable scientific work.
On the other hand, I saw some potential benefit for publishing texts with ISBNs, hence the updates to the text and the preparation of these printed copies of the books, just in case. While Spain’s accreditation agency has some hard rules for becoming a tenured professor, especially for medical associates (whose years of professional experience are almost worthless compared to published peer-reviewed papers), it is quite flexible in assessing one’s merits.
However, regional and/or autonomous entities are not, and need an official identifier and preferably printed versions to evaluate publications, such as an ISBN for books. I took thus some time about a month ago to update the texts and supplementary materials, to publish a printed copy of the books with Amazon. The first copies have arrived, and they look good.
Corrections and Additions
I have changed the names and order of the books, as I intended for the first publication – as some of you may have noticed when the linguistic book was referred to as the third volume in some parts. In the first concept I just wanted to emphasize that the linguistic work had priority over the rest. Now the whole series and the linguistic volume don’t share the same name, and I hope this added clarity is for the better, despite the linguistic volume being the third one.
I have changed the nomenclature for Uralic dialects, as I said recently. I haven’t really modified anything deeper than that, because – unlike adding new information from population genomics – this would require for me to do a thorough research of the most recent publications of Uralic comparative grammar, and I just can’t begin with that right now.
Anyway, the use of terms like Finno-Ugric or Finno-Samic is as correct now for the reconstructed forms as it was before the change in nomenclature.
The most interesting recent genetic data has come from Iberia and the Mediterranean. Lacking direct data from the Italian Peninsula (and thus from the emergence of the Etruscan and Rhaetian ethnolinguistic community), it is becoming clearer how some quite early waves of Indo-Europeans and non-Indo-Europeans expanded and shrank – at least in West Iberia, West Mediterranean, and France.
Some of the main updates to the text have been made to the sections on Finno-Ugric populations, because some interesting new genetic data (especially Y-DNA) have been published in the past months. This is especially true for Baltic Finns and for Ugric populations.
Consequently, and somehow unsurprisingly, the Balto-Slavic section has been affected by this; e.g. by the identification of Early Slavs likely with central-eastern populations dominated by (at least some subclades of) hg. I2a-L621 and E1b-V13.
I have updated some cultural borders in the prehistoric maps, and the maps with Y-DNA and mtDNA. I have also added one new version of the Early Bronze age map, to better reflect the most likely location of Indo-European languages in the Early European Bronze Age.
As those in software programming will understand, major changes in the files that are used for maps and graphics come with an increasing risk of additional errors, so I would not be surprised if some major ones would be found (I already spotted three of them). Feel free to communicate these errors in any way you see fit.
I have selected more conservative SNPs in certain controversial cases.
I have also deleted most SNP-related footnotes and replaced them with the marking of each individual tentative SNP, leaving only those footnotes that give important specific information, because:
My way of referencing tentative SNP authors did not make it clear which samples were tentative, if there were more than one.
It was probably not necessary to see four names repeated 100 times over.
Often I don’t really know if the person I have listed as author of the SNP call is the true author – unless I saw the full SNP data posted directly – or just someone who reposted the results.
Sometimes there are more than one author of SNPs for a certain sample, but I might have added just one for all.
For a centralized file to host the names of those responsible for the unofficial/tentative SNPs used in the text – and to correct them if necessary -, readers will be eventually able to use Phylogeographer‘s tool for ancient Y-DNA, for which they use (partly) the same data I compiled, adding Y-Full‘s nomenclature and references. You can see another map tool in ArcGIS.
NOTE. As I say in the text, if the final working map tool does not deliver the names, I will publish another supplementary table to the text, listing all tentative SNPs with their respective author(s).
If you are interested in ancient Y-DNA and you want to help develop comprehensive and precise maps of ancient Y-DNA and mtDNA haplogroups, you can contact Hunter Provyn at Phylogeographer.com. You can also find more about phylogeography projects at Iain McDonald’s website.
I previously used certain samples prepared by amateurs from BAM files (like Botai, Okunevo, or Hittites), and the results were obviously less than satisfactory – hence my criticism of the lack of publication of prepared files by the most famous labs, especially the Copenhagen group.
Fortunately for all of us, most published datasets are free, so we don’t have to reinvent the wheel. I criticized genetic labs for not releasing all data, so now it is time for praise, at least for one of them: thank you to all responsible at the Reich Lab for this great merged dataset, which includes samples from other labs.
NOTE. I would like to make my tiny contribution here, for beginners interested in working with these files, so I will update – whenever I have time – the “How To” sections of this blog for PCAs, PCA3d, and ADMIXTURE.
For unsupervised ADMIXTURE in the maps, a K=5 is selected based on the CV, giving a kind of visual WHG : NWAN : CHG/IN : EHG : ENA, but with Steppe ancestry “in between”. Higher K gave worse CV, which I guess depends on the many ancient and modern samples selected (and on the fact that many samples are repeated from different sources in my files, because I did not have time to filter them all individually).
I found some interesting component shared by Central European populations in K=7 to K=9 (from CEU Bell Beakers to Denmark LN to Hungarian EBA to Iberia BA, in a sort of “CEU BBC ancestry” potentially related to North-West Indo-Europeans), but still, I prefer to go for a theoretically more correct visualization instead of cherry-picking the ‘best-looking’ results.
Since I made fun of the search for “Siberian ancestry” in coloured components in Tambets et al. 2018, I have to be consistent and preferred to avoid doing the same here…
In the first publication (in January) and subsequent minor revisions until March, I trusted analyses and ancestry estimates reported by amateurs in 2018, which I used for the text adding my own interpretations. Most of them have been refuted in papers from 2019, as you probably know if you have followed this blog (see very recent examples here, here, or here), compelling me to delete or change them again, and again, and again. I don’t have experience from previous years, although the current pattern must have been evidently repeated many times over, or else we would be still talking about such previous analyses as being confirmed today…
I wanted to be one step ahead of peer-reviewed publications in the books, but I prefer now to go for something safe in the book series, rather than having one potentially interesting prediction – which may or may not be right – and ten huge mistakes that I would have helped to endlessly redistribute among my readers (online and now in print) based on some cherry-picked pairwise comparisons. This is especially true when predictions of “Steppe“- and/or “Siberian“-related ancestry have been published, which, for some reason, seem to go horribly wrong most of the time.
I am sure whole books can be written about why and how this happened (and how this is going to keep happening), based on psychology and sociology, but the reasons are irrelevant, and that would be a futile effort; like writing books about glottochronology and its intermittent popularity due to misunderstood scientist trends. The most efficient way to deal with this problem is to avoid such information altogether, because – as you can see in the current revised text – they wouldn’t really add anything essential to the content of these books, anyway.
Second in popularity for the expansion of haplogroup N1a-L392 (ca. 4400 BC) is, apparently, the association with Turkic, and by extension with Micro-Altaic, after the Uralic link preferred in Europe; at least among certain eastern researchers.
According to the views of a number of authoritative researchers, the Yakut ethnos was formed in the territory of Yakutia as a result of the mixing of people from the south and the autochthonous population .
These three major Sakha paternal lineages may have also arrived in Yakutia at different times and/ or from different places and/or with a difference in several generations instead, or perhaps Y-chromosomal STR mutations may have taken place in situ in Yakutia. Nevertheless, the immediate common ancestor(s) from the Asian Steppe of these three most prevalent Sakha Y-chromosomal STR haplotypes possibly lived during the prominence of the Turkic Khaganates, hence the near-perfect matches observed across a wide range of Eurasian geography, including as far as from Cyprus in the West to Liaoning, China in the East, then Middle Lena in the North and Afghanistan in the South (Table 3 and Figure 5). There may also be haplotypes closely-related to ‘the dominant Elley line’ among Karakalpaks, Uzbeks and Tajiks, however, limitations in the loci coverage for the available dataset (only eight Y-chromosomal STR loci) precludes further conclusions on this matter .
According to the results presented here, very similar Y-STR haplotypes to that of the original Elley line were found in the west: Afghanistan and northern Cyprus, and in the east: Liaoning Province, China and Ulaanbaator, Northern Mongolia. In the case of the dominant Omogoy line, very closely matching haplotypes differing by a single mutational step were found in the city of Chifen of the Jirin Province, China. The widest range of similar haplotypes was found for the Yakut haplotype Unknown: In Mongolia, China and South Korea. For instance, haplotypes differing by a single step mutation were found in Northern Mongolia (Khalk, Darhad, Uryankhai populations), Ulaanbaator (Khalk) and in the province of Jirin, China (Han population).
Notably, Tat-C-bearing Y-chromosomes were also observed in ancient DNA samples from the 2700-3000 years-old Upper Xiajiadian culture in Inner Mongolia, as well as those from the Serteya II site at the Upper Dvina region in Russia and the ‘Devichyi gory’ culture of long barrow burials at the Nevel’sky district of Pskovsky region in Russia. A 14-loci Y-chromosomal STR median-joining network of the most prevalent Sakha haplotypes and a Tat-C-bearing haplotype from one of the ancient DNA samples recovered from the Upper Xiajiadian culture in Inner Mongolia (DSQ04) revealed that the contemporary Sakha haplotype ‘Xuo’ (Table 2, Haplotype ID “Xuo”) classified as that of ‘the Xiongnu clan’ in our current study, was the closest to the ancient Xiongnu haplotype (Figure 6). TMRCA estimate for this 14-loci Y-chromosomal STR network was 4357 ± 1038 years or 2341 ± 1038 BCE, which correlated well with the Upper Xiajiadian culture that was dated to the Late Bronze Age (700-1000 BCE).
Also, a simple look at the TMRCA and modern distribution was enough to hypothesize long ago the lack of connection of N1c-L392 with Altaic or Uralic peoples. From Ilumäe et al. (2016):
Previous research has shown that Y chromosomes of the Turkic-speaking Yakuts (Sakha) belong overwhelmingly to hg N3 (formerly N1c1). We found that nearly all of the more than 150 genotyped Yakut N3 Y chromosomes belong to the N3a2-M2118 clade, just as in the Turkic-speaking Dolgans and the linguistically distant Tungusic-speaking Evenks and Evens living in Yakutia (Table S2). Hence, the N3a2 patrilineage is a prime example of a male population of broad central Siberian ancestry that is not intrinsic to any linguistically defined group of people. Moreover, the deepest branch of hg N3a2 is represented by a Lebanese and a Chinese sample. This finding agrees with the sequence data from Hallast et al., where one Turkish Y chromosome was also assigned to the same sub-clade. Interestingly, N3a2 was also found in one Bhutan individual who represents a separate sub-lineage in the clade. These findings show that although N3a2 reflects a recent strong founder effect primarily in central Siberia (Yakutia, Sakha), the sub-clade has a much wider distribution area with incidental occurrences in the Near East and South Asia.
The most striking aspect of the phylogeography of hg N is the spread of the N3a3’6-CTS6967 lineages. Considering the three geographically most distant populations in our study—Chukchi, Buryats, and Lithuanians—it is remarkable to find that about half of the Y chromosome pool of each consists of hg N3 and that they share the same sub-clade N3a3’6. The fractionation of N3a3’6 into the four sub-clades that cover such an extraordinarily wide area occurred in the mid-Holocene, about 5.0 kya (95% CI = 4.4–5.7 kya). It is hard to pinpoint the precise region where the split of these lineages occurred. It could have happened somewhere in the middle of their geographic spread around the Urals or further east in West Siberia, where current regional diversity of hg N sub-lineages is the highest (Figure 1B). Yet, it is evident that the spread of the newly arisen sub-clades of N3a3’6 in opposing directions happened very quickly. Today, it unites the East Baltic, East Fennoscandia, Buryatia, Mongolia, and Chukotka-Kamchatka (Beringian) Eurasian regions, which are separated from each other by approximately 5,000–6,700 km by air. N3a3’6 has high frequencies in the patrilineal pools of populations belonging to the Altaic, Uralic, several Indo-European, and Chukotko-Kamchatkan language families. There is no generally agreed, time-resolved linguistic tree that unites these linguistic phyla. Yet, their split is almost certainly at least several millennia older than the rather recent expansion signal of the N3a3’6 sub-clade, suggesting that its spread had little to do with linguistic affinities of men carrying the N3a3’6 lineages.
It was thus clear long ago that N1c-L392 lineages must have expanded explosively in the 5th millennium through Northern Eurasia, probably from a region to the north of Lake Baikal, and that this expansion – and succeeding ones through Northern Eurasia – may not be associated to any known language group until well into the common era.
Teams led by David Reich at Harvard Medical School and Eske Willerslev at the University of Copenhagen in Denmark announced, independently, that occupants of Corded Ware graves in Germany could trace about three-quarters of their genetic ancestry to the Yamnaya. It seemed that Corded Ware people weren’t simply copying the Yamnaya; to a large degree they actually were Yamnayan in origin.
Burial practices shifted dramatically, a warrior class appeared, and there seems to have been a sharp upsurge in lethal violence. “I’ve become increasingly convinced there must have been a kind of genocide,” says Kristian Kristiansen at the University of Gothenburg, Sweden.
The collaboration revealed that the origin and initial spread of Bell Beaker culture had little to do – at least genetically – with the expansion of the Yamnaya or Corded Ware people into central Europe. “It started in It is in that region that the earliest Bell Beaker objects – including arrowheads, copper daggers and distinctive Bell-shaped pots – have been found, in archaeological sites carbon-dated to 4700 years ago. Then, Bell Beaker culture began to spread east, although the people more or less stayed put. By about 4600 years ago, it reached the most westerly Corded Ware people around where the Netherlands now lies. For reasons still unclear, the Corded Ware people fully embraced it. “They simply take on part of the Bell Beaker package and become Beaker people,” says Kristiansen.
The fact that the genetic analysis showed the Britons then all-but disappeared within a couple of generations might be significant. It suggests the capacity for violence that emerged when the Yamnaya lived on the Eurasia steppe remained even as these people moved into Europe, switched identity from Yamnaya to Corded Ware, and then switched again from Corded Ware to Bell Beaker.
Notice what Kristiansen did there? Yamnaya men “switched identities” into Corded Ware, then “switched identities” into Bell Beakers…So, the most aggresive peoples who have ever existed, exterminating all other Europeans, were actually not so violent when embracingwholly different cultures whose main connection is that they built kurgans (yes, Gimbutas lives on).
NOTE. By the way, just so we are clear, only Indo-Europeans are “genocidal”. Not like Neolithic farmers, or Palaeolithic or Mesolithic populations, or more recent Bronze Age or Iron Age peoples, who also replaced Y-DNA from many regions…
In fact, there is much stronger evidence that these Yamnaya Beakers were ruthless. By about 4500 years ago, they had pushed westwards into the Iberian Peninsula, where the Bell Beaker culture originated a few centuries earlier. Within a few generations, about 40 per cent of the DNA of people in the region could be traced back to the incoming Yamnaya Beakers, according to research by a large team including Reich that was published this month. More strikingly, the ancient DNA analysis reveals that essentially all the men have Y chromosomes characteristic of the Yamnaya, suggesting only Yamnaya men had children.
“The collision of these two populations was not a friendly one, not an equal one, but one where the males from outside were displacing local males and did so almost completely,” Reich told New Scientist Live in September. This supports Kristiansen’s view of the Yamnaya and their descendants as an almost unimaginably violent people. Indeed, he is about to publish a paper in which he argues that they were responsible for the genocide of Neolithic Europe’s men. “It’s the only way to explain that no male Neolithic lines survived,” he says.
So these unimaginably violent Yamnaya men had children exclusively with their Y chromosomes…but not Dutch Single Grave peoples. These great great steppe-like northerners switched culture, cephalic index…and Y-chromosome from R1a (and others) to R1b-L151 to expand Italo-Celtic From The West™.
What’s not to love about 2019 with all this back-and-forth hopping between old and new pet theories?
NOTE. I would complain (again) that the obsessive idea of the Danes is that Denmark CWC is (surprise!) the Pre-Germanic community, so it has nothing to do with “steppe ancestry = Indo-European” (or even with “Corded Ware = Indo-European”, for that matter), but then again you have Koch still arguing for Celtic from the West, Kortlandt still arguing for Balto-Slavic from the east, and – no doubt worst of all – “R1a=IE / R1b=Vasconic / N1c=Uralic” ethnonationalists arguing for whatever is necessary right now, in spite of genetic research.
On a serious note, interesting comment by Heyd in the article:
A striking example of this distinction is a discovery made near the town of Valencina de la Concepción in southern Spain. Archaeologists working there found a Yamnaya-like kurgan, below which was the body of a man buried with a dagger and Yamnaya-like sandals, and decorated with red pigment just as Yamnaya dead were. But the burial is 4875 years old and genetic information suggests Yamnaya-related people didn’t reach that far west until perhaps 4500 years ago. “Genetically, I’m pretty sure this burial has nothing to do with the Yamnaya or the Corded Ware,” says Heyd. “But culturally – identity-wise – there is an aspect that can be clearly linked with them.” It would appear that the ideology, lifestyle and death rituals of the Yamnaya could sometimes run far ahead of the migrants.
NOTE. I have been trying to find which kurgan is this, reviewing this text on the archaeological site, but didn’t find anything beyond occasional ochre and votive sandals, which are usual. Does some reader know which one is it?
Notice how, if you add all those vanguard Yamna findings of Central and Western Europe, including this one from southern Spain, you begin to get a good idea of the territories occupied by East Bell Beakers expanding later. More or less like vanguard Abashevo and Sintashta finds in the Zeravshan valley heralded the steppe-related Srubna-Andronovo expansions in Turan…
It doesn’t seem like Proto-Beaker and Yamna just “crossed paths” at some precise time around the Lower Danube, and Yamna men “switched cultures”. It seems that many Yamna vanguard groups, probably still in long-distance contact with Yamna settlers from the Carpathian Basin, were already settled in different European regions in the first half of the 3rd millennium BC, before the explosive expansion of East Bell Beakers ca. 2500 BC. As Heyd says, there are potentially many Yamna settlements along the Middle and Lower Danube and tributaries not yet found, connecting the Carpathian Basin to Western and Northern Europe.
These vanguard groups would have more easily transformed their weakened eastern Yamna connections with the fashionable Proto-Beaker package expanding from the west (and surrounding all of these loosely connected settlements), just like the Yamna materials from Seville probably represent a close cultural contact of Chalcolithic Iberia with a Yamna settlement (the closest known site with Yamna traits is near Alsace, where high Yamna ancestry is probably going to be found in a Bell Beaker R1b-L151 individual).
This does not mean that there wasn’t a secondary full-scale migration from the Carpathian Basin and nearby settlements, just like Corded Ware shows a secondary (A-horizon?) migration to the east with R1a-Z645. It just means that there was a complex picture of contacts between Yamna and European Chalcolithic groups before the expansion of Bell Beakers. Doesn’t seem genocidal enough for a popular movie, tho.
I basically added information from the latest papers published, which (luckily enough for me) haven’t been too many, and I have added images to illustrate certain sections.
I have updated the PCAs by including North Caucasus samples from Wang et al. (2018), whose position I could only infer for older versions from previously published PCA graphs.
I have also added to the supplementary materials the “Tip of the Iceberg” R1b tree by Mike Walsh from the FTDNA R1b group, with permission, because some relevant genetic sections are centered on the evolution of R1b lineages, and the reader can get easily lost with so many subclades.
I have also updated maps, including some of the Y-DNA ones, and managed to finish two new maps I was working on, and I added them to the supplementary materials and to the menu above:
It is tentative because there hasn’t been any professional study or amateur attempt to date to differentiate both “steppe ancestries” in Yamna, and especially in Bell Beakers. So much for the call of professional geneticists since 2018 (see here and here) and archaeologists since 2017 (see e.g. here and here) to distinguish fine-scale population structure to be able to follow neighbouring populations which expanded with different archaeological (and thus ethnolinguistic) groups.
I think both maps are especially important today, given the current Nordicist reactionary trends arguing (yet again) for an origin of Indo-Europeans in The North™, now based on the Fearsome Tisza River hypothesis, on cephalic index values, and a few pairwise comparisons – i.e. an absolutely no-nonsense approach to the Indo-European question (LOL). At least I get to relax and sit this year out just observing how other people bury themselves and their beloved “steppe ancestry=IE” under so many new pet theories…
NOTE. Not that there is anything wrong with a northern origin of North-West Indo-European from a linguistic point of view, as I commented recently – after all, a Corded Ware origin would roughly fit the linguistic guesstimates, unlike the proposed ancestral origins in Anatolia or India. The problem is that, like many other fringe theories, it is today just based on tradition, or (even worse) ethnic, political, or personal desires, and it doesn’t make sense when all findings from disciplines involved in the Indo-European and Uralic questions are combined.
Within 20 or 30 years, when genetic genealogists (or amateur geneticists, or however you want to call them) ask why we had the opportunity since 2015 to sample as many Hungarian Yamnaya individuals as possible and we didn’t, when it is clear that the number of unscathed kurgans is diminishing every year (from an estimated 4,000 in the 20th century, of the original tens of thousands, to less than 1,500 today) the answer will not be “because this or that archaeologist or linguist was a dilettante or a charlatan‘, as they usually describe academics they dislike.
It will be precisely because the very same genetic genealogists – supposedly interested today in the origin of R1b-L151 and/or genetic marker associated with North-West Indo-Europeans – are obsessed with finding them anywhere else but for Hungary, and prefer to use their money and time to play with a few statistical tools within a biased framework of flawed assumptions and study designs, obtaining absurd results and accepting far-fetched interpretations of them, to be told exactly what they want to hear: be it the Franco-Cantabrian homeland, the Dutch or Moravian Beaker from CWC homeland, the Maykop homeland, or the Moon homeland.
Poetic justice this heritage destruction, whose indirect causes will remain written in Internet archives for everyone to see, as a good lesson for future generations.
Given my reduced free time in these months, I have decided to keep updating the text on Indo-European and Uralic migrations and/or this blog, simultaneously or alternatively, to make the most out of the time I can dedicate to this. I will add the different ‘A Song of Sheep and Horses (ASoSaH) reread’ posts to the original post announcing the books. I would be especially interested in comments and corrections to the book chapters rather than the posts, but any comments are welcome (including in the forum, where comments are more likely to stick).
Luckily enough – for those of us who want precise answers to our previous infinite models of Indo-European language expansions (viz. GAC-associated expansion, IE-speaking Old Europe, Anatolian homeland, Iran homeland, Maykop as Proto-Anatolian, Palaeolithic Continuity Theory, Celtic in the Atlantic façade, etc.) – the situation has been more clear-cut than expected: it turns out that, especially during population expansions, acute Y-chromosome bottlenecks were very common in the past, at least until the Iron Age.
Khvalynsk and Repin-Yamna expansions were no different, and that seems quite natural in hindsight, given the strong familial ties and aversion to foreigners proper of the Late Proto-Indo-European society and culture – probably not really that different from other contemporary societies, like the neighbouring Late Proto-Uralic or Trypillian ones.
During the expansion of early Khvalynsk, the most likely Indo-Anatolian culture, the society of the Don-Volga area was probably made up of different lineages including R1b-V1636, R1b-M269, R1a-YP1272, Q1a-M25, and I2a-L699 (and possibly some R1b-V88?), a variability possibly greater than that of the contemporary north Pontic area, probably a sign of this region being a sink of different east and west migrations from steppe and forest areas.
During its expansion, the Khvalynsk society saw its haplogroup variability reduced, as evidenced by the succeeding expansive Repin culture:
Afanasevo, representing Pre-Tocharian (the earliest Late PIE dialect to branch off), expanded with R1b-L23 – especially R1b-Z2103 – lineages, while early Yamna expanded with R1b-L23 and I2a-L699 lineages, which suggests that these are the main haplogroups that survived the Y-DNA bottleneck undergone during the Khvalynsk expansion, and especially later during the late Repin expansion. Nevertheless, other old haplogroups might still pop up during the Repin and early Yamna period, such as the R1b-V1636 sample from Yamna in the Northern Caucasus.
It is still unclear if R1b-L23 sister clade R1b-PF7562 (formed ca. 4400 BC, TMRCA ca. 3400 BC), prevalent among modern Albanians, expanded with Yamna migrants, or if it was part of an earlier expansion of R1b-M269 into the Balkans, and represent thus Indo-Anatolian speakers who later hitchhiked the expansion of the Late PIE language from the north or west Pontic area. The early TMRCA seems to suggest an association with Repin (and therefore Yamna), rather than later movements in the Balkans.
‘Yamnaya’ or ‘steppe’ ancestry?
After the early years when population genetics relied mainly on modern Y-DNA haplogroups, geneticists and amateurs have been recently playing around with testing “ancestry percentages”, based on newly developed free statistical tools, which offer obviously just one among many types of data to achieve a proper interpretation of the past.
Today we have quite a lot Y-DNA haplogroups reported for ancient samples of more recent prehistoric periods, and they seem to offer (at least since the 2015 papers, but more evidently since the 2018 papers on Bell Beakers and Europeans, Corded Ware, or Fennoscandia among others) the most straightforward interpretation of all results published in population genomics research.
NOTE. The finding of a specific type of ancestry in one isolated 40,000-year-old sample from Tianyuan can offer very interesting information on potential population movements to the region. However, the identification of ethnolinguistic communities and their migrations among neighbouring groups in Neolithic or Bronze Age groups is evidently not that simple.
It is becoming more and more clear with each paper that the true “Yamnaya ancestry” – not the originally described one – was in fact associated with Indo-Europeans (see more on the very Yamnaya-like Yamna Hungary and early East Bell Beaker R1b samples, all of quite similar ancestry and PCA cluster before their further admixture with EEF- and CWC-like groups).
The so-called “steppe ancestry”, on the other hand, reflects the contribution of a Northern Caucasus-related ancestry to expanding Khvalynsk settlers, who spread through the steppes more than a thousand years before the expansion of Late Proto-Indo-Europeans with late Repin, and can thus be found among different groups related to the Pontic-Caspian steppes (see more on the emergence and evolution of “steppe ancestry”).
In fact, after the Yamna/Indo-European and Corded Ware/Uralic expansions, it is more likely to find “steppe ancestry” to the north and east in territories traditionally associated with Uralic languages, whereas to the south and west – i.e. in territories traditionally associated with Indo-European languages – it is more likely to find “EEF ancestry” with diminished “steppe ancestry”, among peoples patrilineally descended from Yamna settlers.
Y-DNA haplogroups, the only uniparental markers (see exceptions in mtDNA inheritance) – unlike ancestry percentages based on the comparison of a few samples and flawed study designs – do not admix, do not change, and therefore they do not lend themselves to infinite pet theories (see e.g. what David Reich has to say about R1b-P312 in Iberia directly derived from Yamna migrants in spite of their predominant EEF ancestry): their cultural continuity can only be challenged with carefully threaded linguistic, archaeological, and genetic data.
Genome sequencing, variant calling, and construction of the Mongolian reference panel. We collected peripheral blood with informed consent from 175 Mongolian individuals representing six distinct tribes/regions in northern China and Mongolia, including the Abaga, Khalkha, Oirat, Buryat, Sonid, and Horchin tribes.
The fixation index (FST) was used to estimate pairwise genetic differentiation among our Mongolian samples and 26 modern human populations selected from 1000G (…) the Mongolian tribes cluster with East Asian groups. The Mongolian populations show the smallest differentiation from the CHB, and FST values increase relative to the magnitude of geographical separation. The Buryat are the most differentiated tribe compared with other East Asians (1.82–2.97%), while the Horchin are the least (0.25–1.35%). All tribes are closer to the Japanese (JPT) than the CHS with the exception of the Horchin. Among the tribes, the Abaga, Khalkha, Oirat, and Sonid show the least differentiation from one another (FST < 0.15%)
A PCA places the Mongolians in close genetic proximity to a group of North Asian Siberians, including Altaians, Tuvinians, Evenki, and Yakut, indicating that the Mongolian whole-genome variation panel could be a better proxy for these groups than any populations currently in the 1000G panel
The most common Y-chromosome haplogroups are from the C3 sublineage (41.67%), including C3c (29.17%) and C3b (12.50%), followed by haplogroup O (23.61%), and haplogroup N (18.06%) (…) While haplogroups C and O are primarily restricted to Asia, haplogroup N is present at high frequency in Finns (60.5%), at low frequency in non-Mongolian East Asians (< 1%), and virtually absent throughout the remainder of European and African samples in 1000G
Comparison with Finns
Of the populations included in our study, Mongolians share the second-highest level of IBD with the Finnish people (FIN), behind only Northern Han Chinese (CHB). While Mongolians share more IBD with Europeans (EUR) as a whole compared with other non-EAS people (Fig. 4b), removal of Finns from the Europeans drops the level of sharing to as low as that with South Asians (SAS) or Admixed American (AMR).
There is considerable geographic separation between modern-day Mongolians and Europe. The positive D-statistic that reveal gene flow between Mongolians and Europeans (Fig. 4c), and the high degree of IBD sharing with Finnish people in particular suggest that complex admixture may have occurred throughout northeastern Europe and Siberia. To see whether Mongolians represent the ethnic group in East Asia with the highest level of gene flow with Finnish people, we calculated a D-statistic for each set of populations [Mongolians, X, FIN, Yoruba (YRI)], where X represents a population from Siberia or Northern Canada. Most of the populations reveal an imbalance in allele frequencies that suggests gene flow with Finns (D >0, Z >3), but the greatest imbalance is observed between Siberians/Northern Canadians and Finnish, rather than between Mongolians and Finns. This pattern indicates that northern Asian populations interacted across large geographic ranges.
I guess the 1000G does not have northern Eurasian groups, because the IBD map and values would be lightening up with Palaeo-Siberian peoples…