NOTE. The video is best viewed in HD 1080p (1920×1080) with a display that allows for this or greater video quality, and a screen big enough to see haplogroup symbols, i.e. tablet or greater. The YouTube link is here. The Facebook link is here.
Based on the results of the past 5 years or so, which have been confirming this combined picture every single time, I doubt there will be much need to change it in any radical way, as only minor details remain to be clarified.
I wanted to publish a GIS tool of my own for everyone to have an updated reference of all data I use for my books.
The most complex GIS tools consume too many resources when used online in a client-server model, so I have to keep that to myself, but there are some ways to publish low quality outputs.
The files below include the possibility to zoom some levels to be able to see more samples, and also to check each one for more information on their ID, attributed culture and label, archaeological site, source paper, subclade (and people responsible for SNP inferences if any), etc.
Some usage notes:
Files are large (ca. 20 Mb), so they still take some time to load.
For the meaning of symbols and colors (for Y-DNA haplogroups), if there is any doubt, check the video above.
Pop-ups with sample information will work on desktop browsers by clicking on them, apparently not on smartphone and related tactile OS. I have changed the settings to show pop-ups on hover, so that it now works (to some extent) on tactile OS.
The search tool can look for specific samples according to their official ID, and works by highlighting the symbol of the selected individual (turning it into a bright blue dot), and leading the layer view to the location, but it seems to work best only with some browser and OS settings – in other browsers, you need to zoom out to see where the dot is located. The specific sample with its information could paradoxically disappear in search mode, so you might need to reload and look again for the same site that was highlighted.
Latitude and longitude values have been randomly modified to avoid samples overcrowding specific sites, so they are not the original ones.
NOTE. Because there are too many samples at the starting view, depending on the file you should zoom some levels to start seeing symbols.
I have tried running supervised ADMIXTURE models by selecting distant populations based on PCA and qpAdm results, but it seems to work fine only for a small K number, being easily improved when running it unsupervised.
Adding distant populations seems to improve or mess up with the results in unpredictable ways, too, so at this point I doubt ADMIXTURE (or anything other than qpAdm) is actually useful to obtain anything precise in terms of ancestry evolution, although it can give a good overall idea of rough ancestry changes, if K is kept small enough.
Anyway, I will keep trying to find a simple way to show the actual evolution and expansion of “Steppe ancestry”. Since every single run for thousands of samples takes days, I don’t really know if and when I will find something interesting to show…
The recent data on ancient DNA from Iberia published by Olalde et al. (2019) was interesting for many different reasons, but I still have the impression that the authors – and consequently many readers – focused on not-so-relevant information about more recent population movements, or even highlighted the least interesting details related to historical events.
This post is thus a summary of its findings with the help of natural neighbour interpolation maps of the reported Germany_Beaker and France_Beaker ancestry for individual samples. Even though maps are not necessary, visualizing geographically the available data facilitates a direct comprehension of the most relevant information. What I considered key points of the paper are highlighted in bold, and enumerated.
NOTE. To get “more natural” maps, extrapolation for the whole Iberian Peninsula is obtained by interpolation through the use of external data from the British Isles, Central Europe, and Africa. This is obviously not ideal, but – lacking data from the corners of the Iberian Peninsula – this method gives a homogeneous look to all maps. Only data in direct line between labelled samples in each map is truly interpolated for the Iberian Peninsula, while the rest would work e.g. for a wider (and more simplistic) map of European Bronze Age ancestry components.
The Proto-Beaker package may or may not have expanded into Central Europe with typical Iberia_Chalcolithic ancestry. A priori, it seems a rather cultural diffusion of traits stemming from west Iberia roughly ca. 2800 BC.
The situation during the Chalcolithic is only relevant for the Indo-European question insofar as it shows a homogeneous Iberia_Chalcolithic-like ancestry with typical Y-chromosome (and mtDNA) haplogroups of the Iberian Neolithic dominating over the whole Peninsula until about 2500 BC. This might represent an original Basque-Iberian community.
(1) East Bell Beakers brought hg. R1b-L23 and Yamnaya ancestry to Iberia, ergo the Bell Beaker phenomenon was not a (mere) local development in Iberia, but involved the expansion of peoples tracing their ancestry to the Yamnaya culture who eventually replaced a great part of the local population.
(2) Classical Bell Beakers have their closest source population in Germany Beakers, and they reject an origin close to Rhine Beakers (i.e. Beakers from the British Isles, the Netherlands, or northern France), ergo the Single Grave culture was not the origin of the Bell Beaker culture, either (see here).
Early Bronze Age
Interestingly, the European Early Bronze Age in Iberia is still a period of adjustments before reaching the final equilibrium. Unlike the situation in the British Isles, where Bell Beakers brought about a swift population replacement, Iberia shows – like the Nordic Late Neolithic period – centuries of genomic balancing between Indo-European- and non-Indo-European-speaking peoples, as could be suggested by hydrotoponymic research alone.
This balancing is seen in terms of Germany_Beaker vs. Iberia_Chalcolithic ancestry, but also in terms of Y-chromosome haplogroups, with the most interesting late developments happening in southern Iberia, around the territory where El Argar eventually emerged in radical opposition to the Bell Beaker culture.
We obtained lower proportions of ancestry related to Germany_Beaker on the X-chromosome than on the autosomes (Table S14), although the Z-score for the differences between the estimates is 2.64, likely due to the large standard error associated to the mixture proportions in the X-chromosome.
Regarding the PCA, Iberia Bronze Age samples occupy an intermediate cluster between Iberia Chalcolithic and Bell Beakers of steppe ancestry, with Yamnaya-rich samples from the north (Asturias, Burgos) representing the likely source Old European population whose languages survived well into the Roman Iron Age:
Middle Bronze Age
During the Middle Bronze Age, the equilibrium reached earlier is reversed, with a (likely non-Indo-European-speaking) Argaric sphere of influence expanding to the west and north featuring Iberia Chalcolithic and lesser amount of Germany_Beaker ancestry, present now in the whole Peninsula, although in varying degrees.
All Iberian groups were probably already under a bottleneck of R1b-DF27 lineages, although it is likely that specific subclades differed among regions:
Late Bronze Age
The Late Bronze Age represents the arrival of the Urnfield culture, which probably expanded with Celtic-speaking peoples. A Late Bronze Age transect before their genetic impact still shows a prevalent Germany_Beaker-like Steppe ancestry, probably peaking in north/west Iberia:
(5) Galaico-Lusitanians were descendants of Iberian Beakers of Germany_Beaker ancestry and hg. R1b-M269. Autosomal data of samples I7688 and I7687, of the Final Bronze (end of the reported 1200-700 BC period for the samples), from Gruta do Medronhal (Arrifana, Coimbra, Portugal) confirms this.
In the 1940s, human bones, metallic artifacts (n=37) and non-human bones were discovered in the natural cave of Medronhal (Arrifana, Coimbra). All these findings are currently housed in the Department of Life Sciences of the University of Coimbra and are analyzed by a multidisciplinary team. The artifacts suggest a date at the beginning of the 1st millennium BC, which is confirmed by radiocarbon date of a human fibula: 890–780 cal BCE (2650±40 BP, Beta–223996). This natural cave has several rooms and corridors with two entrances. No information is available about the context of the human remains. Nowadays these remains are housed mixed and correspond to a minimum number of 11 individuals, 5 adults and 6 non-adults.
NOTE. To understand how the region around Coimbra was (Proto-)Lusitanian – and not just Old European in general – until the expansion of the Turduli Oppidani, see any recent paper on Bronze Age expansion of warrior stelae, hydrotoponymy, anthroponymy, or theonymy (see e.g. about Spear-vocabulary).
In a complex period of multiple population movements and language replacements, the temporal transect in Olalde et al. (2019) offers nevertheless relevant clues for the Pre-Roman Iron Age:
(6) The expansion of Celtic languages was associated with the spread of France_Beaker-like ancestry, most likely already with the LBA Urnfield culture, since a Tartessian and a Pre-Iberian samples (both dated ca. 700-500 BC) already show this admixture, in regions which some centuries earlier did not show it. Similarly, a BA sample from Álava ca. 910–840 BC doesn’t show it, and later Celtiberian samples from the same area (ca. 4th c. BC and later) show it, depicting a likely north-east to west/south-west routes of expansion of Celts.
(7) The distribution of Germany_Beaker ancestry peaked, by the Iron Age, among Old Europeans from west Iberia, including Galaico-Lusitanians and probably also Astures and Cantabri, in line with what was expected before genetic research:
A probably more precise picture of the Final Bronze – Early Iron Age transition is obtained by including the Final Bronze samples I2469 from El Sotillo, Álava (ca. 910-875 BC) as Celtic ancestry buffer to the west, and the sample I3315 from Menorca (ca. 904-861 BC), lacking more recent ones from intermediate regions:
In terms of Y-DNA and mtDNA haplogroups, the situation is difficult to evaluate without more samples and more reported subclades:
In the PCA, Proto-Lusitanian samples occupy an intermediate cluster between Iberian Bronze Age and Bronze Age North (see above), including the Final Bronze sample from Álava, while Celtic-speaking peoples (including Pre-Iberians and Iberians of Celtic descent from north-east Iberia) show a similar position – albeit evidently unrelated – due to their more recent admixture between Iberian Bronze Age and Urnfield/Hallstatt from Central Europe:
(8) Iberian-speaking peoples in north-east Iberia represent a recent expansion of the language from the south, possibly accompanied by an increase in Iberia_Chalcolithic/Germany_Beaker admixture from east/south-east Iberia.
(9) Modern Basques represent a recent isolation + Y-DNA bottlenecks after the Roman Iron Age population movements, probably from Aquitanians migrating south of the Pyrenees, admixing with local peoples, and later becoming isolated during the Early Middle Ages and thereafter:
[Modern Basques] overlap genetically with Iron Age populations showing substantial levels of Steppe ancestry.
Assuming that France_Beaker ancestry is associated with the Urnfield culture (spreading with Celtic-speaking peoples), Vasconic speakers were possibly represented by some population – most likely from France – whose ancestry is close to Rhine Beakers (see here).
Alternatively, a Vasconic language could have survived in some France/Iberia_Chalcolithic-like population that got isolated north of the Pyrenees close to the Atlantic Façade during the Bronze Age, and who later admixed with Celtic-speaking peoples south of the Pyrenees, such as the Vascones, to the point where their true ancestry got diluted.
In any case, the clear Celtic Steppe-like admixture of modern Basques supports for the time being their recent arrival to Aquitaine before the proto-historical period, which is in line with hydrotoponymic research.
The most interesting aspects to discuss after the publication of Olalde et al. (2019) would have been thus the nature of controversial Palaeohispanic peoples for which there is not much linguistic data, such as:
the Astures and the Cantabri, usually considered Pre-Celtic Indo-European (see here);
the Vaccaei, usually considered Celtic;
the Vettones, traditionally viewed as sharing the same language as Lusitanians due to their apparent shared hydrotoponymic, anthroponymic, and/or theonymic layers, but today mostly viewed as having undergone Celticization and helped the westward expansion of Celtic languages (and archaeologically clearly divided from Old European hostile neighbours to the west by their characteristic verracos);
the Pellendones or the Carpetani, who were once considered Pre-Celtic Indo-Europeans, too;
the nature of Tartessian as Indo-European, or maybe even as “Celtic”, as defended by Koch;
or the potential remote connection of Basque and Iberian languages in a common trunk featuring Iberian/France_Chalcolithic ancestry (also including Palaeo-Sardo).
Despite these interesting questions still open for discussion, the paper remarked something already known for a long time: that modern Basques had steppe ancestry and Y-DNA proper of the Yamnaya 5,000 years ago, and that Bell Beakers had brought this steppe ancestry and R1b-P312 lineages to Iberia. This common Basque-centric interpretation of Iberian prehistory is the consequence of a 19th-century tradition of obsessively imagining Vasconic-speaking peoples in their medieval territories extrapolated to Cro-Magnons and Atapuerca (no, really), inhabiting undisturbed for millennia a large territory encompassing the whole Iberia and France, “reduced” or “broken” only with the arrival of Celts just before the Roman conquests. A recursive idea of “linguistic autochthony” and “genetic purity” of the peoples of Iberia that has never had any scientific basis.
Similarly, this paper offered the Nth proof already in population genomics that traditional nativist claims for the origin of the Bell Beaker folk in Western Europe were wrong, both southern (nativist Iberian origin) and northern European (nativist Lower Rhine origin). Both options could be easily rejected with phylogeography since 2015, they were then rejected in Olalde et al. and Mathieson et al (2017), then again with the update of many samples in Olalde et al. (2018) and Mathieson et al (2018), and it has most clearly been rejected recently with data from Wang et al. (2018) and its Yamnaya Hungary samples. Findings from Olalde et al. (2019) are just another nail to coffins that should have been well buried by now.
Even David Anthony didn’t have any doubt in his latest model (2017) about the Carpathian Basin origin of North-West Indo-Europeans (see here), and his latest update to the Proto-Indo-European homeland question (2019) shows that he is convinced now about R1b bottlenecks and proper Pre-Yamnaya ancestry stemming from a time well before the Bell Beaker expansion. This won’t be the last setback to supporters of zombie theories: like the hypotheses of an Anatolian, Armenian, or OIT origin of the PIE homeland, other mythical ideas are so entrenched in nationalist and/or nativist tradition that many supporters will no doubt prefer them to die hard, under the most numerous and shameful rejections of endlessly remade reactionary models.
Interesting excerpts (emphasis mine, modified for clarity):
To understand the genetic structure and influence of the Viking expansion, we sequenced the genomes of 442 ancient humans from across Europe and Greenland ranging from the Bronze Age (c. 2400 BC) to the early Modern period (c. 1600 CE), with particular emphasis on the Viking Age. We find that the period preceding the Viking Age was accompanied by foreign gene flow into Scandinavia from the south and east: spreading from Denmark and eastern Sweden to the rest of Scandinavia. Despite the close linguistic similarities of modern Scandinavian languages, we observe genetic structure within Scandinavia, suggesting that regional population differences were already present 1,000 years ago.
Maps illustrating the following texts have been made based on data from this and other papers:
Maps showing ancestry include only data from this preprint (which also includes some samples from Sigtuna).
Maps showing haplogroups of ancient DNA samples based on their age include data from all published papers, but with slightly modified locations to avoid overcrowding (randomized distance approx. ± 0.1 long. and lat.).
We find that the transition from the BA to the IA is accompanied by a reduction in Neolithic farmer ancestry, with a corresponding increase in both Steppe-like ancestry and hunter-gatherer ancestry. While most groups show a slight recovery of farmer ancestry during the VA, there is considerable variation in ancestry across Scandinavia. In particular, we observe a wide range of ancestry compositions among individuals from Sweden, with some groups in southern Sweden showing some of the highest farmer ancestry proportions (40% or more in individuals from Malmö, Kärda or Öland).
Ancestry proportions in Norway and Denmark on the other hand appear more uniform. Finally we detect an influx of low levels of “eastern” ancestry starting in the early VA, mostly constrained among groups from eastern and central Sweden as well as some Norwegian groups. Testing of putative source groups for this “eastern” ancestry revealed differing patterns among the Viking Age target groups, with contributions of either East Asian- or Caucasus-related ancestry.
Overall, our findings suggest that the genetic makeup of VA Scandinavia derives from mixtures of three earlier sources: Mesolithic hunter-gatherers, Neolithic farmers, and Bronze Age pastoralists. Intriguingly, our results also indicate ongoing gene flow from the south and east into Iron Age Scandinavia. Thus, these observations are consistent with archaeological claims of wide-ranging demographic turmoil in the aftermath of the Roman Empire with consequences for the Scandinavian populations during the late Iron Age.
Genetic structure within Viking-Age Scandinavia
We find that VA Scandinavians on average cluster into three groups according to their geographic origin, shifted towards their respective present-day counterparts in Denmark, Sweden and Norway. Closer inspection of the distributions for the different groups reveals additional complexity in their genetic structure.
We find that the ‘Norwegian’ cluster includes Norwegian IA individuals, who are distinct from both Swedish and Danish IA individuals which cluster together with the majority of central and eastern Swedish VA individuals. Many individuals from southwestern Sweden (e.g. Skara) cluster with Danish present-day individuals from the eastern islands (Funen, Zealand), skewing towards the ‘Swedish’ cluster with respect to early and more western Danish VA individuals (Jutland).
Some individuals have strong affinity with Eastern Europeans, particularly those from the island of Gotland in eastern Sweden. The latter likely reflects individuals with Baltic ancestry, as clustering with Baltic BA individuals is evident in the IBS-UMAP analysis and through f4-statistics.
Genetic clustering using IBS-UMAP suggested genetic affinities of some Viking Age individuals with Bronze Age individuals from the Baltic. To further test these, we quantified excess allele sharing of Viking Age individuals with Baltic BA compared to early Viking Age individuals from Salme using f4 statistics. We find that many individuals from the island of Gotland share a significant excess of alleles with Baltic BA, consistent with other evidence of this site being a trading post with contacts across the Baltic Sea.
The earliest N1a-VL29 sample available comes from Iron Age Gotland (VK579) ca. AD 200-400 (see Iron Age Y-DNA maps), which also proves its presence in the western Baltic before the Viking expansion. The distribution of N1a-VL29 and R1a-Z280 (compared to R1a in general) among Vikings also supports a likely expansion of both lineages in succeeding waves from the east with Akozino warrior-traders, at the same time as they expanded into the Gulf of Finland.
Vikings in Estonia
(…) only one Viking raiding or diplomatic expedition has left direct archaeological traces, at Salme in Estonia, where 41 Swedish Vikings who died violently were buried in two boats accompanied by high-status weaponry. Importantly, the Salme boat-burial predates the first textually documented raid (in Lindisfarne in 793) by nearly half a century. Comparing the genomes of 34 individuals from the Salme burial using kinship analyses, we find that these elite warriors included four brothers buried side by side and a 3rd degree relative of one of the four brothers. In addition, members of the Salme group had very similar ancestry profiles, in comparison to the profiles of other Viking burials. This suggests that this raid was conducted by genetically homogeneous people of high status, including close kin. Isotope analyses indicate that the crew descended from the Mälaren area in Eastern Sweden thus confirming that the Baltic-Mid-Swedish interaction took place early in the VA.
N1a-VL29 lineages spread again later eastwards with Varangians, from Sweden into north-eastern Europe, most likely including the ancestors of the Rurikid dynasty. Unsurprisingly, the arrival of Vikings with Swedish ancestry into the East Baltic and their dispersal through the forest zone didn’t cause a language shift of Balto-Finnic, Mordvinic, or East Slavic speakers to Old Norse, either…
NOTE. For N1a-Y4339 – N1a-L550 subclade of Swedish origin – as main haplogroup of modern descendants of Rurikid princes, see Volkov & Seslavin (2019) – full text in comments below. Data from ancient samples show varied paternal lineages even among early rulers traditionally linked to Rurik’s line, which explains some of the discrepancies found among modern descendants:
A sample from Chernihiv (VK542) potentially belonging to Gleb Svyatoslavich, the 11th century prince of Tmutarakan/Novgorod, belongs to hg. I2a-Y3120 (a subclade of early Slavic I2a-CTS10228) and has 71% “Modern Polish” ancestry (see below).
Izyaslav Ingvarevych, the 13th century prince of Dorogobuzh, Principality of Volhynia/Galicia, is probably behind a sample from Lutsk (VK541), and belongs to hg. R1a-L1029 (a subclade of R1a-M458), showing ca. 95% of “Modern Polish” ancestry.
Firstly, modern Finnish individuals are not like ancient Finnish individuals, modern individuals have ancestry of a population not in the reference; most likely Steppe/Russian ancestry, as Chinese are in the reference and do not share this direction. Ancient Swedes and Norwegians are more extreme than modern individuals in PC2 and 4. Ancient UK individuals were more extreme than Modern UK individuals in PC3 and 4. Ancient Danish individuals look rather similar to modern individuals from all over Scandinavia. By using a supervised ancient panel, we have removed recent drift from the signal, which would have affected modern Scandinavians and Finnish populations especially. This is in general a desirable feature but it is important to check that it has not affected inference.
The story for Modern-vs-ancient Finnish ancestry is consistent, with ancient Finns looking much less extreme than the moderns. Conversely, ancient Norwegians look like less-drifted modern Norwegians; the Danish admixture seen through the use of ancient DNA is hard to detect because of the extreme drift within Norway that has occurred since the admixture event. PC4 vs PC5 is the most important plot for the ancient DNA story: Sweden and the UK (along with Poland, Italy and to an extent also Norway) are visibly extremes of a distribution the same “genes-mirror-geography” that was seen in the Ancient-palette analysis. PC1 vs PC2 tells the same story – and stronger, since this is a high variance-explained PC – for the UK, Poland and Italy.
Evidence for Pictish Genomes
The four ancient genomes of Orkney individuals with little Scandinavian ancestry may be the first ones of Pictish people published to date. Yet a similar (>80% “UK ancestry) individual was found in Ireland (VK545) and five in Scandinavia, implying that Pictish populations were integrated into Scandinavian culture by the Viking Age.
Our interpretation for the Orkney samples can be summarised as follows. Firstly, they represent “native British” ancestry, rather than an unusual type of Scandinavian ancestry. Secondly, that this “British” ancestry was found in Britain before the Anglo-Saxon migrations. Finally, that in Orkney, these individuals would have descended from Pictish populations.
(…) ‘UK’ represents a group from which modern British and Irish people all receive an ancestry component. This information together implies that within the sampling frame of our data, they are proxying the ‘Briton’ component in UK ancestry; that is, a pre-Roman genetic component present across the UK. Given they were found in Orkney, this makes it very likely that they were descended from a Pictish population.
Modern genetic variation within the UK sees variation between ‘native Briton’ populations Wales, Scotland, Cornwall and Ireland as large compared to that within the more ‘Anglo-Saxon’ English. This is despite subsequent gene flow into those populations from English-like populations. We have not attempted to disentangle modern genetic drift from historically distinct populations. Roman-era period people in England, Wales, Ireland and Scotland may not have been genetically close to these Orkney individuals, but our results show that they have a shared genetic component as they represent the same direction of variation.
As in the case of mitochondrial DNA, the overall distribution profile of the Y chromosomal haplogroups in the Viking Age samples was similar to that of the modern North European populations. The most frequently encountered male lineages were the haplogroups I1, R1b and R1a.
Haplogroup I (I1, I2)
The distribution of I1 in southern Scandinavia, including a sample from Sealand (VK532) ca. AD 100 (see Iron Age Y-DNA maps) proves that it had become integrated into the West Germanic population already before their expansions, something that we already suspected thanks to the sampling of Germanic tribes.
Haplogroup R1b (M269, U106, P312)
Especially interesting is the finding of R1b-L151 widely distributed in the historical Nordic Bronze Age region, which is in line with the estimated TMRCA for R1b-P312 subclades found in Scandinavia, despite the known bottleneck among Germanic peoples under U106. Particularly telling in this regard is the finding of rare haplogroups R1b-DF19, R1b-L238, or R1b-S1194. All of that points to the impact of Bell Beaker-derived peoples during the Dagger period, when Pre-Proto-Germanic expanded into Scandinavia.
Also interesting is the finding of hg. R1b-P297 in Troms, Norway (VK531) ca. 2400 BC. R1b-P297 subclades might have expanded to the north through Finland with post-Swiderian Mesolithic groups (read more about Scandinavian hunter-gatherers), and the ancestry of this sample points to that origin.
However, it is also known that ancestry might change within a few generations of admixture, and that the transformation brought about by Bell Beakers with the Dagger Period probably reached Troms, so this could also be a R1b-M269 subclade. In fact, the few available data from this sample show that it comes from the natural harbour Skarsvågen at the NW end of the island Senja, and that its archaeologist thought it was from the Viking period or slightly earlier, based on the grave form. From Prescott (2017):
In 1995, Prescott and Walderhaug tentatively argued that a dramatic transformation took place in Norway around the Late Neolithic (2350 BCE), and that the swift nature of this transition was tied to the initial Indo-Europeanization of southern and coastal Norway, at least to Trøndelag and perhaps as far north as Troms. (…)
The Bell Beaker/early Late Neolithic, however, represents a source and beginning of these institution and practices, exhibits continuity to the following metal age periods and integrated most of Northern Europe’s Nordic region into a set of interaction fields. This happened around 2400 BCE, at the MNB to LN transition.
NOTE. This particular sample is not included in the maps of Viking haplogroups.
Among the ancient samples, two individuals were derived haplogroups were identified as E1b1b1-M35.1, which are frequently encountered in modern southern Europe, Middle East and North Africa. Interestingly, the individuals carrying these haplogroups had much less Scandinavian ancestry compared to the most samples inferred from haplotype based analysis. A similar pattern was also observed for less frequent haplogroups in our ancient dataset, such as G (n=3), J (n=3) and T (n=2), indicating a possible non-Scandinavian male genetic component in the Viking Age Northern Europe. Interestingly, individuals carrying these haplogroups were from the later Viking Age (10th century and younger), which might indicate some male gene influx into the Viking population during the Viking period.
As the paper says, the small sample size of rare haplogroups cannot distinguish if these differences are statistically relevant. Nevertheless, both E1b samples have substantial Modern Polish-like ancestry: one sample from Gotland (VK474), of hg. E1b-L791, has ca. 99% “Polish” ancestry, while the other one from Denmark (VK362), of hg. E1b-V13, has ca. 35% “Polish”, ca. 35% “Italian”, as well as some “Danish” (14%) and minor “British” and “Finnish” ancestry.
Given the E1b-V13 samples of likely Central-East European origin among Lombards, Visigoths, and especially among Early Slavs, and the distribution of “Polish” ancestry among Viking samples, VK362 is probably a close description of the typical ancestry of early Slavs. The peak of Modern Polish-like ancestry around the Upper Pripyat during the (late) Viking Age suggests that Poles (like East Slavs) have probably mixed since the 10th century with more eastern peoples close to north-eastern Europeans, derived from ancient Finno-Ugrians:
Similarly, the finding of R1a-M458 among Vikings in Funen, Denmark (VK139), in Lutsk, Poland (VK541), and in Kurevanikha, Russia (VK160), apart from the early Slav from Usedom, may attest to the origin of the spread of this haplogroup in the western Baltic after the Bell Beaker expansion, once integrated in both Germanic and Balto-Slavic populations, as well as intermediate Bronze Age peoples that were eventually absorbed by their expansions. This contradicts, again, my simplistic initial assessment of R1a-M458 expansion as linked exclusively (or even mainly) to Balto-Slavs.
As I said 6 months ago, 2019 is a tough year to write a blog, because this was going to be a complex regional election year and therefore a time of political promises, hence tenure offers too. Now the preliminary offers have been made, elections have passed, but the timing has slightly shifted toward 2020. So I may have the time, but not really any benefit of dedicating too much effort to the blog, and a lot of potential benefit of dedicating any time to evaluable scientific work.
On the other hand, I saw some potential benefit for publishing texts with ISBNs, hence the updates to the text and the preparation of these printed copies of the books, just in case. While Spain’s accreditation agency has some hard rules for becoming a tenured professor, especially for medical associates (whose years of professional experience are almost worthless compared to published peer-reviewed papers), it is quite flexible in assessing one’s merits.
However, regional and/or autonomous entities are not, and need an official identifier and preferably printed versions to evaluate publications, such as an ISBN for books. I took thus some time about a month ago to update the texts and supplementary materials, to publish a printed copy of the books with Amazon. The first copies have arrived, and they look good.
Corrections and Additions
I have changed the names and order of the books, as I intended for the first publication – as some of you may have noticed when the linguistic book was referred to as the third volume in some parts. In the first concept I just wanted to emphasize that the linguistic work had priority over the rest. Now the whole series and the linguistic volume don’t share the same name, and I hope this added clarity is for the better, despite the linguistic volume being the third one.
I have changed the nomenclature for Uralic dialects, as I said recently. I haven’t really modified anything deeper than that, because – unlike adding new information from population genomics – this would require for me to do a thorough research of the most recent publications of Uralic comparative grammar, and I just can’t begin with that right now.
Anyway, the use of terms like Finno-Ugric or Finno-Samic is as correct now for the reconstructed forms as it was before the change in nomenclature.
The most interesting recent genetic data has come from Iberia and the Mediterranean. Lacking direct data from the Italian Peninsula (and thus from the emergence of the Etruscan and Rhaetian ethnolinguistic community), it is becoming clearer how some quite early waves of Indo-Europeans and non-Indo-Europeans expanded and shrank – at least in West Iberia, West Mediterranean, and France.
Some of the main updates to the text have been made to the sections on Finno-Ugric populations, because some interesting new genetic data (especially Y-DNA) have been published in the past months. This is especially true for Baltic Finns and for Ugric populations.
Consequently, and somehow unsurprisingly, the Balto-Slavic section has been affected by this; e.g. by the identification of Early Slavs likely with central-eastern populations dominated by (at least some subclades of) hg. I2a-L621 and E1b-V13.
I have updated some cultural borders in the prehistoric maps, and the maps with Y-DNA and mtDNA. I have also added one new version of the Early Bronze age map, to better reflect the most likely location of Indo-European languages in the Early European Bronze Age.
As those in software programming will understand, major changes in the files that are used for maps and graphics come with an increasing risk of additional errors, so I would not be surprised if some major ones would be found (I already spotted three of them). Feel free to communicate these errors in any way you see fit.
I have selected more conservative SNPs in certain controversial cases.
I have also deleted most SNP-related footnotes and replaced them with the marking of each individual tentative SNP, leaving only those footnotes that give important specific information, because:
My way of referencing tentative SNP authors did not make it clear which samples were tentative, if there were more than one.
It was probably not necessary to see four names repeated 100 times over.
Often I don’t really know if the person I have listed as author of the SNP call is the true author – unless I saw the full SNP data posted directly – or just someone who reposted the results.
Sometimes there are more than one author of SNPs for a certain sample, but I might have added just one for all.
For a centralized file to host the names of those responsible for the unofficial/tentative SNPs used in the text – and to correct them if necessary -, readers will be eventually able to use Phylogeographer‘s tool for ancient Y-DNA, for which they use (partly) the same data I compiled, adding Y-Full‘s nomenclature and references. You can see another map tool in ArcGIS.
NOTE. As I say in the text, if the final working map tool does not deliver the names, I will publish another supplementary table to the text, listing all tentative SNPs with their respective author(s).
If you are interested in ancient Y-DNA and you want to help develop comprehensive and precise maps of ancient Y-DNA and mtDNA haplogroups, you can contact Hunter Provyn at Phylogeographer.com. You can also find more about phylogeography projects at Iain McDonald’s website.
I previously used certain samples prepared by amateurs from BAM files (like Botai, Okunevo, or Hittites), and the results were obviously less than satisfactory – hence my criticism of the lack of publication of prepared files by the most famous labs, especially the Copenhagen group.
Fortunately for all of us, most published datasets are free, so we don’t have to reinvent the wheel. I criticized genetic labs for not releasing all data, so now it is time for praise, at least for one of them: thank you to all responsible at the Reich Lab for this great merged dataset, which includes samples from other labs.
NOTE. I would like to make my tiny contribution here, for beginners interested in working with these files, so I will update – whenever I have time – the “How To” sections of this blog for PCAs, PCA3d, and ADMIXTURE.
For unsupervised ADMIXTURE in the maps, a K=5 is selected based on the CV, giving a kind of visual WHG : NWAN : CHG/IN : EHG : ENA, but with Steppe ancestry “in between”. Higher K gave worse CV, which I guess depends on the many ancient and modern samples selected (and on the fact that many samples are repeated from different sources in my files, because I did not have time to filter them all individually).
I found some interesting component shared by Central European populations in K=7 to K=9 (from CEU Bell Beakers to Denmark LN to Hungarian EBA to Iberia BA, in a sort of “CEU BBC ancestry” potentially related to North-West Indo-Europeans), but still, I prefer to go for a theoretically more correct visualization instead of cherry-picking the ‘best-looking’ results.
Since I made fun of the search for “Siberian ancestry” in coloured components in Tambets et al. 2018, I have to be consistent and preferred to avoid doing the same here…
In the first publication (in January) and subsequent minor revisions until March, I trusted analyses and ancestry estimates reported by amateurs in 2018, which I used for the text adding my own interpretations. Most of them have been refuted in papers from 2019, as you probably know if you have followed this blog (see very recent examples here, here, or here), compelling me to delete or change them again, and again, and again. I don’t have experience from previous years, although the current pattern must have been evidently repeated many times over, or else we would be still talking about such previous analyses as being confirmed today…
I wanted to be one step ahead of peer-reviewed publications in the books, but I prefer now to go for something safe in the book series, rather than having one potentially interesting prediction – which may or may not be right – and ten huge mistakes that I would have helped to endlessly redistribute among my readers (online and now in print) based on some cherry-picked pairwise comparisons. This is especially true when predictions of “Steppe“- and/or “Siberian“-related ancestry have been published, which, for some reason, seem to go horribly wrong most of the time.
I am sure whole books can be written about why and how this happened (and how this is going to keep happening), based on psychology and sociology, but the reasons are irrelevant, and that would be a futile effort; like writing books about glottochronology and its intermittent popularity due to misunderstood scientist trends. The most efficient way to deal with this problem is to avoid such information altogether, because – as you can see in the current revised text – they wouldn’t really add anything essential to the content of these books, anyway.
Interesting excerpts (emphasis in bold; modified for clarity):
Balearic Islands: The expansion of Iberian speakers
Mallorca_EBA dates to the earliest period of permanent occupation of the islands at around 2400 BCE. We parsimoniously modeled Mallorca_EBA as deriving 36.9 ± 4.2% of her ancestry from a source related to Yamnaya_Samara; (…). We next used qpAdm to identify “proximal” sources for Mallorca_EBA’s ancestry that are more closely related to this individual in space and time, and found that she can be modeled as a clade with the (small) subset of Iberian Bell Beaker culture associated individuals who carried Steppe-derived ancestry (p=0.442).
Suppl. Materials: The model used was with Bell_Beaker_Iberia_highsteppe, a group of outliers from Iberia buried in a Bell Beaker mortuary context who unlike most individuals from this context in that region had high proportions of Steppe ancestry (p=0.442).
Our estimates of Steppe ancestry in the two later Balearic Islands individuals are lower than the earlier one: 26.3 ± 5.1% for Formentera_MBA and 23.1 ± 3.6% for Menorca_LBA, but the Middle to Late Bronze Age Balearic individuals are not a clade relative to non-Balearic groups. Specifically, we find that f4(Mbuti.DG, X; Formentera_MBA, Menorca_LBA) is positive when X=Iberia_Chalcolithic (Z=2.6) or X=Sardinia_Nuragic_BA (Z=2.7). While it is tempting to interpret the latter statistic as suggesting a genetic link between peoples of the Talaiotic culture of the Balearic islands and the Nuragic culture of Sardinia, the attraction to Iberia_Chalcolithic is just as strong, and the mitochondrial haplogroup U5b1+16189+@16192 in Menorca_LBA is not observed in Sardinia_Nuragic_BA but is observed in multiple Iberia_Chalcolithic individuals. A possible explanation is that both the ancestors of Nuragic Sardinians and the ancestors of Talaiotic people from the Balearic Islands received gene flow from an unsampled Iberian Chalcolithic-related group (perhaps a mainland group affiliated to both) that did not contribute to Formentera_MBA.
This sample, like another one in El Argar, is of hg. R1b-P312. So there you are, the data that connects the Proto-Iberian expansion (replacing IE-speaking Bell Beakers) to the Iberian Chalcolithic population, signaled by the increase in Iberian Chalcolithic ancestry after the arrival of Bell Beakers, most likely connected originally to the Argaric and post-Argaric expansions during the MBA.
Steppe in Sardinia IA: Phocaeans from Italy?
Most Sardinians buried in a Nuragic Bronze Age context possessed uniparental haplogroups found in European hunter-gatherers and early farmers, including Y-haplogroup R1b1a[xR1b1a1a] which is different from the characteristic R1b1a1a2a1a2 spread in association with the Bell Beaker complex. An exception is individual I10553 (1226-1056 calBCE) who carried Y-haplogroup J2b2a, previously observed in a Croatian Middle Bronze Age individual bearing Steppe ancestry, suggesting the possibility of genetic input from groups that arrived from the east after the spread of first farmers. This is consistent with the evidence of material culture exchange between Sardinians and mainland Mediterranean groups, although genome-wide analyses find no significant evidence of Steppe ancestry so the quantitative demographic impact was minimal.
Another interesting data, these (Mesolithic) remnant R1b-V88 lineages closely related to the Italian Peninsula, the most likely region of expansion of these lineages into Africa, in turn possibly connected to the expansion of Proto-Afroasiatic.
We detect definitive evidence of Iranian-related ancestry in an Iron Age Sardinian I10366 (391-209 calBCE) with an estimate of 11.9 ± 3.7.% Iran_Ganj_Dareh_Neolithic related ancestry, while rejecting the model with only Anatolian_Neolithic and WHG at p=0.0066 (Supplementary Table 9). The only model that we can fit for this individual using a pair of populations that are closer in time is as a mixture of Iberia_Chalcolithic (11.9 ± 3.2%) and Mycenaean (88.1 ± 3.2%) (p=0.067). This model fits even when including Nuragic Sardinians in the outgroups of the qpAdm analysis, which is consistent with the hypothesis that this individual had little if any ancestry from earlier Sardinians.
Sicily EBA: The Lusitanian/Ligurian connection?
(…) While a previously reported Bell Beaker culture-associated individual from Sicily had no evidence of Steppe ancestry, (…) we find evidence of Steppe ancestry in the Early Bronze Age by ~2200 BCE. In distal qpAdm, the outlier Sicily_EBA11443 is parsimoniously modeled as harboring 40.2 ± 3.5% Steppe ancestry, and the outlier Sicily_EBA8561 is parsimoniously modeled as harboring 23.3 ± 3.5% Steppe ancestry. (…) The presence of Steppe ancestry in Early Bronze Age Sicily is also evident in Y chromosome analysis, which reveals that 4 of the 5 Early Bronze Age males had Steppe-associated Y-haplogroup R1b1a1a2a1a2. (Online Table 1). Two of these were Y-haplogroup R1b1a1a2a1a2a1 (Z195) which today is largely restricted to Iberia and has been hypothesized to have originated there 2500-2000 BCE. This evidence of west-to-east gene flow from Iberia is also suggested by qpAdm modeling where the only parsimonious proximate source for the Steppe ancestry we found in the main Sicily_EBA cluster is Iberians.
What’s this? An ancestral connection between SicelElymian and Galaico-Lusitanian or Ligurian (based on an origin in NE Iberia)? Impossible to say, especially if the languages of these early settlers were replaced later by non-Indo-European speakers from the eastern Mediterranean, and by Indo-European speakers from the mainland closely related to Proto-Italic during the LBA, but see below.
Regarding the comment on R1b-Z195, it is associated with modern Iberians, as DF27 in general, due to founder effects beyond the Pyrenees. It is a very old subclade, split directly from DF27 roughly at the same time as it split from the parent P312, i.e. it can be found anywhere in Europe, and it almost certainly accompanied the expansion of Celts from Central Europe under the subclade R1b-M167/SRY2627.
The connection is thus strong only because of the qpAdm modeling, since R1b-DF27 and subclade R1b-Z195 are certainly lineages expanded quite early, most likely with Yamna settlers in Hungary and East Bell Beakers.
We detect Iranian-related ancestry in Sicily by the Middle Bronze Age 1800-1500 BCE, consistent with the directional shift of these individuals toward Mycenaeans in PCA. Specifically, two of the Middle Bronze Age individuals can only be fit with models that in addition to Anatolia_Neolithic and WHG, include Iran_Ganj_Dareh_Neolithic. The most parsimonious model for Sicily_MBA3125 has 18.0 ± 3.6% Iranian-related ancestry (p=0.032 for rejecting the alternative model of Steppe rather than Iranian-related ancestry), and the most parsimonious model for Sicily_MBA has 14.9 ± 3.9% Iranian-related ancestry (p=0.037 for rejecting the alternative model).
The modern southern Italian Caucasus-related signal identified in Raveane et al. (2018) is plausibly related to the same Iranian-related spread of ancestry into Sicily that we observe in the Middle Bronze Age (and possibly the Early Bronze Age).
The non-Indo-European Sicanians and Elymians were possibly then connected to eastern Mediterranean groups before the expansion of the Sea Peoples.
For the Late Bronze Age group of individuals, qpAdm documented Steppe-related ancestry, modeling this group as 80.2 ± 1.8% Anatolia_Neolithic, 5.3 ± 1.6% WHG, and 14.5 ± 2.2% Yamnaya_Samara. Our modeling using sources more closely related in space and time also supports Sicily_LBA having Minoan-related ancestry or being derived from local preceding populations or individuals with ancestries similar to those of Sicily_EBA3123 (p=0.527), Sicily_MBA3124 (p=0.352), and Sicily_MBA3125 (p=0.095).
This increase in Steppe-related ancestry in a western site during the LBA most likely represents either an expansion from the Aegean or – maybe more likely, given the archaeological finds – a regional population similar to Sicily EBA re-emerging or rather being displaced from the eastern part of the island because of a westward movement from nearby Calabria.
Whether this population sampled spoke Indo-European or not at this time is questionable, since the Iron Age accounts show non-IE Elymians in this region.
Actually, Elymians seem to have spoken Indo-European, which fits well with the increase in steppe ancestry.
EDIT (21 MAR): Interesting about a proposed incoming Minoan-like ancestry is the potential origin of the Iran Neolithic-related ancestry that is going to appear in Central Italy during the LBA. This could then be potentially associated with Tyrsenians passing through the area, although the traditional description may be more more compatible with an arrival of Sea Peoples from the Adriatic.
Sad to read this:
This manuscript is dedicated to the memory of Sebastiano Tusa of the Soprintendenza del Mare in Palermo, who would have been an author of this study had he not tragically died in the crash of Ethiopia Airlines flight 302 on March 10.
Given my reduced free time in these months, I have decided to keep updating the text on Indo-European and Uralic migrations and/or this blog, simultaneously or alternatively, to make the most out of the time I can dedicate to this. I will add the different ‘A Song of Sheep and Horses (ASoSaH) reread’ posts to the original post announcing the books. I would be especially interested in comments and corrections to the book chapters rather than the posts, but any comments are welcome (including in the forum, where comments are more likely to stick).
Luckily enough – for those of us who want precise answers to our previous infinite models of Indo-European language expansions (viz. GAC-associated expansion, IE-speaking Old Europe, Anatolian homeland, Iran homeland, Maykop as Proto-Anatolian, Palaeolithic Continuity Theory, Celtic in the Atlantic façade, etc.) – the situation has been more clear-cut than expected: it turns out that, especially during population expansions, acute Y-chromosome bottlenecks were very common in the past, at least until the Iron Age.
Khvalynsk and Repin-Yamna expansions were no different, and that seems quite natural in hindsight, given the strong familial ties and aversion to foreigners proper of the Late Proto-Indo-European society and culture – probably not really that different from other contemporary societies, like the neighbouring Late Proto-Uralic or Trypillian ones.
During the expansion of early Khvalynsk, the most likely Indo-Anatolian culture, the society of the Don-Volga area was probably made up of different lineages including R1b-V1636, R1b-M269, R1a-YP1272, Q1a-M25, and I2a-L699 (and possibly some R1b-V88?), a variability possibly greater than that of the contemporary north Pontic area, probably a sign of this region being a sink of different east and west migrations from steppe and forest areas.
During its expansion, the Khvalynsk society saw its haplogroup variability reduced, as evidenced by the succeeding expansive Repin culture:
Afanasevo, representing Pre-Tocharian (the earliest Late PIE dialect to branch off), expanded with R1b-L23 – especially R1b-Z2103 – lineages, while early Yamna expanded with R1b-L23 and I2a-L699 lineages, which suggests that these are the main haplogroups that survived the Y-DNA bottleneck undergone during the Khvalynsk expansion, and especially later during the late Repin expansion. Nevertheless, other old haplogroups might still pop up during the Repin and early Yamna period, such as the R1b-V1636 sample from Yamna in the Northern Caucasus.
It is still unclear if R1b-L23 sister clade R1b-PF7562 (formed ca. 4400 BC, TMRCA ca. 3400 BC), prevalent among modern Albanians, expanded with Yamna migrants, or if it was part of an earlier expansion of R1b-M269 into the Balkans, and represent thus Indo-Anatolian speakers who later hitchhiked the expansion of the Late PIE language from the north or west Pontic area. The early TMRCA seems to suggest an association with Repin (and therefore Yamna), rather than later movements in the Balkans.
‘Yamnaya’ or ‘steppe’ ancestry?
After the early years when population genetics relied mainly on modern Y-DNA haplogroups, geneticists and amateurs have been recently playing around with testing “ancestry percentages”, based on newly developed free statistical tools, which offer obviously just one among many types of data to achieve a proper interpretation of the past.
Today we have quite a lot Y-DNA haplogroups reported for ancient samples of more recent prehistoric periods, and they seem to offer (at least since the 2015 papers, but more evidently since the 2018 papers on Bell Beakers and Europeans, Corded Ware, or Fennoscandia among others) the most straightforward interpretation of all results published in population genomics research.
NOTE. The finding of a specific type of ancestry in one isolated 40,000-year-old sample from Tianyuan can offer very interesting information on potential population movements to the region. However, the identification of ethnolinguistic communities and their migrations among neighbouring groups in Neolithic or Bronze Age groups is evidently not that simple.
It is becoming more and more clear with each paper that the true “Yamnaya ancestry” – not the originally described one – was in fact associated with Indo-Europeans (see more on the very Yamnaya-like Yamna Hungary and early East Bell Beaker R1b samples, all of quite similar ancestry and PCA cluster before their further admixture with EEF- and CWC-like groups).
The so-called “steppe ancestry”, on the other hand, reflects the contribution of a Northern Caucasus-related ancestry to expanding Khvalynsk settlers, who spread through the steppes more than a thousand years before the expansion of Late Proto-Indo-Europeans with late Repin, and can thus be found among different groups related to the Pontic-Caspian steppes (see more on the emergence and evolution of “steppe ancestry”).
In fact, after the Yamna/Indo-European and Corded Ware/Uralic expansions, it is more likely to find “steppe ancestry” to the north and east in territories traditionally associated with Uralic languages, whereas to the south and west – i.e. in territories traditionally associated with Indo-European languages – it is more likely to find “EEF ancestry” with diminished “steppe ancestry”, among peoples patrilineally descended from Yamna settlers.
Y-DNA haplogroups, the only uniparental markers (see exceptions in mtDNA inheritance) – unlike ancestry percentages based on the comparison of a few samples and flawed study designs – do not admix, do not change, and therefore they do not lend themselves to infinite pet theories (see e.g. what David Reich has to say about R1b-P312 in Iberia directly derived from Yamna migrants in spite of their predominant EEF ancestry): their cultural continuity can only be challenged with carefully threaded linguistic, archaeological, and genetic data.
Although there has been considerable debate about whether paternal mitochondrial DNA (mtDNA) transmission may coexist with maternal transmission of mtDNA, it is generally believed that mitochondria and mtDNA are exclusively maternally inherited in humans. Here, we identified three unrelated multigeneration families with a high level of mtDNA heteroplasmy (ranging from 24 to 76%) in a total of 17 individuals. Heteroplasmy of mtDNA was independently examined by high-depth whole mtDNA sequencing analysis in our research laboratory and in two Clinical Laboratory Improvement Amendments and College of American Pathologists-accredited laboratories using multiple approaches. A comprehensive exploration of mtDNA segregation in these families shows biparental mtDNA transmission with an autosomal dominantlike inheritance mode. Our results suggest that, although the central dogma of maternal inheritance of mtDNA remains valid, there are some exceptional cases where paternal mtDNA could be passed to the offspring. Elucidating the molecular mechanism for this unusual mode of inheritance will provide new insights into how mtDNA is passed on from parent tooffspring and may even lead to the development of new avenues for the therapeutic treatment for pathogenic mtDNA transmission.
Compared with Family A, a strikingly similar mtDNA transmission pattern was demonstrated in Families B and C. Taking Family B for illustration, II-3 having 29 heteroplasmic and seven homoplasmic variants should have inherited mtDNA from both his father (I-1, haplogroup of K1b2a) and his mother (I-10, haplogroup of H), who were supposed to possess 34 and nine homoplasmic variants, respectively. II-3 further transmitted his mtDNA that he inherited from I-1 to his son (III-2), who also inherited all of his mother’s mtDNA (II-30, carrying 34 variants and a haplogroup of T2a1a). However, III-2’s sister (III-1) and half-brother (III-5) only inherited the maternal mtDNA. Fresh blood sampling and repeated mtDNA sequencing in a second independent laboratory were also performed to rule out the possibility of sample mix-up for III-2 (III-2, column F-G and H-I). Additionally, these samples were further verified using Pacific Bio single molecular sequencing (see Materials and Methods) and by restriction fragment length polymorphism (RFLP) analysis of Family A, and these results were fully consistent with the previous sequencing.
A Resurgence of the Paternal Transmission Hypothesis
The results presented in this paper make a robust case for paternal transmission of mtDNA. Here, we report biparental mtDNA inheritance (either directly or indirectly) in 17 members in three multigeneration families. Thirteen of these individuals were identified directly by sequencing of the mitochondrial genome, whereas four could be inferred based on preexisting maternal heteroplasmy caused by biparental inheritance in the previous generation.
To further confirm these remarkable results and to exclude the possibility of sample mix-up and/or contamination, the whole mtDNA sequencing procedure was repeated independently in at least two different laboratories by different laboratory technicians with newly obtained blood samples. All results were reproducible, indicating no artifacts or contamination exist. More importantly, the multiple mtDNA variants that were paternally transmitted differ in both number and position among each of these three families as well as the related haplogroup (R0a1 in Family A, K1b2a in Family B, and K2b1a1a in Family C, respectively), providing two distinct forms of evidence supporting transmission of the paternal mtDNA.
Therefore, we have unequivocally demonstrated the existence of biparental mtDNA inheritance as evidenced by the high number and level of mtDNA heteroplasmy in these three unrelated multigeneration families. Most interestingly, the mixed haplogroups in these samples are very reminiscent of the mixed haplogroups found in the 20 studies that were dismissed by Bandelt et al. as due to contamination or sample mix-up. One is forced to wonder how many other instances of individuals with biparental mtDNA inheritance have been dismissed as technical errors, and whether Schwartz and Vissing’s original discovery has really been given the proper follow-up that it deserves. We suspect that these results will initiate a broader reassessment of the topic.
We propose that the paternal mtDNA transmission in these families should be accompanied by segregation of a mutation in one nuclear gene involved in paternal mitochondrial elimination and that there is a high probability that the gene in question operates through one of the pathways identified above.
If I have to be honest, I was stuck with the paternal transmission hypothesis which we were taught in class long ago. I didn’t know it was controversial or dismissed, I just thought it was really exceptional, and I never thought about learning more on the subject.
This paper proves it may be more complicated than that, especially for population genomics purposes, because biparental mtDNA transmission with autosomal dominant-like inheritance puts a serious barrier to a general, simplistic interpretation of mtDNA.
I don’t think it is a blow to all interpretations based on mtDNA, though, because the traditional interpretation should often work statistically. However, one has to be always very careful when saying “if it’s mtDNA from region X, it’s about female exogamy”, especially when samples are from neighbouring regions and similar periods.
The term “uniparental marker” for mtDNA is obviously misleading and shouldn’t be used, and many research papers and interpretations taking mtDNA as strictly uniparental should be taken with a pinch of salt.
Interesting excerpts (emphasis mine; most internal references removed):
The earliest, most secure archaeological evidence of human occupation of the region comes from the artefact-rich, high-latitude (~70° N) Yana RHS site dated to ~31.6 kya (…)
The Yana RHS human remains represent the earliest direct evidence of human presence in northeastern Siberia, a population we refer to as “Ancient North Siberians” (ANS). Both Yana RHS individuals were unrelated males, and belong to mitochondrial haplogroup U, predominant among ancient West Eurasian hunter-gatherers, and to Y chromosome haplogroup P1, ancestral to haplogroups Q and R, which are widespread among present-day Eurasians and Native Americans.
Symmetry tests using f4 statistics reject tree-like clade relationships with both Early West Eurasians (EWE; Sunghir) and Early East Asians (EEA; Tianyuan); however, Yana is genetically closer to EWE, despite its geographic location in northeastern Siberia
Using admixture graphs (qpGraph) and outgroup-based estimation of mixture proportions (qpAdm), we find that Yana can be modelled as EWE with ~25% contribution from EEA
Among all ancient individuals, Yana shares the most genetic drift with Mal’ta, and f4 statistics show that Mal’ta shares more alleles with Yana than with EWE (e.g. f4(Mbuti,Mal’ta;Sunghir,Yana) = 0.0019, Z = 3.99). Mal’ta and Yana also exhibit a similar pattern of genetic affinities to both EWE and EEA, consistent with previous studies.The ANE lineage can thus be considered a descendant of the ANS lineage, demonstrating that by 31.6 kya early representatives of this lineage were widespread across northern Eurasia, including far northeastern Siberia.
(…) the 9.8 kya Kolyma1 individual, representing a group we term “Ancient Paleosiberians” (AP). Our results indicate that AP are derived from a first major genetic shift observed in the region. Principal component analysis (PCA), outgroup f3-statistics and mtDNA and Y chromosome haplogroups (G1b and Q1a1a, respectively) demonstrate a close affinity between AP and present-day Koryaks, Itelmen and Chukchis, as well as with Native Americans.
For both AP and Native Americans, ANS ancestry appears more closely related to Mal’ta than Yana, therefore rejecting a direct contribution of Yana to later AP or Native American groups.
Lake Baikal Neolithic – Bronze Age
(…) the newly reported genomes from Ust’Belaya and recently published neighbouring Neolithic and Bronze Age sites show a succession of three distinct genetic ancestries over a ~6 ky time span. The earliest individuals show predominantly East Asian ancestry, closely related to the ancient individuals from DGC. In the early Bronze Age (BA), we observe a resurgence of AP ancestry (up to ~50% ancestry fraction), as well as influence of West Eurasian Steppe ANE ancestry represented by the early BA individuals from Afanasievo in the Altai region (~10%) This is consistent with previous reports of gene flow from an unknown ANE-related source into Lake Baikal hunter-gatherers.
Our results suggest a southward expansion of AP as a possible source, which is also consistent with the replacement of Y chromosome lineages observed at Lake Baikal, from predominantly haplogroup N in the Neolithic to haplogroup Q in the BA. Finally, the most recent individual from Ust’Belaya, dated to ~600 years ago, falls along the Neosiberian cline, similar to the ~760 year-old ‘Young Yana’ individual from northeastern Siberia, demonstrating the widespread distribution of Neosiberian ancestry in the most recent epoch.
At the western edge of northern Eurasia, genetic and strontium isotope data from ancient individuals at the Levänluhta site documents the presence of Saami ancestry in Southern Finland in the Late Holocene 1.5 kya. This ancestry component is currently limited to the northern fringes of the region, mirroring the pattern observed for AP ancestry in northeastern Siberia. However, while the ancient Saami individuals harbour East Asian ancestry, we find that this is better modelled by DGC rather than AP, suggesting that AP influence was likely restricted to the eastern side of the Urals. Comparison of ancient Finns and Saami with their present-day counterparts reveals additional gene flow over the past 1.6 kya, with evidence for West Eurasian admixture into modern Saami. The ancient Finn from Levänluhta shows lower Siberian ancestry than modern Finns .
EDIT (27 OCT 2018): By comparing the three, I see these are samples published already (at least two) in Lamnidis et al. (2018), but here with added (1) specific radiocarbon dates, (2) comparison with Neosiberian populations and (3) strontium isotope analyses.
Finnish_IA (ca. 350 AD) is probably a Saami-speaking individual, just like the Saami_IA with newly reported radiocarbon dates from Levänluhta ca. 400-600 AD (since Fennic peoples were then likely around the Gulf of Finland).
The conflicting strontium isotope data on marine dietary resources on certain samples from the supplementary material hint at possible external origin of the diet of some of the previously reported (and possibly one newly reported) Saami Iron Age individuals, from some 25-30 km. to the northwest through the river up to hundreds of km. to the southwest of Levänluhta (i.e. the whole coast of the Bothnian Sea). It is unclear why they would prefer an origin of the dietary source in southern Baltic regions instead of some km. to the west, though, unless that’s what they want to propose based on the sample’s admixture…
The coast of the Bothnian Sea (=the northern part of the Baltic Sea, between Sweden and Finland) lay only 25-30 km to the northwest, and accessible to the Iron Age people of the Levänluhta region via the Kyrönjoki river. (…) For individual JA2065/DA236, the low 87Sr/86Sr value (0.71078) would imply an exceptionally heavy reliance on Baltic Sea resources. The δ13C and δ15N values of the individual are near comparable (especially considering within-Baltic latitudinal gradients in δ13C; Torniainen et al. 2017) to the δ13C and δ15N values of a Middle Neolithic population on the Baltic island of Gotland (Eriksson, 2004) interpreted to have subsisted primarily on seals.
These new data on the samples give us some more information than what we already had, because the early date of Finnish_IA implies that there was few East Asian admixture (if any at all) in west Finland during the Roman Iron Age, which pushes still farther forward in time the expected appearance of Siberian ancestry among Saamic (first) and Fennic populations (later). It is unclear whether this East Asian ancestry found in Finnish_IA is actually related to DGC, or it is rather related to the ENA-like ancestry found already in Baltic hunter-gatherers (i.e. in some EHG samples from Karelia), for which Baikal_EN is a good proxy in Lazaridis et al. (2018).
The paper finds thus increased (probably the actual) Siberian ancestry in modern Finns compared to this Iron Age Saami individual. Coupled with the later Saami Iron Age samples, from between one to three centuries later – showing the start of Siberian ancestry influx – , we can begin to establish when the expansion of Siberian ancestry happened in central Finland, and thus quite likely when the Saami began to expand to the north and east and admix with Palaeo-Laplandic peoples.
One sample of haplogroup N1a1a1a1a4a1-M1982, Yana_MED, is found in the Arctic region (north-eastern Yakutia) ca. 1100 AD. Since it is derived from N1a1a1a1a-L392, it might be a surprise for some to find it in a clearly non-Uralic speaking environment at the same time other subclades of this haplogroup were admixing in the west with well-established Finno-Saamic, Volga-Finnic, Ugric, and Samoyedic populations…
On the growing doubts that these data – contradicting the CWC=IE theory – are creating among geneticists (from the supplementary materials):
The Proto-Saami language evolved in southern Finland and Karelia in the Early Iron Age, an area now host to Finnish and the closely related Karelian, but with Saami toponyms showing that the latter two languages are intrusive here (Saarikivi 2004). Saami-speaking populations are thought to have retreated to Lapland during the Middle Iron Age (300–800 AD), where it diverged into the modern Saami dialects. Genetically, the northward retreat of the Saami language correlates with the documented decrease of Saami ancestry in Southern Finland between the Iron Age and the modern period (cf. Lamnidis et al. 2018).
On the way to Lapland, the Saami replaced at least two linguistically obscure groups. This can be inferred from 1) an influx of non-Uralic loanwords into Proto-Saami in the Finnish Lakeland area, and 2) an influx of non-Uralic, non-Germanic words into Saami dialects in Lapland (Aikio 2012). Both of these borrowing events imply contact with non-Saami-speaking groups, e.g. non-Uralic-speaking hunter-gatherers that may have left a genetic and linguistic footprint on modern Saami populations.
The linguistic prehistory of Finland thus does not allow for a straightforward interpretation of the genetic data. The detection of East Asian ancestry in the genetically Saami individual is indicative of a population movement from the east (cf. Lamnidis et al. 2018, Rootsi et al. 2007), one that given the affinities with the ~7.6 ky old individuals from the Devil’s Gate Cave may have been a western extension of the Neosiberian turnover. However, it remains unclear whether this gene flow should be associated with the arrival of Uralic speakers, thus providing further support for a Uralic homeland in Eastern Eurasia, or with an earlier immigration of pre-Uralic, so-called “Paleo-Lakelandic” groups.
I think the genetic interpretation is already straightforward, though. We had a sneak peek at how this late admixture with non-Uralians (mainly Palaeo-Lakelandic and Palaeo-Laplandic peoples from Lovozero and related asbestos ware cultures) is going to unfold among expanding Saami-speaking populations thanks to Lamnidis et al. (2018):
Also, still no trace of R1a in far East Asia (reported as M17 ca. 5300 BC near Lake Baikal by Moussa et al. 2016), so I still have doubts about my previous assessment that R1a split into M17 (and thus also M417) in Siberia, with those expanding hunter-gatherer pottery.