Oldest N1c1a1a-L392 samples and Siberian ancestry in Bronze Age Fennoscandia

Open access preprint at bioRxiv, Ancient Fennoscandian genomes reveal origin and spread of Siberian ancestry in Europe, by Lamnidis et al. (2018).

Abstract (emphasis mine):

European history has been shaped by migrations of people, and their subsequent admixture. Recently, evidence from ancient DNA has brought new insights into migration events that could be linked to the advent of agriculture, and possibly to the spread of Indo-European languages. However, little is known so far about the ancient population history of north-eastern Europe, in particular about populations speaking Uralic languages, such as Finns and Saami. Here we analyse ancient genomic data from 11 individuals from Finland and Northwest Russia. We show that the specific genetic makeup of northern Europe traces back to migrations from Siberia that began at least 3,500 years ago. This ancestry was subsequently admixed into many modern populations in the region, in particular populations speaking Uralic languages today. In addition, we show that ancestors of modern Saami inhabited a larger territory during the Iron Age than today, which adds to historical and linguistic evidence for the population history of Finland.

Interesting excerpts (edited):

While the Siberian genetic component described here was previously described in modern-day populations from the region, we gain further insights into its temporal depth. Our data suggest that this fourth genetic component found in modern-day north-eastern Europeans arrived in the area around 4,000 years ago at the latest, as illustrated by ALDER dating using the ancient genome-wide data from Bolshoy Oleni Ostrov. The upper bound for the introduction of this component is harder to estimate. The component is absent in the Karelian hunter-gatherers (EHG) 3 dated to 8,300-7,200 yBP as well as Mesolithic and Neolithic populations from the Baltics from 8,300 yBP and 7,100-5,000 yBP respectively. While this suggests an upper bound of 5,000 yBP for the arrival of Siberian ancestry, we cannot exclude the possibility of its presence even earlier, yet restricted to more northern regions, as suggested by its absence in populations in the Baltic during the Bronze Age. Our study also presents the earliest occurrence of the Y-chromosomal haplogroup N1c in Fennoscandia. N1c is common among modern Uralic speakers, and has also been detected in Hungarian individuals dating to the 10th century, yet it is absent in all published Mesolithic genomes from Karelia and the Baltics.

The large Siberian component in the Bolshoy individuals from the Kola Peninsula provides the earliest direct genetic evidence for an eastern migration into this region. Such contact is well documented in archaeology, with the introduction of asbestos-mixed Lovozero ceramics during the second millenium BC, and the spread of even-based arrowheads in Lapland from 1,900 BCE. Additionally, the nearest counterparts of Vardøy ceramics, appearing in the area around 1,600-1,300 BCE, can be found on the Taymyr peninsula, much further to the east. Finally, the Imiyakhtakhskaya culture from Yakutia spread to the Kola Peninsula during the same period. Contacts between Siberia and Europe are also recognised in linguistics. The fact that the Siberian genetic component is consistently shared among Uralic-speaking populations, with the exceptions of Hungarians and the non-Uralic speaking Russians, would make it tempting to equate this component with the spread of Uralic languages in the area. However, such a model may be overly simplistic. First, the presence of the Siberian component on the Kola Peninsula at ca. 4000 yBP predates most linguistic estimates of the spread of Uralic languages to the area. Second, as shown in our analyses, the admixture patterns found in historic and modern Uralic speakers are complex and in fact inconsistent with a single admixture event. Therefore, even if the Siberian genetic component partly spread alongside Uralic languages, it likely presented only an addition to populations carrying this component from earlier.

Plot of ADMIXTURE (K=3) results containing West Eurasian populations and the Nganasan. Ancient individuals from this study are represented by thicker bars.

The novel genome-wide data here presented from ancient individuals from Finland opens new insights into Finnish population history. Two of the three higher coverage individuals and all six low coverage individuals from Levänluhta showed low genetic affinity to modern-day Finnish speakers of the area. Instead, an increased affinity was observed to modern-day Saami speakers, now mostly residing in the north of the Scandinavian Peninsula. These results suggest that the geographic range of the Saami extended further south in the past, and hints at a genetic shift at least in the western Finnish region during the Iron Age. The findings are in concordance with the noted linguistic shift from Saami languages to early Finnish. Further ancient DNA from Finland is needed to conclude to what extent these signals of migration and admixture are representative of Finland as a whole.

PCA plot of 113 Modern Eurasian populations, with individuals from this study projected on the principal components. Uralic speakers are highlighted in light purple.

The two samples of haplogroup N1c1a1a-L392/L1026, dated ca. 1500 BC, come from the site Bolshoy Oleniy Ostrov, in the Kola Peninsula.

Bolshoy Oleniy Ostrov (Great Reindeer Island), situated in the Kola Bay of the Barents Sea and separated from the mainland by Yekarerininsky Island and two straits, harbors the ancient cemetery of an unknown Early Metal Age culture. The preservation of artifacts made from bone and antler, wooden structures, as well as human remains is remarkable for the location and age this site represents. Altogether 19 skeletons of adults and children have been recognized from both single and collective burials of the site, together with more than 250 artifacts. (…) Apart from these excavations, approximately 25 burials were revealed in 1934 during the construction of fortifications. (…) Radiocarbon dates are provided by Moiseyev and Khartanovich in their 2012 study, placing the site in middle to the late 2nd millennium BC (…)

After seing how Late Indo-European languages spread with Yamna and (mainly) R1b-L23 lineages, we are now obtaining proof of how Siberian ancestry – likely accompanying N1c-L392 lineages – was probably related to an early archaeological Siberian influence in the easternmost region of North-East Europe, seen also probably in linguistics.

NOTE. Whereas I proposed – based mainly on common guesstimates – that R1a-M417 and EHG ancestry might have signaled the arrival of an early Yukaghir substratum to NE Europe, later acquired by Uralic spreading over this territory, while N1c1a1a lineages with the Seima-Turbino phenomenon might have given Uralic its later Altaic traits, it is indeed possible – and more likely with the findings in this paper – that N1c1a1a lineages may have in fact spread Yukaghir languages, especially if (like the Leiden school) one supports an Indo-Uralic community.

The linguistic effect of this migration may depend on one’s preferred model for Proto-Uralic and its strata, and especially on one’s position in the Proto-Uralic vs. Proto-Uralo-Yukaghir controversy. Although I really didn’t have a strong opinion on this matter, it is clear from my texts that (unlike Kortlandt) I didn’t consider Yukaghir to share a common ancestor with Uralic languages. What genomics is showing right now seems to me directly translatable to a linguistic model, and we should therefore reject an original Proto-Uralo-Yukaghir community.

Also, it seems that the Finnish population peak which expanded today’s prevalent N1c-L392 lineages – after the Iron Age bottleneck which likely reduced its haplogroup diversity – may have been associated with the event that displaced the Saami population from Finland after ca. 1000 AD.

I think it is becoming still clearer where Uralic languages came from.


Uralic as a Corded Ware substrate of Indo-Iranian, and loanwords in Finno-Ugric


Asko Parpola has recently published a new paper, Finnish vatsa ~ Sanskrit vatsá and the formation of Indo-Iranian and Uralic languages.


Finnish vatsa ‘stomach’ < PFU *vaćća < Proto-Indo-Aryan *vatsá- ‘calf’ < PIE *vet-(e)s-ó- ‘yearling’ contrasts with Finnish vasa- ‘calf’ < Proto-Iranian *vasa- ‘calf’. Indo-Aryan -ts- versus Iranian -s- refl ects the divergent development of PIE *-tst- in the Iranian branch (> *-st-, with Greek and Balto-Slavic) and in the Indo-Aryan branch ( > *-tt-, probably due to Uralic substratum). The split of Indo-Iranian can be traced in the archaeological record to the differentiation of the Yamnaya culture in the North Pontic and Volga steppes respectively during the third millennium BCE, due to the use of separate sources of metal: the Iranian branch was dependent on the North Caucasus, while the Indo-Aryan branch was oriented towards the Urals. It is argued that the Abashevo culture of the Mid-Volga-Kama-Belaya basins and the Sejma-Turbino trade network (2200–1900 BCE) were bilingual in Proto-Indo-Aryan and PFU, and introduced the PFU as the basis of West Uralic (Volga-Finnic) into the Netted Ware Culture of the Upper Volga-Oka (1900–200 BCE).

He updates thus his quite recent model from On the emergence, contacts and dispersal of Proto-Indo-European, Proto-Uralic and Proto-Aryan in an archaeological perspective (2017).

In it he supported a North-West Indo-European expansion with Corded Ware, and a Neolithic Proto-Uralic community in East Europe (associated with the Comb Ware culture), as I did before the famous 2015 papers.

In fact, he supports that the satemization trend of Proto-Indo-Iranian is due to a Proto-Finno-Ugric substratum in its population in the Volga-Ural region, similar to the model I propose (with the Corded Ware substratum hypothesis).

NOTE. While for Parpola the ‘satemizing’ substratum of Balto-Slavic (a NWIE dialect) may not come exactly from the same Finno-Ugric population as for Indo-Iranian, but from a different Uralic dialect (as I explain in my hypothesis), for the few extant supporters of an Indo-Slavonic group there should not be any problem identifying the same ancient substrate as for the Proto-Indo-Iranian population…

Now that North-West Indo-European is clearly associated with the Yamna -> Bell Beaker expansion, I understand that his previous model is obsolete and needs a revision.

I find it especially difficult to understand (in light of his previous theory) why he compares Indo-Aryan *vatsa– and Iranian *vasa– to assert that the former is the origin of the loanword in Finno-Ugric, when the Proto-Indo-Iranian form is essentially the same as the Indo-Aryan one, with respect to the *w– evolution into *v– in both PII and late FU dialects…

NOTE: I wrote him yesterday asking for this issue, I will post here his answer.

EDIT (20 MAR 2018): The summary of his answer regarding his selection of Indo-Aryan *vatsa– vs. Iranian *vasa– (instead of just PII *watsa-/vatsa-) is one based on Archaeology (and likley guesstimates), since he understands the split into Iranian and Indo-Aryan to have happened early within the Yamna culture, so that the cultural admixture of Abashevo must have happened after the separation.

Potential spread of Finnic. “Distribution of the Netted Ware according to Carpelan (2002: 198). A: Emergence of the Netted Ware on the Upper Volga c. 1900 calBC. B: Spread of Netted Ware by c. 1800 calBC. C: Early Iron Age spread of Netted Ware. (After Carpelan 2002: 198 > Parpola 2012a: 151.)

His effort to link the actual expansion of Finno-Ugric to Corded Ware territory, linking it also partially to population movements from the Seima-Turbino phenomenon – probably associated with the initial expansion of N1c lineages – is another good example of convergence of the different anthropological theories thanks to recent Genomic studies.


More evidence on the recent arrival of haplogroup N and gradual replacement of R1a lineages in North-Eastern Europe


A new article (in Russian), Kinship Analysis of Human Remains from the Sargat Mounds, Baraba forest-steppe, Western Siberia, by Pilipenko et al. Археология, этнография и антропология Евразии Том 45 № 4 2017, downloadable at ResearchGate.


We present the results of a paleogenetic analysis of nine individuals from two Early Iron Age mounds in the Baraba forest -teppe, associated with the Sargat culture (fi ve from Pogorelka-2 mound 8, and four from Vengerovo-6 mound 1). Four systems of genetic markers were analyzed: mitochondrial DNA, the polymorphic part of the amelogenin gene, autosomal STR-loci, and those of the Y-chromosome. Complete or partial data, obtained for eight of the nine individuals, were subjected to kinship analysis. No direct relatives of the “parent-child” type were detected. However, the data indicate close paternal and maternal kinship among certain individuals. This was evidently one of the reasons why certain individuals were buried under a single mound. Paternal kinship appears to have been of greater importance. The diversity of mtDNA and Y-chromosome lineages among individuals from one and the same mound suggests that kinship was not the only motive behind burying the deceased people jointly. The presence of very similar, though not identical, variants of the Y chromosome in different burial grounds may indicate the existence of groups such as clans, consisting of paternally related males. Our conclusions need further confi rmation and detailed elaboration. Keywords: Paleogenetics, ancient DNA, kinship analysis, mitochondrial DNA, uniparental genetic markers, STR-loci, Y-chromosome, Baraba forest-steppe, Sargat culture, Early Iron Age.

From the older study of the same region (Baraba, numbered 4) “Location of ancient human groups with a high frequency of mtDNA haplogroups U5, U4 and U2e lineages. The area of Northern Eurasian anthropological formation is marked by yellow region on the map (References: 1. Bramanti et al., 2009; 2. Malmstrom et
al., 2009; 3. Krause et al., 2010; 4. this study)”

Chronological time scale of Bronze Age Cultures from the Baraba region
This is the same team that brought an ancient mtDNA study of different cultures within the Baraba steppe-forest region (from the Open Access book Population Dynamics in Prehistory and Early History).

The Baraba steppe-forest is a region between the Ob and Irtysh rivers (about 800 km from west to east), stretching over 200 km from the taiga zone in the north to the steppes in the south.

The new study brings a more recent picture of the region, from the Iron Age Sargat culture, ca. 500 BC – 500 AD, with five samples of haplogroup N and two samples of haplogroup R1a.

R1a lineages in the region probably derive from the previous expansion of Andronovo and related cultures, which had absorbed North Caspian steppe populations and their Late Indo-European culture.

N subclades prevalent in certain modern Eurasian populations are probably derived from the expansion of the Seima-Turbino phenomenon.

While samples are scarce, Y-DNA data keeps showing the same picture I have spoken about more than once:

N subclades (potentially originally speaking Proto-Yukaghir languages) gradually replacing haplogroup R1a (originally probably speaking Uralic languages), probably through successive founder effects (such as the bottlenecks found in Finland), which left their Uralic culture and ethnolinguistic identification intact.

Therefore, late Corded Ware groups of North-Eastern Europe (in the Forest Zone and the Baltic), mainly of R1a-Z645 subclades, probably never adopted Late Indo-European languages.