The cradle of Russians, an obvious Finno-Volgaic genetic hotspot

pskov-novgorod-russia

First look of an accepted manuscript (behind paywall), Genome-wide sequence analyses of ethnic populations across Russia, by Zhernakova et al. Genomics (2019).

Interesting excerpts:

There remain ongoing discussions about the origins of the ethnic Russian population. The ancestors of ethnic Russians were among the Slavic tribes that separated from the early Indo-European Group, which included ancestors of modern Slavic, Germanic and Baltic speakers, who appeared in the northeastern part of Europe ca. 1,500 years ago. Slavs were found in the central part of Eastern Europe, where they came in direct contact with (and likely assimilation of) the populations speaking Uralic (Volga-Finnish and Baltic- Finnish), and also Baltic languages [11–13]. In the following centuries, Slavs interacted with the Iranian-Persian, Turkic and Scandinavian peoples, all of which in succession may have contributed to the current pattern of genome diversity across the different parts of Russia. At the end of the Middle Ages and in the early modern period, there occurred a division of the East Slavic unity into Russians, Ukrainians and Belarusians. It was the Russians who drove the colonization movement to the East, although other Slavic, Turkic and Finnish peoples took part in this movement, as the eastward migrations brought them to the Ural Mountains and further into Siberia, the Far East, and Alaska. During that interval, the Russians encountered the Finns, Ugrians, and Samoyeds speakers in the Urals, but also the Turkic, Mongolian and Tungus speakers of Siberia. Finally, in the great expanse between the Altai Mountains on the border with Mongolia, and the Bering Strait, they encountered paleo-Asiatic groups that may be genetically closest to the ancestors of the Native Americans. Today’s complex patchwork of human diversity in Russia has continued to be augmented by modern migrations from the Caucasus, and from Central Asia, as modern economic migrations take shape.

pskov-novgorod-pca-eurasia-yakut
Sample relatedness based on genotype data. Eurasia: Principal Component plot of 574 modern Russian genomes. Colors reflect geographical regions of collection; shapes reflect the sample source. Red circles show the location of Genome Russia samples.

In the current study, we annotated whole genome sequences of individuals currently living on the territory of Russia and identifying themselves as ethnic Russian or as members of a named ethnic minority (Fig. 1). We analyzed genetic variation in three modern populations of Russia (ethnic Russians from Pskov and Novgorod regions and ethnic Yakut from the Sakha Republic), and compared them to the recently released genome sequences collected from 52 indigenous Russian populations. The incidence of function-altering mutations was explored by identifying known variants and novel variants and their allele frequencies relative to variation in adjacent European, East Asian and South Asian populations. Genomic variation was further used to estimate genetic distance and relationships, historic gene flow and barriers to gene flow, the extent of population admixture, historic population contractions, and linkage disequilibrium patterns. Lastly, we present demographic models estimating historic founder events within Russia, and a preliminary HapMap of ethnic Russians from the European part of Russia and Yakuts from eastern Siberia.

pskov-novgorod-pca-finno-permic
Sample relatedness based on genotype data. Western Russia and neighboring countries: Principal Component plot of 574 modern Russian genomes. Colors reflect geographical regions of collection; shapes reflect the sample source. Red circles show the location of Genome Russia samples.

The collection of identified SNPs was used to inspect quantitative distinctions among 264 individuals from across Eurasia (Fig. 1) using Principal Component Analysis (PCA) (Fig. 2). The first and the second eigenvectors of the PCA plot are associated with longitude and latitude, respectively, of the sample locations and accurately separate Eurasian populations according to geographic origin. East European samples cluster near Pskov and Novgorod samples, which fall between northern Russians, Finno-Ugric peoples (Karelian, Finns, Veps etc.), and other Northeastern European peoples (Swedes, Central Russians, Estonian, Latvians, Lithuanians, and Ukrainians) (Fig. 2b). Yakut individuals map into the Siberian sample cluster as expected (Fig. 2a). To obtain an extended view of population relationships, we performed a maximum likelihood-based estimation of ancestry and population structure using ADMIXTURE [46](Fig. 2c). The Novgorod and Pskov populations show similar profiles with their Northeastern European ancestors while the Yakut ethnic group showed mixed ancestry similar to the Buryat and Mongolian groups.

pskov-novgorod-yakut-admixture
Population structure across samples in 178 populations from five major geographic regions (k=5). Samples are pooled across three different studies that covered the territory of Russian Federation (Mallick et al. 2016 [36], Pagani et al. 2016 [37], this study). The optimal k-value was selected by value of cross validation error. Russian samples from all studies (highlighted in bold dark blue) show a slight gradient from Eastern European (Ukrainian, Belorussian, Polish) to North European (Estonian Karelian, Finnish) structures, reflecting population history of northward expansion. Yakut samples from different studies (highlighted in bold red) also show a slight gradient from Mongolian to Siberian people (Evens), as expected from their original admixture and northward expansions. The samples originated from this study are highlighted, and plotted in separated boxes below.

Possible admixture sources of the Genome Russia populations were addressed more formally by calculating F3 statistics, which is an allele frequency-based measure, allowing to test if a target population can be modeled as a mixture of two source populations [48]. Results showed that Yakut individuals are best modeled as an admixture of Evens or Evenks with various European populations (Supplemental Table S4). Pskov and Novgorod showed admixture of European with Siberian or Finno-Ugric populations, with Lithuanian and Latvian populations being the dominant European sources for Pskov samples.

direction-expansion-russians
The heatmaps of gene flow barriers show for each point at the geographical map the interpolated differences in allele frequencies (AF) between the estimated AF at the point with AFs in the vicinity of this point. The direction of the maximal difference in allele frequencies is coded by colors and arrows.

So, Russians expanding in the Middle Ages as acculturaded Finno-Volgaic peoples.

Or maybe the true Germano-Slavonic™-speaking area was in north-eastern Europe, until the recent arrival of Finno-Permians with the totally believable Nganasan-Saami horde, whereas Yamna -> Bell Beaker represented Vasconic-Caucasian expanding all over Europe in the Bronze Age. Because steppe ancestry in Fennoscandia and Modern Basques in Iberia.

A really hard choice between equally plausible models.

Related

Magyar tribes brought R1a-Z645, I2a-L621, and N1a-L392(xB197) lineages to the Carpathian Basin

hungarian-conquerors-turks

The Nightmare Week of “N1c=Uralic” proponents continues, now with preprint Y-chromosome haplogroups from Hun, Avar and conquering Hungarian period nomadic people of the Carpathian Basin, by Neparaczki et al. bioRxiv (2019).

Abstract:

Hun, Avar and conquering Hungarian nomadic groups arrived into the Carpathian Basin from the Eurasian Steppes and significantly influenced its political and ethnical landscape. In order to shed light on the genetic affinity of above groups we have determined Y chromosomal haplogroups and autosomal loci, from 49 individuals, supposed to represent military leaders. Haplogroups from the Hun-age are consistent with Xiongnu ancestry of European Huns. Most of the Avar-age individuals carry east Eurasian Y haplogroups typical for modern north-eastern Siberian and Buryat populations and their autosomal loci indicate mostly unmixed Asian characteristics. In contrast the conquering Hungarians seem to be a recently assembled population incorporating pure European, Asian and admixed components. Their heterogeneous paternal and maternal lineages indicate similar phylogeographic origin of males and females, derived from Central-Inner Asian and European Pontic Steppe sources. Composition of conquering Hungarian paternal lineages is very similar to that of Baskhirs, supporting historical sources that report identity of the two groups.

Interesting excerpts (emphasis mine):

All N-Hg-s identified in the Avars and Conquerors belonged to N1a1a-M178. We have tested 7 subclades of M178; N1a1a2-B187, N1a1a1a2-B211, N1a1a1a1a3-B197, N1a1a1a1a4-M2118, N1a1a1a1a1a-VL29, N1a1a1a1a2-Z1936 and the N1a1a1a1a2a1c1-L1034 subbranch of Z1936. The European subclades VL29 and Z1936 could be excluded in most cases, while the rest of the subclades are prevalent in Siberia 23 from where this Hg dispersed in a counter-clockwise migratory route to Europe (…). All the 5 other Avar samples belonged to N1a1a1a1a3-B197, which is most prevalent in Chukchi, Buryats, Eskimos, Koryaks and appears among Tuvans and Mongols with lower frequency.

haplogroup-n-pca
First two components of PCA from Hg N1a subbranch distribution in 51 populations including Avars and Conquerors. Colors indicate geographic regions. Three letter codes are given in Supplementary Table S5.

By contrast two Conquerors belonged to N1a1a1a1a4-M2118, the Y lineage of nearly all Yakut males, being also frequent in Evenks, Evens and occurring with lower frequency among Khantys, Mansis and Kazakhs.

Three Conqueror samples belonged to Hg N1a1a1a1a2-Z1936 , the Finno-Permic N1a branch, being most frequent among northeastern European Saami, Finns, Karelians, as well as Komis, Volga Tatars and Bashkirs of the Volga-Ural region.Nevertheless this Hg is also present with lower frequency among Karanogays, Siberian Nenets, Khantys, Mansis, Dolgans, Nganasans, and Siberian Tatars.

The west Eurasian R1a1a1b1a2b-CTS1211 subclade of R1a is most frequent in Eastern Europe especially among Slavic people. This Hg was detected just in the Conqueror group (K2/18, K2/41 and K1/10). Though CTS1211 was not covered in K2/36 but it may also belong to this sub-branch of Z283.

Hg I2a1a2b-L621 was present in 5 Conqueror samples, and a 6th sample form Magyarhomorog (MH/9) most likely also belongs here, as MH/9 is a likely kin of MH/16 (see below). This Hg of European origin is most prominent in the Balkans and Eastern Europe, especially among Slavic speaking groups. It might have been a major lineage of the Cucuteni-Trypillian culture and it was present in the Baden culture of the Chalcolithic Carpathian Basin.

hungarian-conquerors-y-dna
Image modified from the paper, with drawn red square around lineages of likely Ugric origin, and squares around R1a-Z93, R1a-Z283, N1a-Z1936, and N1a-M2004 samples. Y-Hg-s determined from 46 males grouped according to sample age, cemetery and Hg. Hg designations are given according to ISOGG Tree 2019. Grey shading designate distinguished individuals with rich grave goods, color shadings denote geographic origin of Hg-s according to Fig. 1. For samples K3/1 and K3/3 the innermost Hg defining marker U106* was not covered, but had been determined previously.

We identified potential relatives within Conqueror cemeteries but not between them. The uniform paternal lineages of the small Karos3 (19 graves) and Magyarhomorog (17 graves) cemeteries approve patrilinear organization of these communities. The identical I2a1a2b Hg-s of Magyarhomorog individuals appears to be frequent among high-ranking Conquerors, as the most distinguished graves in the Karos2 and 3 cemeteries also belong to this lineage. The Karos2 and Karos3 leaders were brothers with identical mitogenomes 11 and Y-chromosomal STR profiles (Fóthi unpublished). The Sárrétudvari commoner cemetery seems distinct from the others, containing other sorts of European Hg-s. Available Y-chromosomal and mtDNA data from this cemetery suggest that common people of the 10th century rather represented resident population than newcomers. The great diversity of Y Hg-s, mtDNA Hg-s, phenotypes and predicted biogeographic classifications of the Conquerors indicate that they were relatively recently associated from very diverse populations.

Surprising about the Hungarian conquerors – although in line with the historical accounts – is the varied patrilineal origin of clans, including Q1a, G2a2b, I1, E1b1b, R1b, J1, or J2 – some of which (depending on specific lineages) may have appeared earlier in the Carpathian Basin or south-eastern Europe.

However, out of the 27 conqueror elite samples, 17 are of haplogroups most likely related to Ugric populations beyond the Urals: R1a-Z645, I2-L621, and two specific N1a-L392 lineages (see below). In fact, there are three high-ranking conqueror elites of hg. I2-L621 (one of them termed a “leader”, brother to an unpublished leader of Karos3, and all of them possibly family), one of hg. R1a-Z280, one of hg. R1a-Z93 (which should be added to the Árpáds), and one of hg. N1a-Z1936, which gives a good idea of the ruling class among the elite Ugric settlers.

NOTE. The Q1a sample is also likely to be found in the mixed population of the West Siberian forest-steppes, since it was found in Mesolithic-Neolithic samples from eastern Europe to Lake Baikal, and in Bronze Age Siberian groups, although admittedly it may have formed part of an Avar Transtisza group, or even earlier Hunnic or Scythian groups along the steppes. Without precise subclades it’s impossible to know.

arrival-of-hungarians-arpad
The seven chieftains of the Hungarians, detail of Arrival of the Hungarians, from Árpád Feszty’s and his assistants’ vast (1800 m2) cyclorama, painted to celebrate the 1000th anniversary of the Magyar conquest of Hungary, now displayed at the Ópusztaszer National Heritage Park in Hungary. Image from Wikipedia.

I2a-L621

I2a-L621 (xS17250) or I2a1b2 in the old nomenclature, is found in 6 early conquerors (including one leader), on a par with R1a and N samples. This haplogroup is found widely distributed in ancient samples, due to its early split (formed ca. 9200 BC, TMRCA ca. 4500 BC) and expansion, probably with Neolithic populations. I can’t seem to find samples of this early haplogroup from the Carpathian Basin, as mentioned in the text, although it wouldn’t be strange, because it appears also in Neolithic Iberia, and in modern populations from western Europe.

Nevertheless, I2a-L621 samples seem to be concentrated mainly in Mesolithic-Neolithic cultures of Fennoscandia, and appeared also in Sikora et al. (2017) in a sample of the High Middle Ages from Sunghir (ca. AD 1100-1200), probably from the Vladimir-Suzdalian Rus’, in a region where clearly tribes of Volga Finns were being assimilated at the time. The reported SNP call by Genetiker is A16681 (see Yfull), deep within I2a-CTS10228. It is possibly also behind a modern Saami from Chalmny Varre (ca. AD 1800) of hg. I2a in Lamnidis et al. (2018).

Lacking precise subclades from Hungarian conquerors this is pure speculation, but modern samples may also point to I2a-CTS10228 (formed ca. 3100 BC, TMRCA ca. 1800 BC) as a Finno-Ugric lineage in common with R1a, which must have expanded to the Urals and beyond with eastern Corded Ware groups or (more likely) succeeding cultures. This is in line with the association of certain I2a lineages with modern Uralic peoples or populations from their historical regions in eastern Europe, and linked thus to the most likely homeland of Uralians in the eastern European forests:

uralic-groups-haplogroup-r1a
Additional file 6: Table S5. Y chromosome haplogroup frequencies in Eurasia. Modified by me: in bold haplogroup N1c and R1a from Uralic-speaking populations, with those in red showing where R1a is the major haplogroup. Observe that all Uralic subgroups – Finno-Permic, Ugric, and Samoyedic – have some populations with a majority of R1a, and also of I lineages. Data from Tambets et al. (2018).

R1a-Z645

Regarding the important question of the ethnic makeup of Ugric populations stemming from the Urals, the most interesting (and expected) data is the presence of R1a-Z645 lineages among high-ranking conquerors, in particular four R1a-Z280 subclades proper of Finno-Ugrians.

This proves that, in line with the old split and expansion of R1a-CTS1211 (formed ca. 2600 BC, TMRCA ca. 2400 BC), and its finding in Bronze Age Fennoscandian samples, only some late R1a-Z280 (xZ92) lineages (see Z280 on YFull) may show a clear identification with early acculturated Uralic speakers, with the main early acculturated Balto-Slavic R1a haplogroup remaining R1a-M458.

I recently hypothesized this late connection of Slavs with very specific R1a-Z280 (xZ92) lineages based on analyses of modern populations (like Slovenians), because the connection of ancient Finno-Ugrians with modern Z92 samples was already evident:

(…) subclades of hg. R1a1a1b1a2-Z280 (xR1a1a1b1a2a-Z92) seem to have also been involved in early Slavic expansions, like R1a1a1b1a2b3a-CTS3402 (formed ca. 2200 BC, TMRCA ca. 2200 BC), found among modern West, South, and East Slavic populations and in Fennoscandia, prevalent e.g. among modern Slovenians which points to a northern origin of its expansion (Maisano Delser et al. 2018).

This finding also supports the expected shared R1a-Z280 lineages among ancient Finno-Ugric populations, as predicted from the study of modern Permic and Ugric peoples in Dudás et al. (2019).

r1a-z282-z280-z2125-distribution
Modified image, from Underhill et al. (2015). Spatial frequency distributions of Z282 (green) and Z93 (blue) affiliated haplogroups. Notice the distribution of R1a-Z280 (xZ92), i.e. R1a-M558, compared to the ancient Finno-Ugric distribution.

Furthermore, while we don’t have precise R1a-Z93 lineages to compare with the new Hunnic sample reported, we already know that some archaic R1a-Z2124 subclades stem from the forest-steppe areas of the Cis- and Trans-Urals, and the two newly reported R1a-Z93 Hungarian conqueror elites, like those of the Árpád dynasty, probably belong to them.

There is an obvious lack of continuity in specific paternal lineages among the Hunnic, the Avar, and the Conqueror periods, which makes any simplistic identification of all R1a-Z93 lineages as stemming from Avars, Huns, or the Iron Age Pontic-Caspian steppes clearly flawed. Comparing R1a-Z93 in Hungarian Conquerors with Huns is like comparing them with samples of the Srubna or earlier periods… Similarly, comparing the Hunnic R1b-U106 or the early Avar I1 to later Hungarian samples is not warranted without precise subclades, because they most likely correspond to different Germanic populations: Goths among Huns, then Longobards, then likely peoples descended from Franks and Irish Monks (the latter with R1b-P312).

N1a-L392

Second behind R1a subclades are, as expected, N1a-L392 (N1c in the old nomenclature).

Avars are dominated by a specific N1a-L392 subclade, N1a-B197, as we recently discovered in Csáky et al. (2019).

Hungarian conquerors show three N1a-Z1936 subclades, which is known to stem from the northern Ural region, including the Arctic (likely Palaeo-Laplandic peoples) and cross-stamped cultures of the northern Eurasian forests.

haplogroup_n3a4
Frequency-Distribution Maps of Individual Subclade N3a4 / N1a1a1a1a2-Z1936, probably with the Samic (first) and Fennic (later) expansions into Paleo-Lakelandic and Palaeo-Laplandic territories.

On the other hand, the two N1a-M2118 lineages are more clearly associated with Palaeo-Siberian populations east of the Urals, but became incorporated into the Ugric stock in the Trans-Urals region probably in the same way as N1a-Z1936, by infiltration from (and acculturation of) hunter-gatherers of forest and taiga cultures.

NOTE. You can read more about the infiltration of N1a lineages in the recent post Corded Ware—Uralic (IV): Hg R1a and N in Finno-Ugric and Samoyedic expansions, and in the specific sections for each Uralic group in A Clash of Chiefs.

haplogroup-n1a-M2118
Frequency-Distribution Maps of Individual Sub-clades of hg N3a2, by Ilumäe et al. (2016).

Conclusion

The picture offered by the paper on Hungarian Conquerors, while in line with historical accounts of multi-ethnic tribes incorporating regional lineages, shows nevertheless patrilineal clans clearly associated with Uralic peoples, in a distribution which could have been easily inferred from ancient Trans-Uralian forest-steppe cultures and modern samples (even regarding I2a-L621).

In spite of this, there is a great deal of discussion in the paper about specific N1a subclades in Hungarian conquerors, while the presence of R1a-Z280 (among early Magyar elites!) is interpreted, as always, as recently acculturated Slavs. This is sadly coupled with the simplistic identification of I2a-L621 as of local origin around the Carpathians.

The introduction of the paper to the history of Hungarians is also weird, for example giving credibility to the mythic accounts of the Árpád dynasty’s origin in Attila, which is in line, I guess, with what the authors intended to support all along, i.e. the association of Magyars with Turks from the Eurasian steppes, which they are apparently willing to achieve by relating them to haplogroup R1a-Z93

The conclusion is thus written to appease modern nation-building myths more than anything else, like many other papers before it:

It is generally accepted that the Hungarian language was brought to the Carpathian Basin by the Conquerors. Uralic speaking populations are characterized by a high frequency of Y-Hg N, which have often been interpreted as a genetic signal of shared ancestry. Indeed, recently a distinct shared ancestry component of likely Siberian origin was identified at the genomic level in these populations, modern Hungarians being a puzzling exception36. The Conqueror elite had a significant proportion of N Hgs, 7% of them carrying N1a1a1a1a4-M2118 and 10% N1a1a1a1a2-Z1936, both of which are present in Ugric speaking Khantys and Mansis. At the same time none of the examined Conquerors belonged to the L1034 subclade of Z1936, while all of the Khanty Z1936 lineages reported in 37 proved to be L1034 which has not been tested in the 23 study. Population genetic data rather position the Conqueror elite among Turkic groups, Bashkirs and Volga Tatars, in agreement with contemporary historical accounts which denominated the Conquerors as “Turks”. This does not exclude the possibility that the Hungarian language could also have been present in the obviously very heterogeneous, probably multiethnic Conqueror tribal alliance.

So, back to square one, and new circular reasoning: If ancient populations from north-eastern Europe believed to represent ancient Finno-Ugrians are of R1a-Z645 lineages, it’s because they were not Finno-Ugric speakers. If ancient and modern populations known to be of Finno-Ugric language show clear connections with R1a-Z645, it’s because they are “multi-ethnic”.

The only stable basis for discussion in genetic papers, apparently, is the own making of geneticists, with their traditional 2000s “R1a=Indo-European” and “N1c=Uralic”, coupled with national beliefs. It does not matter how many predictions based on that have been proven wrong, or how many predictions based on the Corded Ware = Uralic expansion have been proven right.

Related

R1a-Z280 and R1a-Z93 shared by ancient Finno-Ugric populations; N1c-Tat expanded with Micro-Altaic

Two important papers have appeared regarding the supposed link of Uralians with haplogroup N.

Avars of haplogroup N1c-Tat

Preprint Genetic insights into the social organisation of the Avar period elite in the 7th century AD Carpathian Basin, by Csáky et al. bioRxiv (2019).

Interesting excerpts (emphasis mine):

After 568 AD the Avars settled in the Carpathian Basin and founded the Avar Qaganate that was an important power in Central Europe until the 9th century. Part of the Avar society was probably of Asian origin, however the localisation of their homeland is hampered by the scarcity of historical and archaeological data.

Here, we study mitogenome and Y chromosomal STR variability of twenty-six individuals, a number of them representing a well-characterised elite group buried at the centre of the Carpathian Basin more than a century after the Avar conquest.

The Y-STR analyses of 17 males give evidence on a surprisingly homogeneous Y chromosomal composition. Y chromosomal STR profiles of 14 males could be assigned to haplogroup N-Tat (also N1a1-M46). N-Tat haplotype I was found in four males from Kunpeszér with identical alleles on at least nine loci. The full Y-STR haplotype I, reconstructed from AC17 with 17 detected STRs, is rare in our days. Only nine matches were found among haplotypes in YHRD database, such as samples from the Ural Region, Northern Europe (Estonia, Finland), and Western Alaska (Yupiks). We performed Median Joining (MJ) network analysis using N-Tat haplotypes with ten shared STR loci (Fig. 3, Table S9). All modern N-Tat samples included in the network had derived allele of L708 as well. Haplotype I (Cluster 1 in Fig. 3) is shared by eight populations on the MJ network among the 24 identical haplotypes. Cluster 1 represents the founding lineage, as it is described in Siberian populations, because this haplotype is shared by the most populations and it is more diverse than Cluster 2.

Nine males share N-Tat haplotype II (on a minimum of eight detected alleles), all of them buried in the Danube-Tisza Interfluve. We found 30 direct matches of this N-Tat haplotype II in the YHRD database, using the complete 17 STR Y-filer profile of AC1, AC12, AC14, AC15, AC19 samples. Most hits came from Mongolia (seven Buryats and one Khalkh) and from Russia (six Yakuts), but identical haplotypes also occur in China (five in Xinjiang and four in Inner Mongolia provinces). On the MJ network, this haplotype II is represented by Cluster 2 and is composed of 45 samples (including 32 Buryats) from six populations (Fig. 3).

y-str-haplogroup-n-mongolian-ugrians
Median Joining network of 162 N-Tat Y-STR haplotypes Allelic information of ten Y-STR loci were used for the network. Only those Avar samples were included, which had results for these ten Y-STR loci. The founder haplotype I (Cluster 1) is shared by eight populations including three Mongolian, three Székely, three northern Mansi, two southern Mansi, two Hungarian, eight Khanty, one Finn and two Avar (AC17, AC26) chromosomes. Haplotype II (Cluster 2) includes 45 haplotypes from six populations studied: 32 Buryats, two Mongolians, one Székely, one Uzbek, one Uzbek Madjar, two northern Mansi and six Avars (AC1, AC12, AC14, AC15, AC19 and KSZ 37). Haplotype III (indicated by a red arrow) is AC8. Information on the modern reference samples is seen in Table S9.

A third N-Tat lineage (type III) was represented only once in the Avar dataset (AC8), and has no direct modern parallels from the YHRD database. This haplotype on the MJ network (see red arrow in Fig. 3) seems to be a descendent from other haplotype cluster that is shared by three populations (two Buryat from Mongolia, three Khanty and one Northern Mansi samples). This haplotype cluster also differs one molecular step (locus DYS393) from haplotype II. We classified the Avar samples to downstream subgroup N-F4205 within the N-Tat haplogroup, based on the results of ours and Ilumäe et al.18 and constructed a second network (Fig. S4). The N-F4205 network results support the assumption that the N-Tat Avar samples belong to N-F4205 subgroup (see SI chapter 1d for more details).

Based on our calculation, the age of accumulated STR variance (TMRCA) within N-Tat lineage for all samples is 7.0 kya (95% CI: 4.9 – 9.2 kya), considering the core haplotype (Cluster 1) to be the founding lineage. Y haplogroup N-Tat was not detected by large scale Eurasian ancient DNA studies but it occurs in late Bronze Age Inner Mongolia and late medieval Yakuts, among them N-Tat has still the highest frequency.

Two males (AC4 and AC7) from the Transtisza group belong to two different haplotypes of Y-haplogroup Q1. Both Q1a-F1096 and Q1b-M346 haplotypes have neither direct nor one step neighbour matches in the worldwide YHRD database. A network of the Q1b-M346 haplotype shows that this male had a probable Altaian or South Siberian paternal genetic origin.

EDIT (5 APR 2019): The paper offers an interesting late sample before the arrival of Hungarian conquerors, although we don’t know which precise lineage the sample belongs to:

One sample in our dataset (HC9) comes from this population, and both his mtDNA (T1a1b) and Y chromosome (R1a) support Eastern European connections. (…) Furthermore, we excluded sample HC9 from population-genetic statistical analyses because it belongs to a later period (end of 7th – early 9th centuries)

Apparently, then, results are consistent with what was already known from studies of modern populations:

According to Ilumäe et al. study, the frequency peak of N-F4205 (N3a5-F4205) chromosomes is close to the Transbaikal region of Southern Siberia and Mongolia, and we conclude that most Avar N-Tat chromosomes probably originated from a common source population of people living in this area, completely in line with the results of Ilumäe et al.

haplogroup_n1
Geographic-Distribution Map of hg N3 from Ilumäe et al.

Finno-Ugrians share haplogroup R1a-Z280

Another paper, behind paywall, Genetic history of Bashkirian Mari and Southern Mansi ethnic groups in the Ural region, by Dudás et al. Molecular Genetics and Genomics (2019).

Interesting excerpts (emphasis mine):

Y‑chromosome diversity

The most frequent haplogroups of the Bashkirian Maris were N1b-P43 (42%), R1a-Z280 (16%), R1a-Z93 (16%), N1c-Tat (13%), and J2-M172 (7%). Furthermore, subgroup R1b-M343 accounted for 4% and I2a-P37 covered 2% of the lineages. None of the Mari N1c Y chromosomes belonged to the N1c subgroups investigated (L1034, VL29, Z1936).

In the case of the Southern Mansi males, the most frequent haplogroups were N1b-P43 (33%), N1c-L1034 (28%) and R1a-Z280 (19%). The frequencies of the remaining haplogroups were as follows: R1a-M458 (6%), I1-L22 (3%), I2a-P37 (3%), and R1b-P312 (3%). The haplotype and haplogroup diversities of the Bashkirian Mari group were 0.9929 and 0.7657, whereas these values for the Southern Mansi were 0.9984 and 0.7873, respectively. The results show that, in both populations, haplotypes are much more diverse than haplogroups.

bashkir-mari-southern-mansi
Haplogroup frequencies of the Bashkirian Mari and the Southern Mansi ethnic groups in Ural region

Genetic structure

(..) the studied Bashkirian Mari and Southern Mansi population groups formed a compact cluster along with two Khanty, Northern Mansi, Mari, and Estonian populations based on close Fst-genetic distances (< 0.05), with nonsignificant p values (p > 0.05) except for the Estonian population. All of these populations belong to the Finno-Ugric language family. Interestingly, the other Mansi population studied by Pimenoff et al. (2008) (pop # 38) was located a great distance from the Southern Mansi group (0.268). In addition, the Bashkir population (pop # 6) did not show a close genetic affinity to the Bashkirian Mari group (0.194), even though it is the host population. However, the Russian population from the Eastern European region of Russia (pop # 49) showed a genetic distance of 0.055 with the Southern Mansi group. All Hungarian speaking populations (pops 13, 22, 23, 24, 50, and 51) showed close genetic affinities to each other and to the neighbouring populations, but not to the two studied populations.

y-dna-hungarians-ugric-mansi
Multidimensional scaling (MDS) plot constructed on Fstgenetic distances of Y haplogroup frequencies of 63 populations compared. The haplogroup frequency data used for population comparison together with references are seen in Online Resource 2 (ESM_2). Pairwise Fst-genetic distances and p values between 63 populations were calculated as shown in Online Resource 3 (ESM_3) Fig. 4 Multidimensional scaling (MDS) plot constructed on Rstgenetic distances of 10 STR-based Y haplotype frequencies of 21 populations compared. Image modified to include labels of modern populations.

Phylogenetic analysis

Median-joining networks were constructed for:

N-P43 (earlier N1b):

(…) TMRCA estimates for this haplogroup were made for all P43 samples (n = 157) 8.7 kya (95% CI 6.7–10.8 kya), for the N-P43 Asian.

N1c-Tat:

(…) 75% of Buryats belonged to Haplotype 2, indicating that the Buryats studied by us is a young and isolated population (Bíró et al. 2015). Bashkirian Mari samples derive from Haplotype 2 via Haplotype 3 (see dark purple circles on the top of Fig. 6a). Haplotype 3 contained six males (2 Buryat, 1 Northern Mansi, and 3 Khanty samples from Pimenoff et al. 2008). The biggest Bashkirian Mari haplotype node (3 Mari samples) was positioned three mutational steps away from Haplotype 1 and the remaining Mari samples can be derived from this haplotype. Southern Mansi haplotypes were scattered within the network except for two, which formed a smaller haplotype node with two Northern Mansi and two Khanty samples from Pimenoff et al. (2008).

n1c-n-tat-uralic-ugric
Median-Joining Networks (MJ) of 153 N-Tat (a) and 26 N-L1034 (b) haplotypes constructed. The circle sizes are proportional to the haplotype frequencies. The smallest area is equivalent to one individual. For N-Tat network, we used data from Southern Mansi (n = 11), Bashkirian Mari (n = 6) samples with Hungarian (n = 12), Hungarian speaking Székely (n = 6), Northern Mansi (n = 14), Mongolian (n = 16), Buryat (n = 44), Finnish (n = 13), Uzbek Madjar (n = 2), Uzbek (n = 3), Khanty (n = 4) populations studied earlier by us (Fehér et al. 2015; Bíró et al. 2015) and Khanty (n = 18) and Mansi (n = 4) studied by Pimenoff et al. (2008)

R1a-Z280 haplotypes, shared by Maris, Mansis, and Hungarians, hence ancient Finno-Ugrians:

The founder R1a-Z280 haplotype was shared by four samples from four populations (1 Bashkirian Mari; 1 Southern Mansi; 1 Hungarian speaking Székely; and 1 Hungarian), as presented in Fig. 7 (Haplotype 1). Haplotype 2 included five males (3 Bashkirian Mari and 2 Hungarian), as it can be seen in Fig. 7. Haplotype 4 included two shared haplotypes (1 Bashkirian Mari and one Hungarian speaking Csángó). The remaining two Bashkirian Mari haplotypes differ from the founder haplotype (Haplotype 1) by two mutational steps via Hungarian or Hungarian and Bashkirian Mari shared haplotypes. Beside Haplotype 1, the remaining Southern Mansi haplotypes were shared with Hungarians (Haplotype 5 or turquoise blue and red-coloured circles above Haplotype 7) or with Hungarians and Hungarian speaking Székely group (Haplotypes 3, 5, and 6). Haplotype 7 included ten Hungarian speakers (Hungarian, Székely, and Csángó). One Hungarian and one Uzbek Khwarezm shared haplotype can be found in Fig. 7 as well (red and white-coloured circle). All the other haplotypes were scattered in the network. The age of accumulated STR variation within R1a-Z280 lineage for 93 samples is estimated to be 9.4 kya (95% CI 6.5–12.4 kya) considering Haplotype 1 (Fig. 7) to be the founder.

r1a-z280-ugrians
Median-Joining Networks (MJ) of 93 R1a-Z280 haplotypes constructed. The circle sizes are proportional to the haplotype frequencies. The smallest area is equivalent to one individual. We used haplotype data from Bashkirian Mari (n = 7), Southern Mansi (n = 7), Hungarian (n = 52), Hungarian speaking Székely (n = 11), Hungarian speaking Csángó (n = 10), Uzbek Ferghana (n = 2), Uzbek Tashkent (n = 1), Uzbek Khwarezm (n = 1) and Northern Mansi (n = 2) populations

R1a-Z93 as isolated lineages among Permic and Ugric populations:

Figure 8 depicts an MJ network of R1a-Z93* samples using 106 haplotypes from the 14 populations (Fig. 8). All of the Bashkirian Mari samples (7 haplotypes) formed a very isolated branch and differed from the one Hungarian haplotype (Fig. 8, see Haplotype 1) by seven mutational steps as well from two Uzbek Tashkent samples (see Haplotype 3). Another Hungarian sample shared two haplotypes of Uzbek Khwarezm samples in Haplotype 4. This haplotype can be derived from Haplotype 3 (Uzbek Tashkent). Haplotype 2 included one Hungarian and one Khakassian male. The remaining three Hungarian haplotypes are outliers in the network and are not shared by any sample. The other population samples included in the network either form independent clusters such as Altaians, Khakassians, Khanties, and Uzbek Madjars or were scattered in the network. The age of accumulated STR variation (TMRCA) within R1a-Z93* lineage for 106 samples is estimated as 11.6 kya (95% CI 9.3–14.0 kya) considering an Armenian haplotype (Fig. 8, “A”) to be the founder and the median haplotype.

r1a-z93-ugrians
Median-Joining Networks (MJ) of 106 R1a-Z93 haplotypes constructed. The circle sizes are proportional to the haplotype frequencies. The smallest area is equivalent to one individual. We used the next haplotype data: 7 Bashkirian Mari, 6 Khanty, 4 Uzbek Madjar, 5 Uzbek Ferghana, 9 Uzbek Tashkent, 7 Uzbek Khwarezm, 2 Mongolian, 2 Buryat, 6 Hungarian samples tested by us for this study or published earlier (Bíró et al. 2015) and populations (3 Armenian; 3 Afghan Tajik;
16 Altaian; 24 Khakassian; 12 Kyrgyz) from Underhill et al. (2015)

Comments

The results of modern populations for N (especially N1c) subclades show really wide clusters and ancient TMRCA, consistent with their known ancient and wide distribution in northern and eastern Eurasian groups, and thus with infiltration of different lineages with eastern nomads (and northern Arctic populations) coupled with later bottlenecks, as well as acculturation of groups.

EDIT (2 APR): Interesting is the specific subclade to which ancient Mongolic-speaking Avars belong (information from Yfull) N1c-F4205 (TMRCA ca. 500 BC), subclade of N1c-Y6058 (formed ca. 2800 BC, TMRCA ca. 2800 BC). This branch also gives the “European” branch N1c-CTS10760 (formed ca. 2800 BC, TMRCA ca. 2100 BC), and is subclade of a branch of N1c-L392 (formed ca. 4400 BC, TMRCA ca. 2800 BC). A northern expansion of N1c-L392 is probably represented by its branch N1c-Z1936 (formed ca. 2800, TMRCA ca. 2100 BC), the most likely candidate to appear in the Kola Peninsula in the Bronze Age as the Palaeo-Laplandic population (see here). Read more about potential routes of expansion of haplogroup N.

On the other hand, R1a-Z280 lineages form a tight cluster connecting Permic with Ugric groups, with R1a-Z93 showing early isolation (probably) between Cis-Urals and Trans-Urals regions. While both Corded Ware lineages in Finno-Ugrians are most likely related to the Abashevo expansion through Seima-Turbino and the Andronovo-like Horizon (and potentially later Eurasian expansions), a plausible hypothesis would be that Finno-Ugrians are related to an expansion of R1a-Z283 haplogroups (we already knew about the Finno-Permic connection), while the ancient connection between Permians and Hungarians with R1a-Z93 would correspond to this haplogroup’s potentially tighter link with an early Samoyedic split.

I don’t think that an explosive expansion of eastern Corded Ware groups of R1a-Z645 lineages will show a clear-cut division of haplogroups among Eastern Uralic groups, though, and culturally I doubt we will have such a clear image, either (similar to how the explosive expansion of Bell Beakers cannot be easily divided by regional/language group into R1b-L151 subclades before the known bottlenecks). Relevant in this regard are the known Z93 samples from the Árpád dynasty.

Nevertheless, this data may represent a slightly more recent wave of R1a-Z280 lineages linked to the expansion of Ugric into the Trans-Uralian region, after their split from Finno-Permic, still in close contact with Indo-Iranians in Poltavka and Sintashta-Potapovka, evident from the early and late Indo-Iranian borrowings, during a common period when Samoyedic had already separated.

Such a “Z283 over Z93” layer in the Trans-Urals (and Cis-Urals?) forest-steppes would be similar to the apparent replacement of Z284 by Z282 in the Eastern Baltic during the Bronze Age (possibly with the second or Estonian Battle Axe wave or, much more likely during later population movements). Such an early R1a-Z93 split could potentially be supported also by the separation into bottlenecks under “Northern” (R1a-Z283) Finno-Ugric-speaking Abashevo-related groups and “Southern” (R1a-Z93) acculturated Indo-Iranian-speaking Abashevo migrants developing Sintashta-Potapovka admixing with Poltavka R1b-Z2103 herders.

r1a-z282-z280-z2125-distribution
Modified image, from Underhill et al. (2015). Spatial frequency distributions of Z282 (green) and Z93 (blue) affiliated haplogroups.. Notice the potential Finno-Ugric-associated distribution of Z282 (especially R1a-M558, a Z280 subclade), the expansion of R1a-Z2123 subclades with Central Asian forest-steppe groups.

Conclusion

Let’s review some of the most common myths about Hungarians (and Finno-Ugrians in general) repeated ad nauseam, side by side with my assertions:

❌ N (especially N1c-Tat) in ancient and modern samples represent the True Uralic™ N1c peoples including Magyar tribes? Nope.

✅ Ancient N (especially N1c-Tat) lineages among Uralic populations expanded relatively recently, and differently in different regions (including eastern steppe nomads and northern arctic populations) not associated with a particular language or language group? Yep (read the series on Corded Ware = Uralic expansion).

❌ Modern Hungarian R1a-Z280 lineages represent the majority of the native population, poor Slavic ‘peasants’ from the Carpathian Basin, forcibly acculturated by a minority of bad bad Hungarian hordes? Nope.

✅ Modern Hungarian R1a-Z280 subclades represent Ugric lineages in common with ancient R1a-Z645 Finno-Ugric populations from north-eastern Europe and the Trans-Urals? Yep (see Avars and Ugrians).

❌ Modern Hungarian R1a-Z93 lineages represent acculturated Iranian/Turkic peoples from the steppes? Not likely.

✅ Modern Hungarian R1a-Z93 lineages represent a remnant of the expansion of Corded Ware to the east, potentially more clearly associated with Samoyedic? Much more likely.

finno-ugric-haplogroup-n
Map of archaeological cultures in north-eastern Europe ca. 8th-3rd centuries BC. [The Mid-Volga Akozino group not depicted] Shaded area represents the Ananino cultural-historical society. Fading purple arrows represent likely stepped movements of subclades of haplogroup N for centuries (e.g. Siberian → Ananino → Akozino → Fennoscandia [N-VL29]; Circum-Arctic → forest-steppe [N1, N2]; etc.). Blue arrows represent eventual expansions of Uralic peoples to the north. Modified image from Vasilyev (2002).

Sooo, the theory of a “diluted” Y-DNA in Modern Hungarians from originally fully N-dominated conquerors subjugating native R1a-Z280 Slavs from the Carpathian Basin is not backed up by genetic studies? The ethnic Iranian-Turkic R1a-Z93 federation in the steppes that ended up speaking Magyar is not real?? Who would’ve thunk.

Another true story whose rejection in genetics could not be predicted, like, not at all.

Totally unexpected, too, the drift of “R1a=IE” fans with the newest genetic findings towards a Molgen-like “Yamna/R1b = Vasconic-Caucasian”, “N1c = Uralic-Altaic”, and “R1a = the origin of the white world in Mother Russia”. So much for the supposed interest in “Steppe ancestry” and fancy statistics.

Related

A Game of Thrones in Indo-European: proto-languages in Westeros and Essos, and population genomics

I think proto-languages can be applied to basically any appropriate prehistoric setting, and especially to science fiction and fantasy settings. I often viewed the lack of interest for them as based on the idea that they are not fantastic enough, that they would render a fantastic world too realistic to allow for an adequate immersion of the reader (or viewer) into a new world.

With time, I have become more and more convinced that most authors don’t use proto-languages (or tweaked versions of them) simply because they can’t, and resort to the easier way: inventing some rules and words based on some basic ideas and sounds they feel would fit a certain culture or people, to get going. After all, world-building is about a good enough, not too detailed description, and books are about characters and settings, not worlds.

After the end of the 7th season of the Game of Thrones TV series, of which I have become a great fan, I had some season finale grief to deal with, so I thought about applying what we knew about Proto-Indo-Europeans to the fantasy world. Since all book translations deal with English names as if they were translations of the Common Tongue (e.g. Spanish “Invernalia” or “Poniente” for “Winterfel” or “Westeros”), the idea of a translation into Proto-Indo-European seemed quite interesting.

NOTE. I understand that, for some, the idea that “the original language is the best” would make them reject this. However, just take into account the millions who enjoy the books and the TV series only in their native language, and know nothing about the ‘original’ version…

Here are the text and images:

A Dance with Old Tongues

As you can see, the idea of the Common Tongue being Late Proto-Indo-European brings about a whole new (infinite) world of dialectal evolution, language contacts, and population expansions which must be established for the whole setting to work. This is what the text I began to write was about: to use languages (and related populations) of ca. 6000-1500 BC, and to avoid anachronisms and impossible language relationships.

As an added advantage, fans of role-playing games could expand their world with the use of the language correspondences and the maps. This way, instead of “Northern English” being spoken in the North, and “Spanish English” being spoken in Dorne, according to some selections that have been naturally criticized, you have ancient languages that fit with the ancient setting, and which were actually related to each other.

8-westeros-essos-languages-equivalence
Equivalence of languages of the known world with coeval proto-languages. Solid red lines divide Graeco-Aryan from Northern Indo-European dialects (Tocharian is separated from North-West Indo-European by a dotted red line). See all maps.

I also began drawing a fantasy map, my first one – even though I have been member of Cartographer’s Guild for years – , which eventually helped me with my updates of maps of prehistoric migrations, and even with the use of arrows and colors for scientific publications. I drew details mainly to illustrate the text, not to offer a comprehensive translated world. Most of the work was done in the Summer of 2017, with some map changes done in 2018 with help of the maps and works of fans.

NOTE. I have reviewed it during some long travels lately, and included names of “bloodlines” (i.e. haplogroups), which I find more interesting today for people to understand bottlenecks during prehistoric migrations; I have also added a map using pie charts. If this doesn’t fit well with the whole picture, it’s because it’s a recent addition. The rest is more or less the same as one-two years ago.

I don’t have time now to correct much of what I wrote. I have forgotten most of the relevant details from the books, especially A World of Ice and Fire which I think helped me a lot with this, and I am sure that after writing A Song of Sheep and Horses (now you know the why of the book names) I would deal with some language identification and cognates differently.

I decided to publish it to liven up our Facebook page of Modern Indo-European now that the 8th season is near, so that people can participate and try to translate (translatable) names and expressions into Proto-Indo-European, to see how it would work out. You can also request access our Modern Indo-European and Proto-Indo-European groups; both are administered mainly by Fernando.

If you think this whole idea is crazy, or a huge loss of time, I agree; this is how you lose your time when you like fantasy, comic books, etc. But I am a great fan of fantasy and fiction, and I had a lot of free time back then, so I couldn’t help it…

On the other hand, if you feel that mixing fantasy (or SF) with the Proto-Indo-European question (especially population genomics) is a bad idea, I may have agreed with that two years ago, and maybe this is the reason why I hesitated to publish it then.

Hoewever, today we can read a whole new (2018 and 2019) bunch of “steppe ancestry=Indo-European” fantasies: invisible Nganasan reindeer hordes, a Fearsome Tisza River where Yamna settlers mysteriously disappear, shapeshifting Dutch CWC peoples who change haplogroups, languages dependent on cephalic types, or Yamna/Bell Beaker expanding Vasconic…So what’s the matter with some more fantasy?

Happy new year 2019…and enjoy our new books!

song-sheep-horses-header

Sorry for the last weeks of silence, I have been rather busy lately. I am having more projects going on, and (because of that) I also wanted to finish a project I have been working on for many months already.

I have therefore decided to publish a provisional version of the text, in the hope that it will be useful in the following months, when I won’t be able to update it as often as I would like to:

EDIT (20 JAN 2019): For those of you who are more comfortable reading in your native language, I have placed some links to automatic translations by Google Translate. They might work especially well for the texts of A Game of Clans & A Clash of Chiefs.

Don’t forget to check out the maps included in the supplementary materials: I have added Y-DNA, mtDNA, and ADMIXTURE data using GIS software. The PCA graphics are also important to follow the main text.

NOTE. Right now the files are only in my server. I will try to upload them to Academia.edu and Research Gate when I have time, I have uploaded them to Academia.edu and ResearchGate, in case the websites are too slow.

I would have preferred to wait for a thorough revision of the section on archaeology and the linguistic sections on Uralic, but I doubt I will have time when the reviews come, so it was either now or maybe next December…

I say so in the introduction, but it is evident that certain aspects of the book are tentative to say the least: the farther back we go from Late Proto-Indo-European, the less clear are many aspects. Also, linguistically I am not convinced about Eurasiatic or Nostratic, although they do have a certain interest when we try to offer a comprehensive view of the past, including ethnolinguistic identities.

I cannot be an expert in everything, and these books cover a lot. I am bound to publish many corrections as new information appears and more reviews are sent. For example, just days ago (before SNP calls of Wang et al. 2018 were published) some paragraphs implied that AME might have expanded Nostratic from the Middle East. Now it does not seem so, and I changed them just before uploading the text. That’s how tentative certain routes are, and how much all of this may change. And that only if we accept a Nostratic phylum…

NOTE. Since the first book I wrote was the linguistic one, and I have spent the last months updating the archaeology + genetics part, now many of you will probably understand 1) why I am so convinced about certain language relationships and 2) how I used many posts to clarify certain ideas and receive comments. Many posts offer probably a good timeline of what I worked with, and when.

Acknowledgements

I did not add this section to the books, because they are still not ready for print, but I think this is due somewhere now. It is impossible to reference all who have directly or indirectly contributed to this, so this is a list of those I feel have played an important role.

I am indebted to the following people (which does not mean that they share my views, obviously):

First and foremost, to Fernando López-Menchero, for having the patience to review with detail many parts on Indo-European linguistics, knowing that I won’t accept many of his comments anyway. The additional information he offers is invaluable, but I didn’t want to turn this into a huge linguistic encyclopaedia with unending discussions of tiny details of each reconstructed word. I think it is already too big as it is.

I would not have thought about doing this if it were not for the interest of Wekwos (Xavier Delamarre) in publishing a full book about the Indo-European demic diffusion model (in the second half of 2017, I think). It was them who suggested that I extended the content, when all I had done until then was write an essay and draw some maps in my free time between depositing the PhD thesis and defending it.

Sadly, as much as I would like to publish a book with a professional publisher, I don’t think ancient DNA lends itself for the traditional format, so my requests (mainly to have free licenses and being able to review the text at will, as new genetic papers are published) were logically not acceptable. Also, the main aim of all volumes, especially the linguistic one, is the teaching of essentials of Late Proto-Indo-European and related languages, and this objective would be thwarted by selling each volume for $50-70 and only in printed format. I prefer a wider distribution.

At first I didn’t think much of this proposal, because I do not benefit from this kind of publications in my scientific field, but with time my interest in writing a whole, comprehensive book on the subject grew to the point where it was already an ongoing project, probably by the start of 2018.

I would not have been in contact with Wekwos if it were not for user Camulogène Rix at Anthrogenica, so thanks for that and for the interest in this work.

I would not have thought of writing this either if not for the spontaneous support (with an unexpected phone call!) of a professor of the Complutense University of Madrid, Ángel Gómez Moreno, who is interested in this subject – as is his wife, a professor of Classics more closely associated to Indo-European studies, and who helped me with a search for Indo-Europeanists.

EDIT (1 JAN 2019): I remembered that Karin Bojs sent me her book after reading the demic diffusion model. I may have also thought about writing a whole book back then, but mid-2017 is probably too early for the project.

Professor Kortlandt is still to review the text, but he contributed to both previous essays in some very interesting ways, so I hope he can help me improve the parts on Uralic, and maybe alternative accounts of expansion for Balto-Slavic, depending on the time depth that he would consider warranted according to the Temematic hypothesis.

The maps are evidently (for those who are interested in genetics) in part the result of the effort of the late Jean Manco: As you can see from the maps including Y-DNA and mtDNA samples, I have benefitted from her way of organising data and publishing it. Similarly, the work of Iain McDonald in assessing the potential migration routes of R1b and R1a in Europe with the help of detailed maps was behind my idea for the first maps, and consequently behind these, too.

I should thank all people responsible for the release of free datasets to work with, including the Reich and Jena labs, the Veeramah Lab, and also researchers from the Max Planck Institute or the Mainz Palaeogenetics group, who didn’t mind to share with me datasets to work with.

Readers of this blog with interesting comments have also been essential for the improvement of the texts. You can probably see some of your many contributions there. I may not answer many comments, because I am always busy (and sometimes I just don’t have anything interesting to say), but I try to read all of them.

EDIT (1 JAN 2019) I think I should mention at least Chetan, Egg, or Robert George; but then I would leave out old europe, Sgr Ganesh, or Tileman Ehlen; and if I include them I would leave out others…

Users of other sites, like Anthrogenica, whose particular points of view and deep knowledge of some very specific aspects are sometimes very useful. In particular, user Anglesqueville helped me to fix some issues with the merging of datasets to obtain the PCAs and ADMIXTURE, and prepared some individual samples to merge them.

Even without posting anything, Google Analytics keeps sending me messages about increasing user fidelity (returning users), and stats haven’t really changed (which probably means more people are reading old posts), so thank you for that.

I hope you enjoy the books.

Happy new year!

Corded Ware—Uralic (IV): Hg R1a and N in Finno-Ugric and Samoyedic expansions

haplogroup-uralians

This is the fourth of four posts on the Corded Ware—Uralic identification:

Let me begin this final post on the Corded Ware—Uralic connection with an assertion that should be obvious to everyone involved in ethnolinguistic identification of prehistoric populations but, for one reason or another, is usually forgotten. In the words of David Reich, in Who We Are and How We Got Here (2018):

Human history is full of dead ends, and we should not expect the people who lived in any one place in the past to be the direct ancestors of those who live there today.

Haplogroup N

Another recurrent argument – apart from “Siberian ancestry” – for the location of the Uralic homeland is “haplogroup N”. This is as serious as saying “haplogroup R1” to refer to Indo-European migrations, but let’s explore this possibility anyway:

Ancient haplogroups

We have now a better idea of how many ancient migrations (previously hypothesized to be associated with westward Uralic migrations) look like in genetic terms. From Damgaard et al. (Science 2018):

These serial changes in the Baikal populations are reflected in Y-chromosome lineages (Fig. SA; figs. S24 to S27, and tables S13 and SI4). MAI carries the R haplogroup, whereas the majority of Baikal_EN males belong to N lineages, which were widely distributed across Northern Eurasia (29), and the Baikal_LNBA males all carry Q haplogroups, as do most of the Okunevo_EMBA as well as some present-day Central Asians and Siberians.

The only N1c1 sample comes from Ust’Ida Late Neolithic, 180km to the north of Lake Baikal, which – together with the Bronze Age sample from the Kola peninsula, and the medieval sample from Ust’Ida – gives a good idea of the overall expansion of N subclades and Siberian ancestry among the Circum-Arctic peoples of Eurasia, speakers of Palaeo-Siberian languages.

eurasian-n-subclades
Geographical location of ancient samples belonging to major clade N of the Y-chromosome.

Modern haplogroups

What we should expect from Uralic peoples expanding with haplogroup N – seeing how Yamna expands with R1b-L23, and Corded Ware expands with R1a-Z645 – is to find a common subclade spreading with Uralic populations. Let’s see if it works like that for any N-X subclade, in data from Ilumäe et al. (2016):

haplogroup_n1
Geographic-Distribution Map of hg N3 / N1c / N1a.

Within the Eurasian circum-Arctic spread zone, N3 and N2a reveal a well-structured spread pattern where individual sub-clades show very different distributions:

N1a1-M46 (or N-TAT), formed ca. 13900 BC, TMRCA 9800 BC

   N1a1a2-B187, formed ca. 9800 BC, TMRCA 1050 AD:

The sub-clade N3b-B187 is specific to southern Siberia and Mongolia, whereas N3a-L708 is spread widely in other regions of northern Eurasia.

     N1a1a1a-L708, formed ca. 6800 BC, TMRCA 5400 BC.

       N1a1a1a2-B211/Y9022, formed ca. 5400 BC, TMRCA 1900 BC:

The deepest clade within N3a is N3a1-B211, mostly present in the Volga-Uralic region and western Siberian Khanty and Mansi populations.

         N1a1a1a1a-L392/L1026), formed ca. 4400 BC, TMRCA 2800 BC:

The neighbor clade, N3a3’6-CTS6967, spreads from eastern Siberia to the eastern part of Fennoscandia and the Baltic States

haplogroup_n3a3
Frequency-Distribution Maps of Individual Subclade N3a3 / N1a1a1a1a1a-CTS2929/VL29, probably initially with Akozino warrior-traders.

           N1a1a1a1a1a-CTS2929/VL29, formed ca. 2100 BC, TMRCA 1600 BC:

In Europe, the clade N3a3-VL29 encompasses over a third of the present-day male Estonians, Latvians, and Lithuanians but is also present among Saami, Karelians, and Finns (Table S2 and Figure 3). Among the Slavic-speaking Belarusians, Ukrainians, and Russians, about three-fourths of their hg N3 Y chromosomes belong to hg N3a3.

In the post on Finno-Permic expansions, I depicted what seems to me the most likely way of infiltration of N1c-L392 lineages with Akozino warrior-traders into the western Finno-Ugric populations, with an origin around the Barents sea.

This includes the potential spread of (a minority of) N1c-B211 subclades due to contacts with Anonino on both sides of the Urals, through a northern route of forest and forest-steppe regions (equivalent to the distribution of Cherkaskul compared to Andronovo), given the spread of certain subclades in Ugric populations.

NOTE. An alternative possibility is the association of certain B211 subclades with a southern route of expansion with Pre-Scythian and Scythian populations, under whose influence the Ananino culture emerged -which would imply a very quick infiltration of certain groups of haplogroup N everywhere among Finno-Ugrics on both sides of the Urals – , and also the expansion of some subclades with Turkic-speaking peoples, who apparently expanded with alliances of different peoples. Both (Scythian and Turkic) populations expanded from East Asia, where haplogroup N (including N1c) was present since the Neolithic. I find this a worse model of expansion for upper clades, but – given the YFull estimates and the presence of this haplogroup among Turkic peoples – it is a possibility for many subclades.

           N1a1a1a1a2-Z1936, formed ca. 2800 BC, TMRCA 2400 BC:

The only notable exception from the pattern are Russians from northern regions of European Russia, where, in turn, about two-thirds of the hg N3 Y chromosomes belong to the hg N3a4-Z1936—the second west Eurasian clade. Thus, according to the frequency distribution of this clade, these Northern Russians fit better among other non-Slavic populations from northeastern Europe. N3a4 tends to increase in frequency toward the northeastern European regions but is also somewhat unexpectedly a dominant hg N3 lineage among most Turcic-speaking Volga Tatars and South-Ural Bashkirs.

haplogroup_n3a4
Frequency-Distribution Maps of Individual Subclade N3a4 / N1a1a1a1a2-Z1936, probably with the Samic (first) and Fennic (later) expansions into Paleo-Lakelandic and Palaeo-Laplandic territories.

The expansion of N1a-Z1936 in Fennoscandia is most likely associated with the expansion of Saami into asbestos ware-related territory (like the Lovozero culture) during the Late Iron Age – and mixture with its population – , and with the later Fennic expansion to the east and north, replacing their language, as well as with Arctic and forest populations assimilated during Permic, Ugric, and Samoyedic expansions to the north.

           N1a1a1a1a4-M2019 (previously N3a2), formed ca. 4400 BC, TMRCA 1700 BC:

Sub-hg N3a2-M2118 is one of the two main bifurcating branches in the nested cladistic structure of N3a2’6-M2110. It is predominantly found in populations inhabiting present-day Yakutia (Republic of Sakha) in central Siberia and at lower frequencies in the Khanty and Mansi populations, which exhibit a distinct Y-STR pattern (Table S7) potentially intrinsic to an additional clade inside the sub-hg N3a2

The second widespread sub-clade of hg N is N2a. (…):

   N1a2b-P43 (B523/FGC10846/Y3184), formed ca. 6800 BC, TMRCA ca. 2700 BC:

The absolute majority of N2a individuals belong to the second sub-clade, N2a1-B523, which diversified about 4.7 kya (95% CI = 4.0–5.5 kya). Its distribution covers the western and southern parts of Siberia, the Taimyr Peninsula, and the Volga-Uralic region with frequencies ranging from from 10% to 30% and does not extend to eastern Siberia (…)

haplogroup_n2
Geographic-Distribution Map of hg N2a1 / N1a2b-P43

The “European” branch suggested earlier from Y-STR patterns turned out to consist of two clades

     N1a2b2a-Y3185/FGC10847, formed ca. 2200 BC, TMRCA 800 BC:

N2a1-L1419, spread mainly in the northern part of that region.

     N1a2b2b1-B528/Y24382, formed ca. 900 BC, TMRCA ca. 900 BC:

N2a1-B528, spread in the southern Volga-Uralic region.

Haplogroup R1a

We also have a good idea of the distribution of haplogroup R1a-Z645 in ancient samples. Its subclades were associated with the Corded Ware expansion, and some of them fit quite well the early expansion of Finno-Permic, Ugric, and Samoyedic peoples to the east.

r1a-z282-z280-z2125-distribution
Modified image, from Underhill et al. (2015). Spatial frequency distributions of Z282 (green) and Z93 (blue) affiliated haplogroups.. Notice the potential Finno-Ugric-associated distribution of Z282 (especially R1a-M558, a Z280 subclade), the expansion of R1a-Z2123 subclades with Central Asian forest-steppe groups.

This is how the modern distribution of R1a among Uralians looks like, from the latest report in Tambets et al. (2018):

  • Among Fennic populations, Estonians and Karelians (ca. 1.1 million) have not suffered the greatest bottleneck of Finns (ca. 6-7 million), and show thus a greater proportion of R1a-Z280 than N1c subclades, which points to the original situation of Fennic peoples before their expansion. To trust Finnish Y-DNA to derive conclusions about the Uralic populations is as useful as relying on the Basque Y-DNA for the language spread by R1b-P312
  • Among Volga-Finnic populations, Mordovians (the closest to the original Uralic cluster, see above) show a majority of R1a lineages (27%).
  • Hungarians (ca. 13-15 million) represent the majority of Ugric (and Finno-Ugric) peoples. They are mainly R1a-Z280, also R1a-Z2123, have little N1c, and lack Siberian ancestry, and represent thus the most likely original situation of Ugric peoples in 4th century AD (read more on Avars and Hungarians).
  • Among Samoyedic peoples, the Selkup, the southernmost ones and latest to expand – that is, those not heavily admixed with Siberian populations – , also have a majority of R1a-Z2123 lineages (see also here for the original Samoyedic haplogroups to the south).

To understand the relevance of Hungarians for Ugric peoples, as well as Estonians, Karelians, and Mordovians (and northern Russians, Finno-Ugric peoples recently Russified) for Finno-Permic peoples, as opposed to the Circum-Arctic and East Siberian populations, one has to put demographics in perspective. Even a modern map can show the relevance of certain territories in the past:

population-density
Population density (people per km2) map of the world in 1994. From Wikipedia.

Summary of ancestry + haplogroups

Fennic and Samic populations seem to be clearly influenced by Palaeo-Laplandic peoples, whereas Volga-Finnic and especially Permic populations may have received gene flow from both, but essentially Palaeo-Siberian influence from the north and east.

The fact that modern Mansis and Khantys offer the highest variation in N1a subclades, and some of the highest “Siberian ancestry” among non-Nganasans, should have raised a red flag long ago. The fact that Hungarians – supposedly stemming from a source population similar to Mansis – do not offer the same amount of N subclades or Siberian ancestry (not even close), and offer instead more R1a, in common with Estonians (among Finno-Samic peoples) and Mordvins (among Volga-Finnic peoples) should have raised a still bigger red flag. The fact that Nganasans – the model for Siberian ancestry – show completely different N1a2b-P43 lineages should have been a huge genetic red line (on top of the anthropological one) to regard them as the Uralian-type population.

We know now that ethnolinguistic groups have usually expanded with massive (usually male-biased) migrations, and that neighbouring locals often ‘resurge’ later without changing the language. That is seen in Europe after the spread of Bell Beakers, with the increase of previous ancestry and lineages in Scandinavia during the formation of the Nordic ethnolinguistic community; in Central-West Europe, with the resurgence of Neolithic ancestry (and lineages) during the Bronze Age over steppe ancestry; and in Central-East Europe (with Unetice or East European Bronze Age groups like Mierzanowice, Trzciniec, or Lusatian) showing an increase in steppe ancestry (and resurge of R1a subclades); none of them represented a radical ethnolinguistic change.

finno-ugric-haplogroup-n
Map of archaeological cultures in north-eastern Europe ca. 8th-3rd centuries BC. [The Mid-Volga Akozino group not depicted] Shaded area represents the Ananino cultural-historical society. Fading purple arrows represent likely stepped movements of subclades of haplogroup N for centuries (e.g. Siberian → Ananino → Akozino → Fennoscandia [N-VL29]; Circum-Arctic → forest-steppe [N1, N2]; etc.). Blue arrows represent eventual expansions of Uralic peoples to the north. Modified image from Vasilyev (2002).

It is not hard to model the stepped arrival, infiltration, and/or resurge of N subclades and “Siberian ancestries”, as well as their gradual expansion in certain regions, associated with certain migrations first – such as the expansions to the Circum-Arctic region, and later the Scythian- and Turkic-related movements – , as well as limited regional developments, like the known bottleneck in Finns, or the clear late expansion of Ugric and Samoyedic languages to the north among nomadic Palaeo-Siberians due to traditions of exogamy and multilingualism. This fits quite well with the different arrival of N (N1c and xN1c) lineages to the different Uralic-speaking groups, and to the stepped appearance of “Siberian ancestry” in the different regions.

The aternative

It is evident that a lot of people were too attached to the idea of Palaeolithic R1b lineages ‘native’ to western Europe speaking Basque languages; of R1a lineages speaking Indo-European and spreading with Yamna; and N lineages ‘native’ to north-eastern Europe and speaking Uralic, and this is causing widespread weeping and gnashing of teeth (instead of the joy of discovering where one’s true patrilineal ancestors come from, and what language they spoke in each given period, which is the supposed objective of genetic genealogy…)

Since an Indo-Germanic branch (as revived now by some in the Copenhaguen group to fit Kristiansen’s theory of the 1980s with recent genetic data) does not make any sense in linguistics, the finding of R1a in Yamna would not have led where some think it would have, because North-West Indo-European would still be the main Late PIE branch in Europe. Don’t take my word for it; take James P. Mallory’s (2013).

mallory-adams-tree
The levels of Indo-European reconstruction, from Mallory & Adams (2006).

If an (unlikely) Indo-Slavonic group were posited, though, such a group would still be bound (with Indo-Iranian) to the steppes with East Yamna/Poltavka (admixing with Abashevo migrants, but retaining its language), developing Sintashta/Potapovka → Srubna/Andronovo, and R1a lineages would have equally undergone the known bottlenecks of the steppes where they replaced R1b-Z2103 – which this eastern group shares with Balkan languages, a haplogroup that links therefore together the Graeco-Aryan group.

As far as I know – and there might be many other similar pet theories out there – there have been proposals of “modern Balto-Slavic-like” populations (in an obvious circular reasoning based on modern populations) in some Scythian clusters of the Iron Age.

NOTE. I will not enter into “Balto-Slavic-like R1a” of the Late Bronze Age or earlier because no one can seriously believe at this point of development of Population Genetics that autosomal similarity predating 1,500+ years the appearance of Slavs equates to their (ethnolinguistic) ancestral population, without a clear intermediate cultural and genetic trail – something we lack today in the Slavic case even for the late Roman period…

finno-saamic-palaeo-germanic-substratum
The Finnic and Saamic separation looks shallower than it actually is. Invisible convergence can be ‘triangulated’ with the help of Germanic layers of mutual loanwords (Häkkinen 2012).

We also know of R1a-Z280 lineages in Srubna, probably expanding to the west. With that in mind, and knowing that Palaeo-Germanic was in close contact with Finno-Samic while both were already separated but still in contact, and that Palaeo-Germanic was also in contact and closely related to a ‘Temematic’ distinct from Balto-Slavic (and also that early Proto-Baltic and Proto-Slavic from the Roman Iron Age and later were in contact with western Uralic) this will be the linguistic map of the Iron Age if R1a is considered to expand Indo-European from some kind of “patron-client” relationship with west Yamna:

palaeo-germanic-italo-celtic
Eastern European language map during the Late Bronze Age / Iron Age, if R1a spread Indo-European languages and Eastern Yamna spoke Indo-Slavonic. Palaeo-Germanic (i.e. Pre- to Proto-Germanic) needs to be in contact with both the Samic Lovozero population and the Fennic west Circum-Arctic one. Italic and Celtic in contact with Pre-Germanic. Germanic in contact with Temematic. Balto-Slavic in contact with Iranian, and near Fennic to allow for later loanwords. For Germanic and Temematic, see Kortlandt (2018).

You might think I have some personal or political reason against this kind of proposals. I haven’t. We have been proposing Indo-European to be the language of the European Union for more than 10 years, so to support R1b-Italo-Celtic in the whole Western Europe, R1a-Germanic in Central and Eastern Europe, and R1a-Indo-Slavonic in the steppes (as the Danish group seems to be doing) has nothing inherently bad (or good) for me. If anything, it gives more reason to support the revival of North-West Indo-European in Europe.

My problem with this proposal is that it is obviously beholden to the notion of the uninterrupted cultural, historic and ethnic continuity in certain territories. This bias is common in historiography (von Falkenhausen 1993), but it extends even more easily into the lesser known prehistory of any territory, and now more than ever some people feel the need to corrupt (pre)history based on their own haplogroups (or the majority haplogroups of their modern countries). However, more than on philosophical grounds, my rejection is based on facts: this picture is not what the combination of linguistic, archaeological, and genetic data shows. Period.

Nevertheless, if Yamna + Corded Ware represented the “big and early expansion” of Germanic and Italo-Celtic peoples proper of the dream Nazi’s Lebensraum and Fascist’s spazio vitale proposals; Uralians were Siberian hunter-gatherers that controlled the whole eastern and northern Russia, and miraculously managed to push (ethnolinguistically) Neolithic agropastoralists to the west during and after the Iron Age, with gradual (and often minimal) genetic impact; and Balto-Slavic peoples were represented by horse riders from Pokrovka/Srubna, hiding then somewhere around the forest-steppe until after the Scythian expansion, and then spreading their language (without much genetic impact) during the early Middle Ages…so be it.

Related

Corded Ware—Uralic (III): “Siberian ancestry” and Ugric-Samoyedic expansions

siberian-ancestry-tambets

This is the third of four posts on the Corded Ware—Uralic identification. See

An Eastern Uralic group?

Even though proposals of an Eastern Uralic (or Ugro-Samoyedic) group are in the minority – and those who support it tend to search for an origin of Uralic in Central Asia – , there is nothing wrong in supporting this from the point of view of a western homeland, because the eastward migration of both Proto-Ugric and Pre-Samoyedic peoples may have been coupled with each other at an early stage. It’s like Indo-Slavonic: it just doesn’t fit the linguistic data as well as the alternative, i.e. the expansion of Samoyedic first, different from a Finno-Ugric trunk. But, in case you are wondering about this possibility, here is Häkkinen’s (2012) phonological argument:

ugro-samoyedic-uralic

The case of Samoyedic is quite similar to that of Hungarian, although the earliest Palaeo-Siberian contact languages have been lost. There were contacts at least with Tocharian (Kallio 2004), Yukaghir (Rédei 1999) and Turkic (Janhunen 1998). Samoyedic also:

a) has moved far from the related languages and has been exposed to strong foreign influence

b) shares a small number of common words with other branches (from Sammallahti 1988: only 123 ‘Uralic’ words, versus 390 ‘Uralic’ + ‘Finno-Ugric’ words found in other branches than Samoyedic = 31,5 %)

c) derives phonologically from the East Uralic dialect.

The phonological level is taxonomically more reliable, since it lacks the distortion caused by invisible convergence and false divergence at the lexical level. Thus we can conclude that the traditional taxonomic model, according to which Samoyedic was the first branch to split off from the Proto-Uralic unity, is just as incorrect as the view that Hungarian was the first branch to split off.

Seima-Turbino

Late Uralic can be traced back to metallurgical cultures thanks to terms like PU *wäśka ‘copper/bronze’ (borrowed from Proto-Samoyedic *wesä into Tocharian); PU *äsa and *olna/*olni, ‘lead’ or ‘tin’, found in *äsa-wäśka ‘tin-bronze’; and e.g. *weŋći ‘knife’, borrowed into Indo-Iranian (through the stage of vocalization of nasals), appearing later as Proto-Indo-Aryan *wāćī ‘knife, awl, axe’.

It is known that the southern regions of the Abashevo culture developed Proto-Indo-Iranian-speaking Sintashta-Petrovka and Pokrovka (Early Srubna). To the north, however, Abashevo kept its Uralic nature, with continuous contacts allowing for the spread of lexicon – mainly into Finno-Ugric – , and phonetic influence – mainly Uralisms into Proto-Indo-Iranian phonology (read more here).

The northern part of Abashevo (just like the south) was mainly a metallurgical society, with Abashevo metal prospectors found also side by side with Sintashta pioneers in the Zeravshan Valley, near BMAC, in search of metal ores. About the Seima-Turbino phenomenon, from Parpola (2013):

From the Urals to the east, the chain of cultures associated with this network consisted principally of the following: the Abashevo culture (extending from the Upper Don to the Mid- and South Trans-Urals, including the important cemeteries of Sejma and Turbino), the Sintashta culture (in the southeast Urals), the Petrovka culture (in the Tobol-Ishim steppe), the Taskovo-Loginovo cultures (on the Mid- and Lower Tobol and the Mid-Irtysh), the Samus’ culture (on the Upper Ob, with the important cemetery of Rostovka), the Krotovo culture (from the forest steppe of the Mid-Irtysh to the Baraba steppe on the Upper Ob, with the important cemetery of Sopka 2), the Elunino culture (on the Upper Ob just west of the Altai mountains) and the Okunevo culture (on the Mid-Yenissei, in the Minusinsk plain, Khakassia and northern Tuva). The Okunevo culture belongs wholly to the Early Bronze Age (c. 2250–1900 BCE), but most of the other cultures apparently to its latter part, being currently dated to the pre-Andronovo horizon of c. 2100–1800 BCE (cf. Parzinger 2006: 244–312 and 336; Koryakova & Epimakhov 2007: 104–105).

post-eneolithic-steppe-asia
Schematic map of the Middle Bronze Age cultures (steppe and foreststeppe
zone)

The majority of the Sejma-Turbino objects are of the better quality tin-bronze, and while tin is absent in the Urals, the Altai and Sayan mountains are an important source of both copper and tin. Tin is also available in southern Central Asia. Chernykh & Kuz’minykh have accordingly suggested an eastern origin for the Sejma-Turbino network, backing this hypothesis also by the depiction on the Sejma-Turbino knives of mountain sheep and horses characteristic of that area. However, Christian Carpelan has emphasized that the local Afanas’evo and Okunevo metallurgy of the Sayan-Altai area was initially rather primitive, and could not possibly have achieved the advanced and difficult technology of casting socketed spearheads as one piece around a blank. Carpelan points out that the first spearheads of this type appear in the Middle Bronze Age Caucasia c. 2000 BCE, diffusing early on to the Mid-Volga-Kama-southern Urals area, where “it was the experienced Abashevo craftsmen who were able to take up the new techniques and develop and distribute new types of spearheads” (Carpelan & Parpola 2001: 106, cf. 99–106, 110). The animal argument is countered by reference to a dagger from Sejma on the Oka river depicting an elk’s head, with earlier north European prototypes (Carpelan & Parpola 2001: 106–109). Also the metal analysis speaks for the Abashevo origin of the Sejma-Turbino network. Out of 353 artefacts analyzed, 47% were of tin-bronze, 36% of arsenical bronze, and 8.5% of pure copper. Both the arsenical bronze and pure copper are very clearly associated with the Abashevo metallurgy.

seima-turbino-phenomenon-parpola
Find spots of artefacts distributed by the Sejma-Turbino intercultural trader network, and the areas of the most important participating cultures: Abashevo, Sintashta, Petrovka. Based on Chernykh 2007: 77.

The Abashevo metal production was based on the Volga-Kama-Belaya area sandstone ores of pure copper and on the more easterly Urals deposits of arsenical copper (Figure 9). The Abashevo people, expanding from the Don and Mid-Volga to the Urals, first reached the westerly sandstone deposits of pure copper in the Volga and Kama basins, and started developing their metallurgy in this area, before moving on to the eastern side of the Urals to produce harder weapons and tools of arsenical copper. Eventually they moved even further south, to the area richest in copper in the whole Urals region, founding there the very strong and innovative Sintashta culture.

Regarding the most likely expansion of Eastern Uralic peoples:

Nataliya L’vovna Chlenova (1929–2009; cf. Korenyako & Ku’zminykh 2011) published in 1981 a detailed study of the Cherkaskul’ pottery. In her carefully prepared maps of 1981 and 1984 (Figure 10), she plotted Cherkaskul’ monuments not only in Bashkiria and the Trans-Urals, but also in thick concentrations on the Upper Irtysh, Upper Ob and Upper Yenissei, close to the Altai and Sayan mountains, precisely where the best experts suppose the homeland of Proto-Samoyed to be.

cherkaskul-andronovo
Distribution of Srubnaya (Timber Grave, early and late), Andronovo (Alakul’ and Fëdorovo variants) and Cherkaskul’ monuments. After Parpola 1994: 146, fig. 8.15, based on the work of N. L. Chlenova (1984: map facing page 100).

Ugric

The Cherkaskul’ culture was transformed into the genetically related Mezhovka culture (c. 1500–1000 BCE), which occupied approximately the same area from the Mid-Kama and Belaya rivers to the Tobol river in western Siberia (cf. Parzinger 2006: 444–448; Koryakova & Epimakhov 2007: 170–175). The Mezhovka culture was in close contact with the neighbouring and probably Proto-Iranian speaking Alekseevka alias Sargary culture (c. 1500–900 BCE) of northern Kazakhstan (Figure 4 no. 8) that had a Fëdorovo and Cherkaskul’ substratum and a roller pottery superstratum (cf. Parzinger 2006: 443–448; Koryakova & Epimakhov 2007: 161–170). Both the Cherkaskul’ and the Mezhovka cultures are thought to have been Proto-Ugric linguistically, on the basis of the agreement of their area with that of Mansi and Khanty speakers, who moreover in their Fëdorovo-like ornamentation have preserved evidence of continuity in material culture (cf. Chlenova 1984; Koryakova & Epimakhov 2007: 159, 175).

mezhovska-sargary-irmen
Cultures of the Final Bronze Age of the Urals and western Siberia (steppe
and forest-steppe zone).

The Mezhovka culture was succeeded by the genetically related Gamayun culture (c. 1000–700 BCE) (cf. Parzinger 2006: 446; 542–545).

From the Gamayun culture descend Trans-Urals cultures in close contact with Finno-Permic populations of the Cis-Ural region:

  • [Proto-Mansi] Itkul’ culture (c. 700–200 BCE) distributed along the eastern slope of the Ural Mountains (cf. Parzinger 2006: 552–556). Known from its walled forts, it constituted the principal Trans-Uralian centre of metallurgy in the Iron Age, and was in contact with both the Anan’ino and Akhmylovo cultures (the metallurgical centres of the Mid-Volga and Kama-Belaya region) and the neighbouring Gorokhovo culture.
    • [Proto-Hungarian] via the Vorob’evo Group (c. 700–550 BCE) (cf. Parzinger 2006: 546–549), to the Gorokhovo culture (c. 550–400 BCE) of the Trans-Uralian forest steppe (cf. Parzinger 2006: 549–552). For various reasons the local Gorokhovo people started mobile pastoral herding and became part of the multicomponent pastoralist Sargat culture (c. 500 BCE to 300 CE), which in a broader sense comprized all cultural groups between the Tobol and Irtysh rivers, succeeding here the Sargary culture. The Sargat intercommunity was dominated by steppe nomads belonging to the Iranian-speaking Saka confederation, who in the summer migrated northwards to the forest steppe
  • [Proto-Khanty] Late Bronze Age and Early Iron Age cultures related to the Gamayunskoe and Itkul’ cultures that extended up to the Ob: the Nosilovo, Baitovo, Late Irmen’, and Krasnoozero cultures (c. 900–500 BCE). Some were in contact with the Akhmylovo on the Mid-Volga.
sargat-gorokhovo-bolscherechye
Cultural groups of the Iron Age in the forest-steppe zone of western
Siberia. (

Samoyedic

Parpola (2012) connects the expansion of Samoyedic with the Cherkaskul variant of Andronovo. As we know, Andronovo was genetically diverse, which speaks in favour of different groups developing similar material cultures in Central Asia.

Juha Janhunen, author of the etymological dictionary of the Samoyed languages (1977), places the homeland of Proto-Samoyedic in the Minusinsk basin on the Upper Yenissei (cf. Janhunen 2009: 72). Mainly on the basis of Bulghar Turkic loanwords, Janhunen (2007: 224; 2009: 63) dates Proto-Samoyedic to the last centuries BCE. Janhunen thinks that the language of the Tagar culture (c. 800–100 BCE) ought to have been Proto-Samoyedic (cf. Janhunen 1983: 117– 118; 2009: 72; Parzinger 2001: 80 and 2006: 619–631 dates the Tagar culture c. 1000–200 BCE; Svyatko et al. 2009: 256, based on human bone samples, c. 900 BCE to 50 CE). The Tagar culture largely continues the traditions of the Karasuk culture (c. 1400–900 BCE), (…)

chicha-irmen-tagar-baraba-forest-siberian
Map showing the location of Chicha-1.

For the most recent expansions of Samoyedic languages to the north, into Palaeo-Siberian populations, read more about the traditional multilingualism of Siberian populations.

Genetics

Siberian ancestry

The use of a map of “Siberian ancestry” peaking in the arctic to show a supposedly late Uralic population movement (starting in the Iron Age!) seems to be the latest trend in population genomics:

siberian-ancestry-map
Frequency map of the so-called ‘Siberian’ component. From Tambets et al. (2018) (see below for ADMIXTURE in specific populations).

I guess that would make this map of Neolithic farmer ancestry represent an expansion of Indo-European from the south, because Anatolia, Greece, Italy, southern France, and Iberia – where this ancestry peaks in modern populations – are among the oldest territories where Indo-European languages were recorded:

reich-farmer-ancestry
Modern genome-wide data shows that the primary gradient of farmer ancestry in Europe does not flow southeast-to-northwest but instead in an almost perpendicular direction, a result of a major migration of pastoralists from the east that displaced much of the ancestry of the first farmers.

Probably not the right interpretation of this kind of simplistic data about modern populations, though…

The most striking thing about the “Siberian ancestry” white whale is that nobody really knows what it is; just like we did not know what “Yamnaya ancestry” was, until the most recent data is making the picture clearer. Its nature is changing with each new paper, and it can be summed up by “some ancestry we want to find that is common to Uralic-speaking peoples, and should not be CWC-related”. Tambets et al. (2018) explain quite well how they “found it”:

Overall, and specifically at lower values of K, the genetic makeup of Uralic speakers resembles that of their geographic neighbours. The Saami and (a subset of) the Mansi serve as exceptions to that pattern being more similar to geographically more distant populations (Fig. 3a, Additional file 3: S3). However, starting from K = 9, ADMIXTURE identifies a genetic component (k9, magenta in Fig. 3a, Additional file 3: S3), which is predominantly, although not exclusively, found in Uralic speakers. This component is also well visible on K = 10, which has the best cross-validation index among all tests (Additional file 3: S3B). The spatial distribution of this component (Fig. 3b) shows a frequency peak among Ob-Ugric and Samoyed speakers as well as among neighbouring Kets (Fig. 3a). The proportion of k9 decreases rapidly from West Siberia towards east, south and west, constituting on average 40% of the genetic ancestry of FU speakers in Volga-Ural region (VUR) and 20% in their Turkic-speaking neighbours (Bashkirs, Tatars, Chuvashes; Fig. 3a).

siberian-ancestry-modern
Population structure of Uralic-speaking populations inferred from ADMIXTURE analysis on autosomal SNPs in Eurasian context. Individual ancestry estimates for populations of interest for selected number of assumed ancestral populations (K3, K6, K9, K11). Ancestry components discussed in a main text (k2, k3, k5, k6, k9, k11) are indicated and have the same colours throughout. The names of the Uralic-speaking populations are indicated with blue (Finno-Ugric) or orange (Samoyedic). Image from Tambets et al. (2018).

However, this ‘something’ that some people occasionally find in some Uralic populations is also common to other modern and ancient groups, and not so common in some other Uralic peoples. Simply put:

siberian-ancestry-modern-populations
Image modified from Lamnidis et al. (2018). Red line representing maximum “Siberian admixture” in Eastern European hunter-gatherers. In blue, Uralic-speaking groups. “Plot of ADMIXTURE (K=3) results containing West Eurasian populations and the Nganasan. Ancient individuals from this study are represented by thicker bars.”

I already said this in the recent publication of Siberian samples, where a renamed and radiocarbon dated Finnish_IA clearly shows that Late Iron Age Saami (ca. 400 AD) had little “Siberian ancestry”, if any at all, representing the most likely Fennic (and Samic) ancestral components before their expansion into central and northern Finland, where they admixed with circum-polar peoples of asbestos ware cultures.

I will say that again and again, any time they report the so-called “Siberian ancestry” in Uralic samples, no matter how it is defined each time: it does not seem to be that special something people are looking for, but rather (at least in a great part) a quite old ancestral component forming an evident cline with EHG, whose best proximate source are Baikal_EN (and/or Devil’s Gate) at this moment, and thus also East European hunter-gatherers for Western Uralic peoples:

dzudzuana-baikal-en-admixture
Image modified from Lazaridis et al. (2018). In red: samples with Baikal_EN ancestry in speculative estimates. In pink: samples with Baikal_EN ancestry in conservative estimates (probably marking a recent arrival of Baikal_En ancestry, see here). Modeling present-day and ancient West-Eurasians. Mixture proportions computed with qpAdm (Supplementary Information section 4). The proportion of ‘Mbuti’ ancestry represents the total of ‘Deep’ ancestry from lineages that split prior to the split of Ust’Ishim, Tianyuan, and West Eurasians and can include both ‘Basal Eurasian’ and other (e.g., Sub-Saharan African) ancestry. (Left) ‘Conservative’ estimates. Each population 367 cannot be modeled with fewer admixture events than shown. (Right) ‘Speculative’ estimates. The highest number of sources (≤5) with admixture estimates within [0,1] are shown for each population. Some of the admixture proportions are not significantly different from 0 (Supplementary Information section 4).

So either Samara_HG, Karelia_HG, and many other groups from eastern Europe all spoke Uralic according to this ADMIXTURE graphic (and the formation of steppe ancestry in the Volga-Ural region brought the Proto-Indo-European language to the steppes through the CHG/ANE expansion), or a great part of this “Siberian ancestry” found in modern Uralic-speaking populations is not what some people would like to think it is…

Modern populations

PCA clines can be looked for to represent expansions of ancient populations. Most recently, Flegontov et al. (2018) are attempting to do this with Asian populations:

For some Turkic groups in the Urals and the Altai regions and in the Volga basin, a different admixture model fits the data: the same West Eurasian source + Uralic- or Yeniseian-speaking Siberians. Thus, we have revealed an admixture cline between Scythians and the Iranian farmer genetic cluster, and two further clines connecting the former cline to distinct ancestry sources in Siberia. Interestingly, few Wusun-period individuals harbor substantial Uralic/Yeniseian-related Siberian ancestry, in contrast to preceding Scythians and later Turkic groups characterized by the Tungusic/Mongolic-related ancestry. It remains to be elucidated whether this genetic influx reflects contacts with the Xiongnu confederacy. We are currently assembling a collection of samples across the Eurasian steppe for a detailed genetic investigation of the Hunnic confederacies.

jeong-population-clines
Three distinct East/West Eurasian clines across the continent with some interesting linguistic correlates, as earlier reported by Jeong et al. (2018). Alexander M. Kim.

There are potential errors with this approach:

The main one is practical – does a modern cline represent an ancestral language? The answer is: sometimes. It depends on the anthropological context that we have, and especially on the precision of the PCA:

clines-himalayan
Genetic structure of the Himalayan region populations from analyses using unlinked SNPs. (A) PCA of the Himalayan and HGDP-CEPH populations. Each dot represents a sample, coded by region as indicated. The Himalayan region samples lie between the HGDP-CEPH East Asian and South Asian samples on the right-hand side of the plot. From Arciero et al. (2018).

The ‘Europe’, ‘Middle East’, etc. clines of the above PCA do not represent one language, but many. For starters, the PCA includes too many (and modern) populations, its precision is useless for ethnolinguistic groups. Which is the right level? Again, it depends.

The other error is one of detail of the clines drawn (which, in turn, depends on the precision of the PCA). For example, we can draw two paralell lines (or even one line, as in Flegontov et al. above) in one PCA graphic, but we still don’t have the direction of expansion. How do we know if this supposed “Uralic-speaking cline” goes from one region to the other? For that level of detail, we should examine closely modern Uralic-speaking peoples and Circum-Arctic populations:

uralic-cline
Modified from Tambets et al. (2018). Principal component analysis (PCA) and genetic distances of Uralic-speaking populations. a PCA (PC1 vs PC2) of the Uralic-speaking populations

The real ancient Uralic cluster (drawn above in blue) is thus probably from a North-East European source (probably formed by Battle Axe / Fatyanovo-Balanovo / Abashevo) to the east into Siberian populations, and to the north into Laplandic populations (see below also on Mezhovska ancestry for the drawn ‘European cline’, which some may a priori wrongly assume to be quite late).

The fact that the three formed clines point to an admixture of CWC-related populations from North-Eastern Europe, and that variation is greater at the Palaeo-Laplandic and Palaeo-Siberian extremities compared to the CWC-related one, also supports this as the correct interpretation.

However, judging by the two main clines formed, one could be alternatively inclined to interpret that Palaeo-Laplandic and Palaeo-Siberian populations formed a huge ancestral “Uralic” ghost cluster in Siberia (spanning from the Palaeo-Laplandic to the Palaeo-Siberian one), and from there expanded Finno-Samic on one hand, and “Volga-Ugro-Samoyed” on the other. That poses different problems: an obvious linguistic and archaeological one – which I assume a lot of people do not really care about – , and a not-so-obvious genetic one (see below for ancient samples and for the expansion of haplogroup N).

To understand the simplest solution better, one can just have a look at the PCA from Bell Beaker samples in Olalde et al. (2018), which (as Reich has already explained many times) expanded directly from Yamna R1b-L23 lineages:

olalde_pca_clines
Image modified from Olalde et al. (2018). PCA of 999 Eurasian individuals. Marked is the Espersted Outlier with the approximate position of Yamna Hungary, probably the source of its admixture. Different Bell Beaker clines have been drawn, to represent approximate source of expansions from Central European sources into the different regions.

Unlike this PCA with ancient samples, where Bell Beaker clines could be a rough approximation to the real sources for each population, and where a cluster spanning all three depicted Early Bronze Age clusters could give a rough proximate source of European Bell Beakers in Hungary (and where one can even distinguish the Y-DNA bottlenecks in the L23 trunk created by each cline) the PCA of modern Uralic populations is probably not suitable for a good estimate of the ancient situation, which may be found shifted up or down of the drawn “Uralic” cluster along East European groups.

After all, we already know that the Siberian cline shows probably as much an ancient admixture event – from the original Uralic expansion to the east with Corded Ware ancestry – as another more recent one – a westward migration of Siberian ancestry (or even more than one). While we know with more or less exactitude what happened with the Palaeo-Laplandic admixture by expanding Proto-Finno-Samic populations (see here), the Proto-Ugric and Pre-Samoyedic populations formed probably more than one cline during the different ancient migrations through central Asia.

Ancient populations

Apparently, the Corded Ware expansion to the east was not marked by a huge change in ancestry. While the final version of Narasimhan et al. (2018) may show a little more detail about other forest-steppe Seima-Turbino/Andronovo-related migrations (and thus also Eastern Uralic peoples), we have already had enough information for quite some time to get a good idea.

mezhovska-pca
Principal component analysis. PCA of ancient individuals (according colours see legend) projected on modern West Eurasians (grey). Iron Age Scythians are shown in black; CHG, Caucasus hunter-gatherer; LNBA, late Neolithic/Bronze Age; MN, middle Neolithic; EHG, eastern European huntergatherer; LBK_EN, early Neolithic Linearbandkeramik; HG, hunter-gatherer; EBA, early Bronze Age; IA, Iron Age; LBA, late Bronze Age; WHG, western hunter-gatherer.dataset (grey). Iron Age Scythians are shown in black; CHG, Caucasus hunter-gatherer; LNBA, late Neolithic/Bronze Age; MN, middle Neolithic; EHG, eastern European hunter-gatherer; LBK_EN, early Neolithic Linearbandkeramik; HG, hunter-gatherer; EBA, early Bronze Age; IA, Iron Age; LBA, late Bronze Age; WHG, western hunter-gatherer.

Mezhovska‘s position is similar to the later Pre-Scythian and Scythian populations. There are some interesting details: apart from haplogroup R1a-Z280 (CTS1211+), there is one R1b-M269 (PF6494+), probably Z2103, and an outlier (out of three) in a similar position to the recently described central/southern Scythian clusters.

NOTE. The finding of R1b-M269 in the forest-steppe is probably either 1) from an Afanasevo-Okunevo origin, or 2) from an admixture with neighbouring Andronovo-related populations, such as Sargary. A third, maybe less likely option is that this haplogroup admixed with Abashevo directly (as it happened in Sintashta, Potapovka, or Pokrovka) and formed part of early Uralic migrations. In any case, since Mezhovska is a Bronze Age society from the Urals region, its association with R1b-Z2103 – like the association of R1b-Z2103 in Scythian clusters – cannot be attributed to “Thracian peoples”, a link which is (as I already said) too simplistic.

The drawn “European cline” of Hungarians (see above), leading from ‘west-like’ Mansi to Hungarian populations – and hosting also Finnic and Estonian samples – , cannot therefore be attributed simply to late “Slavic/Balkan-like” admixture.

Karasuk – located further to the east – is basically also Corded Ware peoples showing clearly a recent admixture with local ANE / Baikal_EN-like populations. In terms of haplogroups it shows haplogroup Q, R1a-Z2124, and R1a-Z2123, later found among early Hungarians, and present also in ancient Samoyedic populations now acculturated.

The most interesting aspect of both Mezhovska and Karasuk is that they seem to diverge from a point close to Ukraine_Eneolithic, which is the supposed ancestral source of Corded Ware peoples (read more about the formation of “steppe ancestry”). This means that Eastern Uralians derive from a source closer to Middle Dnieper/Abashevo populations, rather than Battle Axe (shifted to Latvian Neolithic), which is more likely the source prevalent in Finno-Permic peoples.

Their initial admixture with (Palaeo-)Siberian populations is thus seen already starting by this time in Mezhovska and especially in Karasuk, but this process (compared to modern populations) is incomplete:

f4-test-karasuk-mezhovska
Visualization of f-statistics results. f4(Test, LBK; Han, Mbuti) values are plotted on x axis and f4(Test, LBK; EHG, Mbuti) values on y axis, positive deviations from zero show deviations from a clade between Test and LBK. A red dashed line is drawn between Yamnaya from Samara and Ami. Iron Age populations that can be modelled as mixtures of Yamnaya and East Eurasians (like the Ami) are arrayed around this line and appear to be distinct from the main North/South European cline (blue) on the left of the x axis.
karasuk-mezhovska-admixture
ADMIXTURE results for ancient populations. Red arrows point to the Iron Age Scythian individuals studied. LBK_EN: Early Neolithic Linearbandkeramik; EHG: Eastern European hunter-gatherer; Motala_HG: hunter-gatherer from Motala (Sweden); WHG: western hunter-gatherer; CHG: Caucasus hunter-gatherer; IA: Iron Age; EBA: Early Bronze Age; LBA: Late Bronze Age.

We know now that Samic peoples expanded during the Late Iron Age into Palaeo-Laplandic populations, admixing with them and creating this modern cline. Finns expanded later to the north (in one of their known genetic bottlenecks), admixing with (and displacing) the Saami in Finland, especially replacing their male lines.

So how did Ugric and Samoyedic peoples admix with Palaeo-Siberian populations further, to obtain their modern cline? The answer is, logically, with East Asian migrations related to forest-steppe populations of Central Asia after the Mezhovska and Karasuk periods, i.e. during the Iron Age and later. Other groups from the forest-steppe in Central Asia show similar East Asian (“Siberian”) admixture. We know this from Narasimhan et al. (2018):

(…) we observe samples from multiple sites dated to 1700-1500 BCE (Maitan, Kairan, Oy_Dzhaylau and Zevakinsikiy) that derive up to ~25% of their ancestry from a source related to present-day East Asians and the remainder from Steppe_MLBA. A similar ancestry profile became widespread in the region by the Late Bronze Age, as documented by our time transect from Zevakinsikiy and samples from many sites dating to 1500-1000 BCE, and was ubiquitous by the Scytho-Sarmatian period in the Iron Age.

We already have some information about these later migrations:

siberian-genetic-component-chronology
Very important observation with implication of population turnover is that pre-Turkic Inner Eurasian populations’ Siberian ancestry appears predominantly “Uralic-Yeniseian” in contrast to later dominance of “Tungusic-Mongolic” sort (which does sporadically occur earlier). Alexander M. Kim

The Ugric-speaking Sargat culture in Western Siberia shows the expected mixture of haplogroups (ca. 500 BC – 500 AD), with 5 samples of hg N and 2 of hg R1a1, in Pilipenko et al. (2017). Although radiocarbon dates and subclades are lacking, N lineages probably spread late, because of the late and gradual admixture of Siberian cultures into the Sargat melting pot.

The Samoyedic-speaking Tagar culture also shows signs of a genetic turnover in Pilipenko et al. (2018):

The observed reduction in the genetic distance between the Middle Tagar population and other Scythian like populations of Southern Siberia(Fig 5; S4 Table), in our opinion, is primarily associated with an increase in the role of East Eurasian mtDNA lineages in the gene pool (up to nearly half of the gene pool) and a substantial increase in the joint frequency of haplogroups C and D (from 8.7% in the Early Tagar series to 37.5% in the Middle Tagar series). These features are characteristic of many ancient and modern populations of Southern Siberia and adjacent regions of Central Asia, including the Pazyryk population of the Altai Mountains.

Before the Iron Age, the Karasuk and Mezhovska population were probably already somehow ‘to the north’ within the ancient Steppe-Altai cline (see image below9 created by expanding Seima-Turbino- and Andronovo-related populations. During the Iron Age, further Siberian contributions with Iranian expansions must have placed Uralians of the Central Asian forest-steppe areas much closer to today’s Palaeo-Siberian cline.

However, the modern genetic picture was probably fully developed only in historic times, when Samoyedic and Ugric languages expanded to the north, only in part admixing further with Palaeo-Siberian-speaking nomads from the Circum-Arctic region (see here for a recent history of Samoyedic Enets), which justifies their more recent radical ‘northern shift’.

east-uralic-clines
Modified image from Jeong et al. (2018), supplementary materials. The first two PCs summarizing the genetic structure within 2,077 Eurasian individuals. The two PCs generally mirror geography. PC1 separates western and eastern Eurasian populations, with many inner Eurasians in the middle. PC2 separates eastern Eurasians along the north-south cline and also separates Europeans from West Asians. Ancient individuals (color-filled shapes), including two Botai individuals, are projected onto PCs calculated from present-day individuals.

This late acquisition of the language by Palaeo-Siberian nomads (without much population replacement) also justifies the wide PCA clusters of very small Siberian populations. See for example in the PCA from Tambets et al. (2018):

uralic-ugric-samoyedic-modern-clines
Approximate Ugric and Samoyedic clines (exluding apparent outliers). Modified from Tambets et al. (2018). Principal component analysis (PCA) and genetic distances of Uralic-speaking populations. a PCA (PC1 vs PC2) of the Uralic-speaking populations

For their relationship with modern Mansi, we have information on Hungarian conqueror populations from Neparáczki et al. (2018):

Moreover, Y, B and N1a1a1a1a Hg-s have not been detected in Finno-Ugric populations [80–84], implying that the east Eurasian component of the Conquerors and Finno-Ugric people are probably not directly related. The same inference can be drawn from phylogenetic data, as only two Mansi samples appeared in our phylogenetic trees on the side branches (S1 Fig, Networks; 1, 4) suggesting that ancestors of the Mansis separated from Asian ancestors of the Conquerors a long time ago. This inference is also supported by genomic Admixture analysis of Siberian and Northeastern European populations [85], which revealed that Mansis received their eastern Siberian genetic component approximately 5–7 thousand years ago from ancestors of modern Even and Evenki people. Most likely the same explanation applies to the Y-chromosome N-Tat marker which originated from China [86,87] and its subclades are now widespread between various language groups of North Asia and Eastern Europe [88].

The genetic picture of Hungarians (their formed cline with Mansi and their haplogroups) may be quite useful for the true admixture found originally in Mansi peoples at the beginning of the Iron Age. By now it is clear even from modern populations that Steppe_MLBA ancestry accompanied the Uralic expansion to the east (roughly approximated in the graphic with Afanasievo_EBA + Bichon_LP EasternHG_M):

siberian-population-expansions
Admixture modelling using qpAdm. Maps showing locations and ancestry proportions of ancient (left) and modern (right) groups. From Sikora et al. (2018).

Continue reading the final post of the series: Corded Ware—Uralic (IV): Haplogroups R1a and N in Finno-Ugric and Samoyedic.

Related

  • The traditional multilingualism of Siberian populations
  • Iron Age bottleneck of the Proto-Fennic population in Estonia
  • Y-DNA haplogroups of Tuvinian tribes show little effect of the Mongol expansion
  • Corded Ware—Uralic (I): Differences and similarities with Yamna
  • Haplogroup R1a and CWC ancestry predominate in Fennic, Ugric, and Samoyedic groups
  • The Iron Age expansion of Southern Siberian groups and ancestry with Scythians
  • Evolution of Steppe, Neolithic, and Siberian ancestry in Eurasia (ISBA 8, 19th Sep)
  • Mitogenomes from Avar nomadic elite show Inner Asian origin
  • On the origin and spread of haplogroup R1a-Z645 from eastern Europe
  • Oldest N1c1a1a-L392 samples and Siberian ancestry in Bronze Age Fennoscandia
  • Consequences of Damgaard et al. 2018 (III): Proto-Finno-Ugric & Proto-Indo-Iranian in the North Caspian region
  • The concept of “Outlier” in Human Ancestry (III): Late Neolithic samples from the Baltic region and origins of the Corded Ware culture
  • Genetic prehistory of the Baltic Sea region and Y-DNA: Corded Ware and R1a-Z645, Bronze Age and N1c
  • More evidence on the recent arrival of haplogroup N and gradual replacement of R1a lineages in North-Eastern Europe
  • Another hint at the role of Corded Ware peoples in spreading Uralic languages into north-eastern Europe, found in mtDNA analysis of the Finnish population
  • New Ukraine Eneolithic sample from late Sredni Stog, near homeland of the Corded Ware culture
  • Waves of Palaeolithic ANE ancestry driven by P subclades; new CWC-like Finnish Iron Age

    New preprint The population history of northeastern Siberia since the Pleistocene, by Sikora et al. bioRxiv (2018).

    Interesting excerpts (emphasis mine; most internal references removed):

    ANE ancestry

    The earliest, most secure archaeological evidence of human occupation of the region comes from the artefact-rich, high-latitude (~70° N) Yana RHS site dated to ~31.6 kya (…)

    The Yana RHS human remains represent the earliest direct evidence of human presence in northeastern Siberia, a population we refer to as “Ancient North Siberians” (ANS). Both Yana RHS individuals were unrelated males, and belong to mitochondrial haplogroup U, predominant among ancient West Eurasian hunter-gatherers, and to Y chromosome haplogroup P1, ancestral to haplogroups Q and R, which are widespread among present-day Eurasians and Native Americans.

    Symmetry tests using f4 statistics reject tree-like clade relationships with both Early West Eurasians (EWE; Sunghir) and Early East Asians (EEA; Tianyuan); however, Yana is genetically closer to EWE, despite its geographic location in northeastern Siberia

    Using admixture graphs (qpGraph) and outgroup-based estimation of mixture proportions (qpAdm), we find that Yana can be modelled as EWE with ~25% contribution from EEA

    Among all ancient individuals, Yana shares the most genetic drift with Mal’ta, and f4 statistics show that Mal’ta shares more alleles with Yana than with EWE (e.g. f4(Mbuti,Mal’ta;Sunghir,Yana) = 0.0019, Z = 3.99). Mal’ta and Yana also exhibit a similar pattern of genetic affinities to both EWE and EEA, consistent with previous studies.The ANE lineage can thus be considered a descendant of the ANS lineage, demonstrating that by 31.6 kya early representatives of this lineage were widespread across northern Eurasia, including far northeastern Siberia.

    siberian-samples-haplogroup

    Ancient Palaeosiberian

    (…) the 9.8 kya Kolyma1 individual, representing a group we term “Ancient Paleosiberians” (AP). Our results indicate that AP are derived from a first major genetic shift observed in the region. Principal component analysis (PCA), outgroup f3-statistics and mtDNA and Y chromosome haplogroups (G1b and Q1a1a, respectively) demonstrate a close affinity between AP and present-day Koryaks, Itelmen and Chukchis, as well as with Native Americans.

    For both AP and Native Americans, ANS ancestry appears more closely related to Mal’ta than Yana, therefore rejecting a direct contribution of Yana to later AP or Native American groups.

    Lake Baikal Neolithic – Bronze Age

    (…) the newly reported genomes from Ust’Belaya and recently published neighbouring Neolithic and Bronze Age sites show a succession of three distinct genetic ancestries over a ~6 ky time span. The earliest individuals show predominantly East Asian ancestry, closely related to the ancient individuals from DGC. In the early Bronze Age (BA), we observe a resurgence of AP ancestry (up to ~50% ancestry fraction), as well as influence of West Eurasian Steppe ANE ancestry represented by the early BA individuals from Afanasievo in the Altai region (~10%) This is consistent with previous reports of gene flow from an unknown ANE-related source into Lake Baikal hunter-gatherers.

    Our results suggest a southward expansion of AP as a possible source, which is also consistent with the replacement of Y chromosome lineages observed at Lake Baikal, from predominantly haplogroup N in the Neolithic to haplogroup Q in the BA. Finally, the most recent individual from Ust’Belaya, dated to ~600 years ago, falls along the Neosiberian cline, similar to the ~760 year-old ‘Young Yana’ individual from northeastern Siberia, demonstrating the widespread distribution of Neosiberian ancestry in the most recent epoch.

    finnish_ia_palaeosiberian
    Genetic structure of ancient northeast Siberians. PCA of ancient individuals projected onto a set of modern Eurasian and American individuals. Abbreviations in group labels: UP – Upper Palaeolithic; LP – Late Palaeolithic; M – Mesolithic; EN – Early Neolithic; MN – Middle Neolithic; LN – Late Neolithic; EBA – Early Bronze Age; LBA – Late Bronze Age; IA – Iron Age; PE – Paleoeskimo; MED – Medieval

    Finland Saami

    At the western edge of northern Eurasia, genetic and strontium isotope data from ancient individuals at the Levänluhta site documents the presence of Saami ancestry in Southern Finland in the Late Holocene 1.5 kya. This ancestry component is currently limited to the northern fringes of the region, mirroring the pattern observed for AP ancestry in northeastern Siberia. However, while the ancient Saami individuals harbour East Asian ancestry, we find that this is better modelled by DGC rather than AP, suggesting that AP influence was likely restricted to the eastern side of the Urals. Comparison of ancient Finns and Saami with their present-day counterparts reveals additional gene flow over the past 1.6 kya, with evidence for West Eurasian admixture into modern Saami. The ancient Finn from Levänluhta shows lower Siberian ancestry than modern Finns .

    EDIT (27 OCT 2018): By comparing the three, I see these are samples published already (at least two) in Lamnidis et al. (2018), but here with added (1) specific radiocarbon dates, (2) comparison with Neosiberian populations and (3) strontium isotope analyses.

    Finnish_IA (ca. 350 AD) is probably a Saami-speaking individual, just like the Saami_IA with newly reported radiocarbon dates from Levänluhta ca. 400-600 AD (since Fennic peoples were then likely around the Gulf of Finland).

    The conflicting strontium isotope data on marine dietary resources on certain samples from the supplementary material hint at possible external origin of the diet of some of the previously reported (and possibly one newly reported) Saami Iron Age individuals, from some 25-30 km. to the northwest through the river up to hundreds of km. to the southwest of Levänluhta (i.e. the whole coast of the Bothnian Sea). It is unclear why they would prefer an origin of the dietary source in southern Baltic regions instead of some km. to the west, though, unless that’s what they want to propose based on the sample’s admixture…

    The coast of the Bothnian Sea (=the northern part of the Baltic Sea, between Sweden and Finland) lay only 25-30 km to the northwest, and accessible to the Iron Age people of the Levänluhta region via the Kyrönjoki river. (…) For individual JA2065/DA236, the low 87Sr/86Sr value (0.71078) would imply an exceptionally heavy reliance on Baltic Sea resources. The δ13C and δ15N values of the individual are near comparable (especially considering within-Baltic latitudinal gradients in δ13C; Torniainen et al. 2017) to the δ13C and δ15N values of a Middle Neolithic population on the Baltic island of Gotland (Eriksson, 2004) interpreted to have subsisted primarily on seals.

    These new data on the samples give us some more information than what we already had, because the early date of Finnish_IA implies that there was few East Asian admixture (if any at all) in west Finland during the Roman Iron Age, which pushes still farther forward in time the expected appearance of Siberian ancestry among Saamic (first) and Fennic populations (later). It is unclear whether this East Asian ancestry found in Finnish_IA is actually related to DGC, or it is rather related to the ENA-like ancestry found already in Baltic hunter-gatherers (i.e. in some EHG samples from Karelia), for which Baikal_EN is a good proxy in Lazaridis et al. (2018).

    Since Bronze Age and Iron Age samples from Estonia show more Baltic_HG drift compared to Corded Ware samples, it is likely that this supposedly DGC-related ancestry (here considered part of the ‘Siberian ancestry’) is actually an EHG-related ENA component of north-east European hunter-gatherers, with whom Finno-Saamic peoples admixed during the expansion of the Corded Ware culture into Finland.

    The paper finds thus increased (probably the actual) Siberian ancestry in modern Finns compared to this Iron Age Saami individual. Coupled with the later Saami Iron Age samples, from between one to three centuries later – showing the start of Siberian ancestry influx – , we can begin to establish when the expansion of Siberian ancestry happened in central Finland, and thus quite likely when the Saami began to expand to the north and east and admix with Palaeo-Laplandic peoples.

    siberian-population-expansions
    Admixture modelling using qpAdm. Maps showing locations and ancestry proportions of ancient (left) and modern (right) groups.

    One sample of haplogroup N1a1a1a1a4a1-M1982, Yana_MED, is found in the Arctic region (north-eastern Yakutia) ca. 1100 AD. Since it is derived from N1a1a1a1a-L392, it might be a surprise for some to find it in a clearly non-Uralic speaking environment at the same time other subclades of this haplogroup were admixing in the west with well-established Finno-Saamic, Volga-Finnic, Ugric, and Samoyedic populations…

    On the growing doubts that these data – contradicting the CWC=IE theory – are creating among geneticists (from the supplementary materials):

    NOTE. This paper comes from the Copenhagen group, also signed by Kristiansen, one of today’s strongest supporters of this connection

    The Proto-Saami language evolved in southern Finland and Karelia in the Early Iron Age, an area now host to Finnish and the closely related Karelian, but with Saami toponyms showing that the latter two languages are intrusive here (Saarikivi 2004). Saami-speaking populations are thought to have retreated to Lapland during the Middle Iron Age (300–800 AD), where it diverged into the modern Saami dialects. Genetically, the northward retreat of the Saami language correlates with the documented decrease of Saami ancestry in Southern Finland between the Iron Age and the modern period (cf. Lamnidis et al. 2018).

    On the way to Lapland, the Saami replaced at least two linguistically obscure groups. This can be inferred from 1) an influx of non-Uralic loanwords into Proto-Saami in the Finnish Lakeland area, and 2) an influx of non-Uralic, non-Germanic words into Saami dialects in Lapland (Aikio 2012). Both of these borrowing events imply contact with non-Saami-speaking groups, e.g. non-Uralic-speaking hunter-gatherers that may have left a genetic and linguistic footprint on modern Saami populations.

    The linguistic prehistory of Finland thus does not allow for a straightforward interpretation of the genetic data. The detection of East Asian ancestry in the genetically Saami individual is indicative of a population movement from the east (cf. Lamnidis et al. 2018, Rootsi et al. 2007), one that given the affinities with the ~7.6 ky old individuals from the Devil’s Gate Cave may have been a western extension of the Neosiberian turnover. However, it remains unclear whether this gene flow should be associated with the arrival of Uralic speakers, thus providing further support for a Uralic homeland in Eastern Eurasia, or with an earlier immigration of pre-Uralic, so-called “Paleo-Lakelandic” groups.

    I think the genetic interpretation is already straightforward, though. We had a sneak peek at how this late admixture with non-Uralians (mainly Palaeo-Lakelandic and Palaeo-Laplandic peoples from Lovozero and related asbestos ware cultures) is going to unfold among expanding Saami-speaking populations thanks to Lamnidis et al. (2018):

    saamic-lovozero-pca
    PCA plot of 113 Modern Eurasian populations, with individuals from this study projected on the principal components. Uralic speakers are highlighted in light purple. Image modified from Lamnidis et al. (2018)

    Also, still no trace of R1a in far East Asia (reported as M17 ca. 5300 BC near Lake Baikal by Moussa et al. 2016), so I still have doubts about my previous assessment that R1a split into M17 (and thus also M417) in Siberia, with those expanding hunter-gatherer pottery.

    Related