Corded Ware—Uralic (IV): Hg R1a and N in Finno-Ugric and Samoyedic expansions


This is the fourth of four posts on the Corded Ware—Uralic identification:

Let me begin this final post on the Corded Ware—Uralic connection with an assertion that should be obvious to everyone involved in ethnolinguistic identification of prehistoric populations but, for one reason or another, is usually forgotten. In the words of David Reich, in Who We Are and How We Got Here (2018):

Human history is full of dead ends, and we should not expect the people who lived in any one place in the past to be the direct ancestors of those who live there today.

Haplogroup N

Another recurrent argument – apart from “Siberian ancestry” – for the location of the Uralic homeland is “haplogroup N”. This is as serious as saying “haplogroup R1” to refer to Indo-European migrations, but let’s explore this possibility anyway:

Ancient haplogroups

We have now a better idea of how many ancient migrations (previously hypothesized to be associated with westward Uralic migrations) look like in genetic terms. From Damgaard et al. (Science 2018):

These serial changes in the Baikal populations are reflected in Y-chromosome lineages (Fig. SA; figs. S24 to S27, and tables S13 and SI4). MAI carries the R haplogroup, whereas the majority of Baikal_EN males belong to N lineages, which were widely distributed across Northern Eurasia (29), and the Baikal_LNBA males all carry Q haplogroups, as do most of the Okunevo_EMBA as well as some present-day Central Asians and Siberians.

The only N1c1 sample comes from Ust’Ida Late Neolithic, 180km to the north of Lake Baikal, which – together with the Bronze Age sample from the Kola peninsula, and the medieval sample from Ust’Ida – gives a good idea of the overall expansion of N subclades and Siberian ancestry among the Circum-Arctic peoples of Eurasia, speakers of Palaeo-Siberian languages.

Geographical location of ancient samples belonging to major clade N of the Y-chromosome.

Modern haplogroups

What we should expect from Uralic peoples expanding with haplogroup N – seeing how Yamna expands with R1b-L23, and Corded Ware expands with R1a-Z645 – is to find a common subclade spreading with Uralic populations. Let’s see if it works like that for any N-X subclade, in data from Ilumäe et al. (2016):

Geographic-Distribution Map of hg N3 / N1c / N1a.

Within the Eurasian circum-Arctic spread zone, N3 and N2a reveal a well-structured spread pattern where individual sub-clades show very different distributions:

N1a1-M46 (or N-TAT), formed ca. 13900 BC, TMRCA 9800 BC

   N1a1a2-B187, formed ca. 9800 BC, TMRCA 1050 AD:

The sub-clade N3b-B187 is specific to southern Siberia and Mongolia, whereas N3a-L708 is spread widely in other regions of northern Eurasia.

     N1a1a1a-L708, formed ca. 6800 BC, TMRCA 5400 BC.

       N1a1a1a2-B211/Y9022, formed ca. 5400 BC, TMRCA 1900 BC:

The deepest clade within N3a is N3a1-B211, mostly present in the Volga-Uralic region and western Siberian Khanty and Mansi populations.

         N1a1a1a1a-L392/L1026), formed ca. 4400 BC, TMRCA 2800 BC:

The neighbor clade, N3a3’6-CTS6967, spreads from eastern Siberia to the eastern part of Fennoscandia and the Baltic States

Frequency-Distribution Maps of Individual Subclade N3a3 / N1a1a1a1a1a-CTS2929/VL29, probably initially with Akozino warrior-traders.

           N1a1a1a1a1a-CTS2929/VL29, formed ca. 2100 BC, TMRCA 1600 BC:

In Europe, the clade N3a3-VL29 encompasses over a third of the present-day male Estonians, Latvians, and Lithuanians but is also present among Saami, Karelians, and Finns (Table S2 and Figure 3). Among the Slavic-speaking Belarusians, Ukrainians, and Russians, about three-fourths of their hg N3 Y chromosomes belong to hg N3a3.

In the post on Finno-Permic expansions, I depicted what seems to me the most likely way of infiltration of N1c-L392 lineages with Akozino warrior-traders into the western Finno-Ugric populations, with an origin around the Barents sea.

This includes the potential spread of (a minority of) N1c-B211 subclades due to contacts with Anonino on both sides of the Urals, through a northern route of forest and forest-steppe regions (equivalent to the distribution of Cherkaskul compared to Andronovo), given the spread of certain subclades in Ugric populations.

NOTE. An alternative possibility is the association of certain B211 subclades with a southern route of expansion with Pre-Scythian and Scythian populations, under whose influence the Ananino culture emerged -which would imply a very quick infiltration of certain groups of haplogroup N everywhere among Finno-Ugrics on both sides of the Urals – , and also the expansion of some subclades with Turkic-speaking peoples, who apparently expanded with alliances of different peoples. Both (Scythian and Turkic) populations expanded from East Asia, where haplogroup N (including N1c) was present since the Neolithic. I find this a worse model of expansion for upper clades, but – given the YFull estimates and the presence of this haplogroup among Turkic peoples – it is a possibility for many subclades.

           N1a1a1a1a2-Z1936, formed ca. 2800 BC, TMRCA 2400 BC:

The only notable exception from the pattern are Russians from northern regions of European Russia, where, in turn, about two-thirds of the hg N3 Y chromosomes belong to the hg N3a4-Z1936—the second west Eurasian clade. Thus, according to the frequency distribution of this clade, these Northern Russians fit better among other non-Slavic populations from northeastern Europe. N3a4 tends to increase in frequency toward the northeastern European regions but is also somewhat unexpectedly a dominant hg N3 lineage among most Turcic-speaking Volga Tatars and South-Ural Bashkirs.

Frequency-Distribution Maps of Individual Subclade N3a4 / N1a1a1a1a2-Z1936, probably with the Samic (first) and Fennic (later) expansions into Paleo-Lakelandic and Palaeo-Laplandic territories.

The expansion of N1a-Z1936 in Fennoscandia is most likely associated with the expansion of Saami into asbestos ware-related territory (like the Lovozero culture) during the Late Iron Age – and mixture with its population – , and with the later Fennic expansion to the east and north, replacing their language.

           N1a1a1a1a4-M2019 (previously N3a2), formed ca. 4400 BC, TMRCA 1700 BC:

Sub-hg N3a2-M2118 is one of the two main bifurcating branches in the nested cladistic structure of N3a2’6-M2110. It is predominantly found in populations inhabiting present-day Yakutia (Republic of Sakha) in central Siberia and at lower frequencies in the Khanty and Mansi populations, which exhibit a distinct Y-STR pattern (Table S7) potentially intrinsic to an additional clade inside the sub-hg N3a2

The second widespread sub-clade of hg N is N2a. (…):

   N1a2b-P43 (B523/FGC10846/Y3184), formed ca. 6800 BC, TMRCA ca. 2700 BC:

The absolute majority of N2a individuals belong to the second sub-clade, N2a1-B523, which diversified about 4.7 kya (95% CI = 4.0–5.5 kya). Its distribution covers the western and southern parts of Siberia, the Taimyr Peninsula, and the Volga-Uralic region with frequencies ranging from from 10% to 30% and does not extend to eastern Siberia (…)

Geographic-Distribution Map of hg N2a1 / N1a2b-P43

The “European” branch suggested earlier from Y-STR patterns turned out to consist of two clades

     N1a2b2a-Y3185/FGC10847, formed ca. 2200 BC, TMRCA 800 BC:

N2a1-L1419, spread mainly in the northern part of that region.

     N1a2b2b1-B528/Y24382, formed ca. 900 BC, TMRCA ca. 900 BC:

N2a1-B528, spread in the southern Volga-Uralic region.

Haplogroup R1a

We also have a good idea of the distribution of haplogroup R1a-Z645 in ancient samples. Its subclades were associated with the Corded Ware expansion, and some of them fit quite well the early expansion of Finno-Permic, Ugric, and Samoyedic peoples to the east.

Modified image, from Underhill et al. (2015). Spatial frequency distributions of Z282 (green) and Z93 (blue) affiliated haplogroups.. Notice the potential Finno-Ugric-associated distribution of Z282 (especially R1a-M558, a Z280 subclade), the expansion of R1a-Z2123 subclades with Central Asian forest-steppe groups.

This is how the modern distribution of R1a among Uralians looks like, from the latest report in Tambets et al. (2018):

  • Among Fennic populations, Estonians and Karelians (ca. 1.1 million) have not suffered the greatest bottleneck of Finns (ca. 6-7 million), and show thus a greater proportion of R1a-Z280 than N1c subclades, which points to the original situation of Fennic peoples before their expansion. To trust Finnish Y-DNA to derive conclusions about the Uralic populations is as useful as relying on the Basque Y-DNA for the language spread by R1b-P312
  • Among Volga-Finnic populations, Mordovians (the closest to the original Uralic cluster, see above) show a majority of R1a lineages (27%).
  • Hungarians (ca. 13-15 million) represent the majority of Ugric (and Finno-Ugric) peoples. They are mainly R1a-Z280, also R1a-Z2123, have little N1c, and lack Siberian ancestry, and represent thus the most likely original situation of Ugric peoples in 4th century AD (read more on Avars and Hungarians).
  • Among Samoyedic peoples, the Selkup, the southernmost ones and latest to expand – that is, those not heavily admixed with Siberian populations – , also have a majority of R1a-Z2123 lineages (see also here for the original Samoyedic haplogroups to the south).

To understand the relevance of Hungarians for Ugric peoples, as well as Estonians, Karelians, and Mordovians (and northern Russians, Finno-Ugric peoples recently Russified) for Finno-Permic peoples, as opposed to the Circum-Arctic and East Siberian populations, one has to put demographics in perspective. Even a modern map can show the relevance of certain territories in the past:

Population density (people per km2) map of the world in 1994. From Wikipedia.

Summary of ancestry + haplogroups

Fennic and Samic populations seem to be clearly influenced by Palaeo-Laplandic peoples, whereas Volga-Finnic and especially Permic populations may have received gene flow from both, but essentially Palaeo-Siberian influence from the north and east.

The fact that modern Mansis and Khantys offer the highest variation in N1a subclades, and some of the highest “Siberian ancestry” among non-Nganasans, should have raised a red flag long ago. The fact that Hungarians – supposedly stemming from a source population similar to Mansis – do not offer the same amount of N subclades or Siberian ancestry (not even close), and offer instead more R1a, in common with Estonians (among Finno-Samic peoples) and Mordvins (among Volga-Finnic peoples) should have raised a still bigger red flag. The fact that Nganasans – the model for Siberian ancestry – show completely different N1a2b-P43 lineages should have been a huge genetic red line (on top of the anthropological one) to regard them as the Uralian-type population.

We know now that ethnolinguistic groups have usually expanded with massive (usually male-biased) migrations, and that neighbouring locals often ‘resurge’ later without changing the language. That is seen in Europe after the spread of Bell Beakers, with the increase of previous ancestry and lineages in Scandinavia during the formation of the Nordic ethnolinguistic community; in Central-West Europe, with the resurgence of Neolithic ancestry (and lineages) during the Bronze Age over steppe ancestry; and in Central-East Europe (with Unetice or East European Bronze Age groups like Mierzanowice, Trzciniec, or Lusatian) showing an increase in steppe ancestry (and resurge of R1a subclades); none of them represented a radical ethnolinguistic change.

Map of archaeological cultures in north-eastern Europe ca. 8th-3rd centuries BC. [The Mid-Volga Akozino group not depicted] Shaded area represents the Ananino cultural-historical society. Fading purple arrows represent likely stepped movements of subclades of haplogroup N for centuries (e.g. Siberian → Ananino → Akozino → Fennoscandia [N-VL29]; Circum-Arctic → forest-steppe [N1, N2]; etc.). Blue arrows represent eventual expansions of Uralic peoples to the north. Modified image from Vasilyev (2002).

It is not hard to model the stepped arrival, infiltration, and/or resurge of N subclades and “Siberian ancestries”, as well as their gradual expansion in certain regions, associated with certain migrations first – such as the expansions to the Circum-Arctic region, and later the Scythian- and Turkic-related movements – , as well as limited regional developments, like the known bottleneck in Finns, or the clear late expansion of Ugric and Samoyedic languages to the north among nomadic Palaeo-Siberians due to traditions of exogamy and multilingualism. This fits quite well with the different arrival of N (N1c and xN1c) lineages to the different Uralic-speaking groups, and to the stepped appearance of “Siberian ancestry” in the different regions.

The aternative

It is evident that a lot of people were too attached to the idea of Palaeolithic R1b lineages ‘native’ to western Europe speaking Basque languages; of R1a lineages speaking Indo-European and spreading with Yamna; and N lineages ‘native’ to north-eastern Europe and speaking Uralic, and this is causing widespread weeping and gnashing of teeth (instead of the joy of discovering where one’s true patrilineal ancestors come from, and what language they spoke in each given period, which is the supposed objective of genetic genealogy…)

Since an Indo-Germanic branch (as revived now by some in the Copenhaguen group to fit Kristiansen’s theory of the 1980s with recent genetic data) does not make any sense in linguistics, the finding of R1a in Yamna would not have led where some think it would have, because North-West Indo-European would still be the main Late PIE branch in Europe. Don’t take my word for it; take James P. Mallory’s (2013).

The levels of Indo-European reconstruction, from Mallory & Adams (2006).

If an (unlikely) Indo-Slavonic group were posited, though, such a group would still be bound (with Indo-Iranian) to the steppes with East Yamna/Poltavka (admixing with Abashevo migrants, but retaining its language), developing Sintashta/Potapovka → Srubna/Andronovo, and R1a lineages would have equally undergone the known bottlenecks of the steppes where they replaced R1b-Z2103 – which this eastern group shares with Balkan languages, a haplogroup that links therefore together the Graeco-Aryan group.

As far as I know – and there might be many other similar pet theories out there – there have been proposals of “modern Balto-Slavic-like” populations (in an obvious circular reasoning based on modern populations) in some Scythian clusters of the Iron Age.

NOTE. I will not enter into “Balto-Slavic-like R1a” of the Late Bronze Age or earlier because no one can seriously believe at this point of development of Population Genetics that autosomal similarity predating 1,500+ years the appearance of Slavs equates to their (ethnolinguistic) ancestral population, without a clear intermediate cultural and genetic trail – something we lack today in the Slavic case even for the late Roman period…

The Finnic and Saamic separation looks shallower than it actually is. Invisible convergence can be ‘triangulated’ with the help of Germanic layers of mutual loanwords (Häkkinen 2012).

We also know of R1a-Z280 lineages in Srubna, probably expanding to the west. With that in mind, and knowing that Palaeo-Germanic was in close contact with Finno-Samic while both were already separated but still in contact, and that Palaeo-Germanic was also in contact and closely related to a ‘Temematic’ distinct from Balto-Slavic (and also that early Proto-Baltic and Proto-Slavic from the Roman Iron Age and later were in contact with western Uralic) this will be the linguistic map of the Iron Age if R1a is considered to expand Indo-European from some kind of “patron-client” relationship with west Yamna:

Eastern European language map during the Late Bronze Age / Iron Age, if R1a spread Indo-European languages and Eastern Yamna spoke Indo-Slavonic. Palaeo-Germanic (i.e. Pre- to Proto-Germanic) needs to be in contact with both the Samic Lovozero population and the Fennic west Circum-Arctic one. Italic and Celtic in contact with Pre-Germanic. Germanic in contact with Temematic. Balto-Slavic in contact with Iranian, and near Fennic to allow for later loanwords. For Germanic and Temematic, see Kortlandt (2018).

You might think I have some personal or political reason against this kind of proposals. I haven’t. We have been proposing Indo-European to be the language of the European Union for more than 10 years, so to support R1b-Italo-Celtic in the whole Western Europe, R1a-Germanic in Central and Eastern Europe, and R1a-Indo-Slavonic in the steppes (as the Danish group seems to be doing) has nothing inherently bad (or good) for me. If anything, it gives more reason to support the revival of North-West Indo-European in Europe.

My problem with this proposal is that it is obviously beholden to the notion of the uninterrupted cultural, historic and ethnic continuity in certain territories. This bias is common in historiography (von Falkenhausen 1993), but it extends even more easily into the lesser known prehistory of any territory, and now more than ever some people feel the need to corrupt (pre)history based on their own haplogroups (or the majority haplogroups of their modern countries). However, more than on philosophical grounds, my rejection is based on facts: this picture is not what the combination of linguistic, archaeological, and genetic data shows. Period.

Nevertheless, if Yamna + Corded Ware represented the “big and early expansion” of Germanic and Italo-Celtic peoples proper of the dream Nazi’s Lebensraum and Fascist’s spazio vitale proposals; Uralians were Siberian hunter-gatherers that controlled the whole eastern and northern Russia, and miraculously managed to push (ethnolinguistically) Neolithic agropastoralists to the west during and after the Iron Age, with gradual (and often minimal) genetic impact; and Balto-Slavic peoples were represented by horse riders from Pokrovka/Srubna, hiding then somewhere around the forest-steppe until after the Scythian expansion, and then spreading their language (without much genetic impact) during the early Middle Ages…so be it.


Corded Ware—Uralic (III): “Siberian ancestry” and Ugric-Samoyedic expansions


This is the third of four posts on the Corded Ware—Uralic identification. See

An Eastern Uralic group?

Even though proposals of an Eastern Uralic (or Ugro-Samoyedic) group are in the minority – and those who support it tend to search for an origin of Uralic in Central Asia – , there is nothing wrong in supporting this from the point of view of a western homeland, because the eastward migration of both Proto-Ugric and Pre-Samoyedic peoples may have been coupled with each other at an early stage. It’s like Indo-Slavonic: it just doesn’t fit the linguistic data as well as the alternative, i.e. the expansion of Samoyedic first, different from a Finno-Ugric trunk. But, in case you are wondering about this possibility, here is Häkkinen’s (2012) phonological argument:


The case of Samoyedic is quite similar to that of Hungarian, although the earliest Palaeo-Siberian contact languages have been lost. There were contacts at least with Tocharian (Kallio 2004), Yukaghir (Rédei 1999) and Turkic (Janhunen 1998). Samoyedic also:

a) has moved far from the related languages and has been exposed to strong foreign influence

b) shares a small number of common words with other branches (from Sammallahti 1988: only 123 ‘Uralic’ words, versus 390 ‘Uralic’ + ‘Finno-Ugric’ words found in other branches than Samoyedic = 31,5 %)

c) derives phonologically from the East Uralic dialect.

The phonological level is taxonomically more reliable, since it lacks the distortion caused by invisible convergence and false divergence at the lexical level. Thus we can conclude that the traditional taxonomic model, according to which Samoyedic was the first branch to split off from the Proto-Uralic unity, is just as incorrect as the view that Hungarian was the first branch to split off.


Late Uralic can be traced back to metallurgical cultures thanks to terms like PU *wäśka ‘copper/bronze’ (borrowed from Proto-Samoyedic *wesä into Tocharian); PU *äsa and *olna/*olni, ‘lead’ or ‘tin’, found in *äsa-wäśka ‘tin-bronze’; and e.g. *weŋći ‘knife’, borrowed into Indo-Iranian (through the stage of vocalization of nasals), appearing later as Proto-Indo-Aryan *wāćī ‘knife, awl, axe’.

It is known that the southern regions of the Abashevo culture developed Proto-Indo-Iranian-speaking Sintashta-Petrovka and Pokrovka (Early Srubna). To the north, however, Abashevo kept its Uralic nature, with continuous contacts allowing for the spread of lexicon – mainly into Finno-Ugric – , and phonetic influence – mainly Uralisms into Proto-Indo-Iranian phonology (read more here).

The northern part of Abashevo (just like the south) was mainly a metallurgical society, with Abashevo metal prospectors found also side by side with Sintashta pioneers in the Zeravshan Valley, near BMAC, in search of metal ores. About the Seima-Turbino phenomenon, from Parpola (2013):

From the Urals to the east, the chain of cultures associated with this network consisted principally of the following: the Abashevo culture (extending from the Upper Don to the Mid- and South Trans-Urals, including the important cemeteries of Sejma and Turbino), the Sintashta culture (in the southeast Urals), the Petrovka culture (in the Tobol-Ishim steppe), the Taskovo-Loginovo cultures (on the Mid- and Lower Tobol and the Mid-Irtysh), the Samus’ culture (on the Upper Ob, with the important cemetery of Rostovka), the Krotovo culture (from the forest steppe of the Mid-Irtysh to the Baraba steppe on the Upper Ob, with the important cemetery of Sopka 2), the Elunino culture (on the Upper Ob just west of the Altai mountains) and the Okunevo culture (on the Mid-Yenissei, in the Minusinsk plain, Khakassia and northern Tuva). The Okunevo culture belongs wholly to the Early Bronze Age (c. 2250–1900 BCE), but most of the other cultures apparently to its latter part, being currently dated to the pre-Andronovo horizon of c. 2100–1800 BCE (cf. Parzinger 2006: 244–312 and 336; Koryakova & Epimakhov 2007: 104–105).

Schematic map of the Middle Bronze Age cultures (steppe and foreststeppe

The majority of the Sejma-Turbino objects are of the better quality tin-bronze, and while tin is absent in the Urals, the Altai and Sayan mountains are an important source of both copper and tin. Tin is also available in southern Central Asia. Chernykh & Kuz’minykh have accordingly suggested an eastern origin for the Sejma-Turbino network, backing this hypothesis also by the depiction on the Sejma-Turbino knives of mountain sheep and horses characteristic of that area. However, Christian Carpelan has emphasized that the local Afanas’evo and Okunevo metallurgy of the Sayan-Altai area was initially rather primitive, and could not possibly have achieved the advanced and difficult technology of casting socketed spearheads as one piece around a blank. Carpelan points out that the first spearheads of this type appear in the Middle Bronze Age Caucasia c. 2000 BCE, diffusing early on to the Mid-Volga-Kama-southern Urals area, where “it was the experienced Abashevo craftsmen who were able to take up the new techniques and develop and distribute new types of spearheads” (Carpelan & Parpola 2001: 106, cf. 99–106, 110). The animal argument is countered by reference to a dagger from Sejma on the Oka river depicting an elk’s head, with earlier north European prototypes (Carpelan & Parpola 2001: 106–109). Also the metal analysis speaks for the Abashevo origin of the Sejma-Turbino network. Out of 353 artefacts analyzed, 47% were of tin-bronze, 36% of arsenical bronze, and 8.5% of pure copper. Both the arsenical bronze and pure copper are very clearly associated with the Abashevo metallurgy.

Find spots of artefacts distributed by the Sejma-Turbino intercultural trader network, and the areas of the most important participating cultures: Abashevo, Sintashta, Petrovka. Based on Chernykh 2007: 77.

The Abashevo metal production was based on the Volga-Kama-Belaya area sandstone ores of pure copper and on the more easterly Urals deposits of arsenical copper (Figure 9). The Abashevo people, expanding from the Don and Mid-Volga to the Urals, first reached the westerly sandstone deposits of pure copper in the Volga and Kama basins, and started developing their metallurgy in this area, before moving on to the eastern side of the Urals to produce harder weapons and tools of arsenical copper. Eventually they moved even further south, to the area richest in copper in the whole Urals region, founding there the very strong and innovative Sintashta culture.

Regarding the most likely expansion of Eastern Uralic peoples:

Nataliya L’vovna Chlenova (1929–2009; cf. Korenyako & Ku’zminykh 2011) published in 1981 a detailed study of the Cherkaskul’ pottery. In her carefully prepared maps of 1981 and 1984 (Figure 10), she plotted Cherkaskul’ monuments not only in Bashkiria and the Trans-Urals, but also in thick concentrations on the Upper Irtysh, Upper Ob and Upper Yenissei, close to the Altai and Sayan mountains, precisely where the best experts suppose the homeland of Proto-Samoyed to be.

Distribution of Srubnaya (Timber Grave, early and late), Andronovo (Alakul’ and Fëdorovo variants) and Cherkaskul’ monuments. After Parpola 1994: 146, fig. 8.15, based on the work of N. L. Chlenova (1984: map facing page 100).


The Cherkaskul’ culture was transformed into the genetically related Mezhovka culture (c. 1500–1000 BCE), which occupied approximately the same area from the Mid-Kama and Belaya rivers to the Tobol river in western Siberia (cf. Parzinger 2006: 444–448; Koryakova & Epimakhov 2007: 170–175). The Mezhovka culture was in close contact with the neighbouring and probably Proto-Iranian speaking Alekseevka alias Sargary culture (c. 1500–900 BCE) of northern Kazakhstan (Figure 4 no. 8) that had a Fëdorovo and Cherkaskul’ substratum and a roller pottery superstratum (cf. Parzinger 2006: 443–448; Koryakova & Epimakhov 2007: 161–170). Both the Cherkaskul’ and the Mezhovka cultures are thought to have been Proto-Ugric linguistically, on the basis of the agreement of their area with that of Mansi and Khanty speakers, who moreover in their Fëdorovo-like ornamentation have preserved evidence of continuity in material culture (cf. Chlenova 1984; Koryakova & Epimakhov 2007: 159, 175).

Cultures of the Final Bronze Age of the Urals and western Siberia (steppe
and forest-steppe zone).

The Mezhovka culture was succeeded by the genetically related Gamayun culture (c. 1000–700 BCE) (cf. Parzinger 2006: 446; 542–545).

From the Gamayun culture descend Trans-Urals cultures in close contact with Finno-Permic populations of the Cis-Ural region:

  • [Proto-Mansi] Itkul’ culture (c. 700–200 BCE) distributed along the eastern slope of the Ural Mountains (cf. Parzinger 2006: 552–556). Known from its walled forts, it constituted the principal Trans-Uralian centre of metallurgy in the Iron Age, and was in contact with both the Anan’ino and Akhmylovo cultures (the metallurgical centres of the Mid-Volga and Kama-Belaya region) and the neighbouring Gorokhovo culture.
    • [Proto-Hungarian] via the Vorob’evo Group (c. 700–550 BCE) (cf. Parzinger 2006: 546–549), to the Gorokhovo culture (c. 550–400 BCE) of the Trans-Uralian forest steppe (cf. Parzinger 2006: 549–552). For various reasons the local Gorokhovo people started mobile pastoral herding and became part of the multicomponent pastoralist Sargat culture (c. 500 BCE to 300 CE), which in a broader sense comprized all cultural groups between the Tobol and Irtysh rivers, succeeding here the Sargary culture. The Sargat intercommunity was dominated by steppe nomads belonging to the Iranian-speaking Saka confederation, who in the summer migrated northwards to the forest steppe
  • [Proto-Khanty] Late Bronze Age and Early Iron Age cultures related to the Gamayunskoe and Itkul’ cultures that extended up to the Ob: the Nosilovo, Baitovo, Late Irmen’, and Krasnoozero cultures (c. 900–500 BCE). Some were in contact with the Akhmylovo on the Mid-Volga.
Cultural groups of the Iron Age in the forest-steppe zone of western
Siberia. (


Parpola (2012) connects the expansion of Samoyedic with the Cherkaskul variant of Andronovo. As we know, Andronovo was genetically diverse, which speaks in favour of different groups developing similar material cultures in Central Asia.

Juha Janhunen, author of the etymological dictionary of the Samoyed languages (1977), places the homeland of Proto-Samoyedic in the Minusinsk basin on the Upper Yenissei (cf. Janhunen 2009: 72). Mainly on the basis of Bulghar Turkic loanwords, Janhunen (2007: 224; 2009: 63) dates Proto-Samoyedic to the last centuries BCE. Janhunen thinks that the language of the Tagar culture (c. 800–100 BCE) ought to have been Proto-Samoyedic (cf. Janhunen 1983: 117– 118; 2009: 72; Parzinger 2001: 80 and 2006: 619–631 dates the Tagar culture c. 1000–200 BCE; Svyatko et al. 2009: 256, based on human bone samples, c. 900 BCE to 50 CE). The Tagar culture largely continues the traditions of the Karasuk culture (c. 1400–900 BCE), (…)

Map showing the location of Chicha-1.

For the most recent expansions of Samoyedic languages to the north, into Palaeo-Siberian populations, read more about the traditional multilingualism of Siberian populations.


Siberian ancestry

The use of a map of “Siberian ancestry” peaking in the arctic to show a supposedly late Uralic population movement (starting in the Iron Age!) seems to be the latest trend in population genomics:

Frequency map of the so-called ‘Siberian’ component. From Tambets et al. (2018) (see below for ADMIXTURE in specific populations).

I guess that would make this map of Neolithic farmer ancestry represent an expansion of Indo-European from the south, because Anatolia, Greece, Italy, southern France, and Iberia – where this ancestry peaks in modern populations – are among the oldest territories where Indo-European languages were recorded:

Modern genome-wide data shows that the primary gradient of farmer ancestry in Europe does not flow southeast-to-northwest but instead in an almost perpendicular direction, a result of a major migration of pastoralists from the east that displaced much of the ancestry of the first farmers.

Probably not the right interpretation of this kind of simplistic data about modern populations, though…

The most striking thing about the “Siberian ancestry” white whale is that nobody really knows what it is; just like we did not know what “Yamnaya ancestry” was, until the most recent data is making the picture clearer. Its nature is changing with each new paper, and it can be summed up by “some ancestry we want to find that is common to Uralic-speaking peoples, and should not be CWC-related”. Tambets et al. (2018) explain quite well how they “found it”:

Overall, and specifically at lower values of K, the genetic makeup of Uralic speakers resembles that of their geographic neighbours. The Saami and (a subset of) the Mansi serve as exceptions to that pattern being more similar to geographically more distant populations (Fig. 3a, Additional file 3: S3). However, starting from K = 9, ADMIXTURE identifies a genetic component (k9, magenta in Fig. 3a, Additional file 3: S3), which is predominantly, although not exclusively, found in Uralic speakers. This component is also well visible on K = 10, which has the best cross-validation index among all tests (Additional file 3: S3B). The spatial distribution of this component (Fig. 3b) shows a frequency peak among Ob-Ugric and Samoyed speakers as well as among neighbouring Kets (Fig. 3a). The proportion of k9 decreases rapidly from West Siberia towards east, south and west, constituting on average 40% of the genetic ancestry of FU speakers in Volga-Ural region (VUR) and 20% in their Turkic-speaking neighbours (Bashkirs, Tatars, Chuvashes; Fig. 3a).

Population structure of Uralic-speaking populations inferred from ADMIXTURE analysis on autosomal SNPs in Eurasian context. Individual ancestry estimates for populations of interest for selected number of assumed ancestral populations (K3, K6, K9, K11). Ancestry components discussed in a main text (k2, k3, k5, k6, k9, k11) are indicated and have the same colours throughout. The names of the Uralic-speaking populations are indicated with blue (Finno-Ugric) or orange (Samoyedic). Image from Tambets et al. (2018).

However, this ‘something’ that some people occasionally find in some Uralic populations is also common to other modern and ancient groups, and not so common in some other Uralic peoples. Simply put:

Image modified from Lamnidis et al. (2018). Red line representing maximum “Siberian admixture” in Eastern European hunter-gatherers. In blue, Uralic-speaking groups. “Plot of ADMIXTURE (K=3) results containing West Eurasian populations and the Nganasan. Ancient individuals from this study are represented by thicker bars.”

I already said this in the recent publication of Siberian samples, where a renamed and radiocarbon dated Finnish_IA clearly shows that Late Iron Age Saami (ca. 400 AD) had little “Siberian ancestry”, if any at all, representing the most likely Fennic (and Samic) ancestral components before their expansion into central and northern Finland, where they admixed with circum-polar peoples of asbestos ware cultures.

I will say that again and again, any time they report the so-called “Siberian ancestry” in Uralic samples, no matter how it is defined each time: it does not seem to be that special something people are looking for, but rather (at least in a great part) a quite old ancestral component forming an evident cline with EHG, whose best proximate source are Baikal_EN (and/or Devil’s Gate) at this moment, and thus also East European hunter-gatherers for Western Uralic peoples:

Image modified from Lazaridis et al. (2018). In red: samples with Baikal_EN ancestry in speculative estimates. In pink: samples with Baikal_EN ancestry in conservative estimates (probably marking a recent arrival of Baikal_En ancestry, see here). Modeling present-day and ancient West-Eurasians. Mixture proportions computed with qpAdm (Supplementary Information section 4). The proportion of ‘Mbuti’ ancestry represents the total of ‘Deep’ ancestry from lineages that split prior to the split of Ust’Ishim, Tianyuan, and West Eurasians and can include both ‘Basal Eurasian’ and other (e.g., Sub-Saharan African) ancestry. (Left) ‘Conservative’ estimates. Each population 367 cannot be modeled with fewer admixture events than shown. (Right) ‘Speculative’ estimates. The highest number of sources (≤5) with admixture estimates within [0,1] are shown for each population. Some of the admixture proportions are not significantly different from 0 (Supplementary Information section 4).

So either Samara_HG, Karelia_HG, and many other groups from eastern Europe all spoke Uralic according to this ADMIXTURE graphic (and the formation of steppe ancestry in the Volga-Ural region brought the Proto-Indo-European language to the steppes through the CHG/ANE expansion), or a great part of this “Siberian ancestry” found in modern Uralic-speaking populations is not what some people would like to think it is…

Modern populations

PCA clines can be looked for to represent expansions of ancient populations. Most recently, Flegontov et al. (2018) are attempting to do this with Asian populations:

For some Turkic groups in the Urals and the Altai regions and in the Volga basin, a different admixture model fits the data: the same West Eurasian source + Uralic- or Yeniseian-speaking Siberians. Thus, we have revealed an admixture cline between Scythians and the Iranian farmer genetic cluster, and two further clines connecting the former cline to distinct ancestry sources in Siberia. Interestingly, few Wusun-period individuals harbor substantial Uralic/Yeniseian-related Siberian ancestry, in contrast to preceding Scythians and later Turkic groups characterized by the Tungusic/Mongolic-related ancestry. It remains to be elucidated whether this genetic influx reflects contacts with the Xiongnu confederacy. We are currently assembling a collection of samples across the Eurasian steppe for a detailed genetic investigation of the Hunnic confederacies.

Three distinct East/West Eurasian clines across the continent with some interesting linguistic correlates, as earlier reported by Jeong et al. (2018). Alexander M. Kim.

There are potential errors with this approach:

The main one is practical – does a modern cline represent an ancestral language? The answer is: sometimes. It depends on the anthropological context that we have, and especially on the precision of the PCA:

Genetic structure of the Himalayan region populations from analyses using unlinked SNPs. (A) PCA of the Himalayan and HGDP-CEPH populations. Each dot represents a sample, coded by region as indicated. The Himalayan region samples lie between the HGDP-CEPH East Asian and South Asian samples on the right-hand side of the plot. From Arciero et al. (2018).

The ‘Europe’, ‘Middle East’, etc. clines of the above PCA do not represent one language, but many. For starters, the PCA includes too many (and modern) populations, its precision is useless for ethnolinguistic groups. Which is the right level? Again, it depends.

The other error is one of detail of the clines drawn (which, in turn, depends on the precision of the PCA). For example, we can draw two paralell lines (or even one line, as in Flegontov et al. above) in one PCA graphic, but we still don’t have the direction of expansion. How do we know if this supposed “Uralic-speaking cline” goes from one region to the other? For that level of detail, we should examine closely modern Uralic-speaking peoples and Circum-Arctic populations:

Modified from Tambets et al. (2018). Principal component analysis (PCA) and genetic distances of Uralic-speaking populations. a PCA (PC1 vs PC2) of the Uralic-speaking populations

The real ancient Uralic cluster (drawn above in blue) is thus probably from a North-East European source (probably formed by Battle Axe / Fatyanovo-Balanovo / Abashevo) to the east into Siberian populations, and to the north into Laplandic populations (see below also on Mezhovska ancestry for the drawn ‘European cline’, which some may a priori wrongly assume to be quite late).

The fact that the three formed clines point to an admixture of CWC-related populations from North-Eastern Europe, and that variation is greater at the Palaeo-Laplandic and Palaeo-Siberian extremities compared to the CWC-related one, also supports this as the correct interpretation.

However, judging by the two main clines formed, one could be alternatively inclined to interpret that Palaeo-Laplandic and Palaeo-Siberian populations formed a huge ancestral “Uralic” ghost cluster in Siberia (spanning from the Palaeo-Laplandic to the Palaeo-Siberian one), and from there expanded Finno-Samic on one hand, and “Volga-Ugro-Samoyed” on the other. That poses different problems: an obvious linguistic and archaeological one – which I assume a lot of people do not really care about – , and a not-so-obvious genetic one (see below for ancient samples and for the expansion of haplogroup N).

To understand the simplest solution better, one can just have a look at the PCA from Bell Beaker samples in Olalde et al. (2018), which (as Reich has already explained many times) expanded directly from Yamna R1b-L23 lineages:

Image modified from Olalde et al. (2018). PCA of 999 Eurasian individuals. Marked is the Espersted Outlier with the approximate position of Yamna Hungary, probably the source of its admixture. Different Bell Beaker clines have been drawn, to represent approximate source of expansions from Central European sources into the different regions.

Unlike this PCA with ancient samples, where Bell Beaker clines could be a rough approximation to the real sources for each population, and where a cluster spanning all three depicted Early Bronze Age clusters could give a rough proximate source of European Bell Beakers in Hungary (and where one can even distinguish the Y-DNA bottlenecks in the L23 trunk created by each cline) the PCA of modern Uralic populations is probably not suitable for a good estimate of the ancient situation, which may be found shifted up or down of the drawn “Uralic” cluster along East European groups.

After all, we already know that the Siberian cline shows probably as much an ancient admixture event – from the original Uralic expansion to the east with Corded Ware ancestry – as another more recent one – a westward migration of Siberian ancestry (or even more than one). While we know with more or less exactitude what happened with the Palaeo-Laplandic admixture by expanding Proto-Finno-Samic populations (see here), the Proto-Ugric and Pre-Samoyedic populations formed probably more than one cline during the different ancient migrations through central Asia.

Ancient populations

Apparently, the Corded Ware expansion to the east was not marked by a huge change in ancestry. While the final version of Narasimhan et al. (2018) may show a little more detail about other forest-steppe Seima-Turbino/Andronovo-related migrations (and thus also Eastern Uralic peoples), we have already had enough information for quite some time to get a good idea.

Principal component analysis. PCA of ancient individuals (according colours see legend) projected on modern West Eurasians (grey). Iron Age Scythians are shown in black; CHG, Caucasus hunter-gatherer; LNBA, late Neolithic/Bronze Age; MN, middle Neolithic; EHG, eastern European huntergatherer; LBK_EN, early Neolithic Linearbandkeramik; HG, hunter-gatherer; EBA, early Bronze Age; IA, Iron Age; LBA, late Bronze Age; WHG, western hunter-gatherer.dataset (grey). Iron Age Scythians are shown in black; CHG, Caucasus hunter-gatherer; LNBA, late Neolithic/Bronze Age; MN, middle Neolithic; EHG, eastern European hunter-gatherer; LBK_EN, early Neolithic Linearbandkeramik; HG, hunter-gatherer; EBA, early Bronze Age; IA, Iron Age; LBA, late Bronze Age; WHG, western hunter-gatherer.

Mezhovska‘s position is similar to the later Pre-Scythian and Scythian populations. There are some interesting details: apart from haplogroup R1a-Z280 (CTS1211+), there is one R1b-M269 (PF6494+), probably Z2103, and an outlier (out of three) in a similar position to the recently described central/southern Scythian clusters.

NOTE. The finding of R1b-M269 in the forest-steppe is probably either 1) from an Afanasevo-Okunevo origin, or 2) from an admixture with neighbouring Andronovo-related populations, such as Sargary. A third, maybe less likely option is that this haplogroup admixed with Abashevo directly (as it happened in Sintashta, Potapovka, or Pokrovka) and formed part of early Uralic migrations. In any case, since Mezhovska is a Bronze Age society from the Urals region, its association with R1b-Z2103 – like the association of R1b-Z2103 in Scythian clusters – cannot be attributed to “Thracian peoples”, a link which is (as I already said) too simplistic.

The drawn “European cline” of Hungarians (see above), leading from ‘west-like’ Mansi to Hungarian populations – and hosting also Finnic and Estonian samples – , cannot therefore be attributed simply to late “Slavic/Balkan-like” admixture.

Karasuk – located further to the east – is basically also Corded Ware peoples showing clearly a recent admixture with local ANE / Baikal_EN-like populations. In terms of haplogroups it shows haplogroup Q, R1a-Z2124, and R1a-Z2123, later found among early Hungarians, and present also in ancient Samoyedic populations now acculturated.

The most interesting aspect of both Mezhovska and Karasuk is that they seem to diverge from a point close to Ukraine_Eneolithic, which is the supposed ancestral source of Corded Ware peoples (read more about the formation of “steppe ancestry”). This means that Eastern Uralians derive from a source closer to Middle Dnieper/Abashevo populations, rather than Battle Axe (shifted to Latvian Neolithic), which is more likely the source prevalent in Finno-Permic peoples.

Their initial admixture with (Palaeo-)Siberian populations is thus seen already starting by this time in Mezhovska and especially in Karasuk, but this process (compared to modern populations) is incomplete:

Visualization of f-statistics results. f4(Test, LBK; Han, Mbuti) values are plotted on x axis and f4(Test, LBK; EHG, Mbuti) values on y axis, positive deviations from zero show deviations from a clade between Test and LBK. A red dashed line is drawn between Yamnaya from Samara and Ami. Iron Age populations that can be modelled as mixtures of Yamnaya and East Eurasians (like the Ami) are arrayed around this line and appear to be distinct from the main North/South European cline (blue) on the left of the x axis.
ADMIXTURE results for ancient populations. Red arrows point to the Iron Age Scythian individuals studied. LBK_EN: Early Neolithic Linearbandkeramik; EHG: Eastern European hunter-gatherer; Motala_HG: hunter-gatherer from Motala (Sweden); WHG: western hunter-gatherer; CHG: Caucasus hunter-gatherer; IA: Iron Age; EBA: Early Bronze Age; LBA: Late Bronze Age.

We know now that Samic peoples expanded during the Late Iron Age into Palaeo-Laplandic populations, admixing with them and creating this modern cline. Finns expanded later to the north (in one of their known genetic bottlenecks), admixing with (and displacing) the Saami in Finland, especially replacing their male lines.

So how did Ugric and Samoyedic peoples admix with Palaeo-Siberian populations further, to obtain their modern cline? The answer is, logically, with East Asian migrations related to forest-steppe populations of Central Asia after the Mezhovska and Karasuk periods, i.e. during the Iron Age and later. Other groups from the forest-steppe in Central Asia show similar East Asian (“Siberian”) admixture. We know this from Narasimhan et al. (2018):

(…) we observe samples from multiple sites dated to 1700-1500 BCE (Maitan, Kairan, Oy_Dzhaylau and Zevakinsikiy) that derive up to ~25% of their ancestry from a source related to present-day East Asians and the remainder from Steppe_MLBA. A similar ancestry profile became widespread in the region by the Late Bronze Age, as documented by our time transect from Zevakinsikiy and samples from many sites dating to 1500-1000 BCE, and was ubiquitous by the Scytho-Sarmatian period in the Iron Age.

We already have some information about these later migrations:

Very important observation with implication of population turnover is that pre-Turkic Inner Eurasian populations’ Siberian ancestry appears predominantly “Uralic-Yeniseian” in contrast to later dominance of “Tungusic-Mongolic” sort (which does sporadically occur earlier). Alexander M. Kim

The Ugric-speaking Sargat culture in Western Siberia shows the expected mixture of haplogroups (ca. 500 BC – 500 AD), with 5 samples of hg N and 2 of hg R1a1, in Pilipenko et al. (2017). Although radiocarbon dates and subclades are lacking, N lineages probably spread late, because of the late and gradual admixture of Siberian cultures into the Sargat melting pot.

The Samoyedic-speaking Tagar culture also shows signs of a genetic turnover in Pilipenko et al. (2018):

The observed reduction in the genetic distance between the Middle Tagar population and other Scythian like populations of Southern Siberia(Fig 5; S4 Table), in our opinion, is primarily associated with an increase in the role of East Eurasian mtDNA lineages in the gene pool (up to nearly half of the gene pool) and a substantial increase in the joint frequency of haplogroups C and D (from 8.7% in the Early Tagar series to 37.5% in the Middle Tagar series). These features are characteristic of many ancient and modern populations of Southern Siberia and adjacent regions of Central Asia, including the Pazyryk population of the Altai Mountains.

Before the Iron Age, the Karasuk and Mezhovska population were probably already somehow ‘to the north’ within the ancient Steppe-Altai cline (see image below9 created by expanding Seima-Turbino- and Andronovo-related populations. During the Iron Age, further Siberian contributions with Iranian expansions must have placed Uralians of the Central Asian forest-steppe areas much closer to today’s Palaeo-Siberian cline.

However, the modern genetic picture was probably fully developed only in historic times, when Samoyedic and Ugric languages expanded to the north, only in part admixing further with Palaeo-Siberian-speaking nomads from the Circum-Arctic region (see here for a recent history of Samoyedic Enets), which justifies their more recent radical ‘northern shift’.

Modified image from Jeong et al. (2018), supplementary materials. The first two PCs summarizing the genetic structure within 2,077 Eurasian individuals. The two PCs generally mirror geography. PC1 separates western and eastern Eurasian populations, with many inner Eurasians in the middle. PC2 separates eastern Eurasians along the north-south cline and also separates Europeans from West Asians. Ancient individuals (color-filled shapes), including two Botai individuals, are projected onto PCs calculated from present-day individuals.

This late acquisition of the language by Palaeo-Siberian nomads (without much population replacement) also justifies the wide PCA clusters of very small Siberian populations. See for example in the PCA from Tambets et al. (2018):

Approximate Ugric and Samoyedic clines (exluding apparent outliers). Modified from Tambets et al. (2018). Principal component analysis (PCA) and genetic distances of Uralic-speaking populations. a PCA (PC1 vs PC2) of the Uralic-speaking populations

For their relationship with modern Mansi, we have information on Hungarian conqueror populations from Neparáczki et al. (2018):

Moreover, Y, B and N1a1a1a1a Hg-s have not been detected in Finno-Ugric populations [80–84], implying that the east Eurasian component of the Conquerors and Finno-Ugric people are probably not directly related. The same inference can be drawn from phylogenetic data, as only two Mansi samples appeared in our phylogenetic trees on the side branches (S1 Fig, Networks; 1, 4) suggesting that ancestors of the Mansis separated from Asian ancestors of the Conquerors a long time ago. This inference is also supported by genomic Admixture analysis of Siberian and Northeastern European populations [85], which revealed that Mansis received their eastern Siberian genetic component approximately 5–7 thousand years ago from ancestors of modern Even and Evenki people. Most likely the same explanation applies to the Y-chromosome N-Tat marker which originated from China [86,87] and its subclades are now widespread between various language groups of North Asia and Eastern Europe [88].

The genetic picture of Hungarians (their formed cline with Mansi and their haplogroups) may be quite useful for the true admixture found originally in Mansi peoples at the beginning of the Iron Age. By now it is clear even from modern populations that Steppe_MLBA ancestry accompanied the Uralic expansion to the east (roughly approximated in the graphic with Afanasievo_EBA + Bichon_LP EasternHG_M):

Admixture modelling using qpAdm. Maps showing locations and ancestry proportions of ancient (left) and modern (right) groups. From Sikora et al. (2018).

Continue reading the final post of the series: Corded Ware—Uralic (IV): Haplogroups R1a and N in Finno-Ugric and Samoyedic.


  • The traditional multilingualism of Siberian populations
  • Iron Age bottleneck of the Proto-Fennic population in Estonia
  • Y-DNA haplogroups of Tuvinian tribes show little effect of the Mongol expansion
  • Corded Ware—Uralic (I): Differences and similarities with Yamna
  • Haplogroup R1a and CWC ancestry predominate in Fennic, Ugric, and Samoyedic groups
  • The Iron Age expansion of Southern Siberian groups and ancestry with Scythians
  • Evolution of Steppe, Neolithic, and Siberian ancestry in Eurasia (ISBA 8, 19th Sep)
  • Mitogenomes from Avar nomadic elite show Inner Asian origin
  • On the origin and spread of haplogroup R1a-Z645 from eastern Europe
  • Oldest N1c1a1a-L392 samples and Siberian ancestry in Bronze Age Fennoscandia
  • Consequences of Damgaard et al. 2018 (III): Proto-Finno-Ugric & Proto-Indo-Iranian in the North Caspian region
  • The concept of “Outlier” in Human Ancestry (III): Late Neolithic samples from the Baltic region and origins of the Corded Ware culture
  • Genetic prehistory of the Baltic Sea region and Y-DNA: Corded Ware and R1a-Z645, Bronze Age and N1c
  • More evidence on the recent arrival of haplogroup N and gradual replacement of R1a lineages in North-Eastern Europe
  • Another hint at the role of Corded Ware peoples in spreading Uralic languages into north-eastern Europe, found in mtDNA analysis of the Finnish population
  • New Ukraine Eneolithic sample from late Sredni Stog, near homeland of the Corded Ware culture
  • Early Iranian steppe nomadic pastoralists also show Y-DNA bottlenecks and R1b-L23

    New paper (behind paywall) Ancient genomes suggest the eastern Pontic-Caspian steppe as the source of western Iron Age nomads, by Krzewińska et al. Science (2018) 4(10):eaat4457.

    Interesting excerpts (emphasis mine, some links to images and tables deleted for clarity):

    Late Bronze Age (LBA) Srubnaya-Alakulskaya individuals carried mtDNA haplogroups associated with Europeans or West Eurasians (17) including H, J1, K1, T2, U2, U4, and U5 (table S3). In contrast, the Iron Age nomads (Cimmerians, Scythians, and Sarmatians) additionally carried mtDNA haplogroups associated with Central Asia and the Far East (A, C, D, and M). The absence of East Asian mitochondrial lineages in the more eastern and older Srubnaya-Alakulskaya population suggests that the appearance of East Asian haplogroups in the steppe populations might be associated with the Iron Age nomads, starting with the Cimmerians.


    #UPDATE (5 OCT 2018): Some Y-SNP calls have been published in a Molgen thread, with:

    • Srubna samples have possibly two R1a-Z280, three R1a-Z93.
    • Cimmerians may not have R1b: cim357 is reported as R1a.
    • Some Scythians have low coverage to the point where it is difficult to assign even a reliable haplogroup (they report hg I2 for scy301, or E for scy197, probably based on some shared SNPs?), but those which can be reliably assigned seem R1b-Z2103 [hence probably the use of question marks and asterisks in the table, and the assumption of the paper that all Scythians are R1b-L23]:
      • The most recent subclade is found in scy305: R1b-Z2103>Z2106 (Z2106+, Y12538/Z8131+)
      • scy304: R1b-Z2103 (M12149/Y4371/Z8128+).
      • scy009: R1b-P312>U152>L2 (P312+, U152?, L2+)?
  • Sarmatians are apparently all R1a-Z93 (including tem002 and tem003);
  • You can read here the Excel file with (some probably as speculative as the paper’s own) results.

    About the PCA

    1. Srubnaya-Alakulskaya individuals exhibited genetic affinity to northern and northeastern present-day Europeans, and these results were also consistent with outgroup f3 statistics.
    2. The Cimmerian individuals, representing the time period of transition from Bronze to Iron Age, were not homogeneous regarding their genetic similarities to present-day populations according to the PCA. F3 statistics confirmed the heterogeneity of these individuals in comparison with present-day populations
    3. The Scythians reported in this study, from the core Scythian territory in the North Pontic steppe, showed high intragroup diversity. In the PCA, they are positioned as four visually distinct groups compared to the gradient of present-day populations:
      1. A group of three individuals (scy009, scy010, and scy303) showed genetic affinity to north European populations (…).
      2. A group of four individuals (scy192, scy197, scy300, and scy305) showed genetic similarities to southern European populations (…).
      3. A group of three individuals (scy006, scy011, and scy193) located between the genetic variation of Mordovians and populations of the North Caucasus (…). In addition, one Srubnaya-Alakulskaya individual (kzb004), the most recent Cimmerian (cim357), and all Sarmatians fell within this cluster. In contrast to the Scythians, and despite being from opposite ends of the Pontic-Caspian steppe, the five Sarmatians grouped close together in this cluster.
      4. A group of three Scythians (scy301, scy304, and scy311) formed a discrete group between the SC and SE and had genetic affinities to present-day Bulgarian, Greek, Croatian, and Turkish populations (…).
      5. Finally, one individual from a Scythian cultural context (scy332) is positioned outside of the modern West Eurasian genetic variation (Fig. 1C) but shared genetic drift with East Asian populations.
    Radiocarbon ages and geographical locations of the ancient samples used in this study. Figure panels presented (Left) Bar plot visualizing approximate timeline of presented and previously published individuals. (Right) Principal component analysis (PCA) plot visualizing 35 Bronze Age and Iron Age individuals presented in this study and in published ancient individuals (table S5) in relation to modern reference panel from the Human Origins data set (41).


    The presence of an SA component (as well as finding of metals imported from Tien Shan Mountains in Muradym 8) could therefore reflect a connection to the complex networks of the nomadic transmigration patterns characteristic of seasonal steppe population movements. These movements, although dictated by the needs of the nomads and their animals, shaped the economic and social networks linking the outskirts of the steppe and facilitated the flow of goods between settled, semi-nomadic, and nomadic peoples. In contrast, all Cimmerians carried the Siberian genetic component. Both the PCA and f4 statistics supported their closer affinities to the Bronze Age western Siberian populations (including Karasuk) than to Srubnaya. It is noteworthy that the oldest of the Cimmerians studied here (cim357) carried almost equal proportions of Asian and West Eurasian components, resembling the Pazyryks, Aldy-Bel, and Iron Age individuals from Russia and Kazakhstan (12). The second oldest Cimmerian (cim358) was also the only one with both uniparental markers pointing toward East Asia. The Q1* Y chromosome sublineage of Q-M242 is widespread among Asians and Native Americans and is thought to have originated in the Altai Mountains (24)


    In contrast to the eastern steppe Scythians (Pazyryks and Aldy-Bel) that were closely related to Yamnaya, the western North Pontic Scythians were instead more closely related to individuals from Afanasievo and Andronovo groups. Some of the Scythians of the western Pontic-Caspian steppe lacked the SA and the East Eurasian components altogether and instead were more similar to a Montenegro Iron Age individual (3), possibly indicating assimilation of the earlier local groups by the Scythians.

    Toward the end of the Scythian period (fourth century CE), a possible direct influx from the southern Ural steppe zone took place, as indicated by scy332. However, it is possible that this individual might have originated in a different nomadic group despite being found in a Scythian cultural context.

    Genetic diversity and ancestral components of Srubnaya-Alakulskaya population.(here called “Srubnaya”): (Left) Mean f3 statistics for Srubnaya and other Bronze Age populations. Srubnaya group was color-coded the same as with PCA. (Right) Pairwise mismatch estimates for Bronze Age populations.


    I am surprised to find this new R1b-L23-based bottleneck in Eastern Iranian expansions so late, but admittedly – based on data from later times in the Pontic-Caspian steppe near the Caucasus – it was always a possibility. The fact that pockets of R1b-L23 lineages remained somehow ‘hidden’ in early Indo-Iranian communities was clear already since Narasimhan et al. (2018), as I predicted could happen, and is compatible with the limited archaeological data on Sintashta-Potapovka populations outside fortified settlements. I already said that Corded Ware was out of Indo-European migrations then, this further supports it.

    Even with all these data coming just from a north-west Pontic steppe region (west of the Dnieper), these ‘Cimmerians’ – or rather the ‘Proto-Scythian’ nomadic cultures appearing before ca. 800 BC in the Pontic-Caspian steppes – are shown to be probably formed by diverse peoples from Central Asia who brought about the first waves of Siberian ancestry (and Asian lineages) seen in the western steppes. You can read about a Cimmerian-related culture, Anonino, key for the evolution of Finno-Permic peoples.

    Also interesting about the Y-DNA bottleneck seen here is the rejection of the supposed continuous western expansions of R1a-Z645 subclades with steppe tribes since the Bronze Age, and thus a clearest link of the Hungarian Árpád dynasty (of R1a-Z2123 lineage) to either the early Srubna-related expansions or – much more likely – to the actual expansions of Hungarian tribes near the Urals in historic times.

    NOTE. I will add the information of this paper to the upcoming post on Ugric and Samoyedic expansions, and the late introduction of Siberian ancestry to these peoples.

    A few interesting lessons to be learned:

    • Remember the fantasy story about that supposed steppe nomadic pastoralist society sharing different Y-DNA lineages? You know, that Yamna culture expanding with R1b from Khvalynsk-Repin into the whole Pontic-Caspian steppes and beyond, developing R1b-dominated Afanasevo, Bell Beaker, and Poltavka, but suddenly appearing (in the middle of those expansions through the steppes) as a different culture, Corded Ware, to the north (in the east-central European forest zone) and dominated by R1a? Well, it hasn’t happened with any other steppe migration, so…maybe Proto-Indo-Europeans were that kind of especially friendly language-teaching neighbours?
    • Remember that ‘pure-R1a’ Indo-Slavonic society emerged from Sintashta ca. 2100 BC? (or even Graeco-Aryan??) Hmmmm… Another good fantasy story that didn’t happen; just like a central-east European Bronze Age Balto-Slavic R1a continuity didn’t happen, either. So, given that cultures from around Estonia are those showing the closest thing to R1a continuity in Europe until the Iron Age, I assume we have to get ready for the Gulf of Finland Balto-Slavic soon.
    • Remember that ‘pure-R1a’ expansion of Indo-Europeans based on the Tarim Basin samples? This paper means ipso facto an end to the Tarim Basin – Tocharian artificial controversy. The Pre-Tocharian expansion is represented by Afanasevo, and whether or not (Andronovo-related) groups of R1a-Z645 lineages replaced part or eventually all of its population before, during, or after the Tocharian expansion into the Tarim Basin, this does not change the origin of the language split and expansion from Yamna to Central Asia; just like this paper does not change the fact that these steppe groups were Proto-Iranian (Srubna) and Eastern Iranian (Scythian) speakers, regardless of their dominant haplogroup.
    • And, best of all, remember the Copenhagen group’s recent R1a-based “Indo-Germanic” dialect revival vs. the R1b-Tocharo-Italo-Celtic? Yep, they made that proposal, in 2018, based on the obvious Yamna—R1b-L23 association, and the desire to support Kristiansen’s model of Corded Ware – Indo-European expansion. Pepperidge Farm remembers. This new data on Early Iranians means another big NO to that imaginary R1a-based PIE society. But good try to go back to Gimbutas’ times, though.
    Olander’s (2018) tree of Indo-European languages. Presented at Languages and migrations in pre-historic Europe (7-12 Aug 2018)

    Do you smell that fresher air? It’s the Central and East European post-Communist populist and ethnonationalist bullshit (viz. pure blond R1a-based Pan-Nordicism / pro-Russian Pan-Slavism / Pan-Eurasianism, as well as Pan-Turanism and similar crap from the 19th century) going down the toilet with each new paper.

    #EDIT (5 OCT 2018): It seems I was too quick to rant about the consequences of the paper without taking into account the complexity of the data presented. Not the first time this impulsivity happens, I guess it depends on my mood and on the time I have to write a post on the specific work day…

    While the data on Srubna, Cimmerians, and Sarmatians shows clearer Y-DNA bottlenecks (of R1a-Z645 subclades) with the new data, the Scythian samples remain controversial, because of the many doubts about the haplogroups (although the most certain cases are R1b-Z2103), their actual date, and cultural attribution. However, I doubt they belong to other peoples, given the expansionist trends of steppe nomads before, during, and after Scythians (as shown in statistical analyses), so most likely they are Scythian or ‘Para-Scythian’ nomadic groups that probably came from the east, whether or not they incorporated Balkan populations. This is further supported by the remaining R1b-P312 and R1b-Z2103 populations in and around the modern Eurasian steppe region.

    Early Iron Age cultures of the Carpathian basin ca. 7-6th century BC, including steppe groups Basarabi and Scythians. Ďurkovič et al. (2018).

    You can find an interesting and detailed take on the data published (in Russian) at Vol-Vlad’s LiveJournal (you can read an automatic translation from Google). I think that post is maybe too detailed in debunking all information associated to the supposed Scythians – to the point where just a single sample seems to be an actual Scythian (?!) -, but is nevertheless interesting to read the potential pitfalls of the study.


    Munda admixture happened probably during the ANI-ASI mixture


    Preprint The genetic legacy of continental scale admixture in Indian Austroasiatic speakers, by Tätte et al. bioRxiv (2018).

    Interesting excerpts:

    Studies analysing mtDNA and Y chromosome markers have revealed a sex-specific admixture pattern of admixture of Southeast and South Asian ancestry components for Munda speakers. While close to 100% of mtDNA lineages present in Mundas match those in other Indian populations, around 65% of their paternal genetic heritage is more closely related to Southeast Asian than South Asian variation. Such a contrasting distribution of maternal and paternal lineages among the Munda speakers is a classic example of ‘father tongue hypothesis’. However, the temporality of this expansion is contentious. Based on Y-STR data the coalescent time of Indian O2a-M95 haplogroup was estimated to be >10 KYA. Recently, the reconstructed phylogeny of 8.8 Mb region of Y chromosome data showed that Indian O2a-M95 lineages coalesce within a clade nested within East/Southeast Asian within the last ~5-7 KYA. This date estimate sets the upper boundary for the main episode of gene flow of Y chromosomes from Southeast Asia to India.

    Supplementary Figure S4. First two components of principal component analysis (PCA). Individuals and population medians (circles) are marked with abbreviations from population names. Different colours represent populations from different geographic areas and/or linguistic groups as shown on the legend on the right. For the full names of populations see Supplementary Table S1. PCA was performed using software EIGENSOFT 6.1.42 on the whole filtered dataset (1072 individuals), previously LD pruned as described in the title of Supplementary Figure S1. The first two principal components describe 5.13% and 2.57% of total variance.

    Admixture proportions suggest a novel scenario

    Regardless of which West Asian population we used, we found that Munda speakers can be described on average as a mixture of ~19% Southeast Asian, 15% West Asian and 66% Onge (South Asian) components. Alternatively, the West and South Asian components of Munda could be modelled using a single South Asian population (Paniya), accounting on average to 77% of the Munda genome. When rescaling the West and South Asian (Onge) components to 1 to explore the Munda genetic composition prior to the introduction of the Southeast Asian component, we note that the West Asian component is lower (~19%) in Munda compared to Paniya (27%) (Supplementary Table S4: *Average_Lao=0). Consistently with qpGraph analyses in Narasimhan et al. (2018), this may point to an initial admixture of a Southeast Asian substrate with a South Asian substrate free of any West Asian component, followed by the encounter of the resulting admixed population with a Paniya-like population. Such a scenario would imply an inverse relationship between the Southeast and West Asian relative proportions in Munda or, in other words, the increase of Southeast Asian component should cause a greater reduction of the West Asian compared to the reduction in the South Asian component in Munda.

    The distribution of genetic components (K=13) based on the global ADMIXTURE analysis (Supplementary Figure S1, S2, S3) for a subset of populations on a map of South and Southeast Asia. The circular legend in the bottom left corner shows the ancestral components corresponding to the colours on pie charts. The sector sizes correspond to population median.

    Dating the admixture event

    In this study, we have replicated a result previously reported in Chaubey et al. (2011)7 that the Mundas lack one ancestral component (k2) that is characteristic to Indian Indo-European and Dravidian speaking populations. If this component came to India through one of the Indo-Aryan migrations then it would be fair to presume that the Munda admixture happened before this component reached India or at least before it spread all over the country. However, the admixture time computed here, falls in the exact same timeframe as the ANI-ASI mixture has been estimated to have happened in India through which the k2 component probably spread. Therefore, we propose that if the Munda admixture happened at the same time, it is possible for it to have happened in the eastern part of the country, east of Bangladesh, and later when populations from East Asia moved to the area, the Mundas migrated towards central India. Such a scenario, which may be further clarified by ancient DNA analyses, seems to be further supported by the fact that Mundas harbor a smaller fraction of West Asian ancestry compared to contemporary Paniya (Supplementary Table S4) and cannot therefore be seen as a simple admixture product of Southern Indian populations with incoming Southeast Asian ancestries.

    Image from Damgaard et al. (2018). A summary of the four qpAdm models fitted for South Asian populations. For each modern South Asian population. we fit different models with qpAdm to explain their ancestry composition using ancient groups and present the f irst model that we could not reject in the following priority order: 1. Namazga_CA + Onge, 2. Namazga_CA + Onge + Late Bronze Age Steppe, 3. Namazga_CA + Onge + Xiongnu_lA (East Asian proxy). and 4. Turkmenistan_lA + Xiongnu_lA. Xiongnu_lA were used here to represent East Asian ancestry. We observe that while South Asian Dravidian speakers can be modeled as a mixture of Onge and Namazga_CA. an additional source related to Late Bronze Age steppe groups is required for IE speakers. In Tibeto-Burman and Austro-Asiatic speakers. an East Asian rather than a Steppe_MLBA source is required

    Linguistics and genome-wide data

    (…) by and large, the linguistic classification justifies itself but Kharia and Juang do not fit in this simplification perfectly.

    Once again, with the current level of detail in genetic studies, there is often no clear dialectal division possible for certain groups without fine-scale population studies, and the help from linguistics and archaeology.

    Featured image from open access paper by Chaubey et al. (2011).


    Mitogenomes from Avar nomadic elite show Inner Asian origin


    Inner Asian maternal genetic origin of the Avar period nomadic elite in the 7th century AD Carpathian Basin, by Csáky et al. bioRxiv (2018).

    Abstract (emphasis mine):

    After 568 AD the nomadic Avars settled in the Carpathian Basin and founded their empire, which was an important force in Central Europe until the beginning of the 9th century AD. The Avar elite was probably of Inner Asian origin; its identification with the Rourans (who ruled the region of today’s Mongolia and North China in the 4th-6th centuries AD) is widely accepted in the historical research.

    Here, we study the whole mitochondrial genomes of twenty-three 7th century and two 8th century AD individuals from a well-characterised Avar elite group of burials excavated in Hungary. Most of them were buried with high value prestige artefacts and their skulls showed Mongoloid morphological traits.

    The majority (64%) of the studied samples’ mitochondrial DNA variability belongs to Asian haplogroups (C, D, F, M, R, Y and Z). This Avar elite group shows affinities to several ancient and modern Inner Asian populations.

    The genetic results verify the historical thesis on the Inner Asian origin of the Avar elite, as not only a military retinue consisting of armed men, but an endogamous group of families migrated. This correlates well with records on historical nomadic societies where maternal lineages were as important as paternal descent.

    MDS with 23 ancient populations. The Multidimensional Scaling plot is based on linearised Slatkin FST values that were calculated based on whole mitochondrial sequences (stress value is 0.1581). The MDS plot shows the connection of the Avars (AVAR) to the Central-Asian populations of the Late Iron Age (C-ASIA_LIAge) and Medieval period (C-ASIA_Medieval) along coordinate 1 and coordinate 2, which is caused by non-significant genetic distances between these populations. The European ancient populations are situated on the left part of the plot, where the Iberian (IB_EBRAge), Central-European (C-EU_BRAge) and British (BRIT_BRAge) populations from Early Bronze Age and Bronze Age are clustered along coordinate 2, while the Neolithic populations from Germany (GER_Neo), Hungary (HUN_Neo), Near-East (TUR_ _Neo) and Baltic region (BALT_Neo) are located on the skirt of the plot along coordinate 1. The linearised Slatkin FST values, abbreviations and references are presented in Table S4.

    Interesting excerpts:

    The mitochondrial genome sequences can be assigned to a wide range of the Eurasian haplogroups with dominance of the Asian lineages, which represent 64% of the variability: four samples belong to Asian macrohaplogroup C (two C4a1a4, one C4a1a4a and one C4b6); five samples to macrohaplogroup D (one by one D4i2, D4j, D4j12, D4j5a, D5b1), and three individuals to F (two F1b1b and one F1b1f). Each haplogroup M7c1b2b, R2, Y1a1 and Z1a1 is represented by one individual. One further haplogroup, M7 (probably M7c1b2b), was detected (sample AC20); however, the poor quality of its sequence data (2.19x average coverage) did not allow further analysis of this sample.

    European lineages (occurring mainly among females) are represented by the following haplogroups: H (one H5a2 and one H8a1), one J1b1a1, three T1a (two T1a1 and one T1a1b), one U5a1 and one U5b1b (Table S1).

    We detected two identical F1b1f haplotypes (AC11 female and AC12 male) and two identical C4a1a4 haplotypes (AC13 and AC15 males) from the same cemetery of Kunszállás; these matches indicate the maternal kinship of these individuals. There is no chronological difference between the female and the male from Grave 30 and 32 (AC11 and AC12), but the two males buried in Grave 28 and 52 (AC13 and AC15) are not contemporaries; they lived at least 2-3 generations apart.

    Ward type clustering of 44 ancient populations. The Ward type clustering shows separation of Asian and European populations. The Avar elite group (AVAR) is situated on an Asian branch and clustered together with Central Asian populations from Late Iron Age (C-ASIA_LIAge) and Medieval period (C-ASIA_Medieval), furthermore with Xiongnu period population from Mongolia (MON_Xiongnu) and Scythians from the Altai region (E-EU_IAge_Scyth). P values are given in percent as red numbers on the dendogram, where red rectangles indicate clusters with significant p values. The abbreviations and references are presented in Table S2.

    The Avar period elite shows the lowest and non-significant genetic distances to ancient Central Asian populations dated to the Late Iron Age (Hunnic) and to the Medieval period, which is displayed on the ancient MDS plot (Fig. 4); these connections are also reflected on the haplogroup based Ward-type clustering tree (Fig. 3). Building of these large Central Asian sample pools is enabled by the small number of samples per cultural/ethnic group. Further mitogenomic data from Inner Asia are needed to specify the ancient genetic connections; however, genomic analyses are also set back by the state of archaeological research, i.e. the lack of human remains from the 4th-5th century Mongolia, which would be a particularly important region in the study of the Avar elite’s origin.

    The investigated elite group from the Avar period elite also shows low genetic distances and phylogenetic connections to several Central and Inner Asian modern populations. Our results indicate that the source population of the elite group of the Avar Qaganate might have existed in Inner Asia (region of today’s Mongolia and North China) and the studied stratum of the Avars moved from there westwards towards Europe. Further genetic connections of the Avars to modern populations living to East and North of Inner Asia (Yakuts, Buryats, Tungus) probably indicate common source populations.

    MDS with the 44 modern populations and the Avar elite group. The Multidimensional Scaling plot is displayed based on linearised Slatkin FST values calculated based on whole mitochondrial sequences (stress value is 0.0677). The MDS plot shows differentiation of European, Near-Eastern, Central- and East-Asian populations along coordinates 1 and 2. The Avar elite (AVAR) is located on the Asian part of plot and clustered with Uyghurs from Northwest-China (NW-CHIN_UYG) and Han Chinese (CHIN), as well as with Burusho and Hazara populations from the Central-Asian Highland (Pakistan). The linearised Slatkin FST values, abbreviations and references are presented in Table S5.

    Sadly, no Y-DNA is available from this paper, although haplogroups Q, C2, or R1b (xM269) are probably to be expected, given the reported mtDNA. A replacement of the male population with subsequent migrations is obvious from the current distribution of Y-DNA haplogroups in the Carpathian Basin.

    Hungarians and Corded Ware

    Ancient Hungarians are important to understand the evolution, not only of Ugric, but also of Finno-Ugric peoples and their origin, since they show a genetic picture before more recent population expansions, genetic drift, and bottlenecks in eastern Europe.

    By now it is evident that the migration of Magyar clans from their homeland in the Cis-Urals region (from the 4th century AD on) happened after the first waves of late and gradual expansion of N1c subclades among Finno-Ugric peoples, but before the bottlenecks seen in modern populations of eastern Europe.

    In Ob-Ugric peoples, from the scarce data found in Pimenoff et al. (2018), we can see how Siberian N subclades expanded further after the separation of Magyars, evidenced by the inverted proportion of haplogroups R1a and N in modern Khantys and Mansis compared to Hungarians, and the diversity of N subclades compared to modern Fennic peoples.

    Similarly to Hungarians, the situation of modern Estonians (where R1a and N subclades show approximately the same proportion, ca. 33%) is probably closer to Fennic peoples in Antiquity, not having undergone the latest strong founder effect evident in modern Finns after their expansion to the north.

    Hungarian expansion from the 4th to the 10th century AD.

    Modern Hungary

    This is data from recent papers, summed up in Wikipedia:

    • In Semino et al. (2001) they found among 45 Palóc from Budapest and northern Hungary: 60% R1a, 13% R1b, 11% I, 9% E, 2% G, 2% J2.
    • In Csányi et al. (2008) Among 100 Hungarian men, 90 of whom from the Great Hungarian Plain: 30% R1a, 15% R1b, 13% I2a1, 13% J2, 9% E1b1b1a, 8% I1, 3% G2, 3% J1, 3% I*, 1% E*, 1% F*, 1% K*. Among 97 Székelys, in Romania: 20% R1b, 19% R1a, 17% I1, 11% J2, 10% J1, 8% E1b1b1a, 5% I2a1, 5% G2, 3% P*, 1% E*, 1% N.
    • In Pamjav et al. (2011), among 230 samples expected to include 6-8% Gypsy peoples: 26% R1a, 20% I2a, 19% R1b, 7% I, 6% J2, 5% H, 5% G2a, 5% E1b1b1a1, 3% J1, <1% N, <1% R2.
    • In Pamjav et al. (2017), from the Bodrogköz population: R1a-M458 (20.4%), I2a1-P37 (19%), R1b-M343 (15%), R1a-Z280 (14.3%), E1b-M78 (10.2%), and N1c-Tat (6.2%).

    NOTE. The N1c-Tat found in Bodrogköz belongs to the N1c-VL29 subgroup, more frequent among Balto-Slavic peoples, which may suggest (yet again) an initial stage of the expansion of N subclades among Finno-Ugric peoples by the time of the Hungarian migration.

    This is the data from FTDNA group on Hungary (copied from a Wikipedia summary of 2017 data):

    • 26.1% R1a (15% Z280, 6.5% M458, 0.9% Z93=>S23201, 3.7% unknown)
    • 19.2% R1b (6% L11-P312/U106, 5.3% P312, 4.2% L23/Z2103, 3.7% U106)
    • 16.9% I2 (15.2% CTS10228, 1.4% M223, 0.5% L38)
    • 8.3% I1
    • 8.1% J2 (5.3% M410, 2.8% M102)
    • 6.9% E1b1b1 (6% V13, 0.3% V22, 0.3% M123, 0.3% M81)
    • 6.9% G2a
    • 3.2% N (1.4% Z9136, 0.5% M2019/VL67, 0.5% Y7310, 0.9% Z16981)- note: only unrelated males are sampled
    • 2.3% Q (1.2% YP789, 0.9% M346, 0.2% M242)
    • 0.9% T
    • 0.5% J1
    • 0.2% L
    • 0.2% C

    R1a-Z280 stands out in FDNA (which we have to assume has no geographic preference among modern Hungarians), while R1a-M458 is prevalent in the north, which probably points to its relationship with (at least West) Slavic populations.

    Ancient Hungarians

    We already knew that Hungarians show similarities with Srubna and Hunnic peoples, and this paper shows a good reason for the similarities with the Huns.

    Also, recent population movements in the region (before the Avars) probably increased the proportion of R1b-L23 and I1 subclades (related to Roman and Germanic peoples) as well as possibly R1a-Z283 (mainly M458, related to the expansion of Slavs). From Understanding 6th-century barbarian social organization and migration through paleogenomics, by Amorim et al. (2018):

    Y-chromosome haplogroup attribution for 37 medieval and 1 Bronze age individuals.

    NOTE. The sample SZ15, of haplogroup R1a1a1b1a3a (S200), belongs to the Germanic branch Z284, which has a completely different history with its integration into the Nordic Bronze Age community.

    Interesting is the Szólád Bronze Age sample of R1a1a1b2a2a (Z2123) subclade (ca. 2100-1700 BC), which is possibly the same haplogroup found in King Béla III [Z93+ (80.6%), Z2123+ (10.8%)]*. Nevertheless, Z2123 refers to an upper clade, found also in East Andronovo sites in Narasimhan et al. (2018), as well as in the modern population of the Tarim Basin.

    NOTE. For more on the analysis of probability of the actual subclade, see here.

    Bronze Age R1a-Z93 samples of central-east Europe – like the Balkans BA sample (ca. 1750-1625 BC) from Merichleri, of R1a1a1b2 subclade – correspond most likely to the expansion of Iranian-speaking peoples in the early 2nd millennium BC, probably to the westward expansion of the Srubna culture.

    The specific subclade of King Béla III, on the other hand, probably corresponds to the more recent expansion of Magyar tribes settled in the region during the 9th century AD, so the specific subclade must have separated from those found in central-east Europe and in Andronovo during the Corded Ware expansion.

    Modified image, from Underhill et al. (2015). Spatial frequency distributions of Z282 (green) and Z93 (blue) affiliated haplogroups. Notice the potential Finno-Ugric-associated distribution of Z282 (including M558, a Z280 subclade) according to ancient maps; the northern Eurasian finds of Z2125 (upper clade of Z2123); and the potential of M458 subclades representing a west-east expansion of Balto-Slavic as a western outgroup of an original Fenno-Ugric population, equivalent to Z284 in Scandinavia.

    The study by Csányi et al. (2008), where the Tat C allele was found in 2 of 4 ancient samples, showed thus a potential 50:50 relationship of N1c in ancient Magyars, which is striking given the modern 1-3% a mere 1,000 years later, without any relevant population movement in between. This result remains to be reproduced with the current technology.

    In fact, recent studies of ancient Magyars, from the 10th to the 12th century, have not shown any N1c sample, and have confirmed instead the ancient presence of R1a (two other samples, interred near Béla III), R1b (four samples), I2a (two samples) J1, and E1b, a mixed genetic picture which is more in line with what is expected.

    So the question that I recently posed about east Corded Ware groups remains open: were Proto-Ugric peoples mainly of R1a-Z282 or R1a-Z93 subclades? Without ancient DNA from Middle Dnieper, Fatyanovo, Afanasevo, and the succeeding cultures (like Netted Ware) in north-eastern Europe, it is difficult to say.

    It is very likely that they are going to show mainly a mixture of both R1a-Z282 and R1a-Z93 lineages, with later populations showing a higher proportion of R1a-Z280 subclades. Whether this mixture happened already during the Corded Ware period, or is the result of later developments, is still unknown. What is certain is that Hungarian N1a1a1a-L708 subclades belong to more recent additions of Siberian haplogroups to the Ugric stock, probably during the Iron Age, just centuries before the Magyar expansion.


    Modelling of prehistoric dispersal of rice varieties in India point to a north-western origin


    New paper (behind paywall), A tale of two rice varieties: Modelling the prehistoric dispersals of japonica and proto-indica rices, by Silva et al., The Holocene (2018).

    Interesting excerpts (emphasis mine):


    Our empirical evidence comes from the Rice Archaeological Database (RAD). The first version of this database was used for a synthesis of rice dispersal by Fuller et al. (2010), a slightly expanded dataset (version 1.1) was used to model the dispersal of rice, land area under wet rice cultivation and associated methane emissions from 5000–1000 BP (Fuller et al., 2011). The present dataset (version 2) was used in a previous analysis of the origins of rice domestication (Silva et al., 2015). The database records sites and chronological phases within sites where rice has been reported, including whether rice was identified from plant macroremains, phytoliths or impressions in ceramics. Ages are recorded as the start and end date of each phase, and a median age of the phase is then used for analysis. Dating is based on radiocarbon evidence (…)

    Modelling framework

    Our approach expands on previous efforts to model the geographical origins, and subsequent spread, of japonica rice (Silva et al., 2015). The methodology is based on the explicit modelling of dispersal hypotheses using the Fast Marching algorithm, which computes the cost-distance of an expanding front at each point of a discrete lattice or raster from the source(s) of diffusion (Sethian, 1996; Silva and Steele, 2012, 2014). Sites in the RAD database are then queried for their cost-distance, the distance from the source(s) of dispersal along the cost-surface that represents the hypothesis being modelled (see Connolly and Lake, 2006; Douglas, 1994; Silva et al., 2015; Silva and Steele, 2014 for more on this approach) and, together with the site’s dating, used for regression analysis. (…)

    Predicted arrival times of the non-shattering rice variety (japonica or the hybrid indica) across southern Asia based on best-fitting model H2. Included are also sites with known presence of non-shattering spikelet bases (see text).

    Model and results

    The ‘Inner Asia Mountain Corridor’ hypothesis (H2) therefore predicts japonica rice to arrive first in northwest India via a route that starts in the Yellow river valley, travels west via the well-known Hexi corridor, then just south of the Inner Asian Mountains and thence to India.

    The results also show that the addition of the Inner Asia Mountain Corridor significantly improves the model’s fit to the data, particularly model H2 where rice is introduced to the Indian subcontinent exclusively via a trade route that circumvents the Tibetan plateau. This agrees with independent archaeological evidence that sees millets spread westwards along this corridor perhaps as early as 3000 BC (e.g. Boivin et al., 2012; Kohler-Schneider and Canepelle, 2009; Rassamakin, 1999) and certainly by 2500–2000 BC (Frachetti et al., 2010; Spengler 2015; Stevens et al., 2016), that is, in the same time frame as that predicted for rice in model H2. The arrival of western livestock (sheep, cattle) into central China, 2500–2000 BC (Fuller et al., 2011; Yuan and Campbell, 2009), and wheat, ca. 2000 BC (Betts et al., 2014; Flad et al., 2010; Stevens et al., 2016; Zhao, 2015), add evidence for the role of the Inner Asia Mountain Corridor for domesticated species dispersal in this period.


    Through a combination of explicit spatial modelling and simulation, we have demonstrated the high likelihood that dispersal of rice via traders in Central Asia introduced japonica rice into South Asia. Only slightly less likely is a combination of introduction via two routes including a Central Asia to Pakistan/northwestern India route as well as introduction to northeastern India directly from China/Myanmar. However, there is a very low probability that current archaeological evidence for rice fits with a single introduction of japonica into India via the northeast. We have also simulated the minimum amount of archaeobotanical sampling from the Neolithic (to Bronze Age) period in the regions of northeastern India and Myanmar that will be necessary to strengthen support for the combined introduction (model H3) or a single Central Asian introduction (model H2).


    “Steppe people seem not to have penetrated South Asia”


    Open access structured abstract for The first horse herders and the impact of early Bronze Age steppe expansions into Asia from Damgaard et al. Science (2018) 360(6396):eaar7711.

    Abstract (emphasis mine):

    The Eurasian steppes reach from the Ukraine in Europe to Mongolia and China. Over the past 5000 years, these flat grasslands were thought to be the route for the ebb and flow of migrant humans, their horses, and their languages. de Barros Damgaard et al. probed whole-genome sequences from the remains of 74 individuals found across this region. Although there is evidence for migration into Europe from the steppes, the details of human movements are complex and involve independent acquisitions of horse cultures. Furthermore, it appears that the Indo-European Hittite language derived from Anatolia, not the steppes. The steppe people seem not to have penetrated South Asia. Genetic evidence indicates an independent history involving western Eurasian admixture into ancient South Asian peoples.

    According to the commonly accepted “steppe hypothesis,” the initial spread of Indo-European (IE) languages into both Europe and Asia took place with migrations of Early Bronze Age Yamnaya pastoralists from the Pontic-Caspian steppe. This is believed to have been enabled by horse domestication, which revolutionized transport and warfare. Although in Europe there is much support for the steppe hypothesis, the impact of Early Bronze Age Western steppe pastoralists in Asia, including Anatolia and South Asia, remains less well understood, with limited archaeological evidence for their presence. Furthermore, the earliest secure evidence of horse husbandry comes from the Botai culture of Central Asia, whereas direct evidence for Yamnaya equestrianism remains elusive.

    We investigated the genetic impact of Early Bronze Age migrations into Asia and interpret our findings in relation to the steppe hypothesis and early spread of IE languages. We generated whole-genome shotgun sequence data (~1 to 25 X average coverage) for 74 ancient individuals from Inner Asia and Anatolia, as well as 41 high-coverage present-day genomes from 17 Central Asian ethnicities.

    Model-based admixture proportions for selected ancient and present-day individuals, assuming K = 6, shown with their corresponding geographical locations. Ancient groups are represented by larger admixture plots, with those sequenced in the present work surrounded by black borders and others used for providing context with blue borders. Present-day South Asian groups are represented by smaller admixture plots with dark red borders.

    We show that the population at Botai associated with the earliest evidence for horse husbandry derived from an ancient hunter-gatherer ancestry previously seen in the Upper Paleolithic Mal’ta (MA1) and was deeply diverged from the Western steppe pastoralists. They form part of a previously undescribed west-to-east cline of Holocene prehistoric steppe genetic ancestry in which Botai, Central Asians, and Baikal groups can be modeled with different amounts of Eastern hunter-gatherer (EHG) and Ancient East Asian genetic ancestry represented by Baikal_EN.

    In Anatolia, Bronze Age samples, including from Hittite speaking settlements associated with the first written evidence of IE languages, show genetic continuity with preceding Anatolian Copper Age (CA) samples and have substantial Caucasian hunter-gatherer (CHG)–related ancestry but no evidence of direct steppe admixture.

    In South Asia, we identified at least two distinct waves of admixture from the west, the first occurring from a source related to the Copper Age Namazga farming culture from the southern edge of the steppe, who exhibit both the Iranian and the EHG components found in many contemporary Pakistani and Indian groups from across the subcontinent. The second came from Late Bronze Age steppe sources, with a genetic impact that is more localized in the north and west.

    Our findings reveal that the early spread of Yamnaya Bronze Age pastoralists had limited genetic impact in Anatolia as well as Central and South Asia. As such, the Asian story of Early Bronze Age expansions differs from that of Europe. Intriguingly, we find that direct descendants of Upper Paleolithic hunter-gatherers of Central Asia, now extinct as a separate lineage, survived well into the Bronze Age. These groups likely engaged in early horse domestication as a prey-route transition from hunting to herding, as otherwise seen for reindeer. Our findings further suggest that West Eurasian ancestry entered South Asia before and after, rather than during, the initial expansion of western steppe pastoralists, with the later event consistent with a Late Bronze Age entry of IE languages into South Asia. Finally, the lack of steppe ancestry in samples from Anatolia indicates that the spread of the earliest branch of IE languages into that region was not associated with a major population migration from the steppe.

    I think the wording of the abstract is weird, but consequent with their samples and results, so probably just clickbait / citebait for Indian journalists and social networks, or maybe a new attempt to ‘show respect for the sensibilities of Indians’ related to the artificially magnified “AIT vs. OIT” controversy, that is only present in India.

    However, everything is possible, since it is brought to you by the same Danish group who proposed the Yamnaya ancestral component™, the CHG = Indo-European (and simultaneously EHG in Maykop = Anatolian??), and now also the CWC/R1a = Indo-European & Volosovo = Uralic

    Here is the reaction of Narasimhan: Narasimhan has deleted the Tweet, it basically questioned the sentence that steppe people did not penetrate South Asia.


    Expansion of domesticated goat echoes expansion of early farmers


    New paper (behind paywall) Ancient goat genomes reveal mosaic domestication in the Fertile Crescent, by Daly et al. Science (2018) 361(6397):85-88.

    Interesting excerpts (emphasis mine):

    Thus, our data favor a process of Near Eastern animal domestication that is dispersed in space and time, rather than radiating from a central core (3, 11). This resonates with archaeozoological evidence for disparate early management strategies from early Anatolian, Iranian, and Levantine Neolithic sites (12, 13). Interestingly, our finding of divergent goat genomes within the Neolithic echoes genetic investigation of early farmers. Northwestern Anatolian and Iranian human Neolithic genomes are also divergent (14–16), which suggests the sharing of techniques rather than large-scale migrations of populations across Southwest Asia in the period of early domestication. Several crop plants also show evidence of parallel domestication processes in the region (17).

    PCA affinity (Fig. 2), supported by qpGraph and outgroup f3 analyses, suggests that modern European goats derive from a source close to the western Neolithic; Far Eastern goats derive from early eastern Neolithic domesticates; and African goats have a contribution from the Levant, but in this case with considerable admixture from the other sources (figs. S11, S16, and S17 and tables S26 and 27). The latter may be in part a result of admixture that is discernible in the same analyses extended to ancient genomes within the Fertile Crescent after the Neolithic (figs. S18 and S19 and tables S20, S27, and S31) when the spread of metallurgy and other developments likely resulted in an expansion of inter-regional trade networks and livestock movement.

    Maximumlikelihood phylogeny and geographical distributions of ancient mtDNA haplogroups. (A) A phylogeny placing ancient whole mtDNA sequences in the context of known haplogroups. Symbols denoting individuals are colored by clade membership; shape indicates archaeological period (see key). Unlabeled nodes are modern bezoar and outgroup sequence (Nubian ibex) added for reference.We define haplogroup T as the sister branch to the West Caucasian tur (9). (B and C) Geographical distributions of haplogroups show early highly structured diversity in the Neolithic period (B) followed by collapse of structure in succeeding periods (C).We delineate the tiled maps at 7250 to 6950 BP, a period >bracketing both our earliest Chalcolithic sequence (24, Mianroud) and latest Neolithic (6, Aşağı Pınar). Numbered archaeological sites also include Direkli Cave (8), Abu Ghosh (9), ‘Ain Ghazal (10), and Hovk-1 Cave (11) (table S1) (9).

    Our results imply a domestication process carried out by humans in dispersed, divergent, but communicating communities across the Fertile Crescent who selected animals in early millennia, including for pigmentation, the most visible of domestic traits.