Intense but irregular NWIE and Indo-Iranian contacts show Uralic disintegrated in the West

chalcolithic-early-uralic-indo-european

Open access PhD thesis Indo-Iranian borrowings in Uralic: Critical overview of sound substitutions and distribution criterion, by Sampsa Holopainen, University of Helsinki (2019), under the supervision of Forsberg, Saarikivi, and Kallio.

Interesting excerpts (emphasis mine):

The gap between Russian and Western scholarship

Many scholars in the Soviet Union and later the Russian Federation also have researched this topic over the last five decades. Notably the eminent Eugene Helimski dealt with this topic in several articles: his 1992 article (republished in Helimski 2000) on the emergence of Uralic consonantal stems used Indo-Iranian and other Indo-European loans as key evidence, and it was one of the first serious attempts to stratify the loanwords, paying attention to the non-initial syllables as well. Helimski (1997b) discusses Indo-Iranian loanwords more generally, but it is especially notable for the introduction of the “Andronovo Aryan” idea: Helimski argues that some loanwords in Ob-Ugric and Permic are derived from an unattested, third branch of Indo-Iranian. Helimski’s idea has been supported by at least Mikhail Zhivlov in a 2013 article, but otherwise it has not received wide acceptance. Helimski was also known for his criticism (see especially Helimski 2001) of Jorma Koivulehto’s etymological work: although the main targets of Helimski’s criticism were Koivulehto’s writings on Proto-Indo-European and Germanic borrowings (which fitted poorly with Helimski’s ideas of the Nostratic roots of Proto-Uralic and his other theories on Uralic linguistic prehistory), also some of his Indo-Iranian ideas received unnecessarily sharp criticism in Helimski (2001).

Vladimir Napol’skikh is another important Russian scholar who has written on several occasions about Indo-Iranian–Uralic contacts. His 2014 article is notable for its criticism on Helimski’s Andronovo Aryan theory and his arguments in favour of Indo-Aryan loanwords. Napol’skikh also considered some of the traditional Indo-Iranian loanwords to be borrowings from Tocharian (see below) in some of his earlier works, an idea which has been criticized by Kallio (2004) and Widmer (2002) and which Napol’skikh himself has since dropped in later publications (2010, 2014), where many of these alleged Tocharian loans are again considered Indo-Iranian.

Some of the main characteristics of Russian research is that the earliest Indo-European loanwords are usually considered to represent an inheritance from the Nostratic proto-language (Helimski [2001]; Kassian, Zhivlov & Starostin [2015]), an idea which is not widely accepted by scholars of Uralic in the West. Although this often does not concern the Indo-Iranian loanwords at all, or it concerns only a part of them, the works of Jorma Koivulehto, who dealt with both earlier Indo-European and Indo-Iranian loans, receive so much criticism from the Russian scholars that his important ideas are often totally rejected or left unmentioned in Russian research.

This kind of rejection of central etymological research literature can be considered one of the most pressing problems in Uralic loanword studies, and it leaves a regrettable gap between Russian and Western European scholars in this perspective.

11-chalcolithic-late-cultures

11-chalcolithic-late-uralic

Semantics

Among the Indo-Iranian loanwords in Uralic, one can easily mention examples that follow the classification of semantic change as described above. For widening or generalization, vasara ‘hammer’ is a good example: the Indo-Iranian original denotes ‘the weapon of the god Indra’ in Indic and ‘the weapon of the god Mithra’ in Avestan, whereas Finnish ‘hammer’ (and the Mordvin meaning ‘axe’) are more general meanings of tools. Fi huhta is a good example of narrowing: Iranian *tsuxta- means simply ‘burned’, whereas in Finnic huhta means specifically ‘a burned patch used in slash-and-burn agriculture’. Metonomy has taken place in Mordvin, where čuvto denotes simply ‘tree’; this probably developed through the meaning ‘wood burned for agriculture’. Khanty (South) wǟrəs denotes ‘horse’s mane’, but its Iranian original probably had a more general meaning of hair (cf. Avestan varəsa- ‘hair of human and animal, mostly hair of the head’).

An interesting example of degeneration is the etymology of Finnic orja ‘slave’, probably borrowed from the Indo-Iranian ethnonym *(H)ārya- ‘Aryan’ (for the original semantics of this word, see the entry *orja in Chapter 2). A similar development is seen in English slave which is etymologically connected to the ethnonym Slav.

Distribution as a criterion in the dating of loanwords

(…) some of the Indo-Iranian loans seem to have a wide distribution, but upon a closer look it becomes clear that they include phonological irregularities, which can only be explained by assuming that they are parallel loans. The ability to recognize parallel borrowings is extremely important in Uralic loanword studies, and it has been developed with success in the research of Germanic and Baltic loanwords (see Junttila 2015).

Interestingly, K. Häkkinen (1983: 207) argues that although words disappear from languages, the most basic words often remain stable and are maintained for longer periods. Although this is probably true, here the notion of “basicness” is something that is open to different interpretations. Many central concepts in culture and livelihoods are often described with prestige words that are borrowed, and these central words can be very easily replaced. In determining the age of the loanwords one has to always keep in mind that a reflex of a very early cultural borrowing from Indo-Iranian to Proto-Uralic/Proto- West Uralic etc. can easily have been lost in some daughter language, if a later prestige loan for the same concept has been borrowed from some later contact language (such as from some form of Germanic or Baltic into Finnic or from some Turkic language into Udmurt, Mari or Mordvin).

In Uralic linguistics the common loanword layers shared by some intermediary proto-language have often been seen as giving support to the reconstruction of these stages, but K. Häkkinen (100–108) considers this problematic. It should also be noted that the distribution of Indo-Iranian loanwords very rarely matches the assumed taxonomic divisions: there are some loanwords confined to the Finno-Permic, Finno-Volgaic or Ugric languages, but very few loanwords that would be Finno-Permic, Finno-Volgaic or Ugric in the way that the word is found in all the languages that belong to the branch.

Consontants

Laryngeals

There are only very few possible examples of a consonantal substitution of the word-initial laryngeal. It seems probable that the word-initial laryngeal, if it was retained, was not substituted in any way in Uralic. *karšV (> Fi karhu), an uncertain etymology, is the only possible example.

(…) Even if *k was a result of laryngeal hardening, the development would probably be earlier than Proto-Indo-Iranian, meaning that by the time the word was borrowed, the Indo-Iranian word simply had the stop *k that was regularly substituted by Uralic *k.

Evidence for Andronovo Aryan and Indo-Aryan loanwords?

None of the loanwords have to be considered as Andronovo Aryan or Proto-Indo-Aryan based on the criteria that were presented in the Introduction. The Uralic palatal affricate *ć or sibilant *ś can in all cases be explained from Proto-Indo-Iranian *ć, and there is no need to assume that it should reflect Andronovo Aryan *ć or PIA *ś. In the etymological material of this study, no further positive evidence was found for the distinction of PU *ś and *ć as substitutions of the Proto-Indo-Iranian affricates. This means that at least in word-initial position there probably was no difference between *ć and *ś, and even though we do not know what this sound was phonetically, it is safe to assume that Uralic words showing *ś reflect a sound substitution of Indo-Iranian *ć and *Ʒ́.

Regarding the distribution of the etymologies within Indo-Iranian, all the loanwords which cannot be from Iranian because of the lack of attested Iranian cognates have a more or less secure Proto-Indo-Iranian etymology, and nothing prevents us from assuming that these words reflect Proto-Indo-Iranian borrowings. It is also possible that some words with solid Proto-Indo-Iranian etymologies were present in Iranian but were lost before the first Old Iranian texts were composed.

12-bronze-age-early-cultures

12-bronze-age-early-uralic

List of Indo-European and Indo-Iranian Etymologies

Pre-Indo-Iranian

*ertä ‘side’, *kekrä ‘wheel’, *kečrä ‘spindle’, *mekši ‘bee’, (*meti ‘honey’), *ońća ‘part’, (*orpa ‘orphan’), *peijas ‘feast’, *pejmä ‘milk’, Pre-P *pertä ‘wing’, *repä ‘fox’, *rećmä ‘rope’, *sejti ‘bridge’

Proto-Indo-Iranian

*aćtara ‘whip’, *anti/onta, *ora ‘awl’, *orja ‘slave; south’, (*orpa ‘orphan’), *pośi ‘penis’, *śaŋka ‘handle’, Pre-Md *śaγa ‘goat’, *śarwi ‘horn’, *śaδa- ‘to rain’, śara- ‘shit’, *śi̮ta ‘hundred’, Pre-P *śVta ‘hundred’, *śasra ‘thousand’, *śišta ‘wax’, *śoma- ‘sad’, *waćara ‘hammer’, *woraći ‘boar’

Ambiguous early loans (can be either from PII or PI)

*ajša ‘shaft’, *asVra ‘lord’, *iha ‘yearning. passion’, *ihta ‘lust’, *jama ‘twin’, *jawi/jowa (> Mo juv) ‘awn’, *jawi (> PS *jäə̑) ‘flour’, *ji̮ni ‘way, path’, *juma ‘god’, *kana- ‘to dig’, *kara- ‘to dig’, *kata- ‘to graze’, *kertä- ‘to bind’, *ki̮ntaw ‘tree stump’, *kürtńV ‘iron’, PKh *kǟrtV ‘iron’, *kärtä ‘iron’, *martas ‘dead’, *ńātV- ‘to help’, *pakas ‘god’, *para ‘good’, Kh pĕnt ‘way’, PMs *pē̮ńtV ‘brother-in-law’, *pora ‘old’, *poči- ‘to boil’, Pre-P *porta ‘vessel’, *puntaksi ‘bottom’, Pre-Ma *pänti- ‘to bind’, PMa *pärća ‘ear of corn’, *pätäri- ‘to flee’, *saγi- ‘to get, obtain’, *sampas ‘pillar’, *saŋka ‘old’, *sara ‘lake’, *sasara ‘sister’, *säptä ‘seven’, *tajwas ‘sky’, *takra ‘piece of flesh’, *tarna ‘grass’, *tojwV ‘wish’, *toraksi ‘through’, *tora- ‘to fight’, *täjV ‘milk’, *täjinV ‘cow’, *täši, *uška ‘bull’, *wakša- (> PS *wåtå-) ‘to grow’, *wajna- ‘to see’, *wojna- ‘to see’, *wiša ‘venom’, *wi̮rna ‘wool’, *wärkä ‘kidney’, PS *wǝ̑rkǝ̑ ‘wolf’, *wirtV- ‘to hold, raise’, *äŋkärä ‘coal’

List of uncertain Indo-Iranian etymologies

PFi *aiwa (← Germanic ?), Ma *arša ‘mane’, PMs *ǟrV ‘fire’, *aštira ‘barren earth’, POug *ćakV ‘hammer’, *ćara- ‘brown; ? to dawn’, *ćero ‘hill-top’, *ćerti ‘group’, *itä- ‘to appear’, Pre-Fi *karšV ‘bear’, PMs *kīrV ‘iron’, *kota ‘chum’, Pre-Sa *kupa ‘pit’, PFi *kärsä ‘snout’, *maksa- ‘to pay’, PFi *mana-, PUg ? *mańći, Ma marij ‘Mari; man; husband’, *mē̮ja ‘wedding’, *mykkä ‘dumb’, PP *oč ‘corn’, *orpV ‘relative’, PFi *paksu ‘thick’, *peji- ‘to milk’, *pi̮ŋka ‘psychedelic mushroom’ POUg *porV ‘phratry’, Pre-Sa *poti ‘against’, Pre-Fi *šatas ‘germ’, *sentü- ‘to be born’, *šerä- ‘to wake up’, Ms šVšwǝŋ ‘hare’, PUg *śeŋkV ‘nail’, Pre-Sa *soma/sami ‘some’, PP *sur ‘beer’, PFi *süte- ‘to hit’ (< ? *sewči-), Hu szekér ‘wagon’, Kh ʌīkər ‘Narte’ PUg *taja- ‘secret’, Pre-Fi *terni ‘young’, *terwV ‘healthy’, ? *towkV ‘spring’, PWU *utarV ‘udder’ (← Germanic ?; Mari *waδar ← II), *waŋka ‘hook’, Mo E v́eŕges, M vərǵas ‘wolf’

Etymologies that were probably borrowed from another Indo-European source (PIE, PBSl, Germanic, Baltic)

*aisa ‘shaft’ ← Balto-Slavic, PFi *aiwa (← Germanic ?), *apV ‘help’ ← Germanic, *jewä ‘grain’ ← Balto-Slavic, Ma karaš etc. ‘honeycomb’ ← Baltic, (*meti ‘honey’ ← ? PIE,) Fi *ojas ‘shaft’ ← Slavic, *ola ← Baltic, *oŋki ← Germanic, *porćas ← Balto-Slavic, Pre-Sa *porta ‘vessel’ ← Germanic, *salV ‘salt’ (cannot be reconstructed for PU, various later parallel loans), *śi̮lkaw ← Balto-Slavic, *sammu- ← Germanic, *śuka ← Balto-Slavic, Mari *šŭžar ← Baltic/Balto-Slavic or Slavic, *tejniš ‘pregnant animal’ ← Baltic/Balto-Slavic, PWU *utarV ‘udder’ (? ← Germanic)

Early loans into differentiated branches

Proto-West Uralic

Only in Finnic:

*aćnas ‘voracious’, *iha ‘wish’, *ihta ‘lust’, PFi *isV ‘appetite’, *martas ‘dead’, *očra ‘barley’, *peijas ‘feast’, *pejmä ‘milk’, *pe̮rna ‘spleen’, *sampas ‘pillar’, *sooja ‘shelter’, *tajwas ‘sky’, *takra ‘piece of flesh’, *terwV ‘healthy’, *tojwV ‘wish’

All of these words, with the exception of *sooja ‘shelter’, were clearly borrowed into Early Proto-Finnic (Pre-Finnic) at the latest. Formally most of the loans could be from PII or PI.

Only in Saami:

*kata- ‘to graze’, *kertä- ‘to bind’, *pora ‘old’, *wojna- ‘to see’

All of the loans were acquired before the Saami vowel changes. Formally all could be either from Proto-Indo-Iranian or Proto-Iranian.

Only in Finnic and Saami:

*asma ‘voracious’, *jama ‘twin’, *kekrä ‘wheel’, *mača ‘insect’

*asma ‘voracious’, *jama ‘twin’, *kekrä ‘wheel’, *mača ‘insect’ Of these, *mača from Proto-Iranian and *jama is ambiguous. As the -sm- in asma does not point to Proto-Indo-Iranian *ć, this is probably an Iranian loan too. It is possible that these words were borrowed into Proto-West Uralic, as there is no general support for a Finno-Saamic proto-language today. As the cognates within Finnic and Saami are regular, there is no need to assume parallel borrowings. *kekrä has to be from Proto-Indo-Iranian.

NOTE. Based on the discussion of stages of borrowing from Indo-Iranian, and of the distribution of *kekrä among Uralic dialects in particular, Holopainen probably means Pre-Indo-Iranian for this example.

Only in Mordvin and/or Finnic and/or Saami (can point to a borrowing into Proto-West Uralic):

*ji̮ni ‘way’, *kečrä ‘spindle’, *rećmä ‘rope’, *śaŋka, *waćara ‘hammer’, *warsa ‘foal’, *wasa ‘calf’, *woraći ‘pig’

Based on phonological criteria, these loans do not form a chronologically coherent layer, but probably their modern distribution is accidental (their original distribution can have been wider). *kečrä ‘spindle’ and *rećmä ‘rope’ are from Pre-II, *śaŋka, *waćara and *woraći from PII, *warsa and *wasa from later Iranian (Alanic). *ji̮ni is ambiguous. Also the loans confined to Finnic and Saami mentioned above probably were borrowed into Proto-West Uralic, as it is a more convincing taxonomic entity than Proto-Finno-Saamic.

Proto-Mari-Permic

Only in Mordvin, Finnic and/or Saami and Mari

*juma ‘good’

This loan can be either from PII or PI. As it is obvious that these four branches do not form any taxonomical entity (Salminen 2002; J. Häkkinen 2009), it is only logical that there are no other loanwords with a “Finno-Volgaic” distribution.

Only in Mari:

*kVrtnV ‘metal’ (← PII, PI or later), Pre-Ma *pänti- ‘to bind’, PMa *pärća ‘ear of corn’, *si̮rńa ‘gold’ (← Old Iranian)

Only very few early Indo-Iranian loans can be found in Mari and in no other Uralic language. It is unclear what the reason for this is. It is, of course, possible that some uncertain loanwords like marij ‘man; Mari’ turn out to be correct after all, but even that does not make the number of loans in Mari very high. The situation has to be explained either with loss of vocabulary and replacement by later loans (from Turkic, and also perhaps from Permic) or with Mari’s location on the periphery at the time of the later contacts with the Iranian languages. Agyagási (2019: 254–258) argues that the current area where Mari is spoken was formed only relatively late, after the Mongol invasion in the High Middle Ages. If this is indeed correct, and Mari was spoken in more northern areas before that, it can be assumed that Pre-Mari had only sporadic contacts with the Iranian languages after it split off from Proto-Uralic.

Only in Permic (early loans; for later loans confined to Permic)

*a(č)wa ‘stallion’, PP *ju ‘awn’, *kertä ‘house’, *kärtä ‘metal’, *kada- ~ *gada- ‘to steal’, *karka ‘chicken’, *parśa ~ *barśa ‘mane’, *parta ‘knife’, *pertä ‘wing’, *poči- ‘to boil’, *porta ‘vessel’, *dura ‘long’, *domV ‘to tame’, PP *śumi̮s ‘band’, PP *šud‘luck’, *uška ‘bull’, *wi̮rna ‘wool’, *wirä ‘man, husband’, *äŋkärä ‘coal’

The number of loanwords in Permic is relatively high, and many of these can be considered to be Iranian loanwords. Technically many loans are ambiguous, but as some of the words were borrowed late due to historical reasons (‘iron’), and some were borrowed into a Pre-Permic which already had a phonological system that was different from Proto-Uralic (*šud- has d which cannot reflect PU *δ).

It is probable that the Permic languages were in continuous contact with the Indo-Iranian languages from the time they split from Proto-Uralic until the early mediaeval era.

Proto-Ugro-Samoyedic

Only in Khanty and Mansi (regular cases):

POUg *ēräɣ ‘song’, POUg *eträ ‘clear sky’, POug *mɔ̈ŋki ‘forest-spirit’, *ńātV- ‘to help’, *päčäɣ ‘reindeer’

The number of these etymologies is so low that it is very difficult to determine whether these words were borrowed into Proto-Ob-Ugric or some earlier proto-language, such as Proto-Ugric.

Only in Khanty and/or Mansi and/or Hungarian (regular cases):

*säptä ‘seven’ (Khanty + Hungarian regular), *sara ‘lake’

There are so few convincing loanwords with a “Ugric” distribution that they provide very little evidence. Either of these loans could be from Proto-Indo-Iranian or Proto-Iranian, if we assume that *s > *h was a common Iranian sound change. Both loans were acquired

Only in Samoyed:

*jäwi (> PS *jäə̑), PS *pulə̑ ~ *pi̮lə̑ ‘bridge’, *täjki ‘spear’, PS *wǝ̑rkə̑ ‘wolf’, Pre-S *täši (> PS *tät), *wakša- (> PS *wåtå) ‘to grow’

Of these, only *wåtå- has to be a very early loan because of *s > *t. *jäwi (> PS *jäə̑) and PS *wə̑rkə̑ were possibly acquired before the Proto-Samoyed vowel developments, making them probably early loanwords too. Formally all of them could be either from PII or PI. *pulə̑ ~ *pi̮lə̑ could have been borrowed into Proto-Samoyed (with Iranian *u corresponding to Samoyed *u), and because of the *l the word is probably from a relatively late, Middle Iranian language.

The following loanwords have a distribution with a cognate in both Samoyed and some other branch:

*śaδa- ‘to rain’, *tora- ‘to fight’ (also *itä-, which is more uncertain, belongs here)

Pan-Uralic loans

The following loanwords have a distribution with regular cognates with at least one Ugric branch and some other branch, which points to early borrowing. Although formally *kana- and *kara- are ambiguous, they are probably from Proto-Indo-Iranian because of their distribution. The rest of the loans are from Pre-II or PII.

*kana- ‘to dig’, *kara- ‘to dig’, *meti ‘honey’, *mekši ‘bee’, *orpV ‘orphan’, *ora ‘awl’, *peji- ‘to milk’, *pätäri- ‘to flee’, *śara- ‘shit’, *śoma- ‘sad’

The following loanwords are found in at least two non-adjacent branches of Uralic (the ones listed in the above categories are not counted). As there are no widely accepted criteria for a word to be considered “Uralic”, all of these could be considered loanwords into Proto-Uralic, in this case probably from Proto-Indo-Iranian or Pre-Indo-Iranian.

*ajša ‘shaft’, *anti/onta ‘grass’, *ertä ‘side’, *ki̮ntaw ‘tree stump’, *mertä ‘human’, *orja ‘slave’, *para ‘good’, *počaw ‘reindeer’, *puntaksi ‘bottom’, *saγi- ‘to get, obtain’, *repä ‘fox’, *si̮ŋka ‘old’, *sasara ‘sister’, *sejti ‘bridge’, *śišta ‘wax’, *tarna ‘grass’, *toraksi ‘through’, *wiša ‘venom’

12-bronze-age-middle-cultures

12-bronze-age-middle-uralic

Discussion about the distribution and its impact on Uralic taxonomy

(…) there are Proto-Iranian loanwords which were borrowed simultaneously into several early branches of Uralic, making it likely that Uralic had split into several branches by the time of these contacts.

Also the fact that many of the Proto-Indo-Iranian loanwords either show a restricted distribution (such as West Uralic *waćara, *woraći) or irregular correspondences (*asVra, *śasra, *śi̮ta) can point to the conclusion that Proto-Uralic was fragmenting by the time when contacts with Proto-Indo-Iranian took place.

The earlier, Pre-Indo-Iranian loanwords usually show a wider distribution and regular sound correspondences. Although the number of these earliest loans is quite small, based on their distribution and regular correspondences it can be assumed that the Pre-Indo-Iranian stage (after RUKI, *l > *r and the merger of velars and labiovelars but before the merger of non-high vowels) was concurrent with Proto-Uralic, with the changes leading to Proto-Indo-Iranian happening after the dispersal of Proto-Uralic.

The distribution of loanwords reinforces the old idea that Samoyed is a lexical outlier, as only few convincing Indo-Iranian etymologies for Proto-Uralic words (*saδa- ‘to rain’, *tora- ‘to fight’) have a convincing reflex in Samoyed. However, the fact that such etymologies exist means rather that the situation is due to lexical loss in Samoyed, and that the earliest contact occurred before Samoyed split off from Proto-Uralic.

There are very few loanwords that have a Ugric distribution (being found in at least one Ob-Ugric branch and Hungarian), and likewise rather few in Ob-Ugric. The few loans that have a distribution confined to Ugric were borrowed before the change *s > *θ took place. This means that the Ugric distribution does not mean much from the point of view of chronology or taxonomy, as the words were borrowed into a language that was still identical to Proto-Uralic. Even some loans borrowed into Khanty and Mansi have to be so early.

Impacts on dating and the location of the contact zones

Because of the very limited number of convincing etymologies found only in Finnic or Saami, it is probable that there were not (extensive) contacts with Pre-Finnic or Pre-Saami after the split of Proto-West Uralic.

The great number of loanwords of varying ages in Permic inevitably points to the conclusion that the pre-form of the Permic branch had been constantly spoken in an area that was adjacent to the Iranian languages. The different layers of loanwords in Permic clearly point to chronological differences in the donor languages, but it also seems that Permic was in contact with various forms of Iranian and not with different diachronic stages of the same language.

In general, the words that have been borrowed are typical cultural words, and the contacts between Indo-Iranian and Uralic seems to have been a typical contact situation in which a culturally less-advanced language group borrows various cultural terms from a more “advanced” group. The words in various loanword layers related to horse and cattle breeding show obvious cultural influence in the field of domesticated animals, and the borrowing of some names of grains points to agricultural influence from the Indo-Iranians on the speakers of Uralic.

Needless to say, many of the borrowings I listed in A Song of Sheep and Horses suffer from the same ailment attributed to Indo-Europeanists in general:

With slight exaggeration one can agree with the remark by Koivulehto (1999a: 209–210) that the Indo-Europeanists often use outdated sources or are simply uninterested in the topic. The problem is further complicated by the various and often obsolete views expressed in even relatively modern Uralicist works, such as those of Rédei (1986c; 1988) or Katz (2003); (…) Mallory & Adams (2006) adequately refer to the importance of the early loanwords, but they use mostly Rédei’s outdated reconstructions and stratigraphy in support of their theories.

I need to review all related texts with this thesis and the works recently published by Kümmel, as well as the recent book of the Leiden school on Indo-Uralic.

Also, does anyone know the (traditional?) why of the resistance to the Indo-Uralic concept among Uralicists? Maybe it’s a reaction against the Nostraticist and Siberian views of Uralic espoused by the Soviets?

Related

Yamnaya replaced Europeans, but admixed heavily as they spread to Asia

narasimhan-spread-yamnaya-ancestry

Recent papers The formation of human populations in South and Central Asia, by Narasimhan, Patterson et al. Science (2019) and An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers, by Shinde et al. Cell (2019).

NOTE. For direct access to Narasimhan, Patterson et al. (2019), visit this link courtesy of the first author and the Reich Lab.

I am currently not on holidays anymore, and the information in the paper is huge, with many complex issues raised by the new samples and analyses rather than solved, so I will stick to the Indo-European question, especially to some details that have changed since the publication of the preprint. For a summary of its previous findings, see the book series A Song of Sheep and Horses, in particular the sections from A Clash of Chiefs where I discuss languages and regions related to Central and South Asia.

I have updated the maps of the Preshistory Atlas, and included the most recently reported mtDNA and Y-DNA subclades. I will try to update the Eurasian PCA and related graphics, too.

NOTE. Many subclades from this paper have been reported by Kolgeh (download), Pribislav and Principe at Anthrogenica on this thread. I have checked some out for comparison, but even if it contradicted their analyses mine would be the wrong ones. I will upload my spreadsheets and link to them from this page whenever I find the time.

caucasus-cline-narasimhan
Ancestry clines (1) before and (2) after the advent of farming. Colour modified from the original to emphasize the CHG cline: notice the apparent relevance of forest-steppe groups in the formation of this CHG mating network from which Pre-Yamnaya peoples emerged.

Indo-Europeans

I think the Narasimhan, Patterson et al. (2019) paper is well-balanced, and unexpectedly centered – as it should – on the spread of Yamnaya-related ancestry (now Western_Steppe_EMBA) as the marker of Proto-Indo-European migrations, which stretched ca. 3000 BC “from Hungary in the west to the Altai mountains in the east”, spreading later Indo-European dialects after admixing with local groups, from the Atlantic to South Asia.

I. Afanasievo

I.1. East or West PIE?

I expected Afanasievo to show (1) R1b-L23(xZ2103, xL51) and (2) R1b-L51 lineages, apart from (3) the known R1b-Z2103 ones, pointing thus to an ancestral PIE community before the typical Yamnaya bottlenecks, and with R1b-L51 supporting a connection with North-West Indo-European. The presence of some samples of hg. Q pointed in this direction, too.

However, Afanasievo samples show overwhelmingly R1b-Z2103 subclades (all except for those with low coverage), all apparently under R1b-Z2108 (formed ca. 3500 BC, TMRCA ca. 3500 BC), like most samples from East Yamnaya.

This necessarily shifts the split and spread of R1b-L23 lineages to Khvalynsk/early Repin-related expansions, in line with what TMRCA suggested, and what advances by Anthony (2019) and Khokhlov (2018) on future samples from the Reich Lab suggest.

Given the almost indistinguishable ancestry between Afanasievo and Early Yamnaya, there seems to be as of yet little potential information to support in population genomics that Pre-Tocharians were more closely related to North-West Indo-Europeans than to Graeco-Aryans, as it is proposed in linguistics based on the few shared traits between them, and the lack of innovations proper of the Graeco-Aryan community.

NOTE. A new issue of Wekʷos contains an abstract from a relevant paper by Blažek on vocabulary for ‘word’, including the common NWIE *wrdʰo-/wordʰo-, but also a new (for me, at least) Northern Indo-European one: *rēki-/*rēkoi̯-, shared by Slavic and Tocharian.

The fact that bottlenecks happened around the time of the late Repin expansion suggests that we might be able to see different clans based on the predominant lineages developing around the Don-Volga area in the 4th millennium BC. The finding of Pre-R1b-L51 in Lopatino (see below), and of a Catacomb sample of hg. R1b-Z2103(Z2105-) in the North Caucasus steppe near Novoaleksandrovskij also support a star-like phylogeny of R1b-L23 stemming from the Don-Volga area.

NOTE. Interestingly, a dismissal of a common trunk between Tocharian and North-West Indo-European would mean that shared similarities between such disparate groups could be traced back to a Common Late PIE trunk, and not to a shared (western) Repin community. For an example of such a ‘pure’ East-West dialectal division, see the diagram of Adams & Mallory (2007) at the end of the post. It would thus mean a fatal blow to Kortlandt’s Indo-Slavonic group among other hypothetical groupings (remade versions of the ancient Centum-Satem division), as well as to certain assumptions about laryngeal survival or tritectalism that usually accompany them. Still, I don’t think this is the case, so the question will remain a linguistic one, and maybe some similarities will be found with enough number of samples that differentiate Northern Indo-Europeans from the East Yamna/Catacomb-Poltavka-Balkan_EBA group.

afanasievo-y-dna
Y-chromosome haplogroups of Afanasievo samples and neighbouring groups. See full maps.

I.2. Expansion or resurgence of hg. Q1b?

Haplogroup Q1b-Y6802(xY6798) seems to be the main lineage that expanded with Afanasievo, or resurged in their territory. It’s difficult to tell, because the three available samples are family, and belong to a later period.

NOTE. I have finally put some order to the chaos of Q1a vs. Q1b subclades in my spreadsheet and in the maps. The change of ISOGG 2016 to 2017 has caused that many samples reported as of Q1 subclades from papers prepared during the 2017-2018 period, and which did not provide specific SNP calls, were impossible to define with certainty. By checking some of them I could determine the specific standard used.

In favour of the presence of this haplogroup in the Pre-Yamnaya community are:

  • The statement by Anthony (2019) that Q1a [hence maybe Q1b in the new ISOGG nomenclature] represented a significant minority among an R1b-rich community.
  • The sample found in a Sintastha WSHG outlier (see below), of hg. Q1b-Y6798, and the sample from Lola, of hg. Q1b-L717, are thus from other lineage(s) separated thousands of years from the Afanasievo subclade, but might be related to the Khvalynsk expansion, like R1b-V1636 and R1b-M269 are.

These are the data that suggest multiple resurgence events in Afanasievo, rather than expanding Q1b lineages with late Repin:

  • Overwhelming presence of R1b in early Yamnaya and Afanasievo samples; one Q1(xQ1b) sample reported in Khvalynsk.
  • The three Q1b samples appear only later, although wide CI for radiocarbon dates, different sites, and indistinguishable ancestry may preclude a proper interpretation of the only available family.
    • Nevertheless, ancestry seems unimportant in the case of Afanasievo, since the same ancestry is found up to the Iron Age in a community of varied haplogroups.
  • Another sample of hg. Q1b-Y6802(xY6798) is found in Aigyrzhal_BA (ca. 2120 BC), with Central_Steppe_EMBA (WSHG-related) ancestry; however, this clade formed and expanded ca. 14000 BC.
  • The whole Altai – Baikal area seems to be a Q1b-L54 hotspot, although admittedly many subclades separated very early from each other, so they might be found throughout North Eurasia during the Neolithic.
  • One Afanasievo sample is reported as of hg. C in Shin (2017), and the same haplogroup is reported by Hollard (2014) for the only available sample of early Chemurchek to date, from Kulala ula, North Altai (ca. 2400 BC).
afanasievo-chemurchek-y-dna
Y-chromosome haplogroups of late Afanasievo – early Chemurchek samples and neighbouring groups. See full maps.

I.3. Agricultural substrate

Evidence of continuous contacts of Central_Steppe_MLBA populations with BMAC from ca. 2100 BC on – visible in the appearance of Steppe ancestry among BMAC samples and BMAC ancestry among Steppe pastoralists – supports the close interaction between Indo-Iranian pastoralists and BMAC agriculturalists as the origin of the Asian agricultural substrate found in Proto-Indo-Iranian, hence likely related to the language of the Oxus Civilization.

Similar to the European agricultural substrate adopted by West Yamnaya settlers (both NWIE and Palaeo-Balkan speakers), Tocharian shows a few substrate terms in common with Indo-Iranian, which can be explained by contacts in different dialectal stages through phonetic reconstruction alone.

The recent Hermes et al. (2019) supports the early integration of pastoralism and millet cultivation in Central Asia (ca. 2700 BC or earlier), with the spread of agriculture to the north – through the Inner Asian Mountain Corridor – being thus unrelated to the Indo-Iranian expansions, which might support independent loans.

However, compared to the huge number of parallel shared loans between NWIE and Palaeo-Balkan languages in the European substratum, Indo-Iranians seem to have been the first borrowers of vocabulary from Asian agriculturalists, while Proto-Tocharian shows just one certain related word, with phonetic similarities that warrant an adoption from late Indo-Iranian dialects.

chemurchek-sintashta-bmac
Y-chromosome haplogroups of Sintashta, Central Asia, and neighbouring groups in the Early Bronze Age. See full maps.

The finding of hg. (pre-)R1b-PH155 in a BMAC sample from Dzharkutan (to the west of Xinjiang) together with hg. R1b in a sample from Central Mongolia previously reported by Shin (2017) support the widespread presence of this lineage to the east and west of Xinjiang, which means it might have become incorporated to Indo-Iranian migrants into the Xiaohe horizon, to the Afanasievo-Chemurchek-derived groups, or the later from the former. In other words, the Island Biogeography Theory with its explanation of founder effects might be, after all, applicable to the whole Xinjiang area, not only during the Chemurchek – Tianshan-Beilu – Xiaohe interaction.

Of course, there is no need for too complicated models of haplogroup resurgence events in Central and South Asia, seeing how the total amount of hg. R1a-L657 (today prevalent among Indo-Aryan speakers from South Asia) among ancient Western/Central_Steppe_MLBA-related samples amounts to a total of 0, and that many different lineages survived in the region. Similar cases of haplogroup resurgence and Y-DNA bottleneck events are also found in the Central and Eastern Mediterranean, and in North-Eastern Europe. From the paper:

[It] could reflect stronger ecological or cultural barriers to the spread of people in South Asia than in Europe, allowing the previously established groups more time to adapt and mix with incoming groups. A second difference is the smaller proportion of Steppe pastoralist– related ancestry in South Asia compared with Europe, its later arrival by ~500 to 1000 years, and a lower (albeit still significant) male sex bias in the admixture (…).

Y-chromosome haplogroups of samples from the Srubna-Andronovo and Andronovo-related horizon, Xiaohe, late BMAC, and neighbouring groups. See full maps.

II. R1b-Beakers replaced R1a-CWC peoples

II.1. R1a-M417-rich Corded Ware

Newly reported Corded Ware samples from Radovesice show hg. R1a-M417, at least some of them xZ645, ‘archaic’ lineages shared with the early Bergrheinfeld sample (ca. 2650 BC) and with the coeval Esperstedt family, hence supporting that it eventually became the typical Western Corded Ware lineage(s), probably dominating over the so-called A-horizon and the Single Grave culture in particular. On the other hand, R1a-Z645 was typical of bottlenecks among expanding Eastern Corded Ware groups.

Interestingly, it is supported once again that known bottlenecks under hg. R1a-M417 happened during the Corded Ware expansion, evidenced also by the remarkable high variability of male lineages among early Corded Ware samples. Similarly, these Corded Ware samples from Bohemia form part of the typical ‘Central European’ cluster in the PCA, which excludes once again not only the ‘official’ Espersted outlier I1540, but also the known outlier with Yamnaya ancestry.

NOTE. The fact that Esperstedt is closely related geographically and in terms of ancestry to later Únětice samples further complicates the assumption that Únětice is a mixture of Bell Beakers and Corded Ware, being rather an admixture of incoming Bell Beakers with post-Yamnaya vanguard settlers who admixed with Corded Ware (see more on the expansion of Yamnaya ancestry). In other words, Únětice is rather an admixture of Yamnaya+EEF with Yamnaya+(CWC+EEF).

Y-chromosome haplogroups of samples from Catacomb, Poltavka, Balkan EBA, and Bell Beaker, as well as neighbouring groups. See full maps.

On Ukraine_Eneolithic I6561

If the bottlenecks are as straightforward as they appear, with a star-like phylogeny of R1a-M417 starting with the Pre-Corded Ware expansion, then what is happening with the Alexandria sample, so precisely radiocarbon dated to ca. 4045-3974 BC? The reported hg. R1a-M417 was fully compatible, while R1a-Z645 could be compatible with its date, but the few positive SNPs I got in my analysis point indeed to a potential subclade of R1a-Z94, and I trust more experienced hobbyists in this ‘art’ of ascertaining the SNPs of ancient samples, and they report hg. R1a-Z93 (Z95+, Y26+, Y2-).

Seeing how Y-DNA bottlenecks worked in Yamnaya-Afanasievo and in Corded Ware and related groups, and if this sample really is so deep within R1a-Z93 in a region that should be more strongly affected by the known Neolithic Y-chromosome bottlenecks and forest-steppe ecotone, someone from the lab responsible for this sample should check its date once again, before more people keep chasing their tails with an individual that (based on its derived SNPs’ TMRCA) might actually be dated to the Bronze Age, where it could make much more sense in terms of ancestry and position in the PCA.

EDIT (14 SEP 2019): … and with the fact that he is the first individual to show the genetic adaptation for lactase persistence (I3910-T), which is only found later among Bell Beakers, and much later in Sintashta and related Steppe_MLBA peoples (see comments below).

This is also evidenced by the other Ukraine_Eneolithic (likely a late Yamnaya) sample of hg. R1b-Z2103 from Dereivka (ca. 2800 BC) and who – despite being in a similar territory 1,000 years later – shows a wholly diluted Yamnaya ancestry under typically European HG ancestry, even more so than other late Sredni Stog samples from Dereivka of ca. 3600-3400 BC, suggesting a decrease in Steppe ancestry rather than an increase – which is supposedly what should be expected based on the ancestry from Alexandria…

Like the reported Chalcolithic individual of Hajji Firuz who showed an apparently incompatible subclade and Yamnaya ancestry at least some 1,000 years before it should, and turned out to be from the Iron Age (see below), this may be another case of wrong radiocarbon dating.

NOTE. It would be interesting, if this turns out to be another Hajji Firuz-like error, to check how well different ancestry models worked in whose hands exactly, and if anyone actually pointed out that this sample was derived, and not ancestral, to many different samples that were used in combination with it. It would also be a great control to check if those still supporting a Sredni Stog origin for PIE would shift their preference even more to the north or west, depending on where the first “true” R1a-M417 samples popped up. Such a finding now could be thus a great tool to discover whether haplogroup-based bias plays a role in ancestry magic as related to the Indo-European question, i.e. if it really is about “pure statistics”, or there is something else to it…

II.1. R1b-L51-rich Bell Beakers

The overwhelming majority of R1b-L51 lineages in Radovesice during the Bell Beaker period, just after the sampled Corded Ware individuals from the same site, further strengthen the hypothesis of an almost full replacement of R1a-M417 lineages from Central Europe up to southern Scandinavia after the arrival of Bell Beakers.

Yet another R1b-L151* sample has popped up in Central Europe, in the individual classified as Bilina_BA (ca. 2200-800 BC), which clusters with Bell Beakers from Bohemia, with the outlier from Turlojiškė, and with Early Slavs, suggesting once again that a group of central-east European Beakers represented the Pre-Proto-Balto-Slavic community before their spread and admixture events to the east.

The available ancient distribution of R1b-L51*, R1b-L52* or R1b-L151* is getting thus closer to the most likely origin of R1b-L51 in the expansion of East Bell Beakers, who trace their paternal ancestors to Yamnaya settlers from the Carpathian Basin:

NOTE. Some of these are from other sources, and some are samples I have checked in a hurry, so I may have missed some derived SNPs. If you send me a corrected SNP call to dismiss one of these, or more ‘archaic’ samples, I’ll correct the map accordingly. See also maps of modern distributionof R1b-M269 subclades.

r1b-l51-ancient-europe
Distribution of ‘archaic’ R1b-L51 subclades in ancient samples, overlaid over a map of Yamnaya and Bell Beaker migrations. In blue, Yamnaya Pre-L51 from Lopatino (not shown) and R1b-L52* from BBC Augsburg. In violet, R1b-L51 (xP312,xU106) from BBC Prague and Poland. In maroon, hg. R1b-L151* from BBC Hungary, BA Bohemia, and (not shown) a potential sample from BBC at Mondelange, which is certainly xU106, maybe xP312. Interestingly, the earliest sample of hg. R1b-U106 (a lineage more proper of northern Europe) has been found in a Bell Beaker from Radovesice (ca. 2350 BC), between two of these ‘archaic’ R1b-L51 samples; and a sample possibly of hg. R1b-ZZ11+ (ancestral to DF27 and U152) was found in a Bell Beaker from Quedlinburg, Germany (ca. 2290 BC), to the north-west of Bohemia. The oldest R1b-U152 are logically from Central Europe, too.

III. Proto-Indo-Iranian

Before the emergence of Proto-Indo-Iranian, it seems that Pre-Proto-Indo-Iranian-speaking Poltavka groups were subjected to pressure from Central_Steppe_EMBA-related peoples coming from the (south-?)east, such as those found sampled from Mereke_BA. Their ‘kurgan’ culture was dated correctly to approximately the same date as Poltavka materials, but their ancestry and hg. N2(pre-N2a) – also found in a previous sample from Botai – point to their intrusive nature, and thus to difficulties in the Pre-Proto-Indo-Iranian community to keep control over the previous East Yamnaya territory in the Don-Volga-Ural steppes.

We know that the region does not show genetic continuity with a previous period (or was not under this ‘eastern’ pressure) because of an Eastern Yamnaya sample from the same site (ca. 3100 BC) showing typical Yamnaya ancestry. Before Yamnaya, it is likely that Pre-Yamnaya ancestry formed through admixture of EHG-like Khvalynsk with a North Caspian steppe population similar to the Steppe_Eneolithic samples from the North Caucasus Piedmont (see Anthony 2019), so we can also rule out some intermittent presence of a Botai/Kelteminar-like population in the region during the Khvalynsk period.

It is very likely, then, that this competition for the same territory – coupled with the known harsher climate of the late 3rd millennium BC – led Poltavka herders to their known joint venture with Abashevo chiefs in the formation of the Sintashta-Potapovka-Filatovka community of fortified settlements. Supporting these intense contacts of Poltavka herders with Central Asian populations, late ‘outliers’ from the Volga-Ural region show admixture with typical Central_Steppe_MLBA populations: one in Potapovka (ca. 2220 BC), of hg. R1b-Z2103; and four in the Sintashta_MLBA_o1 cluster (ca. 2050-1650 BC), with two samples of hg. R1b-L23 (one R1b-Z2109), one Q1b-L56(xL53), one Q1b-Y6798.

central-steppe-pastoralists
Outlier analysis reveals ancient contacts between sites. We plot the average of principal component 1 (x axis) and principal component 2 (y axis) for the West Eurasian and All Eurasian PCA plots (…). In the Middle to Late Bronze Age Steppe, we observe, in addition to the Western_Steppe_MLBA and Central_Steppe_MLBA clusters (indistinguishable in this projection), outliers admixed with other ancestries. The BMAC-related admixture in Kazakhstan documents northward gene flow onto the Steppe and confirms the Inner Asian Mountain Corridor as a conduit for movement of people.

Similar to how the Sintashta_MLBA_o2 cluster shows an admixture with central steppe populations and hg. R1a-Z645, the WSHG ancestry in those outliers from the o1 cluster of typically (or potentially) Yamnaya lineages show that Poltavka-like herders survived well after centuries of Abashevo-Poltavka coexistence and admixture events, supporting the formation of a Proto-Indo-Iranian community from the local language as pronounced by the incomers, who dominated as elites over the fortified settlements.

The Proto-Indo-Iranian community likely formed thus in situ in the Don-Volga-Ural region, from the admixture of locals of Yamnaya ancestry with incomers of Corded Ware ancestry – represented by the ca. 67% Yamnaya-like ancestry and ca. 33% ancestry from the European cline. Their community formed thus ca. 1,000 years later than the expansion of Late PIE ca. 3500 BC, and expanded (some 500 years after that) a full-fledged Proto-Indo-Iranian language with the Srubna-Andronovo horizon, further admixing with ca. 9% of Central_Steppe_EMBA (WSHG-related) ancestry in their migration through Central Asia, as reported in the paper.

IV. Armenian

The sample from Hajji Firuz, of hg. R1b-Z2103 (xPF331), has been – as expected – re-dated to the Iron Age (ca. 1193-1019 BC), hence it may offer – together with the samples from the Levant and their Aegean-like ancestry rapidly diluted among local populations – yet another proof of how the Late Bronze Age upheaval in Europe was the cause of the Armenian migration to the Armenoid homeland, where they thrived under the strong influence from Hurro-Urartian.

middle-east-armenia-y-dna
Y-chromosome haplogroups of the Middle East and neighbouring groups during the Late Bronze Age / Iron Age. See full maps.

Indus Valley Civilization and Dravidian

A surprise came from the analysis reported by Shinde et al. (2019) of an Iran_N-related IVC ancestry which may have split earlier than 10000 BC from a source common to Iran hunter-gatherers of the Belt Cave.

For the controversial Elamo-Dravidian hypothesis of the Muscovite school, this difference in ancestry between both groups (IVC and Iran Neolithic) seems to be a death blow, if population genomics was even needed for that. Nevertheless, I guess that a full rejection of a recent connection will come down to more recent and subtle population movements in the area.

EDIT (12 SEP): Apparently, Iosif Lazaridis is not so sure about this deep splitting of ‘lineages’ as shown in the paper, so we may be talking about different contributions of AME+ANE/ENA, which means the Elamo-Dravidian game is afoot; at least in genomics:

I shared the idea that the Indus Valley Civilization was linked to the Proto-Dravidian community, so I’m inclined to support this statement by Narasimhan, Patterson, et al. (2019), even if based only on modern samples and a few ancient ones:

The strong correlation between ASI ancestry and present-day Dravidian languages suggests that the ASI, which we have shown formed as groups with ancestry typical of the Indus Periphery Cline moved south and east after the decline of the IVC to mix with groups with more AASI ancestry, most likely spoke an early Dravidian language.

india-steppe-indus-valley-andamanese-ancestry
Natural neighbour interpolation of qpAdm results – Maximum A Posteriori Estimate from the Hierarchical Model (estimates used in the Narasimhan, Patterson et al. 2019 figures) for Central_Steppe_MLBA-related (left), Indus_Periphery_West-related (center) and Andamanese_Hunter-Gatherer-related ancestry (right) among sampled modern Indian populations. In blue, peoples of IE language; in red, Dravidian; in pink, Tibeto-Burman; in black, unclassified. See full image.

I am wary of this sort of simplistic correlation with modern speakers, because we have seen what happened with the wrong assumptions about modern Balto-Slavic and Finno-Ugric speakers and their genetic profile (see e.g. here or here). In fact, I just can’t differentiate as well as those with deep knowledge in South Asian history the social stratification of the different tribal groups – with their endogamous rules under the varna and jati systems – in the ancestry maps of modern India. The pattern of ancestry and language distribution combined with the findings of ancient populations seem in principle straightforward, though.

Conclusion

The message to take home from Shinde et al. (2019) is that genomic data is fully at odds with the Anatolian homeland hypothesis – including the latest model by Heggarty (2014)* – whose relevance is still overvalued today, probably due in part to the shift of OIT proponents to more reasonable Out-of-Iran models, apparently more fashionable as a vector of Indo-Aryan languages than Eurasian steppe pastoralists?
*The authors listed this model erroneously as Heggarty (2019).

The paper seems to play with the occasional reference to Corded Ware as a vector of expansion of Indo-European languages, even after accepting the role of Yamnaya as the most evident population expanding Late PIE to western Europe – and the different ancestry that spread with Indo-Iranian to South Asia 1,000 years later. However, the most cringe-worthy aspect is the sole citation of the debunked, pseudoscientific glottochronological method used by Ringe, Warnow, and Taylor (2002) to support the so-called “steppe homeland”, a paper and dialectal scheme which keeps being referenced in papers of the Reich Lab, probably as a consequence of its use in Anthony (2007).

On the other hand, these are the equivalent simplistic comments in Narasimhan, Patterson et al. (2019):

The Steppe ancestry in South Asia has the same profile as that in Bronze Age Eastern Europe, tracking a movement of people that affected both regions and that likely spread the unique features shared between Indo-Iranian and Balto-Slavic languages. (…), which despite their vast geographic separation share the “satem” innovation and “ruki” sound laws.

mallory-adams-tree
Indo-European dialectal relationships, from Mallory and Adams (2006).

The only academic closely related to linguistics from the list of authors, as far as I know, is James P. Mallory, who has supported a North-West Indo-European dialect (including Balto-Slavic) for a long time – recently associating its expansion with Bell Beakers – opposed thus to a Graeco-Aryan group which shared certain innovations, “Satemization” not being one of them. Not that anyone needs to be a linguist to dismiss any similarities between Balto-Slavic and Indo-Iranian beyond this phonetic trend, mind you.

Even Anthony (2019) supports now R1b-rich Pre-Yamnaya and Yamnaya communities from the Don-Volga region expanding Middle and Late Proto-Indo-European dialects.

So how does the underlying Corded Ware ancestry of eastern Europe (where Pre-Balto-Slavs eventually spread to from Bell Beaker-derived groups) and of the highly admixed (“cosmopolitan”, according to the authors) Sintashta-Potapovka-Filatovka in the east relate to the similar-but-different phonetic trends of two unrelated IE dialects?

If only there was a language substrate that could (as Shinde et al. put it) “elegantly” explain this similar phonetic evolution, solving at the same time the question of the expansion of Uralic languages and their strong linguistic contacts with steppe peoples. Say, Eneolithic populations of mainly hunter-fisher-gatherers from the North Pontic forest-steppes with a stronger connection to metalworking

Related

Proto-Tocharians: From Afanasievo to the Tarim Basin through the Tian Shan

tocharians-early-eneolithic

A reader commented recently that there is little information about Indo-Europeans from Central and East Asia in this blog. Regardless of the scarce archaeological data compared to European prehistory, I think it is premature to write anything detailed about population movements of Indo-Iranians in Asia, especially now that we are awaiting the updates of Narasimhan et al (2018).

Furthermore, there was little hope that Tocharians would be different than neighbouring Andronovo-like populations (see a recent post on my predicted varied admixture of Common Tocharians), so the history of both unrelated Late PIE languages would have had to be explained by the admixture of Afanasievo-related groups with peoples of Andronovo descent and their acculturation.

However, data reported recently by Ning, Wang et al. Current Biology (2019) confirmed that peoples of mainly Afanasievo ancestry – as opposed to those of Corded Ware-related ancestry expanding with the Srubna-Andronovo horizon – spread the Tocharian branch of Proto-Indo-European from the Altai into the Tian Shan area, surviving essentially unadmixed into the Early Iron Age.

This genetic continuity of Tocharians will no doubt help us disentangle a great part the ethnolinguistic history of speakers of the Tocharian branch of Proto-Indo-European, from Pre-Proto-Tocharians of Afanasievo to Common Tocharians of the Late Bronze Age/Iron Age eastern Tian Shan.

NOTE. Tocharian’s isolation from the rest of Late PIE dialects and its early and intense language contacts have always been the key to support an early migration and physical separation of the group, hence the traditional association with Afanasievo, a late Repin/early Yamna offshoot. Even with the current incomplete archaeological and genetic picture, there is no other option left for the expansion of Tocharian.

It is not possible to use the currently available ancestry data to map the evolution of Afanasievo ancestry, lacking a proper geographical and temporal transect of Central and East Asian groups. In spite of this, Ning, Wang, et al. (2019) is a huge leap forward, discarding some archaeological models, and leaving only a few potential routes by which Tocharians may have spread southward from the Altai.

NOTE. I have updated the maps of prehistoric cultures accordingly, with colours – as always – reflecting the language/ancestry evolution of the different groups, even though the archaeological data of some groups of Xinjiang remains scarce, so their ethnolinguistic attribution – and the colours picked for them – remain tentative.

xinjiang-andronovo-xiaohe-horizon-bronze-iron-age
A rough timeline of related archaeological sites from North Eurasia. Image modified from Yang (2019).

Tocharians

The recent book Ancient China and its Eurasian Neighbors. Artifacts, Identity and Death in the Frontier, 3000–700 BCE, by Linduff, Sun, Cao, and Liu, Cambridge University Press (2017) offers an interesting summary of the introduction of metalworking into western China.

Here are some relevant excerpts (emphasis mine):

Although [the Xinjiang] route is not uniformly agreed upon (Shelach-Lavi 2009: 134–46), this western transmission has been thought to have passed through eastern Kazakhstan, especially as it is manifest in Semireiche, with Yamnaya, Afanasievo (copper) and Andronovo (tin bronze) peoples (Mei 2000: Fig. 3). From Xinjiang this knowledge has been thought to have traveled through the Gansu Corridor via the Qijia peoples (Bagley 1999) and then into territories controlled by dynastic China. The dating of this process is still a problem, as the sites and their contents in Xinjiang are consistently later than those in Gansu, suggesting that the point of contact was in Gansu and that the knowledge then spread from there westward.

1. Eneolithic Altai

tocharians-chalcolithic-eneolithic
Afanasievo expansion ca. 3300-2600 BC. See full culture and ancient DNA maps.

The Afanasievo sites, as they are identified in Mongolia, for instance, make up an Eneolithic culture analogous to that of southern Siberia (3100/2500–2000 BCE) in the Upper Yenissei Valley that is characterized by copper tools and an economy reliant on horse, sheep and cattle breeding as well as hunting. (…) The Afanasievo is best known through study of its burials, which typically include groups of round barrows (kurgans), each up to 12 m in diameter with a stone kerb and covering a central pit grave containing multiple inhumations. In their Siberian context, burial pottery types and styles have suggested contacts with the slightly earlier Kelteminar culture of the Aral and Caspian Sea area.

The Afanasievo culture monuments, located in the northern Altai and in the Minusinsk Basin (the western Sayan), have been seen as analogous evidence for cross-Eurasian exchange. These complexes contain small collections of metal, and many of the items are made of brass, although golden, silver and iron ornaments were also identified. A mere one-fourth of these objects are tools and ornaments, while the rest consist of unshaped remains and semi-manufactured objects. Its metallurgical tradition has recently been dated by Chernykh to as early as 3100 to 2700 BCE (1992),making it more compatible chronologically with the early brass-using sites in Shaanxi mentioned above. Kovalev and Erdenebaatar have excavated barrows in Bayan-Ulgii, Mongolia, that have been carbon-dated to the first half of the third millennium BCE and associated by ceramic types and styles and burial patterns with the Afanasievo (Kovalev and Erdenebaatar 2009: 357–58). These mounded kurgans were covered with stone and housed rectangular, wooden-faced tombs that included Afanasievo-type bronze awls, plates and small “leaf-shaped” knife blades (Kovalev and Erdenebaatar 2009: Figs. 6 and 7).

They also excavated sites belonging to the more recently identified Chemurchek archaeological culture, located in the foothills of the Mongolian Altai (Kovalev 2014, 2015) (Fig. 2.6). These sites are carbon-dated to the same period as the Afanasievo burials or to c. 3100/2500–1800 BCE (six barrows in Khovd aimag and four in Bayan-Ulgo aimag). In the rectangular stone kerbed Chemurchek slab burials (Ulaaanhus sum, Bayan-ul’gi aimag and so forth), bronze items included awls; and at Khovd aimag, Bulgan sum, in addition to stone sculptures, three lead and one bronze ring were excavated (Kovalev and Erdenebaatar 2009: Figs. 2 and 3; Fig. 2.6). Although we will not know if they were produced locally until much further investigation is undertaken, these discoveries do document knowledge of various uses and types of metal objects in western and south central Mongolia. The types of metal items thus far recovered are simple tools (awls) and rings (ornamental?) not unlike those associated with Andronovo archaeological cultures as well.

This is a complex circumstance where archaeological evidence is not complete, but raises very important questions about transmission of metallurgical knowledge to and from areas in present-day China. In the 1970s some Afanasievo mounds were excavated in Central Mongolia by a Soviet–Mongolian expedition led by V. V. Volkov and E. A. Novgorodova (Novgorodova 1989: 81–85). Unfortunately, these mounds did not yield metal objects, only ceramics, but they show that the Afanasievo culture with the Eneolithic metallurgical tradition of manufacturing pure copper items had already moved east at least far as central Mongolia. In 2004, Kovalev and Erdenebaatar investigated a large Afanasievo mound, Kulala ula, in the extreme northwest of Mongolia, near the Russian border (Kovalev and Erdenebaatar 2009). There they found a copper knife and awl (Fig. 2.5). There are five C14 dates on wood, coal and human bones from this mound, which belong to the period 2890–2570 BCE. This shows that the Afanasievo culture were carriers of technology and produced artifacts in the first half of the third millennium BCE and that they also moved south along the foothills of the Mongolian Altai. Afanasievo culture in Altai and the Minusinsk basin is dated by C14 to 3600–2500 BCE (Svyatko et al. 2009; Polyakov 2010). In the north of Xinjiang in the Altai district, several typical egg-shaped vessels and two censers of Afanasievo types were found. Some of these have been obtained from the stone boxes (chambers of megalithic graves of the Chemurchek culture) (Kovalev 2011). Thus, the Afanasievo tradition of pure copper metallurgy must have spread to the northern foothills of the Tienshan Mountains no later than the mid-third millennium BCE. The links with Afanasievo and local cultures adjacent to and south of the mountains into present-day China can now be assumed.

tocharians-chalcolithic-late
Afanasievo – Chemurchek evolution ca. 2600-2200 BC. See full culture and ancient DNA maps.

2. Bronze Age Altai

Kovalev and Erdenebaatar (2014a) and later Tishkin, Grushin, Kovalev and Munkhbayar (2015) in Western Mongolia conducted large-scale excavations of megalithic barrows of the Chemurchek culture (dated about 2600–1800 BCE). This peculiar culture appeared in Dzungaria and the Mongolian Altai in the second quarter of the third millennium BCE and for some time existed together with the late Afanasievo culture, as evidenced by the findings of Afanasievo ceramics in Chemurchek graves, in the stone boxes. Unfortunately, in China we do not yet know of any metal object related,without doubt, to the Chemurchek culture. Kovalev, Erdenebaatar, Tishkin and Grushin found several leaden ear rings and one ring of tin bronze in three excavated Chemurchek stone boxes (Kovalev and Erdenebaatar 2014a; Tishkin et al. 2015). Such lead rings are typical for Elunino culture,which occupied the entire West Altai after 2400–2300 BCE (Tishkin et al. 2015). This culture had developed a tradition of bronze metallurgy with various dopants, primarily tin. Thus, the tradition of bronze metallurgy as early as this time could have penetrated the Mongolian Altai far to the south. In addition, in the Hadat ovoo Chemurchek stone box, Kovalev and Erdenebaatar discovered stone vessels refurbished with the help of copper “patches,” indicating the presence there of metallurgical production (Fig. 2.7) (Kovalev and Erdenebaatar 2014a). In one of the secondary

Chemurchek graves unearthed by Kovalev and Erdenebaatar in Bayan-Ulgi (2400–2220 BCE), a bronze awl was found (Kovalev and Erdenebaatar 2009). Kovalev and Erdenebaatar also discovered a new culture in the territory of Mongolia (Map 2.3), one that begins immediately after Chemurchek – Munkh-Khairkhan culture (Kovalev and Erdenebaatar 2009, 2014b). To date, about 17 mounds of this culture have been excavated in Khovd, Zavkhan, Khovsgol, Bulgan aimag of Mongolia. This culture dates from about 1800 to 1500 BCE, that is, contemporary with the Andronovo culture. Therefore, the Andronovo culture does not extend far into the territory of Mongolia. Three knives without dedicated handles or stems and five awls have been found in the Munkh-Khairkhan culture mounds (Fig. 2.8). All these products are made of tin bronze. (…) Additionally, eight Late Bronze Age burials (c. 1400–1100 BCE) were unearthed in the Bulgan sum of Khovd aimag and belong to another previously unknown culture called Baitag. And in the Gobi Altai, a new group of “Tevsh” sites dating to the Late Bronze Age were defined in Bayankhongor and South Gobi aimags (Miyamoto and Obata 2016: 42–50). From these Tevsh and Baitag sites, we see the expansion of burial goods to include beads of semiprecious stones (carnelian), bronze beads, buttons and rings and even the famous elaborate golden hair ornaments (Tevsh uul;Bogd sum;Uverkhanagia aimag) from the Baitag barrows (Kovalev and Erdenebaatar 2009: Fig. 5; Miyamoto and Obata 2016).

2.1. Chemurchek

About the Chemurchek culture, from A re-analysis of the Qiemu’erqieke (Shamirshak) cemeteries, Xinjiang, China, by Jia and Betts JIES (2010) 38(4):

The major characteristics of Qiemu’erqieke Phase I include:

  1. Burials with two orientations of approximately 20° or 345°.
  2. Rectangular enclosures built using large stone slabs. The size of the enclosure varies from a maximum of 28 x 30 m.*to a minimum of 10.5 x 4.4 m. (Figure 8, Table 2).
  3. *The stone enclosure located near Hayinar is the largest one at approximately 30 x 40 m. based on pacing of the site during a visit by the authors in 2008.

  4. Almost life-sized anthropomorphic stone stelae erected along one side of the stone enclosures (Lin Yun 2008).
  5. Single enclosures tend to contain one or more than one burial, all or some with stone cist coffins.
  6. The cist coffin is usually constructed using five large stone slabs, four for the sides and one on top, leaving bare earth at the base (Zhang Yuzhong 2007). Sometimes the insides of the slabs have simple painted designs (Zhang Yuzhong 2005).
  7. Primary and secondary burials occur in the same grave.
  8. Some decapitated bodies (up to 20) may be associated with the main burial in one cist.
  9. Bodies are commonly placed on the back or side with the legs drawn up.
  10. Grave goods include stone and bronze arrowheads, handmade gray or brown round-bottomed ovoid jars, and small numbers of flat-bottomed jars (Fig. 7).
  11. Clay lamps appear to occur together with roundbottomed jars.
  12. Complex incised decoration on ceramics is common but some vessels are undecorated.
  13. The stone vessels are distinctive for the high quality of manufacture.
  14. Stone moulds indicate relatively sophisticated metallurgical expertise.
  15. Artefacts made from pure copper occur.
  16. Sheep knucklebones (astragali) imply a tradition (as in historical and modern times) of keeping knucklebones for ritual or other purposes. They also indicate the herding of domestic sheep as part of the subsistence economy.
tocharians-bronze-age-early
Chemurchek culture ca. 2200-1750 BC. See full culture and ancient DNA maps.

Chemurchek dating

Available evidence suggests that the date range for Qiemu’erqieke Phase I should fall from the later third into the early second millennium BC. There are several reasons to suggest that the time span is around the early second millennium BC. Lin Yun (2008) (…) maintains that the bronze artefacts found in Phase I show a greater sophistication in the level of copper alloy technology than that of the pure copper artefacts common to the Afanasievo tradition. On this basis it might be suggested that the Afanasievo could be considered to be Chalcolithic with a time span across much of the third millennium BC ( Gorsdorf et al. 2004: 86, Fig. 1). Qiemu’erqieke Phase I, however, should more properly be considered as Bronze Age.

Lin Yun also used the bronze arrowhead from burial Ml 7 to narrow down the date of Qiemu’erqieke Phase I. Two arrowheads were found in this burial, one of them leaf shaped with a single barb on the back (Fig. 7:4). A similar arrowhead, together with its casting mould, has been found at the Huoshaogou site of Siba tradition (Li Shuicheng 2005, Sun Shuyun and Han Rufen 1997), in Gansu province, northwest China, dated around 2000-1800 BC (Li Shuicheng and Shui Tao 2000) . This supports a date in the early second millennium BC for the Qiemu’erqieke arrowhead. The painted, round-bottomed jar from the Tianshanbeilu cemetery Qia Weiming, Betts and Wu Xinhua 2008: Fig. 7, bottom left) has been considered as a hybrid between the Upper Yellow River Bronze Age cultures of Siba in northwest China and the steppe tradition of Qiemu’erqieke in west Siberia (Li Shuicheng 1999). If this assumption is correct, the date of Tianshanbeilu, around 2000 BC, can be used as a reference for Qiemu’erqieke Phase I (Jia Weiming, Betts and Wu Xinhua 2008, Lin Yun 2008, Li Shuicheng 1999). Stone arrowheads found in Qiemu’erqieke Phase I also imply that the date is likely to fall within the earlier part of the Bronze Age as no such stone arrowheads have yet been found elsewhere in sites of the Bronze Age in Xinlang dated after the beginning of the second millennium BC.*
*For example Chawuhu and Xiaohe cemeteries (Xinjiang Institute of Archaeology 1999, 2003).

pottery-afanasevo-chemurchek
Pottery of Afanasevo and East European traits from the Chemurchek complex. Image modified from Kovalev (2017).

(…) Pottery “oil burners” (goblet-like ceramic vessels, possibly lamps) have been found in three traditions: Afanasievo (Gryaznov and Krizhevskaya 1986:21), Okunevo and Qiemu’erqieke. It is believed that this oil-burner found in Siberia and the Altai is a heritage from the Yamnaya and Catacomb
cultures (Sulimirski 1970: 225, 425; Shishlina 2008:46) in the Caspian steppe further to the west, but does not seem to exist in known Andronovo cultures.
The oil-burner tends to disappear after around 2300 BC during the mid-Okunevo period. It is, however, possible that the tradition continues longer in the Qiemu’erqieke sites.

The construction of the stone enclosures also reveals a close connection between Qiemu’erqieke Phase I and the mid and late Okunevo tradition (Sokolova 2007). Slab built stone enclosures emerged in both the Okunevo and Afanasievo traditions (Gryaznov and Krizhevskaya 1986:15-23, Kovalev 2008, Sokolova 2007, Anthony 2007:310, Koryakova and Epimakhov 2007). In the early Afanasievo the enclosure is circular with no cist coffin (Anthony 2007:310, Gryaznov and Krizhevskaya 1986:20), but in the early stage of the Okunevo square stone enclosures with a single cist burial are dominant. Square or rectangular stone enclosures are a marked feature of Qiemu’erqieke Phase I, suggesting temporal relationships between Qiemu’erqieke Phase I and the Okunevo. In Okunevo chronological group II, possibly with influence from the Anfanasievo, circular stone enclosures appeared in combination with rectangular enclosures within individual cemeteries, referred to by Sokolova (2007: table 2) as hybrid examples. By Okunevo chronological group III, rectangular stone slab enclosures with multi-burials emerged again. This is the dominant form in Qiemu’erqieke Phase I. Okunevo burial traditions changed again to single cist burials in the late stage around chronological group V ( Sokol ova 2007). A specific mortuary rite of decapitated burials exists in both the Qiemu’erqieke and Okunevo traditions (Sokolova 2007, Chen Kwang-tzuu and Hiebert 1995), as does the occasional occurrence of painted designs on the interior of the slabs forming the cists ( e.g., Khavrin 1997: 70, fig. 4; 77: tab. IV.5). Based on these comparisons, the date of Qiemu’erqieke Phase I may well parallel that of the Okunevo from at least chronological group II around 2400 BC (Gorsdorf et al. 2004: fig. 1).

khuh-udzuur-barrow
Khuh Udzuuriin I-1 elite barrow (ca. 2470-2190 BC). Modified from Image modified from Kovalev (2014).

In addition to the pottery making tradition, the anthropomorphic stone stelae may also have earlier antecedents. In the Okunevo assemblage there are anthropomorphic stelae that are longer, thinner and more abstract than those of Qiemu’erqieke. There is no indication of such stelae in the Afanasievo tradition (Gryaznov and Krizhevskaya 1986:15-23). However, further to the west, anthropomorphic stone stelae are associated with the Kemi-Oba and Yamnya cultures around the third millennium BC (Telegin and Mallory 1994; Figure 13). Some major characteristics of these stelae such as the icons on the front face of the stelae (Telegin and Mallory 1994:8-9) also appear on stelae found in Qiemu’erqieke Phase I. Recalling the oil burners that may have been inherited from the Yamnya culture and which are found in the Afansievo, Okunevo and Qiemu’erqieke Phase I, it migh t be possible to speculate that Qiemu’erqieke Phase I has its origins even earlier than the first half of the third millennium BC. This idea has also been suggested by Kovalev ( 1999).

Despite the affinities with the Okunevo cultural tradition, Qiemu’erqieke Phase I appears to be a discrete regional variant. The ceramic assemblage shows traits unique to this cluster of sites, while the anthropomorphic stelae are also distinctive markers of this tradition.

khuh-udzuur-stela
Khuh Udzuur anthropomorphic stone stela, oriented toward the south – south-east. Image modified from Kovalev (2014).

3. Bronze Age Xinjiang

I recently reported on this blog the description of Xiaohe and Gumugou cemeteries from interesting Master’s thesis Shifting Memories: Burial Practices and Cultural Interaction in Bronze Age China: A study of the Xiaohe-Gumugou cemeteries in the Tarim Basin, by Yunyun Yang, Uppsala University, Department of Archaeology and Ancient History (2019).

It also offered a full summary of findings from prehistoric sites of Xinjiang related to the arrival of a cultural package from the Altai region, ultimately connected to Afanasievo. Relevant excerpts include the following (emphasis mine):

In Bronze Age Xinjiang, burials were diverse but also show some common features between different geographic sections. The main three mountains, including Kunlun Mountains, Tian Shan (mountains) and Altai Mountains, enclose the Tarim Basin, and the Dzungaria Basin, but leave the eastern part of the Tarim Basin and the south-eastern part of the Dzungaria Basin open (with easy access to the surroundings). The Hami Basin is located at the transitional area, connecting the two basins. Burials are mainly spread along the edge of the mountain ranges.

xinjiang-afanasievo-andronovo-bmac-tian-shan
An assumption of the spreading/expansion routes stone burial construct.

3.1. The Lop Nur region

In the Lop Nur region, the Xiaohe cemetery (2000-1450 BCE) and the Gumugou cemetery (1900-1800 BCE) had many common features shared, and so is the Keliyahe northern cemetery:

  • Cemeteries were located in sandy areas;
  • Rectangular/boat-shaped wooden coffins with monuments of wooden planks or poles;
  • Coffins had no bottoms;
  • The dead were placed lying straight on the back;
  • The dead were commonly buried in single graves.

The Gumugou cemetery contained six special sun-radiating-spokes burial pattern in addition to the normal burials, which were similar to the wooden coffin graves of the Xiaohe cemetery.

NOTE. For more on Xiaohe and Gumugou, see the recent post on Proto-Tocharians. See other papers on the Andronovo horizon for other Early to Middle Bronze Age cultural groups less clearly associated with the Xiaohe horizon, like Hazandu, Xintala, or the Chust culture.

From Shuicheng (2006):

An assemblage of early bronzes had been recovered from northwestern Xinjiang and the periphery of Dzungaria 准噶尔 Basin. It comprises a variety of utilitarian tools and weapons, and a small number of apparels. These artifacts bear the stamps of Andronovo Culture in form, artifact type and decorative pattern. The metallographic analysis on selected artifacts indicates that they comprise mainly of tin-bronzes that contain 2–10% of tin. Moreover, the chemical compositions of these artifacts are similar to that of the Andronovo Culture. Latter date (first half of the 1st millennium BC) artifacts of the assemblage include a small number of arsenic bronzes. In all, during the period between the mid-2nd and mid-1st millennium BC, copper and bronze artifacts coexisted in this region, albeit tin-bronze comprised the majority. The composition of alloy did not show significant change over time. Some colleagues pointed out that the Nulasai 奴拉赛 site at Nileke 尼勒克 County in the Yili 伊犁 River basin of Xinjiang was the pioneer in the use of “sulphuric ore–ice copper–copper”technology. It is also the only early smelting site in Euro-Asia that arsenic ore was added to deliberately produce an alloy

tocharians-bronze-age-middle
Prehistoric cultures of Xinjiang during the Middle Bronze Age. See full culture and ancient DNA maps.

3.2. The Hami Basin-the Balikun Grassland

From Yang (2019):

The Hami Basin-the Balikun Grassland area is located at the eastern part of Tian Shan. The area is divided in a northern basin and a southern basin by the east-west stretch of the Tian Shan. In the Hami Basin-the Balikun Grassland area, the main type of burials were earth-pit graves in the early Bronze Age, and burials of stone-pit with barrows became more common in the late Bronze Age. The Hami-Tianshan-Beilu cemetery is a representative of the earth-pit graves. The features of the Hami-Tianshan-Beilu cemetery (2000-1500 bce) here were:

  • Rectangular earth pit graves;
  • The dead were often in a hocker position lying on one side;
  • Commonly a single dead in one grave.
balikun-grassland
The Balikun grassland today (source).

The Hami-Wubu cemetery (earlier than 1000 bce) and the Yanbulake cemetery (1200-600 bce) are representatives of another common earth-pit graves. Common features here were:

  • Rectangular earth pits, with two storeys and/or roofed with wooden boards;
  • The dead was placed in a hocker position lying on one side;
  • Mostly a single dead in one grave.

Later there appeared more stone-pit graves in this area, and the features can be summarized as:

  • Round burial mounds, commonly constructed by stones or a mix of stones and earth;
  • Burial mounds with a sunken top or a normal (dome) top;
  • The diameter of the burial mounds varied between 3 and 25.4 m (but not necessarily limited in this scope);
  • Circular or rectangular stone kerbs;
  • Rectangular stone pits, constructed by earth, or stones, or a mix of earth and stones;
  • Rectangular stone pits contained wooden coffins (represented by the Yiwu Baiqi’er cemetery).
hami-basin-balikun-grassland-iron-age-burials
Some representatives of stone burials in the Hami Basin – the Balikun Grassland in the Iron Age (Adapted from: Xinjiang 2011, 29-41). Image modified from Yang (2019).

In the Hami Basin, the Bronze Age cemeteries show common burial features like earth pits and hocker position of the dead. With similar pottery styles in the Hami-Tianshan-Beilu cemetery to those in the Machang and Siba cultures (Xinjiang 2011: 17), it suggests possible cultural influence or people’s migrating from the Hexi Corridor in the east.

In the Balikun Grassland, burials in an earlier time contained mostly earth-pit graves but also a small number of stone-pit graves. The pebbles were imbedded in the floors and the walls of the graves in a rectangular shape, e.g. the Balikun-Nanwan cemetery (1600-1000 bce). In a later time, there appeared huge burial mounds with a sunken top, and with the diameters of the burial mounds varying from 3 to 25.4 m, e.g. the Balikun-Dongheigou cemetery and the Balikun-Heigouliang cemetery. The Yiwu-Bai’erqi and the Yiwu-Kuola cemeteries contained either round stone burial mounds or circular stone kerbs on the ground surface. Considering the three burial elements including burial mounds, stone pits and circular kerbs, the later period cemeteries in the Balikun Grassland were actually similar to cemeteries from the southern edge of the Altai Mountain area.

From Shuicheng (2006):

The Nanwan 南湾 cemetery site at Kuisu 奎苏 Town, Balikun 巴里坤 (1600–1100 BC) also yielded an assemblage of early bronzes. The style of its early phase artifacts is similar to that of the burials distributed in the North Tianshan Route. Some sorts of cultural connection should have existed between the two.

The dates of Yanbulake 焉不拉克 Culture (1300–700 BC) are comparatively late. Its metallurgy was a continuation of the western China tradition. Artifact types include a variety of utilitarian tools, weapons and apparels.

tocharians-bronze-age-late
Prehistoric cultures of Xinjiang during the Late Bronze Age. See full culture and ancient DNA maps.

3.3. The Turpan Basin-the middle part of Tian Shan

From Yang (2019):

Turpan Basin is located at the western part of the Hami Basin, and lies at the southern edge of the eastern Tian Shan. In the Turpan Basin-the middle part of Tian Shan area, the main representative of the Bronze Age cemeteries is the Yanghai Nr.1 cemetery. The features here were:

  • Elliptic earth pit graves, commonly covered by round logs on the top;
  • Some graves contained burial beds made of round logs or reeds;
  • The dead were mainly placed lying straight on the back;
  • Mostly a single dead in one grave.

In Iron Age, the stone burials became dominant, but the stone burials varied in different regions of the Turpan Basin-the middle part of Tian Shan area. Graves containing burial mounds, stone pit, and circular stone kerbs are represented by the Shanshan-Ertanggou cemetery, the Tuokexun-Alagou cemetery, the Urumqi-Chaiwobu cemetery and the Urumqi-Yizihu-Sayi cemetery, etc. The stone funeral construction features here are similar to those contemporary cemeteries in the Hami Basin-the Balikun Grassland area.

3.4. The southern edge of the western and middle part of Tian Shan

In the southern edge of the western and middle part of Tian Shan area, the main representatives of the late Bronze Age cemeteries are the Hejing-Chawuhu Nr.4 cemetery (around 1000-500 bce), the Hejing-Xiaoshankou cemetery, the Baicheng-cemetery, etc. The main burial features of the late Bronze Age and the early Iron Age cemeteries (see Fig.12) here were:

  • Burial mounds, constructed by stones or a mix of stones and earth;
  • Irregular circular or rectangular stone kerbs;
  • Stone pit graves in a bell-shape or a rectangular shape;
  • Stone pit graves constructed by imbedding pebbles or stone slabs in walls and floors;
  • The dead were often placed lying on their back with bent legs;
  • The dead were commonly reburied a second time with multiple burials.

From the late Bronze Age to the early Iron Age in this area, the burial traditions tended to be in a more varied way. In the stone burials with stone kerbs, there is a mixture of stone pit and earth pit graves. The burial features of the Iron Age cemeteries in this section were similar to those contemporary both in the Hami Basin-the Balikun Grassland area and in the Turpan Basin-the middle part of Tian Shan area.

From Shuicheng (2006):

The Chawuhu 察吾呼 Culture (1100–500 BC) distributes on the foothills between the middle section of the Tianshan Mountain Ranges and Tarim River. Its bronze assemblage comprises a variety of weapons, utilitarian tools and small apparels. They show no apparent temporal change in form and type through the four cultural phases. In addition, bronzes bear the Chawuhu characteristics were found in Hejing 和静, Baicheng 拜城 and Luntai 轮台 (Bügür). Yet, sites distributed along the Tarim River, such as Heshuo 和硕, Kuga 库车and Aksu 阿克苏, yielded remains of a bronze culture different from that of Chawuhu. Bronzes recovered include double-eared socketed axe, arrowheads, awls, knives, needles and bracelets. Their absolute dates have been estimated to be earlier than that of Chawuhu.

tocharians-iron-age-early
Prehistoric cultures of Xinjiang during the Early Iron Age. See full culture and ancient DNA maps

3.5. The Pamir Plateau

From Yang (2019):

A typical Bronze Age cemetery from the Pamir Plateau area is the Tashenku’ergan-Xiabandi cemetery (around 1000-500 bce). The burial features here were:

  • Mainly inhumations, but also a few cremations;
  • Burial mounds, constructed of stones;
  • Irregular circular or rectangular stone kerbs;
  • Mostly a single dead in one grave;
  • The dead was placed in a hocker position lying on one side.

The adoption of burial customs from the east supports the migration of Afanasievo-related peoples from the Tian Shan up to the Pamir Plateau, strongly influencing the findings of the Xiabandi cemetery, which has been dated from an early Bronze Age phase (ca. 1500-300 BC) to a late date up to ca. 600 BC.

While it is today unclear how far the Afanasievo admixture reached into the western Xinjiang, it seems that the Pamir Plateau remained culturally connected to neighbouring Andronovo-related cultures in pottery and metallurgical innovations, hence their language probably belonged – during most part of the Bronze and Iron Ages – to the Indo-Iranian branch, even though specific dialects might have changed with each new attested group.

In particular, it is possible that the early Andronovo groups related to the Xiaohe Horizon spoke Indo-Aryan or West Iranian dialects, while Saka-related groups replaced them – or an intermediate Tocharian-speaking group – with East Iranian dialects. A close interaction with West Iranian would justify the known ancient borrowings of Tocharian, although they could also be explained by contacts with Chust-related groups farther west. For more on this, see Ged Carling’s work on the different layers of Iranian loans.

Xinjiang BA/IA Summary

From Yang (2019):

In the early Bronze Age, there are distinct regional differences in the burial customs in and surrounding the Tarim Basin. At the southern edge of the Altai Mountains area, the burial customs included stone burial mounds, stone pit graves, circular or rectangular stone kerbs and stone human sculptures; the dead were placed lying straight on the back. In the Hami Basin-the Balikun Grassland area, the burial customs included earth pit graves; the dead were placed in a hocker position lying on one side. In the Turpan Basin-the middle part of Tian Shan area, the burial customs included earth pit graves; the dead were placed lying straight on the back. In the Lop Nur region, the burial customs included wooden coffins buried in sand; the dead were placed lying straight on the back.

But from the late Bronze Age to the early Iron Age, there was a common shift in burial customs from earth pit graves to stone burials in the Hami Basin-the Balikun Grassland area and in the Turpan Basin-the middle part of Tian Shan area. The main features of the stone burials include stone burial mounds, circular or rectangular stone kerbs, and the stone pit graves in the cemeteries. Similar stone burial customs commonly appeared at the southern edge of the western and middle part of Tian Shan area and the Pamir Plateau area in Iron Age. The burial features in most areas are in a mixture of both the earth pit graves and stone pit graves, especially in the Hami Basin-the Balikun Grassland area and the Turpan Basin-the middle part of Tian Shan area.

xinjiang-bronze-age-iron-age

From Shuicheng (2006):

Historians of metallurgy conducted metallographic analyses on a sample of 234 metal specimens recovered from 16 localities in eastern Xinjiang. They concluded that the metallurgic industry in eastern Xinjiang could be roughly partitioned into three developmental phases. The early phase is represented by the burials distributed in the North Tianshan Route. The majority of the metal assemblage was tin-bronzes; however, copper and arsenic-bronzes maintained considerable proportions. The middle phase is represented by the burials at Yanbulake. During this phase, tin-bronze still maintained the majority; the proportion of arsenic-bronze increased, and some of them were high arsenic-bronzes. The late phase is represented by the burials at Heigouliang 黑沟梁. The composition of lead increased in the bronze alloy in the expense of arsenic. In addition, this phase witnessed the appearance of high tin-bronze that composed up to 16% of tin and the appearance of brass, that is, an alloy of copper and zinc. The bronze alloy consistently contained significant amount of impurities regardless of temporal difference. Casting and forging technologies coexisted throughout the three phases. The early bronzes (2000–500 BC) of eastern Xinjiang, in general, contained arsenic; however, the composition of arsenic was usually under 8%, but a few artifacts contained more than 20% arsenic. In all, arsenic had long been used in the alloy-forming of the early bronzes in eastern Xinjiang. Consequently, arsenic-bronzes were widely found in the prehistoric archaeology of the region. The artifact types, chemical compositions and manufacture techniques of the bronze assemblage of the burials of the North Tianshan Route are similar to those of Siba Culture, indicating that eastern Xinjiang had played a significant role in the East-West interactions.

An assemblage of early bronzes had been recovered from northwestern Xinjiang and the periphery of Dzungaria 准噶尔 Basin. It comprises a variety of utilitarian tools and weapons, and a small number of apparels. These artifacts bear the stamps of Andronovo Culture in form, artifact type and decorative pattern. The metallographic analysis on selected artifacts indicates that they comprise mainly of tin-bronzes that contain 2–10% of tin. Moreover, the chemical compositions of these artifacts are similar to that of the Andronovo Culture. Latter date (first half of the 1st millennium BC) artifacts of the assemblage include a small number of arsenic-bronzes. In all, during the period between the mid-2nd and mid-1st millennium BC, copper and bronze artifacts coexisted in this region, albeit tin-bronze comprised the majority.

tocharians-iron-age-late
Prehistoric cultures of Xinjiang during the Late Iron Age. See full culture and ancient DNA maps.

Tocharians in population genomics

Prehistoric population movements between the Altai and the Tian Shan are difficult to pinpoint, not the least because of the division of these territories among three different countries and their archaeological teams, only recently (more) open to the international scholarship.

The available schematic archaeological picture, where migrations could only be roughly inferred, has been recently updated to a great extent by Ning, Wang et al. (2019), whose genetic analysis of the samples is as thorough as anyone could have asked for, with a level of detail which matches the complex genetic picture of the region by the Iron Age.

As a summary, here is what they described about the samples from Shirenzigou (ca 400-200 BC), corresponding to the Iron Age populations of the Hami Basin-the Balikun Grassland area, and closely related to the preceding Yanbulake Culture:

As shown in Figure S3, the Steppe_MLBA populations including Srubnaya, Andronovo, and Sintashta were shifted toward farming populations compared with Yamnaya groups and the Shirenzigou samples. This observation is consistent with ADMIXTURE analysis that Steppe_MLBA populations have an Anatolian and European farmer-related component that Yamnaya groups and the Shirenzigou individuals do not seem to have. The analysis consistently suggested Yamnaya-related Steppe populations were the better source in modeling the West Eurasian ancestry in Shirenzigou.

biplot-yamnaya-tocharians-shirenzigou
Biplot of f3-outgroup tests illustrating the Kostenki14 and Anatolia_N like ancestries in Shirenzigou individuals. Most Shirenzigou individuals were on a cline with Yamnaya and European hunter-gatherer groups, lacking the European farmer ancestry as compared to the Steppe_MLBA populations such as Andronovo, Srubnaya and Sintashta [S1-S5]. Horizontal and vertical bars represent ± 3 standard errors, corresponding to form of outgroup f3 tests on the x axis and y axis respectively.

We continued to use qpAdm to estimate the admixture proportions in the Shirenzigou samples by using different pairs of source populations, such as Yamnaya_Samara, Afanasievo, Srubnaya, Andronovo, BMAC culture (Bustan_BA and Sappali_ Tepe_BA) and Tianshan_Hun as the West Eurasian source and Han, Ulchi, Hezhen, Shamanka_EN as the East Eurasian source. In all cases, Yamnaya, Afanasievo, or Tianshan_Hun always provide the best model fit for the Shirenzigou individuals, while Srubnaya, Andronovo, Bustan_BA and Sappali_Tepe_BA only work in some cases.

p-values-shirenzigou-samples-han-chinese
Table S2. P values in modelling a two-way (P=rank 1) admixture in Shirenzigou samples using each of the four populations (Bustan_BA, Sappali_Tepe_BA, Andronovo.SG, Srubnaya) together with Han Chinese as two sources [S6], Related to Figure 2. We used the following set of outgroups populations: Dinka, Ust_Ishim, Kostenki14, Onge, Papuan, Australian, Iran_N, EHG, LBK_EN.

shirenzigou-afanasievo-yamnaya-andronovo-srubna-ulchi-han

In the PCA, ADMIXTURE, outgroup f3 statistics [see Figure S4], as well as f4 statistics (Table S3), we observed the Shirenzigou individuals were closer to the present day Tungusic and Mongolic-speaking populations in northern Asia than to the populations in central and southern China, suggesting the northern populations might contribute more to the Shirenzigou individuals. Based on this, we then modeled Shirenzigou as a three-way admixture of Yamnaya_Samara, Ulchi (or Hezhen) and Han to infer the source from the East Eurasia side that contributed to Shirenzigou. We found the Ulchi or Hezhen and Han-related ancestry had a complicated and unevenly distribution in the Shirenzigou samples. The most Shirenzigou individuals derived the majority of their East Eurasian ancestry from Ulchi or Hezhen-related populations, while the following two individuals M820 and M15-2 have more Han related than Ulchi/ Hezhen-related ancestry

It is unclear whether the Chemurchek population will show a sizeable local contribution from neighbouring groups. The fact that Okunevo shows 20% Yamnaya-related ancestry strongly supports the nature of neighbouring stone-grave-building peoples of the Altai and the northern Tian Shan as mostly Afanasievo-like, and the apparent lack of contributions of Srubna/Andronovo-like ancestry in the early Hami-Balikun stone burial builders also speaks for radical population replacement events reaching the areas south of Tian Shan, at least initially.

While ancestry cannot settle linguistic questions, it seems that nomads of the Gansu and Qinghai grasslands retained an ancestry close to Andronovo, whereas nomads of the Hami Basin-Balikun grasslands and related populations of Xinjiang remained closely related to Afanasievo. This doesn’t preclude that the ancestors of the Yuezhi became acculturated under the influence of peoples from eastern Xinjiang, but all data combined suggest an isolation of both populations – relative to other groups and to each other – and it is therefore more likely that they spoke Indo-Iranian-related languages rather than a language of the Tocharian branch.

Haplogroups

In an interesting twist of events, despite the initially reported hg. R1b and Q, Tocharians from Shirenzigou actually show a haplogroup diversity comparable to that attested in other late Iron Age populations: a similar diversity is seen, for example, among Germanic, Baltic, and Balto-Finnic peoples of the Baltic region; among East Germanic or Scythians of the north Pontic region; or among Mediterranean peoples sampled to date. Iron Age peoples show thus a complex sociopolitical setting that overcame the previous patrilineal homogeneity of Bronze Age expansions.

tocharians-pca
PCA and ADMIXTURE for Shirenzigou Samples. Modified from the original to include in black squares samples related to Yamnaya. Modified from the paper to include labels of modern populations and a dotted lines with the cline formed by Shirenzigou, from (Yamnaya-like) Afanasievo to Central and East Asian-like populations. In red circles, samples with best fit for Andronovo-like ancestry. In green circles, samples with Han-related admixture.

M15-2 (with Han-related ancestry) is of the rare haplogroup Q1a-M120, while the samples with highest Steppe_MLBA-related ancestry are of hg. R1b-PH155, which points to their recent origin among Yuezhi, or to Hun-related populations showing an admixture related to the proto-historic nomads of the Gansu and Qinghai grasslands.

The expansion of Chemurchek-related peoples was probably associated more with hg. Q1a (dubious if it’s a Pre-ISOGG 2017 nomenclature, hence possibly Q1b), a haplogroup that might be found in Khvalynsk as a “significant minority” according to Anthony (2019), and it might also be attested in sampled individuals from Afanasievo in its late phase. This might be, therefore, a case similar to the early expansion of Indo-Europeans with R1b-V1636 lineages through the Volga – North Caucasus region, and of the later expansion with I2a-L699 lineages into the Balkans.

Haplogroup Q1a2-M25 is found in individual X3, whose Steppe ancestry is likely a combination of Afanasievo plus Andronovo-like ancestry heavily admixed with Hezhen/Ulchi-like populations, in line with the expected recent contacts with the neighbouring Xiongnu, Yuezhi, and other population movements affecting eastern Xinjiang.

Sample M4, which packs the most Afanasievo-like ancestry, is of hg. R1a-Z645, which – like sample M8R1 of hg. O – is most likely related to haplogroup resurgence events of local populations, which left the predominant Afanasievo-like admixture brought by builders of stone burials essentially intact, evidenced by the almost 100% of R1a found in the Xiaohe cemetery – and in most of the early Andronovo horizon – and among expanding Kangju and Wusun, as well as by the prevalence of hg. O among sampled East Asian populations.

A question that will only be answered with more samples is how and when the prevalent R1b-L23 and Q1b lineages among Afanasievo-related peoples began to be replaced to reach the high variability seen in Shirenzigou. Given the pastoralist nature of peoples around Tian Shan, the succeeding expansions of Proto-Tocharians, and the late isolation of different Common Tocharian groups, it is more than likely that this variability represents a late and local phenomenon within Xinjiang itself.

tocharians-antiquity
Peoples of Xinjiang during Antiquity. See full culture and ancient DNA maps.

Conclusion

Tocharians are one of the main pillars that confirm the Late Proto-Indo-European homeland of the R1b-rich populations of the Don-Volga region. There is already:

Just like the East Bell Beaker expansion from Yamnaya Hungary has confirmed that Corded Ware peoples did not partake in spreading Indo-European languages (spreading Uralic languages instead), data on the expansion of Tocharian speakers from Afanasievo to the Tian Shan was always there; population genomics is merely helping to connect the dots.

In summary, genetic research is supporting the expected linguistic expansions of the Neolithic and Bronze Age step by step, slowly but surely.

Related

Yamnaya ancestry: mapping the Proto-Indo-European expansions

steppe-ancestry-expansion-europe

The latest papers from Ning et al. Cell (2019) and Anthony JIES (2019) have offered some interesting new data, supporting once more what could be inferred since 2015, and what was evident in population genomics since 2017: that Proto-Indo-Europeans expanded under R1b bottlenecks, and that the so-called “Steppe ancestry” referred to two different components, one – Yamnaya or Steppe_EMBA ancestry – expanding with Proto-Indo-Europeans, and the other one – Corded Ware or Steppe_MLBA ancestry – expanding with Uralic speakers.

The following maps are based on formal stats published in the papers and supplementary materials from 2015 until today, mainly on Wang et al. (2018 & 2019), Mathieson et al. (2018) and Olalde et al. (2018), and others like Lazaridis et al. (2016), Lazaridis et al. (2017), Mittnik et al. (2018), Lamnidis et al. (2018), Fernandes et al. (2018), Jeong et al. (2019), Olalde et al. (2019), etc.

NOTE. As in the Corded Ware ancestry maps, the selected reports in this case are centered on the prototypical Yamnaya ancestry vs. other simplified components, so everything else refers to simplistic ancestral components widespread across populations that do not necessarily share any recent connection, much less a language. In fact, most of the time they clearly didn’t. They can be interpreted as “EHG that is not part of the Yamnaya component”, or “CHG that is not part of the Yamnaya component”. They can’t be read as “expanding EHG people/language” or “expanding CHG people/language”, at least no more than maps of “Steppe ancestry” can be read as “expanding Steppe people/language”. Also, remember that I have left the default behaviour for color classification, so that the highest value (i.e. 1, or white colour) could mean anything from 10% to 100% depending on the specific ancestry and period; that’s what the legend is for… But, fere libenter homines id quod volunt credunt.

Sections:

  1. Neolithic or the formation of Early Indo-European
  2. Eneolithic or the expansion of Middle Proto-Indo-European
  3. Chalcolithic / Early Bronze Age or the expansion of Late Proto-Indo-European
  4. European Early Bronze Age and MLBA or the expansion of Late PIE dialects

1. Neolithic

Anthony (2019) agrees with the most likely explanation of the CHG component found in Yamnaya, as derived from steppe hunter-fishers close to the lower Volga basin. The ultimate origin of this specific CHG-like component that eventually formed part of the Pre-Yamnaya ancestry is not clear, though:

The hunter-fisher camps that first appeared on the lower Volga around 6200 BC could represent the migration northward of un-admixed CHG hunter-fishers from the steppe parts of the southeastern Caucasus, a speculation that awaits confirmation from aDNA.

neolithic-chg-ancestry
Natural neighbor interpolation of CHG ancestry among Neolithic populations. See full map.

The typical EHG component that formed part eventually of Pre-Yamnaya ancestry came from the Middle Volga Basin, most likely close to the Samara region, as shown by the sampled Samara hunter-gatherer (ca. 5600-5500 BC):

After 5000 BC domesticated animals appeared in these same sites in the lower Volga, and in new ones, and in grave sacrifices at Khvalynsk and Ekaterinovka. CHG genes and domesticated animals flowed north up the Volga, and EHG genes flowed south into the North Caucasus steppes, and the two components became admixed.

neolithic-ehg-ancestry
Natural neighbor interpolation of EHG ancestry among Neolithic populations. See full map.

To the west, in the Dnieper-Dniester area, WHG became the dominant ancestry after the Mesolithic, at the expense of EHG, revealing a likely mating network reaching to the north into the Baltic:

Like the Mesolithic and Neolithic populations here, the Eneolithic populations of Dnieper-Donets II type seem to have limited their mating network to the rich, strategic region they occupied, centered on the Rapids. The absence of CHG shows that they did not mate frequently if at all with the people of the Volga steppes (…)

neolithic-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Neolithic populations. See full map.

North-West Anatolia Neolithic ancestry, proper of expanding Early European farmers, is found up to border of the Dniester, as Anthony (2007) had predicted.

neolithic-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Neolithic populations. See full map.

2. Eneolithic

From Anthony (2019):

After approximately 4500 BC the Khvalynsk archaeological culture united the lower and middle Volga archaeological sites into one variable archaeological culture that kept domesticated sheep, goats, and cattle (and possibly horses). In my estimation, Khvalynsk might represent the oldest phase of PIE.

(…) this middle Volga mating network extended down to the North Caucasian steppes, where at cemeteries such as Progress-2 and Vonyuchka, dated 4300 BC, the same Khvalynsk-type ancestry appeared, an admixture of CHG and EHG with no Anatolian Farmer ancestry, with steppe-derived Y-chromosome haplogroup R1b. These three individuals in the North Caucasus steppes had higher proportions of CHG, overlapping Yamnaya. Without any doubt, a CHG population that was not admixed with Anatolian Farmers mated with EHG populations in the Volga steppes and in the North Caucasus steppes before 4500 BC. We can refer to this admixture as pre-Yamnaya, because it makes the best currently known genetic ancestor for EHG/CHG R1b Yamnaya genomes.

From Wang et al (2019):

Three individuals from the sites of Progress 2 and Vonyuchka 1 in the North Caucasus piedmont steppe (‘Eneolithic steppe’), which harbour EHG and CHG related ancestry, are genetically very similar to Eneolithic individuals from Khvalynsk II and the Samara region. This extends the cline of dilution of EHG ancestry via CHG-related ancestry to sites immediately north of the Caucasus foothills

eneolithic-pre-yamnaya-ancestry
Natural neighbor interpolation of Pre-Yamnaya ancestry among Neolithic populations. See full map. This map corresponds roughly to the map of Khvalynsk-Novodanilovka expansion, and in particular to the expansion of horse-head pommel-scepters (read more about Khvalynsk, and specifically about horse symbolism)

NOTE. Unpublished samples from Ekaterinovka have been previously reported as within the R1b-L23 tree. Interestingly, although the Varna outlier is a female, the Balkan outlier from Smyadovo shows two positive SNP calls for hg. R1b-M269. However, its poor coverage makes its most conservative haplogroup prediction R-M343.

The formation of this Pre-Yamnaya ancestry sets this Volga-Caucasus Khvalynsk community apart from the rest of the EHG-like population of eastern Europe.

eneolithic-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Eneolithic populations. See full map.

Anthony (2019) seems to rely on ADMIXTURE graphics when he writes that the late Sredni Stog sample from Alexandria shows “80% Khvalynsk-type steppe ancestry (CHG&EHG)”. While this seems the most logical conclusion of what might have happened after the Suvorovo-Novodanilovka expansion through the North Pontic steppes (see my post on “Steppe ancestry” step by step), formal stats have not confirmed that.

In fact, analyses published in Wang et al. (2019) rejected that Corded Ware groups are derived from this Pre-Yamnaya ancestry, a reality that had been already hinted in Narasimhan et al. (2018), when Steppe_EMBA showed a poor fit for expanding Srubna-Andronovo populations. Hence the need to consider the whole CHG component of the North Pontic area separately:

eneolithic-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Eneolithic populations. See full map. You can read more about population movements in the late Sredni Stog and closer to the Proto-Corded Ware period.

NOTE. Fits for WHG + CHG + EHG in Neolithic and Eneolithic populations are taken in part from Mathieson et al. (2019) supplementary materials (download Excel here). Unfortunately, while data on the Ukraine_Eneolithic outlier from Alexandria abounds, I don’t have specific data on the so-called ‘outlier’ from Dereivka compared to the other two analyzed together, so these maps of CHG and EHG expansion are possibly showing a lesser distribution to the west than the real one ca. 4000-3500 BC.

eneolithic-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Eneolithic populations. See full map.

Anatolia Neolithic ancestry clearly spread to the east into the north Pontic area through a Middle Eneolithic mating network, most likely opened after the Khvalynsk expansion:

eneolithic-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Eneolithic populations. See full map.
eneolithic-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Eneolithic populations. See full map.

Regarding Y-chromosome haplogroups, Anthony (2019) insists on the evident association of Khvalynsk, Yamnaya, and the spread of Pre-Yamnaya and Yamnaya ancestry with the expansion of elite R1b-L754 (and some I2a2) individuals:

eneolithic-early-y-dna
Y-DNA haplogroups in West Eurasia during the Early Eneolithic in the Pontic-Caspian steppes. See full map, and see culture, ADMIXTURE, Y-DNA, and mtDNA maps of the Early Eneolithic and Late Eneolithic.

3. Early Bronze Age

Data from Wang et al. (2019) show that Corded Ware-derived populations do not have good fits for Eneolithic_Steppe-like ancestry, no matter the model. In other words: Corded Ware populations show not only a higher contribution of Anatolia Neolithic ancestry (ca. 20-30% compared to the ca. 2-10% of Yamnaya); they show a different EHG + CHG combination compared to the Pre-Yamnaya one.

eneolithic-steppe-best-fits
Supplementary Table 13. P values of rank=2 and admixture proportions in modelling Steppe ancestry populations as a three-way admixture of Eneolithic steppe Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Test, Eneolithic_steppe, Anatolian_Neolithic, WHG.
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Yamnaya Kalmykia and Afanasievo show the closest fits to the Eneolithic population of the North Caucasian steppes, rejecting thus sizeable contributions from Anatolia Neolithic and/or WHG, as shown by the SD values. Both probably show then a Pre-Yamnaya ancestry closest to the late Repin population.

wang-eneolithic-steppe-caucasus-yamnaya
Modelling results for the Steppe and Caucasus cluster. Admixture proportions based on (temporally and geographically) distal and proximal models, showing additional AF ancestry in Steppe groups and additional gene flow from the south in some of the Steppe groups as well as the Caucasus groups. See tables above. Modified from Wang et al. (2019). Within a blue square, Yamnaya-related groups; within a cyan square, Corded Ware-related groups. Green background behind best p-values. In red circle, SD of AF/WHG ancestry contribution in Afanasevo and Yamnaya Kalmykia, with ranges that almost include 0%.

EBA maps include data from Wang et al. (2018) supplementary materials, specifically unpublished Yamnaya samples from Hungary that appeared in analysis of the preprint, but which were taken out of the definitive paper. Their location among Yamnaya settlers from Hungary is speculative, although most uncovered kurgans in Hungary are concentrated in the Tisza-Danube interfluve.

eba-yamnaya-ancestry
Natural neighbor interpolation of Pre-Yamnaya ancestry among Early Bronze Age populations. See full map. This map corresponds roughly with the known expansion of late Repin/Yamnaya settlers.

The Y-chromosome bottleneck of elite males from Proto-Indo-European clans under R1b-L754 and some I2a2 subclades, already visible in the Khvalynsk sampling, became even more noticeable in the subsequent expansion of late Repin/early Yamnaya elites under R1b-L23 and I2a-L699:

chalcolithic-early-y-dna
Y-DNA haplogroups in West Eurasia during the Yamnaya expansion. See full map and maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Chalcolithic and Yamnaya Hungary.

Maps of CHG, EHG, Anatolia Neolithic, and probably WHG show the expansion of these components among Corded Ware-related groups in North Eurasia, apart from other cultures close to the Caucasus:

NOTE. For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you can read the post Corded Ware ancestry in North Eurasia and the Uralic expansion.

eba-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Early Bronze Age populations. See full map.
eba-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Early Bronze Age populations. See full map.
eba-whg-ancestry
Natural neighbor interpolation of WHG ancestry among Early Bronze Age populations. See full map.
eba-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Early Bronze Age populations. See full map.
eba-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Early Bronze Age populations. See full map.

4. Middle to Late Bronze Age

The following maps show the most likely distribution of Yamnaya ancestry during the Bell Beaker-, Balkan-, and Sintashta-Potapovka-related expansions.

4.1. Bell Beakers

The amount of Yamnaya ancestry is probably overestimated among populations where Bell Beakers replaced Corded Ware. A map of Yamnaya ancestry among Bell Beakers gets trickier for the following reasons:

  • Expanding Repin peoples of Pre-Yamnaya ancestry must have had admixture through exogamy with late Sredni Stog/Proto-Corded Ware peoples during their expansion into the North Pontic area, and Sredni Stog in turn had probably some Pre-Yamnaya admixture, too (although they don’t appear in the simplistic formal stats above). This is supported by the increase of Anatolia farmer ancestry in more western Yamna samples.
  • Later, Yamnaya admixed through exogamy with Corded Ware-like populations in Central Europe during their expansion. Even samples from the Middle to Upper Danube and around the Lower Rhine will probably show increasing contributions of Steppe_MLBA, at the same time as they show an increasing proportion of EEF-related ancestry.
  • To complicate things further, the late Corded Ware Espersted family (from ca. 2500 BC or later) shows, in turn, what seems like a recent admixture with Yamnaya vanguard groups, with the sample of highest Yamnaya ancestry being the paternal uncle of other individuals (all of hg. R1a-M417), suggesting that there might have been many similar Central European mating networks from the mid-3rd millennium BC on, of (mainly) Yamnaya-like R1b elites displaying a small proportion of CW-like ancestry admixing through exogamy with Corded Ware-like peoples who already had some Yamnaya ancestry.
mlba-yamnaya-ancestry
Natural neighbor interpolation of Yamnaya ancestry among Middle to Late Bronze Age populations (Esperstedt CWC site close to BK_DE, label is hidden by BK_DE_SAN). See full map. You can see how this map correlated with the map of Late Copper Age migrations and Yamanaya into Bell Beaker expansion.

NOTE. Terms like “exogamy”, “male-driven migration”, and “sex bias”, are not only based on the Y-chromosome bottlenecks visible in the different cultural expansions since the Palaeolithic. Despite the scarce sampling available in 2017 for analysis of “Steppe ancestry”-related populations, it appeared to show already a male sex bias in Goldberg et al. (2017), and it has been confirmed for Neolithic and Copper Age population movements in Mathieson et al. (2018) – see Supplementary Table 5. The analysis of male-biased expansion of “Steppe ancestry” in CWC Esperstedt and Bell Beaker Germany is, for the reasons stated above, not very useful to distinguish their mutual influence, though.

Based on data from Olalde et al. (2019), Bell Beakers from Germany are the closest sampled ones to expanding East Bell Beakers, and those close to the Rhine – i.e. French, Dutch, and British Beakers in particular – show a clear excess “Steppe ancestry” due to their exogamy with local Corded Ware groups:

Only one 2-way model fits the ancestry in Iberia_CA_Stp with P-value>0.05: Germany_Beaker + Iberia_CA. Finding a Bell Beaker-related group as a plausible source for the introduction of steppe ancestry into Iberia is consistent with the fact that some of the individuals in the Iberia_CA_Stp group were excavated in Bell Beaker associated contexts. Models with Iberia_CA and other Bell Beaker groups such as France_Beaker (P-value=7.31E-06), Netherlands_Beaker (P-value=1.03E-03) and England_Beaker (P-value=4.86E-02) failed, probably because they have slightly higher proportions of steppe ancestry than the true source population.

olalde-iberia-chalcolithic

The exogamy with Corded Ware-like groups in the Lower Rhine Basin seems at this point undeniable, as is the origin of Bell Beakers around the Middle-Upper Danube Basin from Yamnaya Hungary.

To avoid this excess “Steppe ancestry” showing up in the maps, since Bell Beakers from Germany pack the most Yamnaya ancestry among East Bell Beakers outside Hungary (ca. 51.1% “Steppe ancestry”), I equated this maximum with BK_Scotland_Ach (which shows ca. 61.1% “Steppe ancestry”, highest among western Beakers), and applied a simple rule of three for “Steppe ancestry” in Dutch and British Beakers.

NOTE. Formal stats for “Steppe ancestry” in Bell Beaker groups are available in Olalde et al. (2018) supplementary materials (PDF). I didn’t apply this adjustment to Bk_FR groups because of the R1b Bell Beaker sample from the Champagne/Alsace region reported by Samantha Brunel that will pack more Yamnaya ancestry than any other sampled Beaker to date, hence probably driving the Yamnaya ancestry up in French samples.

The most likely outcome in the following years, when Yamnaya and Corded Ware ancestry are investigated separately, is that Yamnaya ancestry will be much lower the farther away from the Middle and Lower Danube region, similar to the case in Iberia, so the map above probably overestimates this component in most Beakers to the north of the Danube. Even the late Hungarian Beaker samples, who pack the highest Yamnaya ancestry (up to 75%) among Beakers, represent likely a back-migration of Moravian Beakers, and will probably show a contribution of Corded Ware ancestry due to the exogamy with local Moravian groups.

Despite this decreasing admixture as Bell Beakers spread westward, the explosive expansion of Yamnaya R1b male lineages (in words of David Reich) and the radical replacement of local ones – whether derived from Corded Ware or Neolithic groups – shows the true extent of the North-West Indo-European expansion in Europe:

chalcolithic-late-y-dna
Y-DNA haplogroups in West Eurasia during the Bell Beaker expansion. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Late Copper Age and of the Yamnaya-Bell Beaker transition.

4.2. Palaeo-Balkan

There is scarce data on Palaeo-Balkan movements yet, although it is known that:

  1. Yamnaya ancestry appears among Mycenaeans, with the Yamnaya Bulgaria sample being its best current ancestral fit;
  2. the emergence of steppe ancestry and R1b-M269 in the eastern Mediterranean was associated with Ancient Greeks;
  3. Thracians, Albanians, and Armenians also show R1b-M269 subclades and “Steppe ancestry”.

4.3. Sintashta-Potapovka-Filatovka

Interestingly, Potapovka is the only Corded Ware derived culture that shows good fits for Yamnaya ancestry, despite having replaced Poltavka in the region under the same Corded Ware-like (Abashevo) influence as Sintashta.

This proves that there was a period of admixture in the Pre-Proto-Indo-Iranian community between CWC-like Abashevo and Yamnaya-like Catacomb-Poltavka herders in the Sintashta-Potapovka-Filatovka community, probably more easily detectable in this group because of the specific temporal and geographic sampling available.

srubnaya-yamnaya-ehg-chg-ancestry
Supplementary Table 14. P values of rank=3 and admixture proportions in modelling Steppe ancestry populations as a four-way admixture of distal sources EHG, CHG, Anatolian_Neolithic and WHG using 14 outgroups.
Left populations: Steppe cluster, EHG, CHG, WHG, Anatolian_Neolithic
Right populations: Mbuti.DG, Ust_Ishim.DG, Kostenki14, MA1, Han.DG, Papuan.DG, Onge.DG, Villabruna, Vestonice16, ElMiron, Ethiopia_4500BP.SG, Karitiana.DG, Natufian, Iran_Ganj_Dareh_Neolithic.

Srubnaya ancestry shows a best fit with non-Pre-Yamnaya ancestry, i.e. with different CHG + EHG components – possibly because the more western Potapovka (ancestral to Proto-Srubnaya Pokrovka) also showed good fits for it. Srubnaya shows poor fits for Pre-Yamnaya ancestry probably because Corded Ware-like (Abashevo) genetic influence increased during its formation.

On the other hand, more eastern Corded Ware-derived groups like Sintashta and its more direct offshoot Andronovo show poor fits with this model, too, but their fits are still better than those including Pre-Yamnaya ancestry.

mlba-ehg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya EHG ancestry among Middle to Late Bronze Age populations. See full map.
mlba-chg-ancestry
Natural neighbor interpolation of non-Pre-Yamnaya CHG ancestry among Middle to Late Bronze Age populations. See full map.
mlba-anatolia-farmer-ancestry
Natural neighbor interpolation of Anatolia Neolithic ancestry among Middle to Late Bronze Age populations. See full map.
mlba-iran-chl-ancestry
Natural neighbor interpolation of Iran Chl. ancestry among Middle to Late Bronze Age populations. See full map.

NOTE For maps with actual formal stats of Corded Ware ancestry from the Early Bronze Age to the modern times, you should read the post Corded Ware ancestry in North Eurasia and the Uralic expansion instead.

The bottleneck of Proto-Indo-Iranians under R1a-Z93 was not yet complete by the time when the Sintashta-Potapovka-Filatovka community expanded with the Srubna-Andronovo horizon:

early-bronze-age-y-dna
Y-DNA haplogroups in West Eurasia during the European Early Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Bronze Age.

4.4. Afanasevo

At the end of the Afanasevo culture, at least three samples show hg. Q1b (ca. 2900-2500 BC), which seemed to point to a resurgence of local lineages, despite continuity of the prototypical Pre-Yamnaya ancestry. On the other hand, Anthony (2019) makes this cryptic statement:

Yamnaya men were almost exclusively R1b, and pre-Yamnaya Eneolithic Volga-Caspian-Caucasus steppe men were principally R1b, with a significant Q1a minority.

Since the only available samples from the Khvalynsk community are R1b (x3), Q1a(x1), and R1a(x1), it seems strange that Anthony would talk about a “significant minority”, unless Q1a (potentially Q1b in the newer nomenclature) will pop up in some more individuals of those ca. 30 new to be published. Because he also mentions I2a2 as appearing in one elite burial, it seems Q1a (like R1a-M459) will not appear under elite kurgans, although it is still possible that hg. Q1a was involved in the expansion of Afanasevo to the east.

middle-bronze-age-y-dna
Y-DNA haplogroups in West Eurasia during the Middle Bronze Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Middle Bronze Age and the Late Bronze Age.

Okunevo, which replaced Afanasevo in the Altai region, shows a majority of hg. Q1b, but also some R1b-M269 samples proper of Afanasevo, suggesting partial genetic continuity.

NOTE. Other sampled Siberian populations clearly show a variety of Q subclades that likely expanded during the Palaeolithic, such as Baikal EBA samples from Ust’Ida and Shamanka with a majority of Q1b, and hg. Q reported from Elunino, Sagsai, Khövsgöl, and also among peoples of the Srubna-Andronovo horizon (the Krasnoyarsk MLBA outlier), and in Karasuk.

From Damgaard et al. Science (2018):

(…) in contrast to the lack of identifiable admixture from Yamnaya and Afanasievo in the CentralSteppe_EMBA, there is an admixture signal of 10 to 20% Yamnaya and Afanasievo in the Okunevo_EMBA samples, consistent with evidence of western steppe influence. This signal is not seen on the X chromosome (qpAdm P value for admixture on X 0.33 compared to 0.02 for autosomes), suggesting a male-derived admixture, also consistent with the fact that 1 of 10 Okunevo_EMBA males carries a R1b1a2a2 Y chromosome related to those found in western pastoralists. In contrast, there is no evidence of western steppe admixture among the more eastern Baikal region region Bronze Age (~2200 to 1800 BCE) samples.

This Yamnaya ancestry has been also recently found to be the best fit for the Iron Age population of Shirenzigou in Xinjiang – where Tocharian languages were attested centuries later – despite the haplogroup diversity acquired during their evolution, likely through an intermediate Chemurchek culture (see a recent discussion on the elusive Proto-Tocharians).

Haplogroup diversity seems to be common in Iron Age populations all over Eurasia, most likely due to the spread of different types of sociopolitical structures where alliances played a more relevant role in the expansion of peoples. A well-known example of this is the spread of Akozino warrior-traders in the whole Baltic region under a partial N1a-VL29-bottleneck associated with the emerging chiefdom-based systems under the influence of expanding steppe nomads.

early-iron-age-y-dna
Y-DNA haplogroups in West Eurasia during the Early Iron Age. See full map and see maps of cultures, ADMIXTURE, Y-DNA, and mtDNA of the Early Iron Age and Late Iron Age.

Surprisingly, then, Proto-Tocharians from Shirenzigou pack up to 74% Yamnaya ancestry, in spite of the 2,000 years that separate them from the demise of the Afanasevo culture. They show more Yamnaya ancestry than any other population by that time, being thus a sort of Late PIE fossils not only in their archaic dialect, but also in their genetic profile:

shirenzigou-afanasievo-yamnaya-andronovo-srubna-ulchi-han

The recent intrusion of Corded Ware-like ancestry, as well as the variable admixture with Siberian and East Asian populations, both point to the known intense Old Iranian and Old/Middle Chinese contacts. The scarce Proto-Samoyedic and Proto-Turkic loans in Tocharian suggest a rather loose, probably more distant connection with East Uralic and Altaic peoples from the forest-steppe and steppe areas to the north (read more about external influences on Tocharian).

Interestingly, both R1b samples, MO12 and M15-2 – likely of Asian R1b-PH155 branch – show a best fit for Andronovo/Srubna + Hezhen/Ulchi ancestry, suggesting a likely connection with Iranians to the east of Xinjiang, who later expanded as the Wusun and Kangju. How they might have been related to Huns and Xiongnu individuals, who also show this haplogroup, is yet unknown, although Huns also show hg. R1a-Z93 (probably most R1a-Z2124) and Steppe_MLBA ancestry, earlier associated with expanding Iranian peoples of the Srubna-Andronovo horizon.

All in all, it seems that prehistoric movements explained through the lens of genetic research fit perfectly well the linguistic reconstruction of Proto-Indo-European and Proto-Uralic.

Related

Iron Age Tocharians of Yamnaya ancestry from Afanasevo show hg. R1b-M269 and Q1a1

New open access Ancient Genomes Reveal Yamnaya-Related Ancestry and a Potential Source of Indo-European Speakers in Iron Age Tianshan, by Ning et al. Current Biology (2019).

Interesting excerpts (emphasis mine, changes for clarity):

Here, we report the first genome-wide data of 10 ancient individuals from northeastern Xinjiang. They are dated to around 2,200 years ago and were found at the Iron Age Shirenzigou site. We find them to be already genetically admixed between Eastern and Western Eurasians. We also find that the majority of the East Eurasian ancestry in the Shirenzigou individuals is related to northeastern Asian populations, while the West Eurasian ancestry is best presented by ∼20% to 80% Yamnaya-like ancestry. Our data thus suggest a Western Eurasian steppe origin for at least part of the ancient Xinjiang population. Our findings furthermore support a Yamnaya-related origin for the now extinct Tocharian languages in the Tarim Basin, in southern Xinjiang.

Haplogroups

The dominant mtDNA lineages of the Shirenzigou people are commonly found in modern and ancient West Eurasian populations, such as U4, U5, and H, while they also have East Eurasian-specific haplogroups A, D4, and G3, preliminarily documenting admixed ancestry from eastern and western Eurasia.

The admixture profile is also shown on the paternal Y chromosome side that 4 out of 6 males in Shirenzigou (Figure S2) belong to the West Eurasian-specific haplogroup R1b (n = 2) and East Eurasian-specific haplogroup Q1a (n = 2), the former is predominant in ancient Yamnaya and nearly 100% in Afanasievo, different from the Middle and Late Bronze Age Steppe groups (Steppe_MLBA) such as Andronovo, [Potapovka], Srubnaya, and Sintashta whose Y chromosomal haplogroup is mainly R1a.

tocharians-y-dna-mtdna

Autosomal

We first carried out principal component analysis (PCA) to assess the genetic affinities of the ancient individuals qualitatively by projecting them onto present-day Eurasian variation (Figure 2). We observed a distinct separation between East and West Eurasians. Our ancient Shirenzigou samples and present-day populations from Central Asia and northwestern China form a genetic cline from East to West in the first PC. The distribution of Shirenzigou samples on the cline is relatively scattered with two major clusters, one being closer to modern-day Uygurs and Kazakhs and the other being closer to recently published ancient Saka and Huns from the Tianshan in Kazakhstan (…).

We applied a formal admixture test using f3 statistics in the form of f3 (Shirenzigou; X, Y) where X and Y are worldwide populations that might be the genetic sources for the Shirenzigou individuals. We observed the most significant signals of admixture in the Shirenzigou samples when using Yamnaya_Samara or Srubnaya as the West Eurasian source and some Northern Asians or Koreans as the East Eurasian source (Table S1). We also plotted the outgroup f3 statistics in the form of f3 (Mbuti; X, Anatolia_Neolithic) and f3 (Mbuti; X, Kostenki14) to visualize the allele sharing between population X and Anatolian farmers. As shown in Figure S3, the Steppe_MLBA populations including Srubnaya, Andronovo, and Sintashta were shifted toward farming populations compared with Yamnaya groups and the Shirenzigou samples. This observation is consistent with ADMIXTURE analysis that Steppe_MLBA populations have an Anatolian and European farmer-related component that Yamnaya groups and the Shirenzigou individuals do not seem to have. The analysis consistently suggested Yamnaya-related Steppe populations were the better source in modeling the West Eurasian ancestry in Shirenzigou.

tocharians-pca-admixture
PCA and ADMIXTURE for Shirenzigou Samples. Modified from the original to include in black squares samples related to Yamnaya.

Genetic Composition of Iron Age Shirenzigou Individuals

We continued to use qpAdm to estimate the admixture proportions in the Shirenzigou samples by using different pairs of source populations, such as Yamnaya_Samara, Afanasievo, Srubnaya, Andronovo, BMAC culture (Bustan_BA and Sappali_Tepe_BA) and Tianshan_Hun as the West Eurasian source and Han, Ulchi, Hezhen, Shamanka_EN as the East Eurasian source. In all cases, Yamnaya, Afanasievo, or Tianshan_Hun always provide the best model fit for the Shirenzigou individuals, while Srubnaya, Andronovo, Bustan_BA and Sappali_Tepe_BA only work in some cases. The Yamnaya_Samara or Afanasievo-related ancestry ranges from ∼20% to 80% in different Shirenzigou individuals, consistent with the scattered distribution on the East-West cline in the PCA

ancestry-tocharians

(…) we then modeled Shirenzigou as a three-way admixture of Yamnaya_Samara, Ulchi (or Hezhen) and Han to infer the source from the East Eurasia side that contributed to Shirenzigou. We found the Ulchi or Hezhen and Han-related ancestry had a complicated and unevenly distribution in the Shirenzigou samples. The most Shirenzigou individuals derived the majority of their East Eurasian ancestry from Ulchi or Hezhen-related populations, while the following two individuals M820 and M15-2 have more Han related than Ulchi/Hezhen-related ancestry.

One important question remains, though: how and when did these Proto-Tocharian speakers migrate from the Afanasevo culture in the Altai into the Tarim Basin? The traditional answer, now more likely than ever, is through the Chemurchek culture. See e.g. A re-analysis of the Qiemu’erqieke (Shamirshak) cemeteries, Xinjiang, China, by Jia and Betts JIES (2010) 38(4).

Also, given the apparent lack of (extra farmer ancestry that characterizes) Corded Ware ancestry, if the results were already suspicious before, how likely are now the published R1a(xZ93) and/or radiocarbon dates of the Xiaohe mummies from Li et al. (2010, 2015)? Because, after all, one should have expected in such a late date a generalized admixture with neighbouring Srubna/Andronovo-like populations.

Related

Bronze Age cultures in the Tarim Basin and the elusive Proto-Tocharians

andronovo-xiaohe-horizon

Master’s thesis Shifting Memories: Burial Practices and Cultural Interaction in Bronze Age China: A study of the Xiaohe-Gumugou cemeteries in the Tarim Basin, by Yunyun Yang, Uppsala University, Department of Archaeology and Ancient History (2019).

Summary excerpts, mainly from the conclusions (emphasis mine):

Both the Xiaohe and the Gumugou groups are suggested as possibly originating from southern Siberia or Central Asia and being related to Afanasievo and Andronovo people (Han 1986, 1994; Li et al. 2010, 2015). But a latest research suggest that the Xiaohe males are genetic distinct from the Afanasievo males, considering the paternal lineages (Hollard et al. 2018). From genetic evidence, it is suggested that southern Siberia and Central Asia were dominated by Europeans during the Bronze Age. Southern Siberia was predominant by Europeans since the Bronze Age as a result of eastward migration of Kurgan people (Keyser et al. 2009). Central Asia started to have an eastern Eurasian maternal lineage that coexisted with the previous western maternal lineage from around 700 BCE (Lalueza-Fox et al. 2004). Based on the research mentioned above, we can conclude as that the Xiaohe and the Gumugou people possibly came from the southern Siberia or Central Asia.

Origin of the Xiaohe horizon

There are two hypotheses about the origins of the Xiaohe horizon. The “steppe hypothesis” assumes that the early settlers (Gumugou people) of the Tarim Basin came from the Afanasievo culture in the Minusinsk Basin-Altai Mountains regions (Kuz’mina et al. 2008; Mallory et al. 2008). The “oasis hypothesis” argues that the early settlers were related to the spreading of the oasis-based agricultural groups from the Bactria and Margiana parts of the southern Central Asia area (Chen et al. 1995). Both hypotheses mainly relied on the use of some materials such as animal cattle, sheep/goats, camel hair, and plant wheat, whose origins were bound to western traditions. But these proofs cannot provide enough support to claim that the Xiaohe horizon cultures were from Afanasievo or BMAC cultures, except for telling there were possible cultural connections or interactions among them. What’s more, there were no horses or potteries in the Xiaohe horizon.

It is worth noting that Ephedra plant is commonly thought as a strong candidate of the Soma or Haoma sacred drink for the ancient Indians or Iranians. Soma is the name recorded in the Vedic Brahmanism religious literature Rigveda, Haoma in the Zoroastrianism Avesta, and indicates as a ritual drink from plant juice. The reason to address Ephedra plant to Soma-Haoma drink is mainly because of its ephedrine, which works on muscle strength, low blood pressure, (and asthma) to make people get rid of tiredness (Houben 2013). Furthermore, it is thought that Ephedra with anti-fatigue function gives gods or the dead immortality, longevity, and resurrection (Mahdihassan 1987). From a mobile consideration of Vedic Aryans perspective, it is thought Vedic Aryans made use of Ephedra, cannabis and poppy to produce Soma drink in Margiana, only Ephedra in Bactria and in Indian mountains area, but other substitutes in Indian plains (Shah 2014). From the Ephedra perspective, it is agreeable that the Xiaohe-Gumugou people were related to the Indo-Aryan peoples (Mallory et al. 1997; Wang 2017).

gumugou-xiaohe
The distribution map of the sites in the Xiaohe cultural horizon.

Burial customs

Both the Xiaohe and the Gumugou groups maintained similar burial customs, but we can distinguish a developing process from the slight diverse ways of the Gumugou cemetery to the highly consistent and advanced technology in making coffins of the Xiaohe cemetery. In terms of the dressing, the dead wore a felt cap, a pair of leather boots, a bracelet twined on the right wrist, and was wrapped in a big felt mantle. The dead in the Xiaohe cemetery also wore a loin-cloth. Commonly, both cemeteries contained burials goods of Ephedra twigs, grains of wheat and millet, grass-made baskets, animal ears (such as calf ears), and livestock. Wooden coffins in the two cemeteries were constructed in a similar way, by assembling two side-planks, two end-boards, a lid consisting of a few short straight boards, and covered with livestock hide (mainly cattle hide in the Xiaohe cemetery and sheep/goats hide in the Gumugou cemetery).

Considering the similar and continuous burial behaviours in the two cemeteries, it can be assumed that both the Xiaohe and the Gumugou societies were stable and consistent. The Xiaohe cemetery had both the special clay-lid wooden coffins and the normal coffins in its early phase (burial layers 4th-5th), then turned to be stable and consistent with the normal coffins (burial layers 1st-3rd), and have developed better construction of the boat-shape coffins. The Gumugou cemetery contained two main burial patterns, type I; the sun-radiating-spokes burials and type II; the normal burials, which coexisted during the same time. Burials of type II were similar but not limited to strict rules. Burials in both the Xiaohe and the Gumugou cemetery were fairly heterogeneous, and the clay-lid wooden coffins in the Xiaohe cemetery and the sun-radiating-spokes burials in the Gumugou cemetery only took up in a small percentage of each cemetery. These special burial types could indicate special roles of the dead in their related societies. Either the dead had high social positions or possibly they actually had a different ancestry origin. It is argued here that the latter is something that is quite possible, considering the mixed populations in the two cemeteries.

The sun-radiating-spokes burials share some features with a similar type of grave, constructed of circular stone kerbs of the stone-pit graves. The sun-radiating-spokes burials might represent an adaption to the local desert environment, which had better access to wood rather than stones. Circular stone kerbs with stone-pit in centre were widely seen in Bronze Age Afanasievo and Andronovo burials, and also in the late Bronze Age and early Iron Age burials along the Tian Shan. The present study suggests a high possibility that the six males buried in the sun-radiating-spokes graves came from the contemporary parallel Andronovo horizon, and kept some of their own ancestry memories in an adapted way.

xinjiang-afanasievo-andronovo-bmac-tian-shan
An assumption of the spreading/expansion routes stone burial construct.

Societies

Although the Xiaohe and Gumugou societies were stable and consistent, it does not mean that the societies were isolated, and we can see strong indications of them being open to the outside. With time, the Xiaohe population were getting even more diverse origins, as newcomers kept joining the group from outside. However, the burial behaviours in the Xiaohe cemetery did not change as a consequence if these additions. This suggests that the newcomers inherited the local burial customs, and strongly indicates that they became part of the community and adopted the new social identity, possibly through marriage. As a result, the diverse populations can well explain the coexistence of different cultural elements in the burials, e.g. cattle, sheep/goats, camel hair (from Central Asia), grains of wheat (from the west) and millet (from the east), etc.

The Xiaohe and the Gumugou societies were similar, but the Xiaohe society developed to a more advanced level both in economy and in social structure. First, the oasis-based economic system of the Xiaohe and the Gumugou had similar husbandry, but later this was developed to different extent. Both societies mainly relied on livestock, and while the Xiaohe people favoured cattle, the Gumugou people favoured sheep/goats. The two societies also developed agriculture, which can be seen from the grains of wheat and millet. It has been shown that grains of wheat are bread wheat. The Xiaohe people also cooked porridge with millet and milk, and had dairy products.

From these evidences, we can assume that the Xiaohe people have developed a stronger economic level. Secondly, the Xiaohe society had more distinguished gender roles, resulting in different social roles for men and women in terms of work and religions. The female and male dead were buried in a distinguished way with loin-cloths and wooden monuments. Sexual identity on a social level refers to how people consider and expect different genders to act and behave under the social and cultural framework. In the Xiaohe society, men carried out hunting tasks (creatures like vultures, badgers, lizards, snakes); women were associated to the rebirth of lives. To synthesize, a possible relation between the Xiaohe and the Gumugou societies is that they represent two parallel groups who shared similar economic systems because of the similar environment, or that there is a chronological difference where the Gumugou people may have existed earlier. The absolute dating information from the two cemeteries is insufficient to rule out the second situation.

tarim-basin-regions
The area division of the Tarim Basin and its surroundings (The division is made based on the mountain ranges including Altai Mountains, Tian Shan, and Kunlun Mountains, and also the distribution of ancient cemeteries in the whole Xinjiang generally.)

Surroundings

To place the Xiaohe horizon in the larger context of the Bronze Age burials in its surroundings, the hypothesis presented in this study is that the Xiaohe-Gumugou people might possibly represent a parallel to the Andronovo groups, with an eastward migration, that developed their own societies and ethnicities in the Tarim Basin with some ancestral memories still preserved. Considering the location and the geographical features of Xinjiang, the Altai Mountains and the Tian Shan left open access from the Eurasian Steppe to the Dzungarian Basin. The Hami Basin-the Balikun Grassland was the first intersection area to combine the possible western and eastern cultural influences. To pass by the Turpan Basin and enter into the Tarim Basin, there were two possible routes, one northern route along the southern edge of Tian Shan, and one southern route along the northern edge of Kunlun Mountains.

In the early Bronze Age, the burials in Xinjiang had some clear typical geographic features that distinguish them from their surroundings. But from the late Bronze Age to the early Iron Age, the tradition with circular kerbs of stones with stone-pits burials expanded along the southern edge of the Tian Shan, which was a major shift of burial practice that possibly could be linked to the expansion of the Andronovo horizon or a general nomadic expansion.

Although there were no horses or wagons found in the Xiaohe burials, the wooden horse-hoof objects were an indication of horses, which did not exist in their daily lives anymore, but possibly were related to some settlers’ ancestral memories of their nomadic origins. However, it was more important for them to assimilate to the common social identities of their new group. After people died, it was preferred to be buried in the communal cemetery. Even if the dead bodies were lost, wooden substitutes will be used in graves to represent the dead, since they believed in afterlife and thought that the end of the death is rebirth.

Comments

While the results of Li et al. (2010, 2015) of Xiaohe mummies regarding Y-chromosome haplogroups – showing mostly R1a(xZ93) – and radiocarbon dates of the samples are yet to be confirmed, Proto-Tocharians are known to have had contacts with Samoyeds, early Indo-Iranians (in turn in contact with the BMAC language), then into Common Tocharian with ancient Iranians, and then Indo-Aryan and Iranian languages again (for more on this, see Ged Carling‘s publications).

The connection of the Tocharian branch with Afanasevo is essentially indisputable today, like that of Late Proto-Indo-European with late Repin/early Yamna, even more so than it was just 10 years ago, thanks to the most recent genetic investigation. The common genetic stock of Yamna and Afanasevo – as well as that of East Bell Beakers and Palaeo-Balkan peoples – fits perfectly earlier predictions based on the linguistic estimates of the separation and evolution of the diverse language communities, and the tentative attribution to Eurasian steppe-related cultures.

early-bronze-age-tocharian-chemurchek
Tentative identification of language groups among Early Bronze Age cultures. Pre-/Proto-Tocharian is traditionally associated with Chemurchek. See full image.

The trail leading from Afanasevo to Common Tocharians, on the other hand, seems to be more tricky, not unlike many other Indo-European-speaking groups from Europe and Asia, whose precise evolution until their historical attestation is often unclear. Nevertheless, the eventual presence of diverse haplogroups among historical Tocharians – whether they coincide with ancient DNA recovered from BMAC, South India, Andronovo, or Bronze Age Tian Shan populations – will only be relevant to understand the genetic evolution of the speakers of Tocharian during its different stages.

If the genetic trail backwards from known Tocharians to (earlier) unknown Common Tocharians, and forwards from known Pre-Tocharians to (later) unknown Proto-Tocharians leads unequivocally to these populations from the Xiaohe cultural horizon, this paper shows one of the mechanisms through which peoples of the Andronovo cultural horizon (or, more precisely, male lines derived from it) may have become integrated into a Tocharian-speaking population, not dissimilar to what happened in the steppes between Uralic-speaking Abashevo and Pre-Proto-Indo-Iranian-speaking Catacomb-Poltavka to form the Proto-Indo-Iranian-speaking Sintashta-Potapovka-Filatovka culture.

As we have discussed in this blog many times over, to solve this ethnolinguistic identification of prehistoric cultures one needs to investigate ancient DNA in combination with linguistic guesstimates and the Indo-European homeland problem from a wide anthropological perspective. People not understanding this simple concept are bound to end up in some comical Tocharo-Indo-Iranian grouping related to Corded Ware ancestry from Andronovo, similar to the Celto-Ibero-Basques of elevated CEU BA ancestry and hg. R1b-P312 to the south of the Pyrenees during the Iron Age from Olalde et al. (2019), and to the Balto-Finno-Slavs of hg. R1a-Z283 and elevated “Steppe ancestry” in the BA-IA East Baltic from Saag et al. (2019)

Related

Scytho-Siberians of Aldy-Bel and Sagly, of haplogroup R1a-Z93, Q1b-L54, and N

iron-age-sakas-aldy-bel-scythians

Recently, a paper described Eastern Scythian groups as “Uralic-Altaic” just because of the appearance of haplogroup N in two Pazyryk samples.

This simplistic identification is contested by the varied haplogroups found in early Altaic groups, by the early link of Cimmerians with the expansion of hg. N and Q, by the link of N1c-L392 in north-eastern Europe with Palaeo-Laplandic, and now (paradoxically) by the clear link between early Mongolic expansion and N1c-L392 subclades.

A new paper (behind paywall) offers insight into the prevalent presence of R1a-Z93 among eastern Scytho-Siberian groups (most likely including Samoyedic speakers in the forest-steppes), and a new hint to the westward expansion of haplogroups Q and N (probably coupled with the so-called “Siberian ancestry”) from the east with different groups of Iron Age steppe nomads:

Genetic kinship and admixture in Iron Age Scytho-Siberians, by Mary et al. Human Genetics (2019).

Interesting excerpts (emphasis mine):

From an archeological and historical point of view, the term “Scythians” refers to Iron Age nomadic or seminomadic populations characterized by the presence of three types of artifacts in male burials: typical weapons, specific horse harnesses and items decorated in the so-called “Animal Style”. This complex of goods has been termed the “Scythian triad” and was considered to be characteristic of nomadic groups belonging to the “Scythian World” (Yablonsky 2001). This “Scythian World” includes both the Classic (or European) Scythians from the North Pontic region (7th–3th century BC) and the Southern Siberian (or Asian) populations of the Scythian period (also called Scytho-Siberians). These include, among others, the Sakas from Kazakhstan, the Tagar population from the Minusinsk Basin (Republic of Khakassia), the Aldy-Bel population from Tuva (Russian Federation) and the Pazyryk and Sagly cultures from the Altai Mountains.

mtdna-scytho-siberians
Proportions of Scythian mtDNA haplogroups. Western (blue) and eastern (pink) Eurasian lineages are equally distributed in the Arzhan Scytho-Siberian sample. The U5a2a1 haplogroup shared between the two Scythian groups studied is in bold

In this work, we first aim to address the question of the familial and social organization of Scytho-Siberian groups by studying the genetic relationship of 29 individuals from the Aldy-Bel and Sagly cultures using autosomal STRs. (…) were obtained from 5 archeological sites located in the valley of the Eerbek river in Tuva Republic, Russia (Fig. 1). All the mounds of this archeological site were excavated but DNA samples were not collected from all of them. 14C dates mainly fall within the Hallstatt radiocarbon calibration plateau (ca. 800–400 cal BC) where the chronological resolution is poor. Only one date falls on an earlier segment of calibration curve: Le 9817–2650 ± 25 BP, i.e. 843–792 cal BC with a probability of 94.3% (using the OxCal v4.3.2 program). This sample (Bai-Dag 8, Kurgan 1, grave 10) is not from one of the graves studied but was used to date the kurgan as a whole.

Y-chromosome haplogroups were first assigned using the ISOGG 2018 nomenclature. In order to improve the precision of haplogroup definition, we also analyzed a set of Y-chromosome SNP (Supplementary Table 2). Nine samples belonged to the R1a-M513 haplogroup (defined by marker M513) and two of these nine samples were characterized as belonging to the R1a1a1b2-Z93 haplogroup or one of its subclades. Six samples belonged to the Q1b1a-L54 haplogroup and five of these six samples belonged to the Q1b1a3-L330 subclade. One sample belonged to the N-M231 haplogroup.

haplogroups-scythian-siberians

The distribution of these haplogroups in the population must be confronted with the prevalence of kinship among the samples. Although five individuals belonged to haplogroup Q1b1a3-L330, three of them (ARZ-T18, ARZ-T19 and ARZ-T20) were paternally related (Fig. 2). It must, therefore, be considered that haplogroup Q1b1a3-L330 is present in three independent instances (given that the remaining two instances exhibit no close familial relationship with other samples or one another). All five were buried on the Eki-Ottug 1 archaeological site (although in two different kurgans).

In the same way, although two groups, of two and three individuals, shared haplotypes belonging to the R1a-M513 haplogroup, these groups likely include a father/son pair (ARZ-T2 and ARZ-T12). Therefore, among nine R1a-M513 men, we found six independent haplotypes, one being present in two independent instances. All R1a-M513 haplotypes, however, including those attributed to the R1a1a1b2-Z93 subclade, only differed by one-step mutations, across 5 loci at most. All R1a-M513 individuals were buried on the same site, Eki-Ottug 2, in a single Kurgan.

y-haplogroups-r1a-n-q1b

Haplogroup R1a-M173 was previously reported for 6 Scytho-Siberian individuals from the Tagar culture (Keyser et al. 2009) and one Altaian Scytho-Siberian from the Sebÿstei site (Ricaut et al. 2004a), whereas haplogroup R1a1a1b2-Z93 (or R1a1a1b-S224) was described for one Scythian from Samara (Mathieson et al. 2015) and two Scytho-Siberians from Berel and the Tuva Republic (Unterländer et al. 2017). On the contrary, North Pontic Scythians were found to belong to the R1b1a1a2 haplogroup (Krzewińska et al. 2018), showing a distinction between the two groups of Scythians. (…) The absence of R1b lineages in the Scytho-Siberian individuals tested so far and their presence in the North Pontic Scythians suggest that these 2 groups had a completely different paternal lineage makeup with nearly no gene flow from male carriers between them.

The seven other male individuals studied in this work were found to carry Eastern Eurasian Y haplogroups Q1b1a and one of its subclades (n = 6) and N (n = 1). Haplogroup Q1b1a-L54 was previously described in four males from the Bronze Age in the Altai Mountains (Hollard et al. 2014, 2018) and was clearly associated with Siberian populations (Regueiro et al. 2013).

The N-M231 haplogroup emerged from haplogroup K in Southern Asia around 21,000 years BCE, maybe in Southern China (Shi et al. 2013; Ilumäe et al. 2016). Previous studies attested to its presence in samples from Neolithic and Bronze Age in China (Li et al. 2011; Cui et al. 2013). Waves of northwestern expansion of this haplogroup are described as beginning during the Paleolithic period (Derenko et al. 2006; Shi et al. 2013) but traces of this expansion in archeological samples were reported only in two Scytho-Siberian males from the Altai (Pilipenko et al. 2015).

The sample of haplogroup N comes from the Aldy-Bel culture (ARZ-T15), from the Eerbek site, but has no radiocarbon date. All Q1b-L330 samples come from the Sagly culture, and three are paternally related. The other Q1b-L54 sample is from other tombs in one kurgan at Aldy Bel.

It seems that – exactly as expected – different waves of steppe nomads brought different lineages at a time (the Iron Age) when many regions incorporated different eastern lineages without necessarily changing language. Just like the expansion of N among Ugrians and Samoyeds, and N1c among Finno-Permic peoples, and like many other lineages expanding with federation-like groups in eastern, central, and western Europe

Related

Corded Ware—Uralic (IV): Hg R1a and N in Finno-Ugric and Samoyedic expansions

haplogroup-uralians

This is the fourth of four posts on the Corded Ware—Uralic identification:

Let me begin this final post on the Corded Ware—Uralic connection with an assertion that should be obvious to everyone involved in ethnolinguistic identification of prehistoric populations but, for one reason or another, is usually forgotten. In the words of David Reich, in Who We Are and How We Got Here (2018):

Human history is full of dead ends, and we should not expect the people who lived in any one place in the past to be the direct ancestors of those who live there today.

Haplogroup N

Another recurrent argument – apart from “Siberian ancestry” – for the location of the Uralic homeland is “haplogroup N”. This is as serious as saying “haplogroup R1” to refer to Indo-European migrations, but let’s explore this possibility anyway:

Ancient haplogroups

We have now a better idea of how many ancient migrations (previously hypothesized to be associated with westward Uralic migrations) look like in genetic terms. From Damgaard et al. (Science 2018):

These serial changes in the Baikal populations are reflected in Y-chromosome lineages (Fig. SA; figs. S24 to S27, and tables S13 and SI4). MAI carries the R haplogroup, whereas the majority of Baikal_EN males belong to N lineages, which were widely distributed across Northern Eurasia (29), and the Baikal_LNBA males all carry Q haplogroups, as do most of the Okunevo_EMBA as well as some present-day Central Asians and Siberians.

The only N1c1 sample comes from Ust’Ida Late Neolithic, 180km to the north of Lake Baikal, which – together with the Bronze Age sample from the Kola peninsula, and the medieval sample from Ust’Ida – gives a good idea of the overall expansion of N subclades and Siberian ancestry among the Circum-Arctic peoples of Eurasia, speakers of Palaeo-Siberian languages.

eurasian-n-subclades
Geographical location of ancient samples belonging to major clade N of the Y-chromosome.

Modern haplogroups

What we should expect from Uralic peoples expanding with haplogroup N – seeing how Yamna expands with R1b-L23, and Corded Ware expands with R1a-Z645 – is to find a common subclade spreading with Uralic populations. Let’s see if it works like that for any N-X subclade, in data from Ilumäe et al. (2016):

haplogroup_n1
Geographic-Distribution Map of hg N3 / N1c / N1a.

Within the Eurasian circum-Arctic spread zone, N3 and N2a reveal a well-structured spread pattern where individual sub-clades show very different distributions:

N1a1-M46 (or N-TAT), formed ca. 13900 BC, TMRCA 9800 BC

   N1a1a2-B187, formed ca. 9800 BC, TMRCA 1050 AD:

The sub-clade N3b-B187 is specific to southern Siberia and Mongolia, whereas N3a-L708 is spread widely in other regions of northern Eurasia.

     N1a1a1a-L708, formed ca. 6800 BC, TMRCA 5400 BC.

       N1a1a1a2-B211/Y9022, formed ca. 5400 BC, TMRCA 1900 BC:

The deepest clade within N3a is N3a1-B211, mostly present in the Volga-Uralic region and western Siberian Khanty and Mansi populations.

         N1a1a1a1a-L392/L1026), formed ca. 4400 BC, TMRCA 2800 BC:

The neighbor clade, N3a3’6-CTS6967, spreads from eastern Siberia to the eastern part of Fennoscandia and the Baltic States

haplogroup_n3a3
Frequency-Distribution Maps of Individual Subclade N3a3 / N1a1a1a1a1a-CTS2929/VL29, probably initially with Akozino warrior-traders.

           N1a1a1a1a1a-CTS2929/VL29, formed ca. 2100 BC, TMRCA 1600 BC:

In Europe, the clade N3a3-VL29 encompasses over a third of the present-day male Estonians, Latvians, and Lithuanians but is also present among Saami, Karelians, and Finns (Table S2 and Figure 3). Among the Slavic-speaking Belarusians, Ukrainians, and Russians, about three-fourths of their hg N3 Y chromosomes belong to hg N3a3.

In the post on Finno-Permic expansions, I depicted what seems to me the most likely way of infiltration of N1c-L392 lineages with Akozino warrior-traders into the western Finno-Ugric populations, with an origin around the Barents sea.

This includes the potential spread of (a minority of) N1c-B211 subclades due to contacts with Anonino on both sides of the Urals, through a northern route of forest and forest-steppe regions (equivalent to the distribution of Cherkaskul compared to Andronovo), given the spread of certain subclades in Ugric populations.

NOTE. An alternative possibility is the association of certain B211 subclades with a southern route of expansion with Pre-Scythian and Scythian populations, under whose influence the Ananino culture emerged -which would imply a very quick infiltration of certain groups of haplogroup N everywhere among Finno-Ugrics on both sides of the Urals – , and also the expansion of some subclades with Turkic-speaking peoples, who apparently expanded with alliances of different peoples. Both (Scythian and Turkic) populations expanded from East Asia, where haplogroup N (including N1c) was present since the Neolithic. I find this a worse model of expansion for upper clades, but – given the YFull estimates and the presence of this haplogroup among Turkic peoples – it is a possibility for many subclades.

           N1a1a1a1a2-Z1936, formed ca. 2800 BC, TMRCA 2400 BC:

The only notable exception from the pattern are Russians from northern regions of European Russia, where, in turn, about two-thirds of the hg N3 Y chromosomes belong to the hg N3a4-Z1936—the second west Eurasian clade. Thus, according to the frequency distribution of this clade, these Northern Russians fit better among other non-Slavic populations from northeastern Europe. N3a4 tends to increase in frequency toward the northeastern European regions but is also somewhat unexpectedly a dominant hg N3 lineage among most Turcic-speaking Volga Tatars and South-Ural Bashkirs.

haplogroup_n3a4
Frequency-Distribution Maps of Individual Subclade N3a4 / N1a1a1a1a2-Z1936, probably with the Samic (first) and Fennic (later) expansions into Paleo-Lakelandic and Palaeo-Laplandic territories.

The expansion of N1a-Z1936 in Fennoscandia is most likely associated with the expansion of Saami into asbestos ware-related territory (like the Lovozero culture) during the Late Iron Age – and mixture with its population – , and with the later Fennic expansion to the east and north, replacing their language, as well as with Arctic and forest populations assimilated during Permic, Ugric, and Samoyedic expansions to the north.

           N1a1a1a1a4-M2019 (previously N3a2), formed ca. 4400 BC, TMRCA 1700 BC:

Sub-hg N3a2-M2118 is one of the two main bifurcating branches in the nested cladistic structure of N3a2’6-M2110. It is predominantly found in populations inhabiting present-day Yakutia (Republic of Sakha) in central Siberia and at lower frequencies in the Khanty and Mansi populations, which exhibit a distinct Y-STR pattern (Table S7) potentially intrinsic to an additional clade inside the sub-hg N3a2

The second widespread sub-clade of hg N is N2a. (…):

   N1a2b-P43 (B523/FGC10846/Y3184), formed ca. 6800 BC, TMRCA ca. 2700 BC:

The absolute majority of N2a individuals belong to the second sub-clade, N2a1-B523, which diversified about 4.7 kya (95% CI = 4.0–5.5 kya). Its distribution covers the western and southern parts of Siberia, the Taimyr Peninsula, and the Volga-Uralic region with frequencies ranging from from 10% to 30% and does not extend to eastern Siberia (…)

haplogroup_n2
Geographic-Distribution Map of hg N2a1 / N1a2b-P43

The “European” branch suggested earlier from Y-STR patterns turned out to consist of two clades

     N1a2b2a-Y3185/FGC10847, formed ca. 2200 BC, TMRCA 800 BC:

N2a1-L1419, spread mainly in the northern part of that region.

     N1a2b2b1-B528/Y24382, formed ca. 900 BC, TMRCA ca. 900 BC:

N2a1-B528, spread in the southern Volga-Uralic region.

Haplogroup R1a

We also have a good idea of the distribution of haplogroup R1a-Z645 in ancient samples. Its subclades were associated with the Corded Ware expansion, and some of them fit quite well the early expansion of Finno-Permic, Ugric, and Samoyedic peoples to the east.

r1a-z282-z280-z2125-distribution
Modified image, from Underhill et al. (2015). Spatial frequency distributions of Z282 (green) and Z93 (blue) affiliated haplogroups.. Notice the potential Finno-Ugric-associated distribution of Z282 (especially R1a-M558, a Z280 subclade), the expansion of R1a-Z2123 subclades with Central Asian forest-steppe groups.

This is how the modern distribution of R1a among Uralians looks like, from the latest report in Tambets et al. (2018):

  • Among Fennic populations, Estonians and Karelians (ca. 1.1 million) have not suffered the greatest bottleneck of Finns (ca. 6-7 million), and show thus a greater proportion of R1a-Z280 than N1c subclades, which points to the original situation of Fennic peoples before their expansion. To trust Finnish Y-DNA to derive conclusions about the Uralic populations is as useful as relying on the Basque Y-DNA for the language spread by R1b-P312
  • Among Volga-Finnic populations, Mordovians (the closest to the original Uralic cluster, see above) show a majority of R1a lineages (27%).
  • Hungarians (ca. 13-15 million) represent the majority of Ugric (and Finno-Ugric) peoples. They are mainly R1a-Z280, also R1a-Z2123, have little N1c, and lack Siberian ancestry, and represent thus the most likely original situation of Ugric peoples in 4th century AD (read more on Avars and Hungarians).
  • Among Samoyedic peoples, the Selkup, the southernmost ones and latest to expand – that is, those not heavily admixed with Siberian populations – , also have a majority of R1a-Z2123 lineages (see also here for the original Samoyedic haplogroups to the south).

To understand the relevance of Hungarians for Ugric peoples, as well as Estonians, Karelians, and Mordovians (and northern Russians, Finno-Ugric peoples recently Russified) for Finno-Permic peoples, as opposed to the Circum-Arctic and East Siberian populations, one has to put demographics in perspective. Even a modern map can show the relevance of certain territories in the past:

population-density
Population density (people per km2) map of the world in 1994. From Wikipedia.

Summary of ancestry + haplogroups

Fennic and Samic populations seem to be clearly influenced by Palaeo-Laplandic peoples, whereas Volga-Finnic and especially Permic populations may have received gene flow from both, but essentially Palaeo-Siberian influence from the north and east.

The fact that modern Mansis and Khantys offer the highest variation in N1a subclades, and some of the highest “Siberian ancestry” among non-Nganasans, should have raised a red flag long ago. The fact that Hungarians – supposedly stemming from a source population similar to Mansis – do not offer the same amount of N subclades or Siberian ancestry (not even close), and offer instead more R1a, in common with Estonians (among Finno-Samic peoples) and Mordvins (among Volga-Finnic peoples) should have raised a still bigger red flag. The fact that Nganasans – the model for Siberian ancestry – show completely different N1a2b-P43 lineages should have been a huge genetic red line (on top of the anthropological one) to regard them as the Uralian-type population.

We know now that ethnolinguistic groups have usually expanded with massive (usually male-biased) migrations, and that neighbouring locals often ‘resurge’ later without changing the language. That is seen in Europe after the spread of Bell Beakers, with the increase of previous ancestry and lineages in Scandinavia during the formation of the Nordic ethnolinguistic community; in Central-West Europe, with the resurgence of Neolithic ancestry (and lineages) during the Bronze Age over steppe ancestry; and in Central-East Europe (with Unetice or East European Bronze Age groups like Mierzanowice, Trzciniec, or Lusatian) showing an increase in steppe ancestry (and resurge of R1a subclades); none of them represented a radical ethnolinguistic change.

finno-ugric-haplogroup-n
Map of archaeological cultures in north-eastern Europe ca. 8th-3rd centuries BC. [The Mid-Volga Akozino group not depicted] Shaded area represents the Ananino cultural-historical society. Fading purple arrows represent likely stepped movements of subclades of haplogroup N for centuries (e.g. Siberian → Ananino → Akozino → Fennoscandia [N-VL29]; Circum-Arctic → forest-steppe [N1, N2]; etc.). Blue arrows represent eventual expansions of Uralic peoples to the north. Modified image from Vasilyev (2002).

It is not hard to model the stepped arrival, infiltration, and/or resurge of N subclades and “Siberian ancestries”, as well as their gradual expansion in certain regions, associated with certain migrations first – such as the expansions to the Circum-Arctic region, and later the Scythian- and Turkic-related movements – , as well as limited regional developments, like the known bottleneck in Finns, or the clear late expansion of Ugric and Samoyedic languages to the north among nomadic Palaeo-Siberians due to traditions of exogamy and multilingualism. This fits quite well with the different arrival of N (N1c and xN1c) lineages to the different Uralic-speaking groups, and to the stepped appearance of “Siberian ancestry” in the different regions.

The aternative

It is evident that a lot of people were too attached to the idea of Palaeolithic R1b lineages ‘native’ to western Europe speaking Basque languages; of R1a lineages speaking Indo-European and spreading with Yamna; and N lineages ‘native’ to north-eastern Europe and speaking Uralic, and this is causing widespread weeping and gnashing of teeth (instead of the joy of discovering where one’s true patrilineal ancestors come from, and what language they spoke in each given period, which is the supposed objective of genetic genealogy…)

Since an Indo-Germanic branch (as revived now by some in the Copenhaguen group to fit Kristiansen’s theory of the 1980s with recent genetic data) does not make any sense in linguistics, the finding of R1a in Yamna would not have led where some think it would have, because North-West Indo-European would still be the main Late PIE branch in Europe. Don’t take my word for it; take James P. Mallory’s (2013).

mallory-adams-tree
The levels of Indo-European reconstruction, from Mallory & Adams (2006).

If an (unlikely) Indo-Slavonic group were posited, though, such a group would still be bound (with Indo-Iranian) to the steppes with East Yamna/Poltavka (admixing with Abashevo migrants, but retaining its language), developing Sintashta/Potapovka → Srubna/Andronovo, and R1a lineages would have equally undergone the known bottlenecks of the steppes where they replaced R1b-Z2103 – which this eastern group shares with Balkan languages, a haplogroup that links therefore together the Graeco-Aryan group.

As far as I know – and there might be many other similar pet theories out there – there have been proposals of “modern Balto-Slavic-like” populations (in an obvious circular reasoning based on modern populations) in some Scythian clusters of the Iron Age.

NOTE. I will not enter into “Balto-Slavic-like R1a” of the Late Bronze Age or earlier because no one can seriously believe at this point of development of Population Genetics that autosomal similarity predating 1,500+ years the appearance of Slavs equates to their (ethnolinguistic) ancestral population, without a clear intermediate cultural and genetic trail – something we lack today in the Slavic case even for the late Roman period…

finno-saamic-palaeo-germanic-substratum
The Finnic and Saamic separation looks shallower than it actually is. Invisible convergence can be ‘triangulated’ with the help of Germanic layers of mutual loanwords (Häkkinen 2012).

We also know of R1a-Z280 lineages in Srubna, probably expanding to the west. With that in mind, and knowing that Palaeo-Germanic was in close contact with Finno-Samic while both were already separated but still in contact, and that Palaeo-Germanic was also in contact and closely related to a ‘Temematic’ distinct from Balto-Slavic (and also that early Proto-Baltic and Proto-Slavic from the Roman Iron Age and later were in contact with western Uralic) this will be the linguistic map of the Iron Age if R1a is considered to expand Indo-European from some kind of “patron-client” relationship with west Yamna:

palaeo-germanic-italo-celtic
Eastern European language map during the Late Bronze Age / Iron Age, if R1a spread Indo-European languages and Eastern Yamna spoke Indo-Slavonic. Palaeo-Germanic (i.e. Pre- to Proto-Germanic) needs to be in contact with both the Samic Lovozero population and the Fennic west Circum-Arctic one. Italic and Celtic in contact with Pre-Germanic. Germanic in contact with Temematic. Balto-Slavic in contact with Iranian, and near Fennic to allow for later loanwords. For Germanic and Temematic, see Kortlandt (2018).

You might think I have some personal or political reason against this kind of proposals. I haven’t. We have been proposing Indo-European to be the language of the European Union for more than 10 years, so to support R1b-Italo-Celtic in the whole Western Europe, R1a-Germanic in Central and Eastern Europe, and R1a-Indo-Slavonic in the steppes (as the Danish group seems to be doing) has nothing inherently bad (or good) for me. If anything, it gives more reason to support the revival of North-West Indo-European in Europe.

My problem with this proposal is that it is obviously beholden to the notion of the uninterrupted cultural, historic and ethnic continuity in certain territories. This bias is common in historiography (von Falkenhausen 1993), but it extends even more easily into the lesser known prehistory of any territory, and now more than ever some people feel the need to corrupt (pre)history based on their own haplogroups (or the majority haplogroups of their modern countries). However, more than on philosophical grounds, my rejection is based on facts: this picture is not what the combination of linguistic, archaeological, and genetic data shows. Period.

Nevertheless, if Yamna + Corded Ware represented the “big and early expansion” of Germanic and Italo-Celtic peoples proper of the dream Nazi’s Lebensraum and Fascist’s spazio vitale proposals; Uralians were Siberian hunter-gatherers that controlled the whole eastern and northern Russia, and miraculously managed to push (ethnolinguistically) Neolithic agropastoralists to the west during and after the Iron Age, with gradual (and often minimal) genetic impact; and Balto-Slavic peoples were represented by horse riders from Pokrovka/Srubna, hiding then somewhere around the forest-steppe until after the Scythian expansion, and then spreading their language (without much genetic impact) during the early Middle Ages…so be it.

See also

Related