While the true source of R1a-M417 – the main haplogroup eventually associated with Corded Ware, and thus Uralic speakers – is still not known with precision, due to the lack of R1a-M198 in ancient samples, we already know that the Pontic-Caspian steppes were probably not it.
R1a-M459 (xR1a-M198) lineages appear from the Mesolithic to the Chalcolithic scattered from the Baltic to the Caucasus, from the Dniester to Samara, in a situation similar to haplogroups Q1a-M25 and R1b-L754, which supports the idea that R1a, Q1a, and R1b expanded with ANE ancestry, possibly in different waves since the Epipalaeolithic, and formed the known ANE:EHG:WHG cline.
The first confirmed R1a-M417 sample comes from Alexandria, roughly coinciding with the so-called steppe hiatus. Its emergence in the area of the previous “early Sredni Stog” groups (see the mess of the traditional interpretation of the north Pontic groups as “Sredni Stog”) and its later expansion with Corded Ware supports Kristiansen’s interpretation that Corded Ware emerged from the Dnieper-Dniester corridor, although samples from the area up to ca. 4000 BC, including the few Middle Eneolithic samples available, show continuity of hg. I2a-M223 and typical Ukraine Neolithic ancestry.
NOTE. The further subclade R1a-Z93 (Y26) reported for the sample from Alexandria seems too early, given the confidence interval for its formation (ca. 3500-2500 BC); even R1a-Z645 could be too early. Like the attribution of the R1b-L754 from Khvalynsk to R1b-V1636 (after being previously classifed as of Pre-V88 and M73 subclade), it seems reasonable to take these SNP calls with a pinch of salt: especially because Yleaf (designed to look for the furthest subclade possible) does not confirm for them any subclade beyond R1a-M417 and R1b-L754, respectively.
The sudden appearance of “steppe ancestry” in the region, with the high variability shown by Ukraine_Eneolithic samples, suggests that this is due to recent admixture of incoming foreign peoples (of Ukraine Neolithic / Comb Ware ancestry) with Novodanilovka settlers.
The most likely origin of this population, taking into account the most common population movements in the area since the Neolithic, is the infiltration of (mainly) hunter-gatherers from the forest areas. That would confirm the traditional interpretation of the origin of Uralic speakers in the forest zone, although the nature of Pontic-Caspian settlers as hunter-gatherers rather than herders make this identification today fully unnecessary (see here).
EDIT (3 FEB 2019): As for the most common guesstimates for Proto-Uralic, roughly coinciding with the expansion of this late Sredni Stog community (ca. 4000 BC), you can read the recent post by J. Pystynen in Freelance Reconstruction, Probing the roots of Samoyedic.
NOTE. Although my initial simplistic interpretation (of early 2017) of Comb Ware peoples – traditionally identified as Uralic speakers – potentially showing steppe ancestry was probably wrong, it seems that peoples from the forest zone – related to Comb Ware or neighbouring groups like Lublyn-Volhynia – reached forest-steppe areas to the south and eventually expanded steppe ancestry into east-central Europe through the Volhynian Upland to the Polish Upland, during the late Trypillian disintegration (see a full account of the complex interactions of the Final Eneolithic).
The most interesting aspect of ascertaining the origin of R1a-M417, given its prevalence among Uralic speakers, is to precisely locate the origin of contacts between Late Proto-Indo-European and Proto-Uralic. Traditionally considered as the consequence of contacts between Middle and Upper Volga regions, the most recent archaeological research and data from ancient DNA samples has made it clear that it is Corded Ware the most likely vector of expansion of Uralic languages, hence these contacts of Indo-Europeans of the Volga-Ural region with Uralians have to be looked for in neighbours of the north Pontic area.
My bet – rather obvious today – is that the Don River area is the source of the earliest borrowings of Late Uralic from Late Indo-European (i.e. post-Indo-Anatolian). The borrowing of the Late PIE word for ‘horse’ is particularly interesting in this regard. Later contacts (after the loss of the initial laryngeal) may be attributed to the traditionally depicted Corded Ware – Yamna contact zone in the Dnieper-Dniester area.
NOTE. While the finding of R1a-M417 populations neighbouring R1b-L23 in the Don-Volga interfluve would be great to confirm these contacts, I don’t know if the current pace of more and more published samples will continue. The information we have right now, in my opinion, suffices to support close contacts of neighbouring Indo-Europeans and Uralians in the Pontic-Caspian area during the Late Eneolithic.
Single Grave and central Corded Ware groups – showing some of the earliest available dates (emerging likely ca. 3000/2900 BC) – are as varied in their haplogroups as it is expected from a sink (which does not in the least resemble the Volga-Ural population):
Interesting is the presence of R1b-L754 in Obłaczkowo, potentially of R1b-V88 subclade, as previously found in two Central European individuals from Blätterhole MN (ca. 3650 and 3200 BC), and in the Iron Gates and north Pontic areas.
Haplogroups I2a and G have also been reported in early samples, all potentially related to the supposed Corded Ware central-east European homeland, likely in southern Poland, a region naturally connected to the north Pontic forest-steppe area and to the expansion of Neolithic groups.
The true bottlenecks under haplogroup R1a-Z645 seem to have happened only during the migration of Corded Ware to the east: to the north into the Battle Axe culture, mainly under R1a-Z282, and to the south into Middle Dnieper – Fatyanovo-Balanovo – Abashevo, probably eventually under R1a-Z93.
This bottleneck also supports in archaeology the expansion of a sort of unifying “Corded Ware A-horizon” spreading with people (disputed by Furholt), the disintegrating Uralians, and thus a source of further loanwords shared by all surviving Uralic languages.
Confirming this ‘concentrated’ Uralic expansion to the east is the presence of R1a-M417 (xR1a-Z645) lineages among early and late Single Grave groups in the west – which essentially disappeared after the Bell Beaker expansion – , as well as the presence of these subclades in modern Central and Western Europeans. Central European groups became thus integrated in post-Bell Beaker European EBA cultures, and their Uralic dialect likely disappeared without a trace.
NOTE. The fate of R1b-L51 lineages – linked to North-West Indo-Europeans undergoing a bottleneck in the Yamna Hungary -> Bell Beaker migration to the west – is thus similar to haplogroup R1a-Z645 – linked to the expansion of Late Uralians to the east – , hence proving the traditional interpretation of the language expansions as male-driven migrations. These are two of the most interesting genetic data we have to date to confirm previous language expansions and dialectal classifications.
It will be also interesting to see if known GAC and Corded Ware I2a-Y6098 subclades formed eventually part of the ancient Uralic groups in the east, apart from lineages which will no doubt appear among asbestos ware groups and probably hunter-gatherers from north-eastern Europe (see the recent study by Tambets et al. 2018).
Corded Ware ancestry marked the expansion of Uralians
Sadly, some brilliant minds decided in 2015 that the so-called “Yamnaya ancestry” (now more appropriately called “steppe ancestry”) should be associated to ‘Indo-Europeans’. This is causing the development of various new pet theories on the go, as more and more data contradicts this interpretation.
There is a clear long-lasting cultural, populational, and natural barrier between Yamna and Corded Ware: they are derived from different ancestral populations, which show clearly different ancestry and ancestry evolution (although they did converge to some extent), as well as different Y-DNA bottlenecks; they show different cultures, including those of preceding and succeeding groups, and evolved in different ecological niches. The only true steppe pastoralists who managed to dominate over grasslands extending from the Upper Danube to the Altai were Yamna peoples and their cultural successors.
[A]rchaeologist Volker Heyd at the University of Bristol, UK, disagreed, not with the conclusion that people moved west from the steppe, but with how their genetic signatures were conflated with complex cultural expressions. Corded Ware and Yamnaya burials are more different than they are similar, and there is evidence of cultural exchange, at least, between the Russian steppe and regions west that predate Yamnaya culture, he says. None of these facts negates the conclusions of the genetics papers, but they underscore the insufficiency of the articles in addressing the questions that archaeologists are interested in, he argued. “While I have no doubt they are basically right, it is the complexity of the past that is not reflected,” Heyd wrote, before issuing a call to arms. “Instead of letting geneticists determine the agenda and set the message, we should teach them about complexity in past human actions.
When considering the way the Indo-Europeans took to the west, it is important to realize that mountains, forests and marshlands were prohibitive impediments. Moreover, people need fresh water, all the more so when traveling with horses. The natural way from the Russian steppe to the west is therefore along the northern bank of the river Danube. This leads to the hypothesis that the western Indo-Europeans represent successive waves of migration along the Danube and its tributaries. The Celts evidently followed the Danube all the way to southern Germany. The ancestors of the Italic tribes, including the Veneti, may have followed the river Sava towards northern Italy. The ancestors of Germanic speakers apparently moved into Moravia and Bohemia and followed the Elbe into Saxony. A part of the Veneti may have followed them into Moravia and moved along the Oder through the Moravian Gate into Silesia. The hypothetical speakers of Temematic probably moved through Slovakia along the river Orava into western Galicia. The ancestors of speakers of Balkan languages crossed the lower Danube and moved to the south. This scenario is in agreement with the generally accepted view of the earliest relations between these branches of Indo-European.
The western Indo-European vocabulary in Baltic and Slavic is the result of an Indo-European substratum which contained an older non-Indo-European layer and was part of the Corded Ware horizon. The numbers show that a considerable part of the vocabulary was borrowed after the split between Baltic and Slavic, which came about when their speakers moved westwards north and south of the Pripet marshes. These events are older than the westward movement of the Slavs which brought them into contact with Temematic speakers. One may conjecture that the Venedi occupied the Oder basin and then expanded eastwards over the larger part of present-day Poland before the western Balts came down the river Niemen and moved onwards to the lower Vistula. We may then identify the Venedic expansion with the spread of the Corded Ware horizon and the westward migration of the Balts and the Slavs with their integration into the larger cultural complex. The theory that the Venedi separated from the Veneti in the upper Sava region and moved through Moravia and Silesia to the Baltic Sea explains the “im Namenmaterial auffällige Übereinstimmung zwischen dem Baltikum und den Gebieten um den Nordteil der Adria” (Udolph 1981: 61). The Balts probably moved in two stages because the differences between West and East Baltic are considerable.
Instead of reinterpreting his views in light of the recent genetic finds, Kortlandt tries to mix in this paper his own old theories (see his paper Baltic, Slavic, Germanic) with the recent interpretations of genetic papers, using also dubious secondary sources – e.g. Iversen and Kroonen (2017) or Klejn (2017) [see here, and here] – which, in my opinion, creates a potentially dangerous circular reasoning.
For example, even though he criticizes the general stance of recent genetic papers with regard to Proto-Indo-European dialectalization and expansion as too early, and he supports the Danube expansion route, he nevertheless follows their interpretations in accepting that Corded Ware was Indo-European (following the newest model proposed by Anthony):
The [Yamnaya] penetrated central and northern Europe from the lower Danube through the Carpathian basin, not from the east. The Carpathian basis was evidently the cradle of the Corded Ware cultures, where the descendants of the Yamnaya mixed with the local early farmers before proceeding to the north. The development has a clear parallel in the Middle Ages, when the Hungarians mixed with the local Slavic populations in the same territory (cf. Kushniarevich & al. 2015).
He still follows his good old Indo-Slavonic group in the east, but at the same time maintains Kallio’s view that there were no early Uralic loanwords in Balto-Slavic, and also Kallio’s (and the general) view that there were close contacts with PIE and Pre-Proto-Indo-Iranian…
NOTE. The latest paper on Eurasian migrations by Damgaard et al. (Nature 2018), which shows mainly Proto-Iranians dominating over East Europe after the Early Bronze Age, have left still fewer space for a Proto-Balto-Slavic group emerging from the east.
Also, he asserts the following, which is a rather weird interpretation of events:
It appears that the Corded Ware horizon spread to southern Scandinavia (cf. Iversen & Kroonen 2017) but not to the Baltic region during the Neolithic.
“However, we also find indications of genetic impact from exogenous populations during the Neolithic, most likely from northern Eurasia and the Pontic Steppe. These influences are distinct from the Anatolian-farmer-related gene flow found in Central Europe during this period.”
It follows that the Indo-Europeans did not reach the Baltic region before the Late Neolithic. The influx of non-local people from northern Eurasia may be identified with the expansion of the Finno-Ugrians, who came into contact with the Indo-Europeans as a result of the eastward expansion of the latter in the fourth millennium. This was long before the split between Balto-Slavic and Indo-Iranian.
In the Late Neolithic there was “a further population movement into the regions surrounding the Baltic Sea” that was “accompanied by the first evidence of extensive animal husbandry in the Eastern Baltic”, which “suggests import of the new economy by an incoming steppe-like population independent of the agricultural societies that were already established to the south and west of the Baltic Sea.” (Mittnik & al. 2018). These may have been the ancestors of Balto-Slavic speakers. At a later stage, the Corded Ware horizon spread eastward, giving rise to farming ancestry in Eastern Baltic individuals and to a female gene-flow from the Eastern Baltic into Central Europe (ibidem).
He is a strong Indo-Uralic supporter, and supports a parallel Indo-European – Uralic development in Eastern Europe, and (as you can read) he misunderstands the description of population movements in the Baltic region, and thus misplaces Finno-Ugric speakers as Eurasian migrants arriving in the Baltic from the east during the Late Neolithic, before the Corded Ware expansion, which is not what the cited papers implied.
NOTE. Such an identification of westward Neolithic migrations with Uralic speakers is furthermore to be rejected following the most recent paper on Fennoscandian samples.
He had previously asserted that the substrate common to Germanic and Balto-Slavic is Indo-European with non-Indo-European substrate influence, so I guess that Corded Ware influencing as a substrate both Germanic and Balto-Slavic is the best way he could put everything together, if one assumes the widespread interpretations of genetic papers:
Thus, I think that the western Indo-European vocabulary in Baltic and Slavic is the result of an Indo-European substratum which contained an older non-Indo-European layer and was part of the Corded Ware horizon. The numbers show that a considerable part of the vocabulary was borrowed after the split between Baltic and Slavic, (…)
NOTE. It is very likely that this paper was sent in late 2017. That’s the main problem with traditional publications including the most recent genetic investigation: by the time something gets eventually published, the text is already outdated.
I obviously share his opinion on precedence of disciplines in Indo-European studies:
The methodological point to be emphasized here is that the linguistic evidence takes precedence over archaeological and genetic data, which give no information about the languages spoken and can only support the linguistic evidence. The relative chronology of developments must be established on the basis of the comparative method and internal reconstruction. The location of a reconstructed language can only be established on the basis of lexical and onomastic material.On the other hand, archaeological or genetic data may supply the corresponding absolute chronology. It is therefore incorrect to attribute cultural influences in southern Scandinavia and the Baltic region in the third millennium to Germanic or Baltic speakers because these languages did not yet exist. While the Italo-Celtic branch may have separated from its Indo-European neighbors in the first half of the third millennium, Proto-Balto-Slavic and Proto-Indo-Iranian can be dated to the second millennium and Proto-Germanic to the end of the first millennium BC (cf. Kortlandt 2010: 173f., 197f., 249f.). The Indo-Europeans who moved to southern Scandinavia as part of the Corded Ware horizon were not the ancestors of Germanic speakers, who lived farther to the south, but belonged to an unknown branch that was eventually replaced by Germanic.
I hope we can see more and more anthropological papers like this, using traditional linguistics coupled with archaeology and the most recent genetic investigations.
(…) a northern connection is suggested by contacts between the Indo-Iranian and the Finno-Ugric languages. Speakers of the Finno-Ugric family, whose antecedent is commonly sought in the vicinity of the Ural Mountains, followed an east-to-west trajectory through the forest zone north and directly adjacent to the steppes, producing languages across to the Baltic Sea. In the languages that split off along this trajectory, loanwords from various stages in the development of the Indo-Iranian languages can be distinguished: 1) Pre-Proto-Indo-Iranian (Proto-Finno-Ugric *kekrä (cycle), *kesträ (spindle), and *-teksä (ten) are borrowed from early preforms of Sanskrit cakrá- (wheel, cycle), cattra- (spindle), and daśa- (10); Koivulehto 2001), 2) Proto-Indo-Iranian (Proto-Finno-Ugric *śata (one hundred) is borrowed from a form close to Sanskrit śatám (one hundred), 3) Pre-Proto-Indo-Aryan (Proto-Finno-Ugric *ora (awl), *reśmä (rope), and *ant- (young grass) are borrowed from preforms of Sanskrit ā́rā- (awl), raśmí- (rein), and ándhas- (grass); Koivulehto 2001: 250; Lubotsky 2001: 308), and 4) loanwords from later stages of Iranian (Koivulehto 2001; Korenchy 1972). The period of prehistoric language contact with Finno-Ugric thus covers the entire evolution of Pre-Proto-Indo-Iranian into Proto-Indo-Iranian, as well as the dissolution of the latter into Proto-Indo- Aryan and Proto-Iranian. As such, it situates the prehistoric location of the Indo-Iranian branch around the southern Urals (Kuz’mina 2001).
NOTE. While I agree with the evident ancestral nature of the *kekrä borrowing, I will repeat it here again: I don’t believe that the distinction of late Proto-Indo-Iranian from ‘Pre-Proto-Indo-Aryan’ loans is warranted; not for words reconstructed from recent Finno-Ugric languages.
In this period of a Pre-Proto-Indo-Iranian community, which is to be associated with East Yamna/Poltavka, ca. 3000-2400 BC – as accepted in the supplement from de Barros Damgaard et al. (Nature 2018) – , both Poltavka and Abashevo/Balanovo herders were expanding ca. 2800-2600 BC to the east (and Abashevo already admixing into Poltavka territory), near the southern Urals.
There is no other, clearer, later connection between Finno-Ugric and Proto-Indo-Iranian speakers. Even the arrival of the Seima-Turbino phenomenon (after ca. 2000 BC), if it brought migrants to North-East Europe, would not fit the linguistic, archaeological, or genetic data. It is by now quite clear that Seima-Turbino does not fit with incoming N1c1 lineages and/or Siberian ancestry, either, for those looking for these as potential signs of incoming Uralic speakers.
While the Copenhagen group did not have access to data from Sintashta ca. 2100 BC onwards – now available in Narasimhan et al. (2018) – when submitting the papers, we already know that there was a clear long period of slow progressive admixture in the North Caspian region. It can be seen in the genetic contribution of Yamna to incoming Abashevo groups, and in the R1b-L23 samples still appearing in Sintashta until ca. 1800 BC (as I predicted could happen).
Since the first sample signalling incoming Abashevo migrants is found in the Poltavka outlier dated ca. 2700 BC (of R1a-Z93 lineage), this represents a rather unique, several centuries long process of admixture in the North Caspian region, different from the massive Afanasevo or Bell Beaker migrations in Asia and Europe, whereby a great part of the native male population was suddenly replaced.
This offers further support for language continuity despite genetic replacement in the development of East Yamna/Poltavka (part of the Steppe EMBA cline, formed by Yamna and Afanasevo) mixing with Abashevo migrants (probably identical to Corded Ware samples) to form Potapovka, Sintashta, and later Srubna, and Andronovo communities (all forming, with Corded Ware groups, a wide Eurasian Steppe MLBA cloud). See the available data from Narasimhan et al. (2018).
The continuous interactions and migrations left thus eventually two communities in the southern Urals genetically similar, but ethnolinguistically diverse:
To the north, Abashevo-Balanovo – but potentially also Fatyanovo, and related North-East European late Corded Ware groups – borrowed necessary words from Indo-Iranian neighbours, while maintaining their Finno-Ugric language and culture.
To the south, immigrants (or their descendants) of Abashevo origin expanding among Pre-Proto-Indo-Iranian-speaking North Caspian communities assimilated the surrounding culture and language, giving it their own accent (i.e. ‘satemizing’ it) and turning it into Proto-Indo-Iranian (see e.g. Parpola’s account).
Anthropologically, this ‘long-term founder effect’ that appears as genetic replacement is probably explained by the faster life history in MLBA North Caspian populations, likely due to a combination of changing environmental and social circumstances.
I am happy to see that people are resorting now to dialectal classifications and Y-DNA to explain the findings in Old Hittites, Tocharians (and related migrations), and Indo-Iranians. It is especially interesting to see precisely this Danish groupdownplay the relevance of ancestry and favor complex anthropological models when assessing migrations and ethnolinguistic identification.
So let’s talk about the growing elephant in the room.
It seems we all accept now Tocharian’s more archaic Late PIE nature, which is supported by waves of late Khvalynsk migrants starting probably ca. 3300 BC, as seen in different samples to the east in Central Asia, and to the south in Iran. Almost all of them share R1b-L23 lineages.
NOTE. Whereas their early LPIE dialects have not survived to historic times, the rather speculative hypotheses of Euphratic and Gutian languages may be of interest.
We also know of the coetaneous migrants that settled to the west of the Don River (in the territory of the previous late Sredni Stog culture), to form the western South-Bug / Lower Don groups, which, together with the Volga-Ural / North Caucasian groups formed the early Yamna culture, that dominated from ca. 3300 BC over the Pontic-Caspian steppe.
It is only logical that the other attested languages belonging to the common Late PIE trunk must come from these groups, which must have stuck together for quite some time – after the recently proven late Khvalynsk migrations – , to allow for the spread of isoglosses (not found in Tocharian) among them.
This is agreed, even by the Copenhagen group, who expressly state that Yamna is to be identified with the rest of Late PIE languages after the Tocharian-related migrations.
The period of an early Yamna community constrained to the Pontic-Caspian steppe (ca. 3300-3000 BC) is followed by renewed waves of Late Proto-Indo-European migrations, during which areal contacts and innovations (even between unrelated LPIE branches) can still be reconstructed.
These later migrations can be precisely described as follows (after the latest studies):
Yamna migrants, of mixed R1b-L51 and R1b-Z2103 lineages, settle ca. 3000-2600 BC along the lower Danube, in the Balkans and the Carpathian basin, giving rise later to groups of:
In the Pontic-Caspian steppe, early Yamna groups evolve into (from west to east) Late Yamna, Catacomb, and Poltavka groups, ca. 2800-2300 BC, all still dominated by R1b-L23 lineages (see discussion on the Catacomb sample), with:
Expanding early Proto-Iranian and Proto-Indo-Aryan groups in Srubna (to the west) and Andronovo (to the east), during the first half of the 2nd millennium BC, dominate over the Bronze Age steppe and Central Asia with expanding R1a-Z93 lineages.
1.A) For Germanic, we already have proof that an appropriate, unitary Scandinavian society, ripe for the development of a common Pre-Germanic language (that expanded much later, during the Iron Age, as Proto-Germanic) could have developed only after the arrival of Bell Beakers (see Prescott 2017). The association of proto-historic Germanic tribes mainly with the expansion of R1b-U106 lineages bears witness to that.
NOTE. Even without taking into account the likely L51 samples from Khvalynsk, it is by now quite clear that R1b-L51 lineages were already admixed in Yamna settlers from the Carpathian Basin, and any subclade of U106, L21, DF27, or U152 can thus be found everywhere in Europe associated with any of those North-West Indo-European migrations. What we are seing later, as in the East Bell Beaker migrants arriving in the British Isles (L21), Iberia (DF27), or the Netherlands/Scandinavia (U106), is the further reduction in variability coupled with the expansion of a few sucessful families (and their lineages), as we know it usually happens during migrations.
NOTE. The few ancestral traits common to Germanic and Balto-Slavic are today considered a common substrate language to both, and not due to close contacts (and still less a common branch, as was proposed in the 1st half of the 20th c.). You can read e.g. Kortlandt’s Baltic, Slavic, Germanic (2017), or our Corded Ware substrate hypothesis (2017). In both theories, the referenced substrate is likely a non-Indo-European language, and in both cases it is related to the Corded Ware culture, which represents their most common immediate ancestral population before the spread of Bell Beakers.
2) The late Corded Ware groups of Finland and Estonia, as well as Fatyanovo and Abashevo (and succeeding groups of Eastern Europe) may now be more clearly associated with Proto-Finno-Ugric dialects, and thus probably Corded Ware groups in general with Uralic languages, whose western branches have not survived to this day, with their culture and language being replaced quite early by expanding Bell Beakers.
NOTE. While the demise of Central and Central-East European CWC groups is evident, continuous contacts among Battle Axe culture groups in Scandinavia and the Gulf of Finland through the Baltic Sea – and the strong Bronze Age Palaeo-Germanic influence on Finnic languages (stronger than earlier Indo-Iranian borrowings) may point to the continuity of Proto-Finnic in Northern Scandinavia, which may force a reinterpretation of the prehistoric location of Proto-Finnic-speaking groups.
Here is my translation of the reported summary (emphasis mine):
Khokhlov, A.A. Preliminary results of anthropological and genetic studies of materials of the Volga-Ural region of the Neolithic-Early Bronze Age by an international group of scientists.
In his report, A. A. Khokhlov introduced the scientific circle to the still unpublished data of the new Eneolithic burial ground Yekaterinovskiy Cape, which combines both the Mariupol and Khvalynsk features, and is dated to the fourth quarter of the V millennium BC. All samples analyzed had a Uraloid anthropological type, the chromosome of all samples belonged to haplogroup R1b1a2 (R-P312/S116), and to haplogroup R1b1a1a2a1a1c2b2b1a2. mtDNA to haplogroups U2, U4, U5. In the Khvalynsk burial grounds (first half of the IV millennium BC), the anthropological material differs in a greater variety. In addition to the Uraloid substratum, European wide-faced and southern European variants are recorded. To the samples are added haplogroup R1a1, O1a1, I2a2 to mtDNA T2a1b, H2a1.
So, first of all:
This is a reported summary of an oral communication, and it was written in a forum by a user. Unlike many out there, though, this one uses his real name, apparently assisted to the conference, and is himself a Russian of self-reported haplogroup R1a1a, so probably no interest in reporting this if it’s not true. Errors contained may have been made by him, and may not have been found in the original communication, since he says he wrote it by hand.
Something is obviously off with the haplogroup nomenclature. There has recently been mixing of standards, with some papers reporting R1b1a2-M269 (which is supposed to be now ISOGG V88), and most using R1b1a1a2-M269. What I had never seen is both standards used at the same time, as in this report, so I guess it’s another error of transcription.
It is doubtful that we would be talking about that recent referenced subclade of U106, but it can’t be a surprise to finally find L51 subclades alongside Z2103 in Proto-Indo-European territory. Also, the summary must obviously refer to Q1a1, not O1a1, and probably to the first half of the V (and not IV) millennium BC.
NOTE. Since Khokhlov, like Anthony, is an anthropologist, and this is an archaeological conference, we could suppose – if the report is truthful to what he said or what could be read in the summary – that this is the best he can do to report genetic material that was not assessed by him, but by a specialized lab, because it is not his field. I think the relevant data is nevertheless useful until we have the official publication.
From this report of archaeological works, we know there were 60 Early Eneolithic burials excavated in 2013, dating to the period between S’yezzhe and Khvalynsk. 15 more burials were excavated in 2017, and there are to date already around 93 reported burials, with ongoing excavations.
Assuming that what the report conveys is more or less correct in the basics, let’s derive some simple conclusions from the data:
The presence of some samples uniformly of R1b-L23 subclades that early will mean an end to the question of when this haplogroup dominated over the Khvalynsk population, and probably also when it appeared (rather early during this culture’s formation), since it would mean R1b-L23 subclades were widespread already by the end of the 5th millenium.
I can only guess that CHG ancestry will be found in these samples, based indirectly on what is reported in anthropological terms, and what appears later in Yamna and Afanasevo samples. This will contradict some recent comments suggesting an admixture driven by males from the south, and especially a Maykop -> Khvalynsk migration as a source of this component, placing the admixture at earlier times, and/or driven by exogamy. Therefore we can reject the formation of Middle PIE outside of Khvalynsk, and also the expansion of Proto-Anatolian from Maykop (unless Maykop itself is proposed as a steppe offshoot).
Khvalynsk was probably dominated by R1b-L23 subclades already ca. 4250-4000 BC, which – combined with earlier, more diverse Eneolithic samples from the region (dated ca. 5000-4500 BC) – would support an expansion of these subclades just before this time, in the mid-5th millennium BC, as I proposed based on ancient samples and TMRCAs of modern haplogroups. It is now more likely then that I was right in linking the expansion of R1b-M269 and early R1b-L23 lineages as chiefs with the spread of horse riding from early Khvalynsk, and thus associated also with the split and migration of the Proto-Anatolian community, probably with expanding Suvorovo-Novodanilovka chiefs.
NOTE. While the presence of R1b-P312 and R1b-U106 subclades that early does not seem likely based on their estimated formation dates (in turn based on modern descendants), this is not the first time that such estimations have been proven wrong with ancient samples (viz. the “late” Z93 subclade from Eneolithic Ukraine sample I6561). Also, we already have one sample labelled U106 supposedly expanding with Indo-Iranians, and a sample of an early L51 subclade in Central Asia potentially linked to Afanasevo migrants in the infamous tables of Narasimhan et al. (2018), which help support its early presence in the North Caspian area. Some of these younger subclades seem (based on TMRCAs and forming dates of modern haplogroups) more like a wrong ‘excessive-subclade-reporting fest’, probably due to the use of a certain software for inferences of Y-SNP calls from scarce material, but who knows.
EDIT (2 MAY 2018): A commenter in the forum cast doubts on the actual dates of the site, citing the reservoir effect in Khvalynsk which may show earlier radiocarbon dates than the actual ones. Since this is an international team well versed in archaeological remains of this region, and there have been already many samples and remains assessed before and after these dates, it is not very likely that they did not take such problems of radiocarbon dating into account when reporting the findings…
The publication of this and more data in a book is supposedly due for the summer, so let’s wait for the officially reported haplogroups, and for the corrected tables in Narasimhan et al. (2018), to draw the necessary detailed conclusions.
This post was emailed to subscribers of this blog on the 1st of May immediately after publication, with our Newsletter. If you want to keep up to date with the latest interesting information instantly (few mails will be submitted a month, if any), subscribe now.
EDIT (May 2017) The answer I received from the group to my questions regarding these samples can be read here.
The recent publication of Narasimhan et al. (2018) has outdated the draft of this post a bit, and it has made it at the same time still more interesting.
While we wait for the publication of the dataset (and the actual Y-DNA haplogroups and precise subclades with the revision of the paper), and as we watch the wrath of Hindu nationalists vented against the West (as if the steppe was in Western Europe) and science itself, we have already seen confirmation from the Reich Lab of their new approach to Late Proto-Indo-European migrations.
Yamna/Steppe EMBA, previously identified as the direct source of “steppe” ancestry (AKA ‘Yamnaya‘ ancestry) and Late Indo-European migrations in Asia – through Corded Ware, it is to be understood – has been officially changed. In the case of Indo-Iranian migrations it is the “Steppe MLBA cloud”, after a direct contribution to it of Yamna/Steppe EMBA, which expanded Indo-Iranian, as I predicted ancient DNA could support.
In Twitter, the main author responded the following when asked for this change regarding the origin of steppe ancestry in Asian migrants (emphasis mine):
Our reasons are:
The Turan samples show no elevated steppe ancestry till 2000BC.
MLBA is R1a
Indus periphery doesn’t have steppe ancestry but Swat does, and EMBA doesn’t work both in terms of time or genetic ancestry to explain the difference.
I am glad to see finally recognized that Y-DNA haplogroups and time have to be taken into account, and happy also to see an end to the by now obsolete ‘ADMIXTURE/PCA-only relevance’ in Human Ancestry. The timing of archaeological migrations, the cultural attribution of each sample, and the role of Y-DNA variability reduction and expansion have been finally recognized as equally important to assess potential migrations, as I requested.
This change was already in the making some months ago, when David Anthony – who has worked with the group for this paper and others before it – already changed his official view on Corded Ware – from his previous support of the 2015 model. His latest theory, which linked Yamna settlements in Hungary with a potential mixed society of migrants (of R1b-L23 and R1a-Z645 lineages) from West Yamna, is most likely wrong, too, but it was clearly a brave step forward in the right direction.
The only reasonable model now is that Yamna expanded Late Proto-Indo-European languages with steppe ancestry + R1b-L23 subclades.
You can either accept this change, or you can deny it and wait until one sample of R1a-Z645 appears in West Yamna or central Europe, or one sample of R1b-L23 appears in Corded Ware (as it is obvious it could happen), to keep spreading the wrong ideas still some more years, while the rest of the world goes on: Mallory, Anthony, and other archaeologists co-authoring the latest paper (probably part of the stronger partnership with academics that we were going to see), who had formally put forward complex, detailed theories, investing their time and name in them, have rejected their previous migration models to develop new ones based on the most recent findings. If they can do that, I am sure any amateur geneticist out there can, too.
The Balto-Slavic dialect and its homeland
An interesting question in Linguistics and Archaeology, now that Corded Ware cannot be identified as “Indo-Slavonic” or any other imaginary ancient group (like Indo-Slavo-Germanic), remains thus mostly unchanged since before the famous 2015 genetic papers:
Was Balto-Slavic a dialect of the expanding North-West Indo-European language, a Northern LPIE dialect, as we support, based on morphological and lexical isoglosses?
Or was it part of an Indo-Slavonic group in East Yamna, i.e. a Graeco-Aryan dialect, based mainly on the traditional Satem-Centum phonological division?
I am a strong supporter of Balto-Slavic being a member of a North-West Indo-European group. That’s probably because I educated myself first with the main Spanish books* on Proto-Indo-European reconstruction, and its authors kept repeating this consistent idea, but I have found no relevant data to reject it in the past 15 years.
* Today two of the three volumes are available in English, although they are from the early 1990s, hence a bit outdated. They also maintain certain peculiarities from Adrados’ own personal theories, such as multiple (coloured) laryngeals, 5 cases – with a common ancestral oblique case – for Middle PIE, etc. But it has lots of detailed discussions on the different aspects of the reconstruction. It is not an easy introductory manual to the field, though; for that you have already many famous short handbooks out there, like those of Fortson (N.American), Beekes (Leiden), or Meier-Brügger (Germany).
Fernando and I have always maintained that North-West Indo-European must have formed a very recent community, probably connected well into the early 2nd millennium BC for certain recent isoglosses to spread among its early dialects, based on our guesstimates*, and on our belief that it formed at some point not just a dialect continuum, but probably a common language, so we estimated that the expansion was associated with the pan-European influence of Únětice and close early Bronze Age European contacts.
NOTE. I know, you must be thinking “linguistic guesstimates? Bollocks, that’s not Science”. Right? Wrong. When you learn a dozen languages from different branches, half a dozen ancient ones, and then still study some reconstructed proto-languages from them, you begin to make your own assumptions about how the language changes you perceive could have developed according to your mental time frames. If you just learned a second language and some Latin in school, and try to make assumptions as to how language changes, or you believe you can judge it with this limited background, you have evidently the wrong idea of what a guesstimate is. I accept criticism to this concept from a scientist used only to statistical methods, since it comes from pure ignorance of what it means. And I accept alternative guesstimates from linguists whose language backgrounds may differ (and thus their perception of language change). However, I would not accept a glottochronological or otherwise (supposedly) statistical model instead (or a religious model, for that matter), so we have no alternatives to guesstimates for the moment.
In fact, guesstimates and dialectalization have paved the way to the steppe hypothesis, first with the kurgan hypothesis by Marija Gimbutas, then complemented further in the past 60 years by linguists and archaeologists into a detailed Khvalynsk -> Yamna -> Afanasevo/Bell Beaker/Sintashta-Andronovo expansion model, now confirmed with genomics. So either you trust us (or any other polyglot who deals with Indo-European matters, like Adrados, Lehmann, Beekes, Kloekhorst, Kortlandt, etc.), or you begin learning ancient languages and obtaining your own guesstimates, whichever way you prefer. The easy way of numbers + computer science does not exist yet, and is quite far from happening – until we can understand how our brains summarize and select important details involved in obtaining estimates – , no matter what you might be reading (even in Nature or Science) recently…
Data from the 2015 papers changed my understanding of the original NWIE-speaking community, and I have since shifted my preffered anthropological model (from a Northern dialect in Yamna spreading into a loose NWIE-speaking Corded Ware -> Únětice) to a quite close group formed by late Yamna settlers in the Carpathian Basin, expanded as East Bell Beakers, and later continuing with close contacts through Central European EBA.
NOTE. As you can read, we initially rejected Gimbutas’ and Anthony’s (2007) notion of a Late PIE splitting suddenly into all known dialects (viz. Italo-Celtic with Vučedol/Bell Beaker), and looked thus for a common NWIE spread with Corded Ware migrants, with help from inferences of modern haplogroup distribution (as was common in the early 2000s). Language reconstruction was the foundation of that model, and it was right in its own way. It probably gave the wrong idea to geneticists and archaeologists, who quite easily accepted some results from the 2015 papers as supporting this model. But it also helped us develop a new model and predict what would happen in future papers, as demonstrated in O&M 2018. Any alternative linguistic and archaeological model could explain what is seen today in genomics, but our model of North-West Indo-European reconstruction is obviously at present the best fit for it.
Nevertheless, one of the most important Balticists and Slavicists alive, Frederik Kortlandt, posits that there was in fact an Indo-Slavonic group, so one has to take that possibility into account. Not that his ideas are flawless, of course: he defends the glottalic theory – which is still held today by just a handful of researchers – , and I strongly oppose his description of Balto-Slavic and Germanic oblique cases in *-m- (against other LPIE *-bh-) as an ancestral remnant related to Anatolian (an ending which few scholars would agree corresponds to what he claims), since that would probably represent an older split than warranted in our model. I believe genetics is proving that the dialectalization of Late PIE happened as Fernando López-Menchero and I described.
NOTE. The idea with these examples of how he has been wrong in LPIE and MPIE reconstruction is not to observe the common ad hominem arguments used by amateur geneticists to dismiss academic proposals (“he said that and was wrong, ergo he is wrong now”). It is to bring into attention that the argument from authority is important for the academic community insofar as it creates a common ground, i.e. especially when there are many relevant scholars agreeing on the same subject. But, indeed, any model can and should be challenged, and all authorities are capable of being wrong, and in fact they often are.
The most common explanation today for the dialectal development *-m- is an innovation (not an archaism), whether morphological (viz. Ita. and Gk. them. pl *-i) or phonological (as I defend); and the most commonly repeated model for the satemization trend (even for those supporting a three-dorsal theory for PIE) is areal contact, whether driven by a previous (most likely Uralic) substratum, or not. Hence, if Kortlandt’s main different phonological and morphological assessments of the parent language are flawed, and they are the basis for his dialectal scheme, it should be revised.
The ‘atomic bomb’ that Indo-Slavonic proponents launched, in my opinion, was Holzer’s Temematic (born roughly at the same time as the renewed Old European concept in North-West Indo-European model of Oettinger) – and indeed Kortlandt’s acceptance of it. It seems to me like the linguistic equivalent of the archaeological “patron-client relationship” proposed by Anthony for a cultural diffusion of Late PIE into different Corded Ware regions: almost impossible to be fully rejected, if the Indo-Slavonic superstrate is proposed for a relatively early time.
In my opinion, the shared morphological layer with North-West Indo-European is obviously older than Iranian influence on Slavic, and I think this is communis opinio today. But how could we disentangle the dialectalization of Balto-Slavic, if there is (as it seems) an ancestral substrate layer (most likely Uralic) common to both Balto-Slavic and Indo-Iranian? It seems a very difficult task.
The expansion of Balto-Slavic
In any case, there are two, and only two mainstream choices right now.
NOTE. Mainstream, as in representing trends current today among Indo-Europeanists, so that many programs around the world would explain these alternative models to their students, or they would easily appear in most handbooks. Not like the word “mainstream” you read in any comment out there by anyone who has never been interested in Indo-European studies, and uses any text from any author, written who knows how long ago, merely to justify their ethnic preconceptions coupled with certain genomic finds.
You can agree with:
A) The Spanish and German schools of thought, together with many American and British scholars, as well as archaeologists like Heyd, Mallory, or Prescott, and now Anthony, too: the language ancestral to Balto-Slavic, Germanic, and Italo-Celtic accompanied expanding West Yamna/East Bell Beakers into Europe, and then their speakers – like the rest of peoples everywhere in Europe – admixed later in the different regions.
B) Frederik Kortlandt and other Indo-Slavicists. The ‘original’ Balto-Slavic would have spread with Srubna (and likely Potapovka before it), as a product of the admixture of East Yamna’s Indo-Slavonic with incoming Corded Ware migrants (this would correspond to my description of Indo-Iranian). ‘True’ Balto-Slavic speakers would have then absorbed the Temematic-speaking migrants (equivalent to early Balto-Slavic migrants as described in the demic diffusion model) spreading from the west, most likely in the steppe. Later developments from the steppe would have then brought Baltic to the north, and Slavic to the west.
Therefore, in both cases the language spoken by early R1a-Z645 lineages in Únětice or Mierzanowice/Nitra EBA cultures would have been an eastern North-West Indo-European dialect associated with expanding Bell Beakers, and closely related to Germanic and Italo-Celtic. In the second case, the ancient samples we see genetically closer to modern West Slavs could thus be identified with those speaking the Temematic substrate absorbed later by Balto-Slavic, or maybe by Balts migrating northward, and Slavs spreading west- and southward.
NOTE. In any case, we know that R1a-Z645 subclades resurged in Central-East Europe after the expansion of Bell Beakers, potentially showing an ancient link with the prevalent R1a subclades in the region today. We know that some ancient Central European populations cluster near modern West Slavs, but in other interesting regions (like the British Isles, Central Europe, Scandinavia, or Iberia) we also see close clusters, and nevertheless observe historically documented radical ethnolinguistic changes, as well as many different subsequent genetic inflows and founder effects, that have significantly altered the anthropological picture in these regions, so it could very well be that the lineages we find in ancient samples do not correspond to modern West Slavic lineages, or even similar ancient and modern lineages could show a radical cultural discontinuity (as is likely the case in this to-and-from-the-steppe migration scheme).
Since we are going to see signs of both – west and east admixture – in early Slavic communities near the steppe, and the distribution from South, West, and East Slavs will include a wide “cloud” connecting Central, East, and South-East Europe, as it is evident already from early Germanic samples, it may be interesting to shift our attention to the Tollense valley and Lusatian samples, and their predominant Y-DNA haplogroups. Once again, tracking male-driven migrations from Central Europe to the Baltic region and the steppe, and back again to much of Central and South Europe, will determine which groups expanded this eastern NWIE dialect initially and in later times.
Since Baltic and Slavic languages are attested quite late, genetics is likely to help us select among the different available models for Balto-Slavic, although (it is worth repeating it) these lineages may not be the same that later expanded each dialect.
NOTE. Bronze and Iron Age samples might begin to depict the true Balto-Slavic migration map. Apart from the strong differences in the satemization processes seen among Baltic, Slavic, and Indo-Iranian, from an archaeological point of view the geographic location of the earliest attested Baltic languages and the prehistoric developments of the region seem to me almost incompatible with a homeland in the steppe. Anyway, in the worst-case scenario – for those of us who work with Balto-Slavic to reconstruct North-West Indo-European – there is consensus that there must an eastern North-West Indo-European language (which some would call Temematic), whose common traits with Germanic and Italo-Celtic we use to reconstruct their parent language. The question remains thus mostly theoretical, of limited pragmatic use for the reconstruction.
The third way: Baltic Late Neolithic
I have referred to Kristiansen and his group‘s position regarding Corded Ware as Indo-European as flawed before. While their latest interpretation (and language identification) was wrong, Kristiansen’s original idea of long-lasting contacts in the Dnieper-Dniester region with the area occupied by late Trypillia developing a Proto-Corded Ware culture was probably right, as we are seeing now.
New data in Mittnik et al. 2018 show some interesting early Late Neolithic samples from the Baltic region – Zvejnieki, Gyvakarai1 (R1a-Z645) and Plinkaigalis242 – , proving what I predicted: that elevated steppe ancestry and R1a-Z645 subclades would be found in the Dnieper-Dniester region unrelated to the Yamna expansion, and, it seems, to migrants of the Corded Ware A-horizon.
Funnily enough, this shows that there were probably ancient interactions in the region, as originally asserted by Kristiansen, and probably following some of Victor Klochko‘s proposed exchange paths, but earlier than predicted by him.
Funny also how Anthony, too – like Kristiansen – , may have been right all along since 2007, in proposing that Corded Ware (the nuclear Corded Ware migrants) stemmed from the Dnieper-Dniester region roughly at the same time as Yamna migrants expanded west, and that they did not have any direct genetic connection (in terms of migrations) with each other.
Both researchers, who collaborated with the latest genomic research, remade their models, and have to revise now their most recent proposals with the new data, influencing each new paper published with their pressure to be right in their previous models, and with new genomic data compelling them to change their theories under the pressure not to be too wrong again, in this strange vicious circle. Had they remained silent and committed to their archaeological theories, they could have been right all along, each one in their own way.
NOTE. BTW, in case you see ad hominem here too, I feel compelled to say that only thanks to their commitment to disentangle the truth about ancient migrations, and their readiness to collaborate with genetic research – unlike many others in their field – we know today what we know. If they have been wrong many times, it is because they have tried to connect the genetic dots as they were told. Only because of their readiness to explore their science further they should be praised by all. But, again, that does not mean that they cannot be wrong in their models…
Thanks to Anthony’s latest change of mind, we don’t have to hear the “cultural diffusion” argument anymore, and I consider this a great advance for the field.
NOTE. Not that there could not be prehistoric cultural diffusion events of language (i.e. not accompanied by genetic admixture), of course, but such theories, almost impossible to disprove, probably need much more than a simple “patron-client relationship” proposal and anthropometry to justify them, in a time when we will be able to see almost every meaningful personal exchange in Genomics…
Today – since the finding of Ukraine_Eneolithic sample I6561, of haplogroup R1a-Z93, dated ca. 4200 BC, and likely from the Sredni Stog culture – it seems more likely than ever that the expansion of R1a-Z645 subclades was in fact associated with the spread of steppe admixture probably near the North Pontic forest-steppe region, most likely from the Dnieper-Dniester or Upper Dniester region.
The appearance of a ‘late’ Z93 subclade already at such an early date, with steppe admixture, makes it still more likely that the Proto-Corded Ware culture, from where Corded Ware migrants of R1a-Z645 lineages later spread, was probably associated with this wide region.
NOTE. A migration of Yamna settlers northward along the Prut dated ca. 3000 BC or later could have justified the appearance of steppe admixture in the Dnieper-Dniester region, as I proposed for the Zvejnieki sample, although dates from Baltic samples are likely too early for that. For this to be corroborated, migrants should be accompanied up to a certain region by R1b-L23 lineages, and this could mean in turn a revival of Anthony’s original model of cultural diffusion of 2007. The most likely scenario, however, as predicted by Heyd, given the early appearance of steppe admixture and R1a-Z93 subclades in the forest-steppe during the 5th millennium, is that the admixture happened much earlier than that, fully unrelated to Late PIE migrations.
The modern Baltic and Slavic conundrum
As for some people of Northern European ancestry previously supporting a bulletproof Yamna (R1a/R1b) -> Corded Ware migration that was obviously wrong; now supporting different Sredni Stog -> Corded Ware groups representing Indo-Slavonic (and Germanic??) in a model that is clearly wrong: how are these attempts different from Western Europeans supporting the autochthonous continuity of R1b-P312 lineages against all recent data, from Indians supporting the autochthonous continuity of R1a-M417 lineages no matter what, and from the more recent trend of autochthonous continuity theories for N1c lineages and Uralic in Eastern Europe?
Modern Germanic-speaking peoples can trace their common language to Nordic Iron Age Proto-Germanic, Celts to La Tène’s expansion of Proto-Celtic, and Romance speakers to the Roman expansion (and to an earlier Proto-Italic), all three dating approximately to the Iron Age. Proto-Slavic is dated much later than that, and probably Proto-Baltic too (or maybe earlier depending on the dialectal proposal), with Balto-Slavic being possibly coeval with Pre-Proto-Germanic and Italo-Celtic, but probably slightly later than that. Also, the language ancestral to Slavic may be (like a theoretical Proto-Romance language) impossible to reconstruct with precision, due to multiple substrate (or superstrate?) influences on the wide territory where Proto-Slavic formed and expanded from, in close alliance with steppe communities of different ethnolinguistic backgrounds.
We know that proto-historic Germanic, Celtic, and Italic peoples spread from relatively small regions, and had almost nothing to do with historic groups speaking their daughter languages, let alone modern speakers. Baltic and Slavic are not different.
NOTE. We have read that Weltzin samples clustered closely to Central Europeans (especially Austrians), and at a certain distance from modern Poles. That’s the conclusion of Sell’s PhD thesis, and it may be right, if you take only modern samples for comparison. However, if you have read or thought that they represented some kind of “ancestral Germanic vs. Slavic” battle, please imagine Trump’s voice for my opinion: Wrroonng, wrroonng, wrroonng. They cluster closely with Bell Beaker migrants, Poland BA, and Únětice (in this order), which we now know thanks to the data from O&M 2018 and Mittnik et al. 2018. And we also know who they don’t cluster close too: Corded Ware and Trzciniec samples. Therefore, people from the region near the most likely homelands of Pre-Proto-Germanic and Proto-Balto-Slavic are – as expected – likely descendants from Bell Beaker migrants in Central Europe. The genetic relationship of those ancient samples to modern inhabitants of Central-East Europe? Not obvious – at all.
We also know (and have known for a long time, well before these recent papers) that the oldest attested Indo-European languages – Mycenaean, early Anatolian languages, and Indo-Aryan (through certain words in Mitanni inscriptions) – do not show continuity from the places where they were first attested to the Late and Middle Proto-Indo-European (steppe) homeland either. There should be no problem then in accepting that there is no linguistic, archaeological, or common sense reason to support that Balto-Slavic is older or shows more regional continuity than other IE languages from Europe.
NOTE. Oh yes, Balts saying “Baltic is the most similar language to PIE” I hear you thinking? Uh-huh, sure. And according to some Greeks (supported e.g. by the conclusions from Lazaridis et al. 2017) Mycenaeans were ‘autochthonous’, and Proto-Greek the most similar to PIE. For many Hindus, Vedic Sanskrit is in fact PIE), and the latest paper by Narasimhan et al. (2018) only reinforces this idea (don’t ask me why). Also, Caucasian scholar Gamkrelidze (with Ivanov) supported the origin of the language precisely in the Caucasus, with Armenian being thus the purest language. For Italians fans of Virgil and the Roman Empire, Latin (like Aeneas) comes from Anatolian linguistically and genetically, hence it must be the ‘oldest’ IE dialect alive… No, wait, Danish scholars Kroonen and Iversen quite recently asserted that Germanic is the oldest to branch off, then it should thus be nearest to PIE! I think you can see a pattern here…And don’t forget about the new Vasconic-Uralic hypotheses going on now, with Vasconic fans of R1b changing from Palaeolithic to Mesolithic, and now to European Neolithic and whatnot, or Uralic fans of N1c changing now from Mesolithic EHG to Siberia (for ancestry) or Central Asia (for N1c subclades), or whatever is necessary to believe in ‘continuity’ of their people following the newest genetic papers… Just pick whatever theory you want, call it “mainstream”, and that’s it.
So, if there is no reliable archaeological model connecting Bronze or Iron Age cultures to Eastern European cultures which are supposed to represent the Proto-Slavic and Proto-Baltic homelands…why on earth would any reasonable amateur (not to speak about scholars) dare propose any sort of genetic or linguistic continuity for thousands of years from PIE to early Slavs, a people whose first blurry appearance in historical records happened during the Middle Ages in rather turbulent and genetically admixed regions? It does not make any sense, and it had all odds against it. Blond hair, blue eyes, lactase persistence? Sure, and ABO group, brachycephaly, anthropometry… All very scientifish.
Human ancestry can only help refinesolid academic theories, it cannot create one. Every new pet theory used to satisfy modern cultural pre- and misconceptions has failed, and it will fail again, and again, and again…
To have an own anthropological model of prehistoric migration requires time and study. It is not enough to play with software and to misuse traditional academic disciplines just to ‘prove’ some completely irrelevant, meaningless, and false continuity.
I am not a fan of continuity theories – that much should be clear for anyone reading this blog. However, most of such proposals’ supremacist (or rather fear-of-inferiority) overtones don’t mean they have to be wrong. It just means that most of them, most of the time, most likely are.
While reading Tommenable’s comments, I thought about a potential alternative model, where one could a priori accept an identification of North Pontic cultures as ‘Indo-Slavonic’, which seems to be the Eastern European R1a continuist trend right now.
NOTE. To accept this model, one should first (not a posteriori) accept an Indo-Slavonic linguistic group on theoretical grounds, of course, and take the steppe ancestral component (and not archaeological data) as the most meaningful aspect to consider for language expansion and exchange (which we know is not the most intelligent approach to cultural or language change).
Thinking about how Genomics could challenge what mainstream Linguistics and Archaeology accepts, the only situation I can think of (using simplistic phylogeography) regarding late Khvalynsk-Sredni Stog contacts (until ca. 3300 BC) is:
That the community of R1b-L51 lineages was in fact an isolated group , and not a western one – i.e. to the east within the Volga-Ural groups, or maybe to the south within the North Caucasian groups .
That the R1b-Z2103 community was a huge one dominating over much of the steppe, from the Dnieper area to the Volga-Ural region (where we know they were).
That R1a-M417 subclades (and especially subclade R1a-Z645) with steppe ancestry, as found in Corded Ware migrants,were only found in the North Pontic area (i.e. in Sredni Stog) during the fourth millennium (until at least 3300 BC, when Yamna substitutes it), and did not form other communities in the forest-steppe or Forest Zone (from where Corded Ware eventually expanded), as it is quite likely.
That both the R1b-Z2103 and R1a-Z645 communities shared obvious genetic connections (whatever they were) around the Dnieper, that could justify a common, shared language.
Only then, if a widespread Graeco-Aryan-speaking community happened to be spread from west to east in the Pontic-Caspian steppe, with close contacts with North Pontic cultures, and having an isolated Northern Late PIE community somewhere different than West Yamna, it could leave for me a reasonable doubt of a cultural connection (maybe “Indo-Slavonic” in nature) of the North Pontic steppe. But then we would probably be stuck – yet again – with some sort of cultural diffusion event, impossible to demonstrate.
Since it is known (in Linguistics, and also in Y-DNA lineages, due to the early expansion of Z2103 subclades) that Graeco-Aryan groups separated early, this model would not be impossible.
Also a priori in favour of that model would be the early expansion of a (Northern IE-speaking) Pre-Tocharian population to the east. On the other hand, from an archaeological point of view, the group reaching Afanasevo seems to have expanded from Repin, just like the community expanding Yamna to the west of the Dnieper.
I really doubt there can be any serious discussion though, apart from amateur geneticists with a personal interest on this, because:
Dialectal separation within a Late Proto-Indo-European language must have happened late, gradually, and in close contact, allowing for common innovations to spread through dialectal groups.
It does not make sense in terms of prehistoric cultures, since there is no direct connection or migration among steppe cultures but for the Novodanilovka and the Yamna expansions.
Indo-Slavonic is only supported by a handful of linguists, and not in the way or timing described in this model.
NOTE. You can read Kortlandt’s works in Academia.edu (also on his personal website) if you are really interested in knowing more about an Indo-Slavonic proposal, from an expert Balticist and Slavicist. However, if your intent is to demonstrate some ancient ethnic link of “your” people (whatever that means) to mythical Proto-Indo-Europeans, you would not need actual knowledge or sound theories to do that, so you can skip that part. Also, Kortlandt would probably support a later model of Indo-Slavonic expansion in the steppe, related to East Yamna, and later Sintashta, Srubna, etc…
If you think about it, if most modern Slavs were mainly of R1b-L23 lineages instead of R1a-Z645 (a replacement which, as it is clear know, is the consequence of a simple resurge of previous lineages in East-Central Europe, coupled with a later gradual replacement through founder effects, so no big migration history here), and Finnic speakers were mainly of R1a-Z645 lineages (whose replacement by N1c lineages seems also the consequence of quite late consecutive founder effects), I doubt we would be having this reticence to accept sound anthropological models.
NOTE. The change of narratives where certain languages must have accompanied R1a-Z645 and N1c lineages, but in alternative ways not previously described, is obviously unjustified, if linguistic and archaeological data tell a different story. As unjustified as it is to change Yamna for “Neolithic Steppe” as homeland of Late Indo-European, to fit it with the steppe ancestry concept…
It was reported long ago that genetic studies were being made on remains of a surprisingly big battle that happened in the Tollense valley in north-eastern Germany, at the confluence between Nordic, Tumulus/Urnfield, and Proto-Lusatian/Lusatian territories, ca. 1200 BC.
At least 130 bodies and 5 horses have been identified from the bones found. Taking into account that this is a small percentage of the potential battlefield, around 750 bodies are expected to be buried in the riverbank, so an estimated 4,000-strong army fought there, accounting for one in five participants killed and left on the battlefield.
Body armour, shields, helmet, and corselet used may have needed training and specialised groups of warriors, with their organisation being a display of military force. According to Kristiansen , this battle is therefore unlike any other known conflict of this period north of the Alps – circumscribed to raids by small groups of young men –, and may have heralded a radical change in the north, from individual farmsteads and a low population density to heavily fortified settlements.
The Urnfield culture (ca. 1300-750 BC) is associated with the rise of a new warrior elite, and the formation of new farming settlements and their urnfields. In some areas there is continuity from Tumulus to Urnfield culture, with narrowing and concentration of settlements along the river valleys, but there is also wide-ranging migrations. These migrations are similar to those seen later in the La Tène culture. This period is also coincident with the time of the mythical battle of Troy, with the collapse of the Mycenaean civilisation, and with the raids of Sea People in Egypt, and the marauders of the Hittites.
The majority of sampled individuals fall within the variation of contemporary northern central European samples (including Nordic Late Neolithic and Bronze Age and Únětice samples); however, there are also some outliers closer to Neolithic LBK and modern Basques, suggesting that central and western European cultures were still at that time closely interconnected, continuing thus the connections created during the Bell Beaker expansion a thousand years earlier. The genetic similarity of most samples to modern western Slavic populations (as well as Austrians and Scots) gives support to the origin of Balto-Slavic in Bronze Age north-central Europe, and more specifically in the Lusatian culture.
In fact, scarce aDNA from late Urnfield populations from its north-eastern territories, in Saxony – near the Lusatian culture –, already show a mixture of lineages, which suggest genetic continuity with older cultures (or more likely a resurge) after the Bell Beaker expansions: R1a1a1b1a-Z282 lineage was found in Halberstadt (ca. 1085 BC), and of the eight males studied from the Lichtenstein cave (ca. 1000 BC), five were of haplogroup I2a2b-L38, two of haplogroup R1a1-M459, and one of haplogroup R1b-M343.
Regarding modern populations, the eastern and western peaks in R1a1a1b1a1-M458 lineages might support a west-east migration, as well as an east-west migration, and indeed both in different periods, which is expected to be found if Lusatian is linked to the initial eastward expansion of Balto-Slavic during the Bronze and Iron Ages, and later younger subclades are linked to the West Slavic expansion to the west during Antiquity.
Now, if this is so, then we have to accept that these territories of north-central Europe (between East Germany and Poland), occupied earlier by Corded Ware cultures, adopted Balto-Slavic only after the Bell Beaker expansion; therefore, models arguing for Balto-Slavic origins in east European late Corded Ware groups (or heir cultures), like Trzciniec, Chornoles, Bilozerska, or Milograd (see e.g. the article on Wikipedia) have to be rejected. We also know that Pre-Germanic could have only formed in the Nordic Late Neolithic, after the cultural unification of the Dagger Period, heraled by the arrival of Bell Beakers; and that Indo-Iranian was the language of the Sintashta-Petrovka culture, which had absorbed the previous (Yamna-related) Poltavka culture.
But, if Indo-European was only spoken at both ends of territories previously occupied by Corded Ware cultures – stretching from Scandinavia to the Urals, including the Baltic region… what language did Corded Ware peoples actually speak? The most likely one? Uralic, indeed.
I feel there has recently been an increase in references to quite old – and generally outdated – terms, such as Germano-Balto-Slavic and “Indo-Slavonic” (i.e. Satem), described as Late Indo-European dialects. This is happening in forums and blogs that deal with “Indo-European genetics”, and only marginally (if at all) with the main anthropological subjects that form Indo-European studies, that is Linguistics and Archaeology.
Firstly, let me go apparently against the very aim of this post, by supporting the common traits that these dialects actually share.
Satem Indo-European or Indo-Slavonic
Balto-Slavic is a complex dialect, whose known proto-history and history offers already a difficult picture. Contrary to the opinion of many, there is no single document that can identify the terms Antes, Sklavenes, and Venedi with the cultures that are usually identified as speaking languages ancestral to East Slavic, South Slavic, and West Slavic . These names were used interchangeably in the Byzantine Empire, which was obviously not involved in classifying Slavic peoples by their linguistic branches… For more on the historical identification of Slavic tribes, read Florin Curta‘s The Making of the Slavs: History and Archaeology of the Lower Danube Region, c. 500-700 A.D. On the identification of potential candidates for early Slavic and Baltic cultures, you can read the appropriate entries in the Encyclopedia of Indo-European Culture, by Mallory & Adams.
Baltic and Slavic tribes seem to have a too recently recorded history to be able to confidently trace back their cultural predecessors. In its recent history, close to the formation of its community, Proto-Slavic must have had intense contacts with Iranian-speaking peoples. Also, previously, if R1a-M417 subclades are in fact the most common lineages expanded with the Corded Ware culture (as it seems now), they have no doubt shared a common language, most likely a non-Indo-European one. Not Indo-European in the strict sense, at least, since it formed part most likely of the Indo-Uralic continuum that must have been spoken during the Mesolithic in Eastern Europe, and a language probably nearer to Uralic than to classic Indo-European.
A strong connection between Balto-Slavic and Indo-Iranian in a common Satem branch, as supported by Kortlandt (see e.g. Balto-Slavic and Indo-Iranian 2016, or a reconstruction of Schleicher’s Fable in PIE branches), would imply that a Corded Ware culture from the Dnieper-Donets – speaking a Graeco-Aryan dialect – interacted for centuries with Uralic and other Graeco-Aryan languages, only later influenced by North-West Indo-European (as late as its contact with East Germanic during the Barbarian migration). This model cannot justify the shared traits of Balto-Slavic with North-West Indo-European, unless a third, substrate language – like Holzer’s (1989) Temematic proposal – is added to the equation. Such models are not impossible, but seem too complex.
On the other hand, linguistically Balto-Slavic seems to have split in its known branches quite early, and traits such as the satemization trend appear to have affected each main dialect (Baltic and Slavic) differently, as attested in the different ruki development, hence the assumption of its early but different influence of the trend to both Indo-Iranian and Balto-Slavic (or, more exactly, Indo-Iranian, Baltic, and Slavic). Also, the common North-West Indo-European vocabulary, as well as morphological trends shared by NW IE dialects, clearly affects the oldest layer of both languages (hence the parent Proto-Balto-Slavic too), which predates thus the satemization trend, and further contributes to the idea of a common root between West Indo-European (or Italo-Celtic), Northern Indo-European (the language ancestral to Pre-Germanic), and Proto-Balto-Slavic.
Germano-Balto-Slavic or North European
A common group between Germanic and Balto-Slavic is justified by the presence of certain common isoglosses, such as the famous shared oblique cases in *-m- instead of *-bh-, and support for such a group is found recently e.g. in Gramkelidze-Ivanov (1993-1994) – who nevertheless support a North-West Indo-European continuum -, or in Jasanoff, for whom both languages (regarding phonological traits) “began their post-IE history together”.
On the other hand, such shared traits could have derived either from old contacts – supported traditionally because of their proximity -, or by a common substrate to both without a need for direct contacts, as supported by Kortlandt in Baltic, Slavic, Germanic (2016), among others).
The fact that there might have been a different, third language involved – the hypothetic Temematic substrate language to Balto-Slavic, potentially nearer to Baltic because of the stronger superstrate influence in Slavic – further complicates the dialectal identification of Baltic and Slavic – that is, if one supports a common Germanic and Balto-Slavic group.
I am not implying that a common group of Balto-Slavic with Indo-Aryan (or of Germanic with Balto-Slavic) is fully discarded by linguistics: history and archaeology can indeed support a close interaction between these languages, and there has been historically some support to the inclusion of Balto-Slavic within a Graeco-Aryan group. However, Linguistics and Archaeology are each day more supportive of the association of Italo-Celtic with Germanic in a North-West Indo-European group, and Balto-Slavic with them (Oettinger 1997). See for example any recent article or book by Mallory, Adams, Beekes, Adrados, etc., or if you prefer, refer to the mainstream models followed by scholars in the German, Spanish, Leiden, or American schools. As you probably know, Clackson for the British school supports an abstract “constellation analogy” model for the language reconstruction, and the French school is dominated by archaeologist Jean Paul Demoule’s rejection of a Proto-Indo-European community; both schools, as you can imagine, will have to revise their theories in light of recent genetic studies…
Even Anthony (2007), who has related the Corded Ware culture to the expansion of Indo-European languages through cultural diffusion, recognizing the expansion of Yamna migrants to the west (identifying them with Italo-Celtic and Proto-Greek speakers), has to offer two or three separate cultural diffusion events (!), whereby Pre-Germanic, Proto-Balto-Slavic, and Proto-Indo-Iranian had been learned by the influence of the Yamna culture on neighbouring (unrelated) peoples of Corded Ware cultures: in Central European – Single Grave culture (from Pre-Germanic Usatovo), Middle Dnieper culture (from Balto-Slavic in the Contact Zone), and Potapovka (from Poltavka) cultures, respectively. No actual spread or migration from Yamna into Corded Ware has been supported since Gimbutas.
Balto-Slavic is indeed a complex group of languages – with some supporting (since Toporov and Ivanov proposal in the 1960s) three dialectal groups, composed of East Baltic, West Baltic, and Slavic branches (thus implying an older split of Baltic). Because of the close interaction of eastern Europe with Eurasian invasions, the nature of their language won’t probably ever be solved. Genetics is not the savior that overcomes these difficulties; so long it has only brought more (albeit no doubt interesting) questions, and even though their correct interpretation might offer some new light, we will be far from obtaining a clear picture of the cultural and linguistic development of Proto-Baltic and Proto-Slavic communities.
What I am criticizing here, therefore, is this recent revisionist trend whereby PIE must have been spoken by R1a-Z645 lineages, a trend found not only among amateur geneticists. I am beginning to think – judging from online comments, posts or tweets – that this trend is becoming stronger as a reaction to the fact that not a single R1a-Z645 sample has been found in Yamna or its expansion. These new revisionist models depict a common group of R1a-Z645 lineages hidden somewhere in the steppe, sharing some sort of Indo-Germanic (??) group, or argue for a shared Late PIE community without dialectal divisions, to justify its potential find somewhere marginal to the PIE territory, and then a later development of Corded Ware into Bell Beaker cultures (and, it is implied, peoples).
While not impossible, these are unlikely models, not based on knowledge but on wishes, since linguistic data strongly suggest a North-West Indo-European dialect including Italo-Celtic, Germanic, and (at the very least in its substrate and thus western R1a lineages) Balto-Slavic, and archaeological findings don’t show any meaningful population exchange between Corded Ware and Yamna… That is, it hadn’t until after the first famous papers on the so-called ‘steppe admixture’ of 2015, when (surprise!) Kristiansen has already jumped on the bandwagon (and Anthony seems to be beginning to suggest the same) of previously discarded Yamna -> Corded Ware, and Corded Ware -> Bell Beaker migrations.
Not a single serious researcher can deny that a hidden community of R1a-417 in Yamna is possible. But no one should support that it is the most likely explanation to the current genetic picture, whether based on Linguistic, Archaeology, Anthropology, or Genetics (be it phylogeography or admixture analyses).
I think this recent trend must therefore be the fruit of the influence of previous, deeply entrenched concepts regarding the Corded Ware culture and its link with Proto-Indo-European. These concepts are based on Gimbutas’ Kurgan model, Anthony’s revision of it – explaining the expansion as multiple cultural diffusions (thus renewing Gimbutas’ claim) -, and early studies of modern populations’ haplogroups. Apart from those trends, especially worrying for the future of the field (if it is to be taken seriously), is the interest of some pressure groups, including especially eastern Slavic peoples of R1a lineages, and Finnic speakers of N1c lineages, who are linking some fantastic ancient ethnolinguistic community to their modern national pride.
Adapting to reality
You can find support for anything you like in anthropology: there is certainly a paper out there that apparently supports your personal view on prehistoric ethnolinguistic Europe. You only have to do a quick search in Academia.edu, and you can justify whatever new genetic results you personally obtained playing with the freely available datasets and open source software – e.g. from Reich’s lab, or the famous ADMIXTURE. If you are one of those few interested in the field who haven’t tried it out yet, Razib Khan helps introduce you to DIY Genetics, so you can show off some graphics and proportions, like most popular bloggers and forum users are doing. Then you can also publish your results in BioRxiv, just to try it out.
So there is no merit at all in justifying these genetic results by supporting a potential anthropological scenario for it. Heck, you can invent it! Here, I said it. Anyone can do Anthropology. In fact, it seems that everyone does Human Evolutionary Genetics nowadays, no matter their background. Some lab knowledge and experience in doctoral research seems to be enough.
Admixture analyses are obtained using one or more algorithms, which have a limited potential to inform of possible migrations (its ultimate objective, at least regarding its complementary function to Archaeology within Indo-European studies). Such algorithms invariably have:
Intrinsic constraints: You have to understand each algorithm’s intrinsic limitations to be able to apply them correctly, and to derive meaningful but cautious conclusions. Using software commands and obtaining graphics and percentages does not imply you understand the constraints at stake. If you have tried them out, you have seen their great limitations; if you don’t see them, you certainly realize how little you understand of them.
Extrinsic constraints. Most are known, and often mentioned explicitly in research papers:
Few DNA samples, from limited sites.
Scarce and variable material recovered from these samples.
Quality of the retrieval, human errors, etc.
Lack of precise anthropological context.
Admixture results (whether by professionals or amateurs) are nevertheless often illustrated with tailored anthropological models: in case of the renown papers most likely because of ignorance of anthropological context, broad (philosophical or theoretic) and precise (historical), or lack of sufficient understanding of the different fields involved, and in case of many amateur geneticists also (often) to justify a desire for a prehistoric ethnolinguistic identification similar to their social or political agendas, in a new Kossinnian trend.
Admixture analyses are not wrong per se. It is wrong to trust them to inform you of something they can’t; because they need context, and ancient samples need ancient context, which in prehistoric times is obviously quite limited. If you don’t know as much as possible about the ancient context (i.e. Linguistics, Archaeology, Anthropology), you get the wrong conclusions. Period. If you look for papers on ancient context expecting to find whichever model fits your results (or worse, your wish), that is called bias. Don’t expect to get the right conclusions doing that, either. If you find it, that’s called confirmation bias. Such results are not useful. For anyone, not even you, you just deceive yourself and maybe others.
Some apparently think that a group of geneticists can achieve a meaningful interpretation of data just by adding one or more archaeologists to the research group – or as ‘co-authors’ of individual research papers. Wrong again. Ten people with IQ 20 don’t make the reasoning of a person with IQ 200 (not that I believe in measuring intelligence, but you get the point). Similarly, twenty researchers, each one with knowledge exclusively (or almost exclusively) of their own field, can’t achieve a meaningful explanation for the data obtained. Geneticists look for an anthropological model that coherently fits their results. Archaeologists will look for a model known to them that fits the genetic results (or more likely the interpretations thereof) they are given. That way, when working together, they can achieve a common ground. If neither of them understands the complexities and shortcomings of the others’ materials and methods (and their whole background), the results will be formally correct, but still wrong. They need to know all aspects involved in the others’ fields in great detail, to understand all potential implications of new data.
Since the advent of ancient DNA samples and especially PCA analyses, phylogeography (leaning predominantly on Y chromosomes) has been relegated to a (probably deserved) second place in assessing DNA samples. However, as Razib Khan states, “in the scaffold of the ancient DNA framework it can resolve some issues”. I think this is one of those issues, an issue that is not trivial at all – in that it affects migration models from the steppe at a critical period of linguistic expansion -, and the shortcomings of not relying on it are becoming quite evident with each new publication.
Many amateur geneticists that support the mainstream genetic models of the past two years don’t like the ad hoc explanations that others have been constantly giving to support their previous theories. After all, it seems unfair that some people would reject data that offers an obvious prehistoric picture of populations, because of the unwillingness to change one’s own preconceptions, right? For example, against the mainstream steppe migration theory, we have those who support that R1b must have been western European (Palaeolithic or Mesolithic) hunter-gatherers expanded from Iberia; or those who want R1a to have expanded from India. No matter how strong the evidence is against those models, some groups harbour a desire to fit anything in one’s previous image of reality.
However, some people who can’t stand those absurd ad hoc explanations and rationalisations, are quite ready to embrace the idea that, somehow, during the Chalcolithic expansion of Yamna, an imaginary community was formed where communities of divergent lineages R1a-Z645 (found mostly north of the steppe and later in Corded Ware cultures) and R1b-M269 (found mostly in the steppe and later in the cultures known to have evolved from Yamna, like Afanasevo, Vučedol, and Bell Beaker) lived together and spoke the same language for centuries, or even millennia. And that community would have existed after a Late Neolithic westward expansion of the Khvalynsk culture, and another westward expansion of the Repin culture, both of which probably reduced the diversity of Y-DNA lineages within Yamna: the first to R1b-M269 lineages, the second to R1b-L23 subclades.
Both communities of R1a and R1b lineages, described then as united until the Yamna expansion (although no sample of R1a-Z645 subclade has been ascribed to any steppe expansion) would have expanded somehow separately, R1a-M417 exclusively to the north into Corded Ware – without any migratory connection found between Yamna and Corded Ware in mainstream Archaeology -, and forming thus dialectal groups (like “Germano-Balto-Slavic” or “Indo-Slavonic”) that are not supported by mainstream linguistics.
On the other hand, R1b-Z2103 and R1b-L51 lineages, which were already separated within Yamna and probably forming different communities, are known to have spread to the west with the Yamna expansion, in some places and cultures they are found together (like Bell Beaker), which would be expected in a common migration of separate groups. No single R1b-L23 sample has been found in the Corded Ware expansion, no single R1a-M417 individual in the Yamna expansion.
These convoluted explanations of how R1a lineages must have spoken Indo-European are based on the assumption that admixture analyses (from the current limited data, with the current wrong interpretation of their context) necessarily means that Corded Ware peoples spread as Yamna migrants – hence R1a lineages must come from Yamna – and then spread into Bell Beaker.
It is possible, and in my opinion expected, that eventually some R1a-M417 subclade will be found in Yamna samples (east or west), and some haplogroup R1b-M269 (especially R1b-L23 and subclades) will be found in samples from Corded Ware cultures (west or east). Indeed, there must have been close contacts between both cultures (between Yamna–Southeast Europe–East Bell Beaker and Corded Ware), and not only through female exogamy. It would be quite strange not to find a single R1b-L23 sample in Corded Ware cultures, or an R1a-M417 sample in Late Proto-Indo-European-speaking territories. Those scattered samples, whenever they are found, will probably not change the data: but they might give a reason for some to keep supporting a model that is not the most likely one. It won’t still be the most reasonable, the simplest model that explains all data.
What it means to be an ‘ethnic’ Balt or Slav
Older models – older even than Gimbutas’ kurgan model of the 1950s, as you can see -, by presupposing an instant breakup of a unitary Proto-Indo-European language into different linguistic communities without previous dialectal relation with each other, cannot explain our common European linguistic heritage. More recent models based on recent genetic studies (and on outdated or newly invented linguistic and archaeological theories), by trying to connect genetically (directly) modern eastern Europeans with Proto-Indo-Europeans, are in fact disconnecting Balto-Slavic peoples from the rest of Europe for three thousand years, and connecting them either with Uralic or with Indo-Iranian speakers. Ethnolinguistic identification, however, is not about genetics – and it has never been, and I hope it will never be -; it is related to self-identification into groups, and more broadly to a common culture, and often specifically a common language.
In terms of language, it makes sense to support a situation where Balto-Slavic was a North-West Indo-European dialect (sharing a common language ancestral to Germanic and Italo-Celtic too), with certain ancient (Uralic?) innovative traits shared with Indo-Iranian and partly with Germanic (but with no direct contacts necessary between these branches). Its recent transition to a Baltic and Slavic proto-languages, already by eastern European groups, shows their strong external influence from Uralic and Iranian, respectively, so an identification of Balto-Slavic with the expansion of R1a lineages is probably to be found in a western group of R1a-Z282 subclades expanding eastwards between the Bronze Age and the Iron Age.
Eastern Europe’s Indo-European heritage (Balto-Slavic) is therefore connected to the western European one (Italo-Celtic and Germanic), each with its own linguistic substrate and influences, but with a common, shared ancient language. North-West Indo-European derived in turn from Late Indo-European, a language ancestral to Indo-Iranian and Palaeo-Balkan languages, the latter showing continued contacts with western Europe for millennia.
In the minimum-case scenario – for supporters of a Satem proto-language like Kortlandt – the language substrate to Baltic and Slavic must be a North-West Indo-European language (to fit its shared traits with North-West Indo-European), like Holzer’s Temematic (a hypothesis which Kortlandt seems to support) that would have then been recently absorbed by Satem speakers of Eastern Europe. In that context, central European R1a-Z282 lineages (which form the majority of West Slavic lineages) would have spoken that NWIE language for millennia , until proto-historic times, when a cultural diffusion of a Graeco-Aryan dialect (mainly spoken by R1a-Z93 or eastern European R1a-Z282 lineages, then) would have happened in eastern Europe, and then a cultural diffusion (or demic diffusion?) of Slavic-speaking peoples would have happened to the west, into central Europe.
In none of these scenarios is any sort of Proto-Indo-European -> Balto-Slavic ethnolinguistic, genetic, or territorial continuity to be seen. The former model is not only the simpler explanation for Slavic and Baltic, but it is also the communis opinio today by most Indo-Europeanists, it is supported by Archaeology, and Genetics is likely to keep supporting it with each new paper. I don’t find anything shameful, or that could diminish modern Baltic and Slavic identities a bit, by accepting any of those models, so I don’t understand the imperative need some people seem to have of identifying R1a lineages with the Yamna expansion and thus Proto-Indo-European.
This is just one of many highly hypothetic ancient scenarios, and it requires more assumptions than a continuity of Indo-Uralic (or even Indo-Uralic and Afroasiatic) with R1b lineages – R1a potentially marking the spread of Paleo-Siberian languages -, and above all it is based on controversial linguistic macrofamilies, not (yet?) supported by mainstream anthropological disciplines. It is nevertheless one theory certain romantics can place their hopes in, as R1b communities of the steppe become accepted as those originally speaking (Middle and Late) Proto-Indo-European in the steppe.
I am not saying I am right. There is still too much to be said and corrected. In fact, I could be wrong, and we may lack a lot of interesting data: there might have been a late R1a-R1b North-West Indo-European-speaking community within western Yamna, and we might need to revise what we knew about Archaeology yet again (and maybe even Linguistics!) before admixture algorithms; then maybe geneticists have come to save the day after all. However, all anthropological evidence points strongly (and genetic studies more strongly with each new study) to the image we had previous to the first genetic data based on haplogroups.
I think it is preposterous of some researchers (no matter if professional or amateur geneticists, or archaeologists, or even linguists) to think algorithms can beat more than two hundred years and thousands of works on this matter. In Academia, mathematics rarely revolutionize a field; it could usually help, but it can just make you sound scientifish, and point in the wrong direction.
And no, I am not smarter than the rest, I can only judge from what I know, and that is always too little, far less than I would like to. But maybe I am in a more neutral position regarding the end result, given my renewed skepticism in revolutionary methods to solve academic problems, and my indifference as to a western European or eastern European origin of R1b or R1a lineages. And I am not alone in my lack of confidence in the interpretation of recent genetic admixture results – read Voker Heyd’s papers, for example, if you want the view of a renown and experienced archaeologist who was in the field of Indo-European studies earlier than any of those now popular geneticists.
In fact, I also fell for the R1a-Corded Ware expansion of Late Indo-European, and before many in the Anthropological fields, and with even less proof, back when we only had haplogroups of modern populations and the promises of Cavalli-Sforza. When I decided to publish a grammar to learn Indo-European as a modern language, the aim was to offer a mainstream reconstruction of Late Proto-Indo-European without adding my own contributions; despite this, I added the newly, archaeogenetically-supported Corded Ware migration model (see A Grammar of Modern Indo-European, Third Edition, pp. 74 and ff.).
I guess I liked the picture of an old romantic Europe, divided in western Vasconic (R1b) and eastern Uralic (N1c) hunter-gatherers (and later farmers) being invaded by warring kurgan-makers from the steppe (R1a) … And I really liked the article of Haak et al. (2015) – the first one I read on this subject -, which I saw, like everyone else, as supporting what many of us already believed about a single, common expansion of North-West Indo-European into western Europe. It also made our life – regarding the linguistic unity of Balto-Slavic with the West Indo-European core – much easier…
Recent papers, when compared to what linguists and archaeologists had been saying for years – before even Y-DNA haplogroup was a thing for any of these now popular genetic labs (not to speak about internet geneticists) -, leave little space for doubt right now. I embraced the results of haplogroup analysis of modern populations, which seemed to support an expansion of Proto-Indo-European R1a-lineages with the Corded Ware culture, and dismissed thus Gimbutas’ and Anthony’s model (of a Yamna -> Bell Beaker expansion of Italo-Celtic). I also embraced the results of the publications on genetics of 2015 with open arms.
But, I was able to change my mind when the careful observation of individual samples of these recent studies began to contradict what we thought, and I did so publicly only recently (publishing the Indo-European demic diffusion model), and more strongly after the latest papers (publishing the updated second edition), without remorse. And I will reverse that decision again if needed, and change it again and again as I feel necessary, no matter how many times. In Science, to adapt to new data does not make you a brownnoser, it makes you a scientist. Not to adapt to new data does not make you a man of firm ideals, or any chivalrous concept you might have about that, it makes you look ignorant and biased. It’s that simple.
Some of you may think that there is a third way: to keep an old, now unlikely idea you have supported in the past, but not bragging about it in the meantime until it is proven fully wrong, just in case it is demonstrated to be right in the end – because then you might claim you were right all along, like you had some magic understanding or hidden data the rest of us didn’t. I don’t think that’s the correct way to behave in a scientific environment, either. That makes you a coward. And you wouldn’t have been right all along: you would have been right, then wrong, then right again. Everybody can see that, and so do you.
Geneticists working on future publications should be planning ahead of what might happen. The overconfidence of Haak et al. (2015), Mathieson et al. (2015), and Allentoft et al. (2015), including Lazaridis et al. (2016), in supporting a Yamna -> Corded Ware migration, and a Corded Ware -> Bell Beaker migration are understandable in a rapidly growing field that didn’t leave enough time to study complex anthropological questions. The recent errors of following that simplistic and wrong model in Mathieson et al. (2017), and Olalde et al. (2017), coupled with Kristiansen’s (2017) and Anthony’s (2017) new interpretations (to fit the conclusions of those genetic studies), can be forgiven, because of all the fuss created around the Steppe admixture concept, and the desire of journals to publish popular papers, and of researchers to go with the tide and gain some popularity along the way.
From now on, however, if the evidence keeps pointing in the same direction, a lack of attention to anthropological detail will be simple wishful ignorance, and that cannot be forgiven in any field that strives to ascertaining the truth. If continued, this trend will damage the field of Human Evolutionary Biology for years – at least in the view of anthropologists, who are the real filter of this field’s conclusions -, when its current results prove wrong. Genetic studies will be banished from anthropological studies, dismissed as a pseudo-science, and avoided by any scientific or academic journal worthy of a minimum self-respect.
To regain trust in a field that purportedly uses “more scientific methods” but is nonetheless proven that wrong for years in its essential assumption (a Yamna -> Corded Ware migration model), and especially when it is associated with the traditionally despised ‘Kossinnian trends‘, will be a hard task for those involved. So many postdoc offers in so many labs being created right now will vanish, as the interest in publishing papers of this discredited field will disappear. This could also threaten the recently renewed impulse by archaeologists and linguists of migration models, which had been rejected for a long time, giving impulse to those who deny them ( e.g. in the UK and in France), or who just don’t want to see Archaeology or Linguistics get involved with such a controversial question, or even between each other.
High-impact factor journals like Nature, Science, PNAS, and those not so famous, as well as their reviewers and readers, are doing a disservice to the endeavour of ascertaining the historical truth, if they allow this to happen without protesting. But such consequences for the field will be their making, and not that of suspicious anthropologists, who do well in distrusting any revolutionary results published by overconfident researchers from newly developed and too broadly defined subfields.