Fernando López-Menchero and I have published our first draft on the North-West Indo-European proto-language. Our contribution concerns mainly phonetics, and namely two of its most controversial aspects: a common process of laryngeal loss and two series of velars for PIE.
There is also an updated linguistic model for the Corded Ware substrate hypothesis, which seeks to explain certain similarities between Germanic and Balto-Slavic, and between Balto-Slavic and Indo-Iranian, and potential isoglosses between the three.
As you probably know, our interest is (and has been for the past 15 years or so, even before our common project) the reconstruction of a North-West Indo-European proto-language, the ancestor of Italo-Celtic, Germanic, and Balto-Slavic. At least since Krahe’s proposal of an Alteuropäische substrate to European hydronymy, some 70 years ago, Indo-Europeanists have been supporting an Old European branch of Proto-Indo-European.
However, dialectal divisions were tentative. Since Oettinger, some 30 years ago, we have a clearer picture of a group of closely related dialects, namely Italo-Celtic, Germanic, and Balto-Slavic. Although the nature of Balto-Slavic is somehow contended (for the few scholars who support an Indo-Slavonic group), the minimalist view holds that at least the substrate language of Baltic and Slavic, Holzer‘s Temematic, was part of the North-West Indo-European group.
A North-West Indo-European (NWIE) proto-language not only solved the controversial question of Pan-European IE hydronymy (clearly of Late Indo-European nature), but also – and more elegantly – the question on the origin of the many fragmentary languages attested in Western Europe, usually attributed to a “Pre-Celtic” or “Pre-Italic” nature depending on their surrounding languages (Venetic has even said to be related to Germanic…).
Described first mainly in terms of lexical isoglosses, the concept of a NWIE language was then gradually and strongly founded in common grammatical features, contributed to mainly by the German, North American, and Spanish schools (as you know, the British or French schools are quite divided on the nature of Proto-Indo-European itself…). Recent archaeological models pioneered by Harrison and Heyd (2007) showed how this might have happened, with Yamna migrants that evolved as the East Bell Beaker group, and their subsequent expansion into most of Europe.
This traditional model of a ‘Corded Ware -> Bell Beaker expansion of NWIE’ which we also followed until recently, never fit well with the known migrations paths from Yamna (into Balkan Early Bronze Age cultures), with the geographic distribution of Old European hydronymy, or with the guesstimates for Late Indo-European and North-West Indo-European. This compelled us to support a break-up of the proto-language further back in time than warranted by models of language change, and it needed certain unlikely cultural diffusion events over huge areas (because no such migration from Yamna to northern Europe has been attested): along the steppe/forest-steppe zone first, for a diffusion from Yamna into Corded Ware cultures, and along the Danube or the Rhine later, for a diffusion of Corded Ware into Bell Beaker. These models were also based on the wrong interpretation of the first radiocarbon dates of Beakers – placing an origin of the Bell Beaker people in Iberia (which has been rejected in Archaeology, and now also in Genetics).
Such a ‘Germano-Balto-Slavic’ group faded in Linguistics long ago, with most Indo-Europeanists preferring to talk about late contacts (viz. Celto-Germanic or Italo-Germanic contacts), and for some there is – if any subgroup at all – a core West Indo-European or Italo-Celto-Germanic group, which may be supported by recent genetic research on Bell Beaker peoples, with the Beaker group of the Netherlands being the key. Our research on the potential language spoken by Corded Ware peoples – most likely related to Uralic, from an Indo-Uralic community from the Pontic-Caspian steppe – can elegantly explain the isoglosses that both European dialects share.
It was reported long ago that genetic studies were being made on remains of a surprisingly big battle that happened in the Tollense valley in north-eastern Germany, at the confluence between Nordic, Tumulus/Urnfield, and Proto-Lusatian/Lusatian territories, ca. 1200 BC.
At least 130 bodies and 5 horses have been identified from the bones found. Taking into account that this is a small percentage of the potential battlefield, around 750 bodies are expected to be buried in the riverbank, so an estimated 4,000-strong army fought there, accounting for one in five participants killed and left on the battlefield.
Body armour, shields, helmet, and corselet used may have needed training and specialised groups of warriors, with their organisation being a display of military force. According to Kristiansen , this battle is therefore unlike any other known conflict of this period north of the Alps – circumscribed to raids by small groups of young men –, and may have heralded a radical change in the north, from individual farmsteads and a low population density to heavily fortified settlements.
The Urnfield culture (ca. 1300-750 BC) is associated with the rise of a new warrior elite, and the formation of new farming settlements and their urnfields. In some areas there is continuity from Tumulus to Urnfield culture, with narrowing and concentration of settlements along the river valleys, but there is also wide-ranging migrations. These migrations are similar to those seen later in the La Tène culture. This period is also coincident with the time of the mythical battle of Troy, with the collapse of the Mycenaean civilisation, and with the raids of Sea People in Egypt, and the marauders of the Hittites.
The majority of sampled individuals fall within the variation of contemporary northern central European samples (including Nordic Late Neolithic and Bronze Age and Únětice samples); however, there are also some outliers closer to Neolithic LBK and modern Basques, suggesting that central and western European cultures were still at that time closely interconnected, continuing thus the connections created during the Bell Beaker expansion a thousand years earlier. The genetic similarity of most samples to modern western Slavic populations (as well as Austrians and Scots) gives support to the origin of Balto-Slavic in Bronze Age north-central Europe, and more specifically in the Lusatian culture.
In fact, scarce aDNA from late Urnfield populations from its north-eastern territories, in Saxony – near the Lusatian culture –, already show a mixture of lineages, which suggest genetic continuity with older cultures (or more likely a resurge) after the Bell Beaker expansions: R1a1a1b1a-Z282 lineage was found in Halberstadt (ca. 1085 BC), and of the eight males studied from the Lichtenstein cave (ca. 1000 BC), five were of haplogroup I2a2b-L38, two of haplogroup R1a1-M459, and one of haplogroup R1b-M343.
Regarding modern populations, the eastern and western peaks in R1a1a1b1a1-M458 lineages might support a west-east migration, as well as an east-west migration, and indeed both in different periods, which is expected to be found if Lusatian is linked to the initial eastward expansion of Balto-Slavic during the Bronze and Iron Ages, and later younger subclades are linked to the West Slavic expansion to the west during Antiquity.
Now, if this is so, then we have to accept that these territories of north-central Europe (between East Germany and Poland), occupied earlier by Corded Ware cultures, adopted Balto-Slavic only after the Bell Beaker expansion; therefore, models arguing for Balto-Slavic origins in east European late Corded Ware groups (or heir cultures), like Trzciniec, Chornoles, Bilozerska, or Milograd (see e.g. the article on Wikipedia) have to be rejected. We also know that Pre-Germanic could have only formed in the Nordic Late Neolithic, after the cultural unification of the Dagger Period, heraled by the arrival of Bell Beakers; and that Indo-Iranian was the language of the Sintashta-Petrovka culture, which had absorbed the previous (Yamna-related) Poltavka culture.
But, if Indo-European was only spoken at both ends of territories previously occupied by Corded Ware cultures – stretching from Scandinavia to the Urals, including the Baltic region… what language did Corded Ware peoples actually speak? The most likely one? Uralic, indeed.
After my first version, findings in Olalde et al. (2017) and Mathieson et al. (2017) supported some of my predictions. Now after my third, their new data also supports another prediction. Because the model is based on solid linguistic and archaeological models. Here is an excerpt from the Indo-European demic diffusion model, 3rd ed. (pp. 55-56):
At the end of the Trypillian culture, herding/hunting trends intensified, and the agricultural system collapsed, with people moving to the steppe zone, as confirmed by the presence of numerous graves to the south (Rassamakin 1999). At the same time, the Trypillian world absorbed a foreign tradition related to materials of settlement sites of the Dnieper steppes – such as the late Sredni Stog culture –, like cord impressions and burial rites similar to the later Corded Ware culture, marking also the transformation of decors and changes in their interpretation (Palaguta 2007).
The similarity in burial rituals between Yamna and Corded Ware made Gimbutas define a common “Kurgan people”, whose relationship has also been long supported by Kristiansen (Kristiansen 1989; Kristiansen et al. 2017). An equivalence of both burial rites has been, however, rejected (Häusler 1963, 1978, 1983), and it is generally agreed that the Yamna culture did not expand to the north of the Tisza River.
The importance of horse exploitation in Deriivka, in the forest-steppe zone of the north Pontic region along the Dnieper region, during the Middle Eneolithic period (probably ca. 3700-3530 BC), suggests that horses played a significant role in the life of this Sredni Stog community (Anthony and Brown 2003). In its late period (ca. 4000-3500 BC), this culture had adopted corded ware pottery, and stone battle-axes.
However, this [sic] western steppe peoples were mainly hunters (Rassamakin 1999), and the ‘herding skill’ essential for wild horse domestication seems absent (Kuzmina 2003). All this has been confirmed with zooarchaeological evidence and new molecular and stable isotope results, suggesting an absence of horse domestication in territories of the late Sredni Stog culture in the north Pontic steppe (Mileto et al. 2017), before the advent of migrants from the Indo-European-speaking Repin culture.
The new sample described in Mathieson et al. (2017), dated ca. 4200 BC (but within a wide range, 5000-3500 BC) is from a site classified as of late Sredni Stog (although potentially from Post-Mariupol / Kvitjana), a culture of hunters who probably did not breed domesticated horses (even after the period of conquest and dominance of Suvorovo-Novodanilovka chiefs, from Indo-Hittite-speaking early Khvalynsk, who had domesticated horses), and – more importantly – is of R1a-M417 lineage, shows high so-called “Yamna component” in ADMIXTURE, and clusters among Corded Ware samples in PCA approximately a thousand years before this culture’s expansion. Information from the supplementary material:
An Eneolithic cemetery of the Sredny Stog II culture was excavated by D. Telegin in 1955-1957 near the village of Alexandria, Kupyansk district, Kharkov region on the left bank of the river Oskol. A total of 33 individuals were recovered. Based on craniometric analysis (I.Potekhina 1999) it was suggested that the Eneolithic inhabitants of Alexandria were not homogeneous and resulted from admixture of local Neolithic hunter-gatherers and early farmers, possibly Trypillian groups. We report genetic data from one individual: I6561
Another individual from Eneolithic Ukraine (of R1b1 xM269 lineage) clusters quite closely with Neolithic samples from the Baltic, which points to the strong connection between both – southern and northern – regions of east-central Europe before the period of great Chalcolithic expansions, and the potential origin of the spread of R1b (xM269) lineages with the Corded Ware culture.
It will be fun to see the mess that certain researchers have made (and will still make in the near future) of their findings coupled with the concept of “Yamna component”, when trying to describe the “proxy ancestral populations” of European Copper Age and Bronze Age cultures… Difficult times ahead for many, after the collapse of the simplistic Yamna -> Corded Ware -> Bell Beaker genetic model laid out since Haak et al. (2015) and Allentoft et al. (2015).
[EDIT 27 September 2017] Not directly related, but here is today’s interesting discussion on Twitter surrounding the ancestral populations of the “Yamnaya component”, for illustration of the discussions to come when this ancestry is divided into different, more precise, older (Neolithic) steppe components, and these in turn shown to contribute to different European and Asian Chalcolithic and Bronze Age cultures:
Rough attempt to understand genetic history of Europe and W. Eurasia. It's tough to get your head around. Comments welcome. pic.twitter.com/lutXFKmluk
Given the variance found in the three samples from Eneolithic Ukraine (comparable to the variance found in east Bell Beaker samples), we may now be getting closer to the precise territory and culture where the Corded Ware culture might have formed, which cannot be much further from the Dnieper-Dniester region before the Yamna expansion to the west ca. 3300 BC, judging from the elevated steppe component.
It seems, because of the proximity of both cultures and the similar dates of their migrations, that the westward expansion of the Yamna culture may have indeed provided an important push (among some strong ‘pull’ forces) for peoples of the expansion of the Corded Ware culture.
So Genetics reinforces the solidest models of Archaeology and Linguistics? Professional academics being mostly right in their careful research, and amateur geneticists playing with software being wrong? Who would have thought… More and more papers help thus shut up naysayers who state (again and again) that new algorithms are here to revolutionise these academic fields.
The expansion of peoples is known to be associated with the spread of a certain admixture component + the expansion and reduction in variability of a haplogroup (i.e. few male lineages are usually more successful during the expansion): Neolithic farmers from the Middle East expanding with haplogroup G2a; Natufian component (Levant hunter-gatherers or later, Neolithic farmers) and haplogroup E southward into Africa; CHG component expansion with haplogroup J; WHG expansion into east Europe with haplogroup R1b; etc.
There were (at least) two main expansion processes involving Proto-Indo-European: one causing the branching off of the language ancestral to Anatolian, and another during the spread of Late Indo-European dialects. Based on this, and on known archaeological models, I have predicted since the first version of the demic diffusion model:
Based on haplogroups found until then in Yamna (R1b-M269), Corded Ware (R1a-M417, especially Z645), and Bell Beaker (R1b-L151):
that mainly R1b-L23 (especially L51) lineages and more steppe admixture would be found in east Bell Beaker – confirmed some two months after my publication by Olalde et al. (2017);
and that mainly R1a-M417 (especially Z645) subclades will be found in Corded Ware samples.
Based on the finding of “Yamna component” in the Corded Ware culture: that this admixture must have come from somewhere else. I pointed out to eastern Europe, including the forest and forest-steppe zone especially in the natural continuum of the Dniester-Dnieper region. Especially after Mathieson et al. (2017), in my second and third versions of the model, I have more specifically suggested a southern origin in the region, nearer to where the CHG ancestry must have come from (the Caucasus and cultures formed in contact with it), according to mainstream archaeological data, i.e. cultures of the North Pontic steppe / steppe-forest. But of course, until more samples are available, more CHG ancestry in other cultures of the Forest Zone cannot be discarded.
For the vast majority of academics, more samples (regionally proportioned) are needed only from early Corded Ware, as we have from Bell Beaker: if they are (as expected) mostly R1a-M417, then everything is clear, and it will finally mean the end for the tiring, now almost ‘traditional’ association R1a – Proto-Indo-European. Some more samples from the potential homeland of the third Corded Ware horizon, most likely Ukraine (Podolia and Volynia regions), nearer to the time of the Corded Ware expansion, would also be great, to locate the actual ancestral population of Corded Ware migrants – recognisable by the main presence of haplogroup R1a-Z645 (formed ca. 3500 BC), and elevated “Yamna component” before the arrival of the Yamna culture…
If, however, early Corded Ware samples of R1b-L23 subclades are found in certain quantity, especially old samples from east-central Europe (excluding Yamna migrants along the Prut), the tricky question of Late Indo-European cultural diffusion will remain: Did Corded Ware peoples adopt a Late Indo-European language from clans of R1b-L23 lineages? That is what Kristiansen and Anthony have been betting for, a cultural diffusion, caused by:
A long-lasting contact, according to Kristiansen (1989,…,2017). He defends that Sredni Stog adopted the language – but obviously not the same culture – from the east, but that it is a genetic and cultural mix from Globular Amphora, Trypillia, and steppe cultures. This has been Kristiansen’s model for almost 30 years, and it follows Marija Gimbutas’ outdated theory of the “Kurgan people”.
A rapid change according to Anthony (2007). He associates the adoption of Pre-Germanic with the domination of Yamna chiefs over Usatovo people, and the adoption of Balto-Slavic by the people from (Corded Ware) Middle Dnieper group because of the technical superiority of neighbouring Yamna herders.
Linguistics, with the growing support of a North-West Indo-European group, points clearly to a European expansion of a community speaking the ancestral language of Italo-Celtic, Germanic, and probably Balto-Slavic. Archaeology, too, showed migration from Yamna only to south-eastern Europe (correcting Gimbutas’ Kurgan model) and later with east Bell Beaker mainly into central, western, and northern Europe.
Even Kristiansen admits that only after the arrival of Bell Beaker in Scandinavia was a linguistic community (i.e. Germanic) formed – although he places the center of gravity in Úněticean influence, and (yet again) a cultural diffusion event into the Danish Dagger period.
Because of more and more data contrasting with old theories, some have elected to develop weak, indemonstrable links, to keep supporting e.g. Gimbutas’ concept of “Kurgan people” in Archaeology, and a sudden, early expansion of all PIE dialects at once in Linguistics. It seems that, after so much fuss about the (misleading) ‘Yamna component’ concept – and so many far-fetched assumptions by amateur geneticists -, the Corded Ware connection will once again hinge on weak, indemonstrable cultural diffusion theories, be it ‘Kurgan peoples’ (including now, of course, Eneolithic cultures of Ukraine) or any culture from eastern Europe that will reveal some close samples to Corded Ware migrants, in terms of PCA, ADMIXTURE, or haplogroup.
So once we find mainly R1a-Z645 in more Corded Ware samples (and this haplogroup and more “Yamna component” in non-Yamna cultures of Eneolithic Ukraine, and potentially Poland or Belarus) we all may finally expect a peaceful acceptance of reality, at least in Genetics? Nope. No siree. Nein. Not then, not ever.
Why? Because some people want their paternal lineage to have lived in their historical region, and spoken their historical language, since time immemorial. It won’t matter if Archaeology, Linguistics, Genetics, etc. don’t support their claims: if they need to use some aspects of admixture, or haplogroups (or a combination of them) from carefully selected samples instead of looking at the whole picture; if they have to support that Indo-Europeans came from a culture different than Yamna, in- or outside of the steppe or forest-steppe, be it the Balkans, Anatolia, Armenia, or the Moon; if their proto-language should then come directly from Indo-Hittite, or from a Germano-Slavonic, or Indo-Slavonic, or Indo-Germanic group, or whatever invented dialectal branch necessary to fit their model, or if they have to support the ‘constellation analogy’ of Clackson, or thousands of years of development for each branch; etc. They will support whatever is necessary.
And this adaptation, obviously, has no end. It’s stupid, I know. But that’s how we are, how we think. We have seen that these sad trends continue no matter what, for decades, and not only regarding Indo-European. Some common examples include:
Indo-Aryan-speaking Indians defending an autochthonous origin of R1a and Indo-European; as well as the ‘opposite’ autochtonous continuity theory of Dravidian-speaking Indians (based on ASI ancestry, haplogroup R2, mtDNA haplogroup M, or whatever is at hand).
Western Europeans defending an autochthonous origin of the R1b haplogroup, with a Palaeolithic or Mesolithic origin, including the language, viz. the recent Indo-European from the Atlantic façade theories (in the Celtic from the West series, by Koch and Cunliffe); the now fading Palaeolithic Continuity Theory; and many other forgotten Eurocentric proposals; as well as the more recent informal hints of a central European/Balkan homeland based on the Villabruna cluster and south-eastern Mesolithic finds, which is at risk of being related to a Balkan origin of Proto-Indo-European…
There is also the ‘opposite’ theory of the autochthonous origin of the Basques, including Proto-Iberians and potentially other peoples like Paleo-Sardinians, based on the previously popular Vasconic-Uralic hypothesis (and an ancient Europe divided into R1b and N1c1 haplogroups), which is still widely believed in certain regions.
Nordic speakers supporting the autochthonous nature of Germanic and haplogroup I1 to Scandinavia.
Armenian speakers delighted to see a proposal of Indo-European homeland in the Armenian highlands, be it supported by glottalic consonants, CHG ancestrty, R1b (xM269) or J lineages…
Greek speakers now willing to support continuity of haplogroup J as a ‘native’ Greek lineage, of people speaking Proto-Greek (and in earlier times PIE), because of two Minoan, and one Mycenaean samples found in Lazaridis et al. (2017).
Even Turks linking Yamna with the expansion of Turkic languages. That one is fun to read, almost like a parody for the rest – substituting “Indo-European” for “Turkic”.
For years, a lot of people – me included (at least since 2005) – believed, because of modern maps of R1a distribution, that R1a and Corded Ware are the vector of Indo-European languages. For those of us who don’t have any personal or national tie with this haplogroup, this notion has been easy to change with new data. For others, it obviously isn’t, and it won’t be.
For all these people, a sample, result, or conclusion from any paper, just dubiously in favour, means everything, but a thousand against mean nothing, or can be reinterpreted to support their fantasies.
The Kossinian “autochthonous continuity” crap permeates this relatively new subfield of Human Evolutionary Genetics, as it permeated Indo-European studies (first Linguistics, then Archaeology) in its infancy. It seems to be a generalised human trend, no doubt related to some absurd inferiority complex, mixed with historical romanticism, a certain degree of chauvinism, and (falling in the eternal Godwin’s Law of our field) some outdated, childish notion of ‘supremacy’ linked with the expansion of the own language and people.
Such simplistic and popular models are also lucrative, judging by the boom in demand for DNA analysis, which companies embellish with modern fortune tellers (or fortune tellers themselves sell for a price), promising to ascertain your ‘ancestry proportions’ using automated algorithms, so that you don’t have to get lost in complex genetic data and prehistoric accounts, which can’t help you define your “ethnicity”…
Some just don’t want to realize that the spread of prehistoric languages (like Late Indo-European dialects) was a complex, non-uniform, stepped process, devoid of modern romantic concepts, which in genetic terms necessarily included later founder effects and cultural diffusions, so that no one can trace their haplogroup, lineage, family, region, or country to any single culture, language, or ethnic group. The same, by the way, can be said of peoples and countries in historic times.
As I said before, we shall expect supporters of the Kurgan model (and thus the expansion of R1a-Z645 with Yamna) to wait for just one sample of R1a-M417 in Yamna and/or Bell Beaker (which will eventually be found), and just one sample of R1b-M269 in Corded Ware (which will also eventually be found), to blow the horn of victory in this naïve competition against time, general knowledge, and (essentially) themselves.
A sad consequence of how we are is that, because of the obvious influence of these stupid modern ethnolinguistic agendas, because we are not all rowing in the same direction, genetic results and conclusions are still perceived as far-fetched and labile, and thus most archaeologists and linguists prefer not to include genetic results in their investigation. And those who dare to do so, are badly counselled by those who go with the tide, so that their papers become almost instantly outdated.
I feel there has recently been an increase in references to quite old – and generally outdated – terms, such as Germano-Balto-Slavic and “Indo-Slavonic” (i.e. Satem), described as Late Indo-European dialects. This is happening in forums and blogs that deal with “Indo-European genetics”, and only marginally (if at all) with the main anthropological subjects that form Indo-European studies, that is Linguistics and Archaeology.
Firstly, let me go apparently against the very aim of this post, by supporting the common traits that these dialects actually share.
Satem Indo-European or Indo-Slavonic
Balto-Slavic is a complex dialect, whose known proto-history and history offers already a difficult picture. Contrary to the opinion of many, there is no single document that can identify the terms Antes, Sklavenes, and Venedi with the cultures that are usually identified as speaking languages ancestral to East Slavic, South Slavic, and West Slavic . These names were used interchangeably in the Byzantine Empire, which was obviously not involved in classifying Slavic peoples by their linguistic branches… For more on the historical identification of Slavic tribes, read Florin Curta‘s The Making of the Slavs: History and Archaeology of the Lower Danube Region, c. 500-700 A.D. On the identification of potential candidates for early Slavic and Baltic cultures, you can read the appropriate entries in the Encyclopedia of Indo-European Culture, by Mallory & Adams.
Baltic and Slavic tribes seem to have a too recently recorded history to be able to confidently trace back their cultural predecessors. In its recent history, close to the formation of its community, Proto-Slavic must have had intense contacts with Iranian-speaking peoples. Also, previously, if R1a-M417 subclades are in fact the most common lineages expanded with the Corded Ware culture (as it seems now), they have no doubt shared a common language, most likely a non-Indo-European one. Not Indo-European in the strict sense, at least, since it formed part most likely of the Indo-Uralic continuum that must have been spoken during the Mesolithic in Eastern Europe, and a language probably nearer to Uralic than to classic Indo-European.
A strong connection between Balto-Slavic and Indo-Iranian in a common Satem branch, as supported by Kortlandt (see e.g. Balto-Slavic and Indo-Iranian 2016, or a reconstruction of Schleicher’s Fable in PIE branches), would imply that a Corded Ware culture from the Dnieper-Donets – speaking a Graeco-Aryan dialect – interacted for centuries with Uralic and other Graeco-Aryan languages, only later influenced by North-West Indo-European (as late as its contact with East Germanic during the Barbarian migration). This model cannot justify the shared traits of Balto-Slavic with North-West Indo-European, unless a third, substrate language – like Holzer’s (1989) Temematic proposal – is added to the equation. Such models are not impossible, but seem too complex.
On the other hand, linguistically Balto-Slavic seems to have split in its known branches quite early, and traits such as the satemization trend appear to have affected each main dialect (Baltic and Slavic) differently, as attested in the different ruki development, hence the assumption of its early but different influence of the trend to both Indo-Iranian and Balto-Slavic (or, more exactly, Indo-Iranian, Baltic, and Slavic). Also, the common North-West Indo-European vocabulary, as well as morphological trends shared by NW IE dialects, clearly affects the oldest layer of both languages (hence the parent Proto-Balto-Slavic too), which predates thus the satemization trend, and further contributes to the idea of a common root between West Indo-European (or Italo-Celtic), Northern Indo-European (the language ancestral to Pre-Germanic), and Proto-Balto-Slavic.
Germano-Balto-Slavic or North European
A common group between Germanic and Balto-Slavic is justified by the presence of certain common isoglosses, such as the famous shared oblique cases in *-m- instead of *-bh-, and support for such a group is found recently e.g. in Gramkelidze-Ivanov (1993-1994) – who nevertheless support a North-West Indo-European continuum -, or in Jasanoff, for whom both languages (regarding phonological traits) “began their post-IE history together”.
On the other hand, such shared traits could have derived either from old contacts – supported traditionally because of their proximity -, or by a common substrate to both without a need for direct contacts, as supported by Kortlandt in Baltic, Slavic, Germanic (2016), among others).
The fact that there might have been a different, third language involved – the hypothetic Temematic substrate language to Balto-Slavic, potentially nearer to Baltic because of the stronger superstrate influence in Slavic – further complicates the dialectal identification of Baltic and Slavic – that is, if one supports a common Germanic and Balto-Slavic group.
I am not implying that a common group of Balto-Slavic with Indo-Aryan (or of Germanic with Balto-Slavic) is fully discarded by linguistics: history and archaeology can indeed support a close interaction between these languages, and there has been historically some support to the inclusion of Balto-Slavic within a Graeco-Aryan group. However, Linguistics and Archaeology are each day more supportive of the association of Italo-Celtic with Germanic in a North-West Indo-European group, and Balto-Slavic with them (Oettinger 1997). See for example any recent article or book by Mallory, Adams, Beekes, Adrados, etc., or if you prefer, refer to the mainstream models followed by scholars in the German, Spanish, Leiden, or American schools. As you probably know, Clackson for the British school supports an abstract “constellation analogy” model for the language reconstruction, and the French school is dominated by archaeologist Jean Paul Demoule’s rejection of a Proto-Indo-European community; both schools, as you can imagine, will have to revise their theories in light of recent genetic studies…
Even Anthony (2007), who has related the Corded Ware culture to the expansion of Indo-European languages through cultural diffusion, recognizing the expansion of Yamna migrants to the west (identifying them with Italo-Celtic and Proto-Greek speakers), has to offer two or three separate cultural diffusion events (!), whereby Pre-Germanic, Proto-Balto-Slavic, and Proto-Indo-Iranian had been learned by the influence of the Yamna culture on neighbouring (unrelated) peoples of Corded Ware cultures: in Central European – Single Grave culture (from Pre-Germanic Usatovo), Middle Dnieper culture (from Balto-Slavic in the Contact Zone), and Potapovka (from Poltavka) cultures, respectively. No actual spread or migration from Yamna into Corded Ware has been supported since Gimbutas.
Balto-Slavic is indeed a complex group of languages – with some supporting (since Toporov and Ivanov proposal in the 1960s) three dialectal groups, composed of East Baltic, West Baltic, and Slavic branches (thus implying an older split of Baltic). Because of the close interaction of eastern Europe with Eurasian invasions, the nature of their language won’t probably ever be solved. Genetics is not the savior that overcomes these difficulties; so long it has only brought more (albeit no doubt interesting) questions, and even though their correct interpretation might offer some new light, we will be far from obtaining a clear picture of the cultural and linguistic development of Proto-Baltic and Proto-Slavic communities.
What I am criticizing here, therefore, is this recent revisionist trend whereby PIE must have been spoken by R1a-Z645 lineages, a trend found not only among amateur geneticists. I am beginning to think – judging from online comments, posts or tweets – that this trend is becoming stronger as a reaction to the fact that not a single R1a-Z645 sample has been found in Yamna or its expansion. These new revisionist models depict a common group of R1a-Z645 lineages hidden somewhere in the steppe, sharing some sort of Indo-Germanic (??) group, or argue for a shared Late PIE community without dialectal divisions, to justify its potential find somewhere marginal to the PIE territory, and then a later development of Corded Ware into Bell Beaker cultures (and, it is implied, peoples).
While not impossible, these are unlikely models, not based on knowledge but on wishes, since linguistic data strongly suggest a North-West Indo-European dialect including Italo-Celtic, Germanic, and (at the very least in its substrate and thus western R1a lineages) Balto-Slavic, and archaeological findings don’t show any meaningful population exchange between Corded Ware and Yamna… That is, it hadn’t until after the first famous papers on the so-called ‘steppe admixture’ of 2015, when (surprise!) Kristiansen has already jumped on the bandwagon (and Anthony seems to be beginning to suggest the same) of previously discarded Yamna -> Corded Ware, and Corded Ware -> Bell Beaker migrations.
Not a single serious researcher can deny that a hidden community of R1a-417 in Yamna is possible. But no one should support that it is the most likely explanation to the current genetic picture, whether based on Linguistic, Archaeology, Anthropology, or Genetics (be it phylogeography or admixture analyses).
I think this recent trend must therefore be the fruit of the influence of previous, deeply entrenched concepts regarding the Corded Ware culture and its link with Proto-Indo-European. These concepts are based on Gimbutas’ Kurgan model, Anthony’s revision of it – explaining the expansion as multiple cultural diffusions (thus renewing Gimbutas’ claim) -, and early studies of modern populations’ haplogroups. Apart from those trends, especially worrying for the future of the field (if it is to be taken seriously), is the interest of some pressure groups, including especially eastern Slavic peoples of R1a lineages, and Finnic speakers of N1c lineages, who are linking some fantastic ancient ethnolinguistic community to their modern national pride.
Adapting to reality
You can find support for anything you like in anthropology: there is certainly a paper out there that apparently supports your personal view on prehistoric ethnolinguistic Europe. You only have to do a quick search in Academia.edu, and you can justify whatever new genetic results you personally obtained playing with the freely available datasets and open source software – e.g. from Reich’s lab, or the famous ADMIXTURE. If you are one of those few interested in the field who haven’t tried it out yet, Razib Khan helps introduce you to DIY Genetics, so you can show off some graphics and proportions, like most popular bloggers and forum users are doing. Then you can also publish your results in BioRxiv, just to try it out.
So there is no merit at all in justifying these genetic results by supporting a potential anthropological scenario for it. Heck, you can invent it! Here, I said it. Anyone can do Anthropology. In fact, it seems that everyone does Human Evolutionary Genetics nowadays, no matter their background. Some lab knowledge and experience in doctoral research seems to be enough.
Admixture analyses are obtained using one or more algorithms, which have a limited potential to inform of possible migrations (its ultimate objective, at least regarding its complementary function to Archaeology within Indo-European studies). Such algorithms invariably have:
Intrinsic constraints: You have to understand each algorithm’s intrinsic limitations to be able to apply them correctly, and to derive meaningful but cautious conclusions. Using software commands and obtaining graphics and percentages does not imply you understand the constraints at stake. If you have tried them out, you have seen their great limitations; if you don’t see them, you certainly realize how little you understand of them.
Extrinsic constraints. Most are known, and often mentioned explicitly in research papers:
Few DNA samples, from limited sites.
Scarce and variable material recovered from these samples.
Quality of the retrieval, human errors, etc.
Lack of precise anthropological context.
Admixture results (whether by professionals or amateurs) are nevertheless often illustrated with tailored anthropological models: in case of the renown papers most likely because of ignorance of anthropological context, broad (philosophical or theoretic) and precise (historical), or lack of sufficient understanding of the different fields involved, and in case of many amateur geneticists also (often) to justify a desire for a prehistoric ethnolinguistic identification similar to their social or political agendas, in a new Kossinnian trend.
Admixture analyses are not wrong per se. It is wrong to trust them to inform you of something they can’t; because they need context, and ancient samples need ancient context, which in prehistoric times is obviously quite limited. If you don’t know as much as possible about the ancient context (i.e. Linguistics, Archaeology, Anthropology), you get the wrong conclusions. Period. If you look for papers on ancient context expecting to find whichever model fits your results (or worse, your wish), that is called bias. Don’t expect to get the right conclusions doing that, either. If you find it, that’s called confirmation bias. Such results are not useful. For anyone, not even you, you just deceive yourself and maybe others.
Some apparently think that a group of geneticists can achieve a meaningful interpretation of data just by adding one or more archaeologists to the research group – or as ‘co-authors’ of individual research papers. Wrong again. Ten people with IQ 20 don’t make the reasoning of a person with IQ 200 (not that I believe in measuring intelligence, but you get the point). Similarly, twenty researchers, each one with knowledge exclusively (or almost exclusively) of their own field, can’t achieve a meaningful explanation for the data obtained. Geneticists look for an anthropological model that coherently fits their results. Archaeologists will look for a model known to them that fits the genetic results (or more likely the interpretations thereof) they are given. That way, when working together, they can achieve a common ground. If neither of them understands the complexities and shortcomings of the others’ materials and methods (and their whole background), the results will be formally correct, but still wrong. They need to know all aspects involved in the others’ fields in great detail, to understand all potential implications of new data.
Since the advent of ancient DNA samples and especially PCA analyses, phylogeography (leaning predominantly on Y chromosomes) has been relegated to a (probably deserved) second place in assessing DNA samples. However, as Razib Khan states, “in the scaffold of the ancient DNA framework it can resolve some issues”. I think this is one of those issues, an issue that is not trivial at all – in that it affects migration models from the steppe at a critical period of linguistic expansion -, and the shortcomings of not relying on it are becoming quite evident with each new publication.
Many amateur geneticists that support the mainstream genetic models of the past two years don’t like the ad hoc explanations that others have been constantly giving to support their previous theories. After all, it seems unfair that some people would reject data that offers an obvious prehistoric picture of populations, because of the unwillingness to change one’s own preconceptions, right? For example, against the mainstream steppe migration theory, we have those who support that R1b must have been western European (Palaeolithic or Mesolithic) hunter-gatherers expanded from Iberia; or those who want R1a to have expanded from India. No matter how strong the evidence is against those models, some groups harbour a desire to fit anything in one’s previous image of reality.
However, some people who can’t stand those absurd ad hoc explanations and rationalisations, are quite ready to embrace the idea that, somehow, during the Chalcolithic expansion of Yamna, an imaginary community was formed where communities of divergent lineages R1a-Z645 (found mostly north of the steppe and later in Corded Ware cultures) and R1b-M269 (found mostly in the steppe and later in the cultures known to have evolved from Yamna, like Afanasevo, Vučedol, and Bell Beaker) lived together and spoke the same language for centuries, or even millennia. And that community would have existed after a Late Neolithic westward expansion of the Khvalynsk culture, and another westward expansion of the Repin culture, both of which probably reduced the diversity of Y-DNA lineages within Yamna: the first to R1b-M269 lineages, the second to R1b-L23 subclades.
Both communities of R1a and R1b lineages, described then as united until the Yamna expansion (although no sample of R1a-Z645 subclade has been ascribed to any steppe expansion) would have expanded somehow separately, R1a-M417 exclusively to the north into Corded Ware – without any migratory connection found between Yamna and Corded Ware in mainstream Archaeology -, and forming thus dialectal groups (like “Germano-Balto-Slavic” or “Indo-Slavonic”) that are not supported by mainstream linguistics.
On the other hand, R1b-Z2103 and R1b-L51 lineages, which were already separated within Yamna and probably forming different communities, are known to have spread to the west with the Yamna expansion, in some places and cultures they are found together (like Bell Beaker), which would be expected in a common migration of separate groups. No single R1b-L23 sample has been found in the Corded Ware expansion, no single R1a-M417 individual in the Yamna expansion.
These convoluted explanations of how R1a lineages must have spoken Indo-European are based on the assumption that admixture analyses (from the current limited data, with the current wrong interpretation of their context) necessarily means that Corded Ware peoples spread as Yamna migrants – hence R1a lineages must come from Yamna – and then spread into Bell Beaker.
It is possible, and in my opinion expected, that eventually some R1a-M417 subclade will be found in Yamna samples (east or west), and some haplogroup R1b-M269 (especially R1b-L23 and subclades) will be found in samples from Corded Ware cultures (west or east). Indeed, there must have been close contacts between both cultures (between Yamna–Southeast Europe–East Bell Beaker and Corded Ware), and not only through female exogamy. It would be quite strange not to find a single R1b-L23 sample in Corded Ware cultures, or an R1a-M417 sample in Late Proto-Indo-European-speaking territories. Those scattered samples, whenever they are found, will probably not change the data: but they might give a reason for some to keep supporting a model that is not the most likely one. It won’t still be the most reasonable, the simplest model that explains all data.
What it means to be an ‘ethnic’ Balt or Slav
Older models – older even than Gimbutas’ kurgan model of the 1950s, as you can see -, by presupposing an instant breakup of a unitary Proto-Indo-European language into different linguistic communities without previous dialectal relation with each other, cannot explain our common European linguistic heritage. More recent models based on recent genetic studies (and on outdated or newly invented linguistic and archaeological theories), by trying to connect genetically (directly) modern eastern Europeans with Proto-Indo-Europeans, are in fact disconnecting Balto-Slavic peoples from the rest of Europe for three thousand years, and connecting them either with Uralic or with Indo-Iranian speakers. Ethnolinguistic identification, however, is not about genetics – and it has never been, and I hope it will never be -; it is related to self-identification into groups, and more broadly to a common culture, and often specifically a common language.
In terms of language, it makes sense to support a situation where Balto-Slavic was a North-West Indo-European dialect (sharing a common language ancestral to Germanic and Italo-Celtic too), with certain ancient (Uralic?) innovative traits shared with Indo-Iranian and partly with Germanic (but with no direct contacts necessary between these branches). Its recent transition to a Baltic and Slavic proto-languages, already by eastern European groups, shows their strong external influence from Uralic and Iranian, respectively, so an identification of Balto-Slavic with the expansion of R1a lineages is probably to be found in a western group of R1a-Z282 subclades expanding eastwards between the Bronze Age and the Iron Age.
Eastern Europe’s Indo-European heritage (Balto-Slavic) is therefore connected to the western European one (Italo-Celtic and Germanic), each with its own linguistic substrate and influences, but with a common, shared ancient language. North-West Indo-European derived in turn from Late Indo-European, a language ancestral to Indo-Iranian and Palaeo-Balkan languages, the latter showing continued contacts with western Europe for millennia.
In the minimum-case scenario – for supporters of a Satem proto-language like Kortlandt – the language substrate to Baltic and Slavic must be a North-West Indo-European language (to fit its shared traits with North-West Indo-European), like Holzer’s Temematic (a hypothesis which Kortlandt seems to support) that would have then been recently absorbed by Satem speakers of Eastern Europe. In that context, central European R1a-Z282 lineages (which form the majority of West Slavic lineages) would have spoken that NWIE language for millennia , until proto-historic times, when a cultural diffusion of a Graeco-Aryan dialect (mainly spoken by R1a-Z93 or eastern European R1a-Z282 lineages, then) would have happened in eastern Europe, and then a cultural diffusion (or demic diffusion?) of Slavic-speaking peoples would have happened to the west, into central Europe.
In none of these scenarios is any sort of Proto-Indo-European -> Balto-Slavic ethnolinguistic, genetic, or territorial continuity to be seen. The former model is not only the simpler explanation for Slavic and Baltic, but it is also the communis opinio today by most Indo-Europeanists, it is supported by Archaeology, and Genetics is likely to keep supporting it with each new paper. I don’t find anything shameful, or that could diminish modern Baltic and Slavic identities a bit, by accepting any of those models, so I don’t understand the imperative need some people seem to have of identifying R1a lineages with the Yamna expansion and thus Proto-Indo-European.
This is just one of many highly hypothetic ancient scenarios, and it requires more assumptions than a continuity of Indo-Uralic (or even Indo-Uralic and Afroasiatic) with R1b lineages – R1a potentially marking the spread of Paleo-Siberian languages -, and above all it is based on controversial linguistic macrofamilies, not (yet?) supported by mainstream anthropological disciplines. It is nevertheless one theory certain romantics can place their hopes in, as R1b communities of the steppe become accepted as those originally speaking (Middle and Late) Proto-Indo-European in the steppe.
I am not saying I am right. There is still too much to be said and corrected. In fact, I could be wrong, and we may lack a lot of interesting data: there might have been a late R1a-R1b North-West Indo-European-speaking community within western Yamna, and we might need to revise what we knew about Archaeology yet again (and maybe even Linguistics!) before admixture algorithms; then maybe geneticists have come to save the day after all. However, all anthropological evidence points strongly (and genetic studies more strongly with each new study) to the image we had previous to the first genetic data based on haplogroups.
I think it is preposterous of some researchers (no matter if professional or amateur geneticists, or archaeologists, or even linguists) to think algorithms can beat more than two hundred years and thousands of works on this matter. In Academia, mathematics rarely revolutionize a field; it could usually help, but it can just make you sound scientifish, and point in the wrong direction.
And no, I am not smarter than the rest, I can only judge from what I know, and that is always too little, far less than I would like to. But maybe I am in a more neutral position regarding the end result, given my renewed skepticism in revolutionary methods to solve academic problems, and my indifference as to a western European or eastern European origin of R1b or R1a lineages. And I am not alone in my lack of confidence in the interpretation of recent genetic admixture results – read Voker Heyd’s papers, for example, if you want the view of a renown and experienced archaeologist who was in the field of Indo-European studies earlier than any of those now popular geneticists.
In fact, I also fell for the R1a-Corded Ware expansion of Late Indo-European, and before many in the Anthropological fields, and with even less proof, back when we only had haplogroups of modern populations and the promises of Cavalli-Sforza. When I decided to publish a grammar to learn Indo-European as a modern language, the aim was to offer a mainstream reconstruction of Late Proto-Indo-European without adding my own contributions; despite this, I added the newly, archaeogenetically-supported Corded Ware migration model (see A Grammar of Modern Indo-European, Third Edition, pp. 74 and ff.).
I guess I liked the picture of an old romantic Europe, divided in western Vasconic (R1b) and eastern Uralic (N1c) hunter-gatherers (and later farmers) being invaded by warring kurgan-makers from the steppe (R1a) … And I really liked the article of Haak et al. (2015) – the first one I read on this subject -, which I saw, like everyone else, as supporting what many of us already believed about a single, common expansion of North-West Indo-European into western Europe. It also made our life – regarding the linguistic unity of Balto-Slavic with the West Indo-European core – much easier…
Recent papers, when compared to what linguists and archaeologists had been saying for years – before even Y-DNA haplogroup was a thing for any of these now popular genetic labs (not to speak about internet geneticists) -, leave little space for doubt right now. I embraced the results of haplogroup analysis of modern populations, which seemed to support an expansion of Proto-Indo-European R1a-lineages with the Corded Ware culture, and dismissed thus Gimbutas’ and Anthony’s model (of a Yamna -> Bell Beaker expansion of Italo-Celtic). I also embraced the results of the publications on genetics of 2015 with open arms.
But, I was able to change my mind when the careful observation of individual samples of these recent studies began to contradict what we thought, and I did so publicly only recently (publishing the Indo-European demic diffusion model), and more strongly after the latest papers (publishing the updated second edition), without remorse. And I will reverse that decision again if needed, and change it again and again as I feel necessary, no matter how many times. In Science, to adapt to new data does not make you a brownnoser, it makes you a scientist. Not to adapt to new data does not make you a man of firm ideals, or any chivalrous concept you might have about that, it makes you look ignorant and biased. It’s that simple.
Some of you may think that there is a third way: to keep an old, now unlikely idea you have supported in the past, but not bragging about it in the meantime until it is proven fully wrong, just in case it is demonstrated to be right in the end – because then you might claim you were right all along, like you had some magic understanding or hidden data the rest of us didn’t. I don’t think that’s the correct way to behave in a scientific environment, either. That makes you a coward. And you wouldn’t have been right all along: you would have been right, then wrong, then right again. Everybody can see that, and so do you.
Geneticists working on future publications should be planning ahead of what might happen. The overconfidence of Haak et al. (2015), Mathieson et al. (2015), and Allentoft et al. (2015), including Lazaridis et al. (2016), in supporting a Yamna -> Corded Ware migration, and a Corded Ware -> Bell Beaker migration are understandable in a rapidly growing field that didn’t leave enough time to study complex anthropological questions. The recent errors of following that simplistic and wrong model in Mathieson et al. (2017), and Olalde et al. (2017), coupled with Kristiansen’s (2017) and Anthony’s (2017) new interpretations (to fit the conclusions of those genetic studies), can be forgiven, because of all the fuss created around the Steppe admixture concept, and the desire of journals to publish popular papers, and of researchers to go with the tide and gain some popularity along the way.
From now on, however, if the evidence keeps pointing in the same direction, a lack of attention to anthropological detail will be simple wishful ignorance, and that cannot be forgiven in any field that strives to ascertaining the truth. If continued, this trend will damage the field of Human Evolutionary Biology for years – at least in the view of anthropologists, who are the real filter of this field’s conclusions -, when its current results prove wrong. Genetic studies will be banished from anthropological studies, dismissed as a pseudo-science, and avoided by any scientific or academic journal worthy of a minimum self-respect.
To regain trust in a field that purportedly uses “more scientific methods” but is nonetheless proven that wrong for years in its essential assumption (a Yamna -> Corded Ware migration model), and especially when it is associated with the traditionally despised ‘Kossinnian trends‘, will be a hard task for those involved. So many postdoc offers in so many labs being created right now will vanish, as the interest in publishing papers of this discredited field will disappear. This could also threaten the recently renewed impulse by archaeologists and linguists of migration models, which had been rejected for a long time, giving impulse to those who deny them ( e.g. in the UK and in France), or who just don’t want to see Archaeology or Linguistics get involved with such a controversial question, or even between each other.
High-impact factor journals like Nature, Science, PNAS, and those not so famous, as well as their reviewers and readers, are doing a disservice to the endeavour of ascertaining the historical truth, if they allow this to happen without protesting. But such consequences for the field will be their making, and not that of suspicious anthropologists, who do well in distrusting any revolutionary results published by overconfident researchers from newly developed and too broadly defined subfields.
I recently wrote about how Wiik’s model was wrong in supporting a Mesolithic European Vasconic-Uralic harmony – genetically based on the modern distribution of R1b vs. N1c haplogroups -, and thus also the disruption of this harmony by Indo-Europeans (supposedly a population of R1a-lineages invading central Europe from a Balkan homeland).
Romanticism does this quite frequently: it makes us believe in some esoteric fantasy, like the ethnic continuity of our ancestors in the region we live (and a far, far greater original territory that has been unfairly diminished by invaders), providing us with strong links to support our artificial borders and their potential expansion.
Even though my article on the demic diffusion of Indo-European languages does only slightly comment on the origins (and potential language) of N1c-lineages and of Proto-Basque and Proto-Uralic languages, I have already received some angry emails by Basque and Finnish genetic amateurs. I don’t get the point of fantasizing about one’s own ethnicity and prehistoric territory, and then getting through the five stages of grief when one is confronted with different (usually sounder) theories, time and time again. It seems like a lot of time lost by generations in wholly stupid quests and self-negotiation.
However wrong Wiik’s basic theses are, though, if you have read my paper you have seen that Corded Ware groups spread from north-western Ukraine might have spoken Uralic languages. Therefore, it is reasonable to assume that Pre-Germanic, Pre-Balto-Slavic and Pre-Indo-Iranian might have been adopted by peoples who spoke Uralic languages, probably related with each other, possibly belonging to early Finno-Ugric dialects. In that sense, Wiik’s work has a renewed linguistic interest, regarding the potential substrate words he investigated.
This is not a picture that certain Basque, Finnish, Russian, or Indian romantics would have hoped (or even hope today) for, in terms of ethnic, linguistic, and territorial identification, but that is not a real problem, anyway, just another building of imaginary origins that will fall as many others before them. In the same sense, Germanic ethnogenesis has become more complicated than what some would have wanted, with at least three main paternal lineages with completely different ethnolinguistic origins developing together since ca. 2500 BC to form a more homogeneous community only during the Bronze Age. Therefore, no homogeneous exclusive ethnic ‘original’ European regional community can be fantastically invented anymore.
This seems to me a real coup de grâce to genetic-based nationalism in Europe, and it is encouraging for the European Union that Germany, as the central European country, is not only a central territory, but also a central cultural and genetic bridge between west and east Europe, in terms of history, of North-West Indo-European languages, and paternal lineages and admixture analyses.
Wiik, Kalevi 1999: “Pohjois-Euroopan indoeurooppalaisten kielten suomalais-ugrilainen substraatti”. Pohjan poluilla. Suomalaisten juuret nykytutkimuksen mukaan. Toim. Paul Fogelberg. Bidrag till kännedom av Finlands natur och folk, 153. Helsinki 1999.
Wiik, Kalevi 2002: Eurooppalaisten juuret. Atena, Jyväskylä 2002.