Germanic–Balto-Slavic and Satem (‘Indo-Slavonic’) dialect revisionism by amateur geneticists, or why R1a lineages *must* have spoken Proto-Indo-European


I feel there has recently been an increase in references to quite old – and generally outdated – terms, such as Germano-Balto-Slavic and “Indo-Slavonic” (i.e. Satem), described as Late Indo-European dialects. This is happening in forums and blogs that deal with “Indo-European genetics”, and only marginally (if at all) with the main anthropological subjects that form Indo-European studies, that is Linguistics and Archaeology.

Firstly, let me go apparently against the very aim of this post, by supporting the common traits that these dialects actually share.

Satem Indo-European or Indo-Slavonic

Balto-Slavic is a complex dialect, whose known proto-history and history offers already a difficult picture. Contrary to the opinion of many, there is no single document that can identify the terms Antes, Sklavenes, and Venedi with the cultures that are usually identified as speaking languages ancestral to East Slavic, South Slavic, and West Slavic . These names were used interchangeably in the Byzantine Empire, which was obviously not involved in classifying Slavic peoples by their linguistic branches… For more on the historical identification of Slavic tribes, read Florin Curta‘s The Making of the Slavs: History and Archaeology of the Lower Danube Region, c. 500-700 A.D. On the identification of potential candidates for early Slavic and Baltic cultures, you can read the appropriate entries in the Encyclopedia of Indo-European Culture, by Mallory & Adams.

Baltic and Slavic tribes seem to have a too recently recorded history to be able to confidently trace back their cultural predecessors. In its recent history, close to the formation of its community, Proto-Slavic must have had intense contacts with Iranian-speaking peoples. Also, previously, if R1a-M417 subclades are in fact the most common lineages expanded with the Corded Ware culture (as it seems now), they have no doubt shared a common language, most likely a non-Indo-European one. Not Indo-European in the strict sense, at least, since it formed part most likely of the Indo-Uralic continuum that must have been spoken during the Mesolithic in Eastern Europe, and a language probably nearer to Uralic than to classic Indo-European.

A strong connection between Balto-Slavic and Indo-Iranian in a common Satem branch, as supported by Kortlandt (see e.g. Balto-Slavic and Indo-Iranian 2016, or a reconstruction of Schleicher’s Fable in PIE branches), would imply that a Corded Ware culture from the Dnieper-Donets – speaking a Graeco-Aryan dialect – interacted for centuries with Uralic and other Graeco-Aryan languages, only later influenced by North-West Indo-European (as late as its contact with East Germanic during the Barbarian migration). This model cannot justify the shared traits of Balto-Slavic with North-West Indo-European, unless a third, substrate language – like Holzer’s (1989) Temematic proposal – is added to the equation. Such models are not impossible, but seem too complex.

On the other hand, linguistically Balto-Slavic seems to have split in its known branches quite early, and traits such as the satemization trend appear to have affected each main dialect (Baltic and Slavic) differently, as attested in the different ruki development, hence the assumption of its early but different influence of the trend to both Indo-Iranian and Balto-Slavic (or, more exactly, Indo-Iranian, Baltic, and Slavic). Also, the common North-West Indo-European vocabulary, as well as morphological trends shared by NW IE dialects, clearly affects the oldest layer of both languages (hence the parent Proto-Balto-Slavic too), which predates thus the satemization trend, and further contributes to the idea of a common root between West Indo-European (or Italo-Celtic), Northern Indo-European (the language ancestral to Pre-Germanic), and Proto-Balto-Slavic.

Germano-Balto-Slavic or North European

A common group between Germanic and Balto-Slavic is justified by the presence of certain common isoglosses, such as the famous shared oblique cases in *-m- instead of *-bh-, and support for such a group is found recently e.g. in Gramkelidze-Ivanov (1993-1994) – who nevertheless support a North-West Indo-European continuum -, or in Jasanoff, for whom both languages (regarding phonological traits) “began their post-IE history together”.

Proto-Indo-European language tree, including early (Indo-Hittite) and late (European) stages by Trager & Smith (1950). From the paper On the Origin of North Indo-European, Gimbutas (1952)

On the other hand, such shared traits could have derived either from old contacts – supported traditionally because of their proximity -, or by a common substrate to both without a need for direct contacts, as supported by Kortlandt in Baltic, Slavic, Germanic (2016), among others).

The fact that there might have been a different, third language involved – the hypothetic Temematic substrate language to Balto-Slavic, potentially nearer to Baltic because of the stronger superstrate influence in Slavic – further complicates the dialectal identification of Baltic and Slavic – that is, if one supports a common Germanic and Balto-Slavic group.

This Northern group was supported by Gimbutas (1952) based on a previous linguistic paper by Trager & Smith (1950) – published in the infancy of dialectal PIE differentiation -, but this model is not mainstream nowadays. The linguistic model followed by Anthony (based on Ringe, Warnow, and Taylor 2002), did not link Germanic to Balto-Slavic (or to Italo-Celtic, for that sake) more than to any other dialect, but it seems that the three might have spread from the same western (Repin-derived) Yamna region, according to their maps. See for example their recent article The Indo-European Homeland from Linguistic and Archaeological Perspectives, which sums up and partly corrects Anthony’s detailed account of steppe migrations in the already classic book The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World (2007). Dialectal groups and implications are unclear from their publications, with changing linguistic schemes since 2007, but with a quite stable archaeological framework.

Dialectal Late Indo-European

I am not implying that a common group of Balto-Slavic with Indo-Aryan (or of Germanic with Balto-Slavic) is fully discarded by linguistics: history and archaeology can indeed support a close interaction between these languages, and there has been historically some support to the inclusion of Balto-Slavic within a Graeco-Aryan group. However, Linguistics and Archaeology are each day more supportive of the association of Italo-Celtic with Germanic in a North-West Indo-European group, and Balto-Slavic with them (Oettinger 1997). See for example any recent article or book by Mallory, Adams, Beekes, Adrados, etc., or if you prefer, refer to the mainstream models followed by scholars in the German, Spanish, Leiden, or American schools. As you probably know, Clackson for the British school supports an abstract “constellation analogy” model for the language reconstruction, and the French school is dominated by archaeologist Jean Paul Demoule’s rejection of a Proto-Indo-European community; both schools, as you can imagine, will have to revise their theories in light of recent genetic studies…

Regarding Archaeology, North-West Indo-European must have expanded with the westward Yamna expansion, i.e. associated with the Bell Beaker expansion. That was supported in mainstream Archaeology before the most recent genetic studies.

Even Anthony (2007), who has related the Corded Ware culture to the expansion of Indo-European languages through cultural diffusion, recognizing the expansion of Yamna migrants to the west (identifying them with Italo-Celtic and Proto-Greek speakers), has to offer two or three separate cultural diffusion events (!), whereby Pre-Germanic, Proto-Balto-Slavic, and Proto-Indo-Iranian had been learned by the influence of the Yamna culture on neighbouring (unrelated) peoples of Corded Ware cultures: in Central European – Single Grave culture (from Pre-Germanic Usatovo), Middle Dnieper culture (from Balto-Slavic in the Contact Zone), and Potapovka (from Poltavka) cultures, respectively. No actual spread or migration from Yamna into Corded Ware has been supported since Gimbutas.

Balto-Slavic is indeed a complex group of languages – with some supporting (since Toporov and Ivanov proposal in the 1960s) three dialectal groups, composed of East Baltic, West Baltic, and Slavic branches (thus implying an older split of Baltic). Because of the close interaction of eastern Europe with Eurasian invasions, the nature of their language won’t probably ever be solved. Genetics is not the savior that overcomes these difficulties; so long it has only brought more (albeit no doubt interesting) questions, and even though their correct interpretation might offer some new light, we will be far from obtaining a clear picture of the cultural and linguistic development of Proto-Baltic and Proto-Slavic communities.

What I am criticizing here, therefore, is this recent revisionist trend whereby PIE must have been spoken by R1a-Z645 lineages, a trend found not only among amateur geneticists. I am beginning to think – judging from online comments, posts or tweets – that this trend is becoming stronger as a reaction to the fact that not a single R1a-Z645 sample has been found in Yamna or its expansion. These new revisionist models depict a common group of R1a-Z645 lineages hidden somewhere in the steppe, sharing some sort of Indo-Germanic (??) group, or argue for a shared Late PIE community without dialectal divisions, to justify its potential find somewhere marginal to the PIE territory, and then a later development of Corded Ware into Bell Beaker cultures (and, it is implied, peoples).

While not impossible, these are unlikely models, not based on knowledge but on wishes, since linguistic data strongly suggest a North-West Indo-European dialect including Italo-Celtic, Germanic, and (at the very least in its substrate and thus western R1a lineages) Balto-Slavic, and archaeological findings don’t show any meaningful population exchange between Corded Ware and Yamna… That is, it hadn’t until after the first famous papers on the so-called ‘steppe admixture’ of 2015, when (surprise!) Kristiansen has already jumped on the bandwagon (and Anthony seems to be beginning to suggest the same) of previously discarded Yamna -> Corded Ware, and Corded Ware -> Bell Beaker migrations.

Not a single serious researcher can deny that a hidden community of R1a-417 in Yamna is possible. But no one should support that it is the most likely explanation to the current genetic picture, whether based on Linguistic, Archaeology, Anthropology, or Genetics (be it phylogeography or admixture analyses).

I think this recent trend must therefore be the fruit of the influence of previous, deeply entrenched concepts regarding the Corded Ware culture and its link with Proto-Indo-European. These concepts are based on Gimbutas’ Kurgan model, Anthony’s revision of it – explaining the expansion as multiple cultural diffusions (thus renewing Gimbutas’ claim) -, and early studies of modern populations’ haplogroups. Apart from those trends, especially worrying for the future of the field (if it is to be taken seriously), is the interest of some pressure groups, including especially eastern Slavic peoples of R1a lineages, and Finnic speakers of N1c lineages, who are linking some fantastic ancient ethnolinguistic community to their modern national pride.

“European dialect” expansion of Proto-Indo-European according to The Indo-Europeans: Archeological Problems, Gimbutas (1963). Observe the similarities of the western European expansion to the recently proposed expansion of R1b lineages with western Yamna and Bell Beaker.

Adapting to reality

You can find support for anything you like in anthropology: there is certainly a paper out there that apparently supports your personal view on prehistoric ethnolinguistic Europe. You only have to do a quick search in, and you can justify whatever new genetic results you personally obtained playing with the freely available datasets and open source software – e.g. from Reich’s lab, or the famous ADMIXTURE. If you are one of those few interested in the field who haven’t tried it out yet, Razib Khan helps introduce you to DIY Genetics, so you can show off some graphics and proportions, like most popular bloggers and forum users are doing. Then you can also publish your results in BioRxiv, just to try it out.

So there is no merit at all in justifying these genetic results by supporting a potential anthropological scenario for it. Heck, you can invent it! Here, I said it. Anyone can do Anthropology. In fact, it seems that everyone does Human Evolutionary Genetics nowadays, no matter their background. Some lab knowledge and experience in doctoral research seems to be enough.

Admixture analyses are obtained using one or more algorithms, which have a limited potential to inform of possible migrations (its ultimate objective, at least regarding its complementary function to Archaeology within Indo-European studies). Such algorithms invariably have:

  • Intrinsic constraints: You have to understand each algorithm’s intrinsic limitations to be able to apply them correctly, and to derive meaningful but cautious conclusions. Using software commands and obtaining graphics and percentages does not imply you understand the constraints at stake. If you have tried them out, you have seen their great limitations; if you don’t see them, you certainly realize how little you understand of them.
  • Extrinsic constraints. Most are known, and often mentioned explicitly in research papers:
    • Few DNA samples, from limited sites.
    • Scarce and variable material recovered from these samples.
    • Quality of the retrieval, human errors, etc.
    • Lack of precise anthropological context.

Admixture results (whether by professionals or amateurs) are nevertheless often illustrated with tailored anthropological models: in case of the renown papers most likely because of ignorance of anthropological context, broad (philosophical or theoretic) and precise (historical), or lack of sufficient understanding of the different fields involved, and in case of many amateur geneticists also (often) to justify a desire for a prehistoric ethnolinguistic identification similar to their social or political agendas, in a new Kossinnian trend.

Admixture analyses are not wrong per se. It is wrong to trust them to inform you of something they can’t; because they need context, and ancient samples need ancient context, which in prehistoric times is obviously quite limited. If you don’t know as much as possible about the ancient context (i.e. Linguistics, Archaeology, Anthropology), you get the wrong conclusions. Period. If you look for papers on ancient context expecting to find whichever model fits your results (or worse, your wish), that is called bias. Don’t expect to get the right conclusions doing that, either. If you find it, that’s called confirmation bias. Such results are not useful. For anyone, not even you, you just deceive yourself and maybe others.

Some apparently think that a group of geneticists can achieve a meaningful interpretation of data just by adding one or more archaeologists to the research group – or as ‘co-authors’ of individual research papers. Wrong again. Ten people with IQ 20 don’t make the reasoning of a person with IQ 200 (not that I believe in measuring intelligence, but you get the point). Similarly, twenty researchers, each one with knowledge exclusively (or almost exclusively) of their own field, can’t achieve a meaningful explanation for the data obtained. Geneticists look for an anthropological model that coherently fits their results. Archaeologists will look for a model known to them that fits the genetic results (or more likely the interpretations thereof) they are given. That way, when working together, they can achieve a common ground. If neither of them understands the complexities and shortcomings of the others’ materials and methods (and their whole background), the results will be formally correct, but still wrong. They need to know all aspects involved in the others’ fields in great detail, to understand all potential implications of new data.

Since the advent of ancient DNA samples and especially PCA analyses, phylogeography (leaning predominantly on Y chromosomes) has been relegated to a (probably deserved) second place in assessing DNA samples. However, as Razib Khan states, “in the scaffold of the ancient DNA framework it can resolve some issues”. I think this is one of those issues, an issue that is not trivial at all – in that it affects migration models from the steppe at a critical period of linguistic expansion -, and the shortcomings of not relying on it are becoming quite evident with each new publication.

Many amateur geneticists that support the mainstream genetic models of the past two years don’t like the ad hoc explanations that others have been constantly giving to support their previous theories. After all, it seems unfair that some people would reject data that offers an obvious prehistoric picture of populations, because of the unwillingness to change one’s own preconceptions, right? For example, against the mainstream steppe migration theory, we have those who support that R1b must have been western European (Palaeolithic or Mesolithic) hunter-gatherers expanded from Iberia; or those who want R1a to have expanded from India. No matter how strong the evidence is against those models, some groups harbour a desire to fit anything in one’s previous image of reality.

However, some people who can’t stand those absurd ad hoc explanations and rationalisations, are quite ready to embrace the idea that, somehow, during the Chalcolithic expansion of Yamna, an imaginary community was formed where communities of divergent lineages R1a-Z645 (found mostly north of the steppe and later in Corded Ware cultures) and R1b-M269 (found mostly in the steppe and later in the cultures known to have evolved from Yamna, like Afanasevo, Vučedol, and Bell Beaker) lived together and spoke the same language for centuries, or even millennia. And that community would have existed after a Late Neolithic westward expansion of the Khvalynsk culture, and another westward expansion of the Repin culture, both of which probably reduced the diversity of Y-DNA lineages within Yamna: the first to R1b-M269 lineages, the second to R1b-L23 subclades.

Both communities of R1a and R1b lineages, described then as united until the Yamna expansion (although no sample of R1a-Z645 subclade has been ascribed to any steppe expansion) would have expanded somehow separately, R1a-M417 exclusively to the north into Corded Ware – without any migratory connection found between Yamna and Corded Ware in mainstream Archaeology -, and forming thus dialectal groups (like “Germano-Balto-Slavic” or “Indo-Slavonic”) that are not supported by mainstream linguistics.

On the other hand, R1b-Z2103 and R1b-L51 lineages, which were already separated within Yamna and probably forming different communities, are known to have spread to the west with the Yamna expansion, in some places and cultures they are found together (like Bell Beaker), which would be expected in a common migration of separate groups. No single R1b-L23 sample has been found in the Corded Ware expansion, no single R1a-M417 individual in the Yamna expansion.

These convoluted explanations of how R1a lineages must have spoken Indo-European are based on the assumption that admixture analyses (from the current limited data, with the current wrong interpretation of their context) necessarily means that Corded Ware peoples spread as Yamna migrants – hence R1a lineages must come from Yamna – and then spread into Bell Beaker.

It is possible, and in my opinion expected, that eventually some R1a-M417 subclade will be found in Yamna samples (east or west), and some haplogroup R1b-M269 (especially R1b-L23 and subclades) will be found in samples from Corded Ware cultures (west or east). Indeed, there must have been close contacts between both cultures (between Yamna–Southeast Europe–East Bell Beaker and Corded Ware), and not only through female exogamy. It would be quite strange not to find a single R1b-L23 sample in Corded Ware cultures, or an R1a-M417 sample in Late Proto-Indo-European-speaking territories. Those scattered samples, whenever they are found, will probably not change the data: but they might give a reason for some to keep supporting a model that is not the most likely one. It won’t still be the most reasonable, the simplest model that explains all data.

What it means to be an ‘ethnic’ Balt or Slav

Older models – older even than Gimbutas’ kurgan model of the 1950s, as you can see -, by presupposing an instant breakup of a unitary Proto-Indo-European language into different linguistic communities without previous dialectal relation with each other, cannot explain our common European linguistic heritage. More recent models based on recent genetic studies (and on outdated or newly invented linguistic and archaeological theories), by trying to connect genetically (directly) modern eastern Europeans with Proto-Indo-Europeans, are in fact disconnecting Balto-Slavic peoples from the rest of Europe for three thousand years, and connecting them either with Uralic or with Indo-Iranian speakers. Ethnolinguistic identification, however, is not about genetics – and it has never been, and I hope it will never be -; it is related to self-identification into groups, and more broadly to a common culture, and often specifically a common language.

In terms of language, it makes sense to support a situation where Balto-Slavic was a North-West Indo-European dialect (sharing a common language ancestral to Germanic and Italo-Celtic too), with certain ancient (Uralic?) innovative traits shared with Indo-Iranian and partly with Germanic (but with no direct contacts necessary between these branches). Its recent transition to a Baltic and Slavic proto-languages, already by eastern European groups, shows their strong external influence from Uralic and Iranian, respectively, so an identification of Balto-Slavic with the expansion of R1a lineages is probably to be found in a western group of R1a-Z282 subclades expanding eastwards between the Bronze Age and the Iron Age.

Eastern Europe’s Indo-European heritage (Balto-Slavic) is therefore connected to the western European one (Italo-Celtic and Germanic), each with its own linguistic substrate and influences, but with a common, shared ancient language. North-West Indo-European derived in turn from Late Indo-European, a language ancestral to Indo-Iranian and Palaeo-Balkan languages, the latter showing continued contacts with western Europe for millennia.

In the minimum-case scenario – for supporters of a Satem proto-language like Kortlandt – the language substrate to Baltic and Slavic must be a North-West Indo-European language (to fit its shared traits with North-West Indo-European), like Holzer’s Temematic (a hypothesis which Kortlandt seems to support) that would have then been recently absorbed by Satem speakers of Eastern Europe. In that context, central European R1a-Z282 lineages (which form the majority of West Slavic lineages) would have spoken that NWIE language for millennia , until proto-historic times, when a cultural diffusion of a Graeco-Aryan dialect (mainly spoken by R1a-Z93 or eastern European R1a-Z282 lineages, then) would have happened in eastern Europe, and then a cultural diffusion (or demic diffusion?) of Slavic-speaking peoples would have happened to the west, into central Europe.

In none of these scenarios is any sort of Proto-Indo-European -> Balto-Slavic ethnolinguistic, genetic, or territorial continuity to be seen. The former model is not only the simpler explanation for Slavic and Baltic, but it is also the communis opinio today by most Indo-Europeanists, it is supported by Archaeology, and Genetics is likely to keep supporting it with each new paper. I don’t find anything shameful, or that could diminish modern Baltic and Slavic identities a bit, by accepting any of those models, so I don’t understand the imperative need some people seem to have of identifying R1a lineages with the Yamna expansion and thus Proto-Indo-European.

For those who will still be vying for a more prominent role of haplogroup R1a in the Proto-Indo-European ethnogenesis, there are alternative older scenarios to the arrival of Proto-Indo-European to the steppe, as there are older models for an Anatolian origin of PIE. So, for example I already laid out the possibility that the invasion of R1a-M417 brought Indo-European (or, more precisely, Indo-Uralic) to eastern Europe – as part of a Uralo-Yukaghir or Paleo-Siberian group -, while R1b communities may have originally been speakers of Afroasiatic (cf. R1b-V88 and the potential Afroasiatic homeland in Lake Megachad), and R2 would have been associated with the spread of Dravidian (and maybe Kartvelian and Altaic), all of them departing from a common Nostratic associated with haplogroup R. This model could find support in genetics in the link found between Mesolithic Northeast Europeans and Neolithic Siberians, from an Ancient North Eurasian (ANE) population probably rich in Y-chromosome haplogroup R1a.

This is just one of many highly hypothetic ancient scenarios, and it requires more assumptions than a continuity of Indo-Uralic (or even Indo-Uralic and Afroasiatic) with R1b lineages – R1a potentially marking the spread of Paleo-Siberian languages -, and above all it is based on controversial linguistic macrofamilies, not (yet?) supported by mainstream anthropological disciplines. It is nevertheless one theory certain romantics can place their hopes in, as R1b communities of the steppe become accepted as those originally speaking (Middle and Late) Proto-Indo-European in the steppe.


I am not saying I am right. There is still too much to be said and corrected. In fact, I could be wrong, and we may lack a lot of interesting data: there might have been a late R1a-R1b North-West Indo-European-speaking community within western Yamna, and we might need to revise what we knew about Archaeology yet again (and maybe even Linguistics!) before admixture algorithms; then maybe geneticists have come to save the day after all. However, all anthropological evidence points strongly (and genetic studies more strongly with each new study) to the image we had previous to the first genetic data based on haplogroups.

I think it is preposterous of some researchers (no matter if professional or amateur geneticists, or archaeologists, or even linguists) to think algorithms can beat more than two hundred years and thousands of works on this matter. In Academia, mathematics rarely revolutionize a field; it could usually help, but it can just make you sound scientifish, and point in the wrong direction.

And no, I am not smarter than the rest, I can only judge from what I know, and that is always too little, far less than I would like to. But maybe I am in a more neutral position regarding the end result, given my renewed skepticism in revolutionary methods to solve academic problems, and my indifference as to a western European or eastern European origin of R1b or R1a lineages. And I am not alone in my lack of confidence in the interpretation of recent genetic admixture results – read Voker Heyd’s papers, for example, if you want the view of a renown and experienced archaeologist who was in the field of Indo-European studies earlier than any of those now popular geneticists.

In fact, I also fell for the R1a-Corded Ware expansion of Late Indo-European, and before many in the Anthropological fields, and with even less proof, back when we only had haplogroups of modern populations and the promises of Cavalli-Sforza. When I decided to publish a grammar to learn Indo-European as a modern language, the aim was to offer a mainstream reconstruction of Late Proto-Indo-European without adding my own contributions; despite this, I added the newly, archaeogenetically-supported Corded Ware migration model (see A Grammar of Modern Indo-European, Third Edition, pp. 74 and ff.).

I guess I liked the picture of an old romantic Europe, divided in western Vasconic (R1b) and eastern Uralic (N1c) hunter-gatherers (and later farmers) being invaded by warring kurgan-makers from the steppe (R1a) … And I really liked the article of Haak et al. (2015) – the first one I read on this subject -, which I saw, like everyone else, as supporting what many of us already believed about a single, common expansion of North-West Indo-European into western Europe. It also made our life – regarding the linguistic unity of Balto-Slavic with the West Indo-European core – much easier…

Chalcolithic expansion from Yamna into Corded Ware (as early North-West Indo-European dialects), as interpreted in the second edition of A Grammar of Modern Indo-European.

Recent papers, when compared to what linguists and archaeologists had been saying for years – before even Y-DNA haplogroup was a thing for any of these now popular genetic labs (not to speak about internet geneticists) -, leave little space for doubt right now. I embraced the results of haplogroup analysis of modern populations, which seemed to support an expansion of Proto-Indo-European R1a-lineages with the Corded Ware culture, and dismissed thus Gimbutas’ and Anthony’s model (of a Yamna -> Bell Beaker expansion of Italo-Celtic). I also embraced the results of the publications on genetics of 2015 with open arms.

But, I was able to change my mind when the careful observation of individual samples of these recent studies began to contradict what we thought, and I did so publicly only recently (publishing the Indo-European demic diffusion model), and more strongly after the latest papers (publishing the updated second edition), without remorse. And I will reverse that decision again if needed, and change it again and again as I feel necessary, no matter how many times. In Science, to adapt to new data does not make you a brownnoser, it makes you a scientist. Not to adapt to new data does not make you a man of firm ideals, or any chivalrous concept you might have about that, it makes you look ignorant and biased. It’s that simple.

Some of you may think that there is a third way: to keep an old, now unlikely idea you have supported in the past, but not bragging about it in the meantime until it is proven fully wrong, just in case it is demonstrated to be right in the end – because then you might claim you were right all along, like you had some magic understanding or hidden data the rest of us didn’t. I don’t think that’s the correct way to behave in a scientific environment, either. That makes you a coward. And you wouldn’t have been right all along: you would have been right, then wrong, then right again. Everybody can see that, and so do you.

Geneticists working on future publications should be planning ahead of what might happen. The overconfidence of Haak et al. (2015), Mathieson et al. (2015), and Allentoft et al. (2015), including Lazaridis et al. (2016), in supporting a Yamna -> Corded Ware migration, and a Corded Ware -> Bell Beaker migration are understandable in a rapidly growing field that didn’t leave enough time to study complex anthropological questions. The recent errors of following that simplistic and wrong model in Mathieson et al. (2017), and Olalde et al. (2017), coupled with Kristiansen’s (2017) and Anthony’s (2017) new interpretations (to fit the conclusions of those genetic studies), can be forgiven, because of all the fuss created around the Steppe admixture concept, and the desire of journals to publish popular papers, and of researchers to go with the tide and gain some popularity along the way.

From now on, however, if the evidence keeps pointing in the same direction, a lack of attention to anthropological detail will be simple wishful ignorance, and that cannot be forgiven in any field that strives to ascertaining the truth. If continued, this trend will damage the field of Human Evolutionary Biology for years – at least in the view of anthropologists, who are the real filter of this field’s conclusions -, when its current results prove wrong. Genetic studies will be banished from anthropological studies, dismissed as a pseudo-science, and avoided by any scientific or academic journal worthy of a minimum self-respect.

To regain trust in a field that purportedly uses “more scientific methods” but is nonetheless proven that wrong for years in its essential assumption (a Yamna -> Corded Ware migration model), and especially when it is associated with the traditionally despised ‘Kossinnian trends‘, will be a hard task for those involved. So many postdoc offers in so many labs being created right now will vanish, as the interest in publishing papers of this discredited field will disappear. This could also threaten the recently renewed impulse by archaeologists and linguists of migration models, which had been rejected for a long time, giving impulse to those who deny them ( e.g. in the UK and in France), or who just don’t want to see Archaeology or Linguistics get involved with such a controversial question, or even between each other.

High-impact factor journals like Nature, Science, PNAS, and those not so famous, as well as their reviewers and readers, are doing a disservice to the endeavour of ascertaining the historical truth, if they allow this to happen without protesting. But such consequences for the field will be their making, and not that of suspicious anthropologists, who do well in distrusting any revolutionary results published by overconfident researchers from newly developed and too broadly defined subfields.


(Note: featured image is licensed CC-by-sa 4.0 from crates at Wikipedia)

Wiik’s theory about the spread of Uralic into east and central Europe, and the Uralic substrate in Germanic and Balto-Slavic


I recently wrote about how Wiik’s model was wrong in supporting a Mesolithic European Vasconic-Uralic harmony – genetically based on the modern distribution of R1b vs. N1c haplogroups -, and thus also the disruption of this harmony by Indo-Europeans (supposedly a population of R1a-lineages invading central Europe from a Balkan homeland).

Romanticism does this quite frequently: it makes us believe in some esoteric fantasy, like the ethnic continuity of our ancestors in the region we live (and a far, far greater original territory that has been unfairly diminished by invaders), providing us with strong links to support our artificial borders and their potential expansion.

Even though my article on the demic diffusion of Indo-European languages does only slightly comment on the origins (and potential language) of N1c-lineages and of Proto-Basque and Proto-Uralic languages, I have already received some angry emails by Basque and Finnish genetic amateurs. I don’t get the point of fantasizing on one’s own ethnicity and prehistoric territory, and then getting through the five stages of grief when one is confronted with different (usually sounder) theories, time and time again. It seems like a lot of time lost by generations in wholly stupid quests and self-negotiation.

However wrong Wiik’s basic theses are, though, if you have read my paper you have seen that Corded Ware groups spread from north-western Ukraine might have spoken Uralic languages. Therefore, it is reasonable to assume that Pre-Germanic, Pre-Balto-Slavic and Pre-Indo-Iranian might have been adopted by peoples who spoke Uralic languages, probably related with each other, possibly belonging to early Finno-Ugric dialects. In that sense, Wiik’s work has a renewed linguistic interest, regarding the potential substrate words he investigated.

This is not a picture that certain Basque, Finnish, Russian, or Indian romantics would have hoped (or even hope today) for, in terms of ethnic, linguistic, and territorial identification, but that is not a real problem, anyway, just another building of imaginary origins that will fall as many others before them. In the same sense, Germanic ethnogenesis has become more complicated than what some would have wanted, with at least three main paternal lineages with completely different ethnolinguistic origins developing together since ca. 2500 BC to form a more homogeneous community only during the Bronze Age. Therefore, no homogeneous exclusive ethnic ‘original’ European regional community can be fantastically invented anymore.

This seems to me a real coup de grâce to genetic-based nationalism in Europe, and it is encouraging for the European Union that Germany, as the central European country, is not only a central territory, but also a central cultural and genetic bridge between west and east Europe, in terms of history, of North-West Indo-European languages, and paternal lineages and admixture analyses.


EDIT: You can read interesting recent posts on genetics of Finnic peoples in Razib Khan’s blog: The origin of the Finnic peoples, and The Finnic Peoples Emerged In Baltic After The Bronze Age, the latter discussing results on a recent paper by Saag et al. (2017).

Rhetoric of debates, discussions and arguments: Useful destructive criticism for scientific & academic research, reasons and personal opinions; the example of Proto-Indo-European language revival

Rhetoric (Wikipedia) is the art of harnessing reason, emotions and authority, through language, with a view to persuade an audience and, by persuading, to convince this audience to act, to pass judgement or to identify with given values. The word derives from PIE root wer-, ‘speak’, as in MIE zero-grade wrdhom, ‘word’, or full-grade werdhom, ‘verb’; from wrētōr ρήτωρ (rhētōr), “orator” [built like e.g. wistōr (<*widtor), Gk. ἵστωρ (histōr), “a wise man, one who knows right, a judge” (from which ‘history’), from PIE root weid-, ‘see, know’]; from that noun is adj. wrētorikós, Gk. ρητορικός (rhētorikós), “oratorical, skilled in speaking”, and fem. wrētorikā, GK ρητορική (rhētorikē). According to Plato, rhetoric is the “art of enchanting the soul”.

When related to Proto-Indo-European language revival, as well as in modern scientific research of any discipline, discussions are sometimes interesting in light of historical rhetoric, as they might get really close to some classical (counter-)argumentative resources, however unknown they are to their users…

Sophists taught that every argument could be countered with an opposing argument, that an argument’s effectiveness derived from how “likely” it appeared to the audience (its probability of seeming true), and that any probability argument could be countered with an inverted probability argument. Thus, if it seemed likely that a strong, poor man were guilty of robbing a rich, weak man, the strong poor man could argue, on the contrary, that this very likelihood (that he would be a suspect) makes it unlikely that he committed the crime, since he would most likely be apprehended for the crime. They also taught and were known for their ability to make the weaker (or worse) argument the stronger (or better).

So, for example, if people might generally think that evolution is very likely to have occured, because of the scientifical data available, one only has to say something like “God put those proofs there to confound people and prove their faith“. And, even if there is no single reason to give why that person is entitled to interpret the Bible that way, and to determine what ‘God thought’ when ‘inventing proofs of a false evolution’, in fact there is no need to give rational arguments: this very likelihood of evolution is in itself a proof of how good God is in cheating us…

Statistics was a discipline mostly unknown to sophists, but I’m sure they more or less imagined the typical bell curve that population beliefs and opinions follow. If interpreted the other way round, one could say that the more an idea is believed by people, the more likely is that someone will come along with another, competing one. In fact, that’s natural evolution, too: without that universal trend that life has to differentiate itself from the normal, matter would have never changed and get more and more complicated…

That trend is observed in research, too, as man is obviously another animal and its intelligence another natural feature subjected to the evolutive machinery of nature. That’s why Occam’s razor is never a sufficient argument to end a research field or hypothesis: you have e.g. Gimbutas’ theories (or Renfrew’s, if you like) – even though obviously not completely proven hypothesis -, about some prehistoric speakers being successful in their conquests and migrations through Eurasia, which infers with logic that what happend with Indo-European languages expansion is what has almost always happened in the known history of language expansion, using the most probable extrapolation they can with the facts we know. But you will still find competing hypothesis about an unlikely millennium-long, peaceful spread and mix of languages through and from Europe or Asia, based on some controversial facts and a great part of imagination. And, even if such theories are far away from what can generally be considered rational, they will certainly find supporters; and it’s not bad that such unlikely ideas emerge: science is built up thanks to some of such marginal ideas which eventually prove true; apart from the million ones that prove false and disappear, and some dozens that are sadly able to remain, like homeopathy or Esperanto-like conlanging, as I’ve said before. The same happens with the human body, which went through mutation obtaining lots of advantages, but at the same time dragging some genetic illnesses along…

About Proto-Indo-European research, it’s more or less straightforward which hypothesis and theories are considered generally accepted, and which ones minority views. Nevertheless, that doesn’t prevent renown experts from accepting some marginal hypothesis in some aspects of PIE reconstruction, while keeping the general view on other ones; neither does that prevent renown linguists and philologists to consider Proto-Indo-European, or comparative and historical grammar in general, an absurd work: the ex-Dean of a southern Spanish University, a Latin professor, deems PIE an “invention”; in his words, “from Lat. pater, Gk. pater, and Eng. father, we say there is a language that said what, ‘pater‘? pfff”; he obviously considers “language=written & renown language system”; the problem with that thought is that if PIE becomes spoken (i.e. written too) and renown, just as Old Latin became Classical Latin – instead of disappearing as the other Italic dialects – the whole reasoning is useless; so it’s also useless now. One of the most famous Indo-Europeanists in Spain, F. Adrados (e.g. marginal supporter of Etruscan as an IE language) and Bernabé (e.g. marginal supporter of the Glottalic theory, I think), even if dedicated to Indo-European reconstruction, deemed PIE revival – in some news in Spanish newspaper El Mundo – a “uthopia“, but considered at the same time possible that Greek and Latin (respectively) became EU’s official language: it’s not that they don’t consider speaking PIE impossible, but only that there are “better” alternatives: better, I guess, for Romance or Greek speakers or philologists…

About Proto-Indo-European language revival for Europe, thus, it is difficult to ascertain if it is the most rational choice, as it is to ascertain if liberal thoughts are more rational than conservative ones. I have lived in other countries within the European Union, and have visited other parts of Spain where the spoken language is not Spanish; from that experience, the different attitudes I’ve found are overwhelming: when you speak in English or German anywhere in Europe, the conversation is everything but fluent; also, if you speak English in the UK, German in Germany, French in France, or Czech in Czechia, even mastering quite well the regional language, you’ll never get the same reaction as if a Catalan (from a Catalan-speaking region) speaks Spanish in, say, Galicia (a Galician-Portuguese speaking region), as both use a language (Spanish) common to both of them. That was also the idea behind the first Esperanto out there, probably Volapük, and it has been the idea behind every conlang trying to be THE International Auxiliary Language since then; and none has succeeded. That was also the idea behind Hebrew revival in Israel, for speakers of a hundred different languages living in the same territory: they had other modern, common languages to choose instead of an ancient, partially incomplete, and “difficult” (in Esperantist terms) one, too, and it succeeded.

Latin use in Europe, on the other hand, has been declining ever since the first Romance dialects developed, and had its latest offcial (i.e. legal) use in Europe, apart from the Catholic church, at the beginning of the XX century in Hungary – curiously enough, a non-Indo-European speaking country. Its revival has been proposed a thousand times since then, but has never recovered its prestige, as Germanic-speaking countries have taken the lead in Western Europe, and Slavic-speaking countries in the East. It is hard to explain now why English- or German- or Polish-speaking peoples should learn and speak again the language of the Romans and the Roman Empire, with which they have little history in common…

The rest of known language revivals, like Cornish or Manx, or even e.g. the partial revival (“sociolect”) of Katharevousa Greek, not to talk about the so-called “revivals” – in fact “language revitalizations” – of Basque, Catalan, Breton, Ukrainian, etc. have been just regionally oriented language (or prestige + vocabulary) revivals with cultural or social purposes.

So, is Proto-Indo-European revival a “correct”, or “sufficiently rational” option, given the known facts? As an opinion, it is neither correct nor incorrect, as being “Indo-Europeanist for Europe” is like being leftist or conservative in politics; just like supporting Hebrew revival wasn’t (a hundred years ago) “sufficiently rational” in itself, and controversy over its revival have never ended. But, the reasons behind PIE revival can and should be questioned, as the reasons behind a conlang adoption (i.e. the concepts of “better” and “easier” when applied to language) can and should be critically reviewed. In Proto-Indo-European, it refers – I think – to two main questions:

1) Did Proto-Indo-European exist? i.e. can we confidently consider any proto-language something different from especulation or mere unproven hypothesis? The answer is “it depends”. Proto-Indo-European was probably a language spoken by prehistorical people, as probable as any generally accepted scientific theory we can support without experimental proofs, like theories on the Universe, its creation or development: they might prove wrong in the future, but – following the necessary abstraction and common sense – it’s not difficult to accept most individual premises and facts surrounding them. That migh be said about proto-languages like Proto-Slavic (ca. 1 AD), Proto-Germanic (ca. 1000 BC), Proto-Greek or Proto-Indo-Iranian (ca. 2000 BC) or Proto-Indo-European, especially about its European or North-Western subbranch (ca. 2500-2000 BC); on the other hand, however, about proto-languages like ‘Proto-Eurasiatic’ or ‘Proto-Nostratic’, or ‘Proto-Indo-Tyrrhenian’, or ‘Proto-Thraco-Illyrian’, or ‘Proto-Indo-Uralic’, or ‘Proto-Italo-Celtic’ (or even Proto-Italic), or ‘Proto-Balto-Slavic’, and the hundred other proposed combinations, it is impossible to prove beyond doubt if and when they were languages at all.

2) Is the Proto-Indo-European reconstruction trustable enough to be “revived”? i.e. can we consider it a speakable language, or just a linguistic theoretical approach? Again, it depends, but here mostly mixed with political opinions. In light of Ancient Hebrew – a language that ceased to be spoken 2500 years ago -, “revived” as a modern language introducing thousands of newly coined terms – many of them from Indo-European origin -, to the point that some want to name it “Israeli”, instead of “Hebrew” (as we call MIE “European” or “Europaio” instead of “Indo-European”), I guess the answer is clearly yes, it’s possible: in any possible case, Indo-European languages have a continuated history of more than 4000 years, and modern terms need only (in most cases) a sound-law adjustment to be translated into PIE. Also, in light of the other proto-languages with a high scientifical basis and a similar time span, like Proto-Uralic, Proto-Semitic or Proto-Dravidian, there is no possible comparison with Proto-Indo-European: while PIE is practically a fully reconstructed and well-known language without written texts to ‘confirm’ our knowledge, the rest are just experimental (mainly vocabulary-based) reconstructions. There are, thus, proto-languages and proto-languages, as there are well-known natural dead languages and poorly attested ones; PIE is therefore one of the few ones which might be called today a real, natural language, like Proto-Germanic, Proto-Slavic or Proto-Indo-Aryan.

However, anti-Europeanists (or, better, anti-Indo-Europeanists for the European Union) won’t find it difficult to say a simple “a proto-language is not enough to be revived, as Ancient Hebrew was written down and PIE wasn’t”, thus disguising their sceptic views on the politics behind the project with seemingly rational discussion. While others will also state, in light of our clear confrontation with conlangs, that “proto-language is nothing different from a conlang”, thus disguising their real interest in spreading their personal desire that a proto-language be similar to a conlang. One only has to say: “Classical Latin couldn’t be reconstructed by comparing Spanish, French and Italian” – when, in fact, the question should be something like “could the common, Late Vulgar Latin, be reconstructed with a high degree of confidence, having just the writings of the first mediaeval romance languages?” The answer is probably a simple “yes,and quite well”, until proven the contrary, but by expressing the first doubt one can easily transform the possible-reconstruction argument in an apparently unlikely one; enough to convince those who want to be convinced…

Thus, whereas some people consider PIE a natural language, confidently reconstructed, but impossible to speak today because of political matters, others just consider it another invention, nothing different from Esperanto, while Esperantist talk about it as a “worse” or “more difficult” alternative to it: you could nevertheless find all opinions mixed together when it comes to destructive discussions, as the objective is not to defend an own rational and worked idea, but simply to destroy the appearance (or likelihood, in sophistic terms) of the rival’s idea. Be it anti-Europeanism, anti-Indo-European-reconstrution or anti-everything-else-than-Esperanto, you don’t have to defend your position: just repeat your known anti- cliches, and you’ve “won”. Apparently, at least.

Cicero noted what Greek rhetors already knew before about usual debates, and how arguments should be made and countered so that no idea is left accepted. In that sense, discussions were (and are) generally so unnecessary, that the Socratic Method seems to be still the best philosophical approach to discussions, even those concerning scientifical (i.e. “most probable”) facts: Instead of arriving at answers, non-expert (and often expert) discussion is used to break down the theories others hold, not “to go beyond the axioms and postulates we take for granted” and obtain a better knowledge, as Greek philosophers put it, but just to destroy what others build up.

So, for example, we might get these general rules to counter any argument, even if it’s not only based on opinions, but also on generally accepted facts:

1) Demonstrate the falseness of a part of the rival’s argument; then, infer the falseness of the whole reasoning. For example, let’s say Gimbutas’ view is out-dated, or that we at Dnghu included something considered nowadays ‘wrong’ in our grammar: then PIE revival is also mistaken; nothing more to explain. Or, let’s say that Hebrew revival is not “equal” to a proto-language revival, and that therefore the comparison is ‘false’ – even if comparisons are there to compare similar cases, not “equal” cases, which would be absurd – then, the whole PIE revival project is ‘equivocal’ or ‘absurd’. That’s the view about PIE revival you can find in some comments made on American blogs out there.

2) You can also confirm a part of your rival’s argument, and then, by doing it, carry that argument to its extreme, to the extent that the consequences of it are intolerable, and the paroxism completely distorts your rival’s argument. That’s more or less what I usually do when confronting conlanging as a real option for the European Union, by saying “OK, let’s adopt the ‘better’ and ‘easier’ language: first Esperanto, then the “better” and “easier” Esperanzo, then Lojban, then Pilosofio, then Mazematio, etc. etc. ad infinitum” – so, as a conclusion, one might accept that “better” and “easier” are not actually good reasons to adopt a language; hence the arguments based on “better” and “easier” cliches are opinion, not ratio.

3) The most common now (and then, I guess, in spoken language) is personal discredit, by which you can infer that his argument is also corrupted. That is what some have made when lacking more arguments, calling me personally (and the Indo-European language Association in general ?!) a “racist”, “nazi”, or “KKK-like” group; or trying to discredit me personally by saying I don’t master the English language; or that I misspelled or ‘was wrong’ in reconstructing this or that PIE name or noun; or even just because I am “an amateur”, – thus suggesting we all have to be “language professionals” to propose a trustable PIE revival. A recent example of this is our latest Esperantist visitor, saying I am “close to being racist” because I propose PIE for the EU – thus obviously inviting readers to identify “language=race”, saying that “I propose one language = I propose one race = I am a racist”, and therefore if “I=racist” and “I propose PIE revival” => “PIE=x”. The whole reasoning is nonsense, but he is not the first – and won’t be the last – educated individual to say (and possibly believe) that…

4) The fourth is actually only a minor method derived from the third, used in desperate cases, which consists on taking a sensible, emotional example of the consequences of the generalization of the rival’s argument, to demonstrate the moral baseness of the one who defends it; then, if he is discredited, his argument is corrupted, too [see point 3]… That is what some desperate people do when saying that PIE revival for the EU is “bad” (or “worse”) for non-IE-language-speakers like Finnish, Hungarian, Estonian, Basque or Maltese peoples. In fact, anyone who had taken a look at our website, or had made a quick search about me, would have found that I began this project of PIE revival to defend European languages (at least minority languages, as national or official languages are already well protected) against the European Union’s English officious imperium and English-German-French official triumvirate. Also, if we left PIE revival, only some languages (the official, i.e. national ones, 25 today) would get EU support, while the rest just die out or resist with some regional or private support. With Modern Indo-European, on the other hand, there will only be one official language supported by the European Union, and the rest really equal in front of each other and the Union, be it English, Maltese, Basque, Saami or Piedmontese. Nowadays, English is the language spoken in institutions, Maltese has an official status before the EU, while Saami is official in its country, Basque is only official in its territory, and Piedmontese, Asturian, Breton, and the majority of EU regional languages are only privately and locally defended. Nevertheless, one only has to say “supporting Indo-European is what Nazis did, PIE revival is racist and wants to destroy non-Indo-European peoples and cultures”; and, there you are: nothing proven, nothing reasoned, but the simplest and most efficient FUD you can find to counter the thousand arguments in favour of this revival project.

However unnecessary and unfruitful it might seem, I still discuss – or even directly look for debate -, because I get a benefit of such long, active pauses from my study, unlike those tiny passive TV- or radio-pauses I insert between study hours, especially in these stressful exam periods. Indeed I can find something to discuss in any website at any time, but I’m generally interested in debating these language political options. Nevertheless, I find it difficult to understand why some people get mad (at me, the project, or even the association or the whole world), when in fact taking part on any discussion is freely accepted by all of us, and it’s me who put new ideas and proposals on the table, and the others who just have to criticize them…

Something valuable for life I learned from psychology (possibly the only thing…) is about Chomsky’s reaction on Skinner’s comments: my professor (close to Freudian psychoanalysis), who told us the story – I hope I got it well, I cannot find it out there – thought it was Skinner who “won” the debate, by answering to Chomsky’s criticism, who in turn had criticized Skinner’s work, Verbal Behaviour, for his “scientistic”, not scientific, concept of the human mind. In fact, the younger Chomsky had just applied science to psychology (a need that psychology still has), simplifying the understanding of mind with a strict cognitive view, and criticizing some traditional views that psychologists accepted as ‘normal’. Skinner and those who followed his behavioural school of thought overreacted, mostly based on the belief that Chomsky’s reasons were against their lives and professional options, when in fact reason and opinion are in different planes. Chomsky, instead of entering the flame (yes, trolling existed back in the 60’s) did nothing. When asked years later, about why he didn’t reply as expected to all that criticism, he just said: “they missed the point”; he said what he had to say, criticized what he wanted, proposed an alternative, and left the discussion. And still, even by not answering, cognitive revolution provoked a shift in American psychology between the 1950s through the 1970s from being primarily behavioral to being primarily cognitive.

If you want to debate about opinions – be it PIE revival, Europeanism, general politics, Star Trek or the sex of angels -, entering into unending criticisms and personal attacks, that’s OK; but you should do it if and when you want, as I only do it because I obtain something beneficial, having a good time, laughing a little bit, relaxing from study, thinking about interesting reasons that might appear for or against my views or ideas, etc. And you should do it to get something in (re)turn, be it that same stress relief I (and most people) get, or other personal or professional benefits whatsoever. If not, if maybe you are getting more stressed trying to “convince” me or others, to “make us change our minds” with great one-minute ‘reasons’, by discussing directly your opinions as if they were ‘true‘, then you are clearly “missing the point” (using Chomsky’s words) with these discussions, and – as our latest Esperantist commenter (Mr. Janoski) puts it – “losing your time”, “trying to understand” something…