On Latin, Turkic, and Celtic – likely stories of mixed societies and little genetic impact


Recent article on The Conversation, The Roman dead: new techniques are revealing just how diverse Roman Britain was, about the paper (behind paywall) A Novel Investigation into Migrant and Local Health-Statuses in the Past: A Case Study from Roman Britain, by Redfern et al. Bioarchaeology International (2018), among others.

Interesting excerpts about Roman London:

We have discovered, for example, that one middle-aged woman from the southern Mediterranean has black African ancestry. She was buried in Southwark with pottery from Kent and a fourth century local coin – her burial expresses British connections, reflecting how people’s communities and lives can be remade by migration. The people burying her may have decided to reflect her life in the city by choosing local objects, but we can’t dismiss the possibility that she may have come to London as a slave.

The evidence for Roman Britain having a diverse population only continues to grow. Bioarchaeology offers a unique and independent perspective, one based upon the people themselves. It allows us to understand more about their life stories than ever before, but requires us to be increasingly nuanced in our understanding, recognising and respecting these people’s complexities.

We already have a more or less clear idea about how little the Roman conquest may have shaped the genetic map of Europe, Africa, or the Middle East, in contrast to other previous or later migrations or conquests.

Also, on the Turkic expansion, the recent paper of Damgaard et al. (Nature 2018) stated:

In the sixth century AD, the Hunnic Empire had been broken up and dispersed as the Turkic Khaganate assumed the military and political domination of the steppes22,23. Khaganates were steppe nomad political organizations that varied in size and became dominant during this period; they can be contrasted to the previous stateless organizations of the Iron Age24. The Turkic Khaganate was eventually replaced by a number of short-lived steppe cultures25 (…).

We find evidence that elite soldiers associated with the Turkic Khaganate are genetically closer to East Asians than are the preceding Huns of the Tian Shan mountains (Supplementary Information section 3.7). We also find that one Turkic Khaganate-period nomad was a genetic outlier with pronounced European ancestries, indicating the presence of ongoing contact with Europe (…).

Analyses of Turk- and Medieval-period population clusters. a, PCA of Tian Shan Hun, Turk, Kimak, Kipchack, Karakhanid and Golden Horde, including 28 individuals analysed at 242,406 autosomal SNP positions. b, Results for model-based clustering analysis at K = 7. Here we illustrate the admixture analyses with K = 7 as it approximately identifies the major component of relevance (Anatolian/ European farmer component, Caucasian ancestry, EHG-related ancestry and East Asian ancestry).”

These results suggest that Turkic cultural customs were imposed by an East Asian minority elite onto central steppe nomad populations, resulting in a small detectable increase in East Asian ancestry. However, we also find that steppe nomad ancestry in this period was extremely heterogeneous, with several individuals being genetically distributed at the extremes of the first principal component (Fig. 2) separating Eastern and Western descent. On the basis of this notable heterogeneity, we suggest that during the Medieval period steppe populations were exposed to gradual admixture from the east, while interacting with incoming West Eurasians. The strong variation is a direct window into ongoing admixture processes and the multi-ethnic cultural organization of this period.

We already knew that the expansion of the La Tène culture, associated with the expansion of Celtic languages throughout Europe, was probably not accompanied by massive migrations (from the IEDM, 3rd ed.):

The Mainz research project of bio-archaeometric identification of mobility has not proven to date a mass migration of Celtic peoples in central Europe ca. 4th-3rd centuries BC, i.e. precisely in a period where textual evidence informs of large migratory movements (Scheeres 2014). La Tène material culture points to far-reaching inter-regional contacts and cultural transfers (Burmeister 2016).

Also, from the latest paper on Y-chromosome bottleneck:

[The hypothesis of patrilineal kin group competition] has an added benefit in that it could explain the temporal placement of the bottleneck if competition between patrilineal kin groups was the main form of intergroup competition for a limited episode of time after the Neolithic transition. Anthropologists have repeatedly noted that the political salience of unilineal descent groups is greatest in societies of ‘intermediate social scale’ (Korotayev47 and its citations on p. 2), which tend to be post-Neolithic small-scale societies that are acephalous, i.e. without hierarchical institutions48. Corporate kin groups tend to be absent altogether among mobile hunter gatherers with few defensible resource sites or little property (Kelly49 pp. 64–73), or in societies utilizing relatively unoccupied and under-exploited resource landscapes (Earle and Johnson50 pp. 157–171). Once they emerge, complex societies, such as chiefdoms and states, tend to supervene the patrilineal kin group as the unit of intergroup competition, and while they may not eradicate them altogether as sub-polity-level social identities, warfare between such kin groups is suppressed very effectively51,52.These factors restrict the social phenomena responsible for the bottleneck to the period after the initial Neolithic but before the emergence of complex societies, which would place the bottleneck-generating mechanisms in the right period of time for each region of the Old World.

Diachronic map of Late Copper Age migrations including Classical Bell Beaker (east group) expansion from central Europe ca. 2600-2250 BC

However, I recently read in a forum for linguists that the expansion of East Bell Beakers overwhelmingly of R1b-L21 subclades in the British Isles “poses a problem”, in that it should be identified with a Celtic expansion earlier than traditionally assumed…

That interpretation would be in line with the simplistic maps we are seeing right now for Bell Beakers (see below for the Copenhagen group).

If anything, the results of Bell Beaker expansions (taken alone) would seem to support a model similar to Cunliffe & Koch‘s hypotheses of a rather early Celtic expansion into Great Britain and Iberia from the Atlantic.

Spread of Indo-European languages (by the Copenhagen group).

But it doesn’t. Mallory already explained why in Cunliffe & Koch’s series Celtic from the West: the Bell Beaker expansion is too early for that; even for Italo-Celtic. It should correspond to North-West Indo-European speakers.

Not every population movement that is genetically very significant needs to be significant for the languages attested much later in the region.

This should be obvious to everyone with the many examples we already have. One of the least controversial now would probably be the expansion of R1b-DF27, widespread in Iberia probably at roughly the same time as R1b-L21 was in Great Britain, and still pre-Roman Iberians showed a mix of non-Indo-European languages, non-Celtic languages (at least Galaico-Lusitanian), and also some (certain) Celtic languages. And modern Iberians speak Romance languages, without much genetic impact from the Romans, either…

It is well-established in Academia that the expansion of La Tène is culturally associated with the spread of Celtic languages in Europe, including the British Isles and Iberia. While modern maps of U152 distribution may correspond to the migration of early Celts (or Italo-Celtic speakers) with Urnfield/Hallstatt, the great Celtic expansion across Europe need not show a genetic influence greater than or even equal to that of previous prehistoric migrations.

Post-Bell-Beaker Europe, after ca. 2200 BC.

You can see in these de novo models the same kind of invented theoretical ‘problem’ (as Iosif Lazaridis puts it) that we have seen with the Corded Ware showing steppe ancestry, with Old Hittite samples not showing EHG ancestry, or with CHG ancestry appearing north of the Caucasus but no EHG to the south.

However you may want to explain all these errors in scientific terms (selection bias, under-coverage, over-coverage, faulty statistical methods, etc.), these interpretations were simply fruit of the lack of knowledge of the anthropological disciplines at play.

Let’s hope the future paper on Celtic expansion takes this into consideration.


Population size potentially affecting rates of language change


Open access Population Size and the Rate of Language Evolution: A Test Across Indo-European, Austronesian, and Bantu Languages, by Greenhill et al. Front. Psychol (2018) 9:576.

Summary (emphasis mine):

What role does speaker population size play in shaping rates of language evolution? There has been little consensus on the expected relationship between rates and patterns of language change and speaker population size, with some predicting faster rates of change in smaller populations, and others expecting greater change in larger populations. The growth of comparative databases has allowed population size effects to be investigated across a wide range of language groups, with mixed results. One recent study of a group of Polynesian languages revealed greater rates of word gain in larger populations and greater rates of word loss in smaller populations. However, that test was restricted to 20 closely related languages from small Oceanic islands. Here, we test if this pattern is a general feature of language evolution across a larger and more diverse sample of languages from both continental and island populations. We analyzed comparative language data for 153 pairs of closely-related sister languages from three of the world’s largest language families: Austronesian, Indo-European, and Niger-Congo. We find some evidence that rates of word loss are significantly greater in smaller languages for the Indo-European comparisons, but we find no significant patterns in the other two language families. These results suggest either that the influence of population size on rates and patterns of language evolution is not universal, or that it is sufficiently weak that it may be overwhelmed by other influences in some cases. Further investigation, for a greater number of language comparisons and a wider range of language features, may determine which of these explanations holds true.

Interesting excerpts:

Our analysis suggests that, as for Polynesian languages, smaller Indo-European languages have greater rates of word loss from basic vocabulary. This result is consistent with the claim that smaller populations are at greater risk of loss of language elements, and other aspects of culture, due to effects of incomplete sampling of variants over generations. However, we note that the relatively small sample size for this dataset complicates the interpretation of this result. Least squares regression after Welch & Waxman test has the same false positive rate but has much less power than Poisson regression when sample size is small (~ten or fewer pairs, Hua et al., 2015). This makes it difficult to interpret the inconsistent results of these two analyses, as they may be due to their difference in the statistical power. Hence, the negative relationship between rates of loss and population size for Indo-European languages would benefit from additional investigation. We do not find evidence for a negative relationship between population size and word loss rates in the Austronesian and Bantu groups. This finding suggests that either these datasets contain too few language variants to have sufficient power to detect rate differences, or that the increased loss rate in small populations is not a universal phenomenon, or that it is a relatively weak force in some language groups and thus may be overwhelmed by other social, linguistic or demographic factors.

Regarding potential drawbacks of the study:

[M]easuring speech community size is notoriously difficult. How exactly does one delimit a speech community (Crystal, 2008) and what degree of proficiency in a language is sufficient to be part of the community (Bloomfield, 1933)? This task is made harder as there are few national censuses that collect detailed speaker statistics. Further, speaker population size can change rapidly with many modern world languages (especially the Indo-European languages) experiencing rapid growth over the last few hundred years (Crystal, 2008), while others have experienced catastrophic declines (Bowern, 2010). For the same reasons, the difficulty of obtaining accurate population estimates is also a problem in biology. Furthermore, the relevant parameter for genetic change—the effective population size—is difficult to estimate directly, even when accurate census information is available (Wang et al., 2016). Likewise, there may be an important role played by population and network density—tight-knit networks may inhibit change, while loosely integrated speech communities (regardless of their size), may facilitate change (Granovetter, 1973; Milroy and Milroy, 1992). One way forward here is perhaps to simulate rates of change over a range of population sizes and network topologies (c.f. Reali et al., 2018).

As conclusions:

Firstly, we provide some evidence that rates of language change can be affected by demographic factors. Even if the effect is not universal, the finding of significant associations between population size and patterns of linguistic change in some languages urges caution for any analysis of language evolution that makes an assumption of uniform rates of change. These results also potentially provide a window on processes of language change in these lineages, providing further impetus to investigate the effect of number of speakers on patterns of language transmission and loss. A more detailed study of language change for a larger number of comparisons might clarify the relationship between population size and word loss rates, particularly within the Indo-European language family.

Secondly, we have shown that the significant patterns of language change identified in a previous study are not a universal phenomenon. Unlike the study of Polynesian languages, we did not find any significant relationships between word gain rate and population size, and the association between loss rates and population size was not evident for all language families analyzed. The lack of universal relationships suggests that it may be difficult to draw general conclusions about the influence of demographic factors on patterns and rates of language change. Many other factors have been proposed to influence rates of language change (Greenhill, 2014) including population density, social structure (Nettle, 1999; Labov, 2007; Ke et al., 2008; Trudgill, 2011), degree of contact, and connectedness with other languages (Matras, 2009; Bowern, 2010), degree of language diffusion within a speech community (Wichmann et al., 2008), degree of bilingualism or multilingualism (Lupyan and Dale, 2010; Bentz and Winter, 2013), language group diversity (Atkinson et al., 2008) and environmental factors such as habitat heterogeneity and latitude (Bowern, 2010; Blust, 2013; Amano et al., 2014). These factors might mediate or overwhelm the effect of speaker population size.

We find no evidence to support the hypothesis that uptake of new words should be faster in small populations, which is based on the assumption that new words can diffuse more efficiently through a smaller speaker population than a larger one (Nettle, 1999). Nor do we find support for the suggestion that large, widespread languages have a tendency to lose linguistic features a greater rate (Lupyan and Dale, 2010). However, this latter hypothesis is predominantly expected to explain loss of complex linguistic morphology (such as case systems), which may be harder for non-native speakers to learn, rather than basic vocabulary studied here which may be comparatively easier for second language learners to acquire (but see Kempe and Brooks, 2018). Further, our results cannot be interpreted as confirmation of previous studies that suggest there is no effect of population size on rates (Wichmann and Holman, 2009). The detection of significant patterns in rates of lexical change with population size variation in the Polynesian and Indo-European languages, but the failure to identify similar patterns in the Bantu and Austronesian data, suggests that patterns of rates may need to be investigated on a case-by-case basis.


Language continuity despite population replacement in Remote Oceania


New article (behind paywall) Language continuity despite population replacement in Remote Oceania, by Posth et al., Nat. Ecol. Evol. (2018).


Recent genomic analyses show that the earliest peoples reaching Remote Oceania—associated with Austronesian-speaking Lapita culture—were almost completely East Asian, without detectable Papuan ancestry. However, Papuan-related genetic ancestry is found across present-day Pacific populations, indicating that peoples from Near Oceania have played a significant, but largely unknown, ancestral role. Here, new genome-wide data from 19 ancient South Pacific individuals provide direct evidence of a so-far undescribed Papuan expansion into Remote Oceania starting ~2,500 yr BP, far earlier than previously estimated and supporting a model from historical linguistics. New genome-wide data from 27 contemporary ni-Vanuatu demonstrate a subsequent and almost complete replacement of Lapita-Austronesian by Near Oceanian ancestry. Despite this massive demographic change, incoming Papuan languages did not replace Austronesian languages. Population replacement with language continuity is extremely rare—if not unprecedented—in human history. Our analyses show that rather than one large-scale event, the process was incremental and complex, with repeated migrations and sex-biased admixture with peoples from the Bismarck Archipelago.

So, despite the population replacement in Oceania seen recently in Genomics, the people of present-day Vanuatu continue to speak languages descended from those spoken by the initial Austronesian inhabitants, rather than any Papuan language of the incoming migrants.

Professor Gray, Director of the Department of Linguistic and Cultural Evolution at the MPI-SHH, says:

Population replacement with language continuity is extremely rare – if not unprecedented – in human history. The linguist Bob Blust has long argued for a model in which a separate Papuan expansion reaches Vanuatu soon after initial Austronesian settlement, with the initial, and likely undifferentiated, Austronesian language surviving as a lingua franca for diverse Papuan migrant groups.

Dr. Adam Powell, senior author of the study and also of the MPI-SHH, continues,

The demographic history suggested by our ancient DNA analyses provides really strong support for this historical linguistic model, with the early arrival and complex, incremental process of genetic replacement by people from the Bismarck Archipelago. This provides a compelling explanation for the continuity of Austronesian languages despite the almost complete replacement of the initial genetic ancestry of Vanuatu.

Maps showing the migrations in the area, including, in the final map, the migrations revealed by the current study. Credit: Hans Sell, adapted from Skoglund et al. Genomic insights into the peopling of the Southwest Pacific. Nature (2016).

I think we can safely disagree now with their assertion. We are seeing more and more cases of language continuity in spite of population replacement quite clearly in Eurasian prehistory. At least:

All these cases can be explained with founder effects and gradual expansions after an initial arrival, maybe also initial close interaction between different ethnic groups, where one group (and its language) becomes the dominant one.

NOTE. Even if an alternative model is selected (say, that Corded Ware migrants spoke Indo-European languages), alternative language continuity events need to be proposed for some of these regions, so we are beyond their description as ‘rare language events’ already.

What is becoming clearer with ancient samples, therefore, is that there is little space for prehistoric cultural diffusion events (at least massive ones), which were quite popular explanations before the advent of genetic studies.


Language evolution and language change related to ancient DNA


An interesting special issue of the journal Language Evolution has appeared, dedicated to Ancient DNA and language evolution.

Also, check out the preprint at BioRxiv, Geospatial distributions reflect rates of evolution of features of language, by Kauhanen et al. (2018).


Different structural features of human language change at different rates and thus exhibit different temporal stabilities. Existing methods of linguistic stability estimation depend upon the prior genealogical classification of the world’s languages into language families; these methods result in unreliable stability estimates for features which are sensitive to horizontal transfer between families and whenever data are aggregated from families of divergent time depths. To overcome these problems, we describe a method of stability estimation without family classifications, based on mathematical modelling and the analysis of contemporary geospatial distributions of linguistic features. Regressing the estimates produced by our model against those of a genealogical method, we report broad agreement but also important differences. In particular, we show that our approach is not liable to some of the false positives and false negatives incurred by the genealogical method. Our results suggest that the historical evolution of a linguistic feature leaves a footprint in its global geospatial distribution, and that rates of evolution can be recovered from these distributions by treating language dynamics as a spatially extended stochastic process.

Featured image, modified from the paper: “Empirical geospatial distributions of two linguistic features on the hemisphere from 30°Wto 150° E (red: feature present, blue: feature absent): (A) definite marker (WALS feature 37A), (B) object–verb order (WALS feature 83A). Shown are both individual empirical data points (languages, as given by WALS coordinates) and a spatial interpolation (inverse distance weighting) on these points. Map projection: Albers equal-area.”

See also:

From Proto-Slavic into Germanic or from Germanic into Proto-Slavic? A review of controversial loanwords


Interesting new article From Proto-Slavic into Germanic or from Germanic into Proto-Slavic? A review of controversial loanwords, by Noińska Marta and Rychło Mikołaj in Studia Rossica Gedanensia (2017) 4:39-52.


Germanic loanwords in Proto-Slavic have been comprehensively analysed by both Western and Eastern scholars, however the problem of borrowings in the opposite direction received far less attention, especially among Western academics. It is worth noticing that Viktor Martynov (1963) proposed as many as 40 borrowings and penetrations from Proto-Slavic into Proto-Germanic. Among these, there are nine (*bljudo, 40 Marta Noińska, Mikołaj Rychło *kupiti, *lěkъ, *lugъ, *lukъ, *plugъ, *pъlkъ, *skotъ, *tynъ) which are considered certain loanwords in the opposite direction in the newest monograph on the topic by Pronk- Tiethoff (2013). The aim of the present paper is to review and juxtapose linguists’ views on the direction and etymology of these borrowings. The authors take into consideration the analyses carried out not only by Saskia Pronk-Tiethoff (2013) and Viktor Martynov (1963), but also by Valentin Kiparsky (1934) and Zbigniew Gołąb (1992). An attempt is made to assess which of the nine words could be borrowings from Proto-Slavic in Germanic.

This question of loanwords (in which direction and when approximately in the different stages of the languages involved), a priori only interesting from a linguistic point of view, might be also very important to ascertain the oldest layer of vocabulary shared by both, Germanic and Balto-Slavic, which can hint to their shared substrate immediately after the expansion of East Bell Beakers (or between Pre-Germanic and ‘Temematic’, for Kortlandt and others).

See also:

Migration, acculturation, and the maintenance of between-group cultural variation

Preprint at BioRxiv, Migration, acculturation, and the maintenance of between-group cultural variation, by Alex Mesoudi (2017)


How do migration and acculturation affect within- and between-group cultural variation? Classic models from population genetics show that migration rapidly breaks down between-group genetic structure. However, in the case of cultural evolution, migrants (or their children) can acculturate to local cultural behaviors via social learning processes such as conformity, potentially preventing migration from eliminating between-group cultural variation. To explore this verbal claim formally, here I present models that quantify the effect of migration and acculturation on between-group cultural variation, first for a neutral trait and then for an individually-costly cooperative trait. I also review the empirical literature on the strength of migrant acculturation. The models show that surprisingly little conformist acculturation is required to maintain plausible amounts of between-group cultural diversity. Acculturation is countered by assortation, the tendency for individuals to preferentially interact with culturally-similar others. Cooperative traits may also be maintained by payoff-biased social learning but only in the presence of strong sanctioning institutions. While these models provide insight into the potential dynamics of acculturation and migration in cultural evolution, they also highlight the need for more empirical research into the individual-level learning biases that underlie migrant acculturation.

The Indo-European Corded Ware Theory group might need to resort to cultural diffusion models again after the ‘Yamnaya ancestral component’ is dismissed as a source for Corded Ware peoples and their migration.

Therefore, if you are somehow interested in keeping this IE-CWC(-R1a) theory alive, learning more about these theoritical models for cultural diffusion is probably your best investment for the future…


Forces driving grammatical change are different to those driving lexical change

Grammar change

A new paper at PNAS, Evolutionary dynamics of language systems, by Greenhill et al. (2017).


Do different aspects of language evolve in different ways? Here, we infer the rates of change in lexical and grammatical data from 81 languages of the Pacific. We show that, in general, grammatical features tend to change faster and have higher amounts of conflicting signal than basic vocabulary. We suggest that subsystems of language show differing patterns of dynamics and propose that modeling this rate variation may allow us to extract more signal, and thus trace language history deeper than has been previously possible.


Understanding how and why language subsystems differ in their evolutionary dynamics is a fundamental question for historical and comparative linguistics. One key dynamic is the rate of language change. While it is commonly thought that the rapid rate of change hampers the reconstruction of deep language relationships beyond 6,000–10,000 y, there are suggestions that grammatical structures might retain more signal over time than other subsystems, such as basic vocabulary. In this study, we use a Dirichlet process mixture model to infer the rates of change in lexical and grammatical data from 81 Austronesian languages. We show that, on average, most grammatical features actually change faster than items of basic vocabulary. The grammatical data show less schismogenesis, higher rates of homoplasy, and more bursts of contact-induced change than the basic vocabulary data. However, there is a core of grammatical and lexical features that are highly stable. These findings suggest that different subsystems of language have differing dynamics and that careful, nuanced models of language change will be needed to extract deeper signal from the noise of parallel evolution, areal readaptation, and contact.

This is in line with the studies by Bendt, like Adaptive Communication: Languages with More Non-Native Speakers Tend to Have Fewer Word Forms, which suggest a simplification of grammar with language contact.

It might then give further support to my proposal of Uralic as the Corded Ware substrate – common to Balto-Slavic and Indo-Iranian -, since they are the only Late Indo-European branches that clearly retain the grammatical complexity in word forms, which – together with their shared phonetic isoglosses (also present partially between Balto-Slavic and Germanic) -, put them nearer to a complex, potentially related Uralic (or other Indo-Uralic) branch.

On the other hand, the finding of a greater stability of lexicon gives further support to the concept of a North-West Indo-European group, since one of its foundations (the main one originally) is the shared vocabulary between Italo-Celtic, Germanic, and Balto-Slavic.

Featured image: from the article (copyrighted), “Map showing locations of languages in this study. The phylogenies show the maximum clade credibility tree of the Austronesian languages in our sample. Each phylogeny is colored by the average rate of change, with branches showing more change colored redder, while bluer branches show reductions in rate. Branches with significant shifts are annotated with an asterisk, and the languages showing significantly different rates of change in their grammatical data are located on the map”.