A new paper at PNAS, Evolutionary dynamics of language systems, by Greenhill et al. (2017).
Do different aspects of language evolve in different ways? Here, we infer the rates of change in lexical and grammatical data from 81 languages of the Pacific. We show that, in general, grammatical features tend to change faster and have higher amounts of conflicting signal than basic vocabulary. We suggest that subsystems of language show differing patterns of dynamics and propose that modeling this rate variation may allow us to extract more signal, and thus trace language history deeper than has been previously possible.
Understanding how and why language subsystems differ in their evolutionary dynamics is a fundamental question for historical and comparative linguistics. One key dynamic is the rate of language change. While it is commonly thought that the rapid rate of change hampers the reconstruction of deep language relationships beyond 6,000–10,000 y, there are suggestions that grammatical structures might retain more signal over time than other subsystems, such as basic vocabulary. In this study, we use a Dirichlet process mixture model to infer the rates of change in lexical and grammatical data from 81 Austronesian languages. We show that, on average, most grammatical features actually change faster than items of basic vocabulary. The grammatical data show less schismogenesis, higher rates of homoplasy, and more bursts of contact-induced change than the basic vocabulary data. However, there is a core of grammatical and lexical features that are highly stable. These findings suggest that different subsystems of language have differing dynamics and that careful, nuanced models of language change will be needed to extract deeper signal from the noise of parallel evolution, areal readaptation, and contact.
This is in line with the studies by Bendt, like Adaptive Communication: Languages with More Non-Native Speakers Tend to Have Fewer Word Forms, which suggest a simplification of grammar with language contact.
It might then give further support to my proposal of Uralic as the Corded Ware substrate – common to Balto-Slavic and Indo-Iranian -, since they are the only Late Indo-European branches that clearly retain the grammatical complexity in word forms, which – together with their shared phonetic isoglosses (also present partially between Balto-Slavic and Germanic) -, put them nearer to a complex, potentially related Uralic (or other Indo-Uralic) branch.
On the other hand, the finding of a greater stability of lexicon gives further support to the concept of a North-West Indo-European group, since one of its foundations (the main one originally) is the shared vocabulary between Italo-Celtic, Germanic, and Balto-Slavic.
Featured image: from the article (copyrighted), “Map showing locations of languages in this study. The phylogenies show the maximum clade credibility tree of the Austronesian languages in our sample. Each phylogeny is colored by the average rate of change, with branches showing more change colored redder, while bluer branches show reductions in rate. Branches with significant shifts are annotated with an asterisk, and the languages showing significantly different rates of change in their grammatical data are located on the map”.
- New Ukraine Eneolithic sample from late Sredni Stog, near homeland of the Corded Ware culture
- Germanic–Balto-Slavic and Satem (‘Indo-Slavonic’) dialect revisionism by amateur geneticists, or why R1a lineages *must* have spoken Proto-Indo-European
- Another hint at the role of Corded Ware peoples in spreading Uralic languages into north-eastern Europe, found in mtDNA analysis of the Finnish population
- Wiik’s theory about the spread of Uralic into east and central Europe, and the Uralic substrate in Germanic and Balto-Slavic