A new interesting paper from Nature: Detecting evolutionary forces in language change, by Newberry, Ahern, Clark, and Plotkin (2017). Discovered via Science Daily.
Examining substantial collections of annotated texts dating from the 12th to the 21st centuries, the researchers found that certain linguistic changes were guided by pressures analogous to natural selection — social, cognitive and other factors — while others seem to have occurred purely by happenstance.
“Linguists usually assume that when a change occurs in a language, there must have been a directional force that caused it,” said Joshua Plotkin, professor of biology in Penn’s School of Arts and Sciences and senior author on the paper. “Whereas we propose that languages can also change through random chance alone. An individual happens to hear one variant of a word as opposed to another and then is more likely to use it herself. Chance events like this can accumulate to produce substantial change over generations. Before we debate what psychological or social forces have caused a language to change, we must first ask whether there was any force at all.”
“One of the great early American linguists, Leonard Bloomfield, said that you can never see a language change, that the change is invisible,” said Robin Clark, a coauthor and professor of linguistics in Penn Arts and Sciences. “But now, because of the availability of these large corpora of texts, we can actually see it, in microscopic detail, and begin to understand the details of how change happened.”
One change is the regularization of past-tense verbs. Using the Corpus of Historical American English, comprised of more than 100,000 texts ranging from 1810 to 2009 that have been parsed and digitized — a database that includes more than 400 million words — the team searched for verbs where both regular and irregular past-tense forms were present, for example, “dived” and “dove” or “wed” and “wedded.”
“There is a vast literature and a lot of mythology on verb regularization and irregularization,” Clark said, “and a lot of people have claimed that the tendency is toward regularization. But what we found was quite different.”
Indeed, the analysis pointed to particular instances where it seems selective forces are driving irregularization. For example, while a swimmer 200 years ago might have “dived”, today we would say they “dove.” The shift towards using this irregular form coincided with the invention of cars and concomitant increase in use of the rhyming irregular verb “drive”/“drove.”
Despite finding selection acting on some verbs, “the vast majority of verbs we analyzed show no evidence of selection whatsoever,” Plotkin said.
The team recognized a pattern: random chance affects rare words more than common ones. When rarely-used verbs changed, that replacement was more likely to be due to chance. But when more common verbs switched forms, selection was more likely to be a factor driving the replacement.
The authors also observed a role of random chance in grammatical change. The periphrastic “do,” as used in, “Do they say?” or “They do not say,” did not exist 800 years ago. Back in the 1400s, these sentiments would have been expressed as, “Say they?” or “They say not.”
Using the Penn Parsed Corpora of Historical English, which includes 7 million syntactically parsed words from 1,220 British English texts, the researchers found that the use of the periphrastic “do” emerged in two stages, first in questions (“Don’t they say?”) around the 1500s, and then roughly 200 years later in imperative and declarative statements (“They don’t say.”).
While most linguists have assumed that such a distinctive grammatical feature must have been driven to dominance by some selective pressure, the Penn team’s analysis questions that assumption. They found that the first stage of the rising periphrastic “do” use is consistent with random chance. Only the second stage appears to have been driven by a selective pressure.
“It seems that, once ‘do’ was introduced in interrogative phrases, it randomly drifted to higher and higher frequency over time,” said Plotkin. “Then, once it became dominant in the question context, it was selected for in other contexts, the imperative and declarative, probably for reasons of grammatical consistency or cognitive ease.”
As the authors see it, it’s only natural that social-science fields like linguistics increasingly exchange knowledge and techniques with fields like statistics and biology.
“To an evolutionary biologist,” said Newberry, “it’s important that language is maintained through a process of copying language; people learn language by copying other people. That copying introduces minute variation, and those variants get propagated. Each change is an opportunity for a different copying rate, which is the basis for evolution as we know it.”
Featured image: copyrighted, modified from the Supplementary information of the article.
Image (c) Cherissa Dukelow, 2017, licensed under CC-BY-NC-SA 4.0 http://creativecommons.org/licenses/by-nc-sa/4.0/
Image (c) Mitchell Newberry, 2017, https://creativecommons.org/licenses/by-nc/4.0/, licensed under CC-BY-NC 4.0 (see materials at University of Pennsylvania for further sources).
- Our monograph on North-West Indo-European (first draft) is out
- Forces driving grammatical change are different to those driving lexical change
- New Ukraine Eneolithic sample from late Sredni Stog, near homeland of the Corded Ware culture
- Germanic–Balto-Slavic and Satem (‘Indo-Slavonic’) dialect revisionism by amateur geneticists, or why R1a lineages *must* have spoken Proto-Indo-European
- Wiik’s theory about the spread of Uralic into east and central Europe, and the Uralic substrate in Germanic and Balto-Slavic