mtDNA haplogroup frequency analysis from Verteba Cave supports a strong cultural frontier between farmers and hunter-gatherers in the North Pontic steppe

eneolithic-forest-zone

New preprint paper at BioRxiv, led by a Japanese researcher, with analysis of mtDNA of Trypillians from Verteba Cave, Analysis of ancient human mitochondrial DNA from Verteba Cave, Ukraine: insights into the origins and expansions of the Late Neolithic-Chalcolithic Cututeni-Tripolye Culture, by Wakabayashi et al. (2017).

Abstract:

Background: The Eneolithic (~5,500 yrBP) site of Verteba Cave in Western Ukraine contains the largest collection of human skeletal remains associated with the archaeological Cucuteni-Tripolye Culture. Their subsistence economy is based largely on agro-pastoralism and had some of the largest and most dense settlement sites during the Middle Neolithic in all of Europe. To help understand the evolutionary history of the Tripolye people, we performed mtDNA analyses on ancient human remains excavated from several chambers within the cave.

Results: Burials at Verteba Cave are largely commingled and secondary in nature. A total of 68 individual bone specimens were analyzed. Most of these specimens were found in association with well-defined Tripolye artifacts. We determined 28 mtDNA D-Loop (368 bp) sequences and defined 8 sequence types, belonging to haplogroups H, HV, W, K, and T. These results do not suggest continuity with local pre-Eneolithic peoples, but rather complete population replacement. We constructed maximum parsimonious networks from the data and generated population genetic statistics. Nucleotide diversity (π) is low among all sequence types and our network analysis indicates highly similar mtDNA sequence types for samples in chamber G3. Using different sample sizes due to the uncertainly in number of individuals (11, 28, or 15), we found Tajima’s D statistic to vary. When all sequence types are included (11 or 28), we do not find a trend for demographic expansion (negative but not significantly different from zero); however, when only samples from Site 7 (peak occupation) are included, we find a significantly negative value, indicative of demographic expansion.

Conclusions: Our results suggest individuals buried at Verteba Cave had overall low mtDNA diversity, most likely due to increased conflict among sedentary farmers and nomadic pastoralists to the East and North. Early Farmers tend to show demographic expansion. We find different signatures of demographic expansion for the Tripolye people that may be caused by existing population structure or the spatiotemporal nature of ancient data. Regardless, peoples of the Tripolye Culture are more closely related to early European farmers and lack genetic continuity with Mesolithic hunter-gatherers or pre-Eneolithic groups in Ukraine.

Genetic finds keep supporting the long-lasting cultural and linguistic frontier that Anthony (2007) – among others – asserted existed in the North-West Pontic steppe in the Mesolithic and Neolithic, between western steppe cultures and farmers, while it disproves Kristiansen’s theories of Sredni Stog expansion in Kurgan waves with a mixture of GAC and Trypillia within the Corded Ware culture:

Previous ancient DNA studies showed that hunter-gatherers before 6,500 yrBP in Europe commonly had haplogroups U, U4, U5, and H, whereas hunter-gatherers after 6,500 yrBP in Europe had less frequency of haplogroup H than before. Haplogroups T and K appeared in hunter-gatherers only after 6,500 yrBP, indicating a degree of admixture in some places between farmers and hunter-gatherers. Farmers before and after 6,500 yrBP in Europe had haplogroups W, HV*, H, T, K, and these are also found in individuals buried at Verteba Cave. Therefore, our data point to a common ancestry with early European farmers. Our data also suggest population replacement. Mathieson et al. analyzed a number of Neolithic Ukrainian samples (petrous bone) from several sites in southern, northern, and western Ukraine, dating to ~8,500 – 6,000 yrBP, and found exclusively U (U4 and U5) mtDNA lineages. It should be noted that ‘Neolithic’ in this context does not mean the adoption of agriculture, but rather simply coinciding with a change in material culture. They also analyzed several Trypillian individuals from Verteba Cave (different samples from the those included in this study). Similar to our findings, they found a wider diversity of mtDNA lineages, including H, HV, and T2b. These data, combined with our results, appear to confirm almost complete population replacement by individuals associated with the Tripolye Culture during the Middle to Late Neolithic.

The findings also hint to potential contacts of Yamna with Usatovo as predicted by Anthony (2007), or alternatively (lacking precise dates) to contacts with Corded Ware migrants:

Trypillians were very much a distinct people who most likely displaced 1 local hunter-gatherers with little admixture. Haplogroup W was also observed in several specimens deriving from Site G3. Although we are unsure if all of these haplogroups come from a single or multiple individuals, this observation is interesting in that it is relatively rare and isolated among Neolithic samples. It has, however, been found in samples dating to the Bronze Age. In the study by Wilde et al. [35], they found haplogroup W present in two samples from the Early Bronze Age associated with the Yamnaya and Usatovo cultures. The Usatovo culture (~ 3500 – 2500 BC) was found in Romania, Moldova, and southern Ukraine. It was the conglomeration of Tripolye and North Pontic steppe cultures. Therefore, this individual could link the Trypillian peoples to the Usatovo peoples and perhaps to the greater Yamnaya steppe migrations during the Bronze Age that lead to the Corded Ware Culture.

On the other hand, an article written in terms of mtDNA haplogroup frequencies seems to offer too little proof of anything today. The lack of Y-DNA haplogroups and data on admixture makes their interpretations provisional, subject to change when these further data are published. Also, radiocarbon dating is only confident for individuals of one site (site 7), dated ca. 5,500 cal BP, while “other chambers in the cave are not as confidently dated”…

verteba-cave-mtDNA
“Based on the 8 sequence types of the mtDNA D-loop, a maximum parsimonious phylogenetic network was constructed. Circles represent the sequence types, and the size of the circle is proportional to the number of samples. Numbers on the branches between the circles are nucleotide position numbers (+16,000) of the human mitochondrial genome sequence (rCRS). Information about the location (chamber within the cave) where the specimen was excavated is also provided. Areas 2 and 17 are part of Site 7, and these are defined as a separate chamber, although they are located in close proximity within Site 7. The other chambers, Site 20, G2, and G3, are independent and separate locations within the cave. ‘Undefined’ chamber describes an unknown location within the cave. Specimens from each chamber showed deviation for the sequence type distribution observed in the sample set. For example, specimens excavated from Site 7 had five unique sequence types, (I, II, III, IV, and VIII), while specimens excavated from chamber G 21 had mainly one sequence type (V)”. Made available by the authors under a CC-BY-NC-ND 4.0 International license.

We had also seen signs of conflict between Trypillian and steppe cultures in a recent article, Violence at Verteba Cave, Ukraine: New Insights into the Late Neolithic Intergroup Conflict, by Madden et al. (2017):

Many researchers have pointed to the huge “megasites” and construction of fortifications as evidence of intergroup hostilities among the Late Neolithic Tripolye archaeological culture. However, to date, very few skeletal remains have been analyzed for the types of traumatic injury that serve as direct evidence for violent conflict. In this study, we examine trauma on human remains from the Tripolye site of Verteba Cave in western Ukraine. The remains of 36 individuals, including 25 crania, were buried in the gypsum cave as secondary interments. The frequency of cranial trauma is 30-44% among the 25 crania, six males, four females and one adult of indeterminate sex displayed cranial trauma. Of the 18 total fractures, 10 were significantly large and penetrating suggesting lethal force. Over half of the trauma is located on the posterior aspect of the crania, suggesting the victims were attacked from behind. Sixteen of the fractures observed were perimortem and two were antemortem. The distribution and characteristics of the fractures suggest that some of the Tripolye individuals buried at Verteba Cave were victims of a lethal surprise attack. Resources were limited due to population growth and migration, leading to conflict over resource access. It is hypothesized that during this time of change burial in this cave aided in development of identity and ownership of the local territory.

Related:

Correlation does not mean causation: the damage of the ‘Yamnaya ancestral component’, and the ‘Future American’ hypothesis

New Ukraine Eneolithic sample from late Sredni Stog, near homeland of the Corded Ware culture

The concept of “outlier” in studies of Human Ancestry, and the Corded Ware outlier from Esperstedt

Marija Gimbutas and the expansion of the “Kurgan people” based on tumulus-building cultures

Before steppe ancestry: Europe’s genetic diversity shaped mainly by local processes, with varied sources and proportions of hunter-gatherer ancestry

neolithic-mesolithic-europe

The definitive publication of a BioRxiv preprint article, in Nature: Parallel palaeogenomic transects reveal complex genetic history of early European farmers, by Lipson et al. (2017).

The dataset with all new samples is available at the Reich Lab’s website. You can try my drafts on how to do your own PCA and ADMIXTURE analysis with some of their new datasets.

Abstract:

Ancient DNA studies have established that Neolithic European populations were descended from Anatolian migrants who received a limited amount of admixture from resident hunter-gatherers. Many open questions remain, however, about the spatial and temporal dynamics of population interactions and admixture during the Neolithic period. Here we investigate the population dynamics of Neolithization across Europe using a high-resolution genome-wide ancient DNA dataset with a total of 180 samples, of which 130 are newly reported here, from the Neolithic and Chalcolithic periods of Hungary (6000–2900 BC, n = 100), Germany (5500–3000 BC, n = 42) and Spain (5500–2200 BC, n = 38). We find that genetic diversity was shaped predominantly by local processes, with varied sources and proportions of hunter-gatherer ancestry among the three regions and through time. Admixture between groups with different ancestry profiles was pervasive and resulted in observable population transformation across almost all cultural transitions. Our results shed new light on the ways in which gene flow reshaped European populations throughout the Neolithic period and demonstrate the potential of time-series-based sampling and modelling approaches to elucidate multiple dimensions of historical population interactions.

There were some interesting finds on a regional level, with some late survival of hunter-gatherer ancestry (and Y-DNA haplogroups) in certain specific sites, but nothing especially surprising. This survival of HG ancestry and lineages in Iberia and other regions may be used to revive (yet again) the controversy over the origin of non-Indo-European languages of Europe attested in historical times, such as the only (non-Uralic) one surviving to this day, the Basque language.

This study kept confirming the absence of Y-DNA R1b-M269 subclades in Central Europe before the arrival of Yamna migrants, though, which offers strong reasons to reject the Indo-European from the west hypothesis.

Here are first the PCA of samples included in this paper, and then the PCA of ancient Eurasians (Mathieson et al. 2017) and modern populations (Lazaridis et al. 2014) for comparison of similar clusters:

mesolithic-neolithic-PCA
First two principal components from the PCA. We computed the principal components (PCs) for a set of 782 present-day western Eurasian individuals genotyped on the Affymetrix Human Origins array (background grey points) and then projected ancient individuals onto these axes. A close-up omitting the present-day Bedouin population is shown. From Lipton et al. (2017(
pca-south-east-europe
PCA of South-East European and other European samples from Mathieson et al. (2017)
pca-ancient-modern-europe
Ancient and modern samples on Lazaridis et al. (2014)

Related:

Evolutionary forces in language change depend on selective pressure, but also on random chance

english-language-evolution

A new interesting paper from Nature: Detecting evolutionary forces in language change, by Newberry, Ahern, Clark, and Plotkin (2017). Discovered via Science Daily.

The following are excerpts of materials related to the publication (written by Katherine Unger Baillie), from The University of Pennsylvania:

Examining substantial collections of annotated texts dating from the 12th to the 21st centuries, the researchers found that certain linguistic changes were guided by pressures analogous to natural selection — social, cognitive and other factors — while others seem to have occurred purely by happenstance.

“Linguists usually assume that when a change occurs in a language, there must have been a directional force that caused it,” said Joshua Plotkin, professor of biology in Penn’s School of Arts and Sciences and senior author on the paper. “Whereas we propose that languages can also change through random chance alone. An individual happens to hear one variant of a word as opposed to another and then is more likely to use it herself. Chance events like this can accumulate to produce substantial change over generations. Before we debate what psychological or social forces have caused a language to change, we must first ask whether there was any force at all.”

“One of the great early American linguists, Leonard Bloomfield, said that you can never see a language change, that the change is invisible,” said Robin Clark, a coauthor and professor of linguistics in Penn Arts and Sciences. “But now, because of the availability of these large corpora of texts, we can actually see it, in microscopic detail, and begin to understand the details of how change happened.”

One change is the regularization of past-tense verbs. Using the Corpus of Historical American English, comprised of more than 100,000 texts ranging from 1810 to 2009 that have been parsed and digitized — a database that includes more than 400 million words — the team searched for verbs where both regular and irregular past-tense forms were present, for example, “dived” and “dove” or “wed” and “wedded.”

“There is a vast literature and a lot of mythology on verb regularization and irregularization,” Clark said, “and a lot of people have claimed that the tendency is toward regularization. But what we found was quite different.”

Indeed, the analysis pointed to particular instances where it seems selective forces are driving irregularization. For example, while a swimmer 200 years ago might have “dived”, today we would say they “dove.” The shift towards using this irregular form coincided with the invention of cars and concomitant increase in use of the rhyming irregular verb “drive”/“drove.”

Despite finding selection acting on some verbs, “the vast majority of verbs we analyzed show no evidence of selection whatsoever,” Plotkin said.

The team recognized a pattern: random chance affects rare words more than common ones. When rarely-used verbs changed, that replacement was more likely to be due to chance. But when more common verbs switched forms, selection was more likely to be a factor driving the replacement.

Language-evolution-hero
The grammar of negating a sentence has changed from “Ic ne secge” (Beowulf, c. 900) to “Ic ne sege noht” (the Ormulum, c. 1100) to “I seye not” (Chaucer, c. 1400) to “I doe not say” (Shakespeare, c. 1600) before returning to the familiar “I don’t say” (Virginia Woolf, c. 1900). A team from Penn used massive digital libraries along with inference techniques from population genetics to quantify the forces responsible for language evolution, such as in Jespersen’s cycle of negation, depicted here. (c) Cherissa Dukelow, 2017, license information below

The authors also observed a role of random chance in grammatical change. The periphrastic “do,” as used in, “Do they say?” or “They do not say,” did not exist 800 years ago. Back in the 1400s, these sentiments would have been expressed as, “Say they?” or “They say not.”

Using the Penn Parsed Corpora of Historical English, which includes 7 million syntactically parsed words from 1,220 British English texts, the researchers found that the use of the periphrastic “do” emerged in two stages, first in questions (“Don’t they say?”) around the 1500s, and then roughly 200 years later in imperative and declarative statements (“They don’t say.”).

old-medieval-modern-english
These manuscripts show changes from Old English (Beowulf) through Middle English (Trinity Homilies, Chaucer) to Early Modern English (Shakespeare’s First Folio). Penn researchers used large collections of digitized texts spanning the 12th to the 21st centuries to show that many language changes can be attributed to random chance alone. (c) Mitchell Newberry, 2017, license information below

While most linguists have assumed that such a distinctive grammatical feature must have been driven to dominance by some selective pressure, the Penn team’s analysis questions that assumption. They found that the first stage of the rising periphrastic “do” use is consistent with random chance. Only the second stage appears to have been driven by a selective pressure.

“It seems that, once ‘do’ was introduced in interrogative phrases, it randomly drifted to higher and higher frequency over time,” said Plotkin. “Then, once it became dominant in the question context, it was selected for in other contexts, the imperative and declarative, probably for reasons of grammatical consistency or cognitive ease.”

As the authors see it, it’s only natural that social-science fields like linguistics increasingly exchange knowledge and techniques with fields like statistics and biology.

“To an evolutionary biologist,” said Newberry, “it’s important that language is maintained through a process of copying language; people learn language by copying other people. That copying introduces minute variation, and those variants get propagated. Each change is an opportunity for a different copying rate, which is the basis for evolution as we know it.”

Featured image: copyrighted, modified from the Supplementary information of the article.

Image (c) Cherissa Dukelow, 2017, licensed under CC-BY-NC-SA 4.0 http://creativecommons.org/licenses/by-nc-sa/4.0/
Image (c) Mitchell Newberry, 2017, https://creativecommons.org/licenses/by-nc/4.0/, licensed under CC-BY-NC 4.0 (see materials at University of Pennsylvania for further sources).

Related:

Correlation does not mean causation: the damage of the ‘Yamnaya ancestral component’, and the ‘Future American’ hypothesis

america-languages-lowlandic

Human ancestry can only help solve anthropological questions by using all anthropological disciplines involved. I have said that many times in this blog.

Correlation does not mean causation

Really, it does not.

You might think the tenet ‘correlation does not mean causation‘ must be evident at this point in Statistics, and it must also be for all those using statistical methods in their research. But it is sadly not so. A lot of researchers just look for correlation, and derive conclusions – without even an initial sound hypothesis to be contrasted… You can judge for yourself, e.g. reading the many instances of this complaint in recent publications of Biomedical and Social Sciences, on the interesting blog Statistical Modeling, Causal Inference, and Social Science.

In anthropological questions regarding Indo-European studies there is an added handicap: not taking correlation to mean causation does also mean – to avoid at least the most obvious confounders – taking into account the multiple linguistic and archaeological data that are available right now, to explain the expansion of Indo-European languages.

You might also believe that international researchers in Human Evolutionary Biology – after all, this is essentially a biomedical discipline – are acquainted with statistical methods and their problems when applied to their field. And that scientific journals – and especially those with the highest impact factors, like Nature, Science, or PNAS – have professional, careful reviewers who would never accept papers that equal correlation with causation, especially when Social Sciences are involved (because this alone might make errors grow exponentially…). Sadly, this is obviously not so, either.

https://imgs.xkcd.com/comics/correlation.png

The ‘Yamnaya component’ concept and its damage

From Allentoft et al. (2015), emphasis is mine:

Both studies [Haak et al. (2015) and this one] found a genetic affinity between samples from a central European culture known as Corded Ware, which existed from around 2500 bc, and samples from the earlier Yamnaya steppe culture. This similarity between distant populations is best explained by a substantial westward expansion of the Yamnaya or their close relatives into central Europe (Fig. 1b). Such an expansion is consistent with the steppe hypothesis, which argues that Corded Ware cultures were a conduit for the dispersal of Indo-European languages into Europe.

More interesting than these vague words – and the short, almost invisible suggestion that Yamna may not be exactly the population behind Corded Ware peoples – are the maps that illustrated in Nature their risky hypothesis: they called it “steppe hypothesis“, like that (in general terms), as if everyone defending a steppe origin for Proto-Indo-European would support such a model, when they actually referred to the specific hypothesis of one of their authors (Kristiansen), one of the few archaeologists who keep Gimbutas’ concept of the ‘Kurgan peoples’ alive, based on the Corded Ware culture:

Allentoft Corded Ware
Allentoft et al. (2015): “They conclude that the Corded Ware culture of central Europe had ancestry from the Yamnaya. Allentoft et al. also show that the Afanasievo culture to the east is related to the Yamnaya, and that the Sintashta and Andronovo cultures had ancestry from the Corded Ware. Arrows indicate migrations — those from the Corded Ware reflect the evidence that people of this archaeological culture (or their relatives) were responsible for the spreading of Indo-European languages. All coloured boundaries are approximate.”

In many publications that followed, the trend has been to reproduce this graphical model, by asserting (or implying) that Bell Beaker peoples were the result of subsequent Corded Ware migrations, and indeed that Corded Ware peoples migrated from the Yamna culture, and were thus the vector of expansion for Indo-European languages in Europe.

All of this is being proven wrong, as I predicted: see Mathieson et al. (2017) and Olalde et al. (2017) for recently studied samples with ‘steppe component’, older than (and unrelated to) the Yamna culture. However, no retraction (or correction, whatever) has been published to date about the concept of the ‘Yamnaya ancestry expansion’, and its consequences.

We shall see then just a rather surreptitious shift in terminology from ‘Yamnaya’ to ‘steppe’ component, to adapt to the new data – i.e. some damage control while the ship of ‘Yamnaya ancestry’ capsizes – but little else. “Earlier ‘Yamnaya ancestry’, you say? Just, you know, let’s call it ‘steppe ancestry’ and shift the expansion of Indo-European languages to one or two thousand years earlier, and done!”

The damage of this post-truth genetics is already done: we will see the unending distribution on the Internet in general, and on social networks in particular, of these grandiose conclusions, of far-fetched Indo-European migration models that include the Corded Ware culture, of simplistic maps with apparently harmless ‘arrows of migration’ (like the above) representing fictional population movements suggesting nonexistent dialectal branches.

You might be one of those sceptics wary of so many boring statistical rules: “But it’s a safe reasoning: Yamanaya samples have an ‘ancestral component’ that is found elevated in Corded Ware samples, and less so in Bell Beaker samples, and PCA showed a similar result…so the migration model Yamnaya -> Corded Ware -> Bell Beaker is a priori correct, right?”

The ‘Future American’ hypothesis

Let me illustrate this attractive “Correlation = Causation” argument, using it to solve the problem of Future American languages.

Suppose we live in a future post-apocalyptic world ca. 3500 AD, with no surviving historical records before 3000 AD. None. Just investigation of cultures and their relationship by Archaeology, proto-languages reconstructed and language families identified by Linguistics, etc.

We have thus Future Germanic and Future Romance as the only language families spoken in Future Western Europe and in the Future Americas, in a distribution similar to the present day*, and we have certain somehow related archaeologically-defined cultures on both sides of the Atlantic, like Briton, Iberian, Norman, or Lowlandish, although their distribution remains partly undefined in time and space.

* If you are really curious about this scenario, you can read about the potential evolution of a Future North-American language.

But what languages did the ancestors of Future Americans speak, and who spread them? That question remains far from being settled by our future researchers, in spite of the solidest linguistic and migration models (talking mainly about Briton and Iberian cultures): too many authorities out there questioning them, fighting to impose their own pet theories.

Suddenly, the newly developed field of Human Ancestry comes to save the day. So let’s say we have this map of ancient samples recovered (dated from, say, the 6th to the 18th century AD), and our study is centered on the newly described “Western European” component (a precise combination of, say, WHG+steppe), which peaks in early samples from the Low Lands – hence we call it, quite daringly, “Lowlandic component“.

Our group is keen to demonstrate that the ancient Lowlandic culture described in Archaeology (marked especially by the worldwide distribution of tulips among other traits) is the origin of Western European and American languages… Now, let’s reach conclusions about migrations in the Middle Ages!

america-languages-lowlandic
‘Future American’ hypothesis. Migration routes in Western Europe and the Americas during the Middle Ages, based on the ‘Lowlandic component’ (Click to open higher quality version).

PCA shows that South-West European samples cluster closely to some North-West European samples, and that some late South American samples available cluster at some distance from North American samples – nearer to a native component represented by two individuals with 0% Lowlandic ancestry and a different cluster in PCA. And some North-American samples cluster quite closely to North-West European samples.

Based on the decrease in ‘Lowlandic component’ in the different samples and on PCA, we conclude that Lowlandic peoples (“or their close relatives”) must have migrated at the same time to North America, South America (or potentially from North America to South America?) as well as western, central, and northern Europe. Both migration events must have happened roughly at the same time, in part because both distinct language families appear in a north-south distribution, and Proto-Lowlandic must be (according to Genetics) the ancestor of both, Proto-Future-Germanic and Proto-Future-Romance.

That makes a lot of sense! A huge Lowlandic pressure for migration, you see. Push-pull mechanisms and stuff. A Lowlandic Empire probably (scattered remains are found everywhere)! And, judging by the presence of the ‘Lowlandic component’ in Future East Europe from the Elbe to the Vistula, maybe Lowlandic peoples spread Proto-Slavic, too! We can even date the common Lowlandic-Slavic proto-language this way! So many groundbreaking conclusions!

Future scholars supporting the Lowlandic homeland are on fire; they can’t get enough of publishing papers on the subject. “Two different Future American language families with cultural origins in Britain and Iberia, my ass! Because genetics.”

And don’t forget the future people of haplogroup R1b-U106 and high Lowlandic component: Wow, they are the heirs of those who expanded Future Germanic and Future Romance languages everywhere, aren’t they? How proud they must be. And who wouldn’t want to have these tall, blond, blue-eyed Lowlanders as their forefathers? Personalised genetic analysis is selling like crazy: “let’s know our Lowlandic percentage!”. Everyone is happy, colourful maps with lots of arrows and shit…

But – your future you might ask in awe, seeing that this doesn’t sound quite right, based on your basic archaeological and linguistic knowledge:

  • What about specific models of migration proposed to date? The solidest ones, not just anyone that seems to fit?
  • What about the dialectal classification of languages? The mainstream ones, not those that are compatible with this interpretation?
  • What about archaeological cultures to which individual samples belonged?
  • What about the actual dates of each sample? And how this date relates to the state of the culture to which it belongs?
  • What about the haplogroups, and the actual subclade of each haplogroup?
  • What about the territories, cultures, and dates not sampled, could they change this interpretation in light of known archaeological models?
  • And what about the actual origin of that ancestral component they so frivolously named? Dit it really appear ex nihilo in the Low Lands, and expanded from it?

“Who cares! This new data is sooo coool… And it proves what we wanted, what a coincidence! And it’s numbers, mate! Numbers don’t lie.”

 
No, numbers don’t lie. But people do.

Correlation is fun, isn’t it?

 

Related:

Schleicher’s Fable in Proto-Indo-European – pitch and stress accent

bell-beaker-village-nwie

Also included in our monograph North-West Indo-European (first draft) is a tentative reconstruction of Schleicher’s fable in North-West Indo-European, and just for illustration of the reconstructed sounds (including pitch and stress accent), a recording has been included.

The recording is available as audio (see above) or video (see below) with captions and multiple subtitles. The captions in North-West Indo-European show acute accents over accented vowels, while stressed syllables are underlined:

I think such a recording was necessary for comparison with the most commonly reconstructed pronunciation, as taught usually in courses. And I am not referring to those professors still using only stress – instead of pitch – accent to pronounce PIE, but to those that, using pitch accent, do place stress over the same syllable.

A good example to illustrate my point is Andrew M. Byrd‘s reading of his version of the fable for the journal Archaeology.

Apart from some controversial decisions regarding the Proto-Indo-Hittite reconstruction – see our explanation of our version, or e.g. Kortlandt’s reconstruction of the Fable (PDF) for more details – , his recitation does not seem to contrast enough pitch and stress accent, to the extent that pitch and stress seem to be always on the same syllable. He specialises in Proto-Indo-European phonology, so maybe it is a voluntary selection.

Firstly, as an introduction – in case you don’t know anything about this question -, a pitch accent is reconstructed for Proto-Indo-European, based on the reconstructed accent of Old Indian, Greek, Germanic, and Balto-Slavic – hence also valid for North-West Indo-European, even though Italo-Celtic lost it completely.

If you have listened to any tonal language*, words have also stress accent, and not necessarily on the same syllable – but usually on the heaviest one. In fact, I don’t know of an accent pattern with pitch+stress on the same syllable (but for certain reconstructed intermediate labile stages of a languages), and I guess it is so redundant that it would always lose one of them.

*pitch-accent systems are also tonal systems, after all, since they involve at least two tones: an acute or rising one, and usually a falling one after it.

You can listen to a sample of the Homeric recitation by Stephen Daitz, with restored Ancient Greek pronunciation, where he contrasts pitch and stress beautifully:

Note: you can buy his readings in restored pronunciation online in Bolchazy-Carducci Publishers. I can’t recommend them highly enough.

You can listen to other samples of Ancient Greek with restored pronunciation by Stefan Hagel (whose Homeric singing is superb), or many others.

To see what I mean with the lack of contrast in Byrd’s pronunciation, just compare the restored pronunciation with these samples, of restored Koine Greek, from the Biblical Language Center. I think you can hear pitch accent pronounced, but always stressing the same syllable. After a while, it gets quite monotone (no pun intended); for me, at least*.

*It seems to be, nevertheless, one of the top rated pronunciations of Koine Greek out there.

Pitch accent in my pronunciation is not as noticeable as that of Stephen Daitz, and still less than that of Stefan Hagel. But it is not intended to.

I wanted to combine tone and stress as naturally as possible, as it is found in modern languages, like Chinese, or like South Slavic, Baltic, or Scandinavian languages. I believe PIE phonology cannot be too different from modern natural examples.

Many Modern Greek scholars complain about the artificiality of the restored pronunciation. I’ve heard particularly harsh criticism against Stefan Hagel’s pronunciation: many scholars do not recognise the ancestral language in the restored pronunciation.

While such critics may seem like snob reactionaries, and I really appreciate an exaggerated poetic style for epic poems (I have spent hundreds, probably thousands, of hours listening to Stephen Daitz), I don’t think this is the way Ancient Greek was usually spoken. Listening to Hagel’s pronunciation in the Ancient Greek Assimil, there is a huge contrast between readers who don’t use the restored pronunciation in the recordings (offering thus a decaffeinated Ancient Greek), and Hagel’s reading (or, almost, singing).

In my interpretation of the fable I have tried to follow these ideas, and maybe in the end the pitch accent is not as acute as it should be (a fifth higher). On the other hand, it seemed more natural to me this way.

Also, in the final version of my reading, there are many words where it is not clear – not even to me – if there is more than one syllable with pitch or stress accent. This is especially so after after my first change of voice to make a more acute ‘sheep voice’, and then worsens with my graver ‘horse voice’. I really thought recording this was going to be easier!

If you have any comments or suggestions on the pronunciation, they are all welcome.

UPDATE (November 2, 2017): Frederik Kortlandt comments our paper – “When comparing PIE with other tonal languages, the best candidate is Japanese, which means that the “stress” falls on the last High syllable of a word form or sequence of connected word forms.”

Human ancestry: how to work your own PCA, ADMIXTURE analyses for human evolutionary and genealogical studies

yamna-corded-ware-bell-beaker

I wrote two days ago in the post anouncing the revised version (October 2017) of the Indo-European demic diffusion model, about dumping the information I had on doing PCA and ADMIXTURE analyses as ‘drafts’, without reviewing them, in the new section of this website called Human Ancestry.

I had some time today to review them, and to correct gross mistakes in the texts, so that they might be more usable now

I began to work with free datasets to see if I could learn something more about results of recent Genetic research by working with the available free software. For the moment, I don’t see it necessary to continue working with samples myself, because there are many professionals in Bioinformatics doing an excellent job with their publications – much better than I could do -, and publishing results early (as pre-prints) and with free licenses, which allow us to reuse and modify their material. To work again with their samples seems most of the time like reinventing the wheel.

After all, my interpretation of Indo-European migrations does not depend on my own analysis of free datasets – or on genetic analysis, or on archaeological fieldwork, for that matter – but on the study of all anthropological questions involved. I am actually more interested in Linguistics, and – only marginally – in Archaeology, as is the field of Indo-European Studies in general.

I did find certain interesting aspects that I have commented in the model, though: especially by labelling all samples and reading about them carefully (usually in the supplementary notes of the published papers), you can observe certain patterns and derive some information that others might have missed. Such examples include the Corded Ware outlier from Esperstedt (see more on the Corded Ware migration), or the differences in the three samples from early Khvalynsk.

Now that most data published seem to keep supporting what I have suggested – regarding the more complex nature of the steppe component (so-called ‘yamnaya component‘), and also regarding the migration from Yamna to Bell Beaker, and a migration of a different population (and probably language) with Corded Ware – I don’t find it worthy to spend more of my quite limited time in these tasks.

However, if I need to work again with datasets, I will try to complete the drafts the best I can. Especially regarding F3 Statistics and qpGraph, which I didn’t even try. If you want to help improve the sections, you are welcome of course.

If I find time, I might be of help with your work. And even though modern genealogy does not interest me (for the moment), I guess it can also be relevant to obtain conclusions on more recent migrations, so if I can be of any help to any interesting work, I will do it too.

plot3d-yamna
Plot 3D of datasets Minoans and Mycenaeans + Scythians and Sarmatians, using the same colours as in the Indo-European demic diffusion model.

Related:

  • The concept of “outlier” in studies of Human Ancestry, and the Corded Ware outlier from Esperstedt
  • New Ukraine Eneolithic sample from late Sredni Stog, near homeland of the Corded Ware culture