Mitogenomes show continuity of Neolithic populations in Southern India

New paper (behind paywall) Neolithic phylogenetic continuity inferred from complete mitochondrial DNA sequences in a tribal population of Southern India, by Sylvester et al. Genetica (2018).

This paper used a complete mtDNA genome study of 113 unrelated individuals from the Melakudiya tribal population, a Dravidian speaking tribe from the Kodagu district of Karnataka, Southern India.

Some interesting excerpts (emphasis mine):

Autosomal genetic evidence indicates that most of the ethnolinguistic groups in India have descended from a mixture of two divergent ancestral populations: Ancestral North Indians (ANI) related to People of West Eurasia, the Caucasus, Central Asia and the Middle East, and Ancestral South Indians (ASI) distantly related to indigenous Andaman Islanders (Reich et al. 2009). It is presumed that proto-Dravidian language, most likely originated in Elam province of South Western Iran, and later spread eastwards with the movement of people to the Indus Valley and later the subcontinent India (McAlpin et al. 1975; Cavalli-Sforza et al. 1988; Renfrew 1996; Derenko et al. 2013). West Eurasian haplogroups are found across India and harbor many deep-branching lineages of Indian mtDNA pool, and most of the mtDNA lineages of Western Eurasian ancestry must have a recent entry date less than 10 Kya (Kivisild et al. 1999a). The frequency of these lineages is specifically found among the higher caste groups of India (Bamshad et al. 1998, 2001; Basu et al. 2003) and many caste groups are direct descendants of Indo-Aryan immigrants (Cordaux et al. 2004). These waves of various invasions and subsequent migrations resulted in major demographic expansions in the region, which added new languages and cultures to the already colonized populations of India. Although previous genetic studies of the maternal gene pools of Indians had revealed a genetic connection between Iranian populations and the Arabian Peninsula, likely the result of both ancient and recent gene flow (Metspalu et al. 2004; Terreros et al. 2011).

mtdna-dravidian-south

Haplogroup HV14

mtDNA haplogroup HV14 has prominence in North/Western Europe, West Eurasia, Iran, and South Caucasus to Central Asia (Malyarchuk et al. 2008; Schonberg et al. 2011; Derenko et al. 2013; De Fanti et al. 2015). Although Palanichamy identified haplogroup HV14a1 in three Indian samples (Palanichamy et al. 2015), it is restricted to limited unknown distribution. In the present study, by the addition of considerable sequences from the Melakudiya population, a unique novel subclade designated as HV14a1b was found with a high frequency (43%) allowed us to reveal the earliest diverging sequences in the HV14 tree prior to the emergence of HV14a1b in Melakudiya. (…) The coalescence age for haplogroup HV14 in this study is dated ~ 16.1 ± 4.2 kya and the founder age of haplogroup HV14 in Melakudiya tribe, which is represented by a novel clade HV14a1b is ~ 8.5 ± 5.6 kya

hv14-mtdna-haplogroup
Maximum Parsimonious tree of complete mitogenomes constructed using 38 sequences from Melakudiya tribe and 11 previously published sequences belonging to haplogroup HV14 [Supplementary file Table S2] Suffixes @ indicate back mutation, a plus sign (+) an insertion. Control region mutations are underlined, and synonymous transitions are shown in normal font and non-synonymous mutations are shown in bold font. Coalescence ages (Kya) for complete coding region are shown in normal font and synonymous transitions are shown in Italics

Haplogroup U7a3a1a2

The coalescence age of haplogroup U7a3a1a2 dates to ~ 13.3 ± 4.0 kya. (…)

Although, haplogroup U7 has its origin from the Near East and is widespread from Europe to India, the phylogeny of Melakudiya tribe with subclade U7a3a1a2 clusters with populations of India (caste and tribe) and neighboring populations (Irwin et al. 2010; Ranaweera et al. 2014; Sahakyan et al. 2017), hint about the in-situ origin of the subclade in India from Indo-Aryan immigrants.

I am not a native English speaker, but this paper looks like it needs a revision by one.

Also – without comparison with ancient DNA – it is not enough to show coalescence age to prove an origin of haplogroup expansion in the Neolithic instead of later bottlenecks. However, since we are talking about mtDNA, it is likely that their analysis is mostly right.

Finally, one thing is to prove that the origin of the Indus Valley Civilization lies (in part) in peoples from the Iranian plateau, and to show with ASI ancestry that they are probably the origin of Proto-Dravidian expansion, and another completely different thing is to prove an Elamo-Dravidian connection.

Since that group is not really accepted in linguistics, it is like talking about proving – through that Iran Neolithic ancestry – a Sumero-Dravidian, or a Hurro-Dravidian connection…

Related

Yet another Bayesian phylogenetic tree – now for Dravidian

dravidian-languages

Open access A Bayesian phylogenetic study of the Dravidian language family, by Kolipakam et al. (including Bouckaert and Gray), Royal Society Open Science (2018).

Abstract (emphasis mine):

The Dravidian language family consists of about 80 varieties (Hammarström H. 2016 Glottolog 2.7) spoken by 220 million people across southern and central India and surrounding countries (Steever SB. 1998 In The Dravidian languages (ed. SB Steever), pp. 1–39: 1). Neither the geographical origin of the Dravidian language homeland nor its exact dispersal through time are known. The history of these languages is crucial for understanding prehistory in Eurasia, because despite their current restricted range, these languages played a significant role in influencing other language groups including Indo-Aryan (Indo-European) and Munda (Austroasiatic) speakers. Here, we report the results of a Bayesian phylogenetic analysis of cognate-coded lexical data, elicited first hand from native speakers, to investigate the subgrouping of the Dravidian language family, and provide dates for the major points of diversification. Our results indicate that the Dravidian language family is approximately 4500 years old, a finding that corresponds well with earlier linguistic and archaeological studies. The main branches of the Dravidian language family (North, Central, South I, South II) are recovered, although the placement of languages within these main branches diverges from previous classifications. We find considerable uncertainty with regard to the relationships between the main branches.

dravidian-phylogenetic-tree
MCC tree summary of the posterior probability distribution of the tree sample generated by the analysis with the relaxed covarion model with relative mutation rates estimated. Node bars give the 95% highest posterior density (HPD) limits of the node heights. Numbers over branches give the posterior probability of the node to the right (range 0–1). Colour coding of the branches gives subgroup affiliation: red, South I; blue, Central; purple, North; yellow, South II.

With every new paper using these revamped pseudoscientific linguistic methods popular in the early 2000s, including glottochronology, Swadesh lists, phylogenetic trees, mutation rates, etc. I feel a little more like Sergeant Murtaugh…

Featured image, from the article: “Map of the Dravidian languages in India, Pakistan, Afghanistan and Nepal adapted from Ethnologue [2]. Each polygon represents a language variety (language or dialect). Colours correspond to subgroups (see text). The three large South I languages, Kannada, Tamil and Malayalam are light red, while the smaller South I languages are bright red. Languages present in the dataset used in this paper are indicated by name, with languages with long (950 + years) literatures in bold.”

See also:

The Aryan migration debate, the Out of India models, and the modern “indigenous Indo-Aryan” sectarianism

indus-valley-early-harappan

The Proto-Indo-European Urheimat

Not long ago, the Proto-Indo-European language Urheimat problem used to be cyclic in nature: linguistic and archaeological publications appeared supporting a Copper Age migration from the steppe proposed by Marija Gimbutas, or a Neolithic expansion from Anatolia (or Armenia) proposed by Colin Renfrew, and back again.

I have always supported the simpler, more recent Chalcolithic migration of Late Indo-Europeans from the Pontic-Caspian steppe over an older Neolithic expansion from Anatolia with agriculture. The latter model implied a complex cultural diffusion over a greater span of time than is warranted by linguistic guesstimates, understood as the general grasp that anyone can have on how much a language changes in time, comparing the different stages of different Indo-European languages. Whether they like to talk about it or not, or whether they would describe them as such (or else as terminus ante or post quem), most known linguists and archaeologists involved in Indo-European studies have published at some point their own guesstimates.

To have an idea about how guesstimates work, you only have to learn some Indo-European languages from different branches, the ancient languages from which they are derived, how they have evolved from them through time, and their proto-languages, to see how unlikely it is that the differences from Late Indo-European to Proto-Greek, Proto-Indo-Iranian, Proto-Celtic, or Proto-Italic need a leap of ca. 3000 years almost without change, as required by the Anatolian hypothesis. Some have strong reactions against guesstimates arguing you cannot compare historic or proto-historic changes to prehistoric ones, to support a different linguistic change rate from Proto-Indo-European to proto-languages. I find this to be a sound criticism, but often used justify a worse, ad-hoc estimate that supports other theory.

Glottochronology – in case you are looking for mathematics or statistics to solve the problem – is as useless today as it always was. Not everything – in fact few things in anthropology – can be solved with algorithms and statistics. I do love algorithms and statistics, because their results – if based on sound assumptions – are hard to be contested, but not a single good one has been proposed for comparative grammar, as far as I know.

Algorithms solve everything

Steppe hypothesis

The steppe hypothesis was always the simpler connection with modern Indo-European languages, from a linguistic and archaeological point of view, and archaeogenetics (since the advent of haplogroup investigation, and the finding of modern R1a distribution) did also support it. However, it implied a conquest by warring patrilocal peoples, that substituted the ‘original’ Neolithic European and Asian population and languages, and invasions have not been a fashionable antrhopological subject for a long time.

One of the consequences of the genocidal racism and xenophobia seen during World War II was the strong reaction to its ideological foundations, and there was a common will to end with Kosinna’s trend of historic ethnolinguistic identification of modern peoples. Linguistics and archaeology did then search for more complex models of human relations and exchange, mostly to avoid what appeared as simplistic concepts of migration or invasion. Marija Gimbutas’ simplistic kurganist, male-driven invasion of territories inhabited by matrilocal Old Europeans, albeit reasonable, did not fit well with these post-war times. One could accept historic and proto-historic atrocities and genocide by any people against others, and even tribal conflicts between prehistoric hunter-gatherers that ended in the destruction of one of them, but a violent, massive spread of ‘Aryans’ was considered a dangerous idea to be avoided.

Thanks to the effort of David Anthony (among others) in supporting migration models in Archaeology, the steppe model did have a strong revival even before archaeogenetics began to be a thing in anthropological research.

Anatolian hypothesis

The Anatolian hypothesis, on the other hand, seemed like a fine, long evolution of a language accompanying the peaceful spread of a technological innovation, farming and cattle herding. Originally believed to be mostly a cultural diffusion (now it has been demonstrated to be a mixed diffusion event, with strong demic diffusion in its early phase), it was thus in line with a more politically correct view of prehistoric events.

This cultural diffusion gave in turn way to more peaceful and innovative solutions to language spread, like waves of expansion, or a constellation of languages influencing each other for long periods, so that even the potential reconstruction of a single Proto-Indo-European language or people was doubted. Prehistoric friendly neighbours would have adopted farming and exchanged goods and languages for thousands of years, and only with proto-historic events did people have ethnolinguistic identification that caused conflicts…

While recently there have been some doubts expressed by Mathieson et al. (2017) on the of the steppe hypothesis regarding Proto-Anatolian, it is likely that the lack of enough ancient DNA of the Balkans and Anatolia is the key factor here.

An interesting linguistic proposal, the glottalic theory, while sound in its assumptions and results – much less likely in my opinion than the more common two-dorsal theory, and this much more likely than the prevalent three-dorsal one – gave some theoretical support to the Anatolian (or Armenian) hypothesis, since some proponents felt that a glottalic Proto-Indo-European should have an origin near to the Armenian homeland – because glottalic Proto-Armenian would have retained a phonetic state nearer to the “original” Proto-Indo-European.

That simplistic regional continuity explanation is akin to the trend of Basque researchers to discover links of Proto-Basque with the Pyrenees in Mesolithic and Palaeolithic times, when there is no data to warrant such identifications – and it seems in fact that Proto-Basque, Proto-Iberian, and Palaeo-Sardinian might have accompanied the expansion of farming in the Neolithic. Probably most proponents left of the Glottalic theory today (like Frederik Kortlandt and Alan Bomhard) would accept a steppe migration unrelated to an Armenian or Anatolian origin.

Marginal proposals

There were indeed other marginal proposals, with people supporting origins of Proto-Indo-European in both ends of the current distribution of Indo-European languages, from the “Indo-” in Out of India theories, to the “-European” in Eurocentric proposals. Most Eurocentric proposals – based on certain archaeological cultures and their evolution in- and outside Europe – have been dismissed with archaeological and genetic research, and the remaining ones usually favour the more fashionable peaceful spread of languages.

Palaeolithic Continuity Theory

A small group in support of the more recent Palaeolithic Continuity Theory remains. It seems to me as deeply flawed from a linguistic point of view (with a much larger time span needed than for a Neolithic expansion), but their arguments are led by research on genetics and archaeology, and not much is left for European romanticism, so it has always appeared to me as a professionally acceptable – although futile – attempt by eccentric researchers to disentangle prehistoric events.

Similar to what happens with proponents of the Anatolian hypothesis, new linguistic, archaeological, and genetic research is used to remake PCT models – instead of just dismissing it -, so it is likely that we will have many different proposals of stepped population movements that will make both models eventually converge with the steppe migration theory, to the point where only the steppe migration theory remains, with some added details on its most ancient origin. I guess sometimes it is difficult to let (part of) your life’s research just go away without fighting for some recognition… You desperately look for a tap on the back by some colleagues, even out of pity, who will tell you ‘it seems you might have been right in some details, after all!’…

Out of India

The Out of India theory is the name given to a group of (mostly) independent models that usually propose a Proto-Indo-European homeland based on or around India. Contrary to the PCT, an Out of India theory set during the Mesolithic or Neolithic would be feasible from a linguistic point of view: you could somehow connect some archaeological migrations to support the spread of Early-Proto-Indo-European-speaking R1a lineage happen east-to-west (and north), and genetically it had support in some papers on modern distribution of R1a subclades, for example in Underhill et al. (2014). Underhill himself has since questioned his conclusion in view of recent papers publishing ancient DNA analysis.

Out of India theories, overall, could thus be as strong (or as weak) as the theories concerning an Anatolian origin, in their potential for explanation of the ancient origin of the Proto-Indo-European language spoken in the steppe during the Neolithic and Chalcolithic. However feasible they might a priori be, I have yet to encounter a decent modern paper with that kind of proposal, based on recent genetic papers. Most modern articles are just Indian nationalist crap, and the only decent papers on this matter are becoming quite old fpr this relatively young field of Indo-European studies. Maybe that’s because I don’t have enough time to look for the hidden good anthropological papers among so much dirt. After all, it is not a very likely theory, and one has a limited amount of time.

In recent papers, if you get rid of simplistic reactionary and revisionist views, conservative Indo-Aryan Hindu nationalist or religious bigotry, fantastic connections with the Indus Valley civilization, and simplistic identifications of Proto-Indo-European as ‘nearer’ to Vedic Sanskrit – with absurdly old and odd references to Schleicher’s reconstruction and dialectal Indo-Slavonic or Satem references -, you are left at best with some basic criticisms of Eurocentrism and the known shortcomings of anthropological disciplines in investigating Proto-Indo-European Urheimat, but no data to support any connection with India whatsoever.

If there is a reason for a generalised inferiority complex in India, I would find it in the shameless publication and popularity of such worthless research papers, a trend that is also seen in scientific fields, with Indian researchers having a increasingly tougher time passing editorial and peer reviews, and resorting thus to national journals. In the case of Indo-European studies, instead of trying to fit data with what we know, the only aim in Indian research seems to be to connect the Indus Valley with Proto-Indo-European, and Proto-Indo-European with a “pure” (i.e. Vedic) Indo-Aryan, to support a mythological Indo-Aryan Hinduist India. And that is mostly what you will find in any Out of India article today, whether based on linguistic, archaeological, or – what is prevalent today – genetic investigation.

This has been The Out of India Controversy Week: it began last week with the publication of a quite decent article in The Hindu by Tony Joseph summing up the current situation of anthropological research. It was followed by reactions in conservative Indian news, and this in turn was contested by Davidsky and Razib Khan. The original article by Tony Joseph has been echoed by Victor Mair in Language Log, and I agree with his description of Joseph’s paper as “informed, sensitive, balanced, and nuanced. This is responsible science journalism”, even if I disagree with some of his statements (in a different way than Mr. Mair). However, this propaganda disguised as scientific criticism is what you get from Indian nationalists.

EDIT (25/6/2017): Razib Khan has published a thorough post on Indian evolutionary genetics as follow-up to this week’s controversy. I think there is too much effort being invested during these controversies precisely by the people who need not explain themselves. Anyway, good summaries of anthropological matters are always welcome.

EDIT (29/6/2017): Other posts on the subject, from Brown Pundits: On the “Aryan” debate – the linguistics POV; Razib Khan’s Indian genetics, part n of many; and Aryan Migration and its Discontents.

Interestingly, any time new research comes to shake certain Indian nationalist foundations, a stronger backfire effect happens, and more criticism is done on the shortcomings of such anthropological research. Because, indeed, if the anthropological theory is flawed, mythical Indo-Aryans spread from the Indus Valley, right…? One can only expect this kind of controversies to escalate in conservative Indian blogs and fora alike, and then deescalate until the next paper is published. A dialectic cycle whose only evident result is the increased opposition that conservative Indian researchers – or researchers that depend on funding by such groups – will have in publishing anything related to a potential Aryan invasion, and the addition of a stronger bias in Indian research.

Western European history

It might well be because I am western European, and western Europeans tend to accept quite well multiple invasions from the East. After all, they have happened so many times in proto-historical and historical times, that it is part of our ethnolinguistic nation-building lore. French people trace their history to the expansion of Celts, Romans, and Franks; Spaniards and Portuguese trace it to the spread of Celts, Ibero-Basques, Romans, and Westgoths; Italians to the expansion of Etruscans, Celts and Italics, Romans, Ostrogoths and Langobards; the English to the expansion of Celts, Angles and Saxons, Vikings, and Normans…

It often seems to me that western Europeans will romanticise their origins no matter what appears in historic and genetic investigation: if Neanderthals are unrelated to Europeans, they are ‘cavemen’; if they intermixed with our ancestors, then they suddenly become quite human in their behaviour, and it is great to have more Neanderthal admixture. If Indo-European-speaking R1a lineages invaded central Europe from the east, and transferred their languages, great, because “we” are heirs of original western European hunter-gatherers of Palaeolithic R1b lineages; if R1b lineages represent an invasion of eastern peoples speaking Late Indo-European, great too, because it means that our paternal forefathers were the ‘original’ Indo-European speakers…

This reaction, our history is great no matter what, seems to be a good one for research, since it allows for any change in our romantic views of the past. This, however, does not seem to be the case for some nations, and this inability to change their views is likely related to the inferiority complex that some nations have developed, in turn probably caused by western European colonialism, so one is left to wonder how responsible we are of modern chauvinist trends.

The sad future

Seeing how so many people of eastern European ancestry are convinced of an origin of R1a-M417 in Indo-European migrations from Yamna – when there is (yet?) not a single proof of it – may be just as troubling as the Indian case, or maybe more, since it affects an important part of Europe. I cannot believe that even today only western Europeans are capable of romanticising their own past no matter what, while the rest of the world lives in a quest to appropriate whatever they view as some great ancient culture, people, or language for their own ancestors.

I have already received complaints and have seen people (of Y-DNA haplogroup R1a) complain online that their forefathers cannot have been Uralic speakers, and some Uralic speakers (of haplogroup N) that original Uralic speakers cannot have been of R1a lineages. Firstly, if I were eastern European – be it Germanic, Balto-Slavic, or Uralic speaker, or a speaker of Indo-Aryan languages, of R1a or N lineage, whatever my country of origin, I like to think I would prefer to know where my forefathers actually came from, and what languages they did in fact speak thousands of years ago, even if that disrupts everything I or my fellow countrymen (wrongly) assumed for a long time. Secondly, we – as western Europeans speaking Romance or Germanic languages – have the right to know exactly how our peoples and languages really came to be, even if that means disrupting others’ dreams. Our paternal ancestors probably changed languages 3 or 4 times during their multiple migrations from the east, and were not peaceful hunter-gatherers living since the Palaeolithic in the same region we do now, as traditionally held; if we can get over this, eastern Europeans and Indians can get over it, too.

I think everyone deserves to know the truth, and they will eventually like it and fantasise with it. But many individuals want to disrupt any possible change to keep their current ethnic and nationalist agendas untouched, and that can affect us all. Nationalistic and romantic trends are understandable: Romans needed Virgil at the peak of their conquests to tell them that they had a glorious past in Troy, connecting them to the immortal Greek epics. The most important lesson one can learn from that example is that Italian researchers are still (2000 years later!) influenced by that myth, and they keep trying to look for Anatolian remains in Latin studies, and in the archaeology and evolutionary genetics of Italy. I guess you could therefore say these mythification trends are naturally human…but losing so much time in absurd quests for mythological identities seems absurd, and can only damage research.

It is sad to think about future generations of Indians looking for any sign to support an autochthonous Indo-Aryan homeland, while the rest of the world keeps moving in the right direction…

(Note: featured image is licensed CC-by-sa 4.0 from Avantiputra7 at Wikipedia)