Mitogenomes show ancient human migrations to and through North-East India not of males exclusively


New open article Ancient Human Migrations to and through Jammu Kashmir- India were not of Males Exclusively, by Sharma et al., Scientific Reports 8, N. 851 (2018)


Jammu and Kashmir (J&K), the Northern most State of India, has been under-represented or altogether absent in most of the phylogenetic studies carried out in literature, despite its strategic location in the Himalayan region. Nonetheless, this region may have acted as a corridor to various migrations to and from mainland India, Eurasia or northeast Asia. The belief goes that most of the migrations post-late-Pleistocene were mainly male dominated, primarily associated with population invasions, where female migration may thus have been limited. To evaluate female-centered migration patterns in the region, we sequenced 83 complete mitochondrial genomes of unrelated individuals belonging to different ethnic groups from the state. We observed a high diversity in the studied maternal lineages, identifying 19 new maternal sub-haplogroups (HGs). High maternal diversity and our phylogenetic analyses suggest that the migrations post-Pleistocene were not strictly paternal, as described in the literature. These preliminary observations highlight the need to carry out an extensive study of the endogamous populations of the region to unravel many facts and find links in the peopling of India.


To conclude, the extent of presence of variants defining novel HGs or personal variants indicate high diversity in maternal genetic component of the population of J&K. Statistical analyses indicate that maternal population in J&K have undergone expansion, along with other regions of Indian sub-continent9. However, signatures of maternal gene pool expansion in the region past LGM and early Holocene era are also seen, and this is a unique observation for the present study. These distinct signatures and maternal lineages, never reported before in India, apparently suggest that this region might have served as a corridor, yet also as a reservoir for many unreported lineages.

The overall diversity seen in the maternal gene pool of J&K suggests that the migrations to and through this region were not exclusively of males. This data has refined the existing phylogenetic tree and added to the information further diversity of mtDNA in Indian populations. Further, this preliminary study highlights the importance of the region and emphasizes that the populations of this region should be studied extensively to understand the gene pool of Indian populations. Along with the Y chromosomal and mtDNA markers, a study of autosomal markers is also warranted in these population groups. It is anticipated to help in finding some of the missing links in the evolution of modern humans and their migratory history to and from the mainland India and the Indian subcontinent, a future perspective of our study. Further, we would like to emphasize that the endogamous populations should be studied with respect to their individual evolutionary and migration histories, rather than pooling these together as one group, an underlying drawback that has plagued many of the Indian population based studies in the past, diluting individual signatures and masking stories their DNA has to tell.

See also:

Modern Hungarian mtDNA more similar to ancient Europeans than to Hungarian conquerors


New preprint at BioRxiv, MITOMIX, an Algorithm to Reconstruct Population Admixture Histories Indicates Ancient European Ancestry of Modern Hungarians, by Maroti et al. (2018).

The estimated age distribution of the shared mt Hgs between Hungarians (Hun), the best hypothetical admix (mixFreq) and the populations contributing to this admix: Belgian/Dutch (BeN), Danish (Dan), Basque (Bsq), Croatian/Serbian (CrS), Baltic Late Bronze Age culture (BalBA), Bell Beaker culture (BellB), Slovakian (Slo). The numbers in parentheses indicate the contributions to the best hypothetical admix.

Abstract (emphasis mine)

By making use of the increasing number of available mitogenomes we propose a novel population genetic distance metric, named Shared Haplogroup Distance (SHD). Unlike FST, SHD is a true mathematical distance that complies with all metric axioms, which enables our new algorithm (MITOMIX) to detect population-level admixture based on SHD minimum optimization. In order to demonstrate the effectiveness of our methodology we analyzed the relation of 62 modern and 25 ancient Eurasian human populations, and compared our results with the most widely used FST calculation. We also sequenced and performed an in-depth analysis of 272 modern Hungarian mtDNA genomes to shed light on the genetic composition of modern Hungarians. MITOMIX analysis showed that in general admixture occurred between neighboring populations, but in some cases it also indicated admixture with migrating populations. SHD and MITOMIX analysis comply with known genetic data and shows that in case of closely related and/or admixing populations, SHD gives more realistic results and provides better resolution than FST. Our results suggest that the majority of modern Hungarian maternal lineages have Late Neolith/Bronze Age European origins (partially shared also with modern Danish, Belgian/Dutch and Basque populations), and a smaller fraction originates from surrounding (Serbian, Croatian, Slovakian, Romanian) populations. However only a minor genetic contribution (<3%) was identified from the IXth Hungarian Conquerors whom are deemed to have brought Hungarians to the Carpathian Basin. Our analysis shows that SHD and MITOMIX can augment previous methods by providing novel insights into past population processes.

Unrooted hierarchic cluster of modern and archaic populations based on the SHD matrix.

It is interesting to keep receiving data as to how language does not correlate well with Genomics, whether admixture or haplogroups, even though it is already known to happen in regions such as Anatolia, the Baltic, South-Eastern or Northern Europe.

Thorough anthropological models of migration or cultural diffusion are necessary for a proper interpretation of genetic data. There is no shortcut to that.

Co-occurrence of Hungarian Bronze Age mt Hgs Distribution of mt Hgs found in Hungarian Bronze Age archaic samples in the analyzed populations. The fixation dates are based on Behar et al [6].

Images made available under a CC-BY-NC-ND 4.0 International license.
See also:

Eneolithic Ukraine cultures of the North Pontic steppe and southern steppe-forest, on the Left Bank of the Dnieper


As I said before, Yuri Rassamakin is one archaeologist to follow closely for those interested in Neolithic and Chalcolithic Ukraine (ca. 5000-3300 BC), including Sredni Stog, and their potential connection with the Corded Ware culture, as well as the later expansion of Yamna into the region (and Yamna settlers into south-eastern Europe).

His recent studies include important sites (for Archaeology and recently also for Genomics) such us Dereivka and Alexandria, part of the North Pontic steppe and southern steppe-forest zone, on the Left Bank of the Dnieper river. According to him, many of these sites seem to form part of a common and distinct cultural group.

1) Ohren and Alexandria Burial Grounds of Chalcolithic Period: Problems of Dating and Cultural Inheritance (in Ukrainian), Археологія (2017, Nº 4)

English abstract (sic):

The author discusses the issues of chronology of the known burial grounds. In this article, first of all, the location of a series of burials at Ihren 8 cemetery is revised and the earlier proposed point of view of the author himself is refined. An important moment for that was a revision of a paired bi-ritual burial 7-8 from the excavations in 1932 with the Trypillian painted cup of the second half of the Trypillia B/2. The author presents the arguments for the assumption that the two burials were made not at the same time. As a whole, singled out are the Early Chalcolithic burials with the peculiar for them position on a back with bended knees, accompanied by flint products, first of all tools made on long blades. The second later group is represented by two supine burials which date is determined by a Trypillian cup. Concerning Oleksandriia burial ground, the author confirms his earlier expressed position on the Early Chalcolithic age of the burials with long flint blades, presenting additional arguments, one of which is a publication of a new radiocarbon date for one of the burials. Based on the author’s terminology, graves of the both burial grounds are considered within the borders of the so called Skelianska culture existence, while in Ihren burial ground several burials could be made in the period of so called «hiatus» when there were the Stohivska group sites in the Dnipro River region.

Карта розповсюдження ґрунтових могильників: 1 — Ігрень 8; 2 — О. Виноградний; 3 — Дереївка ІІ; 4 — Молюхів Бугор; 5 — Госпітальний Холм; 6 — Олександрія

2) The Burial of the Early Eneolithic in Luhansk Region (in Ukrainian), by Y. Rassamakin and E. Chernih, Археологія (2017 Nº 2):

English abstract (sic):

A new burial complex is publishing by authors. This burial complex finds analogies among the Early Eneolithic burials of the Siversky Donets basin according to the rite and inventory (long flint blade). In addition, a set of specific flint products (long blades, triangular «spear heads» and flat adzes) finds analogies at the Aleksandriia settlement, where Skelia-type ceramics are represented. Therefore, there is a reason to combine in the same cultural and chronological context the relevant materials of the Aleksandriia settlement and the Early Eneolithic burials, and consider their as a part of the phenomenon that one of the authors conventionally calls Skelia culture.

Карта розповсюдження поховань доби раннього енеоліту в басейні Сіверського Дінця та прилеглих територій: 1 — Олександрія (могильник); 2 — Яма (Сіверськ), могильник; 3 — Ольховатка; 4 — Орловське; 5 — Олександрівськ (могильник); 6 — Ворошиловград; 7 — Луганськ 2010; 8 — Ребриківка ІІ ІІ (РФ); 9 — Донецьк (номери на карті у відповідності до табл. 1)

It remains to bee seen how this new data is interpreted with more complex anthropological models, of potential cultural-historical groups that might have shaped posterior migrations.


More evidence on the recent arrival of haplogroup N and gradual replacement of R1a lineages in North-Eastern Europe


A new article (in Russian), Kinship Analysis of Human Remains from the Sargat Mounds, Baraba forest-steppe, Western Siberia, by Pilipenko et al. Археология, этнография и антропология Евразии Том 45 № 4 2017, downloadable at ResearchGate.


We present the results of a paleogenetic analysis of nine individuals from two Early Iron Age mounds in the Baraba forest -teppe, associated with the Sargat culture (fi ve from Pogorelka-2 mound 8, and four from Vengerovo-6 mound 1). Four systems of genetic markers were analyzed: mitochondrial DNA, the polymorphic part of the amelogenin gene, autosomal STR-loci, and those of the Y-chromosome. Complete or partial data, obtained for eight of the nine individuals, were subjected to kinship analysis. No direct relatives of the “parent-child” type were detected. However, the data indicate close paternal and maternal kinship among certain individuals. This was evidently one of the reasons why certain individuals were buried under a single mound. Paternal kinship appears to have been of greater importance. The diversity of mtDNA and Y-chromosome lineages among individuals from one and the same mound suggests that kinship was not the only motive behind burying the deceased people jointly. The presence of very similar, though not identical, variants of the Y chromosome in different burial grounds may indicate the existence of groups such as clans, consisting of paternally related males. Our conclusions need further confi rmation and detailed elaboration. Keywords: Paleogenetics, ancient DNA, kinship analysis, mitochondrial DNA, uniparental genetic markers, STR-loci, Y-chromosome, Baraba forest-steppe, Sargat culture, Early Iron Age.

From the older study of the same region (Baraba, numbered 4) “Location of ancient human groups with a high frequency of mtDNA haplogroups U5, U4 and U2e lineages. The area of Northern Eurasian anthropological formation is marked by yellow region on the map (References: 1. Bramanti et al., 2009; 2. Malmstrom et
al., 2009; 3. Krause et al., 2010; 4. this study)”

Chronological time scale of Bronze Age Cultures from the Baraba region
This is the same team that brought an ancient mtDNA study of different cultures within the Baraba steppe-forest region (from the Open Access book Population Dynamics in Prehistory and Early History).

The Baraba steppe-forest is a region between the Ob and Irtysh rivers (about 800 km from west to east), stretching over 200 km from the taiga zone in the north to the steppes in the south.

The new study brings a more recent picture of the region, from the Iron Age Sargat culture, ca. 500 BC – 500 AD, with five samples of haplogroup N and two samples of haplogroup R1a.

R1a lineages in the region probably derive from the previous expansion of Andronovo and related cultures, which had absorbed North Caspian steppe populations and their Late Indo-European culture.

N subclades prevalent in certain modern Eurasian populations are probably derived from the expansion of the Seima-Turbino phenomenon.

While samples are scarce, Y-DNA data keeps showing the same picture I have spoken about more than once:

N subclades (potentially originally speaking Proto-Yukaghir languages) gradually replacing haplogroup R1a (originally probably speaking Uralic languages), probably through successive founder effects (such as the bottlenecks found in Finland), which left their Uralic culture and ethnolinguistic identification intact.

Therefore, late Corded Ware groups of North-Eastern Europe (in the Forest Zone and the Baltic), mainly of R1a-Z645 subclades, probably never adopted Late Indo-European languages.


Archaeological origins of Early Proto-Indo-European in the Baltic during the Mesolithic


New article by Leonid Zaliznyak, Mesolithic origins of the first Indo-European cultures in Europe according to the archaeological data (also available in Russian).

The article refers to the common Meso-Neolithic basis of Ukrainian ancient Indo-European cultures (Mariupol, Serednii Stih) and Central Europe (Funnel Beaker and Globular Amphorae cultures) of the fourth millennium BC. Archaeological materials show that the common cultural and genetic substrate of the earliest Indo-Europeans in Europe was forming from the sixth to the fourth millennia BC due to migration of the Western Baltic Mesolithic population to the east through Poland and Polissia to the Dnipro River middle region and further to the Siverskyi Donets River.

I already spoke about the view of the Russian school, and its interpretation of the origin of Proto-Indo-European (and potentially Indo-Uralic) in North-Eastern European Mesolithic. While the genetic interpretation seemed quite off in Klejn’s last article discussing Genetics, Zaliznyak improves the archaeological model to some extent.

This model is partially compatible with the expansion of R1b lineages and the Villabruna cluster with migrating peoples of post-Swiderian cultures into eastern Europe. However – as seems to be often the case with linguists of post-Soviet countries (maybe because of the greater influence of Nostraticists there) – proto-language dates are pushed further back in time than is warranted by usual guesstimates, and thus the model is way off as it approaches the Neolithic, and especially beyond that time.

As you can see, a Post-Swiderian expansion of (a language ancestral to) Proto-Indo-European (e.g. Pre-Indo-Uralic) is compatible with the Indo-European demic diffusion model. On the other hand, it is very difficult to assert anything about that period in terms of language change or evolution, because of scarce and obscured archaeological finds, and because of different admixture waves found in east Europe (in the Pontic-Caspian steppe, forest-steppe, and Forest Zone) during the Palaeolithic-Mesolithic – and even during the Mesolithic-Neolithic – transition.

It is therefore impossible today to ascertain if it was a community of western (R1b) or eastern (R1a) Eurasian lineages who spread Pre-Indo-Uralic; or which combination of WHG:ANE (if any) might have yielded EHG ancestry (and thus how a Pre-Indo-Uralic language might have developed from the influence of west and east Eurasian communities); or how later waves of ANE and CHG ancestry found in steppe populations (during the Neolithic) might have brought cultural change to the communities, or even if they accompanied the more recent R1a-M417 subclades (or haplogroup Q) found in the region…

Spreading of Post-Swiderian and Post-Krasnosillian sites in Mesolithic of Eastern Europe in the 8th millennia BC. See the article for an explanation of all details.

This Russian (or post-Soviet, or East European) school of thought, which is mainly based on their traditional archaeological models, tries to use new genetic data to obtain plausible archaeological-linguistic models of Indo-European expansion. Nevertheless, this improved model is likely to cause some quick dismissals and be made fun of by certain amateur geneticists.

It is curious, though, that some people are quick to judge archaeologists trying to fit new data to their traditional models – which seems like the right way of obtaining sound models for prehistoric human migrations -, but are on the other hand extremely confident about any new model based solely on genetics and their personal desires: very strong confirmation (and rejection) bias at play, indeed.

For example, how could Sredni Stog be Late Indo-European-speaking, if the best candidate for a Late Indo-European-speaking community (the Yamna culture) is almost fully unrelated? For some, simply because of the ‘Yamnaya ancestral component’.

In spite of many naysayers – amateur geneticists who hate archaeological models not fitting their dreams – , it seems that otherwise extremely disparate Indo-European schools of thought (like the German, American, and Spanish schools, the British, and even Leiden, the French, and to some extent the East European school) are converging in Linguistics, while in Archaeology Heyd’s model of Yamna migration (independent of the Corded Ware culture) is being accepted as mainstream with help from aDNA analysis – now also partially by Anthony, at last.

Only researchers of a single workgroup (very popular today, it seems) – tend to diverge from the general unifying trend, following mostly their interpretations of new genetic papers in a funny vicious circle, that is creating a growing bubble of misinformation with no substantive basis (apart from the controversial existence of a Kurgan people).

Let’s see how this ends up, if new genetic algorithms can truly revolutionise Archaeology and Linguistics, or if academic models will keep proving right over misinterpretations from recent genetic papers…

Featured image, from the article, “The settling of the early Indo-Europeans in the period from the 4th to the 2nd millennia BC”.


The Indus Valley Civilisation in genetics – the Harappan Rakhigarhi project


Razib Khan reports on his new website about an article by Tony Joseph, Who built the Indus Valley civilisation?, itself referring to the potential upcoming results of a genetic analysis project involving Rakhigarhi, the biggest Harappan site.

The possible scenarios based on potential sample results in terms of Y-DNA and mtDNA haplogroups seem to be generally well described, and I would bet – like Khan – for some kind of an East-West Eurasian connection. This is all pure speculation, though, and after all we only have to wait one month and see.

Detailed map of Indus Valley Civilization settlements. Key: ville actuelle – modern cities; site indusien – Indus Valley Civilization site; site majeur – major site (from Wikipedia, by Michel Danino)

Out of the potential models laid out by Joseph something struck me as plainly wrong. From the section about R1a and Vedic Aryans (emphasis mine):

In the ancient DNA from Rakhigarhi, scientists identify R1a, one of the hundreds of Y-DNA haplogroups (or male lineages that are passed on from fathers to sons). They also identify H2b — one of the hundreds of mt-DNA haplogroups (or female lineages that are passed on from mothers to daughters) — that has often been found in proximity to R1a.

There is no reason whatsoever to think that this would be the research finding, but if it is, it would cause a global convulsion in the fields of population genetics, history and linguistics. It would also cause great cheer among the advocates of the theory that says that the Indus Valley civilisation was Vedic Aryan.


And it goes on to postulate reasons why such a big fuss will be created about the potential finding of haplogroup R1a, and its implications for the Out-of-India Theory. A global convulsion, no less.

But, since when do genetic findings cause revolutions in Linguistics? Or even in Archaeology?

When I thought the identification of R1a – Indo-European could never reach a lower level of unscientific nonsense, based on circular reasoning, here it is, a worse example.

Not only are there people waiting desperately to see just one sample of an R1a subclade in Yamna to oversimplistically identify (yet again) Corded Ware with the Indo-European expansion; there are also people waiting to find just one sample in India or Central Asia to destroy the current models of steppe origins for Proto-Indo-European.

I guess this childish game is more or less based on the same premises that made some people believe that the concept of the ‘Yamnaya component’ destroyed traditional archaeological models.

Modern haplogroup R1a distribution from The Genetic Atlas (PD), the kind of simplistic maps that generated the current misconceptions (or how to sow the wind among populations with an inferiority complex).

It seems that all new methods involving admixture analysis, PCA, and other statistical tools to study Human Ancestry are still irrelevant for most, and indeed that Archaeology and even Linguistics are at the service of the simplistic identification of ancient languages with modern haplogroup distributions.

We are reliving the 1990s in Genetics, and the 1930s in Archaeology and Linguistics all over again. This must be great news for companies that offer genetic analyses… I wonder if it is also good for Science, though.

The funny thing is, the same people responsible for the survival of these misconceptions, i.e. R1a – Indo-European fanboys, who constantly fan the flames of absurd genetic-genealogical and ethnolinguistic identification, are often the first to criticize models compatible with the Out-of-India Theory.

I really hope some R1a subclade is found among the samples, so that stupidity can reach the lowest possible level in discussions among amateur geneticists obsessed with haplogroup R1a’s role in the expansion of Indo-European speakers. Maybe then will the rest of us be able to overcome this renewed moronic supremacist trends hidden behind supposedly objective migration models.

For those interested in actual Indo-European migration models, the finding of early R1a subclades in central Asia (or India) – like the potential finding of R1a subclades in Yamnadoes change neither Archaeology nor Linguistics on the Indo-European question.

Genomics is merely helping these disciplines evolve, by supporting certain archaeological models of migration over others, but no revolution has been seen yet, and none is expected.

Each new genetic paper helps support the strongest archaeological models of steppe origins for Proto-Indo-European, and a Late Indo-European expansion compatible with current Linguistic reconstructions.

Featured image: From Wikipedia, Indus Valley Civilization, Mature Phase (2600-1900 BCE), by Jane McIntosh.