Mitogenomes show ancient human migrations to and through North-East India not of males exclusively


New open article Ancient Human Migrations to and through Jammu Kashmir- India were not of Males Exclusively, by Sharma et al., Scientific Reports 8, N. 851 (2018)


Jammu and Kashmir (J&K), the Northern most State of India, has been under-represented or altogether absent in most of the phylogenetic studies carried out in literature, despite its strategic location in the Himalayan region. Nonetheless, this region may have acted as a corridor to various migrations to and from mainland India, Eurasia or northeast Asia. The belief goes that most of the migrations post-late-Pleistocene were mainly male dominated, primarily associated with population invasions, where female migration may thus have been limited. To evaluate female-centered migration patterns in the region, we sequenced 83 complete mitochondrial genomes of unrelated individuals belonging to different ethnic groups from the state. We observed a high diversity in the studied maternal lineages, identifying 19 new maternal sub-haplogroups (HGs). High maternal diversity and our phylogenetic analyses suggest that the migrations post-Pleistocene were not strictly paternal, as described in the literature. These preliminary observations highlight the need to carry out an extensive study of the endogamous populations of the region to unravel many facts and find links in the peopling of India.


To conclude, the extent of presence of variants defining novel HGs or personal variants indicate high diversity in maternal genetic component of the population of J&K. Statistical analyses indicate that maternal population in J&K have undergone expansion, along with other regions of Indian sub-continent9. However, signatures of maternal gene pool expansion in the region past LGM and early Holocene era are also seen, and this is a unique observation for the present study. These distinct signatures and maternal lineages, never reported before in India, apparently suggest that this region might have served as a corridor, yet also as a reservoir for many unreported lineages.

The overall diversity seen in the maternal gene pool of J&K suggests that the migrations to and through this region were not exclusively of males. This data has refined the existing phylogenetic tree and added to the information further diversity of mtDNA in Indian populations. Further, this preliminary study highlights the importance of the region and emphasizes that the populations of this region should be studied extensively to understand the gene pool of Indian populations. Along with the Y chromosomal and mtDNA markers, a study of autosomal markers is also warranted in these population groups. It is anticipated to help in finding some of the missing links in the evolution of modern humans and their migratory history to and from the mainland India and the Indian subcontinent, a future perspective of our study. Further, we would like to emphasize that the endogamous populations should be studied with respect to their individual evolutionary and migration histories, rather than pooling these together as one group, an underlying drawback that has plagued many of the Indian population based studies in the past, diluting individual signatures and masking stories their DNA has to tell.

See also:

Modern Hungarian mtDNA more similar to ancient Europeans than to Hungarian conquerors


New preprint at BioRxiv, MITOMIX, an Algorithm to Reconstruct Population Admixture Histories Indicates Ancient European Ancestry of Modern Hungarians, by Maroti et al. (2018).

The estimated age distribution of the shared mt Hgs between Hungarians (Hun), the best hypothetical admix (mixFreq) and the populations contributing to this admix: Belgian/Dutch (BeN), Danish (Dan), Basque (Bsq), Croatian/Serbian (CrS), Baltic Late Bronze Age culture (BalBA), Bell Beaker culture (BellB), Slovakian (Slo). The numbers in parentheses indicate the contributions to the best hypothetical admix.

Abstract (emphasis mine)

By making use of the increasing number of available mitogenomes we propose a novel population genetic distance metric, named Shared Haplogroup Distance (SHD). Unlike FST, SHD is a true mathematical distance that complies with all metric axioms, which enables our new algorithm (MITOMIX) to detect population-level admixture based on SHD minimum optimization. In order to demonstrate the effectiveness of our methodology we analyzed the relation of 62 modern and 25 ancient Eurasian human populations, and compared our results with the most widely used FST calculation. We also sequenced and performed an in-depth analysis of 272 modern Hungarian mtDNA genomes to shed light on the genetic composition of modern Hungarians. MITOMIX analysis showed that in general admixture occurred between neighboring populations, but in some cases it also indicated admixture with migrating populations. SHD and MITOMIX analysis comply with known genetic data and shows that in case of closely related and/or admixing populations, SHD gives more realistic results and provides better resolution than FST. Our results suggest that the majority of modern Hungarian maternal lineages have Late Neolith/Bronze Age European origins (partially shared also with modern Danish, Belgian/Dutch and Basque populations), and a smaller fraction originates from surrounding (Serbian, Croatian, Slovakian, Romanian) populations. However only a minor genetic contribution (<3%) was identified from the IXth Hungarian Conquerors whom are deemed to have brought Hungarians to the Carpathian Basin. Our analysis shows that SHD and MITOMIX can augment previous methods by providing novel insights into past population processes.

Unrooted hierarchic cluster of modern and archaic populations based on the SHD matrix.

It is interesting to keep receiving data as to how language does not correlate well with Genomics, whether admixture or haplogroups, even though it is already known to happen in regions such as Anatolia, the Baltic, South-Eastern or Northern Europe.

Thorough anthropological models of migration or cultural diffusion are necessary for a proper interpretation of genetic data. There is no shortcut to that.

Co-occurrence of Hungarian Bronze Age mt Hgs Distribution of mt Hgs found in Hungarian Bronze Age archaic samples in the analyzed populations. The fixation dates are based on Behar et al [6].

Images made available under a CC-BY-NC-ND 4.0 International license.
See also:

Ancient Phoenician mtDNA from Sardinia, Lebanon reflects settlement, genetic diversity, and female mobility


New article at PLOS One, Ancient mitogenomes of Phoenicians from Sardinia and Lebanon: A story of settlement, integration, and female mobility, by Matisoo-Smith et al. (2018).


The Phoenicians emerged in the Northern Levant around 1800 BCE and by the 9th century BCE had spread their culture across the Mediterranean Basin, establishing trading posts, and settlements in various European Mediterranean and North African locations. Despite their widespread influence, what is known of the Phoenicians comes from what was written about them by the Greeks and Egyptians. In this study, we investigate the extent of Phoenician integration with the Sardinian communities they settled. We present 14 new ancient mitogenome sequences from pre-Phoenician (~1800 BCE) and Phoenician (~700–400 BCE) samples from Lebanon (n = 4) and Sardinia (n = 10) and compare these with 87 new complete mitogenomes from modern Lebanese and 21 recently published pre-Phoenician ancient mitogenomes from Sardinia to investigate the population dynamics of the Phoenician (Punic) site of Monte Sirai, in southern Sardinia. Our results indicate evidence of continuity of some lineages from pre-Phoenician populations suggesting integration of indigenous Sardinians in the Monte Sirai Phoenician community. We also find evidence of the arrival of new, unique mitochondrial lineages, indicating the movement of women from sites in the Near East or North Africa to Sardinia, but also possibly from non-Mediterranean populations and the likely movement of women from Europe to Phoenician sites in Lebanon. Combined, this evidence suggests female mobility and genetic diversity in Phoenician communities, reflecting the inclusive and multicultural nature of Phoenician society.

Haplogroup assignments, dates, locations and Genbank accession details of all aDNA samples included in analyses.

Featured image, from the article: Map showing phoenician maritime expansions across the Mediterranean starting from around 800 BCE. Arrows indicate maritime movement. Blue dots indicate coastal sites and pink shaded areas indicate the extent of Phoenician settlements.

See also:

Ancient mtDNA from Central America and Mexico


New article, Successful reconstruction of whole mitochondrial genomes from ancient Central America and Mexico, by Morales-Arce et al., Scientific Reports (2017).


The northern and southern peripheries of ancient Mesoamerica are poorly understood. There has been speculation over whether borderland cultures such as Greater Nicoya and Casas Grandes represent Mesoamerican outposts in the Isthmo-Colombian area and the Greater Southwest, respectively. Poor ancient DNA preservation in these regions challenged previous attempts to resolve these questions using conventional genetic techniques. We apply advanced in-solution mitogenome capture and high-throughput sequencing to fourteen dental samples obtained from the Greater Nicoya sites of Jícaro and La Cascabel in northwest Costa Rica (n = 9; A.D. 800–1250) and the Casas Grandes sites of Paquimé and Convento in northwest Mexico (n = 5; A.D. 1200–1450). Full mitogenome reconstruction was successful for three individuals from Jícaro and five individuals from Paquimé and Convento. The three Jícaro individuals belong to haplogroup B2d, a haplogroup found today only among Central American Chibchan-speakers. The five Paquimé and Convento individuals belong to haplogroups C1c1a, C1c5, B2f and B2a which, are found in contemporary populations in North America and Mesoamerica. We report the first successfully reconstructed ancient mitogenomes from Central America, and the first genetic evidence of ancestry affinity of the ancient inhabitants of Greater Nicoya and Casas Grandes with contemporary Isthmo-Columbian and Greater Southwest populations, respectively.

Archaeological sites location and corresponding culture areas as noted in the text. ArcGIS 10.4 software ( was used to generate the figure. Service layer credits Esri, ArcGIS Online, TerraColor (Earthstar Geographics) 1999; Vivid – Mexico (Digital Globe) 2005, 2009, 2010, 2011, 2012, 2013, 2014, 2015; Metro (Digital Globe) 2016; Vivid Caribbean (Digital Globe) 2013, 2014, 2015, 2016, Vivid (Digital Globe) 2015, Vivid – Mexico (Digital Globe) 2012 and the GIS User Community.

Discovered via Bernard Sécher’s blog.

Featured image: From Wikipedia, author Juan Miguel, “Mesoamérica y toda América Central prehispanica en el siglo XVI (16) — antes de la llegada de los españoles.”


The Indus Valley Civilisation in genetics – the Harappan Rakhigarhi project


Razib Khan reports on his new website about an article by Tony Joseph, Who built the Indus Valley civilisation?, itself referring to the potential upcoming results of a genetic analysis project involving Rakhigarhi, the biggest Harappan site.

The possible scenarios based on potential sample results in terms of Y-DNA and mtDNA haplogroups seem to be generally well described, and I would bet – like Khan – for some kind of an East-West Eurasian connection. This is all pure speculation, though, and after all we only have to wait one month and see.

Detailed map of Indus Valley Civilization settlements. Key: ville actuelle – modern cities; site indusien – Indus Valley Civilization site; site majeur – major site (from Wikipedia, by Michel Danino)

Out of the potential models laid out by Joseph something struck me as plainly wrong. From the section about R1a and Vedic Aryans (emphasis mine):

In the ancient DNA from Rakhigarhi, scientists identify R1a, one of the hundreds of Y-DNA haplogroups (or male lineages that are passed on from fathers to sons). They also identify H2b — one of the hundreds of mt-DNA haplogroups (or female lineages that are passed on from mothers to daughters) — that has often been found in proximity to R1a.

There is no reason whatsoever to think that this would be the research finding, but if it is, it would cause a global convulsion in the fields of population genetics, history and linguistics. It would also cause great cheer among the advocates of the theory that says that the Indus Valley civilisation was Vedic Aryan.


And it goes on to postulate reasons why such a big fuss will be created about the potential finding of haplogroup R1a, and its implications for the Out-of-India Theory. A global convulsion, no less.

But, since when do genetic findings cause revolutions in Linguistics? Or even in Archaeology?

When I thought the identification of R1a – Indo-European could never reach a lower level of unscientific nonsense, based on circular reasoning, here it is, a worse example.

Not only are there people waiting desperately to see just one sample of an R1a subclade in Yamna to oversimplistically identify (yet again) Corded Ware with the Indo-European expansion; there are also people waiting to find just one sample in India or Central Asia to destroy the current models of steppe origins for Proto-Indo-European.

I guess this childish game is more or less based on the same premises that made some people believe that the concept of the ‘Yamnaya component’ destroyed traditional archaeological models.

Modern haplogroup R1a distribution from The Genetic Atlas (PD), the kind of simplistic maps that generated the current misconceptions (or how to sow the wind among populations with an inferiority complex).

It seems that all new methods involving admixture analysis, PCA, and other statistical tools to study Human Ancestry are still irrelevant for most, and indeed that Archaeology and even Linguistics are at the service of the simplistic identification of ancient languages with modern haplogroup distributions.

We are reliving the 1990s in Genetics, and the 1930s in Archaeology and Linguistics all over again. This must be great news for companies that offer genetic analyses… I wonder if it is also good for Science, though.

The funny thing is, the same people responsible for the survival of these misconceptions, i.e. R1a – Indo-European fanboys, who constantly fan the flames of absurd genetic-genealogical and ethnolinguistic identification, are often the first to criticize models compatible with the Out-of-India Theory.

I really hope some R1a subclade is found among the samples, so that stupidity can reach the lowest possible level in discussions among amateur geneticists obsessed with haplogroup R1a’s role in the expansion of Indo-European speakers. Maybe then will the rest of us be able to overcome this renewed moronic supremacist trends hidden behind supposedly objective migration models.

For those interested in actual Indo-European migration models, the finding of early R1a subclades in central Asia (or India) – like the potential finding of R1a subclades in Yamnadoes change neither Archaeology nor Linguistics on the Indo-European question.

Genomics is merely helping these disciplines evolve, by supporting certain archaeological models of migration over others, but no revolution has been seen yet, and none is expected.

Each new genetic paper helps support the strongest archaeological models of steppe origins for Proto-Indo-European, and a Late Indo-European expansion compatible with current Linguistic reconstructions.

Featured image: From Wikipedia, Indus Valley Civilization, Mature Phase (2600-1900 BCE), by Jane McIntosh.


Globular Amphora not linked to Pontic steppe migrants – more data against Kristiansen’s Kurgan model of Indo-European expansion


New open access article, Genome diversity in the Neolithic Globular Amphorae culture and the spread of Indo-European languages, by Tassi et al. (2017).


It is unclear whether Indo-European languages in Europe spread from the Pontic steppes in the late Neolithic, or from Anatolia in the Early Neolithic. Under the former hypothesis, people of the Globular Amphorae culture (GAC) would be descended from Eastern ancestors, likely representing the Yamnaya culture. However, nuclear (six individuals typed for 597 573 SNPs) and mitochondrial (11 complete sequences) DNA from the GAC appear closer to those of earlier Neolithic groups than to the DNA of all other populations related to the Pontic steppe migration. Explicit comparisons of alternative demographic models via approximate Bayesian computation confirmed this pattern. These results are not in contrast to Late Neolithic gene flow from the Pontic steppes into Central Europe. However, they add nuance to this model, showing that the eastern affinities of the GAC in the archaeological record reflect cultural influences from other groups from the East, rather than the movement of people.

(a) Principal component analysis on genomic diversity in ancient and modern individuals. (b) K = 3,4 ADMIXTURE analysis based only on ancient variation. (a) Principal component analysis of 777 modern West Eurasian samples with 199 ancient samples. Only transversions considered in the PCA (to avoid confounding effects of post-mortem damage). We represented modern individuals as grey dots, and used coloured and labelled symbols to represent the ancient individuals. (b) Admixture plots at K = 3 and K = 4 of the analysis conducted only considering the ancient individuals. The full plot is shown in electronic supplementary material, figure S7. The ancient populations are sorted by a temporal scale from Pleistocene to Iron Age. The GAC samples of this study are displayed in the box on the right.

Excerpt, from the discussion:

In its classical formulation, the Kurgan hypothesis, i.e. a late Neolithic spread of proto-Indo-European languages from the Pontic steppes, regards the GAC people as largely descended from Late Neolithic ancestors from the East, most likely representing the Yamna culture; these populations then continued their Westward movement, giving rise to the later Corded Ware and Bell Beaker cultures. Gimbutas [23] suggested that the spread of Indo-European languages involved conflict, with eastern populations spreading their languages and customs to previously established European groups, which implies some degree of demographic change in the areas affected by the process. The genomic variation observed in GAC individuals from Kierzkowo, Poland, does not seem to agree with this view. Indeed, at the nuclear level, the GAC people show minor genetic affinities with the other populations related with the Kurgan Hypothesis, including the Yamna. On the contrary, they are similar to Early-Middle Neolithic populations, even geographically distant ones, from Iberia or Sweden. As already found for other Late Neolithic populations [18], in the GAC people’s genome there is a component related to those of much earlier hunting-gathering communities, probably a sign of admixture with them. At the nuclear level, there is a recognizable genealogical continuity from Yamna to Corded Ware. However, the view that the GAC people represented an intermediate phase in this large-scale migration finds no support in bi-dimensional representations of genome diversity (PCA and MDS), ADMIXTURE graphs, or in the set of estimated f3-statistics.

Scheme summarizing the five alternative models compared via ABC random forest. We generated by coalescent simulation mtDNA sequences under five models, differing as to the number of migration events considered. The coloured lines represent the ancient samples included in the analysis, namely Unetice (yellow line), Bell Beaker (purple line), Corded Ware (green line) and Globular Amphorae (red line) from Central Europe, Yamnaya (light blue line) and Srubnaya (brown line) from Eastern Europe. The arrows refer to the three waves of migration tested. Model NOMIG was the simplest one, in which the six populations did not have any genetic exchanges; models MIG1, MIG2 and MIG1, 2 differed from NOMIG in that they included the migration events number 1, 2 (from Eastern to Central Europe, respectively before and after the onset of the GAC), or both. Model MIG2, 3 represents a modification of MIG2 model also including a back migration from Central to Eastern Europe after the development of the Corded Ware culture.

Together with Globular Amphora culture samples from Mathieson et al. (2017), this suggests that Kristiansen’s Indo-European Corded Ware Theory is wrong, even in its latest revised models of 2017.

The background shading indicates the tree migratory waves proposed by Marija Gimbutas, and personally
checked by her in 1995. The symbols refer to the ancient populations considered in the ABC analysis

On the other hand, the article’s genetic finds have some interesting connections in terms of mtDNA phylogeography, but without a proper archaeological model it is difficult to explain them.

Haplogroup frequencies were obtained for Early Neolithic (EN), Middle Neolithic (MN), Chalcolithic (CA), and Late Neolithic (LN). The color assigned to each haplogroup is represented on the lower right part of each plot. Haplogroup frequencies were plotted geographically using QGIS v2.14.

Text and images from the article under Creative Commons Attribution 4.0 license.

Discovered first via Bernard Sécher’s blog.

See also: