Model for the spread of Transeurasian (Macro-Altaic) communities with farming


Austronesian influence and Transeurasian ancestry in Japanese: A case of farming/language dispersal, by Martine Robbeets, Max Planck Institute for the Science of Human History.


In this paper, I propose a hypothesis reconciling Austronesian influence and Transeurasian ancestry in the Japanese language, explaining the spread of the Japanic languages through farming dispersal. To this end, I identify the original speech community of the Transeurasian language family as the Neolithic Xinglongwa culture situated in the West Liao River Basin in the sixth millennium bc. I argue that the separation of the Japanic branch from the other Transeurasian languages and its spread to the Japanese Islands can be understood as occurring in connection with the dispersal of millet agriculture and its subsequent integration with rice agriculture. I further suggest that a prehistorical layer of borrowings related to rice agriculture entered Japanic from a sister language of proto-Austronesian, at a time when both language families were still situated in the Shandong-Liaodong interaction sphere.

Classification of the Transeurasian languages according to Robbeets ( forthcoming)

Another interesting anthropological model to validate with future genomic analyses, although I was never convinced about a grouping (let alone reconstructible proto-language) beyond Micro-Altaic languages.

NOTE. The Max Planck Institute may be a great source of scientific advancement, but in Linguistics you can see from the projects Indo-European languages originate in Anatolia (2012) and A massive migration from the steppe brought Indo-European languages to Europe (2015) (the last one referring to the Corded Ware culture, associated with the study by Haak et al. 2015) that they have not got it quite right with Proto-Indo-European… I like the traditional approach of this paper, though, including a thorough assessment of archaeological and linguistic details.

Featured images: Left. The eastward spread of millet agriculture in association with ancestral speech communities. Right: The spread of agriculture and language to Japan.

See also:

Y chromosome C2*-star cluster traces back to ordinary Mongols, rather than Genghis Khan


Article behind paywall, Whole-sequence analysis indicates that the Y chromosome C2*-Star Cluster traces back to ordinary Mongols, rather than Genghis Khan, by Wei, Yan, Lu, et al. Eur J Hum Genet (2018); 26:230–237


The Y-chromosome haplogroup C3*-Star Cluster (revised to C2*-ST in this study) was proposed to be the Y-profile of Genghis Khan. Here, we re-examined the origin of C2*-ST and its associations with Genghis Khan and Mongol populations. We analyzed 34 Y-chromosome sequences of haplogroup C2*-ST and its most closely related lineage. We redefined this paternal lineage as C2b1a3a1-F3796 and generated a highly revised phylogenetic tree of the haplogroup, including 36 sub-lineages and 265 non-private Y-chromosome variants. We performed a comprehensive analysis and age estimation of this lineage in eastern Eurasia, including 18,210 individuals from 292 populations. We discovered that the origin of populations with high frequencies of C2*-ST can be traced to either an ancient Niru’un Mongol clan or ordinary Mongol tribes. Importantly, the age of the most recent common ancestor of C2*-ST (2576 years, 95% CI = 1975–3178) and its sub-lineages, and their expansion patterns, are consistent with the diffusion of all Mongolic-speaking populations, rather than Genghis Khan himself or his close male relatives. We concluded that haplogroup C2*-ST is one of the founder paternal lineages of all Mongolic-speaking populations, and direct evidence of an association between C2*-ST and Genghis Khan has yet to be discovered.

This is a great example of the potential mistake that one can make in assessing leading clans of population expansions from the perspective of the renown case of the Uí Néill clan’s expansion in Ireland.

Just some days ago I wrote about the first Hungarian dynasty’s haplogroup R1a, and the potential association of other Ugric-speaking clans with R1a subclades, so let’s wait and see if future papers on other ancient Hungarian clans and Hungarian settlers bring surprises…


Prehistoric loan relations: Foreign elements in the Proto-Indo-European vocabulary


An interesting ongoing web project, Prehistoric loan relations, on potential loans of Proto-Indo-European words, from Uralic-Yukaghir, Caucasian, and Middle Eastern influence.

Based on a Ph.D. thesis by Bjørn (2017) Foreign elements in the Proto-Indo-European vocabulary (PDF).

From the website (emphasis mine):

This page allows historical linguists to compare and scrutinize proposed prehistoric lexical borrowings from the perspective of Proto-Indo-European. The first entries are all (135 in total) extracted from my master’s thesis “Foreign elements in the Proto-Indo-European vocabulary” (Bjørn 2017). Comments are encouraged at the bottom of each entry. New entries will be added, also on request.

Take this not as the conclusion, but an invitation to join the conversation.

So, we welcome the invitation, and hope that this new project thrives.

Also, I loved his fantasy-like map of the central Eurasian region (featured image on this post).


Expansion of peoples associated with spread of haplogroups: Mongols and C3*-F3918, Arabs and E-M183 (M81)


The expansion of peoples is known to be associated with the spread of a certain admixture component, joint with the expansion and reduction in variability of a haplogroup. In other words, few male lineages are usually more successful during the expansion.

Known examples include:

Two recent interesting papers add prehistoric cases of potential expansion of cultures associated with haplogroups:

1. Whole Y-chromosome sequences reveal an extremely recent origin of the most common North African paternal lineage E-M183 (M81), by Solé-Morata et al., Scientific Reports (2017).


E-M183 (E-M81) is the most frequent paternal lineage in North Africa and thus it must be considered to explore past historical and demographical processes. Here, by using whole Y chromosome sequences from 32 North African individuals, we have identified five new branches within E-M183. The validation of these variants in more than 200 North African samples, from which we also have information of 13 Y-STRs, has revealed a strong resemblance among E-M183 Y-STR haplotypes that pointed to a rapid expansion of this haplogroup. Moreover, for the first time, by using both SNP and STR data, we have provided updated estimates of the times-to-the-most-recent-common-ancestor (TMRCA) for E-M183, which evidenced an extremely recent origin of this haplogroup (2,000–3,000 ya). Our results also showed a lack of population structure within the E-M183 branch, which could be explained by the recent and rapid expansion of this haplogroup. In spite of a reduction in STR heterozygosity towards the West, which would point to an origin in the Near East, ancient DNA evidence together with our TMRCA estimates point to a local origin of E-M183 in NW Africa.

Distribution of E-M183 subclades among North Africa, the Near East and the Iberian Peninsula. Pie chart sectors areas are proportional to haplogroup frequency and are coloured according to haplogroup in the schematic tree to the right. n: sample size. Map was generated using R software.

An interesting excerpt, from the discussion:

Regarding the geographical origin of E-M183, a previous study suggested that an expansion from the Near East could explain the observed east-west cline of genetic variation that extends into the Near East. Indeed, our results also showed a reduction in STR heterozygosity towards the West, which may be taken to support the hypothesis of an expansion from the Near East. In addition, previous studies based on genome-wide SNPs reported that a North African autochthonous component increase towards the West whereas the Near Eastern decreases towards the same direction, which again support an expansion from the Near East. However, our correlations should be taken carefully because our analysis includes only six locations on the longitudinal axis, none from the Near East. As a result, we do not have sufficient statistical power to confirm a Near Eastern origin. In addition, rather than showing a west-to-east cline of genetic diversity, the overall picture shown by this correlation analysis evidences just low genetic diversity in Western Sahara, which indeed could be also caused by the small sample size (n = 26) in this region. Alternatively, given the high frequency of E-M183 in the Maghreb, a local origin of E-M183 in NW Africa could be envisaged, which would fit the clear pattern of longitudinal isolation by distance reported in genome-wide studies. Moreover, the presence of autochthonous North African E-M81 lineages in the indigenous population of the Canary Islands, strongly points to North Africa as the most probable origin of the Guanche ancestors. This, together with the fact that the oldest indigenous inviduals have been dated 2210 ± 60 ya, supports a local origin of E-M183 in NW Africa. Within this scenario, it is also worth to mention that the paternal lineage of an early Neolithic Moroccan individual appeared to be distantly related to the typically North African E-M81 haplogroup30, suggesting again a NW African origin of E-M183. A local origin of E-M183 in NW Africa > 2200 ya is supported by our TMRCA estimates, which can be taken as 2,000–3,000, depending on the data, methods, and mutation rates used.

The TMRCA estimates of a certain haplogroup and its subbranches provide some constraints on the times of their origin and spread. Although our time estimates for E-M78 are slightly different depending on the mutation rate used, their confidence intervals overlap and the dates obtained are in agreement with those obtained by Trombetta et al Regarding E-M183, as mentioned above, we cannot discard an expansion from the Near East and, if so, according to our time estimates, it could have been brought by the Islamic expansion on the 7th century, but definitely not with the Neolithic expansion, which appeared in NW Africa ~7400 BP and may have featured a strong Epipaleolithic persistence. Moreover, such a recent appearance of E-M183 in NW Africa would fit with the patterns observed in the rest of the genome, where an extensive, male-biased Near Eastern admixture event is registered ~1300 ya, coincidental with the Arab expansion. An alternative hypothesis would involve that E-M183 was originated somewhere in Northwest Africa and then spread through all the region. Our time estimates for the origin of this haplogroup overlap with the end of the third Punic War (146 BCE), when Carthage (in current Tunisia) was defeated and destroyed, which marked the beginning of Roman hegemony of the Mediterranean Sea. About 2,000 ya North Africa was one of the wealthiest Roman provinces and E-M183 may have experienced the resulting population growth.

2. The Y-chromosome haplogroup C3*-F3918, likely attributed to the Mongol Empire, can be traced to a 2500-year-old nomadic group, by Zhang et al., Journal of Human Genetics (2017)


The Mongol Empire had a significant role in shaping the landscape of modern populations. Many populations living in Eurasia may have been the product of population mixture between ancient Mongolians and natives following the expansion of Mongol Empire. Geneticists have found that most of these populations carried the Y-haplogroup C3* (C-M217). To trace the history of haplogroup (Hg) C3* and to further understand the origin and development of Mongolians, ancient human remains from the Jinggouzi, Chenwugou and Gangga archaeological sites, which belonged to the Donghu, Xianbei and Shiwei, respectively, were analysed. Our results show that nine of the eleven males of the Gangga site, two of the eight males of Chengwugou site and all of the twelve males of Jinggouzi site were found to have mutations at M130 (Hg C), M217 (Hg C3), L1373 (C2b, ISOGG2015), with the absence of mutations at M93 (Hg C3a), P39 (Hg C3b), M48 (Hg C3c), M407 (Hg C3d) and P62 (Hg C3f). These samples were attributed to the Y-chromosome Hg C3* (Hg C2b, ISOGG2015), and most of them were further typed as Hg C2b1a based on the mutation at F3918. Finally, we inferred that the Y-chromosome Hg C3*-F3918 can trace its origins to the Donghu ancient nomadic group.

The development of Mongolia and the frequencies of haplogroup C3* in modern Eurasians. a The development of Mongolia. b The frequencies of haplogroup C3 in modern Eurasians. The dotted line represents the approximate boundary between the Xiongnu and the Donghu. The black and grey arrows denote the migration of the Donghu and Mongolians, respectively

Featured image: Diachronic map of Iron Age migrations ca. 750-250 BC.


How to do modern phylogeography: Relationships between clans and genetic kin explain cultural similarities over vast distances


A preprint paper has been published in BioRxiv, Relationships between clans and genetic kin explain cultural similarities over vast distances: the case of Yakutia, by Zvenigorosky et al (2017).


Archaeological studies sample ancient human populations one site at a time, often limited to a fraction of the regions and periods occupied by a given group. While this bias is known and discussed in the literature, few model populations span areas as large and unforgiving as the Yakuts of Eastern Siberia. We systematically surveyed 31,000 square kilometres in the Sakha Republic (Yakutia) and completed the archaeological study of 174 frozen graves, assembled between the 15th and the 19th century. We analysed genetic data (autosomal genotypes, Y-chromosome haplotypes and mitochondrial haplotypes) for all ancient subjects and confronted it to the study of 190 modern subjects from the same area and the same population. Ancient familial links and paternal clan were identified between graves up to 1500 km apart and we provide new data concerning the origins of the contemporary Yakut population and demonstrate that cultural similarities in the past were linked to (i) the expansion of specific paternal clans, (ii) preferential marriage among the elites and (iii) funeral choices that could constitute a bias in any ancient population study.

Even if you are not interested in the cultural and anthropological evolution of this Turkic-speaking people of the Russian Far Eastern region, the method used is an excellent example of how to use archaeology and genetics (especially Y-DNA and mtDNA data) to obtain meaningful results when investigating ancient populations.

For quite some time, probably since the first renown admixture analyses of ancient DNA samples were published, we have been living under the impression that phylogeography, or simply archaeogenetics as it was called back in the day, is not needed.

Cavalli-Sforza’s assertion that the study of modern populations could offer a clear picture of past population movements is now considered wrong, and the study of Y-DNA and mtDNA haplogroups is today mostly disregarded as of secondary importance, even among geneticists. Whole genomic investigation (and especially admixture analyses) have been leading the new wave of overconfidence in genetic results, tightly joint with the ignorance of its shortcomings (and commercial interests based on desires of ethnic identification), and haplogroups are usually just reported with other, not entirely meaningful aspects of ancient DNA analyses.

While it is undeniable that admixture analyses are offering quite interesting results, they must be carefully balanced against known archaeological and linguistic knowledge. Phylogeography – and especially Y-DNA haplogroup assessment – is quite interesting in investigating kinship and clans in patrilocal communities – i.e. most communities in prehistoric and historic periods, unless proven otherwise.

Luckily enough, there are those researchers who still strive to obtain meaningful information from haplotypes. The article referenced in this post is quite interesting due to its phylogeographic method’s applicability to ancient cultures and peoples.

When some geneticists look at simplistic prehistoric maps, like those depicting Yamna, Afanasevo, Corded Ware, and Bell Beaker cultures together, they forget that 1) cultural regions are selected more or less arbitrarily (we only have certain scattered sites for each of these cultures); 2) economic or population contacts are difficult to ascertain and to represent graphically; and 3) time periods for archaeological sites are important – in fact, they are probably THE most important aspect in assessing how accurate a map (and its “arrows” of migration or exchange) represents reality.

A careful, detailed study like this one, if applied to the Pontic-Caspian steppe, would probably reveal how R1b subclades dominated steppe clans, beginning at least during the Suvorovo-Novodanilovka expansion to the west, and certainly representing the vast majority of lineages during the internal expansion in the Early Yamna period and its later expansion east and west of the steppe…

Featured image from the article, summing up Geography, Archaeology, and Genetics of Yakutia – including Y-DNA and mtDNA haplogroups from ancient populations.