Scytho-Siberians of Aldy-Bel and Sagly, of haplogroup R1a-Z93, Q1b-L54, and N


Recently, a paper described Eastern Scythian groups as “Uralic-Altaic” just because of the appearance of haplogroup N in two Pazyryk samples.

This simplistic identification is contested by the varied haplogroups found in early Altaic groups, by the early link of Cimmerians with the expansion of hg. N and Q, by the link of N1c-L392 in north-eastern Europe with Palaeo-Laplandic, and now (paradoxically) by the clear link between early Mongolic expansion and N1c-L392 subclades.

A new paper (behind paywall) offers insight into the prevalent presence of R1a-Z93 among eastern Scytho-Siberian groups (most likely including Samoyedic speakers in the forest-steppes), and a new hint to the westward expansion of haplogroups Q and N (probably coupled with the so-called “Siberian ancestry”) from the east with different groups of Iron Age steppe nomads:

Genetic kinship and admixture in Iron Age Scytho-Siberians, by Mary et al. Human Genetics (2019).

Interesting excerpts (emphasis mine):

From an archeological and historical point of view, the term “Scythians” refers to Iron Age nomadic or seminomadic populations characterized by the presence of three types of artifacts in male burials: typical weapons, specific horse harnesses and items decorated in the so-called “Animal Style”. This complex of goods has been termed the “Scythian triad” and was considered to be characteristic of nomadic groups belonging to the “Scythian World” (Yablonsky 2001). This “Scythian World” includes both the Classic (or European) Scythians from the North Pontic region (7th–3th century BC) and the Southern Siberian (or Asian) populations of the Scythian period (also called Scytho-Siberians). These include, among others, the Sakas from Kazakhstan, the Tagar population from the Minusinsk Basin (Republic of Khakassia), the Aldy-Bel population from Tuva (Russian Federation) and the Pazyryk and Sagly cultures from the Altai Mountains.

Proportions of Scythian mtDNA haplogroups. Western (blue) and eastern (pink) Eurasian lineages are equally distributed in the Arzhan Scytho-Siberian sample. The U5a2a1 haplogroup shared between the two Scythian groups studied is in bold

In this work, we first aim to address the question of the familial and social organization of Scytho-Siberian groups by studying the genetic relationship of 29 individuals from the Aldy-Bel and Sagly cultures using autosomal STRs. (…) were obtained from 5 archeological sites located in the valley of the Eerbek river in Tuva Republic, Russia (Fig. 1). All the mounds of this archeological site were excavated but DNA samples were not collected from all of them. 14C dates mainly fall within the Hallstatt radiocarbon calibration plateau (ca. 800–400 cal BC) where the chronological resolution is poor. Only one date falls on an earlier segment of calibration curve: Le 9817–2650 ± 25 BP, i.e. 843–792 cal BC with a probability of 94.3% (using the OxCal v4.3.2 program). This sample (Bai-Dag 8, Kurgan 1, grave 10) is not from one of the graves studied but was used to date the kurgan as a whole.

Y-chromosome haplogroups were first assigned using the ISOGG 2018 nomenclature. In order to improve the precision of haplogroup definition, we also analyzed a set of Y-chromosome SNP (Supplementary Table 2). Nine samples belonged to the R1a-M513 haplogroup (defined by marker M513) and two of these nine samples were characterized as belonging to the R1a1a1b2-Z93 haplogroup or one of its subclades. Six samples belonged to the Q1b1a-L54 haplogroup and five of these six samples belonged to the Q1b1a3-L330 subclade. One sample belonged to the N-M231 haplogroup.


The distribution of these haplogroups in the population must be confronted with the prevalence of kinship among the samples. Although five individuals belonged to haplogroup Q1b1a3-L330, three of them (ARZ-T18, ARZ-T19 and ARZ-T20) were paternally related (Fig. 2). It must, therefore, be considered that haplogroup Q1b1a3-L330 is present in three independent instances (given that the remaining two instances exhibit no close familial relationship with other samples or one another). All five were buried on the Eki-Ottug 1 archaeological site (although in two different kurgans).

In the same way, although two groups, of two and three individuals, shared haplotypes belonging to the R1a-M513 haplogroup, these groups likely include a father/son pair (ARZ-T2 and ARZ-T12). Therefore, among nine R1a-M513 men, we found six independent haplotypes, one being present in two independent instances. All R1a-M513 haplotypes, however, including those attributed to the R1a1a1b2-Z93 subclade, only differed by one-step mutations, across 5 loci at most. All R1a-M513 individuals were buried on the same site, Eki-Ottug 2, in a single Kurgan.


Haplogroup R1a-M173 was previously reported for 6 Scytho-Siberian individuals from the Tagar culture (Keyser et al. 2009) and one Altaian Scytho-Siberian from the Sebÿstei site (Ricaut et al. 2004a), whereas haplogroup R1a1a1b2-Z93 (or R1a1a1b-S224) was described for one Scythian from Samara (Mathieson et al. 2015) and two Scytho-Siberians from Berel and the Tuva Republic (Unterländer et al. 2017). On the contrary, North Pontic Scythians were found to belong to the R1b1a1a2 haplogroup (Krzewińska et al. 2018), showing a distinction between the two groups of Scythians. (…) The absence of R1b lineages in the Scytho-Siberian individuals tested so far and their presence in the North Pontic Scythians suggest that these 2 groups had a completely different paternal lineage makeup with nearly no gene flow from male carriers between them.

The seven other male individuals studied in this work were found to carry Eastern Eurasian Y haplogroups Q1b1a and one of its subclades (n = 6) and N (n = 1). Haplogroup Q1b1a-L54 was previously described in four males from the Bronze Age in the Altai Mountains (Hollard et al. 2014, 2018) and was clearly associated with Siberian populations (Regueiro et al. 2013).

The N-M231 haplogroup emerged from haplogroup K in Southern Asia around 21,000 years BCE, maybe in Southern China (Shi et al. 2013; Ilumäe et al. 2016). Previous studies attested to its presence in samples from Neolithic and Bronze Age in China (Li et al. 2011; Cui et al. 2013). Waves of northwestern expansion of this haplogroup are described as beginning during the Paleolithic period (Derenko et al. 2006; Shi et al. 2013) but traces of this expansion in archeological samples were reported only in two Scytho-Siberian males from the Altai (Pilipenko et al. 2015).

The sample of haplogroup N comes from the Aldy-Bel culture (ARZ-T15), from the Eerbek site, but has no radiocarbon date. All Q1b-L330 samples come from the Sagly culture, and three are paternally related. The other Q1b-L54 sample is from other tombs in one kurgan at Aldy Bel.

It seems that – exactly as expected – different waves of steppe nomads brought different lineages at a time (the Iron Age) when many regions incorporated different eastern lineages without necessarily changing language. Just like the expansion of N among Ugrians and Samoyeds, and N1c among Finno-Permic peoples, and like many other lineages expanding with federation-like groups in eastern, central, and western Europe


Updated phylogenetic tree of haplogroup Q-M242 points to Palaeolithic expansions


New paper (behind paywall) Paternal origin of Paleo-Indians in Siberia: insights from Y-chromosome sequences by Wei et al., Eur. J. Hum. Genet. (2018)

Interesting excerpts (for Eurasian migrations):

Differentiation and diffusion in Palaeolithic Siberia

Based on the phylogenetic analyses and the current distributions of relative sub-lineages, we propose that the prehistoric population differentiation in Siberia after the LGM (post-LGM) provided the genetic basis for the emergence of the Paleo-Indian, American aborigine, population. According to the phylogenetic tree of Y-chromosome haplogroup C2-M217 (Fig. 2 and Figure S1), eight sub-lineages emerged in a short period between 15.3 kya and 14.3 kya (Table S5). Within these sub-lineages, haplogroups C2-M48, C2-F1918, and C2- F1756 are predominant paternal lineages in modern Altaic-speaking populations [46, 51, 52]. Samples of haplogroups C2-F8535 and C2-P53.1 were found in two Turkic- and Mongolic-speaking minorities in China (Table S1). Both archeological and genetic data suggest that Altaic-speaking populations are results of population expansion in the past several thousand years in the Altai Mountain, Mongolia Plateau, and Amur River region [51–54].

By contrast, three other sub-lineages, C2-B79, C2-B77, and C2-P39, appear only in Koryaks and Native Americans [16, 35]. The latitude of the Altai Mountain, the Mongolia Plateau, and Amur River region are much lower than that of Beringia, where the ancestors of Native Americans finally separated from their close relatives in Siberia. Therefore, the phylogeographic patterns of sub-lineages of C2-M217 in this study reveal a major splitting event between populations in a lower latitude region of Siberia and ancestors of Koryaks and Native Americans during the post-LGM period.

The sub-lineages of the Y-chromosome Q-M242 haplogroup were found in populations throughout the Eurasia continent. According to available data, the Q1-L804 lineage is exclusively found in Northwest Europe, while Q1-M120 is primarily restricted to East Asia [48]. Additionally, the lineage Q1-L330 is the predominant paternal lineage in Altai, Tuva, and Kets in South Siberia [34–36, 55]. A number of Q1-M242 samples have also been found in ancient remains from South Siberia and adjacent regions [56, 57]. Other sub-lineages of Q-M242 are scattered widely in different geographic regions of Eurasia, including Q1-L275, Q1-M25, and Q1-Y2659 [14, 35, 37, 58]. Additionally, the Y-chromosome of a 6000–5100 BCE sample (I4550) from Zvejnieki, Latvia has been identified as Q1-L56 [59]. These findings suggest that the sub-lineages of Q-M242 started to diffuse throughout Eurasia in a very ancient period.

Founding paternal lineages of American aborigines and their most closely related lineages among Eurasia populations

Emergence of Paleo-Indian populations

The revised phylogenetic tree of Y-chromosome haplogroup Q-M242 in this study provides clues regarding the origin of Native American lineages Q1-M3 and Q1-Z780 (Fig. 3). According to our estimates, haplogroup Q1-L54 expanded rapidly between 17.2 kya and 15.0 kya and finally gave rise to two major founding paternal lineages of Native American populations, known as Q1-Z780 and Q1-M3. Ancient DNA studies indicate that the early population in South Siberia, represented by MA1 genomes, had a genetic influence on both modern western European and Native American populations [7]. Therefore, we conclude that the accumulated diversity of sub-lineages of Q-M242 before 15.3 kya resulted from the in situ differentiation of Q-M242 in Central Eurasia and South Siberia since the Paleolithic Age, and the appearance of the Paleo-Indian population is part of the great human diffusion throughout the Eurasia after the Last Glacial Maximum.

The Southern Caucasus PIE homeland

Image modified from Wang et al. (2018). Samples projected in PCA of 84 modern-day West Eurasian populations (open symbols). Previously known clusters have been marked and referenced. An EHG and a Caucasus ‘clouds’ have been drawn, leaving Pontic-Caspian steppe and derived groups between them.See the original file here.

The origin of Q-M242 in Zvejnieki, like those of Lola (Q1a2-M25) and Steppe Maykop (Q1a2-M25) from Wang et al. (2018) are therefore most likely migrations throughout North Eurasia dated to the Palaeolithic.

As you might remember, the sample of haplogroup Q1a from Khvalynsk was the closest one (in the PCA, see above) to those we now know most likely represent one or more groups of the steppe north of the Caucasus, which were absorbed during the formation and expansion of Khvalynsk.

NOTE. In fact, the position of this early Khvalynsk sample in the PCA is near the Steppe Eneolithic cluster, in turn near ANE (with the Lola sample Q1a2-M25, circle in dark blue/violet above), and Steppe Maykop (which includes the other Q1a2-M25 sample).

It is often assumed that these populations absorbed in the Pontic-Caspian steppe were dominated by haplogroup J, due to the oldest representatives of CHG ancestry (Kotias Klde and Satsurblia).

However, it would not be surprising now to find out that (one or more of) these “CHG/ANE-rich” groups from the steppe (possibly the Kairshak culture in the North Caspian region) were in fact dominated by Q1-M25 subclades.

If this is the case, I don’t know where the proponents of the (south of the) Caucasus homeland will retreat to.


Recent Africa origin with hybridization, and back to Africa 70,000 years ago


Open access Carriers of mitochondrial DNA macrohaplogroup L3 basal lineages migrated back to Africa from Asia around 70,000 years ago, by Cabrera et al. BMC Evol Biol (2018) 18(98).

Abstract (emphasis mine):


The main unequivocal conclusion after three decades of phylogeographic mtDNA studies is the African origin of all extant modern humans. In addition, a southern coastal route has been argued for to explain the Eurasian colonization of these African pioneers. Based on the age of macrohaplogroup L3, from which all maternal Eurasian and the majority of African lineages originated, the out-of-Africa event has been dated around 60-70 kya. On the opposite side, we have proposed a northern route through Central Asia across the Levant for that expansion and, consistent with the fossil record, we have dated it around 125 kya. To help bridge differences between the molecular and fossil record ages, in this article we assess the possibility that mtDNA macrohaplogroup L3 matured in Eurasia and returned to Africa as basal L3 lineages around 70 kya.


The coalescence ages of all Eurasian (M,N) and African (L3 ) lineages, both around 71 kya, are not significantly different. The oldest M and N Eurasian clades are found in southeastern Asia instead near of Africa as expected by the southern route hypothesis. The split of the Y-chromosome composite DE haplogroup is very similar to the age of mtDNA L3. An Eurasian origin and back migration to Africa has been proposed for the African Y-chromosome haplogroup E. Inside Africa, frequency distributions of maternal L3 and paternal E lineages are positively correlated. This correlation is not fully explained by geographic or ethnic affinities. This correlation rather seems to be the result of a joint and global replacement of the old autochthonous male and female African lineages by the new Eurasian incomers.


These results are congruent with a model proposing an out-of-Africa migration into Asia, following a northern route, of early anatomically modern humans carrying pre-L3 mtDNA lineages around 125 kya, subsequent diversification of pre-L3 into the basal lineages of L3, a return to Africa of Eurasian fully modern humans around 70 kya carrying the basal L3 lineages and the subsequent diversification of Eurasian-remaining L3 lineages into the M and N lineages in the outside-of-Africa context, and a second Eurasian global expansion by 60 kya, most probably, out of southeast Asia. Climatic conditions and the presence of Neanderthals and other hominins might have played significant roles in these human movements. Moreover, recent studies based on ancient DNA and whole-genome sequencing are also compatible with this hypothesis.


You can also read the recent interesting open access review How did Homo sapiens evolve? by Julia Galway-Witham, Chris Stringer, Science (2018) 360:6395 1296-1298.


Domesticated horse population structure, selection, and mtDNA geographic patterns


Open access Detecting the Population Structure and Scanning for Signatures of Selection in Horses (Equus caballus) From Whole-Genome Sequencing Data, by Zhang et al, Evolutionary Bioinformatics (2018) 14:1–9.

Abstract (emphasis mine):

Animal domestication gives rise to gradual changes at the genomic level through selection in populations. Selective sweeps have been traced in the genomes of many animal species, including humans, cattle, and dogs. However, little is known regarding positional candidate genes and genomic regions that exhibit signatures of selection in domestic horses. In addition, an understanding of the genetic processes underlying horse domestication, especially the origin of Chinese native populations, is still lacking. In our study, we generated whole genome sequences from 4 Chinese native horses and combined them with 48 publicly available full genome sequences, from which 15 341 213 high-quality unique single-nucleotide polymorphism variants were identified. Kazakh and Lichuan horses are 2 typical Asian native breeds that were formed in Kazakh or Northwest China and South China, respectively. We detected 1390 loss-of-function (LoF) variants in protein-coding genes, and gene ontology (GO) enrichment analysis revealed that some LoF-affected genes were overrepresented in GO terms related to the immune response. Bayesian clustering, distance analysis, and principal component analysis demonstrated that the population structure of these breeds largely reflected weak geographic patterns. Kazakh and Lichuan horses were assigned to the same lineage with other Asian native breeds, in agreement with previous studies on the genetic origin of Chinese domestic horses. We applied the composite likelihood ratio method to scan for genomic regions showing signals of recent selection in the horse genome. A total of 1052 genomic windows of 10 kB, corresponding to 933 distinct core regions, significantly exceeded neutral simulations. The GO enrichment analysis revealed that the genes under selective sweeps were overrepresented with GO terms, including “negative regulation of canonical Wnt signaling pathway,” “muscle contraction,” and “axon guidance.” Frequent exercise training in domestic horses may have resulted in changes in the expression of genes related to metabolism, muscle structure, and the nervous system.

Bayesian clustering output for 5 K values from K = 2 to K = 8 in 45 domestic horses. Each individual is represented by a vertical line, which is partitioned into colored segments that represent the proportion of the inferred K clusters.

Interesting excerpts:

Admixture proportions were assessed without user-defined population information to infer the presence of distinct populations among the samples (Figure 2). At K = 3 or K = 4, Franches-Montagnes and Arabian forms one unique cluster; at K = 5, Jeju pony forms one unique cluster. For other breeds, comparatively strong population structure exists among breeds, and they can be assigned to 2 (or 3) alternate clusters from K = 3 to K = 5 including group A (Duelmener, Fjord, Icelandic, Kazakh, Lichuan, and Mongolian) and group B (Hanoverian, Morgan, Quarter, Sorraia, and Standardbred). For group A, geographically this was unexpected, where Nordic breeds (Norwegian Fjord, Icelandic, and Duelmener) clustered with Asian breeds including the Mongolian. Previous results of mitochondrial DNA have revealed links between the Mongolian horse and breeds in Iceland, Scandinavia, Central Europe, and the British Isles. The Mongol horses are believed to have been originally imported from Russia subsequently became the basis for the Norwegian Fjord horse.31 At K = 6, Sorraia forms one unique cluster. The Sorraia horse has no long history as a domestic breed but is considered to be of a nearly ancestral type in the southern part of the Iberian Peninsula.32 However, our result did not support Sorraia as an independent ancestral type based on result from K = 2 to K = 5, and the unique cluster in K = 6 may be explained by the small population size and recently inbreeding programs. Genetic admixture of Morgan reveals that these breeds are currently or traditionally continually crossed with other breeds from K = 2 to K = 8. The Morgan horse has been a largely closed breed for 200 years or more but there has been some unreported crossbreeding in recent times.33

Principal component analysis results of all 48 horses. The x-axis denotes the value of PC1, whereas the y-axis denotes the value of PC2. Each dot in the figure represents one individual.

Bayesian clustering and PCA demonstrated the relationships among the horse breeds with weak geographic patterns. The tight grouping within most native breeds and looser grouping of individuals in admixed breeds have been reported previously in modern horses using data from a 54K SNP chip.33,34 Cluster analysis reveals that Arabian or Franches-Montagnes forms one unique cluster with relatively low K value, which is consistent with former study using 50K SNP chip 33,34 Interestingly, Standardbred forms a unique cluster with relatively high K value in this study, different from previous study.33 To date, no footprints are available to describe how the earliest domestic horses spread into China in ancient times. Our study found that Kazakh and Lichuan were assigned to the same lineage as other native Asian breeds, in agreement with previous studies on the origin of Chinese domestic horses.4,5,35,36 The strong genetic relationship between Asian native breeds and European native breeds have made it more difficult to understand the population history of the horse across Eurasia. Low levels of population differentiation observed between breeds might be explained by historical admixture. Unlike the domestic pig in China,8  we suggest that in China, Northern/Southern distinct groups could not be used to genetically distinct native Chinese horse breeds. We consider that during domestication process of horse, gene flow continued among Chinese-domesticated horses.

Open access Some maternal lineages of domestic horses may have origins in East Asia revealed with further evidence of mitochondrial genomes and HVR-1 sequences, by Ma et al., PeerJ (2018).


There are large populations of indigenous horse (Equus caballus) in China and some other parts of East Asia. However, their matrilineal genetic diversity and origin remained poorly understood. Using a combination of mitochondrial DNA (mtDNA) and hypervariable region (HVR-1) sequences, we aim to investigate the origin of matrilineal inheritance in these domestic horses.

To investigate patterns of matrilineal inheritance in domestic horses, we conducted a phylogenetic study using 31 de novo mtDNA genomes together with 317 others from the GenBank. In terms of the updated phylogeny, a total of 5,180 horse mitochondrial HVR-1 sequences were analyzed.

Eighteen haplogroups (Aw-Rw) were uncovered from the analysis of the whole mitochondrial genomes. Most of which have a divergence time before the earliest domestication of wild horses (about 5,800 years ago) and during the Upper Paleolithic (35–10 KYA). The distribution of some haplogroups shows geographic patterns. The Lw haplogroup contained a significantly higher proportion of European horses than the horses from other regions, while haplogroups Jw, Rw, and some maternal lineages of Cw, have a higher frequency in the horses from East Asia. The 5,180 sequences of horse mitochondrial HVR-1 form nine major haplogroups (A-I). We revealed a corresponding relationship between the haplotypes of HVR-1 and those of whole mitochondrial DNA sequences. The data of the HVR-1 sequences also suggests that Jw, Rw, and some haplotypes of Cw may have originated in East Asia while Lw probably formed in Europe.

Our study supports the hypothesis of the multiple origins of the maternal lineage of domestic horses and some maternal lineages of domestic horses may have originated from East Asia.

Median joining network constructed based on the 247- bp HVR-1 sequences. Circles are proportional to the number of horses represented and a scale indicator (for node sizes) was provided. The length of lines represents the number of variants that separate nodes (some manual adjustment was made for visually good). In the circles, the colors of solid pie slices indicate studied horse populations: Orange, European horses; Blue, horses of West Asia; Light Green, horses from East Asia; Grey, ancient horses; Purper, Przewalskii horses.

Geographic distributions of horse mtDNA haplogroups

The analysis of geographic distribution of the mitochondrial genome haplogroups showed that horse populations in Europe or East Asia included all haplogroups defined from the mtDNA genome sequences. The lineage Fw comprised entirely of Przewalskii horses. The two haplogroups Iw and Lw displayed frequency peaks in Europe (14.08% and 37.32%, respectively) and a decline to the east (9.33% and 8.00% in the West Asia, and 6.45% and 12.90% in East Asia, respectively), especially for Lw, which contained the largest number of European horses (Table 2). However, an opposite distribution pattern was observed for haplogroups Aw, Hw, Jw, and Rw, which were harbored by more horses from East Asia than those from other regions. The proportions of horses from East Asia for the four haplogroups were 38%, 88%, 62%, and 54%, respectively.

Schematic phylogeny of mtDNAs genome from modern horses. This tree includes 348 sequences
and was rooted at a donkey (E. asinus) mitochondrial genome (not displayed). The topology was inferred by a beast approach, whereas a time divergence scale (based on rate substitutions) is shown on the bottom (age estimates were indicated with thousand years (KY)). The percentages on each branch represent Bayesian posterior credibility and the alphabets on the right represent the names of haplogroups. Additional details concerning ages were given in Tables S3 and S6.


Distribution of Southern Iberian haplogroup H indicates exchanges in the western Mediterranean

Recent open access paper The distribution of mitochondrial DNA haplogroup H in southern Iberia indicates ancient human genetic exchanges along the western edge of the Mediterranean, by Hernández, Dugoujon, Novelletto, Rodríguez, Cuesta and Calderón, BMC Genetics (2017).

Abstract (emphasis mine):

The structure of haplogroup H reveals significant differences between the western and eastern edges of the Mediterranean, as well as between the northern and southern regions. Human populations along the westernmost Mediterranean coasts, which were settled by individuals from two continents separated by a relatively narrow body of water, show the highest frequencies of mitochondrial haplogroup H. These characteristics permit the analysis of ancient migrations between both shores, which may have occurred via primitive sea crafts and early seafaring. We collected a sample of 750 autochthonous people from the southern Iberian Peninsula (Andalusians from Huelva and Granada provinces). We performed a high-resolution analysis of haplogroup H by control region sequencing and coding SNP screening of the 337 individuals harboring this maternal marker. Our results were compared with those of a wide panel of populations, including individuals from Iberia, the Maghreb, and other regions around the Mediterranean, collected from the literature.

Both Andalusian subpopulations showed a typical western European profile for the internal composition of clade H, but eastern Andalusians from Granada also revealed interesting traces from the eastern Mediterranean. The basal nodes of the most frequent H sub-haplogroups, H1 and H3, harbored many individuals of Iberian and Maghrebian origins. Derived haplotypes were found in both regions; haplotypes were shared far more frequently between Andalusia and Morocco than between Andalusia and the rest of the Maghreb. These and previous results indicate intense, ancient and sustained contact among populations on both sides of the Mediterranean.

Our genetic data on mtDNA diversity, combined with corresponding archaeological similarities, provide support for arguments favoring prehistoric bonds with a genetic legacy traceable in extant populations. Furthermore, the results presented here indicate that the Strait of Gibraltar and the adjacent Alboran Sea, which have often been assumed to be an insurmountable geographic barrier in prehistory, served as a frequently traveled route between continents.

a, b, c. Interpolated frequency surfaces of clade H and its main sub-clades (H1 and H3). Frequencies (%) are showed in a colour scale. See information about the populations used in Additional files 4 and 5. Map templates were taken from Natural Earth free map repository (

I usually find mtDNA data, especially studies like this one based on modern populations, very difficult to interpret for anthropological purposes. It is well-known that there are important differences in the pattern of Y-DNA and mtDNA expansion and distribution.

A paragraph in this respect caught my attention:

The patterns of variation in the Y-chromosome between western and eastern Andalusians, based on 416 males, have also been investigated for a set of Y-Short Tandem Repeats (Y-STRs) and Y-SNPs [53, 54, 55], Calderón et al., unpublished data] in combination to mtDNA analyses ([18, 19] and present study). In general, for both uniparental makers, Andalusians exhibit a typical western European genetic background, with peak frequencies of mtDNA Hg H and Y-chromosome Hg R1b1b2-M269 (45% and 60%, respectively). Interestingly, our results have further revealed that the influence of African female input is far more significant when compared to male influence in contemporary Andalusians. The lack of correspondence between the maternal and paternal genetic profiles of human populations reflects intrinsic differences in migratory behavior related to sex-biased processes and admixture, as well as differences in male and female effective population sizes related to the variance in reproductive success affected, for example, by polygyny [56, 57].

I think that the greater reduction in patrilineal lineages compared to maternal lineages we usually see during and after prehistoric or historic migrations have more to do with the renown Uí Néill family case and with war-related casualties (since combatants were usually men) than with other more popular explanations, such as enslavement of women or polygyny.

The most successful paternal lines (anywhere in the world) were probably those who remained in power for a long time (be it a patriarchal society based on families, clans, or more complex organizational units), who were richer and thus more capable of having healthy offspring, who in turn were able to survive longer and have more children who inherited power, etc.

In case of recent migrations or population movements that disrupt the previously established organization, after a certain number of generations, successful patrilocal families (usually from incoming lineages) might slowly dominate over a whole region, with poorer families (usually of ‘indigenous’ lineages) suffering a greater – especially perinatal and child – mortality, without any obvious (pre)historic event associated to these gradual changes.

This gradual replacement of paternal lineages is compatible with the adoption of the native language by newcomers. If the number of migrants is greater that the native population, and especially if their technology is more advanced, then a more radical change including ethnolinguistic identification is more likely.

I don’t deny the (pre)historic existence of radical replacement of male populations with continuity of female lineages due to massacres of men, female slavery, or polygyny, but they are probably not the main explanation for most regional differences seen in paternal lineages, and should thus be used with caution.

Gradual replacement and founder effects are also the most logical explanation for why autochthonous continuity myths (that the modern regional prevalence of few successful lineages tended to create in the 2000s) haven’t been corroborated by ancient DNA; e.g. R1b-DF27 in Basques, N1c-M178 in Finnic populations, R1a-Z283 in Slavs, etc. There is nothing different in those areas from other recent founder effects and internal migratory flows seen everywhere in Europe in the past millennia.

Paper discovered via a link by Alberto Gonzalez on Facebook group Iberia ADN


Before steppe ancestry: Europe’s genetic diversity shaped mainly by local processes, with varied sources and proportions of hunter-gatherer ancestry


The definitive publication of a BioRxiv preprint article, in Nature: Parallel palaeogenomic transects reveal complex genetic history of early European farmers, by Lipson et al. (2017).

The dataset with all new samples is available at the Reich Lab’s website. You can try my drafts on how to do your own PCA and ADMIXTURE analysis with some of their new datasets.


Ancient DNA studies have established that Neolithic European populations were descended from Anatolian migrants who received a limited amount of admixture from resident hunter-gatherers. Many open questions remain, however, about the spatial and temporal dynamics of population interactions and admixture during the Neolithic period. Here we investigate the population dynamics of Neolithization across Europe using a high-resolution genome-wide ancient DNA dataset with a total of 180 samples, of which 130 are newly reported here, from the Neolithic and Chalcolithic periods of Hungary (6000–2900 BC, n = 100), Germany (5500–3000 BC, n = 42) and Spain (5500–2200 BC, n = 38). We find that genetic diversity was shaped predominantly by local processes, with varied sources and proportions of hunter-gatherer ancestry among the three regions and through time. Admixture between groups with different ancestry profiles was pervasive and resulted in observable population transformation across almost all cultural transitions. Our results shed new light on the ways in which gene flow reshaped European populations throughout the Neolithic period and demonstrate the potential of time-series-based sampling and modelling approaches to elucidate multiple dimensions of historical population interactions.

There were some interesting finds on a regional level, with some late survival of hunter-gatherer ancestry (and Y-DNA haplogroups) in certain specific sites, but nothing especially surprising. This survival of HG ancestry and lineages in Iberia and other regions may be used to revive (yet again) the controversy over the origin of non-Indo-European languages of Europe attested in historical times, such as the only (non-Uralic) one surviving to this day, the Basque language.

This study kept confirming the absence of Y-DNA R1b-M269 subclades in Central Europe before the arrival of Yamna migrants, though, which offers strong reasons to reject the Indo-European from the west hypothesis.

Here are first the PCA of samples included in this paper, and then the PCA of ancient Eurasians (Mathieson et al. 2017) and modern populations (Lazaridis et al. 2014) for comparison of similar clusters:

First two principal components from the PCA. We computed the principal components (PCs) for a set of 782 present-day western Eurasian individuals genotyped on the Affymetrix Human Origins array (background grey points) and then projected ancient individuals onto these axes. A close-up omitting the present-day Bedouin population is shown. From Lipton et al. (2017(
PCA of South-East European and other European samples from Mathieson et al. (2017)
Ancient and modern samples on Lazaridis et al. (2014)