Review article about Ancient Genomics, by Pontus Skoglund and Iain Mathieson


A preprint article by two of the most prolific researchers in Human Ancestry is out, and they request feedback: Ancient genomics: a new view into human prehistory and evolution, by Skoglund and Mathieson (2017). Right now, it is downloadable on Dropbox.


The first decade of ancient genomics has revolutionized the study of human prehistory and evolution. We review new insights based on ancient genomic data, including greatly increased resolution of the timing and structure of the out-of-Africa event, the diversification of present-day non-African populations, and the earliest expansions of those populations into Eurasia and America. Prehistoric genomes now document patterns of population continuity and change on every inhabited continent–in particular the effect of agricultural expansions in Africa, Europe and Oceania–and record a history of natural selection that shapes present-day phenotypic diversity. Despite these advances, much remains unknown, in particular about the genomic histories of Asia–the most populous continent, and Africa–the continent that contains the most genetic diversity. Ancient genomes from these and other regions, integrated with a growing understanding of the genomic basis of human phenotypic diversity, will be in focus during the next decade of research in the field.

The paper may be highly recommended as an introduction for anyone interested in the field of Human Ancestry in general.

However, its short summary of steppe ancestry expansion (where the Corded Ware culture predominates) is still reminiscent of the infamous “Yamnaya -> Corded Ware -> Bell Beaker” model set forth by the 2015 Nature articles on the subject, and Kristiansen’s Indo-European Corded Ware theory.

Here is an excerpt (emphasis mine):

The next substantial change is closely related to ancestry that by around 5000 BP extended over a region of more than 2000 miles of the Eurasian steppe, including in individuals associated with the Yamnaya Cultural Complex in far-eastern Europe (1; 38) and with the Afanasievo culture in the central Asian Altai mountains (1). This “steppe” ancestry is itself a mixture between ancestry that is related to Mesolithic hunter-gatherers of eastern Europe and ancestry that is related to both present-day populations (38) and Mesolithic hunter-gatherers (46) from the Caucasus mountains, and also to the populations of Neolithic (11), and Copper Age (56) Iran. Steppe ancestry appeared in southeastern Europe by 6000 BP (72), northeastern Europe around 5000 BP (47) and central Europe at the time of the Corded Ware Complex around 4600 BP (1; 38). These dates are reasonably tight constraints, because in each case there is no evidence of steppe ancestry in individuals immediately preceding these dates (47; 72). Gene flow on the steppe was extensive and bidirectional, as shown by the eastward flow of Anatolian Neolithic ancestry– reaching well into central Eurasia by the time of the Andronovo culture ~3500 BP (1)–and the westward flow of East Asian ancestry–found in individuals associated with the Iron Age Scythian culture close to the Black Sea ~2500 BP (143).

Copper and Bronze Age population movements (14; 78 Martiniano, 2017 #8761; 85; 112), as well as later movements in the Iron Age and Historical period (70; 119) further distributed steppe ancestry around Europe. Present-day western European populations can be modeled as mixtures of these three ancestry components (Mesolithic hunter-gatherer, Anatolian Neolithic and Steppe) (38; 57). In eastern Europe, further shifts in ancestry are the result of additional or distinct gene flow from Anatolia throughout the Neolithic and Bronze Age in the Aegean (42; 51; 55; 72; 87), and gene flow from Siberian-related populations in Finland and the Baltic region (38). East-west gene flow also brought new ancestry–related to populations from 265 Copper Age Iran–to the Levant during the Copper and Bronze ages (39; 56).

The geographic structure of these population transformations gave rise to population structure of present-day Europe. For example Anatolian Neolithic ancestry is highest in southern European populations like Sardinians, and lowest in northern European populations (38). Steppe ancestry is at high frequency in north-central Europeans and low in the south. Isolation-by-distance may have contributed to these patterns to some extent, but the contribution must have been small. In much of Europe, extreme population discontinuity was the norm.

Featured image: from the article, “Major Holocene population movements and expansions that have been demonstrated using ancient DNA.”


mtDNA haplogroup frequency analysis from Verteba Cave supports a strong cultural frontier between farmers and hunter-gatherers in the North Pontic steppe


New preprint paper at BioRxiv, led by a Japanese researcher, with analysis of mtDNA of Trypillians from Verteba Cave, Analysis of ancient human mitochondrial DNA from Verteba Cave, Ukraine: insights into the origins and expansions of the Late Neolithic-Chalcolithic Cututeni-Tripolye Culture, by Wakabayashi et al. (2017).


Background: The Eneolithic (~5,500 yrBP) site of Verteba Cave in Western Ukraine contains the largest collection of human skeletal remains associated with the archaeological Cucuteni-Tripolye Culture. Their subsistence economy is based largely on agro-pastoralism and had some of the largest and most dense settlement sites during the Middle Neolithic in all of Europe. To help understand the evolutionary history of the Tripolye people, we performed mtDNA analyses on ancient human remains excavated from several chambers within the cave.

Results: Burials at Verteba Cave are largely commingled and secondary in nature. A total of 68 individual bone specimens were analyzed. Most of these specimens were found in association with well-defined Tripolye artifacts. We determined 28 mtDNA D-Loop (368 bp) sequences and defined 8 sequence types, belonging to haplogroups H, HV, W, K, and T. These results do not suggest continuity with local pre-Eneolithic peoples, but rather complete population replacement. We constructed maximum parsimonious networks from the data and generated population genetic statistics. Nucleotide diversity (π) is low among all sequence types and our network analysis indicates highly similar mtDNA sequence types for samples in chamber G3. Using different sample sizes due to the uncertainly in number of individuals (11, 28, or 15), we found Tajima’s D statistic to vary. When all sequence types are included (11 or 28), we do not find a trend for demographic expansion (negative but not significantly different from zero); however, when only samples from Site 7 (peak occupation) are included, we find a significantly negative value, indicative of demographic expansion.

Conclusions: Our results suggest individuals buried at Verteba Cave had overall low mtDNA diversity, most likely due to increased conflict among sedentary farmers and nomadic pastoralists to the East and North. Early Farmers tend to show demographic expansion. We find different signatures of demographic expansion for the Tripolye people that may be caused by existing population structure or the spatiotemporal nature of ancient data. Regardless, peoples of the Tripolye Culture are more closely related to early European farmers and lack genetic continuity with Mesolithic hunter-gatherers or pre-Eneolithic groups in Ukraine.

Genetic finds keep supporting the long-lasting cultural and linguistic frontier that Anthony (2007) – among others – asserted existed in the North-West Pontic steppe in the Mesolithic and Neolithic, between western steppe cultures and farmers, while it disproves Kristiansen’s theories of Sredni Stog expansion in Kurgan waves with a mixture of GAC and Trypillia within the Corded Ware culture:

Previous ancient DNA studies showed that hunter-gatherers before 6,500 yrBP in Europe commonly had haplogroups U, U4, U5, and H, whereas hunter-gatherers after 6,500 yrBP in Europe had less frequency of haplogroup H than before. Haplogroups T and K appeared in hunter-gatherers only after 6,500 yrBP, indicating a degree of admixture in some places between farmers and hunter-gatherers. Farmers before and after 6,500 yrBP in Europe had haplogroups W, HV*, H, T, K, and these are also found in individuals buried at Verteba Cave. Therefore, our data point to a common ancestry with early European farmers. Our data also suggest population replacement. Mathieson et al. analyzed a number of Neolithic Ukrainian samples (petrous bone) from several sites in southern, northern, and western Ukraine, dating to ~8,500 – 6,000 yrBP, and found exclusively U (U4 and U5) mtDNA lineages. It should be noted that ‘Neolithic’ in this context does not mean the adoption of agriculture, but rather simply coinciding with a change in material culture. They also analyzed several Trypillian individuals from Verteba Cave (different samples from the those included in this study). Similar to our findings, they found a wider diversity of mtDNA lineages, including H, HV, and T2b. These data, combined with our results, appear to confirm almost complete population replacement by individuals associated with the Tripolye Culture during the Middle to Late Neolithic.

The findings also hint to potential contacts of Yamna with Usatovo as predicted by Anthony (2007), or alternatively (lacking precise dates) to contacts with Corded Ware migrants:

Trypillians were very much a distinct people who most likely displaced 1 local hunter-gatherers with little admixture. Haplogroup W was also observed in several specimens deriving from Site G3. Although we are unsure if all of these haplogroups come from a single or multiple individuals, this observation is interesting in that it is relatively rare and isolated among Neolithic samples. It has, however, been found in samples dating to the Bronze Age. In the study by Wilde et al. [35], they found haplogroup W present in two samples from the Early Bronze Age associated with the Yamnaya and Usatovo cultures. The Usatovo culture (~ 3500 – 2500 BC) was found in Romania, Moldova, and southern Ukraine. It was the conglomeration of Tripolye and North Pontic steppe cultures. Therefore, this individual could link the Trypillian peoples to the Usatovo peoples and perhaps to the greater Yamnaya steppe migrations during the Bronze Age that lead to the Corded Ware Culture.

On the other hand, an article written in terms of mtDNA haplogroup frequencies seems to offer too little proof of anything today. The lack of Y-DNA haplogroups and data on admixture makes their interpretations provisional, subject to change when these further data are published. Also, radiocarbon dating is only confident for individuals of one site (site 7), dated ca. 5,500 cal BP, while “other chambers in the cave are not as confidently dated”…

“Based on the 8 sequence types of the mtDNA D-loop, a maximum parsimonious phylogenetic network was constructed. Circles represent the sequence types, and the size of the circle is proportional to the number of samples. Numbers on the branches between the circles are nucleotide position numbers (+16,000) of the human mitochondrial genome sequence (rCRS). Information about the location (chamber within the cave) where the specimen was excavated is also provided. Areas 2 and 17 are part of Site 7, and these are defined as a separate chamber, although they are located in close proximity within Site 7. The other chambers, Site 20, G2, and G3, are independent and separate locations within the cave. ‘Undefined’ chamber describes an unknown location within the cave. Specimens from each chamber showed deviation for the sequence type distribution observed in the sample set. For example, specimens excavated from Site 7 had five unique sequence types, (I, II, III, IV, and VIII), while specimens excavated from chamber G 21 had mainly one sequence type (V)”. Made available by the authors under a CC-BY-NC-ND 4.0 International license.

We had also seen signs of conflict between Trypillian and steppe cultures in a recent article, Violence at Verteba Cave, Ukraine: New Insights into the Late Neolithic Intergroup Conflict, by Madden et al. (2017):

Many researchers have pointed to the huge “megasites” and construction of fortifications as evidence of intergroup hostilities among the Late Neolithic Tripolye archaeological culture. However, to date, very few skeletal remains have been analyzed for the types of traumatic injury that serve as direct evidence for violent conflict. In this study, we examine trauma on human remains from the Tripolye site of Verteba Cave in western Ukraine. The remains of 36 individuals, including 25 crania, were buried in the gypsum cave as secondary interments. The frequency of cranial trauma is 30-44% among the 25 crania, six males, four females and one adult of indeterminate sex displayed cranial trauma. Of the 18 total fractures, 10 were significantly large and penetrating suggesting lethal force. Over half of the trauma is located on the posterior aspect of the crania, suggesting the victims were attacked from behind. Sixteen of the fractures observed were perimortem and two were antemortem. The distribution and characteristics of the fractures suggest that some of the Tripolye individuals buried at Verteba Cave were victims of a lethal surprise attack. Resources were limited due to population growth and migration, leading to conflict over resource access. It is hypothesized that during this time of change burial in this cave aided in development of identity and ownership of the local territory.


Correlation does not mean causation: the damage of the ‘Yamnaya ancestral component’, and the ‘Future American’ hypothesis

New Ukraine Eneolithic sample from late Sredni Stog, near homeland of the Corded Ware culture

The concept of “outlier” in studies of Human Ancestry, and the Corded Ware outlier from Esperstedt

Marija Gimbutas and the expansion of the “Kurgan people” based on tumulus-building cultures

Before steppe ancestry: Europe’s genetic diversity shaped mainly by local processes, with varied sources and proportions of hunter-gatherer ancestry


The definitive publication of a BioRxiv preprint article, in Nature: Parallel palaeogenomic transects reveal complex genetic history of early European farmers, by Lipson et al. (2017).

The dataset with all new samples is available at the Reich Lab’s website. You can try my drafts on how to do your own PCA and ADMIXTURE analysis with some of their new datasets.


Ancient DNA studies have established that Neolithic European populations were descended from Anatolian migrants who received a limited amount of admixture from resident hunter-gatherers. Many open questions remain, however, about the spatial and temporal dynamics of population interactions and admixture during the Neolithic period. Here we investigate the population dynamics of Neolithization across Europe using a high-resolution genome-wide ancient DNA dataset with a total of 180 samples, of which 130 are newly reported here, from the Neolithic and Chalcolithic periods of Hungary (6000–2900 BC, n = 100), Germany (5500–3000 BC, n = 42) and Spain (5500–2200 BC, n = 38). We find that genetic diversity was shaped predominantly by local processes, with varied sources and proportions of hunter-gatherer ancestry among the three regions and through time. Admixture between groups with different ancestry profiles was pervasive and resulted in observable population transformation across almost all cultural transitions. Our results shed new light on the ways in which gene flow reshaped European populations throughout the Neolithic period and demonstrate the potential of time-series-based sampling and modelling approaches to elucidate multiple dimensions of historical population interactions.

There were some interesting finds on a regional level, with some late survival of hunter-gatherer ancestry (and Y-DNA haplogroups) in certain specific sites, but nothing especially surprising. This survival of HG ancestry and lineages in Iberia and other regions may be used to revive (yet again) the controversy over the origin of non-Indo-European languages of Europe attested in historical times, such as the only (non-Uralic) one surviving to this day, the Basque language.

This study kept confirming the absence of Y-DNA R1b-M269 subclades in Central Europe before the arrival of Yamna migrants, though, which offers strong reasons to reject the Indo-European from the west hypothesis.

Here are first the PCA of samples included in this paper, and then the PCA of ancient Eurasians (Mathieson et al. 2017) and modern populations (Lazaridis et al. 2014) for comparison of similar clusters:

First two principal components from the PCA. We computed the principal components (PCs) for a set of 782 present-day western Eurasian individuals genotyped on the Affymetrix Human Origins array (background grey points) and then projected ancient individuals onto these axes. A close-up omitting the present-day Bedouin population is shown. From Lipton et al. (2017(
PCA of South-East European and other European samples from Mathieson et al. (2017)
Ancient and modern samples on Lazaridis et al. (2014)


Analysis of R1b-DF27 haplogroups in modern populations adds new information that contrasts with ‘steppe admixture’ results


New open access article published in Scientific Reports, Analysis of the R1b-DF27 haplogroup shows that a large fraction of Iberian Y-chromosome lineages originated recently in situ, by Solé-Morata et al. (2017).


Haplogroup R1b-M269 comprises most Western European Y chromosomes; of its main branches, R1b-DF27 is by far the least known, and it appears to be highly prevalent only in Iberia. We have genotyped 1072 R1b-DF27 chromosomes for six additional SNPs and 17 Y-STRs in population samples from Spain, Portugal and France in order to further characterize this lineage and, in particular, to ascertain the time and place where it originated, as well as its subsequent dynamics. We found that R1b-DF27 is present in frequencies ~40% in Iberian populations and up to 70% in Basques, but it drops quickly to 6–20% in France. Overall, the age of R1b-DF27 is estimated at ~4,200 years ago, at the transition between the Neolithic and the Bronze Age, when the Y chromosome landscape of W Europe was thoroughly remodeled. In spite of its high frequency in Basques, Y-STR internal diversity of R1b-DF27 is lower there, and results in more recent age estimates; NE Iberia is the most likely place of origin of DF27. Subhaplogroup frequencies within R1b-DF27 are geographically structured, and show domains that are reminiscent of the pre-Roman Celtic/Iberian division, or of the medieval Christian kingdoms.

Some people like to say that Y-DNA haplogroup analysis, or phylogeography in general, is of no use anymore (especially modern phylogeography), and they are content to see how ‘steppe admixture’ was (or even is) distributed in Europe to draw conclusions about ancient languages and their expansion. With each new paper, we are seeing the advantages of analysing ancient and modern haplogroups in ascertaining population movements.

Quite recently there was a suggestion based on steppe admixture that Basque-speaking Iberians resisted the invasion from the steppe. Observing the results of this article (dates of expansion and demographic data) we see a clear expansion of Y-DNA haplogroups precisely by the time of Bell Beaker expansion from the east. Y-DNA haplogroups of ancient samples from Portugal point exactly to the same conclusion.

The situation of R1b-DF27 in Basques, as I have pointed out elsewhere, is probably then similar to the genetic drift of Finns, mainly of N1c lineages, speaking today a Uralic language that expaned with Corded Ware and R1a subclades.

The recent article on Mycenaean and Minoan genetics also showed that, when it comes to Europe, most of the demographic patterns we see in admixture are reminiscent of the previous situation, only rarely can we see a clear change in admixture (which would mean an important, sudden replacement of the previous population).

Equating the so-called steppe admixture with Indo-European languages is wrong. Period.

The following are excerpts from the article (emphasis is mine):

Dates and expansions

The average STR variance of DF27 and each subhaplogroup is presented in Suppl. Table 2. As expected, internal diversity was higher in the deeper, older branches of the phylogeny. If the same diversity was divided by population, the most salient finding is that native Basques (Table 2) have a lower diversity than other populations, which contrasts with the fact that DF27 is notably more frequent in Basques than elsewhere in Iberia (Suppl. Table 1). Diversity can also be measured as pairwise differences distributions (Fig. 5). The distribution of mean pairwise differences within Z195 sits practically on top of that of DF27; L176.2 and Z220 have similar distributions, as M167 and Z278 have as well; finally, M153 shows the lowest pairwise distribution values. This pattern is likely to reflect the respective ages of the haplogroups, which we have estimated by a modified, weighted version of the ρ statistic (see Methods).

Z195 seems to have appeared almost simultaneously within DF27, since its estimated age is actually older (4570 ± 140 ya). Of the two branches stemming from Z195, L176.2 seems to be slightly younger than Z220 (2960 ± 230 ya vs. 3320 ± 200 ya), although the confidence intervals slightly overlap. M167 is clearly younger, at 2600 ± 250 ya, a similar age to that of Z278 (2740 ± 270 ya). Finally, M153 is estimated to have appeared just 1930 ± 470 ya.

Haplogroup ages can also be estimated within each population, although they should be interpreted with caution (see Discussion). For the whole of DF27, (Table 3), the highest estimate was in Aragon (4530 ± 700 ya), and the lowest in France (3430 ± 520 ya); it was 3930 ± 310 ya in Basques. Z195 was apparently oldest in Catalonia (4580 ± 240 ya), and with France (3450 ± 269 ya) and the Basques (3260 ± 198 ya) having lower estimates. On the contrary, in the Z220 branch, the oldest estimates appear in North-Central Spain (3720 ± 313 ya for Z220, 3420 ± 349 ya for Z278). The Basques always produce lower estimates, even for M153, which is almost absent elsewhere.

Simplified phylogenetic tree of the R1b-M269 haplogroup. SNPs in italics were not analyzed in this manuscript.


The median value for Tstart has been estimated at 103 generations (Table 4), with a 95% highest probability density (HPD) range of 50–287 generations; effective population size increased from 131 (95% HPD: 100–370) to 72,811 (95% HPD: 52,522–95,334). Considering patrilineal generation times of 30–35 years, our results indicate that R1b-DF27 started its expansion ~3,000–3,500 ya, shortly after its TMRCA.

As a reference, we applied the same analysis to the whole of R1b-S116, as well as to other common haplogroups such as G2a, I2, and J2a. Interestingly, all four haplogroups showed clear evidence of an expansion (p > 0.99 in all cases), all of them starting at the same time, ~50 generations ago (Table 4), and with similar estimated initial and final populations. Thus, these four haplogroups point to a common population expansion, even though I2 (TMRCA, weighted ρ, 7,800 ya) and J2a (TMRCA, 5,500 ya) are older than R1b-DF27. It is worth noting that the expansion of these haplogroups happened after the TMRCA of R1b-DF27.

Principal component analysis of STR haplotypes. (a) Colored by subhaplogroup, (b) colored by population. Larger squares represent subhaplogroup or population centroids.

Sum up and discussion

We have characterized the geographical distribution and phylogenetic structure of haplogroup R1b-DF27 in W. Europe, particularly in Iberia, where it reaches its highest frequencies (40–70%). The age of this haplogroup appears clear: with independent samples (our samples vs. the 1000 genome project dataset) and independent methods (variation in 15 STRs vs. whole Y-chromosome sequences), the age of R1b-DF27 is firmly grounded around 4000–4500 ya, which coincides with the population upheaval in W. Europe at the transition between the Neolithic and the Bronze Age. Before this period, R1b-M269 was rare in the ancient DNA record, and during it the current frequencies were rapidly reached. It is also one of the haplogroups (along with its daughter clades, R1b-U106 and R1b-S116) with a sequence structure that shows signs of a population explosion or burst. STR diversity in our dataset is much more compatible with population growth than with stationarity, as shown by the ABC results, but, contrary to other haplogroups such as the whole of R1b-S116, G2a, I2 or J2a, the start of this growth is closer to the TMRCA of the haplogroup. Although the median time for the start of the expansion is older in R1b-DF27 than in other haplogroups, and could suggest the action of a different demographic process, all HPD intervals broadly overlap, and thus, a common demographic history may have affected the whole of the Y chromosome diversity in Iberia. The HPD intervals encompass a broad timeframe, and could reflect the post-Neolithic population expansions from the Bronze Age to the Roman Empire.

While when R1b-DF27 appeared seems clear, where it originated may be more difficult to pinpoint. If we extrapolated directly from haplogroup frequencies, then R1b-DF27 would have originated in the Basque Country; however, for R1b-DF27 and most of its subhaplogroups, internal diversity measures and age estimates are lower in Basques than in any other population. Then, the high frequencies of R1b-DF27 among Basques could be better explained by drift rather than by a local origin (except for the case of M153; see below), which could also have decreased the internal diversity of R1b-DF27 among Basques. An origin of R1b-DF27 outside the Iberian Peninsula could also be contemplated, and could mirror the external origin of R1b-M269, even if it reaches there its highest frequencies. However, the search for an external origin would be limited to France and Great Britain; R1b-DF27 seems to be rare or absent elsewhere: Y-STR data are available only for France, and point to a lower diversity and more recent ages than in Iberia (Table 3). Unlike in Basques, drift in a traditionally closed population seems an unlikely explanation for this pattern, and therefore, it does not seem probable that R1b-DF27 originated in France. Then, a local origin in Iberia seems the most plausible hypothesis. Within Iberia, Aragon shows the highest diversity and age estimates for R1b-DF27, Z195, and the L176.2 branch, although, given the small sample size, any conclusion should be taken cautiously. On the contrary, Z220 and Z278 are estimated to be older in North Central Spain (N Castile, Cantabria and Asturias). Finally, M153 is almost restricted to the Basque Country: it is rarely present at frequencies >1% elsewhere in Spain (although see the cases of Alacant, Andalusia and Madrid, Suppl. Table 1), and it was found at higher frequencies (10–17%) in several Basque regions; a local origin seems plausible, but, given the scarcity of M153 chromosomes outside of the Basque Country, the diversity and age values cannot be compared.

Within its range, R1b-DF27 shows same geographical differentiation: Western Iberia (particularly, Asturias and Portugal), with low frequencies of R1b-Z195 derived chromosomes and relatively high values of R1b-DF27* (xZ195); North Central Spain is characterized by relatively high frequencies of the Z220 branch compared to the L176.2 branch; the latter is more abundant in Eastern Iberia. Taken together, these observations seem to match the East-West patterning that has occurred at least twice in the history of Iberia: i) in pre-Roman times, with Celtic-speaking peoples occupying the center and west of the Iberian Peninsula, while the non-Indoeuropean eponymous Iberians settled the Mediterranean coast and hinterland; and ii) in the Middle Ages, when Christian kingdoms in the North expanded gradually southwards and occupied territories held by Muslim fiefs.

Contour maps of the derived allele frequencies of the SNPs analyzed in this manuscript. Population abbreviations as in Table 1. Maps were drawn with SURFER v. 12 (Golden Software, Golden CO, USA).

I wouldn’t trust the absence of R1b-DF27 outside France as a proof that its origin must be in Western Europe – especially since we have ancient DNA, and that assertion might prove quite wrong – but aside from that the article seems solid in its analysis of modern populations.


Text and figures from the article, licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit

How to do modern phylogeography: Relationships between clans and genetic kin explain cultural similarities over vast distances


A preprint paper has been published in BioRxiv, Relationships between clans and genetic kin explain cultural similarities over vast distances: the case of Yakutia, by Zvenigorosky et al (2017).


Archaeological studies sample ancient human populations one site at a time, often limited to a fraction of the regions and periods occupied by a given group. While this bias is known and discussed in the literature, few model populations span areas as large and unforgiving as the Yakuts of Eastern Siberia. We systematically surveyed 31,000 square kilometres in the Sakha Republic (Yakutia) and completed the archaeological study of 174 frozen graves, assembled between the 15th and the 19th century. We analysed genetic data (autosomal genotypes, Y-chromosome haplotypes and mitochondrial haplotypes) for all ancient subjects and confronted it to the study of 190 modern subjects from the same area and the same population. Ancient familial links and paternal clan were identified between graves up to 1500 km apart and we provide new data concerning the origins of the contemporary Yakut population and demonstrate that cultural similarities in the past were linked to (i) the expansion of specific paternal clans, (ii) preferential marriage among the elites and (iii) funeral choices that could constitute a bias in any ancient population study.

Even if you are not interested in the cultural and anthropological evolution of this Turkic-speaking people of the Russian Far Eastern region, the method used is an excellent example of how to use archaeology and genetics (especially Y-DNA and mtDNA data) to obtain meaningful results when investigating ancient populations.

For quite some time, probably since the first renown admixture analyses of ancient DNA samples were published, we have been living under the impression that phylogeography, or simply archaeogenetics as it was called back in the day, is not needed.

Cavalli-Sforza’s assertion that the study of modern populations could offer a clear picture of past population movements is now considered wrong, and the study of Y-DNA and mtDNA haplogroups is today mostly disregarded as of secondary importance, even among geneticists. Whole genomic investigation (and especially admixture analyses) have been leading the new wave of overconfidence in genetic results, tightly joint with the ignorance of its shortcomings (and commercial interests based on desires of ethnic identification), and haplogroups are usually just reported with other, not entirely meaningful aspects of ancient DNA analyses.

While it is undeniable that admixture analyses are offering quite interesting results, they must be carefully balanced against known archaeological and linguistic knowledge. Phylogeography – and especially Y-DNA haplogroup assessment – is quite interesting in investigating kinship and clans in patrilocal communities – i.e. most communities in prehistoric and historic periods, unless proven otherwise.

Luckily enough, there are those researchers who still strive to obtain meaningful information from haplotypes. The article referenced in this post is quite interesting due to its phylogeographic method’s applicability to ancient cultures and peoples.

When some geneticists look at simplistic prehistoric maps, like those depicting Yamna, Afanasevo, Corded Ware, and Bell Beaker cultures together, they forget that 1) cultural regions are selected more or less arbitrarily (we only have certain scattered sites for each of these cultures); 2) economic or population contacts are difficult to ascertain and to represent graphically; and 3) time periods for archaeological sites are important – in fact, they are probably THE most important aspect in assessing how accurate a map (and its “arrows” of migration or exchange) represents reality.

A careful, detailed study like this one, if applied to the Pontic-Caspian steppe, would probably reveal how R1b subclades dominated steppe clans, beginning at least during the Suvorovo-Novodanilovka expansion to the west, and certainly representing the vast majority of lineages during the internal expansion in the Early Yamna period and its later expansion east and west of the steppe…

Featured image from the article, summing up Geography, Archaeology, and Genetics of Yakutia – including Y-DNA and mtDNA haplogroups from ancient populations.


Another hint at the role of Corded Ware peoples in spreading Uralic languages into north-eastern Europe, found in mtDNA analysis of the Finnish population


Open article at Scientific Reports (Nature): Identification and analysis of mtDNA genomes attributed to Finns reveal long-stagnant demographic trends obscured in the total diversity, by Översti et al. (2017).

Of special interest is its depiction of Finland’s past as including the expansion of Corded Ware population of mtDNA U5b1b2 (and probably Y-DNA R1a-M417 subclades), most likely Uralic speakers of the Forest Zone, to the north of the Yamna culture (where Late Proto-Indo-European was spoken).

A later expansion of other subclades – particularly Y-DNA N1c -, was probably associated with the later western expansion of the Eurasian Seima-Turbino phenomenon, and its current prevalence in Finnish Y-DNA haplogroups might have been the consequence of the population decline ca. 1500 BC, and later Iron Age population bottleneck (with the population peak ca. 500 AD) described in the article.

That would more naturally explain the ‘cultural diffusion’ of Finnic languages into invading eastern N1c lineages, a diffusion which would have been in fact a long-term, quite gradual replacement of previously prevalent Y-DNA R1a subclades in the region, as supported by the prevalent “steppe” component in genome-wide ancestry of Finns.

Therefore, there were probably no sudden, strong population (and thus cultural) changes associated with the arrival of N1c lineages, like the ones seen with R1a (Corded Ware / Uralic) and R1b (Yamna / Proto-Indo-European) expansions in Europe.

How the Saami fit into this scheme is not yet obvious, though.


In Europe, modern mitochondrial diversity is relatively homogeneous and suggests an ubiquitous rapid population growth since the Neolithic revolution. Similar patterns also have been observed in mitochondrial control region data in Finland, which contrasts with the distinctive autosomal and Y-chromosomal diversity among Finns. A different picture emerges from the 843 whole mitochondrial genomes from modern Finns analyzed here. Up to one third of the subhaplogroups can be considered as Finn-characteristic, i.e. rather common in Finland but virtually absent or rare elsewhere in Europe. Bayesian phylogenetic analyses suggest that most of these attributed Finnish lineages date back to around 3,000–5,000 years, coinciding with the arrival of Corded Ware culture and agriculture into Finland. Bayesian estimation of past effective population sizes reveals two differing demographic histories: 1) the ‘local’ Finnish mtDNA haplotypes yielding small and dwindling size estimates for most of the past; and 2) the ‘immigrant’ haplotypes showing growth typical of most European populations. The results based on the local diversity are more in line with that known about Finns from other studies, e.g., Y-chromosome analyses and archaeology findings. The mitochondrial gene pool thus may contain signals of local population history that cannot be readily deduced from the total diversity.

From its results:

In general, there appears to be two loose and largely overlapping clusters among the Finn-characteristic haplogroups: the first between 1,000–2,000 ybp and the second around 3,300–5,500 ybp. The age of the older cluster coincides temporally with the arrival of the Corded-Ware culture and, notably, the spread of agriculture in Finland. The arrival and spread of agriculture, temporally corresponding with the age estimates for most of the haplogroups characteristic of Finns, might be a sign of population size increase enabled by the new mode of subsistence, resulting in reduced drift and accumulation of genetic diversity in the population.


Another insight in the past population sizes in Finland is based on radiocarbon-dated archaeological findings in different time periods. These analyses suggest two prehistoric population peaks in Finland, the Stone Age peak (c. 5,500 ybp) and the Metal Age peak (~1,500 ybp). Both of these peaks were followed by a population decline, which appears to have reached its ebb around 3,500 ybp. These developments are not distinguishable in the BSPs. However, these ages correspond well to the two haplogroup age clusters described above. The presumably less severe Iron Age population bottleneck seen in the archaeological data, 1,500–1,300 ybp, temporally coincides with the population size reduction visible for the Finn-characteristic subhaplogroups.


Discovered via Eurogenes.