Mitochondrial DNA unsuitable to test for IBD, and undersampling genomes show biased time and rate estimates

Two interesting papers questioning previous methods have been published.

Open access Mitochondrial DNA is unsuitable to test for isolation by distance, by Teske et al. Scientific Reports (2018) 8:8448.

Abstract (emphasis mine):

Tests for isolation by distance (IBD) are the most commonly used method of assessing spatial genetic structure. Many studies have exclusively used mitochondrial DNA (mtDNA) sequences to test for IBD, but this marker is often in conflict with multilocus markers. Here, we report a review of the literature on IBD, with the aims of determining (a) whether significant IBD is primarily a result of lumping spatially discrete populations, and (b) whether microsatellite datasets are more likely to detect IBD when mtDNA does not. We also provide empirical data from four species in which mtDNA failed to detect IBD by comparing these with microsatellite and SNP data. Our results confirm that IBD is mostly found when distinct regional populations are pooled, and this trend disappears when each is analysed separately. Discrepancies between markers were found in almost half of the studies reviewed, and microsatellites were more likely to detect IBD when mtDNA did not. Our empirical data rejected the lack of IBD in the four species studied, and support for IBD was particularly strong for the SNP data. We conclude that mtDNA sequence data are often not suitable to test for IBD, and can be misleading about species’ true dispersal potential. The observed failure of mtDNA to reliably detect IBD, in addition to being a single-locus marker, is likely a result of a selection-driven reduction in genetic diversity obscuring spatial genetic differentiation.

Plots of geographic distances vs. F-statistics for the following species (plots on the left show mtDNA data, those on the right SNP or microsatellite data): (a) Sardinops sagax; (b) Psammogobius knysnaensis; (c) Nerita atramentosa; (d) Siphonaria diemenensis. The density of data points is indicated by colours.

Behind paywall, Undersampling Genomes has Biased Time and Rate Estimates Throughout the Tree of Life, by Julie Marin and S. Blair Hedges, Mol Biol Evol (2018), msy103.

Abstract (emphasis mine):

Genomic data drive evolutionary research on the relationships and timescale of life but the genomes of most species remain poorly sampled. Phylogenetic trees can be reconstructed reliably using small data sets and the same has been assumed for the estimation of divergence time with molecular clocks. However, we show here that undersampling of molecular data results in a bias expressed as disproportionately shorter branch lengths and underestimated divergence times in the youngest nodes and branches, termed the small sample artifact. In turn, this leads to increasing speciation and diversification rates towards the present. Any evolutionary analyses derived from these biased branch lengths and speciation rates will be similarly biased. The widely used timetrees of the major species-rich studies of amphibians, birds, mammals, and squamate reptiles are all data-poor and show upswings in diversification rate, suggesting that their results were biased by undersampling. Our results show that greater sampling of genomes is needed for accurate time and rate estimation, which are basic data used in ecological and evolutionary research.

Potential biases on speciation rate estimation. The black line represents constant speciation rate as expected if there are no artifacts or other factors affecting the rate. The small sample artifact (an insufficient number of variable sites) may impact all of the tree and diversification plot, resulting in a rate increase towards the present. The taxonomic artifact (incomplete sampling of taxa or lineages) also may impact all of the tree and diversification plot and results in a speciation rate decrease towards the present. The sparse nodes artifact (stochastic effect of a limited number of nodes) may impact the beginning of the diversification plot, causing decreases or increases in rate.

Do you remember the male-biased expansion from the Pontic-Caspian steppe, that was later contested in its methods by Lazaridis and Reich, but that is today again accepted by Reich and Lazaridis (probably for different reasons, namely Y-DNA evidence)?

Every time I read this kind of studies rejecting previous methods – which get written and published only because there is a future interest in them, not because they are (or may cause) retractions of previous results and interpretations – I remember these people inventing migration models based on genomic studies and saying “genetics is a science, linguistics/archaeology/anthropology is not”…

NOTE. Even if papers eventually receive a correction, journalists and blogs will keep echoing whatever gets published (see the famous Dennis/Denise will become dentists); there is no end to that. Believe it or not, we still see Underhill et al. (2014) being cited against the most recent papers, and even against the author’s own rejection of his paper’s results

Especially right now, it must cause some kind of dissociated reasoning among those naysayers, when they need to resort to anthropological disciplines to discuss the latest interpretations of a potential Caucasus origin or North Iranian homeland of Proto-Indo-European…

EDIT (5 JUN 2018): Also, check out the recent review From genome-wide associations to candidate causal variants by statistical fine-mapping, by Schaid, Chen, and Larson.


FADS1 and the timing of human adaptation to agriculture


Open access FADS1 and the timing of human adaptation to agriculture, by Sara Mathieson & Iain Mathieson, bioRxiv (2018).


Variation at the FADS1/FADS2 gene cluster is functionally associated with differences in lipid metabolism and is often hypothesized to reflect adaptation to an agricultural diet. Here, we test the evidence for this relationship using both modern and ancient DNA data. We document pre-out-of-Africa selection for both the derived and ancestral FADS1 alleles and show that almost all the inhabitants of Europe carried the ancestral allele until the derived allele was introduced approximately 8,500 years ago by Early Neolithic farming populations. However, we also show that it was not under strong selection in these populations. Further, we find that this allele, and other proposed agricultural adaptations including variants at LCT/MCM6, SLC22A4 and NAT2, were not strongly selected until the Bronze Age, 2,000-4,000 years ago. Similarly, increased copy number variation at the salivary amylase gene AMY1 is not linked to the development of agriculture although in this case, the putative adaptation precedes the agricultural transition. Our analysis shows that selection at the FADS locus was not tightly linked to the development of agriculture. Further, it suggests that the strongest signals of recent human adaptation may not have been driven by the agricultural transition but by more recent changes in environment or by increased efficiency of selection due to increases in effective population size.

Interesting excerpt for the steppe-related expansion:

Allele frequency trajectories for other putative agricultural adaptation variants. As in Figure 2C, estimated allele frequency trajectories and selection coefficients in different ancient European populations. Significant selection coefficients are labelled.

In the case of FADS1 and all the other examples we investigated, the proposed agricultural adaption was either not temporally linked with agriculture or showed no evidence of selection in agricultural populations. Instead, most of the variants with any evidence of selection were only strongly selected at some point between the Bronze Age and the present day, that is, in a period starting 2000-4000 BP and continuing until the present. This time period is one in which there is relatively limited ancient DNA data, and so we are unable to determine the timing of selection any more accurately. Future research should address the question of why this recent time period saw the most rapid changes in apparently diet associated genes. One plausible hypothesis is that the change in environment at this time was actually more dramatic than the earlier change associated with agriculture. Another is that effective population sizes were so small before this time that selection did not operate efficiently on variants with small selection coefficients. For example, analysis of present-day genomes from the United Kingdom suggests that effective population size increased by a factor of 100-1000 in the past 4500 years (Browning and Browning 2015). Ancient effective population sizes less that 104 would suggest that those populations would not be able to efficiently select for variants with selection coefficients on the order of 10-4 or smaller. Larger ancient DNA datasets from the past 4,000 years will likely resolve this question.

This complexity of the reasons for selection reminded me of the comment by Narasimhan on lactase persistence expanding with steppe populations into Central Asia (based on data of the paper where he is the first author):

I always thought that to argue for natural selection in humans (viz. skin color, lactase persistence, etc.) was possible for archaic groups over tens of thousands of years, but that more recent selections would be very difficult to prove, in so far as historical population expansions involve more ‘artificial’ (i.e. man-made or man-caused) societal changes.

NOTE. I am probably more inclined to think about regional outbreaks (especially of new diseases) as one of the few potential short-term selection mechanisms in historical societies, because of their potential to create sudden bottlenecks of better fitted survivors.

I think recent works like these are showing a mixed situation, where maybe some traits were strongly selected for environmental reasons; but most of the time they were probably – like, say, Y-DNA haplogroup bottlenecks in Europe after the steppe-related expansions – due mostly to chance.

The uneasy relationship between Archaeology and Ancient Genomics

Allentoft Corded Ware

News feature Divided by DNA: The uneasy relationship between archaeology and ancient genomics, Two fields in the midst of a technological revolution are struggling to reconcile their views of the past, by Ewen Callaway, Nature (2018) 555:573-576.

Interesting excerpts (emphasis mine):

In duelling 2015 Nature papers6,7the teams arrived at broadly similar conclusions: an influx of herders from the grassland steppes of present-day Russia and Ukraine — linked to Yamnaya cultural artefacts and practices such as pit burial mounds — had replaced much of the gene pool of central and Western Europe around 4,500–5,000 years ago. This was coincident with the disappearance of Neolithic pottery, burial styles and other cultural expressions and the emergence of Corded Ware cultural artefacts, which are distributed throughout northern and central Europe. “These results were a shock to the archaeological community,” Kristiansen says.


Still, not everyone was satisfied. In an essay8 titled ‘Kossinna’s Smile’, archaeologist Volker Heyd at the University of Bristol, UK, disagreed, not with the conclusion that people moved west from the steppe, but with how their genetic signatures were conflated with complex cultural expressions. Corded Ware and Yamnaya burials are more different than they are similar, and there is evidence of cultural exchange, at least, between the Russian steppe and regions west that predate Yamnaya culture, he says. None of these facts negates the conclusions of the genetics papers, but they underscore the insufficiency of the articles in addressing the questions that archaeologists are interested in, he argued. “While I have no doubt they are basically right, it is the complexity of the past that is not reflected,” Heyd wrote, before issuing a call to arms. “Instead of letting geneticists determine the agenda and set the message, we should teach them about complexity in past human actions.”

Many archaeologists are also trying to understand and engage with the inconvenient findings from genetics. (…)
[Carlin:] “I would characterize a lot of these papers as ‘map and describe’. They’re looking at the movement of genetic signatures, but in terms of how or why that’s happening, those things aren’t being explored,” says Carlin, who is no longer disturbed by the disconnect. “I am increasingly reconciling myself to the view that archaeology and ancient DNA are telling different stories.” The changes in cultural and social practices that he studies might coincide with the population shifts that Reich and his team are uncovering, but they don’t necessarily have to. And such biological insights will never fully explain the human experiences captured in the archaeological record.

Reich agrees that his field is in a “map-making phase”, and that genetics is only sketching out the rough contours of the past. Sweeping conclusions, such as those put forth in the 2015 steppe migration papers, will give way to regionally focused studies with more subtlety.

This is already starting to happen. Although the Bell Beaker study found a profound shift in the genetic make-up of Britain, it rejected the notion that the cultural phenomenon was associated with a single population. In Iberia, individuals buried with Bell Beaker goods were closely related to earlier local populations and shared little ancestry with Beaker-associated individuals from northern Europe (who were related to steppe groups such as the Yamnaya). The pots did the moving, not the people.

This final paragraph apparently sums up a view that Reich has of this field, since he repeats it:

Reich concedes that his field hasn’t always handled the past with the nuance or accuracy that archaeologists and historians would like. But he hopes they will eventually be swayed by the insights his field can bring. “We’re barbarians coming late to the study of the human past,” Reich says. “But it’s dangerous to ignore barbarians.”

I would say that the true barbarians didn’t have a habit or possibility to learn from the higher civilizations they attacked or invaded. Geneticists, on the other hand, only have to do what they expect archaeologists to do: study.

EDIT (30 MAR 2018): A new interesting editorial of Nature, On the use and abuse of ancient DNA.

See also:

David Reich on the influence of ancient DNA on Archaeology and Linguistics

An interesting interview has appeared on The Atlantic, Ancient DNA Is Rewriting Human (and Neanderthal) History, on the occasion of the publication of David Reich’s book Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past.

Some interesting excerpts (I have emphasized some of Reich’s words):

On the efficiency of the Reich Lab

Zhang: How much does it cost to process an ancient DNA sample right now?

Reich: In our hands, a successful sample costs less than $200. That’s only two or three times more than processing them on a present-day person. And maybe about one-third to one half of the samples we screen are successful at this point.

This is probably the most controversial assessment for the Twitterverse, since it puts the Reich Lab at the top of the publishing chain, but I don’t find this fact controversial; at all.

Anyone interested in doing genetic studies has free datasets, papers, and bioinformatic tools at hand – thanks to his lab, mostly – to develop new methods and publish papers. Such secondary works won’t probably be published in journals with the highest impact factor, but what can you do, welcome to the scientific world…

Also, by the looks of it, every single researcher involved in recovering an archaeological sample is included as co-author of the papers, so there is a clear benefit for ‘local’ researchers collaborating with the Lab. Therefore, these researchers and their institutions are responsible for whatever unfair situation might be created by their exchange.

On Archaeology’s reaction to Kossinna and Nazi ideas:

Zhang: You actually had German collaborators drop out of a study because of these exact concerns, right? One of them wrote, “We must(!) avoid … being compared with the so-called ‘siedlungsarchäologie Method’ from Gustaf Kossinna!”

Reich: Yeah, that’s right. I think one of the things the ancient DNA is showing is actually the Corded Ware culture does correspond coherently to a group of people. I think that was a very sensitive issue to some of our coauthors, and one of the coauthors resigned because he felt we were returning to that idea of migration in archaeology that pots are the same as people. There have been a fair number of other coauthors from different parts of continental Europe who shared this anxiety.

We responded to this by adding a lot of content to our papers to discuss these issues and contextualize them. Our results are actually almost diametrically opposite from what Kossina thought because these Corded Ware people come from the East, a place that Kossina would have despised as a source for them. But nevertheless it is true that there’s big population movements, and so I think what the DNA is doing is it’s forcing the hand of this discussion in archaeology, showing that in fact, major movements of people do occur. They are sometimes sharp and dramatic, and they involve large-scale population replacements over a relatively short period of time. We now can see that for the first time.

What the genetics is finding is often outside the range of what the archaeologists are discussing these days.

This is mostly true: Genomics offers a whole new dimension to assess exchanges among groups, and help thus select anthropological models of cultural diffusion. They offer another way of interpreting prehistoric cultural evolution and change, including the investigation of potential languages of these cultures, ways of change and replacement, etc.

Also, he acknowledges that there is a lot of content added to the papers in search for context – and thus avoid simplistic assumptions and conclusions – , so this is a reasonable way to look at the (often erroneous) cultural and linguistic context which accompany most genetic papers, and even the new methods being developed to assess samples.

On the other hand, the fact that many in Archaeology didn’t want to discuss migrations does not mean that it was not discussed at all, as he seems to suggest.

On how Genomics fits with traditional disciplines

Zhang: I think at one point in your book you actually describe ancient DNA researchers as the “barbarians” at the gates of the study of history.

Reich: Yeah.

Zhang: Does it feel that way? Have you gotten into arguments with archaeologists over your findings?

Reich: I think archaeologists and linguists find it frustrating that we’re not trained in the language of archaeology and all these sensitivities like about Kossinna. Yet we have this really powerful tool which is this way of looking at things nobody has been able to look at before.

The point I was trying to make there was that even if we’re not always able to articulate the context of our findings very well, this is very new information, and a serious scholar really needs to take this on board. It’s dangerous. Barbarians may not talk in an educated and learned way but they have access to weapons and ways of looking at things that other people haven’t looked to. And time and again we’ve learned in the past that ignoring barbarians is a dangerous thing to do.

I think this is also mostly true: many academics find it frustrating to read these papers, most of which lack a minimal understanding of the topics being discussed.

For example, you can’t pretend to derive meaningful conclusions about Proto-Indo-Europeans knowing nothing about their language and the potential cultures associated with them (and why they were associated with them in the first place)…

I also agree with him in that the study of ancient DNA is a very powerful tool. Everyone involved in Anthropology and Archaeology should be trained these days in Genomics – or, at least, they should have the opportunity to do so.

On the dangers of Genomics

Reich: (…) I know there are extremists who are interested in genealogy and genetics. But I think those are very marginal people, and there’s, of course, a concern they may impinge on the mainstream.

But if you actually take any serious look at this data, it just confounds every stereotype. It’s revealing that the differences among populations we see today are actually only a few thousand years old at most and that everybody is mixed. I think that if you pay any attention to this world, and have any degree of seriousness, then you can’t come out feeling affirmed in the racist view of the world. You have to be more open to immigration. You have to be more open to the mixing of different peoples. That’s your own history.

I guess David Reich does not frequent forums on human genetics linked to ethnolinguistic identification, or he would not think of ‘extremists’ as marginal people. Or else we have a different view of what defines an ‘extremist’…


I did not have the best of opinions about David Reich – or any other geneticist involved in publishing anthropological theories, for that matter. I have always had great respect for their scientific work, though.

If anything, this article shows that he knows his own (and his fellow geneticists’) limitations, and the dangers and limitations of Genomics as a whole, so I have more respect for him – and anyone involved with his Lab’s work – after reading this piece.

I would sum up his interview with his humbling sentence:

We should think we really don’t know what we’re talking about.

NOTE. Also on the occasion of the publication of his book, Nature has published the piece Sex, power and ancient DNA – Turi King hails David Reich’s thrilling account of mapping humans through time and place.

After buying Lalueza-Fox’s recent book ‘La forja genètica d’Europa’, I don’t really feel like buying another book on Genomics and migrations from a geneticist. If you have read Reich’s book, please share your impressions.

EDIT (19 MAR 2018): Razib Khan has written a ‘preview of a review‘ that he intends to publish on the National Review, and it seems the book might be worth it, after all.

EDIT (20 MAR 2018): The New York Times’ Carl Zimmer writes a review, David Reich Unearths Human History Etched in Bone. Seen first in Razib Khan’s Gene Expression blog.

Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula

Open access preprint (which I announced already) at bioRxiv Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula, by Bycroft et al. (2018).

Abstract (emphasis mine):

Genetic differences within or between human populations (population structure) has been studied using a variety of approaches over many years. Recently there has been an increasing focus on studying genetic differentiation at fine geographic scales, such as within countries. Identifying such structure allows the study of recent population history, and identifies the potential for confounding in association studies, particularly when testing rare, often recently arisen variants. The Iberian Peninsula is linguistically diverse, has a complex demographic history, and is unique among European regions in having a centuries-long period of Muslim rule. Previous genetic studies of Spain have examined either a small fraction of the genome or only a few Spanish regions. Thus, the overall pattern of fine-scale population structure within Spain remains uncharacterised. Here we analyse genome-wide genotyping array data for 1,413 Spanish individuals sampled from all regions of Spain. We identify extensive fine-scale structure, down to unprecedented scales, smaller than 10 Km in some places. We observe a major axis of genetic differentiation that runs from east to west of the peninsula. In contrast, we observe remarkable genetic similarity in the north-south direction, and evidence of historical north-south population movement. Finally, without making particular prior assumptions about source populations, we show that modern Spanish people have regionally varying fractions of ancestry from a group most similar to modern north Moroccans. The north African ancestry results from an admixture event, which we date to 860 – 1120 CE, corresponding to the early half of Muslim rule. Our results indicate that it is possible to discern clear genetic impacts of the Muslim conquest and population movements associated with the subsequent Reconquista.

“(a) Binary tree showing the inferred hierarchical relationships between clusters. The colours and points correspond to each cluster as shown on the map, and the length of the coloured rectangles is proportional to the number of individuals assigned to that cluster. We combined some small clusters (Methods) and the thick black branches indicate the clades of the tree that we visualise in the map. We have labeled clusters according to the approximate location of most of their members, but geographic data was not used in the inference. (b) Each individual is represented by a point placed at (or close to) the centroid of their grandparents’ birthplaces. On this map we only show the individuals for whom all four grandparents were born within 80km of their average birthplace, although the data for all individuals were used in the fineSTRUCTURE inference. The background is coloured according to the spatial densities of each cluster at the level of the tree where there are 14 clusters (see Methods). The colour and symbol of each point corresponds to the cluster the individual was assigned to at a lower level of the tree, as shown in (a). The labels and boundaries of Spain’s Autonomous Communities are also shown.”

Some interesting excerpts:

Our results further imply that north west African-like DNA predominated in the migration. Moreover, admixture mainly, and perhaps almost exclusively, occurred within the earlier half of the period of Muslim rule. Within Spain, north African ancestry occurs in all groups, although levels are low in the Basque region and in a region corresponding closely to the 14th-century ‘Crown of Aragon’. Therefore, although genetically distinct this implies that the Basques have not been completely isolated from the rest of Spain over the past 1300 years.

NOTE. I must add here that the Expulsion of Moriscos is known to have been quite successful in the old Crown of Aragon – deeply affecting its economy – , in contrast with other territories of the Crown of Castille, where they either formed less sizeable communities, or were dispersed and eventually Christened and integrated with local communities. For example, thousands of Moriscos from Granada were dispersed following the War of Alpujarras (1567–1571) into different regions of the Crown of Castille, and many could not be later expelled due to the locals’ resistance to follow the expulsion edict.

Perhaps surprisingly, north African ancestry does not reflect proximity to north Africa, or even regions under more extended Muslim control. The highest amounts of north African ancestry found within Iberia are in the west (11%) including in Galicia, despite the fact that the region of Galicia as it is defined today (north of the Miño river), was never under Muslim rule and Berber settlements north of the Douro river were abandoned by. This observation is consistent with previous work using Y-chromosome data. We speculate that the pattern we see is driven by later internal migratory flows, such as between Portugal and Galicia, and this would also explain why Galicia and Portugal show indistinguishable ancestry sharing with non-Spanish groups more generally. Alternatively, it might be that these patterns reflect regional differences in patterns of settlement and integration with local peoples of north African immigrants themselves, or varying extents of the large-scale expulsion of Muslim people, which occurred post-Reconquista and especially in towns and cities.

We estimated ancestry profiles for each point on a fine spatial grid across Spain (Methods). Gray crosses show
the locations of sampled individuals used in the estimation. Map shows the fraction contributed from the donor group ‘NorthMorocco’.

Overall, the pattern of genetic differentiation we observe in Spain reflects the linguistic and geopolitical boundaries present around the end of the time of Muslim rule in Spain, suggesting this period has had a significant and long-term impact on the genetic structure observed in modern Spain, over 500 years later. In the case of the UK, similar geopolitical correspondence was seen, but to a different period in the past (around 600 CE). Noticeably, in these two cases, country-specific historical events rather than geographic barriers seem to drive overall patterns of population structure. The observation that fine-scale structure evolves at different rates in different places could be explained if observed patterns tend to reflect those at the ends of periods of significant past upheaval, such as the end of Muslim rule in Spain, and the end of the Anglo-Saxon and Danish Viking invasions in the UK.

Certain people want to believe (well into the 21st century) into ideal ancestral populations and ancient ethnolinguistic identifications linked to one’s own – or the own country’s dominant – ancestral components and Y-DNA haplogroup.

We are nevertheless seeing how mainly the most recent relevant geopolitical events and late internal migratory flows have shaped the genetic structure (including Y-DNA haplogroup composition) of modern regions and countries regardless of its population’s actual language or ethnic identification, whether (pre)historical or modern.

Another surprise for many, I guess.


Population substructure in Iberia, highest in the north-west territory (to appear in Nature)

A manuscript co-authored by Angel Carracedo, from the University of Santiago de Compostela, and (always according to him) pre-accepted in Nature, will offer more insight into the population substructure of Spain, based on autosomal DNA.

Carracedo’s lecture about DNA (in Galician), including his summary of the paper (from december 2017):

Some of the points made in the video:

  • The study shows a situation parallelling – as expected – the expansion of Spanish Medieval kingdoms during the Reconquista (and subsequent repopulation).
  • In it, the biggest surprise seems to be the greater substructure found in Galicia, the north-western Spanish territory – greater even than expected by the authors.
  • As a side note, Galicia shows a great influence from Moorish” ancestral components, due mainly to the influx from Portugal, which shows more.

It is difficult to judge only from the image and his words, but one could say that there are:

  • Certain quite old ancestral Galician groups;
    • then two – also quite old – ancestral Basque groups;
      • then more recent Galician groups;
        • and then a common, central Spanish group – including
          • a wider Asturian-Catalan group, with a western Asturian-Leonese, and an eastern Catalan subgroup;
          • and a central Castillian-Aragonese group, also with a western Castillian, and an eastern Aragonese subgroup.
Spain’s population substructure, from the video.

We thought that certain parts of the British Isles could show ancestral components related to the old population, although this has not proven exactly right, due to more recent population expansions.

However, this paper might shed light to the controversy surrounding Lusitanian (possibly Gallaico-Lusitanian) as a Pre-Celtic Indo-European group of Iberia, either slightly older as an Italo-Celtic dialect, or potentially from the Bell Beaker expansion, whose genetic imprint might have survived the Roman conquest, which apparently didn’t replace its ancestral population.

Given the presence of a central Spanish group opposed to the other minor groups – and knowing that (at least part of) the Medieval kingdoms should be related to the Occitan region – due to the Celtic expansion, and also potentially later during the Visigothic Kingdom, and the Carolingian Empire – , we can only guess that the other (north-western and Basque) groups are potentially quite old, and reflect prehistoric population structures.

Just speculating here, of course. Another interesting genetic paper to await…

Seen first in the Facebook group Iberia ADN.


Ancient mtDNA from Central America and Mexico


New article, Successful reconstruction of whole mitochondrial genomes from ancient Central America and Mexico, by Morales-Arce et al., Scientific Reports (2017).


The northern and southern peripheries of ancient Mesoamerica are poorly understood. There has been speculation over whether borderland cultures such as Greater Nicoya and Casas Grandes represent Mesoamerican outposts in the Isthmo-Colombian area and the Greater Southwest, respectively. Poor ancient DNA preservation in these regions challenged previous attempts to resolve these questions using conventional genetic techniques. We apply advanced in-solution mitogenome capture and high-throughput sequencing to fourteen dental samples obtained from the Greater Nicoya sites of Jícaro and La Cascabel in northwest Costa Rica (n = 9; A.D. 800–1250) and the Casas Grandes sites of Paquimé and Convento in northwest Mexico (n = 5; A.D. 1200–1450). Full mitogenome reconstruction was successful for three individuals from Jícaro and five individuals from Paquimé and Convento. The three Jícaro individuals belong to haplogroup B2d, a haplogroup found today only among Central American Chibchan-speakers. The five Paquimé and Convento individuals belong to haplogroups C1c1a, C1c5, B2f and B2a which, are found in contemporary populations in North America and Mesoamerica. We report the first successfully reconstructed ancient mitogenomes from Central America, and the first genetic evidence of ancestry affinity of the ancient inhabitants of Greater Nicoya and Casas Grandes with contemporary Isthmo-Columbian and Greater Southwest populations, respectively.

Archaeological sites location and corresponding culture areas as noted in the text. ArcGIS 10.4 software ( was used to generate the figure. Service layer credits Esri, ArcGIS Online, TerraColor (Earthstar Geographics) 1999; Vivid – Mexico (Digital Globe) 2005, 2009, 2010, 2011, 2012, 2013, 2014, 2015; Metro (Digital Globe) 2016; Vivid Caribbean (Digital Globe) 2013, 2014, 2015, 2016, Vivid (Digital Globe) 2015, Vivid – Mexico (Digital Globe) 2012 and the GIS User Community.

Discovered via Bernard Sécher’s blog.

Featured image: From Wikipedia, author Juan Miguel, “Mesoamérica y toda América Central prehispanica en el siglo XVI (16) — antes de la llegada de los españoles.”


Preprint paper: Estimating genetic kin relationships in prehistoric populations, by Monroy Kuhn, Jakobsson, and Günther


A new preprint paper appeared some days ago in BioRxiv, Estimating genetic kin relationships in prehistoric populations, by researchers of the Uppsala University Jose Manuel Monroy Kuhn, Mattias Jakobsson, and Torsten Günther. Jakobsson and Günther. You might remember the last two from their work Ancient X chromosomes reveal contrasting sex bias in Neolithic and Bronze Age Eurasian migrations, whose results were said not to be replicable by Lazaridis and Reich (PNAS), something they denied pointing to the limitations of the current aDNA data (PNAS).

They propose a new, more conservative method to infer close relationships (in contrast with available methods, suitable for modern samples). They have implemented the method as a software program, called READ, which should work better with degraded samples (typical of ancient DNA) by reducing false positives – and having therefore more false negatives. Abstract:

Archaeogenomic research has proven to be a valuable tool to trace migrations of historic and prehistoric individuals and groups, whereas relationships within a group or burial site have not been investigated to a large extent. Knowing the genetic kinship of historic and prehistoric individuals would give important insights into social structures of ancient and historic cultures. Most archaeogenetic research concerning kinship has been restricted to uniparental markers, while studies using genome-wide information were mainly focused on comparisons between populations. Applications which infer the degree of relationship based on modern-day DNA information typically require diploid genotype data. Low concentration of endogenous DNA, fragmentation and other post-mortem damage to ancient DNA (aDNA) makes the application of such tools unfeasible for most archaeological samples. To infer family relationships for degraded samples, we developed the software READ (Relationship Estimation from Ancient DNA). We show that our heuristic approach can successfully infer up to second degree relationships with as little as 0.1x shotgun coverage per genome for pairs of individuals. We uncover previously unknown relationships among prehistoric individuals by applying READ to published aDNA data from several human remains excavated from different cultural contexts. In particular, we find a group of five closely related males from the same Corded Ware culture site in modern-day Germany, suggesting patrilocality, which highlights the possibility to uncover social structures of ancient populations by applying READ to genome-wide aDNA data.

The software READ applied to the 230 ancient European DNA data from Mathieson et al. (2015) was studied, with certain interesting results. For starters, this paper already supports the idea that the five German Corded Ware samples from Esperstedt were all related, thus further supporting to a certain extent the culture’s patrilocality and female exogamy practices:

Of particular interest was a group of five males from Esperstedt in Germany who were associated with the Corded Ware culture {a culture that arose after large scale migrations of males from the east. Around 50 Corded Ware burials, six of them stone cists, were excavated near Esperstedt in the context of road constructions in 2005. Characteristic Corded Ware pottery was found in the graves and all male individuals had been buried on their right hand site. Interestingly, the central individual of the group of related individuals (I1541) was buried in a stone cist approximately 700 meters from the graves of the other four individuals which were all close to each other. The close relationship of this group of only male individuals from the same location suggest patrilocality and female exogamy, a pattern which has also been found from Strontium isotopes at another Corded Ware site just 30 kilometers from Esperstedt and suggested for the Corded Ware culture in general. This represents just one example of how the genetic analysis of relationships can be used to uncover and understand social structures in ancient populations.

It is to be expected that improvement in such methods can help more accurately define certain samples, by inferring their precise subclades. For example, in the case of those relatives from Esperstedt – classified variously as R(xR1b), R1a, or R1a1 – one would be able to classify those related patrilineally to the most precise subclade: in this case, that of the sample I0104 (ca.2473-2348 BC), of subclade R1a1a1-M417.

However, errors are dependent on the quality of the ancient DNA recovered:

READ does not explicitly model aDNA damage and it only considers one allele at heterozygous sites. This implies that a careful curation of the data is required to avoid errors due to low coverage, short sequence fragments, deamination damage, sequencing errors and potential contamination. We recommend a number of well established filtering steps when working with low coverage aDNA data