Preprint paper: Estimating genetic kin relationships in prehistoric populations, by Monroy Kuhn, Jakobsson, and Günther

A new preprint paper appeared some days ago in BioRxiv, Estimating genetic kin relationships in prehistoric populations, by researchers of the Uppsala University Jose Manuel Monroy Kuhn, Mattias Jakobsson, and Torsten Günther. Jakobsson and Günther. You might remember the last two from their work Ancient X chromosomes reveal contrasting sex bias in Neolithic and Bronze Age Eurasian migrations, whose results were said not to be replicable by Lazaridis and Reich (PNAS), something they denied pointing to the limitations of the current aDNA data (PNAS).

They propose a new, more conservative method to infer close relationships (in contrast with available methods, suitable for modern samples). They have implemented the method as a software program, called READ, which should work better with degraded samples (typical of ancient DNA) by reducing false positives – and having therefore more false negatives. Abstract:

Archaeogenomic research has proven to be a valuable tool to trace migrations of historic and prehistoric individuals and groups, whereas relationships within a group or burial site have not been investigated to a large extent. Knowing the genetic kinship of historic and prehistoric individuals would give important insights into social structures of ancient and historic cultures. Most archaeogenetic research concerning kinship has been restricted to uniparental markers, while studies using genome-wide information were mainly focused on comparisons between populations. Applications which infer the degree of relationship based on modern-day DNA information typically require diploid genotype data. Low concentration of endogenous DNA, fragmentation and other post-mortem damage to ancient DNA (aDNA) makes the application of such tools unfeasible for most archaeological samples. To infer family relationships for degraded samples, we developed the software READ (Relationship Estimation from Ancient DNA). We show that our heuristic approach can successfully infer up to second degree relationships with as little as 0.1x shotgun coverage per genome for pairs of individuals. We uncover previously unknown relationships among prehistoric individuals by applying READ to published aDNA data from several human remains excavated from different cultural contexts. In particular, we find a group of five closely related males from the same Corded Ware culture site in modern-day Germany, suggesting patrilocality, which highlights the possibility to uncover social structures of ancient populations by applying READ to genome-wide aDNA data.

The software READ applied to the 230 ancient European DNA data from Mathieson et al. (2015) was studied, with certain interesting results. For starters, this paper already supports the idea that the five German Corded Ware samples from Esperstedt were all related, thus further supporting to a certain extent the culture’s patrilocality and female exogamy practices:

Of particular interest was a group of five males from Esperstedt in Germany who were associated with the Corded Ware culture {a culture that arose after large scale migrations of males from the east. Around 50 Corded Ware burials, six of them stone cists, were excavated near Esperstedt in the context of road constructions in 2005. Characteristic Corded Ware pottery was found in the graves and all male individuals had been buried on their right hand site. Interestingly, the central individual of the group of related individuals (I1541) was buried in a stone cist approximately 700 meters from the graves of the other four individuals which were all close to each other. The close relationship of this group of only male individuals from the same location suggest patrilocality and female exogamy, a pattern which has also been found from Strontium isotopes at another Corded Ware site just 30 kilometers from Esperstedt and suggested for the Corded Ware culture in general. This represents just one example of how the genetic analysis of relationships can be used to uncover and understand social structures in ancient populations.

It is to be expected that improvement in such methods can help more accurately define certain samples, by inferring their precise subclades. For example, in the case of those relatives from Esperstedt – classified variously as R(xR1b), R1a, or R1a1 – one would be able to classify those related patrilineally to the most precise subclade: in this case, that of the sample I0104 (ca.2473-2348 BC), of subclade R1a1a1-M417.

However, errors are dependent on the quality of the ancient DNA recovered:

READ does not explicitly model aDNA damage and it only considers one allele at heterozygous sites. This implies that a careful curation of the data is required to avoid errors due to low coverage, short sequence fragments, deamination damage, sequencing errors and potential contamination. We recommend a number of well established filtering steps when working with low coverage aDNA data