Contrastive principal component analysis (cPCA) to explore patterns specific to a dataset

Interesting open access paper Exploring patterns enriched in a dataset with contrastive principal component analysis, by Abid, Zhang, Bagaria & Zou, Nature Communications (2018) 9:2134.

Abstract (emphasis mine):

Visualization and exploration of high-dimensional data is a ubiquitous challenge across disciplines. Widely used techniques such as principal component analysis (PCA) aim to identify dominant trends in one dataset. However, in many settings we have datasets collected under different conditions, e.g., a treatment and a control experiment, and we are interested in visualizing and exploring patterns that are specific to one dataset. This paper proposes a method, contrastive principal component analysis

Read the rest “Contrastive principal component analysis (cPCA) to explore patterns specific to a dataset”

The concept of “Outlier” in Human Ancestry (III): Late Neolithic samples from the Baltic region and origins of the Corded Ware culture

corded-ware-regions-main

I have written before about how the Late Neolithic sample from Zvejnieki seemed to be an outlier among Corded Ware samples (read also the Admixture analysis section on the IEDDM), due to its position in PCA, even more than its admixture components or statistical comparison might show.

In the recent update to Northern European samples in Mittnik et al. (2018), an evaluation of events similar to the previous preprint (2017) is given:

Computing D-statistics for each individual of the form D(Baltic LN, Yamnaya; X, Mbuti), we find that the two individuals from the early phase of the

Read the rest “The concept of “Outlier” in Human Ancestry (III): Late Neolithic samples from the Baltic region and origins of the Corded Ware culture”

Differences in ADMIXTURE between Khvalynsk/Yamna and Sredni Stog/Corded Ware

neolithic-steppe

Looking for differences among steppe cultures in Genomics is like looking for a needle in a haystack.

It means, after all, looking for differences among closely related cultures, such as between South-Western and North-Western Anatolian Neolithic cultures, or among Old European cultures (such as Vinča or Cucuteni–Trypillia), or between Iberian cultures after the arrival of steppe-related populations.

These differences between closely related regions, in all these cases and especially among steppe cultures, even when they are supported by Archaeology and anthropological models of migration (and compatible with linguistic models), are expected to be minimal.

Fortunately, we have … Read the rest “Differences in ADMIXTURE between Khvalynsk/Yamna and Sredni Stog/Corded Ware”

The concept of “Outlier” in Human Ancestry (II): Early Khvalynsk, Sredni Stog, West Yamna, Iron Age Bulgaria, Potapovka, Andronovo…

yamna-corded-ware-bell-beaker

I already wrote about the concept of outlier in Human Ancestry, so I am not going to repeat myself. This is just an update of “outliers” in recent studies, and their potential origins (here I will repeat some of the examples):

Early Khvalynsk: the three samples from the Samara region have quite different positions in PCA, from nearest to EHG (of Y-DNA haplogroup R1a) to nearest to ANE ancestry (of Y-DNA haplogroup Q). This could represent the initial consequences of the second wave of ANE ancestry – as found later in Yamna samples from a neighbouring region -, … Read the rest “The concept of “Outlier” in Human Ancestry (II): Early Khvalynsk, Sredni Stog, West Yamna, Iron Age Bulgaria, Potapovka, Andronovo…”

Globular Amphora not linked to Pontic steppe migrants – more data against Kristiansen’s Kurgan model of Indo-European expansion

eneolithic-steppe-cultures

New open access article, Genome diversity in the Neolithic Globular Amphorae culture and the spread of Indo-European languages, by Tassi et al. (2017).

Abstract:

It is unclear whether Indo-European languages in Europe spread from the Pontic steppes in the late Neolithic, or from Anatolia in the Early Neolithic. Under the former hypothesis, people of the Globular Amphorae culture (GAC) would be descended from Eastern ancestors, likely representing the Yamnaya culture. However, nuclear (six individuals typed for 597 573 SNPs) and mitochondrial (11 complete sequences) DNA from the GAC appear closer to those of earlier Neolithic groups than to

Read the rest “Globular Amphora not linked to Pontic steppe migrants – more data against Kristiansen’s Kurgan model of Indo-European expansion”

Human ancestry: how to work your own PCA, ADMIXTURE analyses for human evolutionary and genealogical studies

yamna-corded-ware-bell-beaker

I wrote two days ago in the post anouncing the revised version (October 2017) of the Indo-European demic diffusion model, about dumping the information I had on doing PCA and ADMIXTURE analyses as ‘drafts’, without reviewing them, in the new section of this website called Human Ancestry.

I had some time today to review them, and to correct gross mistakes in the texts, so that they might be more usable now

I began to work with free datasets to see if I could learn something more about results of recent Genetic research by working with the available … Read the rest “Human ancestry: how to work your own PCA, ADMIXTURE analyses for human evolutionary and genealogical studies”

The concept of “outlier” in studies of Human Ancestry, and the Corded Ware outlier from Esperstedt

pca-yamna-corded-ware

While writing the third version of the Indo-European demic diffusion model, I noticed that one Corded Ware sample (labelled I0104) clusters quite closely with steppe samples (i.e. Yamna, Afanasevo, and Potapovka). The other Corded Ware samples cluster, as expected, closely with east-central European samples, which include related cultures such as the Swedish Battle Axe, and later Sintashta, or Potapovka (cultures that are from the steppe proper, but are derived from Corded Ware).

I also noticed after publishing the draft that I had used the wording “Corded Ware outlier” at least once. I certainly had that term … Read the rest “The concept of “outlier” in studies of Human Ancestry, and the Corded Ware outlier from Esperstedt”

Indo-European demic diffusion model, 3rd edition

pca-yamna-corded-ware

I have just uploaded the working draft of the third version of the Indo-European demic diffusion model. Unlike the previous two versions, which were published as essays (fully developed papers), this new version adds more information on human admixture, and probably needs important corrections before a definitive edition can be published.

The third version is available right now on ResearchGate and Academia.edu. I will post the PDF at Academia Prisca, as soon as possible:

Feel free to … Read the rest “Indo-European demic diffusion model, 3rd edition”

Palaeogenomic and biostatistical analysis of ancient DNA data from Mesolithic and Neolithic skeletal remains

lepenski-vir-mesolithic-anatolia-neolithic

PhD Thesis Palaeogenomic and biostatistical analysis of ancient DNA data from Mesolithic and Neolithic skeletal remains, by Zuzana Hofmanova (2017) at the University of Mainz.
Abstract:

Palaeogenomic data have illuminated several important periods of human past with surprising im- plications for our understanding of human evolution. One of the major changes in human prehistory was Neolithisation, the introduction of the farming lifestyle to human societies. Farming originated in the Fertile Crescent approximately 10,000 years BC and in Europe it was associated with a major population turnover. Ancient DNA from Anatolia, the presumed source area of the demic spread to

Read the rest “Palaeogenomic and biostatistical analysis of ancient DNA data from Mesolithic and Neolithic skeletal remains”