Insights from the Green Saharan Y-chromosomal findings (emphasis mine):
It is widely accepted that sub-Saharan Y chromosomes are dominated by E-M2 lineages carried by Bantu-speaking farmers as they expanded from West Africa starting < 5 kya, reaching South Africa within recent centuries . The E-M2-Bantu lineages lie phylogenetically within the E-M2-Green Sahara lineage and show at least three explosive lineage expansions beginning 4.9–5.3 kya  (Fig. 1a). These events of E-M2-Bantu expansion are slightly later than the R-V88 expansion, and highlight the range of male demographic changes in the mid-Holocene. North of the Sahara, in addition to the four trans-Saharan haplogroups, haplogroup E-M81 (which diverged from E-M78 ~ 13 kya) became very common in present-day populations as a result of another massive expansion ~ 2 kya  (Fig. 1a).
Although Y chromosomes exist within populations and so share and reflect the general history of those populations, they can sometimes show some departures from other parts of the genome that result from differences in male and female behaviors. D’Atanasio et al.  highlight one such contrast in their study. Present-day North African populations show substantial sub-Saharan autosomal and mtDNA genetic components ascribed to the Roman and Arab slave trades 1–2 kya , but carry few sub-Saharan Y lineages from this source, probably reflecting the smaller numbers of male slaves and their reduced reproductive opportunities when compared to those of female slaves. The sub-Saharan Y chromosomes in these North African populations thus originate predominantly from the earlier Green Sahara period.
In this part of Africa, the indigenous languages that are spoken belong to three of the four African linguistic families (Afro-Asiatic, Nilo-Saharan and Niger-Congo). Interestingly, these languages show non-random associations with Y lineages. For example, Chadic languages within the Afro-Asiatic family are associated with haplogroup R-V88, whereas Nilo-Saharan languages are associated with specific sublineages within A3-M13 and E-M78, further illustrating the complex human history of the region.
(…) what are the reasons for the very rapid R-V88 expansion 5–6 kya  and E-M81 expansion ~ 2 kya , and how do these expansions fit within general worldwide patterns of male-specific expansions, which in other cases have been linked to cultural and technological changes ?
I think that the only known haplogroup expansion that might fit today the spread and dialectalization of Afroasiatic, a proto-language probably contemporaneous or slighly older than Middle Proto-Indo-European, is that of R1b-V88 lineages. However, without ancient DNA samples to corroborate this, we cannot be sure.
Genetic differences within or between human populations (population structure) has been studied using a variety of approaches over many years. Recently there has been an increasing focus on studying genetic differentiation at fine geographic scales, such as within countries. Identifying such structure allows the study of recent population history, and identifies the potential for confounding in association studies, particularly when testing rare, often recently arisen variants. The Iberian Peninsula is linguistically diverse, has a complex demographic history, and is unique among European regions in having a centuries-long period of Muslim rule. Previous genetic studies of Spain have examined either a small fraction of the genome or only a few Spanish regions. Thus, the overall pattern of fine-scale population structure within Spain remains uncharacterised. Here we analyse genome-wide genotyping array data for 1,413 Spanish individuals sampled from all regions of Spain. We identify extensive fine-scale structure, down to unprecedented scales, smaller than 10 Km in some places. We observe a major axis of genetic differentiation that runs from east to west of the peninsula. In contrast, we observe remarkable genetic similarity in the north-south direction, and evidence of historical north-south population movement. Finally, without making particular prior assumptions about source populations, we show that modern Spanish people have regionally varying fractions of ancestry from a group most similar to modern north Moroccans. The north African ancestry results from an admixture event, which we date to 860 – 1120 CE, corresponding to the early half of Muslim rule. Our results indicate that it is possible to discern clear genetic impacts of the Muslim conquest and population movements associated with the subsequent Reconquista.
Some interesting excerpts:
Our results further imply that north west African-like DNA predominated in the migration. Moreover, admixture mainly, and perhaps almost exclusively, occurred within the earlier half of the period of Muslim rule. Within Spain, north African ancestry occurs in all groups, although levels are low in the Basque region and in a region corresponding closely to the 14th-century ‘Crown of Aragon’. Therefore, although genetically distinct this implies that the Basques have not been completely isolated from the rest of Spain over the past 1300 years.
NOTE. I must add here that the Expulsion of Moriscos is known to have been quite successful in the old Crown of Aragon – deeply affecting its economy – , in contrast with other territories of the Crown of Castille, where they either formed less sizeable communities, or were dispersed and eventually Christened and integrated with local communities. For example, thousands of Moriscos from Granada were dispersed following the War of Alpujarras (1567–1571) into different regions of the Crown of Castille, and many could not be later expelled due to the locals’ resistance to follow the expulsion edict.
Perhaps surprisingly, north African ancestry does not reflect proximity to north Africa, or even regions under more extended Muslim control. The highest amounts of north African ancestry found within Iberia are in the west (11%) including in Galicia, despite the fact that the region of Galicia as it is defined today (north of the Miño river), was never under Muslim rule and Berber settlements north of the Douro river were abandoned by. This observation is consistent with previous work using Y-chromosome data. We speculate that the pattern we see is driven by later internal migratory flows, such as between Portugal and Galicia, and this would also explain why Galicia and Portugal show indistinguishable ancestry sharing with non-Spanish groups more generally. Alternatively, it might be that these patterns reflect regional differences in patterns of settlement and integration with local peoples of north African immigrants themselves, or varying extents of the large-scale expulsion of Muslim people, which occurred post-Reconquista and especially in towns and cities.
Overall, the pattern of genetic differentiation we observe in Spain reflects the linguistic and geopolitical boundaries present around the end of the time of Muslim rule in Spain, suggesting this period has had a significant and long-term impact on the genetic structure observed in modern Spain, over 500 years later. In the case of the UK, similar geopolitical correspondence was seen, but to a different period in the past (around 600 CE). Noticeably, in these two cases, country-specific historical events rather than geographic barriers seem to drive overall patterns of population structure. The observation that fine-scale structure evolves at different rates in different places could be explained if observed patterns tend to reflect those at the ends of periods of significant past upheaval, such as the end of Muslim rule in Spain, and the end of the Anglo-Saxon and Danish Viking invasions in the UK.
Certain people want to believe (well into the 21st century) into ideal ancestral populations and ancient ethnolinguistic identifications linked to one’s own – or the own country’s dominant – ancestral components and Y-DNA haplogroup.
We are nevertheless seeing how mainly the most recent relevant geopolitical events and late internal migratory flows have shaped the genetic structure (including Y-DNA haplogroup composition) of modern regions and countries regardless of its population’s actual language or ethnic identification, whether (pre)historical or modern.
Next generation sequencing (NGS) technologies offer immense possibilities given the large genomic data they simultaneously deliver. The human Y chromosome serves as good example how NGS benefits various applications in evolution, anthropology, genealogy and forensics. Prior to NGS, the Y-chromosome phylogenetic tree consisted of a few hundred branches, based on NGS data it now contains many thousands. The complexity of both, Y tree and NGS data provide challenges for haplogroup assignment. For effective analysis and interpretation of Y-chromosome NGS data, we present Yleaf, a publically available, automated, user-friendly software for high-resolution Y-chromosome haplogroup inference independently of library and sequencing methods.
In the time of NGS (or massively parallel sequencing, MPS), the amount of genomic data produced and made publically available is rapidly expanding, providing valuable resources for many areas of research and applications. Due to its haploid nature and male-specific inheritance, the non-recombining part of the human Y-chromosome (NRY) is highly suitable for phylogenetic studies and for addressing questions in evolution, anthropology, population history, genealogy and forensics (Jobling & Tyler-Smith, 2017). Over recent years, NGS data allowed the phylogenetic NRY tree to dramatically increase in size and complexity (Hallast et al. 2014; Poznik et al. 2016). The two most comprehensive tree versions ISOGG (http://www.isogg.org/tree) and Yfull (https://www.yfull.com/tree) currently contain thousands of branches. However, the complexity of both, Y tree and NGS data provide immense challenges for NRY haplogroup assignment, which reflects a key element in many NRY applications. Here we introduce Yleaf, a Phyton-based, easy-to-use, publically-available software tool for effective NRY single nucleotide polymorphism (SNP) calling and subsequent NRY haplogroup inference from NGS data. By comparative whole genome data analysis, we demonstrate high concordance of Yleaf in NRY-SNP calling compared to well-established tools such as SAMtools/BCFtools (Li et al. 2009), and GATK (McKenna, et al. 2010) as well as improved performance of Yleaf in NRY haplogroup assignment relative to previously developed tools such as clean_tree (Ralf et al. 2015), AMY-tree (Van Geystelen et al. 2015), and yHaplo (Poznik, 2016).
Yleaf allows analyzing NRY sequence data from many types of NGS libraries i.e., whole genomes, whole exomes, large genomic regions, and large numbers of targeted amplicons. Several modifications relative to our previously developed clean_tree tool (Ralf et al. 2015) were implemented to optimize the performance especially relevant for extremely large NGS datasets such as whole genomes. For instance, Yleaf extracts the Y-chromosomal reads prior to further processing and uses multi-threading, a batch option is included too. Importantly, Yleaf provides drastically increased haplogroup resolution i.e., from Downloaded from 530 positions defining 432 NRY haplogroups with clean_tree (Ralf et al. 2015) to over 41,000 positions defining 5353 haplogroups with Yleaf. For a detailed method description see the supplementary material.
The possible scenarios based on potential sample results in terms of Y-DNA and mtDNA haplogroups seem to be generally well described, and I would bet – like Khan – for some kind of an East-West Eurasian connection. This is all pure speculation, though, and after all we only have to wait one month and see.
Out of the potential models laid out by Joseph something struck me as plainly wrong. From the section about R1a and Vedic Aryans (emphasis mine):
In the ancient DNA from Rakhigarhi, scientists identify R1a, one of the hundreds of Y-DNA haplogroups (or male lineages that are passed on from fathers to sons). They also identify H2b — one of the hundreds of mt-DNA haplogroups (or female lineages that are passed on from mothers to daughters) — that has often been found in proximity to R1a.
There is no reason whatsoever to think that this would be the research finding, but if it is, it would cause a global convulsion in the fields of population genetics, history and linguistics. It would also cause great cheer among the advocates of the theory that says that the Indus Valley civilisation was Vedic Aryan.
And it goes on to postulate reasons why such a big fuss will be created about the potential finding of haplogroup R1a, and its implications for the Out-of-India Theory. A global convulsion, no less.
It seems that all new methods involving admixture analysis, PCA, and other statistical tools to study Human Ancestry are still irrelevant for most, and indeed that Archaeology and even Linguistics are at the service of the simplistic identification of ancient languages with modern haplogroup distributions.
I really hope some R1a subclade is found among the samples, so that stupidity can reach the lowest possible level in discussions among amateur geneticists obsessed with haplogroup R1a’s role in the expansion of Indo-European speakers. Maybe then will the rest of us be able to overcome this renewed moronic supremacist trends hidden behind supposedly objective migration models.
A preprint article by two of the most prolific researchers in Human Ancestry is out, and they request feedback: Ancient genomics: a new view into human prehistory and evolution, by Skoglund and Mathieson (2017). Right now, it is downloadable on Dropbox.
The first decade of ancient genomics has revolutionized the study of human prehistory and evolution. We review new insights based on ancient genomic data, including greatly increased resolution of the timing and structure of the out-of-Africa event, the diversification of present-day non-African populations, and the earliest expansions of those populations into Eurasia and America. Prehistoric genomes now document patterns of population continuity and change on every inhabited continent–in particular the effect of agricultural expansions in Africa, Europe and Oceania–and record a history of natural selection that shapes present-day phenotypic diversity. Despite these advances, much remains unknown, in particular about the genomic histories of Asia–the most populous continent, and Africa–the continent that contains the most genetic diversity. Ancient genomes from these and other regions, integrated with a growing understanding of the genomic basis of human phenotypic diversity, will be in focus during the next decade of research in the field.
The paper may be highly recommended as an introduction for anyone interested in the field of Human Ancestry in general.
The next substantial change is closely related to ancestry that by around 5000 BP extended over a region of more than 2000 miles of the Eurasian steppe, including in individuals associated with the Yamnaya Cultural Complex in far-eastern Europe (1; 38) and with the Afanasievo culture in the central Asian Altai mountains (1). This “steppe” ancestry is itself a mixture between ancestry that is related to Mesolithic hunter-gatherers of eastern Europe and ancestry that is related to both present-day populations (38) and Mesolithic hunter-gatherers (46) from the Caucasus mountains, and also to the populations of Neolithic (11), and Copper Age (56) Iran. Steppe ancestry appeared in southeastern Europe by 6000 BP (72), northeastern Europe around 5000 BP (47) and central Europe at the time of the Corded Ware Complex around 4600 BP (1; 38). These dates are reasonably tight constraints, because in each case there is no evidence of steppe ancestry in individuals immediately preceding these dates (47; 72). Gene flow on the steppe was extensive and bidirectional, as shown by the eastward flow of Anatolian Neolithic ancestry– reaching well into central Eurasia by the time of the Andronovo culture ~3500 BP (1)–and the westward flow of East Asian ancestry–found in individuals associated with the Iron Age Scythian culture close to the Black Sea ~2500 BP (143).
Copper and Bronze Age population movements (14; 78 Martiniano, 2017 #8761; 85; 112), as well as later movements in the Iron Age and Historical period (70; 119) further distributed steppe ancestry around Europe. Present-day western European populations can be modeled as mixtures of these three ancestry components (Mesolithic hunter-gatherer, Anatolian Neolithic and Steppe) (38; 57). In eastern Europe, further shifts in ancestry are the result of additional or distinct gene flow from Anatolia throughout the Neolithic and Bronze Age in the Aegean (42; 51; 55; 72; 87), and gene flow from Siberian-related populations in Finland and the Baltic region (38). East-west gene flow also brought new ancestry–related to populations from 265 Copper Age Iran–to the Levant during the Copper and Bronze ages (39; 56).
The geographic structure of these population transformations gave rise to population structure of present-day Europe. For example Anatolian Neolithic ancestry is highest in southern European populations like Sardinians, and lowest in northern European populations (38). Steppe ancestry is at high frequency in north-central Europeans and low in the south. Isolation-by-distance may have contributed to these patterns to some extent, but the contribution must have been small. In much of Europe, extreme population discontinuity was the norm.
Featured image: from the article, “Major Holocene population movements and expansions that have been demonstrated using ancient DNA.”
Background: The Eneolithic (~5,500 yrBP) site of Verteba Cave in Western Ukraine contains the largest collection of human skeletal remains associated with the archaeological Cucuteni-Tripolye Culture. Their subsistence economy is based largely on agro-pastoralism and had some of the largest and most dense settlement sites during the Middle Neolithic in all of Europe. To help understand the evolutionary history of the Tripolye people, we performed mtDNA analyses on ancient human remains excavated from several chambers within the cave.
Results: Burials at Verteba Cave are largely commingled and secondary in nature. A total of 68 individual bone specimens were analyzed. Most of these specimens were found in association with well-defined Tripolye artifacts. We determined 28 mtDNA D-Loop (368 bp) sequences and defined 8 sequence types, belonging to haplogroups H, HV, W, K, and T. These results do not suggest continuity with local pre-Eneolithic peoples, but rather complete population replacement. We constructed maximum parsimonious networks from the data and generated population genetic statistics. Nucleotide diversity (π) is low among all sequence types and our network analysis indicates highly similar mtDNA sequence types for samples in chamber G3. Using different sample sizes due to the uncertainly in number of individuals (11, 28, or 15), we found Tajima’s D statistic to vary. When all sequence types are included (11 or 28), we do not find a trend for demographic expansion (negative but not significantly different from zero); however, when only samples from Site 7 (peak occupation) are included, we find a significantly negative value, indicative of demographic expansion.
Conclusions: Our results suggest individuals buried at Verteba Cave had overall low mtDNA diversity, most likely due to increased conflict among sedentary farmers and nomadic pastoralists to the East and North. Early Farmers tend to show demographic expansion. We find different signatures of demographic expansion for the Tripolye people that may be caused by existing population structure or the spatiotemporal nature of ancient data. Regardless, peoples of the Tripolye Culture are more closely related to early European farmers and lack genetic continuity with Mesolithic hunter-gatherers or pre-Eneolithic groups in Ukraine.
Genetic finds keep supporting the long-lasting cultural and linguistic frontier that Anthony (2007) – among others – asserted existed in the North-West Pontic steppe in the Mesolithic and Neolithic, between western steppe cultures and farmers, while it disproves Kristiansen’s theories of Sredni Stog expansion in Kurgan waves with a mixture of GAC and Trypillia within the Corded Ware culture:
Previous ancient DNA studies showed that hunter-gatherers before 6,500 yrBP in Europe commonly had haplogroups U, U4, U5, and H, whereas hunter-gatherers after 6,500 yrBP in Europe had less frequency of haplogroup H than before. Haplogroups T and K appeared in hunter-gatherers only after 6,500 yrBP, indicating a degree of admixture in some places between farmers and hunter-gatherers. Farmers before and after 6,500 yrBP in Europe had haplogroups W, HV*, H, T, K, and these are also found in individuals buried at Verteba Cave. Therefore, our data point to a common ancestry with early European farmers. Our data also suggest population replacement. Mathieson et al. analyzed a number of Neolithic Ukrainian samples (petrous bone) from several sites in southern, northern, and western Ukraine, dating to ~8,500 – 6,000 yrBP, and found exclusively U (U4 and U5) mtDNA lineages. It should be noted that ‘Neolithic’ in this context does not mean the adoption of agriculture, but rather simply coinciding with a change in material culture. They also analyzed several Trypillian individuals from Verteba Cave (different samples from the those included in this study). Similar to our findings, they found a wider diversity of mtDNA lineages, including H, HV, and T2b. These data, combined with our results, appear to confirm almost complete population replacement by individuals associated with the Tripolye Culture during the Middle to Late Neolithic.
The findings also hint to potential contacts of Yamna with Usatovo as predicted by Anthony (2007), or alternatively (lacking precise dates) to contacts with Corded Ware migrants:
Trypillians were very much a distinct people who most likely displaced 1 local hunter-gatherers with little admixture. Haplogroup W was also observed in several specimens deriving from Site G3. Although we are unsure if all of these haplogroups come from a single or multiple individuals, this observation is interesting in that it is relatively rare and isolated among Neolithic samples. It has, however, been found in samples dating to the Bronze Age. In the study by Wilde et al. , they found haplogroup W present in two samples from the Early Bronze Age associated with the Yamnaya and Usatovo cultures. The Usatovo culture (~ 3500 – 2500 BC) was found in Romania, Moldova, and southern Ukraine. It was the conglomeration of Tripolye and North Pontic steppe cultures. Therefore, this individual could link the Trypillian peoples to the Usatovo peoples and perhaps to the greater Yamnaya steppe migrations during the Bronze Age that lead to the Corded Ware Culture.
On the other hand, an article written in terms of mtDNA haplogroup frequencies seems to offer too little proof of anything today. The lack of Y-DNA haplogroups and data on admixture makes their interpretations provisional, subject to change when these further data are published. Also, radiocarbon dating is only confident for individuals of one site (site 7), dated ca. 5,500 cal BP, while “other chambers in the cave are not as confidently dated”…
Many researchers have pointed to the huge “megasites” and construction of fortifications as evidence of intergroup hostilities among the Late Neolithic Tripolye archaeological culture. However, to date, very few skeletal remains have been analyzed for the types of traumatic injury that serve as direct evidence for violent conflict. In this study, we examine trauma on human remains from the Tripolye site of Verteba Cave in western Ukraine. The remains of 36 individuals, including 25 crania, were buried in the gypsum cave as secondary interments. The frequency of cranial trauma is 30-44% among the 25 crania, six males, four females and one adult of indeterminate sex displayed cranial trauma. Of the 18 total fractures, 10 were significantly large and penetrating suggesting lethal force. Over half of the trauma is located on the posterior aspect of the crania, suggesting the victims were attacked from behind. Sixteen of the fractures observed were perimortem and two were antemortem. The distribution and characteristics of the fractures suggest that some of the Tripolye individuals buried at Verteba Cave were victims of a lethal surprise attack. Resources were limited due to population growth and migration, leading to conflict over resource access. It is hypothesized that during this time of change burial in this cave aided in development of identity and ownership of the local territory.
Ancient DNA studies have established that Neolithic European populations were descended from Anatolian migrants who received a limited amount of admixture from resident hunter-gatherers. Many open questions remain, however, about the spatial and temporal dynamics of population interactions and admixture during the Neolithic period. Here we investigate the population dynamics of Neolithization across Europe using a high-resolution genome-wide ancient DNA dataset with a total of 180 samples, of which 130 are newly reported here, from the Neolithic and Chalcolithic periods of Hungary (6000–2900 BC, n = 100), Germany (5500–3000 BC, n = 42) and Spain (5500–2200 BC, n = 38). We find that genetic diversity was shaped predominantly by local processes, with varied sources and proportions of hunter-gatherer ancestry among the three regions and through time. Admixture between groups with different ancestry profiles was pervasive and resulted in observable population transformation across almost all cultural transitions. Our results shed new light on the ways in which gene flow reshaped European populations throughout the Neolithic period and demonstrate the potential of time-series-based sampling and modelling approaches to elucidate multiple dimensions of historical population interactions.
There were some interesting finds on a regional level, with some late survival of hunter-gatherer ancestry (and Y-DNA haplogroups) in certain specific sites, but nothing especially surprising. This survival of HG ancestry and lineages in Iberia and other regions may be used to revive (yet again) the controversy over the origin of non-Indo-European languages of Europe attested in historical times, such as the only (non-Uralic) one surviving to this day, the Basque language.
This study kept confirming the absence of Y-DNA R1b-M269 subclades in Central Europe before the arrival of Yamna migrants, though, which offers strong reasons to reject the Indo-European from the west hypothesis.
Here are first the PCA of samples included in this paper, and then the PCA of ancient Eurasians (Mathieson et al. 2017) and modern populations (Lazaridis et al. 2014) for comparison of similar clusters:
Haplogroup R1b-M269 comprises most Western European Y chromosomes; of its main branches, R1b-DF27 is by far the least known, and it appears to be highly prevalent only in Iberia. We have genotyped 1072 R1b-DF27 chromosomes for six additional SNPs and 17 Y-STRs in population samples from Spain, Portugal and France in order to further characterize this lineage and, in particular, to ascertain the time and place where it originated, as well as its subsequent dynamics. We found that R1b-DF27 is present in frequencies ~40% in Iberian populations and up to 70% in Basques, but it drops quickly to 6–20% in France. Overall, the age of R1b-DF27 is estimated at ~4,200 years ago, at the transition between the Neolithic and the Bronze Age, when the Y chromosome landscape of W Europe was thoroughly remodeled. In spite of its high frequency in Basques, Y-STR internal diversity of R1b-DF27 is lower there, and results in more recent age estimates; NE Iberia is the most likely place of origin of DF27. Subhaplogroup frequencies within R1b-DF27 are geographically structured, and show domains that are reminiscent of the pre-Roman Celtic/Iberian division, or of the medieval Christian kingdoms.
Some people like to say that Y-DNA haplogroup analysis, or phylogeography in general, is of no use anymore (especially modern phylogeography), and they are content to see how ‘steppe admixture’ was (or even is) distributed in Europe to draw conclusions about ancient languages and their expansion. With each new paper, we are seeing the advantages of analysing ancient and modern haplogroups in ascertaining population movements.
Quite recently there was a suggestion based on steppe admixture that Basque-speaking Iberians resisted the invasion from the steppe. Observing the results of this article (dates of expansion and demographic data) we see a clear expansion of Y-DNA haplogroups precisely by the time of Bell Beaker expansion from the east. Y-DNA haplogroups of ancient samples from Portugal point exactly to the same conclusion.
The recent article on Mycenaean and Minoan genetics also showed that, when it comes to Europe, most of the demographic patterns we see in admixture are reminiscent of the previous situation, only rarely can we see a clear change in admixture (which would mean an important, sudden replacement of the previous population).
The following are excerpts from the article (emphasis is mine):
Dates and expansions
The average STR variance of DF27 and each subhaplogroup is presented in Suppl. Table 2. As expected, internal diversity was higher in the deeper, older branches of the phylogeny. If the same diversity was divided by population, the most salient finding is that native Basques (Table 2) have a lower diversity than other populations, which contrasts with the fact that DF27 is notably more frequent in Basques than elsewhere in Iberia (Suppl. Table 1). Diversity can also be measured as pairwise differences distributions (Fig. 5). The distribution of mean pairwise differences within Z195 sits practically on top of that of DF27; L176.2 and Z220 have similar distributions, as M167 and Z278 have as well; finally, M153 shows the lowest pairwise distribution values. This pattern is likely to reflect the respective ages of the haplogroups, which we have estimated by a modified, weighted version of the ρ statistic (see Methods).
Z195 seems to have appeared almost simultaneously within DF27, since its estimated age is actually older (4570 ± 140 ya). Of the two branches stemming from Z195, L176.2 seems to be slightly younger than Z220 (2960 ± 230 ya vs. 3320 ± 200 ya), although the confidence intervals slightly overlap. M167 is clearly younger, at 2600 ± 250 ya, a similar age to that of Z278 (2740 ± 270 ya). Finally, M153 is estimated to have appeared just 1930 ± 470 ya.
Haplogroup ages can also be estimated within each population, although they should be interpreted with caution (see Discussion). For the whole of DF27, (Table 3), the highest estimate was in Aragon (4530 ± 700 ya), and the lowest in France (3430 ± 520 ya); it was 3930 ± 310 ya in Basques. Z195 was apparently oldest in Catalonia (4580 ± 240 ya), and with France (3450 ± 269 ya) and the Basques (3260 ± 198 ya) having lower estimates. On the contrary, in the Z220 branch, the oldest estimates appear in North-Central Spain (3720 ± 313 ya for Z220, 3420 ± 349 ya for Z278). The Basques always produce lower estimates, even for M153, which is almost absent elsewhere.
The median value for Tstart has been estimated at 103 generations (Table 4), with a 95% highest probability density (HPD) range of 50–287 generations; effective population size increased from 131 (95% HPD: 100–370) to 72,811 (95% HPD: 52,522–95,334). Considering patrilineal generation times of 30–35 years, our results indicate that R1b-DF27 started its expansion ~3,000–3,500 ya, shortly after its TMRCA.
As a reference, we applied the same analysis to the whole of R1b-S116, as well as to other common haplogroups such as G2a, I2, and J2a. Interestingly, all four haplogroups showed clear evidence of an expansion (p > 0.99 in all cases), all of them starting at the same time, ~50 generations ago (Table 4), and with similar estimated initial and final populations. Thus, these four haplogroups point to a common population expansion, even though I2 (TMRCA, weighted ρ, 7,800 ya) and J2a (TMRCA, 5,500 ya) are older than R1b-DF27. It is worth noting that the expansion of these haplogroups happened after the TMRCA of R1b-DF27.
Sum up and discussion
We have characterized the geographical distribution and phylogenetic structure of haplogroup R1b-DF27 in W. Europe, particularly in Iberia, where it reaches its highest frequencies (40–70%). The age of this haplogroup appears clear: with independent samples (our samples vs. the 1000 genome project dataset) and independent methods (variation in 15 STRs vs. whole Y-chromosome sequences), the age of R1b-DF27 is firmly grounded around 4000–4500 ya, which coincides with the population upheaval in W. Europe at the transition between the Neolithic and the Bronze Age. Before this period, R1b-M269 was rare in the ancient DNA record, and during it the current frequencies were rapidly reached. It is also one of the haplogroups (along with its daughter clades, R1b-U106 and R1b-S116) with a sequence structure that shows signs of a population explosion or burst. STR diversity in our dataset is much more compatible with population growth than with stationarity, as shown by the ABC results, but, contrary to other haplogroups such as the whole of R1b-S116, G2a, I2 or J2a, the start of this growth is closer to the TMRCA of the haplogroup. Although the median time for the start of the expansion is older in R1b-DF27 than in other haplogroups, and could suggest the action of a different demographic process, all HPD intervals broadly overlap, and thus, a common demographic history may have affected the whole of the Y chromosome diversity in Iberia. The HPD intervals encompass a broad timeframe, and could reflect the post-Neolithic population expansions from the Bronze Age to the Roman Empire.
While when R1b-DF27 appeared seems clear, where it originated may be more difficult to pinpoint. If we extrapolated directly from haplogroup frequencies, then R1b-DF27 would have originated in the Basque Country; however, for R1b-DF27 and most of its subhaplogroups, internal diversity measures and age estimates are lower in Basques than in any other population. Then, the high frequencies of R1b-DF27 among Basques could be better explained by drift rather than by a local origin (except for the case of M153; see below), which could also have decreased the internal diversity of R1b-DF27 among Basques. An origin of R1b-DF27 outside the Iberian Peninsula could also be contemplated, and could mirror the external origin of R1b-M269, even if it reaches there its highest frequencies. However, the search for an external origin would be limited to France and Great Britain; R1b-DF27 seems to be rare or absent elsewhere: Y-STR data are available only for France, and point to a lower diversity and more recent ages than in Iberia (Table 3). Unlike in Basques, drift in a traditionally closed population seems an unlikely explanation for this pattern, and therefore, it does not seem probable that R1b-DF27 originated in France. Then, a local origin in Iberia seems the most plausible hypothesis. Within Iberia, Aragon shows the highest diversity and age estimates for R1b-DF27, Z195, and the L176.2 branch, although, given the small sample size, any conclusion should be taken cautiously. On the contrary, Z220 and Z278 are estimated to be older in North Central Spain (N Castile, Cantabria and Asturias). Finally, M153 is almost restricted to the Basque Country: it is rarely present at frequencies >1% elsewhere in Spain (although see the cases of Alacant, Andalusia and Madrid, Suppl. Table 1), and it was found at higher frequencies (10–17%) in several Basque regions; a local origin seems plausible, but, given the scarcity of M153 chromosomes outside of the Basque Country, the diversity and age values cannot be compared.
Within its range, R1b-DF27 shows same geographical differentiation: Western Iberia (particularly, Asturias and Portugal), with low frequencies of R1b-Z195 derived chromosomes and relatively high values of R1b-DF27* (xZ195); North Central Spain is characterized by relatively high frequencies of the Z220 branch compared to the L176.2 branch; the latter is more abundant in Eastern Iberia. Taken together, these observations seem to match the East-West patterning that has occurred at least twice in the history of Iberia: i) in pre-Roman times, with Celtic-speaking peoples occupying the center and west of the Iberian Peninsula, while the non-Indoeuropean eponymous Iberians settled the Mediterranean coast and hinterland; and ii) in the Middle Ages, when Christian kingdoms in the North expanded gradually southwards and occupied territories held by Muslim fiefs.
I wouldn’t trust the absence of R1b-DF27 outside France as a proof that its origin must be in Western Europe – especially since we have ancient DNA, and that assertion might prove quite wrong – but aside from that the article seems solid in its analysis of modern populations.
Archaeological studies sample ancient human populations one site at a time, often limited to a fraction of the regions and periods occupied by a given group. While this bias is known and discussed in the literature, few model populations span areas as large and unforgiving as the Yakuts of Eastern Siberia. We systematically surveyed 31,000 square kilometres in the Sakha Republic (Yakutia) and completed the archaeological study of 174 frozen graves, assembled between the 15th and the 19th century. We analysed genetic data (autosomal genotypes, Y-chromosome haplotypes and mitochondrial haplotypes) for all ancient subjects and confronted it to the study of 190 modern subjects from the same area and the same population. Ancient familial links and paternal clan were identified between graves up to 1500 km apart and we provide new data concerning the origins of the contemporary Yakut population and demonstrate that cultural similarities in the past were linked to (i) the expansion of specific paternal clans, (ii) preferential marriage among the elites and (iii) funeral choices that could constitute a bias in any ancient population study.
Even if you are not interested in the cultural and anthropological evolution of this Turkic-speaking people of the Russian Far Eastern region, the method used is an excellent example of how to use archaeology and genetics (especially Y-DNA and mtDNA data) to obtain meaningful results when investigating ancient populations.
For quite some time, probably since the first renown admixture analyses of ancient DNA samples were published, we have been living under the impression that phylogeography, or simply archaeogenetics as it was called back in the day, is not needed.
Cavalli-Sforza’s assertion that the study of modern populations could offer a clear picture of past population movements is now considered wrong, and the study of Y-DNA and mtDNA haplogroups is today mostly disregarded as of secondary importance, even among geneticists. Whole genomic investigation (and especially admixture analyses) have been leading the new wave of overconfidence in genetic results, tightly joint with the ignorance of its shortcomings (and commercial interests based on desires of ethnic identification), and haplogroups are usually just reported with other, not entirely meaningful aspects of ancient DNA analyses.
While it is undeniable that admixture analyses are offering quite interesting results, they must be carefully balanced against known archaeological and linguistic knowledge. Phylogeography – and especially Y-DNA haplogroup assessment – is quite interesting in investigating kinship and clans in patrilocal communities – i.e. most communities in prehistoric and historic periods, unless proven otherwise.
Luckily enough, there are those researchers who still strive to obtain meaningful information from haplotypes. The article referenced in this post is quite interesting due to its phylogeographic method’s applicability to ancient cultures and peoples.
When some geneticists look at simplistic prehistoric maps, like those depicting Yamna, Afanasevo, Corded Ware, and Bell Beaker cultures together, they forget that 1) cultural regions are selected more or less arbitrarily (we only have certain scattered sites for each of these cultures); 2) economic or population contacts are difficult to ascertain and to represent graphically; and 3) time periods for archaeological sites are important – in fact, they are probably THE most important aspect in assessing how accurate a map (and its “arrows” of migration or exchange) represents reality.
A later expansion of other subclades – particularly Y-DNA N1c -, was probably associated with the later western expansion of the Eurasian Seima-Turbino phenomenon, and its current prevalence in Finnish Y-DNA haplogroups might have been the consequence of the population decline ca. 1500 BC, and later Iron Age population bottleneck (with the population peak ca. 500 AD) described in the article.
That would more naturally explain the ‘cultural diffusion’ of Finnic languages into invading eastern N1c lineages, a diffusion which would have been in fact a long-term, quite gradual replacement of previously prevalent Y-DNA R1a subclades in the region, as supported by the prevalent “steppe” component in genome-wide ancestry of Finns.
Therefore, there were probably no sudden, strong population (and thus cultural) changes associated with the arrival of N1c lineages, like the ones seen with R1a (Corded Ware / Uralic) and R1b (Yamna / Proto-Indo-European) expansions in Europe.
How the Saami fit into this scheme is not yet obvious, though.
In Europe, modern mitochondrial diversity is relatively homogeneous and suggests an ubiquitous rapid population growth since the Neolithic revolution. Similar patterns also have been observed in mitochondrial control region data in Finland, which contrasts with the distinctive autosomal and Y-chromosomal diversity among Finns. A different picture emerges from the 843 whole mitochondrial genomes from modern Finns analyzed here. Up to one third of the subhaplogroups can be considered as Finn-characteristic, i.e. rather common in Finland but virtually absent or rare elsewhere in Europe. Bayesian phylogenetic analyses suggest that most of these attributed Finnish lineages date back to around 3,000–5,000 years, coinciding with the arrival of Corded Ware culture and agriculture into Finland. Bayesian estimation of past effective population sizes reveals two differing demographic histories: 1) the ‘local’ Finnish mtDNA haplotypes yielding small and dwindling size estimates for most of the past; and 2) the ‘immigrant’ haplotypes showing growth typical of most European populations. The results based on the local diversity are more in line with that known about Finns from other studies, e.g., Y-chromosome analyses and archaeology findings. The mitochondrial gene pool thus may contain signals of local population history that cannot be readily deduced from the total diversity.
From its results:
In general, there appears to be two loose and largely overlapping clusters among the Finn-characteristic haplogroups: the first between 1,000–2,000 ybp and the second around 3,300–5,500 ybp. The age of the older cluster coincides temporally with the arrival of the Corded-Ware culture and, notably, the spread of agriculture in Finland. The arrival and spread of agriculture, temporally corresponding with the age estimates for most of the haplogroups characteristic of Finns, might be a sign of population size increase enabled by the new mode of subsistence, resulting in reduced drift and accumulation of genetic diversity in the population.
Another insight in the past population sizes in Finland is based on radiocarbon-dated archaeological findings in different time periods. These analyses suggest two prehistoric population peaks in Finland, the Stone Age peak (c. 5,500 ybp) and the Metal Age peak (~1,500 ybp). Both of these peaks were followed by a population decline, which appears to have reached its ebb around 3,500 ybp. These developments are not distinguishable in the BSPs. However, these ages correspond well to the two haplogroup age clusters described above. The presumably less severe Iron Age population bottleneck seen in the archaeological data, 1,500–1,300 ybp, temporally coincides with the population size reduction visible for the Finn-characteristic subhaplogroups.