The Phase 3 sequence data from 20 populations, comprising five populations for each of the four main geographical regions of Europe, East Asia, South Asia and Africa, were downloaded from the 1000 Genomes Project website (www.1000genomes.org/data, ), including whole mitochondrial genome data for 1999 individuals. We decided not to analyse populations from the Americas due to the region’s complex history of admixture [13,14].
The European populations were as follows: Finnish sampled in Finland (FIN); European Caucasians resident in Utah, USA (CEU); British in England and Scotland (GBR); an Iberian population from Spain (IBS) and Toscani from Italy (TSI). Representing East Asia were the Han Chinese in Beijing (CHB); Southern Han Chinese (CHS); Dai Chinese from Xishuangbanna, China (CDX); Kinh population from Ho Chi Minh City, Vietnam (KHV) and Japanese from Tokyo (JPT). The South Asian populations were Punjabi Indians from Lahore, Pakistan (PJL); Gujarati Indians in Houston, USA (GIH) as well as Indian Telugu sampled in the UK (ITU); Bengali from Bangladesh (BEB) and Sri Lankan Tamil from the UK (STU). (…)
We analysed our mtDNA data with the extended Bayesian skyline plot (EBSP) method, a Bayesian, non-parametric technique for inferring past population size fluctuations from genetic data. Building on the previous Bayesian skyline plot (BSP) approach, EBSP uses a piecewise-linear model and Markov chain Monte Carlo (MCMC) methods to reconstruct a populations’ demographic history  and is implemented in the software package BEAST v. 2.3.2 . Alignments for each of the 20 populations were loaded separately into the Bayesian Evolutionary Analysis Utility tool (BEAUti v. 2.3.2) in NEXUS format.
Regional demographic histories
The five European profiles are presented in figure 2. The four southerly populations all show profiles with a stable size up to approximately 14 ka followed by a sudden, rapid increase that becomes progressively less steep towards the present. There is also a north-south trend, with confidence intervals becoming broader towards the north, particularly for the oldest time-points. The Finnish population profile appears rather different, but this is to be expected both because it is so far north and because previous studies have identified Finns as a strong genetic outlier in Europe [19–22].
The five profiles for South Asia are shown in figure 3. All populations reveal a period of rapid growth approximately 45–40 ka which then slows. Near the present the two southerly populations, GIH and STU both show evidence of a decline. However, this may be due to these samples being drawn from populations no longer living on the subcontinent, with the downward trend capturing a bottleneck associated with moving to Europe/America, perhaps accentuated by the tendency for immigrant populations to group by region, religion and race .
Native Americans from the Amazon, Andes, and coastal geographic regions of South America have a rich cultural heritage but are genetically understudied, therefore leading to gaps in our knowledge of their genomic architecture and demographic history. In this study, we sequence 150 genomes to high coverage combined with an additional 130 genotype array samples from Native American and mestizo populations in Peru. The majority of our samples possess greater than 90% Native American ancestry, which makes this the most extensive Native American sequencing project to date. Demographic modeling reveals that the peopling of Peru began ∼12,000 y ago, consistent with the hypothesis of the rapid peopling of the Americas and Peruvian archeological data. We find that the Native American populations possess distinct ancestral divisions, whereas the mestizo groups were admixtures of multiple Native American communities that occurred before and during the Inca Empire and Spanish rule. In addition, the mestizo communities also show Spanish introgression largely following Peruvian Independence, nearly 300 y after Spain conquered Peru. Further, we estimate migration events between Peruvian populations from all three geographic regions with the majority of between-region migration moving from the high Andes to the low-altitude Amazon and coast. As such, we present a detailed model of the evolutionary dynamics which impacted the genomes of modern-day Peruvians and a Native American ancestry dataset that will serve as a beneficial resource to addressing the underrepresentation of Native American ancestry in sequencing studies.
The high frequency of Native American mitochondrial haplotypes suggests that European males were the primary source of European admixture with Native Americans, as previously found (23, 24, 41, 42). The only Peruvian populations that have a proportion of the Central American component are in the Amazon (Fig. 2A). This is supported by Homburger et al. (4), who also found Central American admixture in other Amazonian populations and could represent ancient shared ancestry or a recent migration between Central America and the Amazon.
Following the peopling of Peru, we find a complex history of admixture between Native American populations from multiple geographic regions (Figs. 2B and 3 A and C). This likely began before the Inca Empire due to Native American and mestizo groups sharing IBD segments that correspond to the time before the Inca Empire. However, the Inca Empire likely influenced this pattern due to their policy of forced migrations, known as “mitma” (mitmay in Quechua) (28, 31, 37), which moved large numbers of individuals to incorporate them into the Inca Empire. We can clearly see the influence of the Inca through IBD sharing where the center of dominance in Peru is in the Andes during the Inca Empire (Fig. 3C).
A similar policy of large-scale consolidation of multiple Native American populations was continued during Spanish rule through their program of reducciones, or reductions (31, 32), which is consistent with the hypothesis that the Inca and Spanish had a profound impact on Peruvian demography (25). The result of these movements of people created early New World cosmopolitan communities with genetic diversity from the Andes, Amazon, and coast regions as is evidenced by mestizo populations’ ancestry proportions (Fig. 3A). Following Peruvian independence, these cosmopolitan populations were those same ones that predominantly admixed with the Spanish (Fig. 3B). Therefore, this supports our model that the Inca Empire and Spanish colonial rule created these diverse populations as a result of admixture between multiple Native American ancestries, which would then go on to become the modern mestizo populations by admixing with the Spanish after Peruvian independence.
Further, it is interesting that this admixture began before the urbanization of Peru (26) because others suspected the urbanization process would greatly impact the ancestry patterns in these urban centers (25). (…)
We analyzed 391 samples from 12 Argentinian populations from the Center-West, East and North-West regions with the Illumina Human Exome Beadchip v1.0 (HumanExome-12v1-A). We did Principal Components analysis to infer patterns of populational divergence and migrations. We identified proportions and patterns of European, African and Native American ancestry and found a correlation between distance to Buenos Aires and proportion of Native American ancestry, where the highest proportion corresponds to the Northernmost populations, which is also the furthest from the Argentinian capital. Most of the European sources are from a South European origin, matching historical records, and we see two different Native American components, one that spreads all over Argentina and another specifically Andean. The highest percentages of African ancestry were in the Center West of Argentina, where the old trade routes took the slaves from Buenos Aires to Chile and Peru. Subcontinentaly, sources of this African component are represented by both West Africa and groups influenced by the Bantu expansion, the second slightly higher than the first, unlike North America and the Caribbean, where the main source is West Africa. This is reasonable, considering that a large proportion of the ships arriving at the Southern Hemisphere came from Mozambique, Loango and Angola.
We present new data and analysis on the genetic variation of contemporary inhabitants of central Argentina, including a total of 812 unrelated individuals from 20 populations. Our goal was to bring new elements for understanding micro-evolutionary and historical processes that generated the genetic diversity of the region, using molecular markers of uniparental inheritance (mitochondrial DNA and Y chromosome). Almost 76% of the individuals show mitochondrial lineages of American origin. The Native American haplogroups predominate in all surveyed localities, except in one. The larger presence of Eurasian maternal lineages were observed in the plains (Pampas) of the southeast, whereas the African lineages are more frequent in northern Córdoba. On the other hand, the analysis of 258 male samples reveals that 92% of them present Eurasian paternal lineages, 7% carry Native American haplogroups, and only 1% of the males show African lineages. The maternal lineages have high genetic diversity homogeneously distributed throughout central Argentina, probably as result of a recent common origin and sustained gene flow. Migratory events that occurred in colonial and recent times should have contributed to hiding any traces of differentiation that might have existed in the past. The analysis of paternal lineages showed also homogeneous distribution of the variation together with a drastic reduction of the native male population.
The immigration waves had less impact in the north–central and northwestern regions, the most populated areas of the country in pre-Hispanic times. The spatial structure of genetic diversity has its origins in historical factors. It is possible to distinguish different stages in migratory processes from abroad, with a heterogeneous regional impact. The genetic composition of central Argentina gives account of these processes. On one hand, the political boundaries between provinces influenced the configuration of the genetic structure of the populations that were formed. In this sense, Córdoba—an important economic and commercial center since colonial times—has a greater component of foreign lineages than the populations of San Luis and Santiago del Estero. On the other hand, the genetic structure of central Argentina also accounts for other processes related to different migration phases and occupations of space over the last 500 years.
Similarly, negative values observed in the neutrality tests (Tajima’s D and Fu’s FS), indicate relatively recent population growth, probably associated with technological and organizational changes leading to new lifestyles and important demographic and territorial expansion . In conclusion, the molecular markers of maternal inheritance shows large genetic diversity homogeneously distributed throughout central Argentina, probably as result of a recent common origin and sustained gene flow between sub-populations. In addition, migratory events that occurred in colonial and recent times should have contributed to hiding any traces of differentiation that might have existed in the past. The analysis of paternal lineages showed also homogeneous distribution of the variation across the region but also a drastic reduction of the native male population, with a large prevalence of haplogroups of European origin.
The Hadza and Sandawe populations in present-day Tanzania speak languages containing click sounds and therefore thought to be distantly related to southern African Khoisan languages. We analyzed genome-wide genotype data for individuals sampled from the Hadza and Sandawe populations in the context of a global data set of 3,528 individuals from 163 ethno-linguistic groups. We found that Hadza and Sandawe individuals share ancestry distinct from and most closely related to Omotic ancestry; share Khoisan ancestry with populations such as ≠Khomani, Karretjie, and Ju/’hoansi in southern Africa; share Niger-Congo ancestry with populations such as Yoruba from Nigeria and Luhya from Kenya, consistent with migration associated with the Bantu Expansion; and share Cushitic ancestry with Somali, multiple Ethiopian populations, the Maasai population in Kenya, and the Nama population in Namibia. We detected evidence for low levels of Arabian, Nilo-Saharan, and Pygmy ancestries in a minority of individuals. Our results indicate that west Eurasian ancestry in eastern Africa is more precisely the Arabian parent of Cushitic ancestry. Relative to the Out-of-Africa migrations, Hadza ancestry emerged early whereas Sandawe ancestry emerged late.
In the Hadza population, the distribution of Y chromosomes includes mostly B2 haplogroups, with a smaller number of E1b1a haplogroups, which are common in Niger-Congo-speaking populations, and E1b1b haplogroups, which are common in Cushitic populations (Tishkoff, et al. 2007). In the Sandawe population, E1b1a and E1b1b haplogroups are more common, with lower frequencies of B2 and A3b2 haplogroups (Tishkoff, et al. 2007).
We found that Hadza ancestry diverged early, rather than late. We found evidence for contributions of Cushitic and Niger-Congo ancestries in Tanzania, consistent with the movements of herding and cultivating Cushitic speakers ~4,000 years ago and agricultural Niger-Congo speakers ~2,500 years ago (Newman 1995). However, we did not find evidence of a substantial contribution of Nilo-Saharan ancestry that might have resulted from movement of pastoralist Nilo-Saharan speakers (Newman 1995). We also identified west Eurasian ancestry in eastern and southern African populations more precisely as the Arabian parent of Cushitic ancestry. Finally, our ancestry analyses support the hypothesis that Omotic, Hadza, and Sandawe languages group together, rather than Omotic languages belonging to the Afroasiatic family and Hadza and Sandawe languages belonging to the Khoisan family.
Understanding how deleterious genetic variation is distributed across human populations is of key importance in evolutionary biology and medical genetics. However, the impact of population size changes and gene flow on the corresponding mutational load remains a controversial topic. Here, we report high-coverage exomes from 300 rainforest hunter-gatherers and farmers of central Africa, whose distinct subsistence strategies are expected to have impacted their demographic pasts. Detailed demographic inference indicates that hunter-gatherers and farmers recently experienced population collapses and expansions, respectively, accompanied by increased gene flow. We show that the distribution of deleterious alleles across these populations is compatible with a similar efficacy of selection to remove deleterious variants with additive effects, and predict with simulations that their present-day additive mutation load is almost identical. For recessive mutations, although an increased load is predicted for hunter-gatherers, this increase has probably been partially counteracted by strong gene flow from expanding farmers. Collectively, our predicted and empirical observations suggest that the impact of the recent population decline of African hunter-gatherers on their mutation load has been modest and more restrained than would be expected under a fully recessive model of dominance.
The expansion of peoples is known to be associated with the spread of a certain admixture component, joint with the expansion and reduction in variability of a haplogroup. In other words, few male lineages are usually more successful during the expansion.
E-M183 (E-M81) is the most frequent paternal lineage in North Africa and thus it must be considered to explore past historical and demographical processes. Here, by using whole Y chromosome sequences from 32 North African individuals, we have identified five new branches within E-M183. The validation of these variants in more than 200 North African samples, from which we also have information of 13 Y-STRs, has revealed a strong resemblance among E-M183 Y-STR haplotypes that pointed to a rapid expansion of this haplogroup. Moreover, for the first time, by using both SNP and STR data, we have provided updated estimates of the times-to-the-most-recent-common-ancestor (TMRCA) for E-M183, which evidenced an extremely recent origin of this haplogroup (2,000–3,000 ya). Our results also showed a lack of population structure within the E-M183 branch, which could be explained by the recent and rapid expansion of this haplogroup. In spite of a reduction in STR heterozygosity towards the West, which would point to an origin in the Near East, ancient DNA evidence together with our TMRCA estimates point to a local origin of E-M183 in NW Africa.
An interesting excerpt, from the discussion:
Regarding the geographical origin of E-M183, a previous study suggested that an expansion from the Near East could explain the observed east-west cline of genetic variation that extends into the Near East. Indeed, our results also showed a reduction in STR heterozygosity towards the West, which may be taken to support the hypothesis of an expansion from the Near East. In addition, previous studies based on genome-wide SNPs reported that a North African autochthonous component increase towards the West whereas the Near Eastern decreases towards the same direction, which again support an expansion from the Near East. However, our correlations should be taken carefully because our analysis includes only six locations on the longitudinal axis, none from the Near East. As a result, we do not have sufficient statistical power to confirm a Near Eastern origin. In addition, rather than showing a west-to-east cline of genetic diversity, the overall picture shown by this correlation analysis evidences just low genetic diversity in Western Sahara, which indeed could be also caused by the small sample size (n = 26) in this region. Alternatively, given the high frequency of E-M183 in the Maghreb, a local origin of E-M183 in NW Africa could be envisaged, which would fit the clear pattern of longitudinal isolation by distance reported in genome-wide studies. Moreover, the presence of autochthonous North African E-M81 lineages in the indigenous population of the Canary Islands, strongly points to North Africa as the most probable origin of the Guanche ancestors. This, together with the fact that the oldest indigenous inviduals have been dated 2210 ± 60 ya, supports a local origin of E-M183 in NW Africa. Within this scenario, it is also worth to mention that the paternal lineage of an early Neolithic Moroccan individual appeared to be distantly related to the typically North African E-M81 haplogroup30, suggesting again a NW African origin of E-M183. A local origin of E-M183 in NW Africa > 2200 ya is supported by our TMRCA estimates, which can be taken as 2,000–3,000, depending on the data, methods, and mutation rates used.
The TMRCA estimates of a certain haplogroup and its subbranches provide some constraints on the times of their origin and spread. Although our time estimates for E-M78 are slightly different depending on the mutation rate used, their confidence intervals overlap and the dates obtained are in agreement with those obtained by Trombetta et al Regarding E-M183, as mentioned above, we cannot discard an expansion from the Near East and, if so, according to our time estimates, it could have been brought by the Islamic expansion on the 7th century, but definitely not with the Neolithic expansion, which appeared in NW Africa ~7400 BP and may have featured a strong Epipaleolithic persistence. Moreover, such a recent appearance of E-M183 in NW Africa would fit with the patterns observed in the rest of the genome, where an extensive, male-biased Near Eastern admixture event is registered ~1300 ya, coincidental with the Arab expansion. An alternative hypothesis would involve that E-M183 was originated somewhere in Northwest Africa and then spread through all the region. Our time estimates for the origin of this haplogroup overlap with the end of the third Punic War (146 BCE), when Carthage (in current Tunisia) was defeated and destroyed, which marked the beginning of Roman hegemony of the Mediterranean Sea. About 2,000 ya North Africa was one of the wealthiest Roman provinces and E-M183 may have experienced the resulting population growth.
The Mongol Empire had a significant role in shaping the landscape of modern populations. Many populations living in Eurasia may have been the product of population mixture between ancient Mongolians and natives following the expansion of Mongol Empire. Geneticists have found that most of these populations carried the Y-haplogroup C3* (C-M217). To trace the history of haplogroup (Hg) C3* and to further understand the origin and development of Mongolians, ancient human remains from the Jinggouzi, Chenwugou and Gangga archaeological sites, which belonged to the Donghu, Xianbei and Shiwei, respectively, were analysed. Our results show that nine of the eleven males of the Gangga site, two of the eight males of Chengwugou site and all of the twelve males of Jinggouzi site were found to have mutations at M130 (Hg C), M217 (Hg C3), L1373 (C2b, ISOGG2015), with the absence of mutations at M93 (Hg C3a), P39 (Hg C3b), M48 (Hg C3c), M407 (Hg C3d) and P62 (Hg C3f). These samples were attributed to the Y-chromosome Hg C3* (Hg C2b, ISOGG2015), and most of them were further typed as Hg C2b1a based on the mutation at F3918. Finally, we inferred that the Y-chromosome Hg C3*-F3918 can trace its origins to the Donghu ancient nomadic group.