Ancient genomes from North Africa evidence Neolithic migrations to the Maghreb

BioRxiv preprint now published (behind paywall) Ancient genomes from North Africa evidence prehistoric migrations to the Maghreb from both the Levant and Europe, by Fregel et al., PNAS (2018).

NOTE. I think one of the important changes in this version compared to the preprint is the addition of the recent Iberomaurusian samples.

Abstract (emphasis mine):

The extent to which prehistoric migrations of farmers influenced the genetic pool of western North Africans remains unclear. Archaeological evidence suggests that the Neolithization process may have happened through the adoption of innovations by local Epipaleolithic communities or by demic diffusion from the Eastern Mediterranean shores or Iberia. Here, we present an analysis of individuals’ genome sequences from Early and Late Neolithic sites in Morocco and from Early Neolithic individuals from southern Iberia. We show that Early Neolithic Moroccans (∼5,000 BCE) are similar to Later Stone Age individuals from the same region and possess an endemic element retained in present-day Maghrebi populations, confirming a long-term genetic continuity in the region. This scenario is consistent with Early Neolithic traditions in North Africa deriving from Epipaleolithic communities that adopted certain agricultural techniques from neighboring populations. Among Eurasian ancient populations, Early Neolithic Moroccans are distantly related to Levantine Natufian hunter-gatherers (∼9,000 BCE) and Pre-Pottery Neolithic farmers (∼6,500 BCE). Late Neolithic (∼3,000 BCE) Moroccans, in contrast, share an Iberian component, supporting theories of trans-Gibraltar gene flow and indicating that Neolithization of North Africa involved both the movement of ideas and people. Lastly, the southern Iberian Early Neolithic samples share the same genetic composition as the Cardial Mediterranean Neolithic culture that reached Iberia ∼5,500 BCE. The cultural and genetic similarities between Iberian and North African Neolithic traditions further reinforce the model of an Iberian migration into the Maghreb.

Ancestry inference in ancient samples from North Africa and the Iberian Peninsula. PCA analysis using the Human Origins panel (European, Middle Eastern, and North African populations) and LASER projection of aDNA samples.

Relevant excerpts:

FST and outgroup-f3 distances indicate a high similarity between IAM and Taforalt. As observed for IAM, most Taforalt sample ancestry derives from Epipaleolithic populations from the Levant. However, van de Loosdrecht et al. (17) also reported that one third of Taforalt ancestry was of sub-Saharan African origin. To confirm whether IAM individuals show a sub-Saharan African component, we calculated f4(chimpanzee, African population; Natufian, IAM) in such a way that a positive result for f4 would indicate that IAM is composed both of Levantine and African ancestries. Consistent with the results observed for Taforalt, f4 values are significantly positive for West African populations, with the highest value observed for Gambian and Mandenka (Fig. 3 and SI Appendix, Supplementary Note 10). Together, these results indicate the presence of the same ancestral components in ∼15,000-y old and ∼7,000-y-old populations from Morocco, strongly suggesting a temporal continuity between Later Stone Age and Early Neolithic populations in the Maghreb. However, it is important to take into account that the number of ancient genomes available for comparison is still low and future sampling can provide further refinement in the evolutionary history of North Africa.

Genetic analyses have revealed that the population history of modern North Africans is quite complex (11). Based on our aDNA analysis, we identify an Early Neolithic Moroccan component that is (i) restricted to North Africa in present-day populations (11); (ii) the sole ancestry in IAM samples; and (iii) similar to the one observed in Later Stone Age samples from Morocco (17). We conclude that this component, distantly related to that of Epipaleolithic communities from the Levant, represents the autochthonous Maghrebi ancestry associated with Berber populations. Our data suggests that human populations were isolated in the Maghreb since Upper Paleolithic times. Our hypothesis is in agreement with archaeological research pointing to the first stage of the Neolithic expansion in Morocco as the result of a local population that adopted some technological innovations, such as pottery production or farming, from neighboring areas.

By 3,000 BCE, a continuity in the Neolithic spread brought Mediterranean-like ancestry to the Maghreb, most likely from Iberia. Other archaeological remains, such as African elephant ivory and ostrich eggs found in Iberian sites, confirm the existence of contacts and exchange networks through both sides of the Gibraltar strait at this time. Our analyses strongly support that at least some of the European ancestry observed today in North Africa is related to prehistoric migrations, and local Berber populations were already admixed with Europeans before the Roman conquest. Furthermore, additional European/ Iberian ancestry could have reached the Maghreb after KEB people; this scenario is supported by the presence of Iberian-like Bell-Beaker pottery in more recent stratigraphic layers of IAM and KEB caves. Future paleogenomic efforts in North Africa will further disentangle the complex history of migrations that forged the ancestry of the admixed populations we observe today.

Ancestry inference in ancient samples from North Africa and the Iberian Peninsula. (B) ADMIXTURE analysis using the Human Origins dataset (European, Middle Eastern, and North African populations) for modern and ancient samples (K = 8). (D) Detail of ADMIXTURE analysis using the Human Origins dataset (European, Middle Eastern, North African, and sub-Saharan African populations) for modern and ancient samples, including Taforalt.

Also, from the main author’s Twitter account:

I just realized that the paragraph with information on data availability is missing! Sequence data in the European Nucleotide Archive (PRJEB22699). Consensus mtDNA sequences are available at the National Center of Biotechnology Information (Accession Numbers MF991431-MF991448).

I find it hard to believe that this genetic continuity from Upper Palaeolithic to Late Neolithic could be representative of an autochthonous development of Afroasiatic. An important population movement – likely more than one – must be found in ancient DNA influencing North-Central and North-East Africa, probably during the time of the Green Sahara corridor.

See here:

The Proto-Indo-European – Euskarian hypothesis


Another short communication by Juliette Blevins has just been posted, A single sibilant in Proto-Basque: *s, *Rs, *sT and the phonetic basis of the sibilant split:

Blevins (to appear) presents a new reconstruction of Proto-Basque, the mother of Basque and Aquitanian, based on standard methods in historical linguistics: the comparative method and the method of internal reconstruction. Where all previous reconstructions of Proto-Basque assume a contrast between two sibilants, *s, a voiceless apical sibilant, and *z a voiceless laminal sibilant (Martinet 1955; Michelena 1977; Lakarra 1995; Trask 1997), this proposal is unique in positing only a single sibilant *s. Under this account, all instances of Common Basque /z/ are derived from *s. More specifically, in syllable coda, *Rs > *Rz (R a sonorant) while in the syllable onset, *sT > *zT (T an oral stop). The true split of *s into /s/ vs. /z/ occurs when clusters like *Rz or *zT are further simplified to /z/.

In this talk, internal evidence for a single sibilant, *s, in Proto-Basque is presented, and sound changes underlying the sibilant split are examined within the context of Evolutionary Phonology (Blevins 2004, 2006, 2015, 2017). Similar sound changes are identified in other languages with similar cluster types (e.g. Kümmel 2007:232), and the phonetic basis of the sibilant split is informed by recent studies of sibilant retraction (e.g. Stevens and Harrington 2016; Stuart-Smith et al. 2018).

Blevins, already known for her previous work on the Basque language, was known internationally for her recent controversial proposal of a genetic relationship between Proto-Indo-European and Basque. Apparently, a book with her full model, Advances in Proto-Basque Reconstruction with evidence for the Proto-Indo-European-Euskarian Hypothesis, will be published by Routledge soon.

I was never convinced, not just about a genetic connection, but about the very possibility of discovering it if there was any, mainly because such a link would be quite old, and Basque is known to have been greatly influenced by surrounding IE prestige languages for millennia until it was first attested in the 16th century. Internal reconstruction can only avail a gross reconstruction of few aspects up to a certain point in time, probably not very far beyond the Pre-Roman period, and that only thanks to the available Aquitanian inscriptions.

There are indeed certain known migrations that could be linked with a pan-European population movement, the most likely one for this hypothesis being the Villabruna cluster (the Villabruna sample itself being of haplogroup R1b pre-P297), and especially the expansion of R1b-V88 lineages, found widespread in Europe from west to east, from Mesolithic Iberia to Khvalynsk.

This haplogroup is also found in Sardinia, which may be connected to the expansion of V88 subclades (which I have speculatively proposed could be linked to Afro-Asiatic) into Africa through Italy and the Green Sahara; although it could also be linked to a speculative Vasconic-Iberian – Palaeo-Sardo group.

Without knowing the exact Pre-Proto-Indo-European stage at which Blevins would place the Basque separation, it is difficult to know how it could fit within any macro-language proposal – and thus potential ancestral population expansion.

If you are interested in this hypothesis, I suggest Koch’s controversial paper of 2013 Is Basque an Indo-European Language? (PDF), appeared in JIES 41 (1 & 2)….And of course the many papers rejecting it in the same volume. You also have Forni’s writings supporting this association.

Seeing how many Basque nationalists (obviously obsessed with racial purity) are still rooting for an autochthonous Palaeolithic origin of R1b lineages (especially P312) linked with the Basque language and dat huge Vasconic Western Europe; and now, after Olalde & Mathieson 2018, how some are also suggesting a Neolithic link with Sardinians and the Neolithic expansion, for lack of further modern genetic differences with other Western Europeans… I wonder how a lot of people inclined to believe this nonsense today, and mentally linking Vasconic with haplogroup R1b, will be paradoxically necessarily tied precisely to this kind of macro-family proposals in the future.


David Reich on social inequality and Yamna expansion with few Y-DNA subclades

Interesting article from David Reich that I had missed, at Nautilus, Social Inequality Leaves a Genetic Mark.

It explores one of the main issues we are observing with ancient DNA, the greater reduction in Y-DNA lineages relative to mtDNA lineages, and its most likely explanation (which I discussed recently).

Excerpts interesting for the Indo-European question (emphasis mine):

Gimbutas’s reconstruction has been criticized as fantastical by her critics, and any attempt to paint a vivid picture of what a human culture was like before the period of written texts needs to be viewed with caution. Nevertheless, ancient DNA data has provided evidence that the Yamnaya were indeed a society in which power was concentrated among a small number of elite males. The Y chromosomes that the Yamnaya carried were nearly all of a few types, which shows that a limited number of males must have been extraordinarily successful in spreading their genes. In contrast, in their mitochondrial DNA, the Yamnaya had more diverse sequences.9 The descendants of the Yamnaya or their close relatives spread their Y chromosomes into Europe and India, and the demographic impact of this expansion was profound, as the Y-chromosome types they carried were absent in Europe and India before the Bronze Age but are predominant in both places today.13

This Yamnaya expansion also cannot have been entirely friendly, as is clear from the fact that the proportion of Y chromosomes of steppe origin in both western Europe14 and in India15 today is much larger than the proportion of the rest of the genome. This preponderance of male ancestry coming from the steppe implies that male descendants of the Yamnaya with political or social power were more successful at competing for local mates than men from the local groups. The most striking example I know is from Iberia in far southwestern Europe, where Yamnaya-derived ancestry arrived suddenly at the onset of the Bronze Age between 4,500 and 4,000 years ago. Daniel Bradley’s laboratory and my laboratory independently produced ancient DNA from individuals of this period.14 We find that in the first Iberians with Yamnaya-derived ancestry, the proportion of Yamnaya ancestry across the whole genome is almost never more than around 15 percent. However, around 90 percent of males who carry Yamnaya ancestry have a Y-chromosome type of steppe origin that was absent in Iberia prior to that time. It is clear that there were extraordinary hierarchies and imbalances in power at work in the Yamnaya expansions.

David Reich clearly doesn’t give a damn about how other people might react to his commentaries. That’s nice.

In any case, if anyone was still in denial, R1b-M269 expanded with Yamna (through the Bell Beaker expansion) into Iberia, hence yes, 90% of modern Basque male lineages have an origin in the steppe, like the R1b-DF27 sample recently found, and their common ancestor spoke Late Proto-Indo-European.

Findings like these, which should be taken as normal developments of research, are apparently still a trauma for many – like R1a-fans from India realizing most of their paternal ancestors came from the steppe, or its fans from Northern Europe understanding that their paternal ancestors probably spoke Uralic or a related language; or N1c-fans seeing how their paternal ancestors probably didn’t speak Uralic. It seems life isn’t fair to stupid simplistic ethnolinguistic ideas

Let’s see which Y-DNA haplogroups we find in West Yamna, to verify the latest migration model of Late PIE speakers of the Reich Lab (featured image).

Check out also the BBC News coverage of David Reich and Nick Patterson, the two most influential researchers of the moment in Human Ancestry: How ancient DNA is transforming our view of the past.


Distribution of Southern Iberian haplogroup H indicates exchanges in the western Mediterranean

Recent open access paper The distribution of mitochondrial DNA haplogroup H in southern Iberia indicates ancient human genetic exchanges along the western edge of the Mediterranean, by Hernández, Dugoujon, Novelletto, Rodríguez, Cuesta and Calderón, BMC Genetics (2017).

Abstract (emphasis mine):

The structure of haplogroup H reveals significant differences between the western and eastern edges of the Mediterranean, as well as between the northern and southern regions. Human populations along the westernmost Mediterranean coasts, which were settled by individuals from two continents separated by a relatively narrow body of water, show the highest frequencies of mitochondrial haplogroup H. These characteristics permit the analysis of ancient migrations between both shores, which may have occurred via primitive sea crafts and early seafaring. We collected a sample of 750 autochthonous people from the southern Iberian Peninsula (Andalusians from Huelva and Granada provinces). We performed a high-resolution analysis of haplogroup H by control region sequencing and coding SNP screening of the 337 individuals harboring this maternal marker. Our results were compared with those of a wide panel of populations, including individuals from Iberia, the Maghreb, and other regions around the Mediterranean, collected from the literature.

Both Andalusian subpopulations showed a typical western European profile for the internal composition of clade H, but eastern Andalusians from Granada also revealed interesting traces from the eastern Mediterranean. The basal nodes of the most frequent H sub-haplogroups, H1 and H3, harbored many individuals of Iberian and Maghrebian origins. Derived haplotypes were found in both regions; haplotypes were shared far more frequently between Andalusia and Morocco than between Andalusia and the rest of the Maghreb. These and previous results indicate intense, ancient and sustained contact among populations on both sides of the Mediterranean.

Our genetic data on mtDNA diversity, combined with corresponding archaeological similarities, provide support for arguments favoring prehistoric bonds with a genetic legacy traceable in extant populations. Furthermore, the results presented here indicate that the Strait of Gibraltar and the adjacent Alboran Sea, which have often been assumed to be an insurmountable geographic barrier in prehistory, served as a frequently traveled route between continents.

a, b, c. Interpolated frequency surfaces of clade H and its main sub-clades (H1 and H3). Frequencies (%) are showed in a colour scale. See information about the populations used in Additional files 4 and 5. Map templates were taken from Natural Earth free map repository (

I usually find mtDNA data, especially studies like this one based on modern populations, very difficult to interpret for anthropological purposes. It is well-known that there are important differences in the pattern of Y-DNA and mtDNA expansion and distribution.

A paragraph in this respect caught my attention:

The patterns of variation in the Y-chromosome between western and eastern Andalusians, based on 416 males, have also been investigated for a set of Y-Short Tandem Repeats (Y-STRs) and Y-SNPs [53, 54, 55], Calderón et al., unpublished data] in combination to mtDNA analyses ([18, 19] and present study). In general, for both uniparental makers, Andalusians exhibit a typical western European genetic background, with peak frequencies of mtDNA Hg H and Y-chromosome Hg R1b1b2-M269 (45% and 60%, respectively). Interestingly, our results have further revealed that the influence of African female input is far more significant when compared to male influence in contemporary Andalusians. The lack of correspondence between the maternal and paternal genetic profiles of human populations reflects intrinsic differences in migratory behavior related to sex-biased processes and admixture, as well as differences in male and female effective population sizes related to the variance in reproductive success affected, for example, by polygyny [56, 57].

I think that the greater reduction in patrilineal lineages compared to maternal lineages we usually see during and after prehistoric or historic migrations have more to do with the renown Uí Néill family case and with war-related casualties (since combatants were usually men) than with other more popular explanations, such as enslavement of women or polygyny.

The most successful paternal lines (anywhere in the world) were probably those who remained in power for a long time (be it a patriarchal society based on families, clans, or more complex organizational units), who were richer and thus more capable of having healthy offspring, who in turn were able to survive longer and have more children who inherited power, etc.

In case of recent migrations or population movements that disrupt the previously established organization, after a certain number of generations, successful patrilocal families (usually from incoming lineages) might slowly dominate over a whole region, with poorer families (usually of ‘indigenous’ lineages) suffering a greater – especially perinatal and child – mortality, without any obvious (pre)historic event associated to these gradual changes.

This gradual replacement of paternal lineages is compatible with the adoption of the native language by newcomers. If the number of migrants is greater that the native population, and especially if their technology is more advanced, then a more radical change including ethnolinguistic identification is more likely.

I don’t deny the (pre)historic existence of radical replacement of male populations with continuity of female lineages due to massacres of men, female slavery, or polygyny, but they are probably not the main explanation for most regional differences seen in paternal lineages, and should thus be used with caution.

Gradual replacement and founder effects are also the most logical explanation for why autochthonous continuity myths (that the modern regional prevalence of few successful lineages tended to create in the 2000s) haven’t been corroborated by ancient DNA; e.g. R1b-DF27 in Basques, N1c-M178 in Finnic populations, R1a-Z283 in Slavs, etc. There is nothing different in those areas from other recent founder effects and internal migratory flows seen everywhere in Europe in the past millennia.

Paper discovered via a link by Alberto Gonzalez on Facebook group Iberia ADN


Iberia in the Copper and Early Bronze Age: Cultural, demographic, and environmental analysis


New paper (behind paywall), Cultural, Demographic and Environmental Dynamics of the Copper and Early Bronze Age in Iberia (3300–1500 BC): Towards an Interregional Multiproxy Comparison at the Time of the 4.2 ky BP Event, Blanco-González, Lillios, López-Sáez, et al. J World Prehist (2018).

Abstract (emphasis mine):

This paper presents the first comprehensive pan-Iberian overview of one of the major episodes of cultural change in later prehistoric Iberia, the Copper to Bronze Age transition (c. 2400–1900 BC), and assesses its relationship to the 4.2 ky BP climatic event. It synthesizes available cultural, demographic and palaeoenvironmental evidence by region between 3300 and 1500 BC. Important variation can be discerned through this comparison. The demographic signatures of some regions, such as the Meseta and the southwest, diminished in the Early Bronze Age, while other regions, such as the southeast, display clear growth in human activities; the Atlantic areas in northern Iberia barely experienced any changes. This paper opens the door to climatic fluctuations and inter-regional demic movements within the Peninsula as plausible contributing drivers of particular historical dynamics.

Division of Iberia into 5 study areas according to their culture history (3300–1550 BC)

Interesting excerpts summarizing key trends in the different regions:

  • Between 2200 and 1900 BC, the northernmost regions (i.e. Galicia, the Cantabrian strip and the northeastern sector to the north of the Ebro valley) underwent relatively minor changes in the realms of settlement and burial practices. (…) In addition, some Atlantic areas show a marked and statistically significant fall in human activity c. 2200 BC, with a subsequent recovery c. 1600 BC, and such observations are matched by paleoenvironmental proxies and a lack of known EBA sites
  • The overall impression from the Meseta is one of sharp disruption in cultural practices; these include both settlement and burial patterns, abrupt shifts in local climate conditions, and striking differences in human pressure on vegetation. However, there was also clear intra-regional variability, with remarkable internal particularities and differential tempos between the western and eastern sectors. In terms of material culture, discontinuity with the Copper Age is the main trend in the western Duero and the Tagus valleys, yet EBA communities to the north of the Central System adopted far more distinctive and therefore traceable site types (hilltops) and material repertoire. This shift was even stronger in the case of the Motillas culture at La Mancha, whose pathway seems closely tied to the Argaric area.
  • Intra-regional variability is also apparent within the northeast (…) In the second millennium BC, material culture changed, long-distance exchange intensified and anthropogenic pressure increased, despite continuity in diverse realms of social practice.
  • The pattern in the southwest was one of marked discontinuity with two key features: a) it follows the general decreasing trends manifest across Atlantic Iberia; and b) its temporality was clearly different from the rest of the Peninsula and apparently unrelated to the 4.2 ky BP event. Thus, a highly conspicuous and rich variety of cultural expressions in the Chalcolithic, with an early and marked peak in human activity during the Beaker phase c. 2500 BC, gave way to a sudden cultural collapse prior to the onset of the EBA
  • The southeast exhibits one of the most remarkable cultural shifts in Western Europe. (…) The radical transformation in Chalcolithic materiality and ways of life could be regarded as a kind of societal collapse. The Argaric, a highly hierarchical and integrated regional polity, is the clearest example of a new scenario that emerged after the 4.2 ky BP event, yet the contributing role of environmental change and immigration from other regions remains to be fully explored.

Since R1b-DF27 lineages are widely distributed in modern western Europe, it is only logical that the recent find of its first ancient sample in Iberia has sparked the interest for Chalcolithic and Early Bronze Age Iberian cultures.

There is not much literature in English about Iberian prehistory, especially on the evolution of Bell Beaker culture. Also, most papers in Spanish on this cultural phenomenon – in my humble opinion, as a non-archaeologist – seem to be written from a merely descriptive archaeological point of view, many of them still sharing the radiocarbon-based assessment of origin and distribution of materials, instead of more complex anthropological models of cultural change and potential migrations.

Nevertheless, changes and influences in Iberian cultures are obvious regardless of the view taken on population movements (which are becoming quite clear now), and this paper seems to me a thorough review, very interesting for international researchers when interpreting ancient DNA from Iberia.

Featured image, modified from the paper: “The Bell Beaker culture in the northern Meseta: an artistic recreation of funerary ritual at Fuente Olmedo (Valladolid). Source: Garrido-Pena et al. 2011, Fig. 7.7”.

EDIT (21 MAR 2018): Interesting C14 date repository project Cronología de la Prehistoria de la Península Ibérica (read a brief description, in Spanish).


Iberian prehistoric migrations in Genomics from Neolithic, Chalcolithic, and Bronze Age


New open access paper Four millennia of Iberian biomolecular prehistory illustrate the impact of prehistoric migrations at the far end of Eurasia, by Valdiosera, Günther, Vera-Rodríguez, et al. PNAS (2018) published ahead of print.

Abstract (emphasis mine)

Population genomic studies of ancient human remains have shown how modern-day European population structure has been shaped by a number of prehistoric migrations. The Neolithization of Europe has been associated with large-scale migrations from Anatolia, which was followed by migrations of herders from the Pontic steppe at the onset of the Bronze Age. Southwestern Europe was one of the last parts of the continent reached by these migrations, and modern-day populations from this region show intriguing similarities to the initial Neolithic migrants. Partly due to climatic conditions that are unfavorable for DNA preservation, regional studies on the Mediterranean remain challenging. Here, we present genome-wide sequence data from 13 individuals combined with stable isotope analysis from the north and south of Iberia covering a four-millennial temporal transect (7,500–3,500 BP). Early Iberian farmers and Early Central European farmers exhibit significant genetic differences, suggesting two independent fronts of the Neolithic expansion. The first Neolithic migrants that arrived in Iberia had low levels of genetic diversity, potentially reflecting a small number of individuals; this diversity gradually increased over time from mixing with local hunter-gatherers and potential population expansion. The impact of post-Neolithic migrations on Iberia was much smaller than for the rest of the continent, showing little external influence from the Neolithic to the Bronze Age. Paleodietary reconstruction shows that these populations have a remarkable degree of dietary homogeneity across space and time, suggesting a strong reliance on terrestrial food resources despite changing culture and genetic make-up.

(A) f4 statistics testing affinities of prehistoric European farmers to either early Neolithic Iberians or central Europeans, restricting these reference populations to SNP-captured individuals to avoid technical artifacts driving the affinities. The boxplots in A show the distributions of all individual f4 statistics belonging to the respective groups. The signal is not sensitive to the choice of reference populations and is not driven by hunter-gatherer–related admixture (Datasets S4 and S5). (B) Estimates of ancestry proportions in different prehistoric Europeans as well as modern southwestern Europeans. Individuals from regions of Iberia were grouped together for the analysis in A and B to increase sample sizes per group and reduce noise


We present a comprehensive biomolecular dataset spanning four millennia of prehistory across the whole Iberian Peninsula. Our results highlight the power of archaeogenomic studies focusing on specific regions and covering a temporal transect. The 4,000 y of prehistory in Iberia were shaped by major chronological changes but with little geographic substructure within the Peninsula. The subtle but clear genetic differences between early Neolithic Iberian farmers and early Neolithic central European farmers point toward two independent migrations, potentially originating from two slightly different source populations. These populations followed different routes, one along the Mediterranean coast, giving rise to early Neolithic Iberian farmers, and one via mainland Europe forming early Neolithic central European farmers. This directly links all Neolithic Iberians with the first migrants that arrived with the initial Mediterranean Neolithic wave of expansion. These Iberians mixed with local hunter-gatherers (but maintained farming/pastoral subsistence strategies, i.e., diet), leading to a recovery from the loss of genetic diversity emerging from the initial migration founder bottleneck. Only after the spread of Bell Beaker pottery did steppe-related ancestry arrive in Iberia, where it had smaller contributions to the population compared with the impact that it had in central Europe. This implies that the two prehistoric migrations causing major population turnovers in central Europe had differential effects at the southwestern edge of their distribution: The Neolithic migrations caused substantial changes in the Iberian gene pool (the introduction of agriculture by farmers) (6, 9, 11, 13, 24), whereas the impact of Bronze Age migrations (Yamnaya) was significantly smaller in Iberia than in north-central Europe (24). The post-Neolithic prehistory of Iberia is generally characterized by interactions between residents rather than by migrations from other parts of Europe, resulting in relative genetic continuity, while most other regions were subject to major genetic turnovers after the Neolithic (4, 6, 7, 9, 25, 48). Although Iberian populations represent the furthest wave of Neolithic expansion in the westernmost Mediterranean, the subsequent populations maintain a surprisingly high genetic legacy of the original pioneer farming migrants from the east compared with their central European counterparts. This counterintuitive result emphasizes the importance of in-depth diachronic studies in all parts of the continent.


Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula

Open access preprint (which I announced already) at bioRxiv Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula, by Bycroft et al. (2018).

Abstract (emphasis mine):

Genetic differences within or between human populations (population structure) has been studied using a variety of approaches over many years. Recently there has been an increasing focus on studying genetic differentiation at fine geographic scales, such as within countries. Identifying such structure allows the study of recent population history, and identifies the potential for confounding in association studies, particularly when testing rare, often recently arisen variants. The Iberian Peninsula is linguistically diverse, has a complex demographic history, and is unique among European regions in having a centuries-long period of Muslim rule. Previous genetic studies of Spain have examined either a small fraction of the genome or only a few Spanish regions. Thus, the overall pattern of fine-scale population structure within Spain remains uncharacterised. Here we analyse genome-wide genotyping array data for 1,413 Spanish individuals sampled from all regions of Spain. We identify extensive fine-scale structure, down to unprecedented scales, smaller than 10 Km in some places. We observe a major axis of genetic differentiation that runs from east to west of the peninsula. In contrast, we observe remarkable genetic similarity in the north-south direction, and evidence of historical north-south population movement. Finally, without making particular prior assumptions about source populations, we show that modern Spanish people have regionally varying fractions of ancestry from a group most similar to modern north Moroccans. The north African ancestry results from an admixture event, which we date to 860 – 1120 CE, corresponding to the early half of Muslim rule. Our results indicate that it is possible to discern clear genetic impacts of the Muslim conquest and population movements associated with the subsequent Reconquista.

“(a) Binary tree showing the inferred hierarchical relationships between clusters. The colours and points correspond to each cluster as shown on the map, and the length of the coloured rectangles is proportional to the number of individuals assigned to that cluster. We combined some small clusters (Methods) and the thick black branches indicate the clades of the tree that we visualise in the map. We have labeled clusters according to the approximate location of most of their members, but geographic data was not used in the inference. (b) Each individual is represented by a point placed at (or close to) the centroid of their grandparents’ birthplaces. On this map we only show the individuals for whom all four grandparents were born within 80km of their average birthplace, although the data for all individuals were used in the fineSTRUCTURE inference. The background is coloured according to the spatial densities of each cluster at the level of the tree where there are 14 clusters (see Methods). The colour and symbol of each point corresponds to the cluster the individual was assigned to at a lower level of the tree, as shown in (a). The labels and boundaries of Spain’s Autonomous Communities are also shown.”

Some interesting excerpts:

Our results further imply that north west African-like DNA predominated in the migration. Moreover, admixture mainly, and perhaps almost exclusively, occurred within the earlier half of the period of Muslim rule. Within Spain, north African ancestry occurs in all groups, although levels are low in the Basque region and in a region corresponding closely to the 14th-century ‘Crown of Aragon’. Therefore, although genetically distinct this implies that the Basques have not been completely isolated from the rest of Spain over the past 1300 years.

NOTE. I must add here that the Expulsion of Moriscos is known to have been quite successful in the old Crown of Aragon – deeply affecting its economy – , in contrast with other territories of the Crown of Castille, where they either formed less sizeable communities, or were dispersed and eventually Christened and integrated with local communities. For example, thousands of Moriscos from Granada were dispersed following the War of Alpujarras (1567–1571) into different regions of the Crown of Castille, and many could not be later expelled due to the locals’ resistance to follow the expulsion edict.

Perhaps surprisingly, north African ancestry does not reflect proximity to north Africa, or even regions under more extended Muslim control. The highest amounts of north African ancestry found within Iberia are in the west (11%) including in Galicia, despite the fact that the region of Galicia as it is defined today (north of the Miño river), was never under Muslim rule and Berber settlements north of the Douro river were abandoned by. This observation is consistent with previous work using Y-chromosome data. We speculate that the pattern we see is driven by later internal migratory flows, such as between Portugal and Galicia, and this would also explain why Galicia and Portugal show indistinguishable ancestry sharing with non-Spanish groups more generally. Alternatively, it might be that these patterns reflect regional differences in patterns of settlement and integration with local peoples of north African immigrants themselves, or varying extents of the large-scale expulsion of Muslim people, which occurred post-Reconquista and especially in towns and cities.

We estimated ancestry profiles for each point on a fine spatial grid across Spain (Methods). Gray crosses show
the locations of sampled individuals used in the estimation. Map shows the fraction contributed from the donor group ‘NorthMorocco’.

Overall, the pattern of genetic differentiation we observe in Spain reflects the linguistic and geopolitical boundaries present around the end of the time of Muslim rule in Spain, suggesting this period has had a significant and long-term impact on the genetic structure observed in modern Spain, over 500 years later. In the case of the UK, similar geopolitical correspondence was seen, but to a different period in the past (around 600 CE). Noticeably, in these two cases, country-specific historical events rather than geographic barriers seem to drive overall patterns of population structure. The observation that fine-scale structure evolves at different rates in different places could be explained if observed patterns tend to reflect those at the ends of periods of significant past upheaval, such as the end of Muslim rule in Spain, and the end of the Anglo-Saxon and Danish Viking invasions in the UK.

Certain people want to believe (well into the 21st century) into ideal ancestral populations and ancient ethnolinguistic identifications linked to one’s own – or the own country’s dominant – ancestral components and Y-DNA haplogroup.

We are nevertheless seeing how mainly the most recent relevant geopolitical events and late internal migratory flows have shaped the genetic structure (including Y-DNA haplogroup composition) of modern regions and countries regardless of its population’s actual language or ethnic identification, whether (pre)historical or modern.

Another surprise for many, I guess.


Migration vs. Acculturation models for Aegean Neolithic in Genetics — still depending strongly on Archaeology


Recent paper in Proceedings of the Royal Society B: Archaeogenomic analysis of the first steps of Neolithization in Anatolia and the Aegean, by Kılınç et al. (2017).


The Neolithic transition in west Eurasia occurred in two main steps: the gradual development of sedentism and plant cultivation in the Near East and the subsequent spread of Neolithic cultures into the Aegean and across Europe after 7000 cal BCE. Here, we use published ancient genomes to investigate gene flow events in west Eurasia during the Neolithic transition. We confirm that the Early Neolithic central Anatolians in the ninth millennium BCE were probably descendants of local hunter–gatherers, rather than immigrants from the Levant or Iran. We further study the emergence of post-7000 cal BCE north Aegean Neolithic communities. Although Aegean farmers have frequently been assumed to be colonists originating from either central Anatolia or from the Levant, our findings raise alternative possibilities: north Aegean Neolithic populations may have been the product of multiple westward migrations, including south Anatolian emigrants, or they may have been descendants of local Aegean Mesolithic groups who adopted farming. These scenarios are consistent with the diversity of material cultures among Aegean Neolithic communities and the inheritance of local forager know-how. The demographic and cultural dynamics behind the earliest spread of Neolithic culture in the Aegean could therefore be distinct from the subsequent Neolithization of mainland Europe.

The analysis of the paper highlights two points regarding the process of Neolithisation in the Aegean, which is essential to ascertain the impact of later Indo-European migrations of Proto-Anatolian and Proto-Greek and other Palaeo-Balkan speakers(texts partially taken verbatim from the paper):

  • The observation that the two central Anatolian populations cluster together to the exclusion of Neolithic populations of south Levant or of Iran restates the conclusion that farming in central Anatolia in the PPN was established by local groups instead of immigrants, which is consistent with the described cultural continuity between central Anatolian Epipalaeolithic and Aceramic communities. This reiterates the earlier conclusion that the early Neolithisation in the primary zone was largely a process of cultural interaction instead of gene flow.
Principal component analysis (PCA) with modern and ancient genomes. The eigenvectors were calculated using 50 modern west Eurasian populations, onto which genome data from ancient individuals were projected. The gray circles highlight the four ancient gene pools of west Eurasia. Modern-day individuals are shown as gray points. In the Near East, Pre-Neolithic (Epipaleolithic/Mesolithic) and Neolithic individuals genetically cluster by geography rather than by cultural context. For instance, Neolithic individuals of Anatolia cluster to the exclusion of individuals from the Levant or Iran). In Europe, genetic clustering reflects cultural context but not geography: European early Neolithic individuals are genetically distinct from European pre-Neolithic individuals but tightly cluster with Anatolians. PPN: Pre-Pottery/Aceramic Neolithic, PN: Pottery Neolithic, Tepecik: Tepecik-Çiftlik (electronic supplementary material, table S1 lists the number of SNPs per ancient individual).
  • The realisation that there are still two possibilities regarding the question of whether Aegean Neolithisation (post-7000 cal BC) involved similar acculturation processes, or was driven by migration similar to Neolithisation in mainland Europe — a long-standing debate in Archaeology:
    1. Migration from Anatolia to the Aegean: the Aegean Neolithisation must have involved replacement of a local, WHG-related Mesolithic population by incoming easterners. Central Anatolia or south Anatolia / north Levant (of which there is no data) are potential origins of the components observed. Notably, the north Aegeans – Revenia (ca. 6438-6264 BC) and Barcın (ca. 6500-6200 BC) – show higher diversity than the central Anatolians, and the population size of Aegeans was larger than that of central Anatolians. The lack of WHG in later samples indicates that they must have been fully replaced by the eastern migrant farmers.
    2. Adoption of Neolithic elements by local foragers: Alternatively, the Aegean coast Mesolithic populations may have been part of the Anatolian-related gene pool that occupied the Aegean seaboard during the Early Holocene, in an “out-of-the-Aegean hypothesis. Following the LGM, Aegean emigrants would have dispersed into central Anatolia and established populations that eventually gave rise to the local Epipalaeolithic and later Neolithic communities, in line with the earliest direct evidence for human presence in central Anatolia ca 14 000 cal BCE
  • On the archaeological evidence (excerpt):

    Instead of a single-sourced colonization process, the Aegean Neolithization may thus have flourished upon already existing coastal and interior interaction networks connecting Aegean foragers with the Levantine and central Anatolian PPN populations, and involved multiple cultural interaction events from its early steps onward [16,20,64,74]. This wide diversity of cultural sources and the potential role of local populations in Neolithic development may set apart Aegean Neolithization from that in mainland Europe. While Mesolithic Aegean genetic data are awaited to fully resolve this issue, researchers should be aware of the possibility that the initial emergence of the Neolithic elements in the Aegean, at least in the north Aegean, involved cultural and demographic dynamics different than those in European Neolithization.

    Featured image, from the article: “Summary of the data analyzed in this study. (a) Map of west Eurasia showing the geographical locations and (b) timeline showing the time period (years BCE) of ancient individuals investigated in the study. Blue circles: individuals from pre-Neolithic context; red triangles: individuals from Neolithic contexts”.


Before steppe ancestry: Europe’s genetic diversity shaped mainly by local processes, with varied sources and proportions of hunter-gatherer ancestry


The definitive publication of a BioRxiv preprint article, in Nature: Parallel palaeogenomic transects reveal complex genetic history of early European farmers, by Lipson et al. (2017).

The dataset with all new samples is available at the Reich Lab’s website. You can try my drafts on how to do your own PCA and ADMIXTURE analysis with some of their new datasets.


Ancient DNA studies have established that Neolithic European populations were descended from Anatolian migrants who received a limited amount of admixture from resident hunter-gatherers. Many open questions remain, however, about the spatial and temporal dynamics of population interactions and admixture during the Neolithic period. Here we investigate the population dynamics of Neolithization across Europe using a high-resolution genome-wide ancient DNA dataset with a total of 180 samples, of which 130 are newly reported here, from the Neolithic and Chalcolithic periods of Hungary (6000–2900 BC, n = 100), Germany (5500–3000 BC, n = 42) and Spain (5500–2200 BC, n = 38). We find that genetic diversity was shaped predominantly by local processes, with varied sources and proportions of hunter-gatherer ancestry among the three regions and through time. Admixture between groups with different ancestry profiles was pervasive and resulted in observable population transformation across almost all cultural transitions. Our results shed new light on the ways in which gene flow reshaped European populations throughout the Neolithic period and demonstrate the potential of time-series-based sampling and modelling approaches to elucidate multiple dimensions of historical population interactions.

There were some interesting finds on a regional level, with some late survival of hunter-gatherer ancestry (and Y-DNA haplogroups) in certain specific sites, but nothing especially surprising. This survival of HG ancestry and lineages in Iberia and other regions may be used to revive (yet again) the controversy over the origin of non-Indo-European languages of Europe attested in historical times, such as the only (non-Uralic) one surviving to this day, the Basque language.

This study kept confirming the absence of Y-DNA R1b-M269 subclades in Central Europe before the arrival of Yamna migrants, though, which offers strong reasons to reject the Indo-European from the west hypothesis.

Here are first the PCA of samples included in this paper, and then the PCA of ancient Eurasians (Mathieson et al. 2017) and modern populations (Lazaridis et al. 2014) for comparison of similar clusters:

First two principal components from the PCA. We computed the principal components (PCs) for a set of 782 present-day western Eurasian individuals genotyped on the Affymetrix Human Origins array (background grey points) and then projected ancient individuals onto these axes. A close-up omitting the present-day Bedouin population is shown. From Lipton et al. (2017(
PCA of South-East European and other European samples from Mathieson et al. (2017)
Ancient and modern samples on Lazaridis et al. (2014)


Iberian Peninsula: Discontinuity in mtDNA between hunter-gatherers and farmers, not so much during the Chalcolithic and EBA


A new preprint paper at BioRxiv, The maternal genetic make-up of the Iberian Peninsula between the Neolithic and the Early Bronze Age, by Szécsényi-Nagy et al. (2017).


Agriculture first reached the Iberian Peninsula around 5700 BCE. However, little is known about the genetic structure and changes of prehistoric populations in different geographic areas of Iberia. In our study, we focused on the maternal genetic makeup of the Neolithic (~ 5500-3000 BCE), Chalcolithic (~ 3000-2200 BCE) and Early Bronze Age (~ 2200-1500 BCE). We report ancient mitochondrial DNA results of 213 individuals (151 HVS-I sequences) from the northeast, central, southeast and southwest regions and thus on the largest archaeogenetic dataset from the Peninsula to date. Similar to other parts of Europe, we observe a discontinuity between hunter-gatherers and the first farmers of the Neolithic. During the subsequent periods, we detect regional continuity of Early Neolithic lineages across Iberia, however the genetic contribution of hunter-gatherers is generally higher than in other parts of Europe and varies regionally. In contrast to ancient DNA findings from Central Europe, we do not observe a major turnover in the mtDNA record of the Iberian Late Chalcolithic and Early Bronze Age, suggesting that the population history of the Iberian Peninsula is distinct in character.

Iberian mtDNA samples

Detailed conclusions of their work,

The present study, based on 213 new and 125 published mtDNA data of prehistoric Iberian individuals suggests a more complex mode of interaction between local hunter-gatherers and incoming early farmers during the Early and Middle Neolithic of the Iberian Peninsula, as compared to Central Europe. A characteristic of Iberian population dynamics is the proportion of autochthonous hunter-gatherer haplogroups, which increased in relation to the distance to the Mediterranean coast. In contrast, the early farmers in Central Europe showed comparatively little admixture of contemporaneous hunter-gatherer groups. Already during the first centuries of Neolithic transition in Iberia, we observe a mix of female DNA lineages of different origins. Earlier hunter-gatherer haplogroups were found together with a variety of new lineages, which ultimately derive from Near Eastern farming groups. On the other hand, some early Neolithic sites in northeast Iberia, especially the early group from the cave site of Els Trocs in the central Pyrenees, seem to exhibit affinities to Central European LBK communities. The diversity of female lineages in the Iberian communities continued even during the Chalcolithic, when populations became more homogeneous, indicating higher mobility and admixture across different geographic regions. Even though the sample size available for Early Bronze Age populations is still limited, especially with regards to El Argar groups, we observe no significant changes to the mitochondrial DNA pool until the end of our time transect (1500 BCE). The expansion of groups from the eastern steppe, which profoundly impacted Late Neolithic and EBA groups of Central and North Europe, cannot (yet) be seen in the contemporaneous population substrate of the Iberian Peninsula at the present level of genetic resolution. This highlights the distinct character of the Neolithic transition both in the Iberian Peninsula and elsewhere and emphasizes the need for further in depth archaeogenetic studies for reconstructing the close reciprocal relationship of genetic and cultural processes on the population level.

So it seems more and more likely that the North-West Indo-European invasion during the Copper Age (signaled by changes in Y-DNA lineages) was not, as in central Europe, accompanied by much mtDNA turnover. What that means – either a male-dominated invasion, or a longer internal evolution of invasive Y-DNA subclades – remains to bee seen, but I am still more inclined to see the former as the most likely interpretation, in spite of admixture results.


Featured images: from the article, licensed BY-NC-ND.