Distribution of Southern Iberian haplogroup H indicates exchanges in the western Mediterranean

Recent open access paper The distribution of mitochondrial DNA haplogroup H in southern Iberia indicates ancient human genetic exchanges along the western edge of the Mediterranean, by Hernández, Dugoujon, Novelletto, Rodríguez, Cuesta and Calderón, BMC Genetics (2017).

Abstract (emphasis mine):

The structure of haplogroup H reveals significant differences between the western and eastern edges of the Mediterranean, as well as between the northern and southern regions. Human populations along the westernmost Mediterranean coasts, which were settled by individuals from two continents separated by a relatively narrow body of water, show the highest frequencies of mitochondrial haplogroup H. These characteristics permit the analysis of ancient migrations between both shores, which may have occurred via primitive sea crafts and early seafaring. We collected a sample of 750 autochthonous people from the southern Iberian Peninsula (Andalusians from Huelva and Granada provinces). We performed a high-resolution analysis of haplogroup H by control region sequencing and coding SNP screening of the 337 individuals harboring this maternal marker. Our results were compared with those of a wide panel of populations, including individuals from Iberia, the Maghreb, and other regions around the Mediterranean, collected from the literature.

Both Andalusian subpopulations showed a typical western European profile for the internal composition of clade H, but eastern Andalusians from Granada also revealed interesting traces from the eastern Mediterranean. The basal nodes of the most frequent H sub-haplogroups, H1 and H3, harbored many individuals of Iberian and Maghrebian origins. Derived haplotypes were found in both regions; haplotypes were shared far more frequently between Andalusia and Morocco than between Andalusia and the rest of the Maghreb. These and previous results indicate intense, ancient and sustained contact among populations on both sides of the Mediterranean.

Our genetic data on mtDNA diversity, combined with corresponding archaeological similarities, provide support for arguments favoring prehistoric bonds with a genetic legacy traceable in extant populations. Furthermore, the results presented here indicate that the Strait of Gibraltar and the adjacent Alboran Sea, which have often been assumed to be an insurmountable geographic barrier in prehistory, served as a frequently traveled route between continents.

a, b, c. Interpolated frequency surfaces of clade H and its main sub-clades (H1 and H3). Frequencies (%) are showed in a colour scale. See information about the populations used in Additional files 4 and 5. Map templates were taken from Natural Earth free map repository (http://www.naturalearthdata.com/)

I usually find mtDNA data, especially studies like this one based on modern populations, very difficult to interpret for anthropological purposes. It is well-known that there are important differences in the pattern of Y-DNA and mtDNA expansion and distribution.

A paragraph in this respect caught my attention:

The patterns of variation in the Y-chromosome between western and eastern Andalusians, based on 416 males, have also been investigated for a set of Y-Short Tandem Repeats (Y-STRs) and Y-SNPs [53, 54, 55], Calderón et al., unpublished data] in combination to mtDNA analyses ([18, 19] and present study). In general, for both uniparental makers, Andalusians exhibit a typical western European genetic background, with peak frequencies of mtDNA Hg H and Y-chromosome Hg R1b1b2-M269 (45% and 60%, respectively). Interestingly, our results have further revealed that the influence of African female input is far more significant when compared to male influence in contemporary Andalusians. The lack of correspondence between the maternal and paternal genetic profiles of human populations reflects intrinsic differences in migratory behavior related to sex-biased processes and admixture, as well as differences in male and female effective population sizes related to the variance in reproductive success affected, for example, by polygyny [56, 57].

I think that the greater reduction in patrilineal lineages compared to maternal lineages we usually see during and after prehistoric or historic migrations have more to do with the renown Uí Néill family case and with war-related casualties (since combatants were usually men) than with other more popular explanations, such as enslavement of women or polygyny.

The most successful paternal lines (anywhere in the world) were probably those who remained in power for a long time (be it a patriarchal society based on families, clans, or more complex organizational units), who were richer and thus more capable of having healthy offspring, who in turn were able to survive longer and have more children who inherited power, etc.

In case of recent migrations or population movements that disrupt the previously established organization, after a certain number of generations, successful patrilocal families (usually from incoming lineages) might slowly dominate over a whole region, with poorer families (usually of ‘indigenous’ lineages) suffering a greater – especially perinatal and child – mortality, without any obvious (pre)historic event associated to these gradual changes.

This gradual replacement of paternal lineages is compatible with the adoption of the native language by newcomers. If the number of migrants is greater that the native population, and especially if their technology is more advanced, then a more radical change including ethnolinguistic identification is more likely.

I don’t deny the (pre)historic existence of radical replacement of male populations with continuity of female lineages due to massacres of men, female slavery, or polygyny, but they are probably not the main explanation for most regional differences seen in paternal lineages, and should thus be used with caution.

Gradual replacement and founder effects are also the most logical explanation for why autochthonous continuity myths (that the modern regional prevalence of few successful lineages tended to create in the 2000s) haven’t been corroborated by ancient DNA; e.g. R1b-DF27 in Basques, N1c-M178 in Finnic populations, R1a-Z283 in Slavs, etc. There is nothing different in those areas from other recent founder effects and internal migratory flows seen everywhere in Europe in the past millennia.

Paper discovered via a link by Alberto Gonzalez on Facebook group Iberia ADN


Paternal lineages mainly from migrants, maternal lineages mainly from local populations in Argentina

New paper (behind paywall) Genetic variation in populations from central Argentina based on mitochondrial and Y chromosome DNA evidence, by García, Pauro, Bailliet, Bravi & Demarchi, J. Hum. Genet (2018) 63: 493–507.

Abstract (emphasis mine):

We present new data and analysis on the genetic variation of contemporary inhabitants of central Argentina, including a total of 812 unrelated individuals from 20 populations. Our goal was to bring new elements for understanding micro-evolutionary and historical processes that generated the genetic diversity of the region, using molecular markers of uniparental inheritance (mitochondrial DNA and Y chromosome). Almost 76% of the individuals show mitochondrial lineages of American origin. The Native American haplogroups predominate in all surveyed localities, except in one. The larger presence of Eurasian maternal lineages were observed in the plains (Pampas) of the southeast, whereas the African lineages are more frequent in northern Córdoba. On the other hand, the analysis of 258 male samples reveals that 92% of them present Eurasian paternal lineages, 7% carry Native American haplogroups, and only 1% of the males show African lineages. The maternal lineages have high genetic diversity homogeneously distributed throughout central Argentina, probably as result of a recent common origin and sustained gene flow. Migratory events that occurred in colonial and recent times should have contributed to hiding any traces of differentiation that might have existed in the past. The analysis of paternal lineages showed also homogeneous distribution of the variation together with a drastic reduction of the native male population.

Maps showing continental mtDNA haplogroups frequencies in 20 population samples from central Argentina. References for populations abbreviated names are from the tables.

Interesting excerpts:

The immigration waves had less impact in the north–central and northwestern regions, the most populated areas of the country in pre-Hispanic times. The spatial structure of genetic diversity has its origins in historical factors. It is possible to distinguish different stages in migratory processes from abroad, with a heterogeneous regional impact. The genetic composition of central Argentina gives account of these processes. On one hand, the political boundaries between provinces influenced the configuration of the genetic structure of the populations that were formed. In this sense, Córdoba—an important economic and commercial center since colonial times—has a greater component of foreign lineages than the populations of San Luis and Santiago del Estero. On the other hand, the genetic structure of central Argentina also accounts for other processes related to different migration phases and occupations of space over the last 500 years.

Maternal continental contribution (in percentages), and Native American haplogroup frequencies, by population

Similarly, negative values observed in the neutrality tests (Tajima’s D and Fu’s FS), indicate relatively recent population growth, probably associated with technological and organizational changes leading to new lifestyles and important demographic and territorial expansion [75]. In conclusion, the molecular markers of maternal inheritance shows large genetic diversity homogeneously distributed throughout central Argentina, probably as result of a recent common origin and sustained gene flow between sub-populations. In addition, migratory events that occurred in colonial and recent times should have contributed to hiding any traces of differentiation that might have existed in the past. The analysis of paternal lineages showed also homogeneous distribution of the variation across the region but also a drastic reduction of the native male population, with a large prevalence of haplogroups of European origin.

Y chromosome haplogroups frequencies in three provinces from central Argentina and other 19 samples from Argentina, Chile, and Paraguay


A history of male migration in and out of the Green Sahara

Open access research highlight A history of male migration in and out of the Green Sahara, by Yali Xue, Genome Biology (2018) 19:30, on the recent paper by D’Atanasio et al.

Insights from the Green Saharan Y-chromosomal findings (emphasis mine):

It is widely accepted that sub-Saharan Y chromosomes are dominated by E-M2 lineages carried by Bantu-speaking farmers as they expanded from West Africa starting < 5 kya, reaching South Africa within recent centuries [4]. The E-M2-Bantu lineages lie phylogenetically within the E-M2-Green Sahara lineage and show at least three explosive lineage expansions beginning 4.9–5.3 kya [5] (Fig. 1a). These events of E-M2-Bantu expansion are slightly later than the R-V88 expansion, and highlight the range of male demographic changes in the mid-Holocene. North of the Sahara, in addition to the four trans-Saharan haplogroups, haplogroup E-M81 (which diverged from E-M78 ~ 13 kya) became very common in present-day populations as a result of another massive expansion ~ 2 kya [6] (Fig. 1a).

Simplified Y-chromosomal phylogeny and inferred past or observed present-day distribution of relevant Y-chromosomal lineages. a Calibrated phylogenetic tree of Y-chromosomal lineages discussed in the text. Green shading represents the period when the present-day Sahara Desert was green and fertile. Lineages represented by filled pentagons have undergone very rapid expansions. b [featured image] The Green Sahara period 5–12 kya. Green shading indicates that the present-day Sahara Desert was green and fertile. The colors within the large oval represent the four Y-chromosomal haplogroups deduced to be present in the region at this time; specific locations are not implied. The arrows indicate the inferred origins of these haplogroups to the north or south, but specific origins and routes are not implied. c The present-day distributions of the four Green Saharan Y-chromosomal haplogroups. Yellow shading indicates the Sahara Desert. Each circle represents a sampled population, with the presence or absence of the four Green Saharan haplogroups shown by the colored sectors; other haplogroups may also be present in these populations, but are not shown. The small arrows indicate the inferred northwards and southwards movements of these haplogroups when the Sahara became uninhabitable.

Although Y chromosomes exist within populations and so share and reflect the general history of those populations, they can sometimes show some departures from other parts of the genome that result from differences in male and female behaviors. D’Atanasio et al. [1] highlight one such contrast in their study. Present-day North African populations show substantial sub-Saharan autosomal and mtDNA genetic components ascribed to the Roman and Arab slave trades 1–2 kya [7], but carry few sub-Saharan Y lineages from this source, probably reflecting the smaller numbers of male slaves and their reduced reproductive opportunities when compared to those of female slaves. The sub-Saharan Y chromosomes in these North African populations thus originate predominantly from the earlier Green Sahara period.

In this part of Africa, the indigenous languages that are spoken belong to three of the four African linguistic families (Afro-Asiatic, Nilo-Saharan and Niger-Congo). Interestingly, these languages show non-random associations with Y lineages. For example, Chadic languages within the Afro-Asiatic family are associated with haplogroup R-V88, whereas Nilo-Saharan languages are associated with specific sublineages within A3-M13 and E-M78, further illustrating the complex human history of the region.

The main question after D’Atanasio et al. (2018) is thus:

(…) what are the reasons for the very rapid R-V88 expansion 5–6 kya [1] and E-M81 expansion ~ 2 kya [6], and how do these expansions fit within general worldwide patterns of male-specific expansions, which in other cases have been linked to cultural and technological changes [5]?

I think that the only known haplogroup expansion that might fit today the spread and dialectalization of Afroasiatic, a proto-language probably contemporaneous or slighly older than Middle Proto-Indo-European, is that of R1b-V88 lineages. However, without ancient DNA samples to corroborate this, we cannot be sure.

See also:

Yleaf: software for human Y-chromosomal haplogroup inference from next generation sequencing data


Brief communication (behind paywall) Yleaf: software for human Y-chromosomal haplogroup inference from next generation sequencing data, by Arwin Ralf, Diego Montiel González, Kaiyin Zhong, and Manfred Kayser, Mol Biol Evol (2018), msy032.


Next generation sequencing (NGS) technologies offer immense possibilities given the large genomic data they simultaneously deliver. The human Y chromosome serves as good example how NGS benefits various applications in evolution, anthropology, genealogy and forensics. Prior to NGS, the Y-chromosome phylogenetic tree consisted of a few hundred branches, based on NGS data it now contains many thousands. The complexity of both, Y tree and NGS data provide challenges for haplogroup assignment. For effective analysis and interpretation of Y-chromosome NGS data, we present Yleaf, a publically available, automated, user-friendly software for high-resolution Y-chromosome haplogroup inference independently of library and sequencing methods.

Here is a link to the software Yleaf’s website, from the Department of Genetic Identification, at the University of Erasmus Medical Center.

Summary of NGS datasets used for automated NRY haplogrouping with Yleaf


In the time of NGS (or massively parallel sequencing, MPS), the amount of genomic data produced and made publically available is rapidly expanding, providing valuable resources for many areas of research and applications. Due to its haploid nature and male-specific inheritance, the non-recombining part of the human Y-chromosome (NRY) is highly suitable for phylogenetic studies and for addressing questions in evolution, anthropology, population history, genealogy and forensics (Jobling & Tyler-Smith, 2017). Over recent years, NGS data allowed the phylogenetic NRY tree to dramatically increase in size and complexity (Hallast et al. 2014; Poznik et al. 2016). The two most comprehensive tree versions ISOGG (http://www.isogg.org/tree) and Yfull (https://www.yfull.com/tree) currently contain thousands of branches. However, the complexity of both, Y tree and NGS data provide immense challenges for NRY haplogroup assignment, which reflects a key element in many NRY applications. Here we introduce Yleaf, a Phyton-based, easy-to-use, publically-available software tool for effective NRY single nucleotide polymorphism (SNP) calling and subsequent NRY haplogroup inference from NGS data. By comparative whole genome data analysis, we demonstrate high concordance of Yleaf in NRY-SNP calling compared to well-established tools such as SAMtools/BCFtools (Li et al. 2009), and GATK (McKenna, et al. 2010) as well as improved performance of Yleaf in NRY haplogroup assignment relative to previously developed tools such as clean_tree (Ralf et al. 2015), AMY-tree (Van Geystelen et al. 2015), and yHaplo (Poznik, 2016).

Yleaf allows analyzing NRY sequence data from many types of NGS libraries i.e., whole genomes, whole exomes, large genomic regions, and large numbers of targeted amplicons. Several modifications relative to our previously developed clean_tree tool (Ralf et al. 2015) were implemented to optimize the performance especially relevant for extremely large NGS datasets such as whole genomes. For instance, Yleaf extracts the Y-chromosomal reads prior to further processing and uses multi-threading, a batch option is included too. Importantly, Yleaf provides drastically increased haplogroup resolution i.e., from Downloaded from 530 positions defining 432 NRY haplogroups with clean_tree (Ralf et al. 2015) to over 41,000 positions defining 5353 haplogroups with Yleaf. For a detailed method description see the supplementary material.

Featured image: From Martiniano et al. (2017).


Archaeological and anthropological studies on the Harappan cemetery of Rakhigarhi, India


New open access paper Archaeological and anthropological studies on the Harappan cemetery of Rakhigarhi, India, by Shinde, Kim, Wo, et al. PLOS One (2018) 13(2): e0192299.


An insufficient number of archaeological surveys has been carried out to date on Harappan Civilization cemeteries. One case in point is the necropolis at Rakhigarhi site (Haryana, India), one of the largest cities of the Harappan Civilization, where most burials within the cemetery remained uninvestigated. Over the course of the past three seasons (2013 to 2016), we therefore conducted excavations in an attempt to remedy this data shortfall. In brief, we found different kinds of graves co-existing within the Rakhigarhi cemetery in varying proportions. Primary interment was most common, followed by the use of secondary, symbolic, and unused (empty) graves. Within the first category, the atypical burials appear to have been elaborately prepared. Prone-positioned internments also attracted our attention. Since those individuals are not likely to have been social deviants, it is necessary to reconsider our pre-conceptions about such prone-position burials in archaeology, at least in the context of the Harappan Civilization. The data presented in this report, albeit insufficient to provide a complete understanding of Harappan Civilization cemeteries, nevertheless does present new and significant information on the mortuary practices and anthropological features at that time. Indeed, the range of different kinds of burials at the Rakhigarhi cemetery do appear indicative of the differences in mortuary rituals seen within Harappan societies, therefore providing a vivid glimpse of how these people respected their dead.

Harappan sites where skeletons were discovered (indicated by dots). Red dot: Rakhigarhi site; dashed dot: skeletons from non-cemetery area; black dots: cemetery sites other than Rakhigarhi.

This is a must read for anyone willing to analyze in detail the upcoming Rakhigarhi samples, which will bring more information regarding the Neolithic population of the Indian subcontinent before the migration of Indo-Iranian peoples.


First Hungarian ruling dynasty, the Árpáds, of Y-DNA haplogroup R1a


Open access article DNA profiling of Hungarian King Béla III and other skeletal remains originating from the Royal Basilica of Székesfehérvár, Olasz, J., Seidenberg, V., Hummel, S. et al. Archaeol Anthropol Sci (2018).


A few decades after the collapse of the Avar Khaganate (c. 822 AD), Hungarian invaders conquered the Carpathian Basin (c. 862–895 AD). The first Hungarian ruling dynasty, the Árpáds played an important role in European history during the Middle Ages. King Béla III (1172–1196) was one of the most significant rulers of the dynasty. He also consolidated Hungarian dominance over the Northern Balkans. The provostry church of the Virgin Mary (commonly known as the Royal Basilica of Székesfehérvár) played a prominent role as a coronation church and burial place of medieval Hungarian kings. The basilica’s building and graves had been destroyed over the centuries. The only royal graves that remained intact were those of King Béla III and his first spouse, Anna of Antioch. These graves were discovered in 1848. We defined the autosomal STR (short tandem repeat) fingerprints of the royal couple and eight additional individuals (two females and six males) found in the Royal Basilica. These results revealed no evidence of first-degree relationship between any of the investigated individuals. Y-chromosomal STR profiles were also established for all the male skeletons. Based upon the Y-chromosomal data, one male skeleton showed an obvious patrilineal relationship to King Béla III. A database search uncovered an existing Y-chromosomal haplotype, which had a single-repeat difference compared to that of King Béla. It was discovered in a person living in an area close to Hungary. This current male line is probably related paternally to the Árpád Dynasty. The control region of the mitochondrial DNA was determined in the royal couple and in the remains of the inferred relative. The mitochondrial results excluded sibling relationship between the King and the patrilineal relative. In summary, we successfully defined a Y-chromosomal profile of King Béla III, which can serve as a reference for the identification of further remains and disputed living descendants of the Árpád Dynasty. Among the examined skeletons, we discovered an Árpád member, whose exact affiliation, however, has not yet been established.

The Árpad Dynasty

The Árpád Dynasty (c. 850–1301 AD) played an important role in European history during the Middle Ages (Hóman 1940-1943). The first Great Prince Álmos organised the monarchic state in the northern region of the Black Sea c. 850. A few decades after the collapse of the Avar Khaganate (c. 822 AD), Álmos and his son Árpád conquered the Carpathian Basin (c. 862–895 AD) (Szőke 2014). During the conquest, Hungarian invaders, together with Turkic-speaking Kabars assimilated the Avars and Slavonic groups (Szádeczky-Kardoss 1990). Thus, most of the population in the Carpathian Basin originated from the Hun-Turkic cultural community of the Eurasian Steppe and was accompanied by Slavonic and German-speaking groups (László 1996). The origin of Hungarians is still controversial, and this paper cannot cover this complex subject. The Hungarian Great Principality represented the Eurasian steppe empires in Central Europe from c. 862 until 1000. Saint Stephen I, the last Great Prince (997–1000) and first King (1000–1038) of Hungary re-organised this early Hungarian state as a Christian kingdom. Saint Stephen received the royal crown from the Pope and joined the post-Roman Christian political system and cultural commonwealth of Latin Europe (Pohl 2003; Szabados 2011). Hungary remained an independent state between the German and Byzantine empires (Makk 1989). King Béla III (1172–1196) was one of the most significant rulers of the dynasty. He was the second son of King Géza II (1141–1162) and Queen Euphrosyne, the daughter of Mstislav I (1125–1132), the Great Prince of Kiev. Through the mediation of Byzantine Emperor Manuel I Komnenos, Béla married Anna of Châtillon from Antioch (1150–1184), the half-sister of the Emperor’s wife in 1170. After Manuel’s death, King Béla consolidated Hungarian dominance over the Northern Balkans.

The provostry church of the Virgin Mary (commonly known as the Royal Basilica of Székesfehérvár) was built by Saint Stephen I at the beginning of the eleventh century. The basilica played a prominent role as a church of coronation and as the main burial place of Hungarian kings in the Middle Ages. Fifteen kings, several queens, princes and princesses and clerical and secular dignitaries were buried there over five centuries (Engel 1987)

The five graves excavated by János Érdy. Drawn by János Varsányi (1848). Originally published by Érdy (1853). I: remains of Béla III; II: remains of Anna of Antioch; III: a male skeleton whose identity with II/52 is questioned; IV: the skeleton of an expectant female, only foetal bones remained; V: a crushed skeleton, it has not been preserved.


There were three R1a and two R1b statistically predicted Y haplogroups among the male skeletons (Table 3). These are the most frequent and second most frequent haplogroups (25.6 and 18.1% respectively) in the present Hungarian population (Völgyi et al. 2009). King Béla III was inferred to belong to haplogroup R1a. The R1a Y haplogroup relates paternally to more than 10% of men in a wide geographic area from South Asia to Central Eastern Europe and South Siberia (Underhill et al. 2010). It is the most frequent haplogroup in various populations speaking Slavic, Indo-Iranian, Dravidian, Turkic and Finno-Ugric languages (Underhill et al. 2010).

Kinship analysis

The autosomal STR results contradicted the paternity between King Béla III and II/52. The mitochondrial sequence results excluded siblingship, too. Apart from that, we also tested the hypothesis for siblingship versus non-relationship based on the autosomal STR results using “Familias 3”. The LR (likelihood ratio) for the alternative hypothesis was found to be 7.67, which was inconclusive. Testing the hypothesis for a grandfather-grandson (or uncle-nephew) relationship versus non-relationship resulted in an LR of 5.44, which corresponds to a probability of 84.46% (assuming a prior probability of 50%). This result is indecisive for the hypothesis.

The Hungarian conquest of the Carpathian Basin, by Fz22 at Wikipedia.

So, the first Hungarian dynasty, which one can safely say were one of the ruling clans among Hungarian conquerors, a group of Ugric speakers that invaded the Carpathian basin from the steppe in the 9th c. (stemming originally from North-Eastern Europe) were of R1a lineages.

Who could have thought, right?


R1b-V88 migration through Southern Italy into Green Sahara corridor, and the Afroasiatic connection

Open access article The peopling of the last Green Sahara revealed by high-coverage resequencing of trans-Saharan patrilineages, by D’Atanasio, Trombetta, Bonito, et al., Genome Biology (2018) 19:20.


Little is known about the peopling of the Sahara during the Holocene climatic optimum, when the desert was replaced by a fertile environment.

In order to investigate the role of the last Green Sahara in the peopling of Africa, we deep-sequence the whole non-repetitive portion of the Y chromosome in 104 males selected as representative of haplogroups which are currently found to the north and to the south of the Sahara. We identify 5,966 mutations, from which we extract 142 informative markers then genotyped in about 8,000 subjects from 145 African, Eurasian and African American populations. We find that the coalescence age of the trans-Saharan haplogroups dates back to the last Green Sahara, while most northern African or sub-Saharan clades expanded locally in the subsequent arid phase.

Our findings suggest that the Green Sahara promoted human movements and demographic expansions, possibly linked to the adoption of pastoralism. Comparing our results with previously reported genome-wide data, we also find evidence for a sex-biased sub-Saharan contribution to northern Africans, suggesting that historical events such as the trans-Saharan slave trade mainly contributed to the mtDNA and autosomal gene pool, whereas the northern African paternal gene pool was mainly shaped by more ancient events.

Maximum parsimony Y chromosome tree and dating of the four trans-Saharan haplogroups. a Phylogenetic relations among the 150 samples analysed here. Each haplogroup is labelled in a different colour. The four Y sequences from ancient samples are marked by the dagger symbol. b Phylogenetic tree of the four trans-Saharan haplogroups, aligned to the timeline (at the bottom). At the tip of each lineage, the ethno-geographic affiliation of the corresponding sample is represented by a circle, coloured according to the legend (bottom left). The last Green Sahara period is highlighted by a green belt in the background

Also, interesting excerpts:

The fertile environment established in the Green Sahara probably promoted demographic expansions and rapid dispersals of the human groups, as suggested by the great homogeneity in the material culture of the early Holocene Saharan populations [62]. Our data for all the four trans-Saharan haplogroups are consistent with this scenario, since we found several multifurcated topologies, which can be considered as phylogenetic footprints of demographic expansions. The multifurcated structure of the E-M2 is suggestive of a first demographic expansion, which occurred about 10.5 kya, at the beginning of the last Green Sahara (Fig. 2; Additional file 2: Figure S4). After this initial expansion, we found that most of the trans-Saharan lineages within A3-M13, E-M2 and R-V88 radiated in a narrow time interval at 8–7 kya, suggestive of population expansions that may have occurred in the same time (Fig. 2; Additional file 2: Figures S3, S4 and S6). Interestingly, during roughly the same period, the Saharan populations adopted pastoralism, probably as an adaptive strategy against a short arid period [1, 62, 63]. So, the exploitation of pastoralism resources and the reestablishment of wetter conditions could have triggered the simultaneous population expansions observed here. R-V88 also shows signals of a further and more recent (~ 5.5 kya) Saharan demographic expansion which involved the R-V1589 internal clade. We observed similar demographic patterns in all the other haplogroups in about the same period and in different geographic areas (A3-M13/V3, E-M2/V3862 and E-M78/V32 in the Horn of Africa, E-M2/M191 in the central Sahel/central Africa), in line with the hypothesis that the start of the desertification may have caused massive economic, demographic and social changes [1].

Finally, the onset of the arid conditions at the end of the last African humid period was more abrupt in the eastern Sahara compared to the central Sahara, where an extensive hydrogeological network buffered the climatic changes, which were not complete before ~ 4 kya [6, 62, 64]. Consistent with these local climatic differences, we observed slight differences among the four trans-Saharan haplogroups. Indeed, we found that the contact between northern and sub-Saharan Africa went on until ~ 4.5 kya in the central Sahara, where we mainly found the internal lineages of E-M2 and R-V88 (Additional file 2: Figures S4 and S6). In the eastern Sahara, we found a sharper and more ancient (> 5 kya) differentiation between the people from northern Africa (and, more generally, from the Mediterranean area) and the groups from the eastern sub-Saharan regions (mainly from the Horn of Africa), as testified by the distribution and the coalescence ages of the A3-M13 and E-M78 lineages (Additional file 2: Figures S3 and S5).

Time estimates and frequency maps of the four trans-Saharan haplogroups and major sub-clades. a Time estimates of the four trans-Saharan clades and their main internal lineages. To the left of the timeline, the time windows of the main climatic/historical African events are reported in different colours (legend in the upper left). b Frequency maps of the main trans-Saharan clades and sub-clades. For each map, the relative frequencies (percentages) are reported to the right

R-V88 has been observed at high frequencies in the central Sahel (northern Cameroon, northern Nigeria, Chad and Niger) and it has also been reported at low frequencies in northwestern Africa [37]. Outside the African continent, two rare R-V88 sub-lineages (R-M18 and R-V35) have been observed in Near East and southern Europe (particularly in Sardinia)[30, 37, 38, 39]. Because of its ethno-geographic distribution in the central Sahel, R-V88 has been linked to the spread of the Chadic branch of the Afroasiatic linguistic family [37, 40].

(…) the R-V88 lineages date back to 7.85 kya and its main internal branch (branch 233) forms a “star-like” topology (“Star-like” index = 0.55), suggestive of a demographic expansion. More specifically, 18 out of the 21 sequenced chromosomes belong to branch 233, which includes eight sister clades, five of which are represented by a single subject. The coalescence age of this sub-branch dates back to 5.73 kya, during the last Green Sahara period. Interestingly, the subjects included in the “star-like” structure come from northern Africa or central Sahel, tracing a trans-Saharan axis. It is worth noting that even the three lineages outside the main multifurcation (branches 230, 231 and 232) are sister lineages without any nested sub-structure. The peculiar topology of the R-V88 sequenced samples suggests that the diffusion of this haplogroup was quite rapid and possibly triggered by the Saharan favourable climate (Fig. 2b).

One of the theories I proposed in the Indo-European demic diffusion model since the first edition – based mainly on phylogeography – is that R1b-V88 lineages had probably crossed the Mediterranean through southern Italy into a Green Sahara region, and distributed from there throuh important green corridors, humid areas between megalakes. Even though this new study – like the rest of them – is based solely on modern samples, and as such is quite prone to error in assessing ancient distributions – as we have seen in Europe -, it seems that a southern Italian route (probably through Sicily) for R1b-V88 and a late expansion through Green Sahara is more and more likely.

If we accept that the migration of R1b-V88 lineages is the last great expansion through a Green Sahara, then this expansion is a potential candidate for the initial Afroasiatic expansion – whereas older haplogroup expansions would represent languages different than Afroasiatic, and more recent haplogroup expansions would represent subsequent expansions of Afroasiatic dialects, like Semitic, Hamitic, Cushitic, or Chadic – as I explained in an older post.

In absolutely shameless speculative terms, then – as is today common in Genetic studies, by the way, so let’s all have some fun here – instead of some sort of R1b/Eurasiatic continuity in Europe, as some autochthonous continuists would like, this could mean that there would be an old Afroasiatic – R1b connection. That would imply:

NOTE. Regarding the contribution of CHG ancestry in the Pontic-Caspian steppe cultures, it is usually explained as caused by exogamy, or by absorption of a previous population (as in the Indo-Iranian case), although a contribution of communities of mainly J subclades to the formation of Neolithic steppe cultures cannot be ruled out. As for some autochthonous continuists’ belief in some sort of mythical mixed steppe people with mixed haplogroups and mixed language, well…

Simple Nostratic tree by Bomhard (2008)

The Pre-Indo-European linguistic situation, before the formation of Neolithic steppe cultures, seems like pure speculation, because a) language macro-families (with the exception of Afroasiatic) are highly speculative, b) sound anthropological models are lacking for them, and c) migrations inferred from haplogroup distributions of modern populations are often incorrect:

  • Haplogroup R could then be argued to be the source of Nostratic, and earlier subclades the source of Starostin’s Borean, given the distribution of its subclades in Asia and the timing of their migrations.
  • But of course one could also argue that, given the comparatively late population expansions that Genomics is showing, supporting Western European linguistic schools – where Russian Nostraticists tend to date languages further back in timeR1b (and not R) expansion could be the marker of Nostratic languages, due to its most likely southern path (and their old subclades found in Iran and the Caucasus), which would be more in line with the wet dreams of Europeans proposing R1b autochthonous continuity theories. I like this option far less because of that, but it cannot be ruled out.

If you have read this blog before, you know I profoundly dislike lexicostatistical and glottochronological methods, and I don’t like mass comparisons either. Whereas these methods pretend to apply mathematics to big (raw) data where there is almost no knowledge of what one is doing, comparative grammar applies complex reasoning where there is a lot of partially processed data.

But, it is always fun to ask “what if they were right?” and follow from there…

See also:

mtDNA suggest original East Germanic population linked to Jutland Iron Age and Bell Beaker


Open Access article A mosaic genetic structure of the human population living in the South Baltic region during the Iron Age, by Stolarek et al., at Scientific Reports 8:2455 (2018).

About the site:

Kowalewko is a village in Wielkopolskie vojevodship, close to Poznan, in the middle reaches of the Samica Kierska river. Biritual Roman Age cemetery (site 12), dated from the mid-1st to the beginning of 3rd century AD, is located in the featureless arable fields at the South and West of the village

About the Wielbark culture:

Chronology spans almost all the Roman Iron Age, since ca. 20 AD to ca. 450 AD. The Wielbark culture is associated with the Goths and Gepids, who migrated from Scandinavia towards the Black Sea, and their successors, who, after several centuries, returned to the lands formerly occupied by their ancestors. Typical features of the culture include inhumation graves rich in goods of numerous ornaments frequently of noble metals, while no implements and weapons have been observed and iron objects very rarely. Less frequent cremations. Barrows recorded within cemeteries reflect emergence of elites. The Wielbark communities built stone constructions, including pebbled floors and circles. This culture is mainly known from cemeteries, as settlements, not fortified, are less recognized.

Location of Kowalewko and a scheme of the Kowalewko cemetery site 12, based on the Fig. 3 from the monograph by Tomasz Skorupka, Kowalewko 12. Biritual cemetery of a population of the Wielbark Culture (mid 1st to beginning of 3rd century AD), published in: Marek Chlodnicki [ed.], Archaeological rescue investigations along the gas transit pipeline, vol. II – Wielkopolska, part 3, Poznan 2001, generated using Corel Draw ver. 12.0, with the author permission. Sampled graves are marked with a red color. Europe and Poland maps were downloaded from Wikimedia Commons (https://commons.wikimedia.org), under the free licence, and modified with Corel Draw ver. 12.0.

Interesting excerpts with emphasis added (and some stylistic changes for abbreviations):

Analysis of genetic distances (see Fig. 2b) showed that both Jutland Iron Age (JIA) and Kowalewko (Kow-OVIA), are the closest to the Central Europe Metapopulation (CEM). However, it should be mentioned that many of the resulting genetic proximities did not reach statistical significance at the alpha level 0.05 (mainly due to the multiple comparisons), thus they should be interpreted with caution. Higher prevalence of the mtDNA haplogroup H in Kowalewko and Jutland Iron Age(its high level is also characteristic for the Bell Beaker Culture) than in the preceding Corded Ware Culture (CWC) and Unetic Culture (UC) supports the hypothesis assuming significant demographic changes in Central Europe after the LN/EBA period. This hypothesis is additionally strengthened by the results of AMOVA analysis indicating that there is some inconsistency between genetic distances and the chronology of the appearance of the studied populations in Central Europe, i.e., the older populations (BBC, CWC) contributed more to the genetic structure of CEM than the younger ones (UC).

Changes in the occurrence of mtDNA haplogroups U5a/U5b in Central Europe are also worth noting. At LN and EBA, the prevailing haplogroup was U5a for BBC/CWC/UC. Next, there was a dominance of U5b for the Kow-OVIA/JIA during IA and now U5a is again more popular (CEM). The first alteration in the U5a/U5b prevalence between the LN/EBA and the IA supports the hypothesis of demographic changes right after the LN, proposed by Brandt et al (2013). The second conversion indicated by our results suggests another crucial demographic event that should occur between the IA and present.

On the basis of the above observations, one may assume that in the IA, specific genetic substructures were formed in Central Europe. Because the demographic history of fossil populations often has a local character33,34, it is worth considering the range of the observed changes. These considerations should also take into account the hypothesis on the migrations that most likely occurred between the 3rd and 6th century AD. In this context, it seems necessary to compare Kow-OVIA and JIA with other populations from the IA, in particular those located east of Vistula, and with the populations that inhabited this region during the Middle Ages.

PCA2 vs. PCA3 on the haplogroup frequencies of ‘European Population Transect’ populations

Finally, we found that the genetic structures of female and male subpopulations of Kow-OVIA were significantly different. This fact cannot be explicitly determined based on the results of individual analyses; however, it is quite evident if one considers the whole set of data presented here including the Fisher test on haplogroup frequencies. The analyses of both mtDNA haplogroups and genetic distances indicated that women from Kowalewko were related closer to the EN/MN populations, and the men were closer to the CWC and UC. This observation may explain why the genetic relationships of Kow-OVIA with other ancient European populations were more complex and more difficult to define as it was in the case of JIA. In analyzing Kow-OVIA, we observed multiple overlapping effects of two subpopulations with different genetic affinities. One would speculate that the genetic profile of Kow-OVIA-F resulted from exogamy that was described for the CWC population. This is, however, not the case. We found that the genetic differences between women and men were maintained for the entire observation period, i.e., for 200 years (approximately 8 generations). Such a composition of the genetic structure of Kow-OVIA could exist only if at least one subgroup (Kow-OVIA-F or -M) was periodically exchanged. It would further mean that Kowalewko played some specific roles in that region. According to the recent archaeological studies, the colonization pattern in IA Greater Poland could be linked with the existence of a centralized organization system32. Kowalewko could have been one of the important elements of this system. For example, it could have functioned as a garrison for the population closely associated with the JIA, such that warriors stayed in the garrison for only a few years and were then replaced by others. Other scenarios are also possible; however, verification of any hypothesis requires more detailed studies.

All in all, we know that Wielbark probably represented the initial migration period of East Germanic tribes, traditionally believed to be from Northern Scandinavia, into territory later inhabited by Slavic tribes (and potentially earlier by a Balto-Slavic community).

Other than that, the results show some potential for a stable genomic situation in the Germanic homeland in terms of mtDNA, common after the Bell Beaker expansion, which probably brought Pre-Germanic to Scandinavia.

Nevertheless, only a comprehensive study of all Germanic regions from that period (whole genomic and Y-DNA) might shed light onto the real origin of East Germanic peoples, and thus their contended dialectal position, since we already know that certain modern Slavic and Germanic populations cluster closely to some Bronze Age communities of the same region, so differences during the Iron Age may be already quite subtle.

In my humble opinion, too many hypotheses in the paper for few interesting data – as is more and more usual in genetic papers. I guess journals expect that to get more attention, although serious reviewers should actually encourage the opposite, and only informal blogs like this one should come up with far-fetched theories, instead of rebutting them…


Ancient Phoenician mtDNA from Sardinia, Lebanon reflects settlement, genetic diversity, and female mobility


New article at PLOS One, Ancient mitogenomes of Phoenicians from Sardinia and Lebanon: A story of settlement, integration, and female mobility, by Matisoo-Smith et al. (2018).


The Phoenicians emerged in the Northern Levant around 1800 BCE and by the 9th century BCE had spread their culture across the Mediterranean Basin, establishing trading posts, and settlements in various European Mediterranean and North African locations. Despite their widespread influence, what is known of the Phoenicians comes from what was written about them by the Greeks and Egyptians. In this study, we investigate the extent of Phoenician integration with the Sardinian communities they settled. We present 14 new ancient mitogenome sequences from pre-Phoenician (~1800 BCE) and Phoenician (~700–400 BCE) samples from Lebanon (n = 4) and Sardinia (n = 10) and compare these with 87 new complete mitogenomes from modern Lebanese and 21 recently published pre-Phoenician ancient mitogenomes from Sardinia to investigate the population dynamics of the Phoenician (Punic) site of Monte Sirai, in southern Sardinia. Our results indicate evidence of continuity of some lineages from pre-Phoenician populations suggesting integration of indigenous Sardinians in the Monte Sirai Phoenician community. We also find evidence of the arrival of new, unique mitochondrial lineages, indicating the movement of women from sites in the Near East or North Africa to Sardinia, but also possibly from non-Mediterranean populations and the likely movement of women from Europe to Phoenician sites in Lebanon. Combined, this evidence suggests female mobility and genetic diversity in Phoenician communities, reflecting the inclusive and multicultural nature of Phoenician society.

Haplogroup assignments, dates, locations and Genbank accession details of all aDNA samples included in analyses.

Featured image, from the article: Map showing phoenician maritime expansions across the Mediterranean starting from around 800 BCE. Arrows indicate maritime movement. Blue dots indicate coastal sites and pink shaded areas indicate the extent of Phoenician settlements. https://doi.org/10.1371/journal.pone.0190169.g001

See also:

We are all special, which also means that none of us is


Adam Rutherford writes You’re Descended from Royalty and So Is Everybody Else – Anybody you can name from ancient history is in your family tree, which I discovered via John Hawks’ new post The surprising connectedness of human genealogies over centuries.


One way to think of it is to accept that everyone of European descent should have billions of ancestors at a time in the 10th century, but there weren’t billions of people around then, so try to cram them into the number of people that actually were. The math that falls out of that apparent impasse is that all of the billions of lines of ancestry have coalesced into not just a small number of people, but effectively literally everyone who was alive at that time. So, by inference, if Charlemagne was alive in the ninth century, which we know he was, and he left descendants who are alive today, which we also know is true, then he is the ancestor of everyone of European descent alive in Europe today.

Since most of this blog’s posts support academic disciplines looking for answers to the Indo-European question, and gives constantly reasons against modern genetic (and phylogenetic) identification, I think it is worth at least a quick read for anyone interested in the field.

I recently referred to the interesting series of posts by Graham Coop on this matter.

Featured image: Europe around 800 – the map is public domain from from the Historical Atlas (New York, 1911)