R1b-V88 migration through Southern Italy into Green Sahara corridor, and the Afroasiatic connection

Open access article The peopling of the last Green Sahara revealed by high-coverage resequencing of trans-Saharan patrilineages, by D’Atanasio, Trombetta, Bonito, et al., Genome Biology (2018) 19:20.


Little is known about the peopling of the Sahara during the Holocene climatic optimum, when the desert was replaced by a fertile environment.

In order to investigate the role of the last Green Sahara in the peopling of Africa, we deep-sequence the whole non-repetitive portion of the Y chromosome in 104 males selected as representative of haplogroups which are currently found to the north and to the south of the Sahara. We identify 5,966 mutations, from which we extract 142 informative markers then genotyped in about 8,000 subjects from 145 African, Eurasian and African American populations. We find that the coalescence age of the trans-Saharan haplogroups dates back to the last Green Sahara, while most northern African or sub-Saharan clades expanded locally in the subsequent arid phase.

Our findings suggest that the Green Sahara promoted human movements and demographic expansions, possibly linked to the adoption of pastoralism. Comparing our results with previously reported genome-wide data, we also find evidence for a sex-biased sub-Saharan contribution to northern Africans, suggesting that historical events such as the trans-Saharan slave trade mainly contributed to the mtDNA and autosomal gene pool, whereas the northern African paternal gene pool was mainly shaped by more ancient events.

Green Sahara, Trans-Saharan, haplogroups,

Maximum parsimony Y chromosome tree and dating of the four trans-Saharan haplogroups. a Phylogenetic relations among the 150 samples analysed here. Each haplogroup is labelled in a different colour. The four Y sequences from ancient samples are marked by the dagger symbol. b Phylogenetic tree of the four trans-Saharan haplogroups, aligned to the timeline (at the bottom). At the tip of each lineage, the ethno-geographic affiliation of the corresponding sample is represented by a circle, coloured according to the legend (bottom left). The last Green Sahara period is highlighted by a green belt in the background

Also, interesting excerpts:

The fertile environment established in the Green Sahara probably promoted demographic expansions and rapid dispersals of the human groups, as suggested by the great homogeneity in the material culture of the early Holocene Saharan populations [62]. Our data for all the four trans-Saharan haplogroups are consistent with this scenario, since we found several multifurcated topologies, which can be considered as phylogenetic footprints of demographic expansions. The multifurcated structure of the E-M2 is suggestive of a first demographic expansion, which occurred about 10.5 kya, at the beginning of the last Green Sahara (Fig. 2; Additional file 2: Figure S4). After this initial expansion, we found that most of the trans-Saharan lineages within A3-M13, E-M2 and R-V88 radiated in a narrow time interval at 8–7 kya, suggestive of population expansions that may have occurred in the same time (Fig. 2; Additional file 2: Figures S3, S4 and S6). Interestingly, during roughly the same period, the Saharan populations adopted pastoralism, probably as an adaptive strategy against a short arid period [1, 62, 63]. So, the exploitation of pastoralism resources and the reestablishment of wetter conditions could have triggered the simultaneous population expansions observed here. R-V88 also shows signals of a further and more recent (~ 5.5 kya) Saharan demographic expansion which involved the R-V1589 internal clade. We observed similar demographic patterns in all the other haplogroups in about the same period and in different geographic areas (A3-M13/V3, E-M2/V3862 and E-M78/V32 in the Horn of Africa, E-M2/M191 in the central Sahel/central Africa), in line with the hypothesis that the start of the desertification may have caused massive economic, demographic and social changes [1].

Finally, the onset of the arid conditions at the end of the last African humid period was more abrupt in the eastern Sahara compared to the central Sahara, where an extensive hydrogeological network buffered the climatic changes, which were not complete before ~ 4 kya [6, 62, 64]. Consistent with these local climatic differences, we observed slight differences among the four trans-Saharan haplogroups. Indeed, we found that the contact between northern and sub-Saharan Africa went on until ~ 4.5 kya in the central Sahara, where we mainly found the internal lineages of E-M2 and R-V88 (Additional file 2: Figures S4 and S6). In the eastern Sahara, we found a sharper and more ancient (> 5 kya) differentiation between the people from northern Africa (and, more generally, from the Mediterranean area) and the groups from the eastern sub-Saharan regions (mainly from the Horn of Africa), as testified by the distribution and the coalescence ages of the A3-M13 and E-M78 lineages (Additional file 2: Figures S3 and S5).

Time estimates and frequency maps of the four trans-Saharan haplogroups and major sub-clades. a Time estimates of the four trans-Saharan clades and their main internal lineages. To the left of the timeline, the time windows of the main climatic/historical African events are reported in different colours (legend in the upper left). b Frequency maps of the main trans-Saharan clades and sub-clades. For each map, the relative frequencies (percentages) are reported to the right

R-V88 has been observed at high frequencies in the central Sahel (northern Cameroon, northern Nigeria, Chad and Niger) and it has also been reported at low frequencies in northwestern Africa [37]. Outside the African continent, two rare R-V88 sub-lineages (R-M18 and R-V35) have been observed in Near East and southern Europe (particularly in Sardinia)[30, 37, 38, 39]. Because of its ethno-geographic distribution in the central Sahel, R-V88 has been linked to the spread of the Chadic branch of the Afroasiatic linguistic family [37, 40].

(…) the R-V88 lineages date back to 7.85 kya and its main internal branch (branch 233) forms a “star-like” topology (“Star-like” index = 0.55), suggestive of a demographic expansion. More specifically, 18 out of the 21 sequenced chromosomes belong to branch 233, which includes eight sister clades, five of which are represented by a single subject. The coalescence age of this sub-branch dates back to 5.73 kya, during the last Green Sahara period. Interestingly, the subjects included in the “star-like” structure come from northern Africa or central Sahel, tracing a trans-Saharan axis. It is worth noting that even the three lineages outside the main multifurcation (branches 230, 231 and 232) are sister lineages without any nested sub-structure. The peculiar topology of the R-V88 sequenced samples suggests that the diffusion of this haplogroup was quite rapid and possibly triggered by the Saharan favourable climate (Fig. 2b).

One of the theories I proposed in the Indo-European demic diffusion model since the first edition – based mainly on phylogeography – is that R1b-V88 lineages had probably crossed the Mediterranean through southern Italy into a Green Sahara region, and distributed from there throuh important green corridors, humid areas between megalakes. Even though this new study – like the rest of them – is based solely on modern samples, and as such is quite prone to error in assessing ancient distributions – as we have seen in Europe -, it seems that a southern Italian route (probably through Sicily) for R1b-V88 and a late expansion through Green Sahara is more and more likely.

If we accept that the migration of R1b-V88 lineages is the last great expansion through a Green Sahara, then this expansion is a potential candidate for the initial Afroasiatic expansion – whereas older haplogroup expansions would represent languages different than Afroasiatic, and more recent haplogroup expansions would represent subsequent expansions of Afroasiatic dialects, like Semitic, Hamitic, Cushitic, or Chadic – as I explained in an older post.

In absolutely shameless speculative terms, then – as is today common in Genetic studies, by the way, so let’s all have some fun here – instead of some sort of R1b/Eurasiatic continuity in Europe, as some autochthonous continuists would like, this could mean that there would be an old Afroasiatic – R1b connection. That would imply:

NOTE. Regarding the contribution of CHG ancestry in the Pontic-Caspian steppe cultures, it is usually explained as caused by exogamy, or by absorption of a previous population (as in the Indo-Iranian case), although a contribution of communities of mainly J subclades to the formation of Neolithic steppe cultures cannot be ruled out. As for some autochthonous continuists’ belief in some sort of mythical mixed steppe people with mixed haplogroups and mixed language, well…

Simple Nostratic tree by Bomhard (2008)

The Pre-Indo-European linguistic situation, before the formation of Neolithic steppe cultures, seems like pure speculation, because a) language macro-families (with the exception of Afroasiatic) are highly speculative, b) sound anthropological models are lacking for them, and c) migrations inferred from haplogroup distributions of modern populations are often incorrect:

  • Haplogroup R could then be argued to be the source of Nostratic, and earlier subclades the source of Starostin’s Borean, given the distribution of its subclades in Asia and the timing of their migrations.
  • But of course one could also argue that, given the comparatively late population expansions that Genomics is showing, supporting Western European linguistic schools – where Russian Nostraticists tend to date languages further back in timeR1b (and not R) expansion could be the marker of Nostratic languages, due to its most likely southern path (and their old subclades found in Iran and the Caucasus), which would be more in line with the wet dreams of Europeans proposing R1b autochthonous continuity theories. I like this option far less because of that, but it cannot be ruled out.

If you have read this blog before, you know I profoundly dislike lexicostatistical and glottochronological methods, and I don’t like mass comparisons either. Whereas these methods pretend to apply mathematics to big (raw) data where there is almost no knowledge of what one is doing, comparative grammar applies complex reasoning where there is a lot of partially processed data.

But, it is always fun to ask “what if they were right?” and follow from there…

See also:

Leave a Reply

Your email address will not be published.

Help us avoid Spam! *