Sorry for the last weeks of silence, I have been rather busy lately. I am having more projects going on, and (because of that) I also wanted to finish a project I have been working on for many months already.
I have therefore decided to publish a provisional version of the text, in the hope that it will be useful in the following months, when I won’t be able to update it as often as I would like to:
EDIT (20 JAN 2019): For those of you who are more comfortable reading in your native language, I have placed some links to automatic translations by Google Translate. They might work especially well for the texts of A Game of Clans & A Clash of Chiefs.
Don’t forget to check out the maps included in the supplementary materials: I have added Y-DNA, mtDNA, and ADMIXTURE data using GIS software. The PCA graphics are also important to follow the main text.
NOTE. Right now the files are only in my server. I will try to upload them to Academia.edu and Research Gate when I have time, I have uploaded them to Academia.edu and ResearchGate, in case the websites are too slow.
I would have preferred to wait for a thorough revision of the section on archaeology and the linguistic sections on Uralic, but I doubt I will have time when the reviews come, so it was either now or maybe next December…
I say so in the introduction, but it is evident that certain aspects of the book are tentative to say the least: the farther back we go from Late Proto-Indo-European, the less clear are many aspects. Also, linguistically I am not convinced about Eurasiatic or Nostratic, although they do have a certain interest when we try to offer a comprehensive view of the past, including ethnolinguistic identities.
I cannot be an expert in everything, and these books cover a lot. I am bound to publish many corrections as new information appears and more reviews are sent. For example, just days ago (before SNP calls of Wang et al. 2018 were published) some paragraphs implied that AME might have expanded Nostratic from the Middle East. Now it does not seem so, and I changed them just before uploading the text. That’s how tentative certain routes are, and how much all of this may change. And that only if we accept a Nostratic phylum…
NOTE. Since the first book I wrote was the linguistic one, and I have spent the last months updating the archaeology + genetics part, now many of you will probably understand 1) why I am so convinced about certain language relationships and 2) how I used many posts to clarify certain ideas and receive comments. Many posts offer probably a good timeline of what I worked with, and when.
I did not add this section to the books, because they are still not ready for print, but I think this is due somewhere now. It is impossible to reference all who have directly or indirectly contributed to this, so this is a list of those I feel have played an important role.
I am indebted to the following people (which does not mean that they share my views, obviously):
First and foremost, to Fernando López-Menchero, for having the patience to review with detail many parts on Indo-European linguistics, knowing that I won’t accept many of his comments anyway. The additional information he offers is invaluable, but I didn’t want to turn this into a huge linguistic encyclopaedia with unending discussions of tiny details of each reconstructed word. I think it is already too big as it is.
I would not have thought about doing this if it were not for the interest of Wekwos (Xavier Delamarre) in publishing a full book about the Indo-European demic diffusion model (in the second half of 2017, I think). It was them who suggested that I extended the content, when all I had done until then was write an essay and draw some maps in my free time between depositing the PhD thesis and defending it.
Sadly, as much as I would like to publish a book with a professional publisher, I don’t think ancient DNA lends itself for the traditional format, so my requests (mainly to have free licenses and being able to review the text at will, as new genetic papers are published) were logically not acceptable. Also, the main aim of all volumes, especially the linguistic one, is the teaching of essentials of Late Proto-Indo-European and related languages, and this objective would be thwarted by selling each volume for $50-70 and only in printed format. I prefer a wider distribution.
At first I didn’t think much of this proposal, because I do not benefit from this kind of publications in my scientific field, but with time my interest in writing a whole, comprehensive book on the subject grew to the point where it was already an ongoing project, probably by the start of 2018.
I would not have been in contact with Wekwos if it were not for user Camulogène Rix at Anthrogenica, so thanks for that and for the interest in this work.
I would not have thought of writing this either if not for the spontaneous support (with an unexpected phone call!) of a professor of the Complutense University of Madrid, Ángel Gómez Moreno, who is interested in this subject – as is his wife, a professor of Classics more closely associated to Indo-European studies, and who helped me with a search for Indo-Europeanists.
EDIT (1 JAN 2019): I remembered that Karin Bojs sent me her book after reading the demic diffusion model. I may have also thought about writing a whole book back then, but mid-2017 is probably too early for the project.
Professor Kortlandt is still to review the text, but he contributed to both previous essays in some very interesting ways, so I hope he can help me improve the parts on Uralic, and maybe alternative accounts of expansion for Balto-Slavic, depending on the time depth that he would consider warranted according to the Temematic hypothesis.
The maps are evidently (for those who are interested in genetics) in part the result of the effort of the late Jean Manco: As you can see from the maps including Y-DNA and mtDNA samples, I have benefitted from her way of organising data and publishing it. Similarly, the work of Iain McDonald in assessing the potential migration routes of R1b and R1a in Europe with the help of detailed maps was behind my idea for the first maps, and consequently behind these, too.
Readers of this blog with interesting comments have also been essential for the improvement of the texts. You can probably see some of your many contributions there. I may not answer many comments, because I am always busy (and sometimes I just don’t have anything interesting to say), but I try to read all of them.
Users of other sites, like Anthrogenica, whose particular points of view and deep knowledge of some very specific aspects are sometimes very useful. In particular, user Anglesqueville helped me to fix some issues with the merging of datasets to obtain the PCAs and ADMIXTURE, and prepared some individual samples to merge them.
Even without posting anything, Google Analytics keeps sending me messages about increasing user fidelity (returning users), and stats haven’t really changed (which probably means more people are reading old posts), so thank you for that.
Interesting excerpts (emphasis mine, reference numbers deleted for clarity):
The material culture of the Late Chalcolithic period in the southern Levant contrasts qualitatively with that of earlier and later periods in the same region. The Late Chalcolithic in the Levant is characterized by increases in the density of settlements, introduction of sanctuaries, utilization of ossuaries in secondary burials, and expansion of public ritual practices as well as an efflorescence of symbolic motifs sculpted and painted on artifacts made of pottery, basalt, copper, and ivory. The period’s impressive metal artifacts, which reflect the first known use of the “lost wax” technique for casting of copper, attest to the extraordinary technical skill of the people of this period.
The distinctive cultural characteristics of the Late Chalcolithic period in the Levant (often related to the Ghassulian culture, although this term is not in practice applied to the Galilee region where this study is based) have few stylistic links to the earlier or later material cultures of the region, which has led to extensive debate about the origins of the people who made this material culture. One hypothesis is that the Chalcolithic culture in the region was spread in part by immigrants from the north (i.e., northern Mesopotamia), based on similarities in artistic designs. Others have suggested that the local populations of the Levant were entirely responsible for developing this culture, and that any similarities to material cultures to the north are due to borrowing of ideas and not to movements of people.
Previous genome-wide ancient DNA studies from the Near East have revealed that at the time when agriculture developed, populations from Anatolia, Iran, and the Levant were approximately as genetically differentiated from each other as present-day Europeans and East Asians are today. By the Bronze Age, however, expansion of different Near Eastern agriculturalist populations — Anatolian, Iranian, and Levantine — in all directions and admixture with each other substantially homogenized populations across the region, thereby contributing to the relatively low genetic differentiation that prevails today. Showed that the Levant Bronze Age population from the site of ‘Ain Ghazal, Jordan (2490–2300 BCE) could be fit statistically as a mixture of around 56% ancestry from a group related to Levantine Pre-Pottery Neolithic agriculturalists (represented by ancient DNA from Motza, Israel and ‘Ain Ghazal, Jordan; 8300–6700 BCE) and 44% related to populations of the Iranian Chalcolithic (Seh Gabi, Iran; 4680–3662 calBCE). Suggested that the Canaanite Levant Bronze Age population from the site of Sidon, Lebanon (~1700 BCE) could be modeled as a mixture of the same two groups albeit in different proportions (48% Levant Neolithic-related and 52% Iran Chalcolithic-related). However, the Neolithic and Bronze Age sites analyzed so far in the Levant are separated in time by more than three thousand years, making the study of samples that fill in this gap, such as those from Peqi’in, of critical importance.
This procedure produced genome-wide data from 22 ancient individuals from Peqi’in Cave (4500–3900 calBCE) (…)
We find that the individuals buried in Peqi’in Cave represent a relatively genetically homogenous population. This homogeneity is evident not only in the genome-wide analyses but also in the fact that most of the male individuals (nine out of ten) belong to the Y-chromosome haplogroup T, a lineage thought to have diversified in the Near East. This finding contrasts with both earlier (Neolithic and Epipaleolithic) Levantine populations, which were dominated by haplogroup E, and later Bronze Age individuals, all of whom belonged to haplogroup J.
Our finding that the Levant_ChL population can be well-modeled as a three-way admixture between Levant_N (57%), Anatolia_N (26%), and Iran_ChL (17%), while the Levant_BA_South can be modeled as a mixture of Levant_N (58%) and Iran_ChL (42%), but has little if any additional Anatolia_N-related ancestry, can only be explained by multiple episodes of population movement. The presence of Iran_ChL-related ancestry in both populations – but not in the earlier Levant_N – suggests a history of spread into the Levant of peoples related to Iranian agriculturalists, which must have occurred at least by the time of the Chalcolithic. The Anatolian_N component present in the Levant_ChL but not in the Levant_BA_South sample suggests that there was also a separate spread of Anatolian-related people into the region. The Levant_BA_South population may thus represent a remnant of a population that formed after an initial spread of Iran_ChL-related ancestry into the Levant that was not affected by the spread of an Anatolia_N-related population, or perhaps a reintroduction of a population without Anatolia_N-related ancestry to the region. We additionally find that the Levant_ChL population does not serve as a likely source of the Levantine-related ancestry in present-day East African populations.
These genetic results have striking correlates to material culture changes in the archaeological record. The archaeological finds at Peqi’in Cave share distinctive characteristics with other Chalcolithic sites, both to the north and south, including secondary burial in ossuaries with iconographic and geometric designs. It has been suggested that some Late Chalcolithic burial customs, artifacts and motifs may have had their origin in earlier Neolithic traditions in Anatolia and northern Mesopotamia. Some of the artistic expressions have been related to finds and ideas and to later religious concepts such as the gods Inanna and Dumuzi from these more northern regions. The knowledge and resources required to produce metallurgical artifacts in the Levant have also been hypothesized to come from the north.
Our finding of genetic discontinuity between the Chalcolithic and Early Bronze Age periods also resonates with aspects of the archeological record marked by dramatic changes in settlement patterns, large-scale abandonment of sites, many fewer items with symbolic meaning, and shifts in burial practices, including the disappearance of secondary burial in ossuaries. This supports the view that profound cultural upheaval, leading to the extinction of populations, was associated with the collapse of the Chalcolithic culture in this region.
I think the most interesting aspect of this paper is – as usual – the expansion of peoples associated with a single Y-DNA haplogroup. Given that the expansion of Semitic languages in the Middle East – like that of Anatolian languages from the north – must have happened after ca. 3100 BC, coinciding with the collapse of the Uruk period, these Chalcolithic north Levant peoples are probably not related to the posterior Semitic expansion in the region. This can be said to be supported by their lack of relationship with posterior Levantine migrations into Africa. The replacement of haplogroup E before the arrival of haplogroup J suggests still more clearly that Natufians and their main haplogroup were not related to the Afroasiatic expansions.
On the other hand, while their ancestry points to neighbouring regional origins, their haplogroup T1a1a (probably T1a1a1b2) may be closely related to that of other Semitic peoples to the south, as found in east Africa and Arabia. This may be due either to a northern migration of these Chalcolithic Levantine peoples from southern regions in the 5th millennium BC, or maybe to a posterior migration of Semitic peoples from the Levant to the south, coupled with the expansion of this haplogroup, but associated with a distinct population. As we know, ancestry can change within certain generations of intense admixture, while Y-DNA haplogroups are not commonly admixed in prehistoric population expansions.
Without more data from ancient DNA, it is difficult to say. Haplogroup T1a1 is found in Morocco (ca. 3780-3650 calBC), which could point to a recent expansion of a Berbero-Semitic branch; but also in a sample from Balkans Neolithic ca. 5800-5400 calBCE, which could suggest an Anatolian origin of the specific subclades encountered here. In any case, a potential origin of Proto-Semitic anywhere near this wide Near Eastern region ca. 4500-3500 BC cannot be discarded, knowing that their ancestors came probably from Africa.
Interesting from this paper is also that we are yet to find a single prehistoric population expansion not associated with a reduction of variability and expansion of Y-DNA haplogroups. It seems that the supposedly mixed Yamna community remains the only (hypothetical) example in history where expanding patrilineal clans will not share Y-DNA haplogroup…
The Sahara was wetter and greener during multiple interglacial periods of the Quaternary, when some have suggested it featured very large (mega) lakes, ranging in surface area from 30,000 to 350,000 km2. In this paper, we review the physical and biological evidence for these large lakes, especially during the African Humid Period (AHP) 11–5 ka. Megalake systems from around the world provide a checklist of diagnostic features, such as multiple well-defined shoreline benches, wave-rounded beach gravels where coarse material is present, landscape smoothing by lacustrine sediment, large-scale deltaic deposits, and in places, tufas encrusting shorelines. Our survey reveals no clear evidence of these features in the Sahara, except in the Chad basin. Hydrologic modeling of the proposed megalakes requires mean annual rainfall ≥1.2 m/yr and a northward displacement of tropical rainfall belts by ≥1000 km. Such a profound displacement is not supported by other paleo-climate proxies and comprehensive climate models, challenging the existence of megalakes in the Sahara. Rather than megalakes, isolated wetlands and small lakes are more consistent with the Sahelo-Sudanian paleoenvironment that prevailed in the Sahara during the AHP. A pale-green and discontinuously wet Sahara is the likelier context for human migrations out of Africa during the late Quaternary.
The whole review is an interesting read, but here are some relevant excerpts:
Various researchers have suggested that megalakes coevally covered portions of the Sahara during the AHP and previous periods, such as paleolakes Chad, Darfur, Fezzan, Ahnet-Mouydir, and Chotts (Fig. 2, Table 2). These proposed paleolakes range in size by an order of magnitude in surface area from the Caspian Sea–scale paleo-Lake Chad at 350,000 km2 to Lake Chotts at 30,000 km2. At their maximum, megalakes would have covered ~ 10% of the central and western Sahara, similar to the coverage by megalakes Victoria, Malawi, and Tanganyika in the equatorial tropics of the African Rift today. This observation alone should raise questions of the existence of megalakes in the Sahara, and especially if they developed coevally. Megalakes, because of their significant depth and area, generate large waves that become powerful modifiers of the land surface and leave conspicuous and extensive traces in the geologic record.
Lakes, megalakes, and wetlands
Active ground-water discharge systems abound in the Sahara today, although they were much more widespread in the AHP. They range from isolated springs and wet ground in many oases scattered across the Sahara (e.g., Haynes et al., 1989) to wetlands and small lakes (Kröpelin et al., 2008). Ground water feeding these systems is dominated by fossil AHP-age and older water (e.g., Edmunds and Wright 1979; Sonntag et al., 1980), although recently recharged water (<50 yr) has been locally identified in Saharan ground water (e.g., Sultan et al., 2000; Maduapuchi et al., 2006).
In our view, Lake Chad is the only former megalake in the Sahara firmly documented by sedimentologic and geomorphic evidence. Mega-Lake Chad is thought to have covered ~ 345,000 km2, stretching for nearly 8° (10–18°N) of latitude (Ghienne et al., 2002) (Fig. 2). The presence of paleo- Lake Chad was at one point challenged, but several—and in our view very robust—lines of evidence have been presented to support its development during the AHP. These include: (1) clear paleo-shorelines at various elevations, visible on the ground (Abafoni et al., 2014) and in radar and satellite images (Schuster et al., 2005; Drake and Bristow, 2006; Bouchette et al., 2010); (2) sand spits and shoreline berms (Thiemeyer, 2000; Abafoni et al., 2014); and (3) evaporites and aquatic fauna such as fresh-water mollusks and diatoms in basin deposits (e.g., Servant, 1973; Servant and Servant, 1983). Age determinations for all but the Holocene history of mega- Lake Chad are sparse, but there is evidence for Mio-Pliocene lake (s) (Lebatard et al., 2010) and major expansion of paleo- Lake Chad during the AHP (LeBlanc et al., 2006; Schuster et al., 2005; Abafoni et al., 2014; summarized in Armitage et al., 2015) up to the basin overflow level at ~ 329m asl.
Insights from hydrologic mass balance of megalakes
Using these conservative conditions (i.e., erring in the direction that will support megalake formation), our hydrologic models for the two biggest central Saharan megalakes (Darfur and Fezzan) require minimum annual average rainfall amounts of ~ 1.1 m/yr to balance moisture losses from their respective basins (Supplementary Table S1). Lake Chad required a similar amount (~1 m/yr; Supplementary Table S1) during the AHP according to our calculations, but this is plausible, because even today the southern third of the Chad basin receives ≥1.2 m/yr (Fig. 2) and experiences a climate similar to Lake Victoria. A modest 5° shift in the rainfall belt would bring this moist zone northward to cover a much larger portion of the Chad basin, which spans N13° ±7°. Estimated rainfall rates for Darfur and Fezzan are slightly less than the average of ~ 1.3 m/yr for the Lake Victoria basin, because of the lower aw values, that is, smaller areas of Saharan megalakes compared with their respective drainage basins (Fig. 15).
Estimates of paleo-rainfall during the AHP
Here major contradictions develop between the model outcomes and paleo-vegetation evidence, because our Sahelo-Sudanian hydrologic model predicts wetter conditions and therefore more tropical vegetation assemblages than found around Lake Victoria today. In fact, none of the very wet rainfall scenarios required by all our model runs can be reconciled with the relatively dry conditions implied by the fossil plant and animal evidence. In short, megalakes cannot be produced in Sahelo-Sudanian conditions past or present; to form, they require a tropical or subtropical setting, and major displacements of the African monsoon or extra-desert moisture sources.
If not megalakes, what size lakes, marshes, discharging springs, and flowing rivers in the Sahara were sustainable in Sahelo-Sudanian climatic conditions? For lakes and perennial rivers to be created and sustained, net rainfall in the basin has to exceed loss to evapotranspiration, evaporation, and infiltration, yielding runoff that then supplies a local lake or river. Our hydrologic models (see Supplementary Material) and empirical observations (Gash et al., 1991; Monteith, 1991) for the Sahel suggest that this limit is in the 200–300 mm/yr range, meaning that most of the Sahara during the AHP was probably too dry to support very large lakes or perennial rivers by means of local runoff. This does not preclude creation of local wetlands supplied by ground-water recharge focused from a very large recharge area or forced to the surface by hydrologic barriers such as faults, nor megalakes like Chad supplied by moisture from the subtropics and tropics outside the Sahel. But it does raise a key question concerning the size of paleolakes, if not megalakes, in the Sahara during the AHP. Our analysis suggests that Sahelo-Sudanian climate could perhaps support a paleolake approximately ≤5000 km2 in area in the Darfur basin and ≤10,000–20,000 km2 in the Fezzan basin. These are more than an order of magnitude smaller than the megalakes envisioned for these basins, but they are still sizable, and if enclosed in a single body of water, should have been large enough to generate clear shorelines (Enzel et al., 2015, 2017). On the other hand, if surface water was dispersed across a series of shallow and extensive but partly disconnected wetlands, as also implied by previous research (e.g., Pachur and Hoelzmann, 1991), then shorelines may not have developed.
One of the underdeveloped ideas of my Indo-European demic diffusion model was that R1b-V88 had migrated through South Italy to Northern Africa, and from it using the Sahara Green Corridor to the south, from where the “upside-down” view of Bender (2007) could have occurred, i.e. Afroasiatic expanding westwards within the Green Sahara, precisely at this time, and from a homeland near the Megalake Chad region (see here).
Whether or not R1b-V88 brought the ‘original’ lineage that expanded Afroasiatic languages may be contended, but after D’Atanasio et al. (2018) it seems that only two lineages, E-M2 and R1b-V88, fit the ‘star-like’ structure suggesting an appropriate haplogroup expansion and necessary regional distribution that could explain the spread of Afroasiatic languages within a reasonable time frame.
This review shows that the hypothesized Green Sahara corridor full of megalakes that some proposed had fully connected Africa from west to east was actually a strip of Sahelo-Sudanian steppe spread to the north of its current distribution, including the Chad megalake, East Africa and Arabia, apart from other discontinuous local wetlands further to the north in Africa. This greenish belt would have probably allowed for the initial spread of early Afroasiatic proto-languages only through the southern part of the current Sahara Desert. This and the R1b-V88 haplogroup distribution in Central and North Africa (with a prevalence among Chadic speakers probably due to later bottlenecks), and the Near East, leaves still fewer possibilities for an expansion of Afroasiatic from anywhere else.
If my proposal turns out to be correct, this Afroasiatic-like language would be the one suggested by some in the vocabulary of Old European and North European local groups (viz. Kroonen for the Agricultural Substrate Hypothesis), and not Anatolian farmer ancestry or haplogroup G2, which would have been rather confined to Southern Europe, mainly south of the Loess line, where incoming Middle East farmers encountered the main difficulties spreading agriculture and herding, and where they eventually admixed with local hunter-gatherers.
NOTE. If related to attested languages before the Roman expansion, Tyrsenian would be a good candidate for a descendant of the language of Anatolian farmers, given the more recent expansion of Anatolian ancestry to the Tuscan region (even if already influenced by Iran farmer ancestry), which reinforces its direct connection to the Aegean.
The fiercest opposition to this R1b-V88 – Afroasiatic connection may come from:
Traditional Hamito-Semitic scholars, who try to look for any parent language almost invariably in or around the Near East – the typical “here it was first attested, ergo here must be the origin, too”-assumption (coupled with the cradle of civilization memes) akin to the original reasons behind Anatolian or Out-of-India hypotheses; and of course
autochthonous continuity theories based on modern subclades, of (mainly Semitic) peoples of haplogroup E or J, who will root for either one or the other as the Afroasiatic source no matter what. As we have seen with the R1a – Indo-European hypothesis (see here for its history), this is never the right way to look at prehistoric migrations, though.
I proposed that it was R1a-M417 the lineage marking an expansion of Indo-Uralic from the east near Lake Baikal, then obviously connected to Yukaghir and Altaic languages marked by R1a-M17, and that haplogroup R could then be the source of a hypothetic Nostratic expansion (where R2 could mark the Dravidian expansion), with upper clades being maybe responsible for Borean.
However, recent studies have shown early expansions of R1b-297 to East Europe (Mathieson et al. 2017 & 2018), and of R1b-M73 to East Eurasia probably up to Siberia, and possibly reaching the Pacific (Jeong et al. 2018). Also, the Steppe Eneolithic and Caucasus Eneolithic clusters seen in Wang et al. (2018) would be able to explain the WHG – EHG – ANE ancestry cline seen in Mesolithic and Neolithic Eurasia without a need for westward migrations.
Dravidian is now after Narasimhan et al. (2018) and Damgaard et al. (Science 2018) more and more likely to be linked to the expansion of the Indus Valley civilization and haplogroup J, in turn strongly linked to Iranian farmer ancestry, thus giving support to an Elamo-Dravidian group stemming from Iran Neolithic.
NOTE. This Dravidian-IVC and Iran connection has been supported for years by knowledgeable bloggers and commenters alike, see e.g. one of Razib Khan’s posts on the subject. This rather early support for what is obvious today is probably behind the reactionary views by some nationalist Hindus, who probably saw in this a potential reason for a strengthened Indo-Aryan/Dravidian divide adding to the religious patchwork that is modern India.
I am not in a good position to judge Nostratic, and I don’t think Glottochronology, Swadesh lists, or any statistical methods applied to a bunch of words are of any use, here or anywhere. The work of pioneers like Illich-Svitych or Starostin, on the other hand, seem to me solid attempts to obtain a faithful reconstruction, if rather outdated today.
NOTE. I am still struggling to learn more about Uralic and Indo-Uralic; not because it is more difficult than Indo-European, but because – in comparison to PIE comparative grammar – material about them is scarce, and the few available sources are sometimes contradictory. My knowledge of Afroasiatic is limited to Semitic (Arabic and Akkadian), and the field is not much more developed here than for Uralic…
If one wanted to support a Nostratic proto-language, though, and not being able to take into account genome-wide autosomal admixture, the only haplogroup right now which can connect the expansion of all its branches is R1b-M343:
R1b-L278 expanded from Asia to Europe through the Iranian Plateau, since early subclades are found in Iran and the Caucasus region, thus supporting the separation of Elamo-Dravidian and Kartvelian branches;
R1b-V88 expanding everywhere in Europe, and especially the branch expanding to the south into Africa, may be linked to the initial Afroasiatic expansion through the Pale-Green Sahara corridor (and even a hypothetic expansion with E-M2 subclades and/or from the Middle East would also leave open the influence of V88 and previous R1b subclades from the Middle East in the emergence of the language);
R1b-297 subclades expanding to the east may be linked to Eurasiatic, giving rise to both Indo-Uralic (M269) and Macro- or Micro-Altaic (M73) expansions.
This is shameless, simplistic speculation, of course, but not more than the Nostratic hypothesis, and it has the main advantage of offering ‘small and late’ language expansions relative to other proposals spanning thousands (or even tens of thousands) of years more of language separation. On the other hand, that would leave Borean out of the question, unless the initial expansion of R1b subclades happened from a community close to lake Baikal (and Mal’ta) that was also at the origin of the other supposedly related Borean branches, whether linked to haplogroup R or to any other…
NOTE. If Afroasiatic and Indo-Uralic (or Eurasiatic) are not genetically related, my previous simplistic model, R1b-Afroasiatic vs. R1a-Eurasiatic, may still be supported, with R1a-M17 potentially marking the latest meaningful westward population expansion from which EHG ancestry might have developed (see here). Without detailed works on Nostratic comparative grammar and dialectalization, and especially without a lot more Palaeolithic and Mesolithic samples, all this will remain highly speculative, like proposals of the 2000s about Y-DNA-haplogroup – language relationships.
NOTE. I think one of the important changes in this version compared to the preprint is the addition of the recent Iberomaurusian samples.
Abstract (emphasis mine):
The extent to which prehistoric migrations of farmers influenced the genetic pool of western North Africans remains unclear. Archaeological evidence suggests that the Neolithization process may have happened through the adoption of innovations by local Epipaleolithic communities or by demic diffusion from the Eastern Mediterranean shores or Iberia. Here, we present an analysis of individuals’ genome sequences from Early and Late Neolithic sites in Morocco and from Early Neolithic individuals from southern Iberia. We show that Early Neolithic Moroccans (∼5,000 BCE) are similar to Later Stone Age individuals from the same region and possess an endemic element retained in present-day Maghrebi populations, confirming a long-term genetic continuity in the region. This scenario is consistent with Early Neolithic traditions in North Africa deriving from Epipaleolithic communities that adopted certain agricultural techniques from neighboring populations. Among Eurasian ancient populations, Early Neolithic Moroccans are distantly related to Levantine Natufian hunter-gatherers (∼9,000 BCE) and Pre-Pottery Neolithic farmers (∼6,500 BCE). Late Neolithic (∼3,000 BCE) Moroccans, in contrast, share an Iberian component, supporting theories of trans-Gibraltar gene flow and indicating that Neolithization of North Africa involved both the movement of ideas and people. Lastly, the southern Iberian Early Neolithic samples share the same genetic composition as the Cardial Mediterranean Neolithic culture that reached Iberia ∼5,500 BCE. The cultural and genetic similarities between Iberian and North African Neolithic traditions further reinforce the model of an Iberian migration into the Maghreb.
FST and outgroup-f3 distances indicate a high similarity between IAM and Taforalt. As observed for IAM, most Taforalt sample ancestry derives from Epipaleolithic populations from the Levant. However, van de Loosdrecht et al. (17) also reported that one third of Taforalt ancestry was of sub-Saharan African origin. To confirm whether IAM individuals show a sub-Saharan African component, we calculated f4(chimpanzee, African population; Natufian, IAM) in such a way that a positive result for f4 would indicate that IAM is composed both of Levantine and African ancestries. Consistent with the results observed for Taforalt, f4 values are significantly positive for West African populations, with the highest value observed for Gambian and Mandenka (Fig. 3 and SI Appendix, Supplementary Note 10). Together, these results indicate the presence of the same ancestral components in ∼15,000-y old and ∼7,000-y-old populations from Morocco, strongly suggesting a temporal continuity between Later Stone Age and Early Neolithic populations in the Maghreb. However, it is important to take into account that the number of ancient genomes available for comparison is still low and future sampling can provide further refinement in the evolutionary history of North Africa.
Genetic analyses have revealed that the population history of modern North Africans is quite complex (11). Based on our aDNA analysis, we identify an Early Neolithic Moroccan component that is (i) restricted to North Africa in present-day populations (11); (ii) the sole ancestry in IAM samples; and (iii) similar to the one observed in Later Stone Age samples from Morocco (17). We conclude that this component, distantly related to that of Epipaleolithic communities from the Levant, represents the autochthonous Maghrebi ancestry associated with Berber populations. Our data suggests that human populations were isolated in the Maghreb since Upper Paleolithic times. Our hypothesis is in agreement with archaeological research pointing to the first stage of the Neolithic expansion in Morocco as the result of a local population that adopted some technological innovations, such as pottery production or farming, from neighboring areas.
By 3,000 BCE, a continuity in the Neolithic spread brought Mediterranean-like ancestry to the Maghreb, most likely from Iberia. Other archaeological remains, such as African elephant ivory and ostrich eggs found in Iberian sites, confirm the existence of contacts and exchange networks through both sides of the Gibraltar strait at this time. Our analyses strongly support that at least some of the European ancestry observed today in North Africa is related to prehistoric migrations, and local Berber populations were already admixed with Europeans before the Roman conquest. Furthermore, additional European/ Iberian ancestry could have reached the Maghreb after KEB people; this scenario is supported by the presence of Iberian-like Bell-Beaker pottery in more recent stratigraphic layers of IAM and KEB caves. Future paleogenomic efforts in North Africa will further disentangle the complex history of migrations that forged the ancestry of the admixed populations we observe today.
Also, from the main author’s Twitter account:
I just realized that the paragraph with information on data availability is missing! Sequence data in the European Nucleotide Archive (PRJEB22699). Consensus mtDNA sequences are available at the National Center of Biotechnology Information (Accession Numbers MF991431-MF991448).
I find it hard to believe that this genetic continuity from Upper Palaeolithic to Late Neolithic could be representative of an autochthonous development of Afroasiatic. An important population movement – likely more than one – must be found in ancient DNA influencing North-Central and North-East Africa, probably during the time of the Green Sahara corridor.
In the last three decades, genetic studies have played an increasingly important role in exploring human history. They have helped to conclusively establish that anatomically modern humans first appeared in Africa roughly 250,000–350,000 years before present and subsequently migrated to other parts of the world. The history of humans in Africa is complex and includes demographic events that influenced patterns of genetic variation across the continent. Through genetic studies, it has become evident that deep African population history is captured by relationships among African hunter–gatherers, as the world’s deepest population divergences occur among these groups, and that the deepest population divergence dates to 300,000 years before present. However, the spread of pastoralism and agriculture in the last few thousand years has shaped the geographic distribution of present-day Africans and their genetic diversity. With today’s sequencing technologies, we can obtain full genome sequences from diverse sets of extant and prehistoric Africans. The coming years will contribute exciting new insights toward deciphering human evolutionary history in Africa.
Regarding potential Afroasiatic origins and expansions:
It is currently believed that farming practices in northeastern and eastern Africa developed independently in the Sahara/Sahel (around 7,000 BP) and the Ethiopian highlands (7,000–4,000 BP), while farming in the Nile River Valley developed as a consequence of the Neolithic Revolution in the Middle East (84). Northeastern and eastern African farmers today speak languages from the Afro-Asiatic and Nilo-Saharan linguistic groups, which is also reflected in their genetic affinities (Figure 3, K=6). In the northern parts of East Africa (South Sudan, Somalia, and Ethiopia), Nilo-Saharan and Afro-Asiatic speakers with farming lifeways have completely replaced hunter–gatherers. It is still largely unclear how farming and herding practices influenced the northeastern African prefarming population structure and whether the spread of farming is better explained by demic or cultural diffusion in this part of the world. Genetic studies of contemporary populations and aDNA have started to provide some insights into population continuity and incoming gene flow in this region of Africa.
For example, studies have shown that a back-migration from Eurasia into Africa affected most of northeastern and eastern Africa (36, 46, 53, 89, 132) (Figure 1b). A genetic baseline of eastern African ancestral genetic variation unaffected by recent Eurasian admixture and farming migrations within the last 4,500 years has been suggested in the form of the genome sequence of a 4,500-year-old individual from Mota, Ethiopia (36). Based on comparisons with the ancient Mota genome, we know that certain populations from northeastern Africa show deep continuity in their local area with very limited gene flow resulting from recent population movements. For example, the Nilotic herder populations from South Sudan (e.g., Dinka, Nuer, and Shilluk) appear to have remained relatively isolated over time and received little to no gene flow from Eurasians, West African Bantu-speaking farmers, and other surrounding groups (53) (Figures 2 and 3). By contrast, the Nubian and Arab populations to their north show gene flow with Eurasians, which has been connected to the Arab expansion (53). The Nubian, Arab, and Beja populations of northeastern Africa roughly display equal admixture fractions from a local northeastern African gene pool (similar to the Nilotic component) and an incoming Eurasian migrant component (53) (Figure 3). The Eurasian component has been linked to the Middle East and the Arab migration, but only the Arab groups shifted to the Semitic languages; the Nubians and Beja groups kept their original languages. The Eurasian gene flow appears to have spread from north to south along the Nile and Blue Nile in a succession of admixture events (53).
This page allows historical linguists to compare and scrutinize proposed prehistoric lexical borrowings from the perspective of Proto-Indo-European. The first entries are all (135 in total) extracted from my master’s thesis “Foreign elements in the Proto-Indo-European vocabulary” (Bjørn 2017). Comments are encouraged at the bottom of each entry. New entries will be added, also on request.
Take this not as the conclusion, but an invitation to join the conversation.
So, we welcome the invitation, and hope that this new project thrives.
The publication of new ancient DNA samples from Africa is near, according to people at the SMBE meeting. As reported by Anthropology.net, a group by Pontus Skoglund has analysed new samples (complementing the study made by Carina Schlebusch), so we will have ancient samples of Africans from 300 to 6,000 years ago. They have been compared to the data of modern African populations, and among their likely conclusions (to be published):
Several thousand years ago, likely Tanzanian herders migrated far and wide, reaching Southern Africa centuries before the first farmers.
West Africans were likely early contributors to the gene pool of sub-Saharan Africans.
One ancient African herder showed influence from even farther abroad, with 38% of their DNA coming from outside Africa. 9-22% of the DNA of modern farmers, including the southern Khoe-San, comes from East Africans and Eurasian herders
Modern farmers, the ones as old as 500 years old, did have Bantu DNA in their genomes, but the ancient hunter-gatherers predated the spread of the Bantu.
Razib Khan, asked about the Afroasiatic homeland by David Reich, has taken this opportunity to publish his own hypothesis on the expansion of Afroasiatic, given the known Admixture analyses, using Y-DNA phylogeography, and with reasonable assumptions. He concludes that Afroasiatic expansion might also be associated with the western expansion of E1b1b subclades from a Levantine (“Natufian”) homeland.
I think it is necessary to remind everyone of the many problems unsolved by Indo-European studies – a much older discipline (and with more research published) than Afroasiatic studies. It is already quite revealing that we can’t still trace back Proto-Semitic to its homeland, and that Proto-Semitic is probably as old as Late Proto-Indo-European. We are talking, then, about an ancient proto-language – Afroasiatic – possibly older than Middle Indo-European (or Indo-Hittite), and whose dialects are still not well studied – but for the Semitic and Egyptian branches. Linguistic guesstimates or phylogenetic speculation date the proto-language (and thus the homeland) within a wide range, from 15,000 to 6,000 years ago.
There is an obvious trend (probably driven by Semitic and Egyptian researchers) to place the Afroasiatic Homeland near one of the many proposed Semitic homelands, i.e. in East Africa. This is similar to the trend seen in the first half of the 20th century in Indo-European studies, with most proposals locating the Proto-Indo-European homeland in Europe. European languages were the best known, and only the perceived antiquity of Vedic Sanskrit made some propose South Asian origins for the proto-language. However, it was only careful interpretation of linguistic finds, combined with archaeological data, what eventually yielded the Kurgan hypothesis, which has been since refined.
Razib Khan’s proposal makes sense in that it fits what others have proposed before, i.e. an east African or Middle Eastern Afroasiatic homeland, and that it links it with the expansion of farming. However, we have to keep in mind that until 5,000 years ago the Sahara was not the desert we know: it had certain important green corridors, humid areas between megalakes. The Sahara might not have been exactly green 10,000 to 5,000 years ago (roughly the time when Afroasiatic must have been spoken), but it had certain regions that allowed for an east-west migration. However, it also allowed for a west-east migration, and – perhaps more importantly – for a sizeable population expansion in central Saharan territory. To forget that is to allow for potentially wrong assumptions to be made.
What we expect from the next papers on ancient African DNA samples are the result of certain (more recent) population – and thus potentially ethnolinguistic – movements, but they probably won’t solve the question of the Afroasiatic homeland, which has an older time span than the samples studied. There is a wide void in African prehistory – compared with Near Eastern history – and this research will be closing that gap, just like European samples are helping close the gap in the prehistory of western, northern, and eastern Europe, compared to the history of the eastern Mediterranean regions.
I already wrote, regarding the potential ethnolinguistic link between Indo-European and Afroasiatic, that a close look at the migration of R1b-V88 lineages from Europe (through southern Italy?) into the Sahara – through the Fezzan-Chad-Chotts, and Chad-Chotts-Ahnet-Moyer megalake green corridors – could have been the key to the successful expansion of Afrasians.
Interesting aspects to take into account are the distribution of R1b-V88 lineages, compared to the location of Chadic languages (probably the most divergent and least known of the group) and to the potential North Afroasiatic (composed by Egyptian, Berber, and Semitic) and South Afroasiatic group (made of Cushitic and Omotic). Chadic has been argued to be connected variously to North Afroasiatic, or to the Berber branch, but the Northern group has also been argued to be connected with Cushitic, with Omotic as an independent branch. Also interesting would then be the potential connection between Indo-European (or Indo-Uralic) and Afroasiatic.
We could speculatively place the potential primary Afroasiatic homeland in the south-central Sahara, near the Megachad lake (i.e. near the peak of R1b-V88 lineages), with a secondary homeland in eastern Africa (as in the map above) – and maybe a tertiary homeland (of North Afroasiatic) in the Middle East, associated with the expansion of “Natufians” and E1b1b subclades. The identification of the spread of Afroasiatic languages with the expansion of R1b-V88 lineages needs an anthropological context (linguistic and archaeological) that is obviously lacking today.
It is important to keep all possibilities in sight when reviewing genetic analyses.
These are the statements about the Adamic language and the Tower of Babel as Abrahamic texts, beliefs and traditions show:
Adamic was the language spoken by Adam and Eve in the Garden of Eden. Adamic is typically identified with either the language used by God to address Adam, or the language invented by Adam (Book of Genesis 2:19).
The Genesis is ambiguous on whether the language of Adam was preserved by Adam’s descendants until the confusion of tongues (Genesis 11:1-9), or if it began to evolve naturally even before Babel (Genesis 10:5), into what is usually called Chaldaic:
Dante in his De Vulgari Eloquentia argues that the Adamic language is of divine origin and therefore unchangeable.
In his Divina Commedia, however, Dante changes his view to the effect that the Adamic language was the product of Adam. This had the consequence that it could not any longer be regarded immutable, and hence Hebrew could not be regarded as identical with the language of Paradise..
Also, the nature of that original language remains controversial, interpretations showing many nationalist flavours:
Traditional Jewish exegesis such as Midrash (Genesis Rabbah 38) says that Adam spoke Old Hebrew or rather its linguistic ancestor Proto-Canaanite, because the names he gives Eve – “Isha” (Book of Genesis 2:23) and “Chava” (Genesis 3:20) – only make sense in Hebrew.
Traditional Christians based on Genesis 10:5 have assumed that the Japhetite, or Indo-European, languages are rather the direct descendants of the Adamic language, having separated before the confusion of tongues, by which also Hebrew was affected.
Early Christian fathers claimed that Adam spoke Latin to explain why God would make it the liturgical language of his Church, although “Latin” here would be a loose way of referring to its ancestor, Proto-Italic or older Europe’s Indo-European.
Modern traditional Catholics follow Anne Catherine Emmerick’s revelations (1790), which stated that the most direct descendants of the Adamic language were Bactrian, Zend and Indian languages (i.e., the Indo-Iranian languages), associating the Adamic language with the then-recent concept of the “common source” of these tongues, now known as Proto-Indo-European:
This language was the pure Hebrew, or Chaldaic. The first tongue, the mother tongue, spoken by Adam, Shem, and Noah, was different, and it is now extant only in isolated dialects. Its first pure offshoots are the Zend, the sacred tongue of India, and the language of the Bactrians. In those languages, words may be found exactly similar to the Low German of my native place.
Many Muslim scholars, following the traditional Jewish identification of Pre-Hebrew as the Adamic language, hence classified within the Semitic language family (which includes the Ge’ez language used in the Book of Enoch), claim that Pre-Arabic – hence Proto-(West-)Semitic – is the original Adamic language. Most of them do not believe the Semitic languages were the direct descendants of the Adamic language, but rather trace them back to Abraham, instead of Noah and Adam.
The confusion of tongues is the initial fragmentation of human languages described in the Book of Genesis 11:1–9, as a result of the construction of the Tower of Babel.
And the Lord said, Behold, the people is one, and they have all one language; and this they begin to do: and now nothing will be restrained from them, which they have imagined to do.
Go to, let us go down, and there confound their language, that they may not understand one another’s speech.
So the Lord scattered them abroad from thence upon the face of all the earth: and they left off to build the city.
The language spoken by Noah and his descendants – whether the original Adamic language (either of divine origin or not) or the derived Chaldaic – split into seventy or seventy-two languages, according to the different traditions. The existence of only one language before Babel in Genesis 11:1
And the whole earth was of one language, and of one speech
has sometimes been interpreted as being in contradiction to Genesis 10:5
Of these were the isles of the nations divided in their lands, every one after his tongue, after their families, in their nations.
This issue only arises, however, if Genesis 10:5 is interpreted as taking place before and separate from the Tower of Babel story, instead of as an overview of events later described in detail in Genesis 11.
It also necessitates that the reference to the earth being “divided” (Genesis 10:25) is taken to mean the division of languages, rather than a physical division of the earth (such as in the formation of continents).
So, to sum up, these are the facts known to us from comparative linguistics, related to those Abrahamic beliefs and interpretations and the biblical chronology:
Mainstream linguists – without any links to religion, just based on comparative grammar – have accepted some form or other of language superfamilies, from Eurasiatic and Afro-Asiatic < Nostratic < Borean < Proto-World language, which would correspond loosely to that common language of the Genesis that was spoken before it was (instantly?) “confounded” into different languages, hence the similar (or even worse) results obtained in reconstructing subgroupings (say Indo-Uralic, Ural-Altaic) than with a more global Nostratic or even Proto-World language.
Most of the earliest attested, reconstructed or (generally accepted) hypothetic languages, like Old Egyptian; (Semitic) Akkadian, Pre-Proto-Canaanite; (Indo-European) Europe’s Indo-European, Proto-Indo-Iranian, Proto-Greek, Common Anatolian; (Uralic) Proto-Finno-Ugric; (Sino-Tibetan) Proto-Sinitic; (Pre-)Proto-Dravidic; etc. can be traced back – depending on the archeological findings and linguistic theories, inherently inexact – to ca. 2500 BC.
It is therefore odd that before that date everything is ‘more blurred’ (so to speak) in linguistic findings and reconstructions of older linguistic ancestors – as e.g. the hypothesized laryngeals (or their phonetic output) in Late Proto-Indo-European, or the difficult reconstruction of Proto-Semitic, not to talk about Proto-Uralic or Proto-Sino-Tibetan. This is the strongest argument to support a theoretical instant split of a common (Chaldaic or Adamic) language into 70 or 72 derived languages, which we know from attested inscriptions, reconstructions or hypothesis, or which disappeared without a trace.
About their classification into language “families”, they might be related to the families based on consanguinity as described in the Bible, but identifications of those families by modern scholars have blurred the possible links (if any) between older language superfamilies and Noah’s sons; cf. Japhetic‘s simplistic identification with Indo-European, or Semitic‘s with “Semitic” languages. However, the more traditional identification of Japheth’s sons with “European” peoples (and therefore Eurasiatic languages), and Shem’s sons with (the old concept of) “Asian” peoples (hence with Afro-Asiatic languages) is more reasonable, leaving Ham’s sons with (at least) Austric and Dené-Caucasian languages (see Borean language tree).
Many biblical interpretations of the Adamic language share therefore mistakes inherent to the culturally-biased and simplistic views of many scholars, hence the identification of the original tongue as Proto-Semitic by Jews and Muslims, Proto-Indo-European by many Christians (since Rasmus Rask‘s first description of it as “Japetisk”), Sanskrit or Indo-Iranian (Aryan) by Hinduism, etc. That has hindered a more rational interpretation of the Bible and other sacred texts in light of the newest academic findings.
To sum up, we cannot know if the Adamic language existed, or its nature; we don’t know if Chaldaic (the common language before Babel) was the same as Adamic, or if not, if it was global (Proto-World language) or local to the Middle East (Nostratic?) according to Genesis 10:5. We can, however, defend mainstream Abrahamic beliefs on the confusion of tongues and the Tower of Babel as possible (“probability” based on extrapolation has little to do with religion and even with social events happened more than 4000 years ago) and that the descendants of Noah might have spoken a common language until the centuries on either side of 2500 BC:
All that nonwithstanding any possible interpretations of Adamic or Chaldaic from Old Earth Creationists, who usually take the historical accounts of the Genesis (its literal interpretation) as real facts just from the Tower of Babel on, dismissing the rest of the biblical data from the Flood backwards, and indeed any timeline calculated with genealogies by Young Earth Creationists.