R1b-V88 migration through Southern Italy into Green Sahara corridor, and the Afroasiatic connection


Open access article The peopling of the last Green Sahara revealed by high-coverage resequencing of trans-Saharan patrilineages, by D’Atanasio, Trombetta, Bonito, et al., Genome Biology (2018) 19:20.


Little is known about the peopling of the Sahara during the Holocene climatic optimum, when the desert was replaced by a fertile environment.

In order to investigate the role of the last Green Sahara in the peopling of Africa, we deep-sequence the whole non-repetitive portion of the Y chromosome in 104 males selected as representative of haplogroups which are currently found to the north and to the south of the Sahara. We identify 5,966 mutations, from which we extract 142 informative markers then genotyped in about 8,000 subjects from 145 African, Eurasian and African American populations. We find that the coalescence age of the trans-Saharan haplogroups dates back to the last Green Sahara, while most northern African or sub-Saharan clades expanded locally in the subsequent arid phase.

Our findings suggest that the Green Sahara promoted human movements and demographic expansions, possibly linked to the adoption of pastoralism. Comparing our results with previously reported genome-wide data, we also find evidence for a sex-biased sub-Saharan contribution to northern Africans, suggesting that historical events such as the trans-Saharan slave trade mainly contributed to the mtDNA and autosomal gene pool, whereas the northern African paternal gene pool was mainly shaped by more ancient events.

Maximum parsimony Y chromosome tree and dating of the four trans-Saharan haplogroups. a Phylogenetic relations among the 150 samples analysed here. Each haplogroup is labelled in a different colour. The four Y sequences from ancient samples are marked by the dagger symbol. b Phylogenetic tree of the four trans-Saharan haplogroups, aligned to the timeline (at the bottom). At the tip of each lineage, the ethno-geographic affiliation of the corresponding sample is represented by a circle, coloured according to the legend (bottom left). The last Green Sahara period is highlighted by a green belt in the background

Also, interesting excerpts:

The fertile environment established in the Green Sahara probably promoted demographic expansions and rapid dispersals of the human groups, as suggested by the great homogeneity in the material culture of the early Holocene Saharan populations [62]. Our data for all the four trans-Saharan haplogroups are consistent with this scenario, since we found several multifurcated topologies, which can be considered as phylogenetic footprints of demographic expansions. The multifurcated structure of the E-M2 is suggestive of a first demographic expansion, which occurred about 10.5 kya, at the beginning of the last Green Sahara (Fig. 2; Additional file 2: Figure S4). After this initial expansion, we found that most of the trans-Saharan lineages within A3-M13, E-M2 and R-V88 radiated in a narrow time interval at 8–7 kya, suggestive of population expansions that may have occurred in the same time (Fig. 2; Additional file 2: Figures S3, S4 and S6). Interestingly, during roughly the same period, the Saharan populations adopted pastoralism, probably as an adaptive strategy against a short arid period [1, 62, 63]. So, the exploitation of pastoralism resources and the reestablishment of wetter conditions could have triggered the simultaneous population expansions observed here. R-V88 also shows signals of a further and more recent (~ 5.5 kya) Saharan demographic expansion which involved the R-V1589 internal clade. We observed similar demographic patterns in all the other haplogroups in about the same period and in different geographic areas (A3-M13/V3, E-M2/V3862 and E-M78/V32 in the Horn of Africa, E-M2/M191 in the central Sahel/central Africa), in line with the hypothesis that the start of the desertification may have caused massive economic, demographic and social changes [1].

Finally, the onset of the arid conditions at the end of the last African humid period was more abrupt in the eastern Sahara compared to the central Sahara, where an extensive hydrogeological network buffered the climatic changes, which were not complete before ~ 4 kya [6, 62, 64]. Consistent with these local climatic differences, we observed slight differences among the four trans-Saharan haplogroups. Indeed, we found that the contact between northern and sub-Saharan Africa went on until ~ 4.5 kya in the central Sahara, where we mainly found the internal lineages of E-M2 and R-V88 (Additional file 2: Figures S4 and S6). In the eastern Sahara, we found a sharper and more ancient (> 5 kya) differentiation between the people from northern Africa (and, more generally, from the Mediterranean area) and the groups from the eastern sub-Saharan regions (mainly from the Horn of Africa), as testified by the distribution and the coalescence ages of the A3-M13 and E-M78 lineages (Additional file 2: Figures S3 and S5).

Time estimates and frequency maps of the four trans-Saharan haplogroups and major sub-clades. a Time estimates of the four trans-Saharan clades and their main internal lineages. To the left of the timeline, the time windows of the main climatic/historical African events are reported in different colours (legend in the upper left). b Frequency maps of the main trans-Saharan clades and sub-clades. For each map, the relative frequencies (percentages) are reported to the right

R-V88 has been observed at high frequencies in the central Sahel (northern Cameroon, northern Nigeria, Chad and Niger) and it has also been reported at low frequencies in northwestern Africa [37]. Outside the African continent, two rare R-V88 sub-lineages (R-M18 and R-V35) have been observed in Near East and southern Europe (particularly in Sardinia)[30, 37, 38, 39]. Because of its ethno-geographic distribution in the central Sahel, R-V88 has been linked to the spread of the Chadic branch of the Afroasiatic linguistic family [37, 40].

(…) the R-V88 lineages date back to 7.85 kya and its main internal branch (branch 233) forms a “star-like” topology (“Star-like” index = 0.55), suggestive of a demographic expansion. More specifically, 18 out of the 21 sequenced chromosomes belong to branch 233, which includes eight sister clades, five of which are represented by a single subject. The coalescence age of this sub-branch dates back to 5.73 kya, during the last Green Sahara period. Interestingly, the subjects included in the “star-like” structure come from northern Africa or central Sahel, tracing a trans-Saharan axis. It is worth noting that even the three lineages outside the main multifurcation (branches 230, 231 and 232) are sister lineages without any nested sub-structure. The peculiar topology of the R-V88 sequenced samples suggests that the diffusion of this haplogroup was quite rapid and possibly triggered by the Saharan favourable climate (Fig. 2b).

One of the theories I proposed in the Indo-European demic diffusion model since the first edition – based mainly on phylogeography – is that R1b-V88 lineages had probably crossed the Mediterranean through southern Italy into a Green Sahara region, and distributed from there throuh important green corridors, humid areas between megalakes. Even though this new study – like the rest of them – is based solely on modern samples, and as such is quite prone to error in assessing ancient distributions – as we have seen in Europe -, it seems that a southern Italian route (probably through Sicily) for R1b-V88 and a late expansion through Green Sahara is more and more likely.

If we accept that the migration of R1b-V88 lineages is the last great expansion through a Green Sahara, then this expansion is a potential candidate for the initial Afroasiatic expansion – whereas older haplogroup expansions would represent languages different than Afroasiatic, and more recent haplogroup expansions would represent subsequent expansions of Afroasiatic dialects, like Semitic, Hamitic, Cushitic, or Chadic – as I explained in an older post.

In absolutely shameless speculative terms, then – as is today common in Genetic studies, by the way, so let’s all have some fun here – instead of some sort of R1b/Eurasiatic continuity in Europe, as some autochthonous continuists would like, this could mean that there would be an old Afroasiatic – R1b connection. That would imply:

NOTE. Regarding the contribution of CHG ancestry in the Pontic-Caspian steppe cultures, it is usually explained as caused by exogamy, or by absorption of a previous population (as in the Indo-Iranian case), although a contribution of communities of mainly J subclades to the formation of Neolithic steppe cultures cannot be ruled out. As for some autochthonous continuists’ belief in some sort of mythical mixed steppe people with mixed haplogroups and mixed language, well…

Simple Nostratic tree by Bomhard (2008)

The Pre-Indo-European linguistic situation, before the formation of Neolithic steppe cultures, seems like pure speculation, because a) language macro-families (with the exception of Afroasiatic) are highly speculative, b) sound anthropological models are lacking for them, and c) migrations inferred from haplogroup distributions of modern populations are often incorrect:

  • Haplogroup R could then be argued to be the source of Nostratic, and earlier subclades the source of Starostin’s Borean, given the distribution of its subclades in Asia and the timing of their migrations.
  • But of course one could also argue that, given the comparatively late population expansions that Genomics is showing, supporting Western European linguistic schools – where Russian Nostraticists tend to date languages further back in timeR1b (and not R) expansion could be the marker of Nostratic languages, due to its most likely southern path (and their old subclades found in Iran and the Caucasus), which would be more in line with the wet dreams of Europeans proposing R1b autochthonous continuity theories. I like this option far less because of that, but it cannot be ruled out.

If you have read this blog before, you know I profoundly dislike lexicostatistical and glottochronological methods, and I don’t like mass comparisons either. Whereas these methods pretend to apply mathematics to big (raw) data where there is almost no knowledge of what one is doing, comparative grammar applies complex reasoning where there is a lot of partially processed data.

But, it is always fun to ask “what if they were right?” and follow from there…

See also:

Differences in ADMIXTURE between Khvalynsk/Yamna and Sredni Stog/Corded Ware


Looking for differences among steppe cultures in Genomics is like looking for a needle in a haystack.

It means, after all, looking for differences among closely related cultures, such as between South-Western and North-Western Anatolian Neolithic cultures, or among Old European cultures (such as Vinča or Cucuteni–Trypillia), or between Iberian cultures after the arrival of steppe-related populations.

These differences between closely related regions, in all these cases and especially among steppe cultures, even when they are supported by Archaeology and anthropological models of migration (and compatible with linguistic models), are expected to be minimal.

Fortunately, we have phylogeography, which helps us point in the right direction when assessing potential migrations using genomic data.

User Tomenable recently pointed out a curious finding on Anthrogenica, from data available in Mathieson et al (2017): in ADMIXTURE results with K=12, a different ancestral component (in light green in the paper, see below) is traceable from the North Caspian steppe since the Neolithic. This is also partially distinguishable on K=10 and K=11, although not so clearly differentiating among later cultures.

NOTE: Read more on the controversy regarding the ideal number of ancestral populations, the absurd use of ADMIXTURE to solve language questions, and the meaning of cross-validation (CV) values

Unsupervised ADMIXTURE plot from k=10 to 12, on a dataset consisting of 1099 present-day individuals and 476 ancient individuals. We show newly reported ancient individuals and some previously published individuals for comparison.

Explanations for this finding might include, as the user points out, a greater contribution of CHG ancestry in the eastern steppe cultures (Khvalynsk/Yamna) compared to the North Pontic steppe (Sredni Stog/Corded Ware), which is probably one of the main genomic differences among both cultures, as I pointed out in the Indo-European demic diffusion model (see accounts on the origins of Khvalynsk and Sredni Stog populations and on contacts between Yamna and the Caucasus, and see below also my sketch of Eurasian genomic history).

Interesting is also the appearance of similar ancestral components later in Vučedol – which probably received admixture from Yamna settlers (see admixture components in West Yamna samples and in the Yamna settler from Bulgaria) – , and later still in the Balkans.

On the other hand, previous ancestral components in outliers from the Balkans seem to be more similar to Sredni Stog samples, giving still more strength to the hypothesis that this common (“steppe”) component expanded westward within the Pontic-Caspian steppe with the spread of Suvorovo-Novodanilovka chiefs.

Problems with this interpretation include:

1) The scarce samples available, the different cultures included, and the CV values of the K populations selected in ADMIXTURE.

2) The lack of data for comparison with Bell Beaker peoples (from Olalde et al. 2017).

3) The sample classified as Latvia_LN/CWC has this component. I have already said before that, given the differences with all other Corded Ware samples, this quite early sample might be an outlier, with Khvalynsk/Yamna population connected directly to the ancestors of this individual, possibly through exogamy (as it is clear from my sketch below). Whether or not this is an outlier among CWC populations in the Baltic, only future samples can tell.

4) Three later individuals from Corded Ware in Germany have the component, in a minimal amount. I would bet – judging by their position in the graphic – that this might be explained through the Esperstedt family. These individuals might have in turn got the contribution directly from the oldest member, who shows what seems (in PCA) like a recent admixture from contemporary steppe cultures (such as the Catacomb culture).

NOTE: See my graphics with interesting members of the Espersted family marked: ADMIXTURE and PCA (outlier).

Tentative sketch modelling the genetic history of Europe and West Eurasia from ancient populations up to the Neolithic, according to results in recent genetic papers and archaeological models of known migrations.

Again, needle in a haystack… And confirmation bias by me, indeed.

But interesting nonetheless.

EDIT (4 JAN 2017): A reader points out that the interpretation of Unsupervised ADMIXTURE should work backwards (i.e. different contributions into different modern populations), and not based solely on ancestral populations, which seems probably right. So again, confirmation bias (and potentially wrong direction fallacy) by me…


The Great Hungarian Plain in a time of change in the Balkans – Neolithic, Chalcolithic, and Bronze Age


I wrote recently about Anthony’s new model of Corded Ware culture expansion from Yamna settlements of Hungary. I am extremely sceptic about it in terms of current genetic finds, and suspicious of the real reasons behind it – probably misinterpretations of the so-called ‘Yamnaya ancestral component’ in recent genetic papers, rather than archaeological finds.

Nevertheless, it means a definitive rejection by Anthony of:

  • The multiple patron-client relationships he proposed to justify a cultural diffusion of Late Indo-European dialects from Yamna into different Corded Ware cultures in the forest-steppe and Forest Zone (see one of his latest summaries of the model in 2015). Now the language change is explained as a pure migration event, and cultural diffusion is not an option. Ergo, if no migration is found from Hungarian Yamna into Lesser Poland, then Corded Ware cultures were not Indo-European-speaking.
  • Ringe’s glottochronological tree for Proto-Indo-European languages (Ringe, Warnow, and Taylor 2002). An early and sudden split of Late PIE dialects in all directions is substituted by a common, Old European language that expanded from a very small area of settlers, in the Carpathian Basin. This is coincident with the current view on North-West Indo-European, and I think that his final acceptance of a sound linguistic model is essential to solve Indo-European questions.
  • The simplistic assumption of Yamna -> Corded Ware -> Bell Beaker migration found in genetic papers of 2015. The new model implies Yamna->Yamna settlers (Eastern Hungary). Yamna settlers are known to have developed into East Bell Beakers (as described by Gimbutas and accepted by Anthony originally, and now also found in the adoption of Heyd’s theory for his new model); therefore a Yamna settlers (Hungary) -> East Bell Beaker evolution is evident and mainstream, now clear also in genetics. It remains to be seen if the additional Yamna settlers (Hungary) -> Proto-Corded Ware migration proposed by him as a novelty in this new model is also right, i.e. if Yamna settlers from Hungary did in fact migrate into sites of Lesser Poland (to form a Proto-Corded Ware culture). If not, then only Heyd’s model remains.

This new model offers thus a more suitable time frame for usual proto-language guesstimates, that would be compatible with a spread of Late Indo-European with Yamna settlers (of R1b lineages) from the steppe into a small region, where North-West Indo-European would have been spoken, and then a potential cultural diffusion through (or founder effect in) a Proto-Corded Ware culture (of R1a-M417 subclades) of Lesser Poland, which is compatible with the Corded Ware Substrate hypothesis.

Since Anthony has stuck his neck out in favour of this new theory – changing some of his popular theories, and rejecting what many geneticists seem to take as certain – , and because of his previous impressive improvements over Gimbutas’ simple steppe theory (now apparently fashionable again), I think he deserves that his proposal of Yamna/Late Indo-European expansion in the Balkans be further investigated, if only to be improved upon.

I recently found the paper 4000-2000 BC in Hungary: The Age of Transformation, by T. Horváth, in Annales Universitatis Apulensis. Series Historica, 20/II, 51-113. While it deals mainly with the potential survival of the Baden culture into the late third millennium BC, it gives some interesting quite early dates for Yamna (‘Pit’) graves in the Carpathian Basin, and potential cultural (and population) movements within the Balkans.

A note about the Corded Ware culture in the Carpathian Basin:

Many researchers may assume that it is unnecessary for us to deal with the Corded Ware and Globular Amphorae cultures of north Germany, Poland and Denmark, and if so it does not matter what the names of the periods are. It actually matters a lot. It is true that in these areas there was no Baden complex, but the period had many Baden (and other) culture “period phenomena”. These seem to part of a larger formation than cultures – evidenced by traces such as cattle burials, the relationship between copper metallurgies and jade – which link these territories even when the culture complexes were different, because these phenomena appear not just in the Baden, but in the Corded Ware and Globular Amphorae area as well (and these cultural complexes partly overlapped each other both in space and time!). Even the characteristics of sites show many similarities: e.g. in the northern part of corded ware distribution area, mainly burials have been discovered (similarly to the Pit Grave culture in the Great Hungarian Plain) and in the southern part only settlements appear.

At the moment we have no explanation regarding the nature of the relationship between them (it is supposed that as a result of geographical conditions the people of the same culture lived in different ecological conditions and they adapted differently to their environment). In considering the whole of Europe around 3500-3000 BC, easily observable settlement signs disappeared (Milisauskas and Kruk, “Late Neolithic/Late Copper Age,” 307), similarly to Hungary, even though in Hungary this occurred from the end of the Middle Copper Age to the Early Bronze Age, between 4000 and 2000 BC. If we do not take into account that the cattle burials of the Baden culture between 3600 and 2800 BC, and possibly even longer than that, have analogies with the cattle burials of areas in the Early and Middle Neolithic Corded Ware culture (because “logically” analogies would be sought in those areas in the Bronze Age but this period is not analogous with that period in those areas), we would not find any spiritual resemblance in their relationships that lies behind their spatial and temporal analogies; cf. comp. Niels Johannsen and Steffen Laursen, “Routes and Wheeled Transport in Late 4th-Early 3rd Millennium Funerary Customs of the Jutland Peninsula: Regional Evidence and European Context,” PZ 85 (2010): 15-58; Horváth “The Intercultural Connections of the Baden „Culture,” 118. It is painful to think about how many relationships we have not explored or even assessed yet!

One version from both maps shown in the article, by T. Horváth: “Since the two cultures surely lived together in the Late Copper Age, their collective map represents the Late Copper Age (supplemented with Vučedol sites). Since the direction of diffusion of the Kostolac ceramic style is still unclear, two map versions were made. On one the Kostolac followed the Danube River, on the other they diffused in the opposite direction. In northeast Hungary, Coțofeni III appeared. On this map Kostolac sites are not depicted as dots but, in light of their position and density, proportionately sized arrows are used.”

On Yamna culture and burials in the Carpathian Basin:

Looking at Pit Grave kurgans on the distribution map, it is apparent that burials are the densest where there were no Boleráz or Baden occupations (in this respect this was a kind of “no man’s land”, but from the whole Late Copper Age perspective it was not: the sites of the Baden complex and Pit Grave complemented each other and even partially overlapped). Apart from burials, no Pit Grave settlements or other types of Pit Grave sites are known in Hungary, therefore we do not know whether Pit Grave settlements were situated near the kurgans or whether were somewhere else entirely and we simply have not found them yet.

Since the Pit Grave people had a different lifestyle from the Baden, we can assume that, up to the line of the Tisza River, small animal-keeping mobile groups (Pit Grave) met more populated and settled, agriculturalist, indigenous Boleráz-Baden groups. Animal keepers (Pit Grave) settled in areas where agriculturalists (Boleráz and Baden) did not; in some places, however, they crossed each other’s paths (Fig. 5, 7). Sometimes their sites are very close to each other, sometimes they appear on one site and they can be identified in the stratigraphy of a site. In the latter case the kurgan is always situated on top of a Baden settlement, indicating that Pit Grave not only followed the Baden at these sites but may have represented a somewhat higher social power and belief system than the Baden.

The relationship between pastoral, patrilineal, combatant nomadic tribes and agriculturalist communities is often described as some sort of patron and client relationship. In reality, the signs of such assumption are not visible in the Pit Grave-Baden relationship. There are cases when more aggressive herders conquered more developed agriculturalist communities, but there are also cases when the conqueror’s culture was more developed or stronger than that of the conquered. Always, the conquering nomads are the patrons, the rulers and the empire builders.

In our case, timing is important. How much time had passed on those common sites where a Baden settlement was followed by a Pit Grave kurgan? In these cases, it is certain that the kurgan is younger, but how much younger?

From the article, by T. Horváth. “On the 10 locations analysed, surviving Baden can be assumed after 2800 BC. Unfortunately, it is not possible to predict which sites would survive further scrutiny of radiocarbon dating in this respect; only a few dates are available so far. Therefore, on the map of Baden that still existing after the Late Copper Age, I have also represented all sites (up to the Danube River line) and combined them with Early Bronze Age sites. Since the majority of Makó sites are represented by only one find (scattered finds), and the majority of sites have just one grave, it is impossible to ascertain whether it was part of a cemetery, was within a settlement, or was an individual burial without any further features. Therefore, following Dani 2005, I utilized subdivisions: perhaps in the future this fine subdivision will provide a meaningful explanation (1). Since the radiocarbon dates of Pit Grave kurgans clearly show that the Pit Grave survived at least until 2500 BC, I combined the previous map with that of the Pit Grave. This map would show a realistic picture of cultures after 2800 BC east of the Danube River (2).”

To sum up, the Pit Grave and Baden in the Late Copper Age were certainly contemporary from 3350 BC in the Great Hungarian Plain, and they had common sites, sites which were very close to each other, sites which were far from each other, and also independent sites. The Pit Grave culture surely survived in the transitional period, and into Early Bronze Age I, but perhaps even longer. For the most part, the Baden had ended by 2900 BC in the Great Hungarian Plain. Mapping and some other data (e.g. the discovery that Younger-type, not Mondsee-type, metal objects, which can now be considered to be Baden, even appear east of the Danube River) does not exclude the possibility of searching further for traces of Baden surviving in the Great Hungarian Plain together with or alongside to the Pit Grave. On the common Baden-Pit Grave sites, even without carbon dating, we can assume from already known stratigraphical data that they closely followed each other in time.

For those of you interested in more detailed radiocarbon analysis and assessment of Yamna burials and settlements, from the steppe to the Balkans, to investigate Anthony’s theory further – apart from those authors referenced by him – , I can recommend reading Y. Rassamakin (e.g. Import and Imitation in Archaeology, 2008), S. Ivanova, or Claudia Gerling (e.g. Prehistoric Mobility and Diet in the West Eurasian Steppes 3500 to 300 BC).

Featured image, from the article, by T. Horváth: Distribution map of the Pit Grave.


The concept of “Outlier” in Human Ancestry (II): Early Khvalynsk, Sredni Stog, West Yamna, Iron Age Bulgaria, Potapovka, Andronovo…


I already wrote about the concept of outlier in Human Ancestry, so I am not going to repeat myself. This is just an update of “outliers” in recent studies, and their potential origins (here I will repeat some of the examples):

Early Khvalynsk: the three samples from the Samara region have quite different positions in PCA, from nearest to EHG (of Y-DNA haplogroup R1a) to nearest to ANE ancestry (of Y-DNA haplogroup Q). This could represent the initial consequences of the second wave of ANE ancestry – as found later in Yamna samples from a neighbouring region -, possibly brought then by Eurasian migrants related to haplogroup Q.
With only 3 samples, this is obviously just a tentative explanation of the finds. The samples can only be reasonably said to show an unstable time for the region in terms of admixture (i.e. probably migration), judging by the data on PCA.

Ukraine Eneolithic samples offer a curious example of how the concept of outlier can change radically: from the third version (May 30th) of the preprint paper of Mathieson et al. (2017), when the Ukraine Eneolithic sample with steppe ancestry (and clustering with central European samples) was the ‘outlier’, to the fourth version (September 19th), when two samples with steppe ancestry clustering close to Corded Ware samples were now the ‘normal’ ones (i.e. those representing Ukraine Eneolithic population), and the outlier was the one clustering closely with Ukraine Mesolithic samples…

PCA and Admixture for south-eastern Europe. Image modified from Mathieson et al. (2017) – Third revision (May 30th), used in the 2nd edition of the Indo-European demic diffusion model.

This is one of the funny consequences of the wrong interpretation of the ‘yamnaya component’, that made geneticists believe at first that, out of two samples (!), the ‘outlier’ was the one with ‘yamnaya’ ancestry, because this component would have been brought by an eastern immigrant from early Khvalynsk…

This example offers yet another reason why precise anthropological context is necessary to offer the right interpretation of results. Within the Indo-European demic diffusion model – based mainly on Archaeology and Linguistics – , the sample with steppe ancestry was the most logical find in the region for a potential origin of the Corded Ware culture, and it was interpreted as such, well before the publication of the fourth version of Mathieson et al. (2017).

PCA of South-East European and other European samples. Image modified from Mathieson et al. (2017) – Fourth revision (September 19th), used in the 3rd edition of the Indo-European demic diffusion model.

West Yamna (to insist on the same question, the ‘yamnaya’ component): we have only four western Yamna samples, two of them showing Anatolian Neolithic ancestry (one of them, from Ukraine, with a strong ‘southern’ drift). On the other hand, Corded Ware migrants do not show this. So we could infer that their migrations were not coetaneous: whereas peoples of Corded Ware culture expanded ca. 3300 BC to the north – in the natural corridor to the Baltic that has been proposed for this culture in Archaeology for decades (and that is well represented by Ukraine Eneolithic samples) -, peoples of Yamna culture expanded to the west, replacing the Ukraine Eneolithic population (i.e. probably those of ‘Proto-Corded Ware culture’), and eventually mixing with Balkan populations of Anatolian Neolithic ancestry.

Potapovka, Andronovo, and Srubna: while Potapovka clusters closely to the steppe, and Andronovo (like Sintashta) clusters closely to Corded Ware (i.e. Ukraine Neolithic / Central-East European), both have certain ‘outliers’ in PCA: the former has one individual clustering closely to Corded Ware, and the latter to the steppe. Both ‘outliers’ fit well with the interpretation of the recent mixture of Corded Ware peoples with steppe populations, and they offer a different image for the evolution of populations of Potapovka and Sintashta-Petrovka, potentially influencing their language. The position of Srubna samples, nearer to Sintashta and Andronovo (but occupying the same territory as the previous Potapovka) offers the image of a late westward conquest from Corded Ware-related populations.

Diachronic map of migrations ca. 2250-1750 BC

Iron Age Bulgaria: a sample of haplogroup R1a-z93, with more ‘yamnaya’ ancestry than any other previous sample from the Balkans. For some, it might mean continuity from an older time. However – as with the Corded Ware outlier from Esperstedt before it – it is more likely a recent migrant from the steppe. The most likely origin of this individual is therefore people from the steppe, i.e. either the Srubna culture or a related group. Its relatively close cluster in PCA to certain recent Slavic populations can be interpreted in light of the multiple back and forth migrations in the region: of steppe populations to the west (Srubna, Cimmerians, Scythians, Sarmatians,…), and of Slavic-speaking populations:

Diachronic map of Bronze Age migrations ca. 1750-1250 BC.

Well-defined outliers are, therefore, essential to understand a recent history of admixture. On the other hand, the very concept of “outlier” can be a dangerous tool – when the lack of enough samples makes their classification as as such unjustified -, leading to the wrong interpretations.