Present-day domestic horses are immensely diverse in their maternally inherited mitochondrial DNA, yet they show very little variation on their paternally inherited Y chromosome. Although it has recently been shown that Y chromosomal diversity in domestic horses was higher at least until the Iron Age, when and why this diversity disappeared remain controversial questions. We genotyped 16 recently discovered Y chromosomal single-nucleotide polymorphisms in 96 ancient Eurasian stallions spanning the early domestication stages (Copper and Bronze Age) to the Middle Ages. Using this Y chromosomal time series, which covers nearly the entire history of horse domestication, we reveal how Y chromosomal diversity changed over time. Our results also show that the lack of multiple stallion lineages in the extant domestic population is caused by neither a founder effect nor random demographic effects but instead is the result of artificial selection—initially during the Iron Age by nomadic people from the Eurasian steppes and later during the Roman period. Moreover, the modern domestic haplotype probably derived from another, already advantageous, haplotype, most likely after the beginning of the domestication. In line with recent findings indicating that the Przewalski and domestic horse lineages remained connected by gene flow after they diverged about 45,000 years ago, we present evidence for Y chromosomal introgression of Przewalski horses into the gene pool of European domestic horses at least until medieval times.
The first record of the modern domestic Y chromosome haplotype stems from two Bronze Age samples of similar age. Notably, both samples were found in two distantly located regions: present-day Slovakia (2000–1600 BCE, dated by archaeological context) and western Siberia (14C-dated: 1609–1436 cal. BCE). Although a very recent study proposes an oriental origin of this haplotype (14), we cannot determine the geographical origin of Y-HT-1 with certainty, because this haplotype has not been found thus far in predomestic or wild stallions. There are two possible scenarios: (i) Y-HT-1 emerged within the domestic population by mutation and (ii) Y-HT-1 was already present in wild horses and entered the domestic population either at the beginning of domestication (but initially restricted to Asian horses) or later by introgression (from wild Y-HT-1 carrying studs during the Iron Age). Crosses between domestic animals and their wild counterparts have been observed in several domestic species (15–18); thus, the simplest explanation would be that we missed Y-HT-1 in older samples because of limited geographical sampling. However, the estimated haplotype age is contemporary (Fig. 4) with the assumed starting point of horse domestication ~4000–3500 BCE (19), rendering it likely that Y-HT-1 originated within the domestic horse gene pool. Still, we cannot rule out definitively that it appeared before domestication.
Independent of its geographical origin, Y-HT-1 progressively replaced all other haplotypes—except for one additional lineage that is restricted to Yakutian horses (11). Considering our data, this trend in paternal diversity toward dominance of the modern lineage appears to start in the Bronze Age and becomes even more pronounced during the Iron Age. The Bronze Age was a time of large-scale human migrations across Eurasia (20–22), movements that were undoubtedly facilitated by the spread of horses as a means of transport and warfare. At that time, the western Eurasian steppes were inhabited by highly mobile cultures that largely relied on horses (20, 21, 23, 24). The genetic admixture of northern and central European humans with Caucasians/eastern Europeans did correlate with the spread of the Yamnaya culture from the Pontic-Caspian steppe (25), an area that has repeatedly been suggested as the center of horse domestication (19, 26, 27). Given the importance of domestic horses, it appears that deliberate selection/rejection of certain stallions by these people might have contributed to the loss of paternal diversity. The spread of humans out of this region might also have resulted in the spread of Y-HT-1 from Asia to Europe. This scenario also agrees with recent findings that the low male diversity of extant horses is not caused by recruiting only a limited number of stallions during early domestication (13).
The presence of the Y chromosome haplotype carried by present-day Przewalski horses (Y-HT-2) in early domestic stallions and a European wild horse (Pie05; table S2) could be the result of introgression of Przewalski stallions. Although the original distribution of the Przewalski horse is unknown, it was probably much larger than that of the relict population in Mongolia that produced modern Przewalski horses and might even have extended into Central Europe. However, it is also possible that either Przewalski horses were among the initially domesticated horses or that Y-HT-2 occurred both in Przewalski horses and in those wild horses that are the ancestors of domestic horses, based on autosomal DNA data (30). Regardless of how Y-HT-2 entered the domestic gene pool, it was eventually lost, as were all haplotypes except Y-HT-1. In our sample set, Y-HT-2 was undetectable as early as the third time bin. However, it is possible that Y-HT-2 may have been present during this time period, but with a frequency below 0.11 (with 95% probability). The inferred time trajectories for Y-HT-2 frequencies suggest that it could nevertheless have persisted at very low frequencies until the Middle Ages (Fig. 3). On the basis of these simulations, this finding could be interpreted as a relic of this haplotype’s formerly higher frequency in the domestic horse gene pool. It is also possible that the presence of this haplotype could be the result of mating a wild stallion with a domestic mare, a frequently reported breeding practice when wild horses were still widely distributed. However, a significant contribution of the Przewalski horse to the gene pool of modern domestic horses has been almost ruled out by recent genomic studies (13, 31, 32).
The cemetery site of Sharakhalsun 2 is located approximately 160 km east of Stavropol in the north Caucasus region of Russia [see featured image]. It comprises a linear alignment of mounds situated on the right side of the river Kalaus near the Manych water reserve. This area was a focus of burial activity from the late 5th millennium BCE onwards, and is dotted with tens of thousands of mounds.
Burial mound 6 was 50 m in diameter and 3 m high and was initially constructed by communities of the Steppe Maikop culture in the late 4th millennium BCE (Yakovlev and Samoylenko, 2008). During the third millennium, the mound was reused by groups from the Yamnaya community, who added several graves to the centre (graves 4, 5, 16) and periphery of the mound (grave 3). Several construction layers of the mound embankment can be attributed to these Yamnaya communities.
The most intriguing aspect of mound 6 was the discovery of four burials with wagons or wagon parts. The oldest is grave 18, which was a narrow, deep catacomb-like shaft dug from the side into the existing mound. At the bottom of the shaft the skeleton of an adult male was discovered, buried in a sitting position on a four-wheeled wagon [see figure below]. Most wooden parts of the wagon were poorly preserved but it is obvious that they comprised a complete, assembled wagon that was squeezed into the burial chamber. No other grave inclusions were found. Due to the constant remodelling of the mound when new burials were added, the actual stratigraphic relationship to the central Yamnaya graves is unclear but wooden parts of the wagon have been radiocarbon dated to 4500 ± 40 BP (3356-3033 cal BCE, at 95.4%, OxCal 4.2.4; Bronk Ramsey, 2009), which links the grave to the early Yamnaya culture and specifically to a group that are in between the Maikop and Yamnaya.
Wagon burials are a well-known phenomenon in the Northwest and North Caucasian steppe zone and beyond. The dating of their archaeological contexts associates such graves with the Novotitarovskaya, Yamnaya and Catacomb Cultures (Gei, 2000; Häusler, 1982; Kaiser, 2007; Shishlina et al., 2013). There is a great variation in this type of burial, with some wagons being found intact and assembled within graves, wagons with dismantled wheels being found below burials, or wagon boxes being used as the grave ceiling. Assembled or dismantled wagons have also been found in specific chambers beside the burial pits (Belinskiy and Kalmykov, 2004; Gei, 2000; Häusler, 1982; Limberis and Marchenko, 2002).
Not only is grave 18 the oldest wagon burial in mound 6 at Sharakhalsun but the position of the individual is unique. Out of the approximately 280 wagon burials so far known from the Urals to the lower Danube (Kaiser, 2007), it is the only one where the associated individual was buried sitting on the wagon, in contrast to the typical supine burial position underneath the wagon box.
Various interpretations have been posited for the significance of wagons in funerary rituals of the period, with Kaiser (2003) arguing that their relative rarity in Catacomb Culture burials represents the beginnings of social stratification, while Reinhold et al. (2017) discuss whether they may have been related to ownership rather than active driving. Wagons may have started to be used as ceremonial vehicles rather than for purely utilitarian purposes, with their final function being as a hearse (Uckelmann, 2013), the corpse being laid out on the wagon bed. In cases where the wagons were dismantled, and therefore no longer able to serve a functional purpose, it has been argued that this represents either their symbolic disabling (Knüsel, 2002), or gives them a new ritualistic lease of life (Shishlina et al., 2014). The finding of partial wagons in some burials has been suggested to represent pars pro toto (Kaiser, 2003), with the symbolic importance of the vehicle overriding any practical use they may have had in the funerary rites.
The article goes on to enumerate the different injuries of the skeleton that are compatible with a wagon accident.
The burial of the individual, found in a seated position on a fully assembled wagon, is unique. When this form of burial is considered alongside the number and pattern of fractures found in the individual, which is also unique amongst the wider burial population, it has to be considered whether the individual could have been an active wagon-driver who sustained the majority, if not all, of the injuries in a severe accident whilst engaged in this activity.
There are some skeletal features recorded in the individual that could suggest heavy and unusual physical loading that may have been associated with habitual wagon-driving, although it must always be borne in mind that inferring specific occupations from activity-related skeletal changes is fraught with difficulties (see Jurmain et al., 2011; Villotte and Knüsel, 2013). The individual demonstrated heavy or abnormal use of muscle groups and ligaments involved in anterior and lateral flexion of the neck; elevation and stabilisation of the shoulders; abduction, adduction, rotation, flexion and extension of the arm; extension and flexion of the wrist; flexion, rotation and stabilisation of the thigh; and flexion of the knee (see Appendix for a more detailed description of these entheseal changes). All of these would be typical body movements expected in the action of sitting on a wagon and controlling the cart animals. The same pattern of entheseal changes was found in individuals examined by Molleson and Hodges (1993) and Kozak (2014), who also argued that this could suggest the presence of wagon drivers in their skeletal samples.
[Spondylolisthesis of the fourth lumbar vertebra and nonunion of the left ulna fracture] suggest that the individual recovered from his injuries, despite their severity, and continued with his former activities. However, the non-union of the ulna fracture may have resulted in some functional problems (dos Reis et al., 2009), while vertebral compression fractures often leave patients with chronic pain (Silverman, 1992). The individual had also developed severe secondary arthritis of the head of the first metacarpal and proximal phalanx as a result of the fracture to the metacarpal. Complications may also have arisen with the multiple rib fractures, especially those of the lower ribs, which are often associated with abdominal injuries (Brickley, 2006), while isolated fractures of the fibula can be associated with severe soft tissue damage to the ankle (Galloway, 2014b). Unfortunately, it is not possible to state with any certainty the degree to which the individual may have been affected by any complications in terms of loss of function or pain, as these are very specific to each individual (Petrie, 1967).
The majority of the suite of traumatic injuries suffered by this individual possibly relates to a single accident a number of months, if not years, before his death. The typical aetiology of these injuries would suggest that this may have been a fall from a wagon, with subsequent crushing by the vehicle landing on top of them, or “overrun” of a wheel across the chest of the individual, an accident involving their draft animals, or a combination of all three. The survival and recovery of the individual, despite the severity of his injuries, would probably have been a notable event in the communityand it is interesting to speculate whether the unique positioning of the individual in his grave, sitting on a wagon rather than buried in a supine position underneath the wagon box, was some form of commemoration of the event.
It explores one of the main issues we are observing with ancient DNA, the greater reduction in Y-DNA lineages relative to mtDNA lineages, and its most likely explanation (which I discussed recently).
Excerpts interesting for the Indo-European question (emphasis mine):
Gimbutas’s reconstruction has been criticized as fantastical by her critics, and any attempt to paint a vivid picture of what a human culture was like before the period of written texts needs to be viewed with caution. Nevertheless, ancient DNA data has provided evidence that the Yamnaya were indeed a society in which power was concentrated among a small number of elite males. The Y chromosomes that the Yamnaya carried were nearly all of a few types, which shows that a limited number of males must have been extraordinarily successful in spreading their genes. In contrast, in their mitochondrial DNA, the Yamnaya had more diverse sequences.9 The descendants of the Yamnaya or their close relatives spread their Y chromosomes into Europe and India, and the demographic impact of this expansion was profound, as the Y-chromosome types they carried were absent in Europe and India before the Bronze Age but are predominant in both places today.13
This Yamnaya expansion also cannot have been entirely friendly, as is clear from the fact that the proportion of Y chromosomes of steppe origin in both western Europe14 and in India15 today is much larger than the proportion of the rest of the genome. This preponderance of male ancestry coming from the steppe implies that male descendants of the Yamnaya with political or social power were more successful at competing for local mates than men from the local groups. The most striking example I know is from Iberia in far southwestern Europe, where Yamnaya-derived ancestry arrived suddenly at the onset of the Bronze Age between 4,500 and 4,000 years ago. Daniel Bradley’s laboratory and my laboratory independently produced ancient DNA from individuals of this period.14 We find that in the first Iberians with Yamnaya-derived ancestry, the proportion of Yamnaya ancestry across the whole genome is almost never more than around 15 percent. However, around 90 percent of males who carry Yamnaya ancestry have a Y-chromosome type of steppe origin that was absent in Iberia prior to that time. It is clear that there were extraordinary hierarchies and imbalances in power at work in the Yamnaya expansions.
David Reich clearly doesn’t give a damn about how other people might react to his commentaries. That’s nice.
In any case, if anyone was still in denial, R1b-M269 expanded with Yamna (through the Bell Beaker expansion) into Iberia, hence yes, 90% of modern Basque male lineages have an origin in the steppe, like the R1b-DF27 sample recently found, and their common ancestor spoke Late Proto-Indo-European.
The recent publication of Narasimhan et al. (2018) has outdated the draft of this post a bit, and it has made it at the same time still more interesting.
While we wait for the publication of the dataset (and the actual Y-DNA haplogroups and precise subclades with the revision of the paper), and as we watch the wrath of Hindu nationalists vented against the West (as if the steppe was in Western Europe) and science itself, we have already seen confirmation from the Reich Lab of their new approach to Late Proto-Indo-European migrations.
Yamna/Steppe EMBA, previously identified as the direct source of “steppe” ancestry (AKA ‘Yamnaya‘ ancestry) and Late Indo-European migrations in Asia – through Corded Ware, it is to be understood – has been officially changed. In the case of Indo-Iranian migrations it is the “Steppe MLBA cloud”, after a direct contribution to it of Yamna/Steppe EMBA, which expanded Indo-Iranian, as I predicted ancient DNA could support.
In Twitter, the main author responded the following when asked for this change regarding the origin of steppe ancestry in Asian migrants (emphasis mine):
Our reasons are:
The Turan samples show no elevated steppe ancestry till 2000BC.
MLBA is R1a
Indus periphery doesn’t have steppe ancestry but Swat does, and EMBA doesn’t work both in terms of time or genetic ancestry to explain the difference.
I am glad to see finally recognized that Y-DNA haplogroups and time have to be taken into account, and happy also to see an end to the by now obsolete ‘ADMIXTURE/PCA-only relevance’ in Human Ancestry. The timing of archaeological migrations, the cultural attribution of each sample, and the role of Y-DNA variability reduction and expansion have been finally recognized as equally important to assess potential migrations, as I requested.
This change was already in the making some months ago, when David Anthony – who has worked with the group for this paper and others before it – already changed his official view on Corded Ware – from his previous support of the 2015 model. His latest theory, which linked Yamna settlements in Hungary with a potential mixed society of migrants (of R1b-L23 and R1a-Z645 lineages) from West Yamna, is most likely wrong, too, but it was clearly a brave step forward in the right direction.
The only reasonable model now is that Yamna expanded Late Proto-Indo-European languages with steppe ancestry + R1b-L23 subclades.
You can either accept this change, or you can deny it and wait until one sample of R1a-Z645 appears in West Yamna or central Europe, or one sample of R1b-L23 appears in Corded Ware (as it is obvious it could happen), to keep spreading the wrong ideas still some more years, while the rest of the world goes on: Mallory, Anthony, and other archaeologists co-authoring the latest paper (probably part of the stronger partnership with academics that we were going to see), who had formally put forward complex, detailed theories, investing their time and name in them, have rejected their previous migration models to develop new ones based on the most recent findings. If they can do that, I am sure any amateur geneticist out there can, too.
The Balto-Slavic dialect and its homeland
An interesting question in Linguistics and Archaeology, now that Corded Ware cannot be identified as “Indo-Slavonic” or any other imaginary ancient group (like Indo-Slavo-Germanic), remains thus mostly unchanged since before the famous 2015 genetic papers:
Was Balto-Slavic a dialect of the expanding North-West Indo-European language, a Northern LPIE dialect, as we support, based on morphological and lexical isoglosses?
Or was it part of an Indo-Slavonic group in East Yamna, i.e. a Graeco-Aryan dialect, based mainly on the traditional Satem-Centum phonological division?
I am a strong supporter of Balto-Slavic being a member of a North-West Indo-European group. That’s probably because I educated myself first with the main Spanish books* on Proto-Indo-European reconstruction, and its authors kept repeating this consistent idea, but I have found no relevant data to reject it in the past 15 years.
* Today two of the three volumes are available in English, although they are from the early 1990s, hence a bit outdated. They also maintain certain peculiarities from Adrados’ own personal theories, such as multiple (coloured) laryngeals, 5 cases – with a common ancestral oblique case – for Middle PIE, etc. But it has lots of detailed discussions on the different aspects of the reconstruction. It is not an easy introductory manual to the field, though; for that you have already many famous short handbooks out there, like those of Fortson (N.American), Beekes (Leiden), or Meier-Brügger (Germany).
Fernando and I have always maintained that North-West Indo-European must have formed a very recent community, probably connected well into the early 2nd millennium BC for certain recent isoglosses to spread among its early dialects, based on our guesstimates*, and on our belief that it formed at some point not just a dialect continuum, but probably a common language, so we estimated that the expansion was associated with the pan-European influence of Únětice and close early Bronze Age European contacts.
NOTE. I know, you must be thinking “linguistic guesstimates? Bollocks, that’s not Science”. Right? Wrong. When you learn a dozen languages from different branches, half a dozen ancient ones, and then still study some reconstructed proto-languages from them, you begin to make your own assumptions about how the language changes you perceive could have developed according to your mental time frames. If you just learned a second language and some Latin in school, and try to make assumptions as to how language changes, or you believe you can judge it with this limited background, you have evidently the wrong idea of what a guesstimate is. I accept criticism to this concept from a scientist used only to statistical methods, since it comes from pure ignorance of what it means. And I accept alternative guesstimates from linguists whose language backgrounds may differ (and thus their perception of language change). However, I would not accept a glottochronological or otherwise (supposedly) statistical model instead (or a religious model, for that matter), so we have no alternatives to guesstimates for the moment.
In fact, guesstimates and dialectalization have paved the way to the steppe hypothesis, first with the kurgan hypothesis by Marija Gimbutas, then complemented further in the past 60 years by linguists and archaeologists into a detailed Khvalynsk -> Yamna -> Afanasevo/Bell Beaker/Sintashta-Andronovo expansion model, now confirmed with genomics. So either you trust us (or any other polyglot who deals with Indo-European matters, like Adrados, Lehmann, Beekes, Kloekhorst, Kortlandt, etc.), or you begin learning ancient languages and obtaining your own guesstimates, whichever way you prefer. The easy way of numbers + computer science does not exist yet, and is quite far from happening – until we can understand how our brains summarize and select important details involved in obtaining estimates – , no matter what you might be reading (even in Nature or Science) recently…
Data from the 2015 papers changed my understanding of the original NWIE-speaking community, and I have since shifted my preffered anthropological model (from a Northern dialect in Yamna spreading into a loose NWIE-speaking Corded Ware -> Únětice) to a quite close group formed by late Yamna settlers in the Carpathian Basin, expanded as East Bell Beakers, and later continuing with close contacts through Central European EBA.
NOTE. As you can read, we initially rejected Gimbutas’ and Anthony’s (2007) notion of a Late PIE splitting suddenly into all known dialects (viz. Italo-Celtic with Vučedol/Bell Beaker), and looked thus for a common NWIE spread with Corded Ware migrants, with help from inferences of modern haplogroup distribution (as was common in the early 2000s). Language reconstruction was the foundation of that model, and it was right in its own way. It probably gave the wrong idea to geneticists and archaeologists, who quite easily accepted some results from the 2015 papers as supporting this model. But it also helped us develop a new model and predict what would happen in future papers, as demonstrated in O&M 2018. Any alternative linguistic and archaeological model could explain what is seen today in genomics, but our model of North-West Indo-European reconstruction is obviously at present the best fit for it.
Nevertheless, one of the most important Balticists and Slavicists alive, Frederik Kortlandt, posits that there was in fact an Indo-Slavonic group, so one has to take that possibility into account. Not that his ideas are flawless, of course: he defends the glottalic theory – which is still held today by just a handful of researchers – , and I strongly oppose his description of Balto-Slavic and Germanic oblique cases in *-m- (against other LPIE *-bh-) as an ancestral remnant related to Anatolian (an ending which few scholars would agree corresponds to what he claims), since that would probably represent an older split than warranted in our model. I believe genetics is proving that the dialectalization of Late PIE happened as Fernando López-Menchero and I described.
NOTE. The idea with these examples of how he has been wrong in LPIE and MPIE reconstruction is not to observe the common ad hominem arguments used by amateur geneticists to dismiss academic proposals (“he said that and was wrong, ergo he is wrong now”). It is to bring into attention that the argument from authority is important for the academic community insofar as it creates a common ground, i.e. especially when there are many relevant scholars agreeing on the same subject. But, indeed, any model can and should be challenged, and all authorities are capable of being wrong, and in fact they often are.
The most common explanation today for the dialectal development *-m- is an innovation (not an archaism), whether morphological (viz. Ita. and Gk. them. pl *-i) or phonological (as I defend); and the most commonly repeated model for the satemization trend (even for those supporting a three-dorsal theory for PIE) is areal contact, whether driven by a previous (most likely Uralic) substratum, or not. Hence, if Kortlandt’s main different phonological and morphological assessments of the parent language are flawed, and they are the basis for his dialectal scheme, it should be revised.
The ‘atomic bomb’ that Indo-Slavonic proponents launched, in my opinion, was Holzer’s Temematic (born roughly at the same time as the renewed Old European concept in North-West Indo-European model of Oettinger) – and indeed Kortlandt’s acceptance of it. It seems to me like the linguistic equivalent of the archaeological “patron-client relationship” proposed by Anthony for a cultural diffusion of Late PIE into different Corded Ware regions: almost impossible to be fully rejected, if the Indo-Slavonic superstrate is proposed for a relatively early time.
In my opinion, the shared morphological layer with North-West Indo-European is obviously older than Iranian influence on Slavic, and I think this is communis opinio today. But how could we disentangle the dialectalization of Balto-Slavic, if there is (as it seems) an ancestral substrate layer (most likely Uralic) common to both Balto-Slavic and Indo-Iranian? It seems a very difficult task.
The expansion of Balto-Slavic
In any case, there are two, and only two mainstream choices right now.
NOTE. Mainstream, as in representing trends current today among Indo-Europeanists, so that many programs around the world would explain these alternative models to their students, or they would easily appear in most handbooks. Not like the word “mainstream” you read in any comment out there by anyone who has never been interested in Indo-European studies, and uses any text from any author, written who knows how long ago, merely to justify their ethnic preconceptions coupled with certain genomic finds.
You can agree with:
A) The Spanish and German schools of thought, together with many American and British scholars, as well as archaeologists like Heyd, Mallory, or Prescott, and now Anthony, too: the language ancestral to Balto-Slavic, Germanic, and Italo-Celtic accompanied expanding West Yamna/East Bell Beakers into Europe, and then their speakers – like the rest of peoples everywhere in Europe – admixed later in the different regions.
B) Frederik Kortlandt and other Indo-Slavicists. The ‘original’ Balto-Slavic would have spread with Srubna (and likely Potapovka before it), as a product of the admixture of East Yamna’s Indo-Slavonic with incoming Corded Ware migrants (this would correspond to my description of Indo-Iranian). ‘True’ Balto-Slavic speakers would have then absorbed the Temematic-speaking migrants (equivalent to early Balto-Slavic migrants as described in the demic diffusion model) spreading from the west, most likely in the steppe. Later developments from the steppe would have then brought Baltic to the north, and Slavic to the west.
Therefore, in both cases the language spoken by early R1a-Z645 lineages in Únětice or Mierzanowice/Nitra EBA cultures would have been an eastern North-West Indo-European dialect associated with expanding Bell Beakers, and closely related to Germanic and Italo-Celtic. In the second case, the ancient samples we see genetically closer to modern West Slavs could thus be identified with those speaking the Temematic substrate absorbed later by Balto-Slavic, or maybe by Balts migrating northward, and Slavs spreading west- and southward.
NOTE. In any case, we know that R1a-Z645 subclades resurged in Central-East Europe after the expansion of Bell Beakers, potentially showing an ancient link with the prevalent R1a subclades in the region today. We know that some ancient Central European populations cluster near modern West Slavs, but in other interesting regions (like the British Isles, Central Europe, Scandinavia, or Iberia) we also see close clusters, and nevertheless observe historically documented radical ethnolinguistic changes, as well as many different subsequent genetic inflows and founder effects, that have significantly altered the anthropological picture in these regions, so it could very well be that the lineages we find in ancient samples do not correspond to modern West Slavic lineages, or even similar ancient and modern lineages could show a radical cultural discontinuity (as is likely the case in this to-and-from-the-steppe migration scheme).
Since we are going to see signs of both – west and east admixture – in early Slavic communities near the steppe, and the distribution from South, West, and East Slavs will include a wide “cloud” connecting Central, East, and South-East Europe, as it is evident already from early Germanic samples, it may be interesting to shift our attention to the Tollense valley and Lusatian samples, and their predominant Y-DNA haplogroups. Once again, tracking male-driven migrations from Central Europe to the Baltic region and the steppe, and back again to much of Central and South Europe, will determine which groups expanded this eastern NWIE dialect initially and in later times.
Since Baltic and Slavic languages are attested quite late, genetics is likely to help us select among the different available models for Balto-Slavic, although (it is worth repeating it) these lineages may not be the same that later expanded each dialect.
NOTE. Bronze and Iron Age samples might begin to depict the true Balto-Slavic migration map. Apart from the strong differences in the satemization processes seen among Baltic, Slavic, and Indo-Iranian, from an archaeological point of view the geographic location of the earliest attested Baltic languages and the prehistoric developments of the region seem to me almost incompatible with a homeland in the steppe. Anyway, in the worst-case scenario – for those of us who work with Balto-Slavic to reconstruct North-West Indo-European – there is consensus that there must an eastern North-West Indo-European language (which some would call Temematic), whose common traits with Germanic and Italo-Celtic we use to reconstruct their parent language. The question remains thus mostly theoretical, of limited pragmatic use for the reconstruction.
The third way: Baltic Late Neolithic
I have referred to Kristiansen and his group‘s position regarding Corded Ware as Indo-European as flawed before. While their latest interpretation (and language identification) was wrong, Kristiansen’s original idea of long-lasting contacts in the Dnieper-Dniester region with the area occupied by late Trypillia developing a Proto-Corded Ware culture was probably right, as we are seeing now.
New data in Mittnik et al. 2018 show some interesting early Late Neolithic samples from the Baltic region – Zvejnieki, Gyvakarai1 (R1a-Z645) and Plinkaigalis242 – , proving what I predicted: that elevated steppe ancestry and R1a-Z645 subclades would be found in the Dnieper-Dniester region unrelated to the Yamna expansion, and, it seems, to migrants of the Corded Ware A-horizon.
Funnily enough, this shows that there were probably ancient interactions in the region, as originally asserted by Kristiansen, and probably following some of Victor Klochko‘s proposed exchange paths, but earlier than predicted by him.
Funny also how Anthony, too – like Kristiansen – , may have been right all along since 2007, in proposing that Corded Ware (the nuclear Corded Ware migrants) stemmed from the Dnieper-Dniester region roughly at the same time as Yamna migrants expanded west, and that they did not have any direct genetic connection (in terms of migrations) with each other.
Both researchers, who collaborated with the latest genomic research, remade their models, and have to revise now their most recent proposals with the new data, influencing each new paper published with their pressure to be right in their previous models, and with new genomic data compelling them to change their theories under the pressure not to be too wrong again, in this strange vicious circle. Had they remained silent and committed to their archaeological theories, they could have been right all along, each one in their own way.
NOTE. BTW, in case you see ad hominem here too, I feel compelled to say that only thanks to their commitment to disentangle the truth about ancient migrations, and their readiness to collaborate with genetic research – unlike many others in their field – we know today what we know. If they have been wrong many times, it is because they have tried to connect the genetic dots as they were told. Only because of their readiness to explore their science further they should be praised by all. But, again, that does not mean that they cannot be wrong in their models…
Thanks to Anthony’s latest change of mind, we don’t have to hear the “cultural diffusion” argument anymore, and I consider this a great advance for the field.
NOTE. Not that there could not be prehistoric cultural diffusion events of language (i.e. not accompanied by genetic admixture), of course, but such theories, almost impossible to disprove, probably need much more than a simple “patron-client relationship” proposal and anthropometry to justify them, in a time when we will be able to see almost every meaningful personal exchange in Genomics…
Today – since the finding of Ukraine_Eneolithic sample I6561, of haplogroup R1a-Z93, dated ca. 4200 BC, and likely from the Sredni Stog culture – it seems more likely than ever that the expansion of R1a-Z645 subclades was in fact associated with the spread of steppe admixture probably near the North Pontic forest-steppe region, most likely from the Dnieper-Dniester or Upper Dniester region.
The appearance of a ‘late’ Z93 subclade already at such an early date, with steppe admixture, makes it still more likely that the Proto-Corded Ware culture, from where Corded Ware migrants of R1a-Z645 lineages later spread, was probably associated with this wide region.
NOTE. A migration of Yamna settlers northward along the Prut dated ca. 3000 BC or later could have justified the appearance of steppe admixture in the Dnieper-Dniester region, as I proposed for the Zvejnieki sample, although dates from Baltic samples are likely too early for that. For this to be corroborated, migrants should be accompanied up to a certain region by R1b-L23 lineages, and this could mean in turn a revival of Anthony’s original model of cultural diffusion of 2007. The most likely scenario, however, as predicted by Heyd, given the early appearance of steppe admixture and R1a-Z93 subclades in the forest-steppe during the 5th millennium, is that the admixture happened much earlier than that, fully unrelated to Late PIE migrations.
The modern Baltic and Slavic conundrum
As for some people of Northern European ancestry previously supporting a bulletproof Yamna (R1a/R1b) -> Corded Ware migration that was obviously wrong; now supporting different Sredni Stog -> Corded Ware groups representing Indo-Slavonic (and Germanic??) in a model that is clearly wrong: how are these attempts different from Western Europeans supporting the autochthonous continuity of R1b-P312 lineages against all recent data, from Indians supporting the autochthonous continuity of R1a-M417 lineages no matter what, and from the more recent trend of autochthonous continuity theories for N1c lineages and Uralic in Eastern Europe?
Modern Germanic-speaking peoples can trace their common language to Nordic Iron Age Proto-Germanic, Celts to La Tène’s expansion of Proto-Celtic, and Romance speakers to the Roman expansion (and to an earlier Proto-Italic), all three dating approximately to the Iron Age. Proto-Slavic is dated much later than that, and probably Proto-Baltic too (or maybe earlier depending on the dialectal proposal), with Balto-Slavic being possibly coeval with Pre-Proto-Germanic and Italo-Celtic, but probably slightly later than that. Also, the language ancestral to Slavic may be (like a theoretical Proto-Romance language) impossible to reconstruct with precision, due to multiple substrate (or superstrate?) influences on the wide territory where Proto-Slavic formed and expanded from, in close alliance with steppe communities of different ethnolinguistic backgrounds.
We know that proto-historic Germanic, Celtic, and Italic peoples spread from relatively small regions, and had almost nothing to do with historic groups speaking their daughter languages, let alone modern speakers. Baltic and Slavic are not different.
NOTE. We have read that Weltzin samples clustered closely to Central Europeans (especially Austrians), and at a certain distance from modern Poles. That’s the conclusion of Sell’s PhD thesis, and it may be right, if you take only modern samples for comparison. However, if you have read or thought that they represented some kind of “ancestral Germanic vs. Slavic” battle, please imagine Trump’s voice for my opinion: Wrroonng, wrroonng, wrroonng. They cluster closely with Bell Beaker migrants, Poland BA, and Únětice (in this order), which we now know thanks to the data from O&M 2018 and Mittnik et al. 2018. And we also know who they don’t cluster close too: Corded Ware and Trzciniec samples. Therefore, people from the region near the most likely homelands of Pre-Proto-Germanic and Proto-Balto-Slavic are – as expected – likely descendants from Bell Beaker migrants in Central Europe. The genetic relationship of those ancient samples to modern inhabitants of Central-East Europe? Not obvious – at all.
We also know (and have known for a long time, well before these recent papers) that the oldest attested Indo-European languages – Mycenaean, early Anatolian languages, and Indo-Aryan (through certain words in Mitanni inscriptions) – do not show continuity from the places where they were first attested to the Late and Middle Proto-Indo-European (steppe) homeland either. There should be no problem then in accepting that there is no linguistic, archaeological, or common sense reason to support that Balto-Slavic is older or shows more regional continuity than other IE languages from Europe.
NOTE. Oh yes, Balts saying “Baltic is the most similar language to PIE” I hear you thinking? Uh-huh, sure. And according to some Greeks (supported e.g. by the conclusions from Lazaridis et al. 2017) Mycenaeans were ‘autochthonous’, and Proto-Greek the most similar to PIE. For many Hindus, Vedic Sanskrit is in fact PIE), and the latest paper by Narasimhan et al. (2018) only reinforces this idea (don’t ask me why). Also, Caucasian scholar Gamkrelidze (with Ivanov) supported the origin of the language precisely in the Caucasus, with Armenian being thus the purest language. For Italians fans of Virgil and the Roman Empire, Latin (like Aeneas) comes from Anatolian linguistically and genetically, hence it must be the ‘oldest’ IE dialect alive… No, wait, Danish scholars Kroonen and Iversen quite recently asserted that Germanic is the oldest to branch off, then it should thus be nearest to PIE! I think you can see a pattern here…And don’t forget about the new Vasconic-Uralic hypotheses going on now, with Vasconic fans of R1b changing from Palaeolithic to Mesolithic, and now to European Neolithic and whatnot, or Uralic fans of N1c changing now from Mesolithic EHG to Siberia (for ancestry) or Central Asia (for N1c subclades), or whatever is necessary to believe in ‘continuity’ of their people following the newest genetic papers… Just pick whatever theory you want, call it “mainstream”, and that’s it.
So, if there is no reliable archaeological model connecting Bronze or Iron Age cultures to Eastern European cultures which are supposed to represent the Proto-Slavic and Proto-Baltic homelands…why on earth would any reasonable amateur (not to speak about scholars) dare propose any sort of genetic or linguistic continuity for thousands of years from PIE to early Slavs, a people whose first blurry appearance in historical records happened during the Middle Ages in rather turbulent and genetically admixed regions? It does not make any sense, and it had all odds against it. Blond hair, blue eyes, lactase persistence? Sure, and ABO group, brachycephaly, anthropometry… All very scientifish.
Human ancestry can only help refinesolid academic theories, it cannot create one. Every new pet theory used to satisfy modern cultural pre- and misconceptions has failed, and it will fail again, and again, and again…
To have an own anthropological model of prehistoric migration requires time and study. It is not enough to play with software and to misuse traditional academic disciplines just to ‘prove’ some completely irrelevant, meaningless, and false continuity.
The genetic formation of Central and South Asian populations has been unclear because of an absence of ancient DNA. To address this gap, we generated genome-wide data from 362 ancient individuals, including the first from eastern Iran, Turan (Uzbekistan, Turkmenistan, and Tajikistan), Bronze Age Kazakhstan, and South Asia. Our data reveal a complex set of genetic sources that ultimately combined to form the ancestry of South Asians today. We document a southward spread of genetic ancestry from the Eurasian Steppe, correlating with the archaeologically known expansion of pastoralist sites from the Steppe to Turan in the Middle Bronze Age (2300-1500 BCE). These Steppe communities mixed genetically with peoples of the Bactria Margiana Archaeological Complex (BMAC) whom they encountered in Turan (primarily descendants of earlier agriculturalists of Iran), but there is no evidence that the main BMAC population contributed genetically to later South Asians. Instead, Steppe communities integrated farther south throughout the 2nd millennium BCE, and we show that they mixed with a more southern population that we document at multiple sites as outlier individuals exhibiting a distinctive mixture of ancestry related to Iranian agriculturalists and South Asian hunter-gathers. We call this group Indus Periphery because they were found at sites in cultural contact with the Indus Valley Civilization (IVC) and along its northern fringe, and also because they were genetically similar to post-IVC groups in the Swat Valley of Pakistan. By co-analyzing ancient DNA and genomic data from diverse present-day South Asians, we show that Indus Periphery-related people are the single most important source of ancestry in South Asia — consistent with the idea that the Indus Periphery individuals are providing us with the first direct look at the ancestry of peoples of the IVC — and we develop a model for the formation of present-day South Asians in terms of the temporally and geographically proximate sources of Indus Periphery-related, Steppe, and local South Asian hunter-gatherer-related ancestry. Our results show how ancestry from the Steppe genetically linked Europe and South Asia in the Bronze Age, and identifies the populations that almost certainly were responsible for spreading Indo-European languages across much of Eurasia.
NOTE. The supplementary material seems to be full of errors right now, because it lists as R1b-M269 (and further subclades) samples that have been previously expressly said were xM269, so we will have to wait to see if there are big surprises here. So, for example, samples from Mal’ta (M269), Iron Gates (M269 and L51), and Latvia Mesolithic (L51), a Deriivka sample from 5230 BC (M269), Armenia_EBA (Z2103)…Also, the sample from Yuzhnyy Oleni Ostrov is R1a-M417 now.
EDIT (1 APR 2018): The main author has confirmed on Twitter that they have used a new Y Chr caller that calls haplogroups given the data provided, and depending on the coverage tried to provide a call to the lowest branch of the tree possible, so there are obviously a lot of mistakes – not just in the subclades of R. A revision of the paper is on its way, and soon more people will be able to work with the actual samples, since they say they are releasing them.
Nevertheless, since it is subclades (and not haplogroups) the apparent source of gross errors, for the moment it seems we can say with a great degree of confidence that:
New samples of East Yamna / Poltavka are of haplogroup R1b-L23.
Afanasevo is confirmed to be dominated by R1b-M269.
With lesser confidence in precise subclades, we find that:
A sample from Hajji Firuz in Iran ca. 5650 BC, of subclade R1b-Z2103, may confirm Mesolithic R1b-M269 lineages from the Caucasus as the source of CHG ancestry to Khvalynsk/Yamna, and be thus the reason why Reich wrote about a potential PIE homeland south of the Caucasus. (EDIT 11 APR 2018) The sample shows steppe ancestry, therefore the date is most likely incorrect, and a new radiocarbon dating is due. It is still interesting – depending on the precise subclade – for its potential relationship with IE migrations into the area.
New samples of East Yamna / Poltavka are of haplogroup R1b-Z2103.
Afanasevo migrants are mainly of haplogroup R1b-Z2103.
The Darra-e Kur sample, ca. 2655, of haplogroup R1b-L151, without a clear cultural adscription, may be the expected sign of Afanasevo migrants (Pre-Proto-Tocharian speakers) expanding a Northern Indo-European (in contrast with a Southern or Graeco-Aryan) dialect, in a region closely linked with the later desert mummies in the Tarim Basin. Its early presence there would speak in favour of a migration through the Inner Asian Mountain Corridor previous to the one caused by Andronovo migrants.
Sintashta shows a mixed R1b-Z2103 / R1a-Z93 society.
Later Indo-Iranian migrations are apparently dominated by R1a-Z2123, an early subclade of R1a-Z93, also found in Srubna.
R1b is also seen later in BMAC (ca. 1487 BC), although its subclade is not given.
There is also a sample of R1a-Z283 subclade in the eastern steppe (ca. 1600 BC). What may be interesting about it is that it could mark one of the subclades not responsible for the expansion of Balto-Slavic (or responsible for it with the expansion of Srubna, for those who support an Indo-Slavonic branch related Sintashta-Potapovka).
A sample of R1b-U106 subclade is found in Loebanr_IA ca. 950 BC, which – together with the sample of Darra-e Kur – is compatible with the presence of L51 in Yamna.
NOTE. Errors in haplogroups of previously published samples make every subclade of new samples from the supplementary table questionable, but all new samples (safe for the Darra_i_Kur one) were analysed and probably reported by the Reich Lab, and at least upper subclades in each haplogroup tree seem mostly coherent with what was expected. Also, the contribution of Iranian Farmer related (a population in turn contributing to Hajji Firuz) to Khvalynsk in their sketch of the genetic history may be a sign of the association of R1b-M269 lineages with CHG ancestry, although previous data on precise R1b subclades in the region contradict this. (EDIT 11 APR 2018) The sample of Hajji Firuz is most likely much younger than the published date, hence its younger subclade may be correct. No revision or comment on this matter has been published, though.
Also, it seems that the Corded Ware culture appears now irrelevant for Late Proto-Indo-European migrations. Observe:
Our results also shed light on the question of the origins of the subset of Indo-European languages spoken in India and Europe (45). It is striking that the great majority of Indo-European speakers today living in both Europe and South Asia harbor large fractions of ancestry related to Yamnaya Steppe pastoralists (corresponding genetically to the Steppe_EMBA cluster), suggesting that “Late Proto-Indo-European”—the language ancestral to all modern Indo- European languages—was the language of the Yamnaya (46). While ancient DNA studies have documented westward movements of peoples from the Steppe that plausibly spread this ancestry to Europe (5, 31), there has not been ancient DNA evidence of the chain 488 of transmission to South Asia. Our documentation of a large-scale genetic pressure from Steppe_MLBA groups in the 2nd millennium BCE provides a prime candidate, a finding that is consistent with archaeological evidence of connections between material culture in the Kazakh middle-to-late Bronze Age Steppe and early Vedic culture in India (46).
NOTE. If they correct the haplogroups soon, I will update the information in this post. Unless there is a big surprise that merits a new one, of course.
EDIT (1 APR 2018): Multiple minor edits to the original post.
EDIT (2 APR 2018): While I and other simple-minded people were only looking to confirm our previous theories using Y-DNA haplogroups, and are content with wildly speculating over the consequences if some of those strange (probably wrong) ones were true, intelligent people are using their time for something useful, interpreting the results of the investigation as described in the paper, to offer a clearer picture of Indo-Iranian migrations for everyone:
Visit the beautiful interactive map with samples: with their location, PCA, ADMIXTURE and haplogroups (still with those originally given): https://public.tableau.com/profile/vagheesh#!/vizhome/TheGenomicFormationofSouthandCentralAsia/Fig_1
Featured image, from the article: “A Tale of Two Subcontinents. The prehistory of South Asia and Europe are parallel in both being impacted by two successive spreads, the first from the Near East after 7000 BCE bringing agriculturalists who mixed with local hunter-gatherers, and the second from the Steppe after 3000 BCE bringing people who spoke Indo-European languages and who mixed with those they encountered during their migratory movement. Mixtures of these mixed populations then produced the rough clines of ancestry present in both South Asia and in Europe today (albeit with more variable proportions of local hunter-gatherer-related ancestry in Europe than in India), which are (imperfectly) correlated to geography. The plot shows in contour lines the time of the expansion of Near Eastern agriculture. Human movements and mixtures, which also plausibly contributed to the spread of languages, are shown with arrows.”
User Camulogène Rix at Anthrogenica posted an interesting excerpt of Reich’s new book in a thread on ancient DNA studies in the news (emphasis mine):
Ancient DNA available from this time in Anatolia shows no evidence of steppe ancestry similar to that in the Yamnaya (although the evidence here is circumstantial as no ancient DNA from the Hittites themselves has yet been published). This suggests to me that the most likely location of the population that first spoke an Indo-European language was south of the Caucasus Mountains, perhaps in present-day Iran or Armenia, because ancient DNA from people who lived there matches what we would expect for a source population both for the Yamnaya and for ancient Anatolians. If this scenario is right the population sent one branch up into the steppe-mixing with steppe hunter-gatherers in a one-to-one ratio to become the Yamnaya as described earlier- and another to Anatolia to found the ancestors of people there who spoke languages such as Hittite.
The thread has since logically become a trolling hell, and it seems not to be working right for hours now.
This new idea based on ancestral components suffers thus from the same essential methodological problems, which equate it – yet again – to pure speculation:
It is a conclusion based on the genomic analysis of few individuals from distant regions and different periods, and – maybe more disturbingly – on the lack of steppe ancestry in the few samples at hand.
Wait, what? Steppe ancestry? So they are trying to derive potential genetic connections among specific prehistoric cultures with a poorly depicted genetic sketch, based on previous flawed concepts (instead of on anthropological disciplines), which seems a rather long stretch for any scientist, whether they are content with seeing themselves as barbaric scientific conquerors of academic disciplines or not. In other words, statistics is also science (in fact, the main one to assert anything in almost any scientific field), and you cannot overcome essential errors (design, sampling, hypothesis testing) merely by using a priori correct statistical methods. Results obtained this way constitute a statistical fallacy.
Even if the sampling and hypothesis testing were fine, to derive anthropological models from genomic investigation is completely wrong. Ancestral component ≠ population.
To include not only potential migrations, but also languages spoken by these potential migrants? It’s sad that we have a need to repeat it, but if ancestral component ≠ population, how could ancestral component = language?
The Proto-Indo-European-speaking community
This is what we know about the formation of a Proto-Indo-European community (i.e. a community speaking a reconstructible Proto-Indo-European language) in the Pontic-Caspian steppe, which is based on linguistic reconstruction and guesstimates, tracing archaeological cultures backwards from cultures known to have spoken ancient (proto-)languages, and helping both disciplines with anthropological models (for which ancient genomics is only helping select certain details) of migration or – rarely – cultural diffusion:
ca. 4500 BC. Khvalynsk probably speaking Middle Proto-Indo-European expands, most likely including Suvorovo-Novodanilovka chiefs into the North Pontic steppe, and probably expanding R1b-M269 lineages for the first time.
ca. 4000 BC. Separated communities develop, including North Pontic cultures probably gradually dominated by R1a-Z645 (potentially speaking Proto-Uralic); and Khvalynsk (and Repin) cultures probably dominated by R1b-L23 lineages, most likely developing a Late Proto-Indo-European already separated from Proto-Anatolian.
ca. 3500 BC. A Proto-Corded Ware population dominated by R1a-Z645 expands to the north, and slightly later an early Yamna community develops from Late Khvalynsk and Repin, expanding to the west of the Don River, and to the east into Afanasevo. This is most likely the period of reduction of variability and expansion of subclades of R1a-Z645 and R1b-L23 that we expect to see with more samples.
For those willingly lost in a myriad of new dreams boosted by the shallow comment contained in David Reich’s paragraph on CHG ancestry, even he does not doubt that the origin of Late Proto-Indo-European lies in Yamna, to the north of the Caucasus, based on Anthony’s (2007) account:
Innner genetic flow among steppe cultures in close contact.
Potentially stable seasonal exchange systems during the Eneolithic among certain steppe groups with settlements of the Northern Caucasus, which may have included bidirectional exogamy practices.
Just to be clear, an expansion of Proto-Anatolian to the south, through the Caucasus, cannot be discarded today. It will remain a possibility until Maykop and more Balkan Chalcolithic and Anatolian-speaking samples are published.
However, an original Early Proto-Indo-European community south of the Caucasus seems to me highly unlikely, based on anthropological data, which should drive any conclusion. From what I could read, here are the rather simplistic arguments used:
Gimbutas and Maykop: Maykop was thought to be (in Gimbutas’ times) a rather late archaeological culture, directly connected to a Transcaucasian Copper Age culture ca. 2400-2300 BC. It has been demonstrated in recent years that this culture is substantially older, and even then language guesstimates for a Late PIE / Proto-Anatolian would not fit a migration to the north. While our ignorance may certainly be used to derive far-fetched conclusions about potential migrations from and to it, using Gimbutas (or any archaeological theory until the 1990s) today does not make any sense. Still less if we think that she favoured a steppe homeland.
NOTE. It seems that the Reich Lab may have already access to Maykop samples, so this suggested Proto-Indo-European – Maykop connection may have some real foundation. Regardless, we already know that intense contacts happened, so there will be no surprise (unless Y-DNA shows some sort of direct continuity from one to the other).
Gamkrelidze & Ivanov: they argued for an Armenian homeland (and are thus at the origin of yet another autochthonous continuity theory), but they did so to support their glottalic theory, i.e. merely to support what they saw as favouring their linguistic model (with Armenian being the most archaic dialect). The glottalic theory is supported today – as far as I know – mainly by Kortlandt, Jagodziński, or (Nostraticist) Bomhard, but even they most likely would not need to argue for an Armenian homeland. In fact, their support of a Graeco-Aryan group (also supported by Gamkrelidze & Ivanov) would be against this, at least in archaeological terms.
Colin Renfrew and the Anatolian homeland: This conceptual umbrella of language spreading with farming everywhere has changed so much and so many times in the past 20 years, with so many glottochronological and archaeological estimates circulating, that you can support anything by now using them. Mostly used today for abstract models of long-lasting language contacts, cultural diffusion, and constellation analogies. Anyway, he strives to keep up-to-date information to revise the model, that much is certain:
Glottochronology, phylogenetic trees, Swadesh list analysis, statistical estimates, psychics, pyramid power, and healing crystals: no, please, no.
In principle, unlike many other recent autochthonous continuity theories, I doubt there can be much racial-based opposition anywhere in the world to an origin of Proto-Indo-European in the Middle East, where the oldest civilizations appeared – apart, obviously, from modern Northeast and Northwest Caucasian, Kartvelian, or Semitic speakers, who may in turn have to revisit their autochthonous continuity theories radically…
In fact, Proto-Anatolian and Common Anatolian speakers need not share any ancestral component, PCA cluster, or any other statistical parameter related to steppe populations, not even the same Y-DNA haplogroups, given that approximately three thousand years might have passed between their split from an Indo-Hittite community and the first attested Anatolian-speaking communities…We must carefully follow their tracks from Anatolia ca. 1500 BC to the steppe ca. 4500 BC, otherwise we risk creating another mess like the Corded Ware one.
In my opinion, the substantial contribution of EHG ancestry and R1a-M417 lineages to the Pontic-Caspian steppe (probably ca. 6500 BC) from Central or East Eurasia is the most recent sizeable genomic event in the region, and thus the best candidate for the community that expanded a language ancestral to Proto-Indo-European – whether you call it Pre-Proto-Indo-European, Pre-Indo-Uralic, or Eurasiatic, depending on your preferences.
An early (and substantial) contribution of CHG ancestry in Khvalynsk relative to North Pontic cultures, if it is found with new samples, may actually be a further proof of the Caucasian substrate of Proto-Indo-European proposed by Kortlandt (or Bomhard) as contributing to the differentiation of Middle PIE from Uralic. Genomics could thus help support, again, traditional disciplines in accepting or rejecting academic controversial theories.
In the case of an Early PIE (or Indo-Uralic) homeland, genomic data is scarce. But all traditional anthropological disciplines point to the Pontic-Caspian steppe, so we should stick to it, regardless of the informal suggestion written by a renown geneticist in one paragraph of a book conceived as an introduction to the field.
It seems we are not learning much from the hundreds of peer-reviewed, statistically (superficially, at least) sound genetic papers whose anthropological conclusions have been proven wrong by now. A lot of people should be spending their time learning about the complex, endless methods at hand in this kind of research – not just bioinformatics – , instead of fruitlessly speculating about wild unsubstantiated proposals.
As a final note, I would like to remind some in the discussion, who seem to dismiss the identification of CHG with Proto-Indo-European by supporting a “R1a-R1b” community for PIE, of their previous commitment to ancestral components in identifying peoples and languages, and thus their support to Reich’s (and his group’s) fundamental premises.
You cannot have it both ways. At least David Reich is being consistent.
Nevertheless, since we have very few samples, I think we could still see a clear genetic contribution from Yamna to Corded Ware immigrants in the North Caspian region (from Abashevo, in turn a mix of Fatyanovo/Balanovo and Catacomb/Poltavka cultures) in terms of:
Ancestral components and PCA in new Sintashta-Petrovka, Andronovo, and/or later samples – similar the ‘steppe’ drift seen in Potapovka relative to Sintashta samples, both formed by incoming Corded Ware migrants – ; and
R1b-L23 subclades, either appearing scattered during the Sintashta melting pot (of Abashevo/R1a-Z645 and East Yamna-Poltavka/R1b-Z2103 peoples), or resurging after this period, as we have seen in Pre-Balto-Slavic territory.
A lot of people seem to be looking like crazy since O&M 2018 for some sort of connection between Corded Ware and Yamna migrants in Eastern and Central Europe (wheter in SNP calls of samples published, or among almost forgotten academic papers), either to support the ideas of the 2015 papers – for those who relied on their conclusions and built (even if only mentally) far-fetched migration models around it – , or just because of some sort of absurd continuity theory involving modern R1a-Z645 subclades:
Some (the nostalgic ones?) keep looking for just one sample of R1a-Z645 in Yamna, for the same reason.
NOTE. The situation we have seen with the hundreds of samples from O&M 2018, and with the recent additional Eastern European samples, depict an unexpected absolutely clear-cut distinction in Y-DNA haplogroups between Corded Ware and Yamna/Bell Beaker: I really can’t see how the situation could be more obvious for everyone, so I doubt any further samples will make certain people change their minds. Their hope is, I guess, that just one sample may give some more oxygen to infinite pet theories, as we are still surprisingly seeing even with reactionary R1b autochthonous continuists in Western Europe…
However, looking into the most likely future for the field, what we should be expecting right now is continuity of Yamna ancestry and lineages in early Proto-Indo-Iranian territory. Since we only have a few samples from Sintashta-Petrovka, Potapovka, and Andronovo, I think there might be a sizeable number of R1b-Z2103 subclades in the territory inhabited by those who – no doubt – spread the language into Central Asia.
in the Balkans (e.g. in Vučedol or Makó-Kosihy-Čaka), including Greek and even in historical Armenian territory (potentially including Iran Iron Age sample F38 and Armenia LBA/IA RISE397, although the Mitanni may be a confounding factor here), showing the expansion of Palaeo-Balkan languages;
If we find now, as I expect, genetic continuity of east Yamna in Sintashta -> Andronovo (relative to other late Corded Ware peoples), probably including haplogroup R1b-Z2103 mixed with R1a-Z93 before its further reduction of subclades (e.g. to L657) and expansion during its subsequent spread southward…
Why exactly do we need Corded Ware to explain migrations of Late Indo-European speakers?
In other words: if we had the data we have today in 2015, would we have a need for Corded Ware to explain Indo-European migrations from the steppe? Are some people so blinded by their will to (appear to) be right in their past interpretations that they can’t just let go?
NOTE. On a side note, wouldn’t it be nice for this paper to publish some other R1b-L23 (x2103) sample – maybe even R1b-L51 – in Yamna, Andronovo, or Afanasevo territory, to end both autochthonous continuity theories (of North-Eastern and Western Europe) at the same time?
I really hope someone in David Reich’s team understands this matter, or else they will still identify Corded Ware as the (now probably ‘a’ instead) vector of expansion of Indo-European languages, and some of us will still have fun for another 2 or 3 years with such conclusions, until someone in the lab realizes that ancestry ≠ population ≠ ethnic identification ≠ language.
NOTE. It seems rather dull to read how people are discussing in the Twitterverse conventional constructs like ‘human race‘ as found in Reich’s op-ed in The New York Times, as if such grandiose semantic discussions had any practical meaning, when basic anthropological questions actually relevant for Genomics, like the essential ancestral component ≠ people tenet seem not to be of interest for anyone in the field….
Since our Indo-European demic difusion model (and its consequences for our reconstruction of North-West Indo-European) and this blog are becoming more and more popular each day – judging by the constant growth in visits in the past 6 months or so – , I guess the simplemindedness and predictability of certain geneticists is benefitting traditional anthropology directly, driving more and more amateur geneticists to look for sound academic models to answer the growing inconsistencies of genetic research.
NOTE. I am not saying the rejection of Corded Ware as spreading Indo-European is definitive. Maybe more samples within some years will depict a clear ancient expansion of Early or Middle Proto-Indo-Europeans from Khvalynsk to the forest-steppe and forest zone, and later with certain Corded Ware migrants into Central Europe, over whose territory a Late Indo-European dialect from Bell Beakers became the superstrate, as some have proposed in the past – e.g. to explain Krahe’s Old European hydronymy. I really doubt you could demonstrate such an old ethnolinguistic identification with a clear, unbroken archaeological trail, though, and we know now that this old hydronymy is probably of Late Indo-European nature (possibly even more recent).
What I am saying is: with the data we have now, it does not make any sense to keep the anthropological models invented by geneticists ex nihiloin 2015, and the hundred different alternative Late Indo-European migration models that are – born – with – each – new – paper.
These Yamna -> Corded Ware migration models didn’t have any sense for me since early 2016, but now after O&M 2017, and especially O&M 2018, I don’t think any geneticist with a little knowledge in Linguistics or Archaeology (if they are decent about their quest for truth in describing ancient European migrations) would buy them, if not for some sort of created ‘tradition’. So let’s ditch Corded Ware as Late Indo-European-speaking, let’s accept that late Corded Ware migrants should most likely be identified as early Uralic speakers, and then future data will tell if we are – again – wrong.
Please, don’t let Genomics become another pseudoscience based solely on Bioinformatics like glottochronology: let anthropologists (preferably mainstream archaeologists, but also the true Indo-Europeanists, linguists) help you interpret your raw data. Don’t deceive yourselves thinking that you have read enough about the Indo-European question, or that you know enough Indo-Europeanists (say what?) to derive your own conclusions.
Use the South Asia paper to begin expressly retracting the Corded Ware mess.
It happens so that the discussion has turned lately mainly to ancient Y-DNA haplogroups, because they help confirm previous mainstream anthropological models of cultural diffusion and migration. It is obviously not reasonable to judge prehistoric ethnolinguistic migrations from ca. 5,000 years ago based on historical nation-states and ethnic or religious concepts invented since the Middle Ages, coupled with “your” people’s main modern (or your own) paternal lineage.
EDIT (27 MAR 2018): Minor corrections and post made shorter.
I already expressed my predictions for 2018. One of the most interesting questions among them is the identification of the early Anatolian offshoot, and this is – I believe – where Genomics has the most to say in Indo-European migrations.
EDIT (10 MAR 2018): The Anatolian westward route within the steppe homeland model refers to the possibility that Proto-Anatolian spread south through the Caucasus, and then westward through Anatolia, as suggested e.g. originally by Marija Gimbutas for Maykop, as a link in the Caucasus.
We all know that this Khvalynsk -> Novodanilovka-Suvorovo -> Cernavoda -> Ezero -> Troymigration model proposed by Anthony shows no conspicuous chain in Archaeology, but obvious contacts (including Genomics) are seen among some of these neighbouring cultures in different times.
We know that remains of Suvorovo-Novodanilovka culture of chiefs emerged around 4400-4200 BC among ordinary local Sredni Stog settlements:
the Novodanilovka rich burials in the steppes, near the Dnieper,
and the Suvorovo group in the Danube delta, roughly coinciding with the massive abandonment of old tell settlements in the area.
One of the strongest cultural connections between Khvalynsk and Suvorovo Novodanilovka chiefs is the similar polished stone mace-heads shaped like horse heads found in both cultures, a typical steppe prestige object going back to the east Pontic-Caspian steppe beginning ca. 5000-4800 BC.
Its finding in the Danube valley may have signalled the expansion of horse riding, which is compatible with the finding of ancient domesticated horses in the region. Horses were not important in Old European cultures, and it seems that they weren’t in Sredni Stog or Kvitjana either.
NOTE. Telegin, the main source of knowledge in Ukraine prehistoric cultures for Anthony, was eventually convinced that Surovovo-Novodanilovka was a separate culture. However, for Anthony (using Telegin’s first impressions), it may have been a wealthy elite among Sredni Stog peoples. Anthony considers Sredni Stog to have been also influenced by Khvalynsk, and thus potentially related to the Suvorovo-Novodanilovka chiefs.
Nevertheless, he obviously cannot link North Pontic Eneolithic cultures to Khvalynsk nor to horse riding – whilst he clearly assumes horse riding for Novodanilovka-Suvorovo chiefs – , and he does not link North Pontic cultures to later expansions of Late Proto-Indo-Europeans from late Khvalynsk and Yamna, either.
The question here for Anthony (as with further Proto-Anatolian expansions described in his 2007 book), in my opinion, was to offer a plausible string of connections between Khvalynsk and Anatolia, and the simplest connection one can make among steppe cultures is a general, broad community between North Pontic and North Caspian cultures. That way, the knot tying Khvalynsk to the Danube seems stronger, whatever the origin of Suvorovo-Novodanilovka chiefs.
If, however, a direct genetic connection is made between Suvorovo-Novodanilovka chiefs and Khvalynsk – as in its association with R1b-M269 and R1b-L23 lineages – , there will be little need to include Sredni Stog or any other intermediate culture in the equation.
We have already seen a movement of steppe ancestry into mainland Greece, and I would not be surprised if a parallel movement could be seen from Ezero to Troy (or a neighbouring North-West Anatolian region), so that the final migration of Common Anatolian had in fact been triggered by the massive steppe migrations during the Chalcolithic.
NOTE. Whereas we are certain to find R1b-L23 subclades in the direct Balkan migrations from Yamna, the link of steppe->Anatolia migrations may be a little trickier: even if we find out that the Suvorovo-Novodanilovka expansion was associated with an expansion and reduction of haplogroup variability (to haplogroups R1b-M269 and R1b-L23), we don’t know yet if the ca. 1,500 years passed (and the different cultural and population changes occurred) between Proto-Anatolian and Common Anatolian migrations may have impacted the main haplogroup composition of both communities.
A probably unsurprising – because of its previously known admixture and PCA – , but nevertheless disappointing finding came from the Y-SNP call of the haplogroup R1 found in Varna (R1b-V88, given first by Genetiker), leaving us with no new haplogroup data standing out for this period.
This sample’s lack of obvious genetic links with the steppe and early date didn’t deter me from believing it could show subclade M269, and thus a sign of incoming Suvorovo chiefs in the region. After all, R1b-P297 subclades seemed to have almost disappeared from the Balkans by that time, and we know that assessments based only on ancestral components and PCA clusters are not infallible – we are seeing that in many, many samples already.
NOTE. In fact, the first time I checked Mathieson et al. (2018) supplementary tables I thought that the ‘Ukraine_Eneolithic’ sample of R1b-L23 subclade was ‘it’: the first clear proof in ancient samples of incoming Suvorovo chiefs from Khvalynsk I was looking for…Until I realized its date, and that it was more likely a Late Yamna (or Catacomb) sample.
a) If the incoming Suvorovo-Novodanilovka chiefs (most likely originally from Khvalynsk) dominating over North Pontic and Danube regions show – as I bet – R1b-M269, and possibly also early R1b-L23* subclades,
b) Or else they still show mixed lineages, reflecting an older admixed population of the Pontic-Caspian steppe – as the early Khvalynsk and Ukraine Eneolithic samples we have now.
I am not a fan of continuity theories – that much should be clear for anyone reading this blog. However, most of such proposals’ supremacist (or rather fear-of-inferiority) overtones don’t mean they have to be wrong. It just means that most of them, most of the time, most likely are.
While reading Tommenable’s comments, I thought about a potential alternative model, where one could a priori accept an identification of North Pontic cultures as ‘Indo-Slavonic’, which seems to be the Eastern European R1a continuist trend right now.
NOTE. To accept this model, one should first (not a posteriori) accept an Indo-Slavonic linguistic group on theoretical grounds, of course, and take the steppe ancestral component (and not archaeological data) as the most meaningful aspect to consider for language expansion and exchange (which we know is not the most intelligent approach to cultural or language change).
Thinking about how Genomics could challenge what mainstream Linguistics and Archaeology accepts, the only situation I can think of (using simplistic phylogeography) regarding late Khvalynsk-Sredni Stog contacts (until ca. 3300 BC) is:
That the community of R1b-L51 lineages was in fact an isolated group , and not a western one – i.e. to the east within the Volga-Ural groups, or maybe to the south within the North Caucasian groups .
That the R1b-Z2103 community was a huge one dominating over much of the steppe, from the Dnieper area to the Volga-Ural region (where we know they were).
That R1a-M417 subclades (and especially subclade R1a-Z645) with steppe ancestry, as found in Corded Ware migrants,were only found in the North Pontic area (i.e. in Sredni Stog) during the fourth millennium (until at least 3300 BC, when Yamna substitutes it), and did not form other communities in the forest-steppe or Forest Zone (from where Corded Ware eventually expanded), as it is quite likely.
That both the R1b-Z2103 and R1a-Z645 communities shared obvious genetic connections (whatever they were) around the Dnieper, that could justify a common, shared language.
Only then, if a widespread Graeco-Aryan-speaking community happened to be spread from west to east in the Pontic-Caspian steppe, with close contacts with North Pontic cultures, and having an isolated Northern Late PIE community somewhere different than West Yamna, it could leave for me a reasonable doubt of a cultural connection (maybe “Indo-Slavonic” in nature) of the North Pontic steppe. But then we would probably be stuck – yet again – with some sort of cultural diffusion event, impossible to demonstrate.
Since it is known (in Linguistics, and also in Y-DNA lineages, due to the early expansion of Z2103 subclades) that Graeco-Aryan groups separated early, this model would not be impossible.
Also a priori in favour of that model would be the early expansion of a (Northern IE-speaking) Pre-Tocharian population to the east. On the other hand, from an archaeological point of view, the group reaching Afanasevo seems to have expanded from Repin, just like the community expanding Yamna to the west of the Dnieper.
I really doubt there can be any serious discussion though, apart from amateur geneticists with a personal interest on this, because:
Dialectal separation within a Late Proto-Indo-European language must have happened late, gradually, and in close contact, allowing for common innovations to spread through dialectal groups.
It does not make sense in terms of prehistoric cultures, since there is no direct connection or migration among steppe cultures but for the Novodanilovka and the Yamna expansions.
Indo-Slavonic is only supported by a handful of linguists, and not in the way or timing described in this model.
NOTE. You can read Kortlandt’s works in Academia.edu (also on his personal website) if you are really interested in knowing more about an Indo-Slavonic proposal, from an expert Balticist and Slavicist. However, if your intent is to demonstrate some ancient ethnic link of “your” people (whatever that means) to mythical Proto-Indo-Europeans, you would not need actual knowledge or sound theories to do that, so you can skip that part. Also, Kortlandt would probably support a later model of Indo-Slavonic expansion in the steppe, related to East Yamna, and later Sintashta, Srubna, etc…
If you think about it, if most modern Slavs were mainly of R1b-L23 lineages instead of R1a-Z645 (a replacement which, as it is clear know, is the consequence of a simple resurge of previous lineages in East-Central Europe, coupled with a later gradual replacement through founder effects, so no big migration history here), and Finnic speakers were mainly of R1a-Z645 lineages (whose replacement by N1c lineages seems also the consequence of quite late consecutive founder effects), I doubt we would be having this reticence to accept sound anthropological models.
NOTE. The change of narratives where certain languages must have accompanied R1a-Z645 and N1c lineages, but in alternative ways not previously described, is obviously unjustified, if linguistic and archaeological data tell a different story. As unjustified as it is to change Yamna for “Neolithic Steppe” as homeland of Late Indo-European, to fit it with the steppe ancestry concept…
As expected, the first Y-DNA haplogroup of a sample from the North Pontic region (apart from an indigenous European I2 subclade) during its domination by the Yamna culture is of haplogroup R1b-L23, and it is dated ca. 2890-2696 BC. More specifically, it is of Z2103 subclade, the main lineage found to date in Yamna samples. The site in question is Dereivka, “in the southern part of the middle Dnieper, at the boundary between the forest-steppe and the steppe zones”.
There is no data on this individual in the supplementary material – since Eneolithic Dereivka samples come from stored dental remains – , but the radiocarbon date (if correct) is unequivocal: the Yamna cultural-historical community dominated over that region at that precise time. Why would the authors name it just “Ukraine_Eneolithic”? They surely took the assessment of archaeologists, and there is no data on it, so I agree this is the safest name to use for a serious paper. This would not be the first sample apparently too early for a certain culture (e.g. Catacomb in this case) which ends up being nevertheless classified as such. And it is also not impossible that it represents another close Ukraine Eneolithic culture, since ancestral cultural groups did not have borders…
NOTE. Why, on the other hand, was the sample from Zvejnieki – classified as of Latvia_LN – assumed to correspond to “Corded Ware” (like the recent samples from Plinkaigalis242 or Gyvakarai1), when we don’t have data on their cultures either? No conspiracy here, just taking assessments from different archaeologists in charge of these samples: those attributed to “Corded Ware” have been equally judged solely by radiocarbon date, but, combining the known archaeological signs of herding in the region arriving around this time with the old belief (similar to the “Iberia is the origin of Bell Beaker peoples” meme) that “only the Corded Ware culture signals the arrival of herding in the Baltic”. This assumption has been contested recently by Furholt, in an anthropological model that is now mainstream, upheld also by Anthony.
We already know that, out of three previous West Yamna samples, one shows Anatolian Neolithic ancestry, the so-called “Yamna outlier”. We also know that one sample from Yamna in Bulgaria also shows Anatolian Neolithic ancestry, with a distinct ‘southern’ drift, clustering closely to East Bell Beaker samples, as we can still see in Mathieson et al. (2018), see below. So, two “outliers” (relative to East Yamna samples) out of four samples… Now a new, fifth sample from Ukraine is another “outlier”, coinciding with (and possibly somehow late to be a part of) the massive migration waves into Central Europe and the Balkans predicted long ago by academics and now confirmed with Genomics.
I think there are two good explanations right now for its ancestral components and position in PCA:
How many generations are needed for ancestral components and PCA clusters to change to that extent, in regions where only some patrilocal chiefs but indigenous populations remain, and the population probably admixed due to exogamy, back-migrations, and “resurge” events? Not many, obviously, as we see from the differences among the many Bell Beaker samples of R1b-L23 subclades from Olalde et al. (2018)…
b) That this sample shows the first genetic sign of the precise population that contributed to the formation of the Catacomb culture. Since it is a hotly debated topic where and how this culture actually formed to gradually replace the Yamna culture in the central region of the Pontic-Caspian steppe, this sample would be a good hint of how its population came to be.
This could then be not ‘just another West Yamna outlier’, but would actually show meaningful ‘resurge’ of Neolithic Ukraine ancestry in the Catacomb culture.
It could be meaningul to derive hypotheses, in the same way that the late Central European CWC sample from Esperstedt (of R1a-M417 subclade) shows recent exogamy directly from the (now more probably eastern part of the) steppe or steppe-forest, and thus implies great mobility among distant CWC groups. Although, given the BB samples with elevated steppe ancestry and close PCA cluster from Olalde et al. (2018), it could also just mean exogamy from a near-by region, around the Carpathian Basin where Yamna migrants settled…
How to know which is the case? We have to wait for more samples in the region. For the moment, the date seems too early for the known radiocarbon dating of most archaeological remains of the Catacomb Culture.
An important consequence of the addition of these “Yamna outliers” for the future of research on Indo-European migrations is that, especially if confirmed as just another West Yamna sample – with more, similar samples – , early Palaeo-Balkan peoples migrating south of the Danube and later through Anatolia may need to be judged not only in terms of ancestral components or PCA (as in the paper on Minoans and Mycenaeans), but also and more decisively using phylogeography, especially with the earliest samples potentially connected with such migrations.
Even without express confirmation of its presence in the steppe, the alternative model of a Balkan origin seems unlikely, given the almost certain continuity of expanding Yamna clans as East Bell Beaker ones, in this clearly massive and relatively quick expansion that did not leave much time for founder effects. But, of course, it is not impossible to think about a previously hidden R1b-L151 community in the Carpathian Basin yet to be discovered, adopting North-West Indo-European (by some sort of founder effect) brought there by Yamna peoples of exclusively R1b-Z2103 lineages. As it is not impossible to think about a hidden and ‘magically’ isolated community of haplogroup R1a-M417 in Yamna waiting to be discovered…Just not very likely, either option.
As to why this sample or the other Bell Beaker samples “solve” the question of R1a-Z645 subclades (typical of Corded Ware migrants) not expanding with Yamna, it’s very simple: it doesn’t. What should have settled that question – in previous papers, at least since 2015 – is the absence of this subclade in elite chiefs of clans expanded from Khvalynsk, Yamna, or their only known offshoots Afanasevo and Bell Beaker. Now we only have still more proof, and no single ‘outlier’ in that respect.
Not that radiocarbon dates or the actual origin of this sample cannot be wrong, mind you, it just strikes me how twisted such biased reasonings may be, depending on the specific sample at hand… Denial, anger, and bargaining, including shameless circular reasoning – we know the drill: we have seen it a hundred times already, with all kinds of supremacists autochthonous continuists who still today manage to place an oudated mythical symbolism on expanding Proto-Indo-Europeans, or on regional ethnolinguistic continuity…