Interesting report by Bernard Sécher on Anthrogenica, about the Ph.D. thesis of Samantha Brunel from Institut Jacques Monod, Paris, Paléogénomique des dynamiques des populations humaines sur le territoire Français entre 7000 et 2000 (2018).
A summary from user Jool, who was there, translated into English by Sécher (slight changes to translation, and emphasis mine):
They have a good hundred samples from the North, Alsace and the Mediterranean coast, from the Mesolithic to the Iron Age.
There is no major surprise compared to the rest of Europe. On the PCA plot, the Mesolithic are with the WHG, the early Neolithics with the first farmers close to the Anatolians. Then there is a small resurgence of hunter-gatherers that moves the Middle Neolithics a little closer to the WHGs.
From the Bronze Age, they have 5 samples with autosomal DNA, all in Bell Beaker archaeological context, which are very spread on the PCA. A sample very high, close to the Yamnaya, a little above the Corded Ware, two samples right in the Central European Bell Beakers, a fairly low just above the Neolithic package, and one last full in the package. The most salient point was that the Y chromosomes of their 12 Bronze Age samples (all Bell Beakers) are all R1b, whereas there was no R1b in the Neolithic samples.
Finally they have samples of the Iron Age that are collected on the PCA plot close to the Bronze Age samples. They could not determine if there is continuity with the Bronze Age, or a partial replacement by a genetically close population.
The sample with likely high “steppe ancestry“, clustering closely to Yamna (more than Corded Ware samples) is then probably an early East Bell Beaker individual, probably from Alsace, or maybe close to the Rhine Delta in the north, rather than from the south, since we already have samples from southern France from Olalde et al. (2018) with high Neolithic ancestry, and samples from the Rhine with elevated steppe ancestry, but not that much.
This specific sample, if confirmed as one of those reported as R1b (then likely R1b-L151), as it seems from the wording of the summary, is key because it would finally link Yamna to East Bell Beaker through Yamna Hungary, all of them very “Yamnaya-like”, and therefore R1b-L151 (hence also R1b-L51) directly to the steppe, and not only to the Carpathian Basin (that is, until we have samples from late Repin or West Yamna…)
NOTE. The only alternative explanation for such elevated steppe ancestry would be an admixture between a ‘less Yamnaya-like’ East Bell Beaker + a Central European Corded Ware sample like the Esperstedt outlier + drift, but I don’t think that alternative is the best explanation of its position in the PCA closer to Yamna in any of the infinite parallel universes, so… Also, the sample from Esperstedt is clearly a late outlier likely influenced by Yamna vanguard settlers from Hungary, not the other way round…
Unexpectedly, then, fully Yamnaya-like individuals are found not only in Yamna Hungary ca. 3000-2500 BC, but also among expanding East Bell Beakers later than 2500 BC. This leaves us with unexplained, not-at-all-Yamnaya-like early Corded Ware samples from ca. 2900 BC on. An explanation based on admixture with locals seems unlikely, seeing how Corded Ware peoples continue a north Pontic cluster, being thus different from Yamna and their ancestors since the Neolithic; and how they remained that way for a long time, up to Sintashta, Srubna, Andronovo, and even later samples… A different, non-Indo-European community it is, then.
Let’s wait and see the Ph.D. thesis, when it’s published, and keep observing in the meantime the absurd reactions of denial, anger, bargaining, and depression (stages of grief) among BBC/R1b=Vasconic and CWC/R1a=Indo-European fans, as if they had lost something (?). Maybe one of these reactions is actually the key to changing reality and going back to the 2000s, who knows…
Featured image: initial expansion of the East Bell Beaker Group, by Volker Heyd (2013).
Interesting excerpts (emphasis mine, some links to images and tables deleted for clarity):
Late Bronze Age (LBA) Srubnaya-Alakulskaya individuals carried mtDNA haplogroups associated with Europeans or West Eurasians (17) including H, J1, K1, T2, U2, U4, and U5 (table S3). In contrast, the Iron Age nomads (Cimmerians, Scythians, and Sarmatians) additionally carried mtDNA haplogroups associated with Central Asia and the Far East (A, C, D, and M). The absence of East Asian mitochondrial lineages in the more eastern and older Srubnaya-Alakulskaya population suggests that the appearance of East Asian haplogroups in the steppe populations might be associated with the Iron Age nomads, starting with the Cimmerians.
#UPDATE (5 OCT 2018): Some Y-SNP calls have been published in a Molgen thread, with:
Srubna samples have possibly two R1a-Z280, three R1a-Z93.
Cimmerians may not have R1b: cim357 is reported as R1a.
Some Scythians have low coverage to the point where it is difficult to assign even a reliable haplogroup (they report hg I2 for scy301, or E for scy197, probably based on some shared SNPs?), but those which can be reliably assigned seem R1b-Z2103 [hence probably the use of question marks and asterisks in the table, and the assumption of the paper that all Scythians are R1b-L23]:
The most recent subclade is found in scy305: R1b-Z2103>Z2106 (Z2106+, Y12538/Z8131+)
scy304: R1b-Z2103 (M12149/Y4371/Z8128+).
scy009: R1b-P312>U152>L2 (P312+, U152?, L2+)?
Sarmatians are apparently all R1a-Z93 (including tem002 and tem003);
Srubnaya-Alakulskaya individuals exhibited genetic affinity to northern and northeastern present-day Europeans, and these results were also consistent with outgroup f3 statistics.
The Cimmerian individuals, representing the time period of transition from Bronze to Iron Age, were not homogeneous regarding their genetic similarities to present-day populations according to the PCA. F3 statistics confirmed the heterogeneity of these individuals in comparison with present-day populations
The Scythians reported in this study, from the core Scythian territory in the North Pontic steppe, showed high intragroup diversity. In the PCA, they are positioned as four visually distinct groups compared to the gradient of present-day populations:
A group of three individuals (scy009, scy010, and scy303) showed genetic affinity to north European populations (…).
A group of four individuals (scy192, scy197, scy300, and scy305) showed genetic similarities to southern European populations (…).
A group of three individuals (scy006, scy011, and scy193) located between the genetic variation of Mordovians and populations of the North Caucasus (…). In addition, one Srubnaya-Alakulskaya individual (kzb004), the most recent Cimmerian (cim357), and all Sarmatians fell within this cluster. In contrast to the Scythians, and despite being from opposite ends of the Pontic-Caspian steppe, the five Sarmatians grouped close together in this cluster.
A group of three Scythians (scy301, scy304, and scy311) formed a discrete group between the SC and SE and had genetic affinities to present-day Bulgarian, Greek, Croatian, and Turkish populations (…).
Finally, one individual from a Scythian cultural context (scy332) is positioned outside of the modern West Eurasian genetic variation (Fig. 1C) but shared genetic drift with East Asian populations.
The presence of an SA component (as well as finding of metals imported from Tien Shan Mountains in Muradym 8) could therefore reflect a connection to the complex networks of the nomadic transmigration patterns characteristic of seasonal steppe population movements. These movements, although dictated by the needs of the nomads and their animals, shaped the economic and social networks linking the outskirts of the steppe and facilitated the flow of goods between settled, semi-nomadic, and nomadic peoples. In contrast, all Cimmerians carried the Siberian genetic component. Both the PCA and f4 statistics supported their closer affinities to the Bronze Age western Siberian populations (including Karasuk) than to Srubnaya. It is noteworthy that the oldest of the Cimmerians studied here (cim357) carried almost equal proportions of Asian and West Eurasian components, resembling the Pazyryks, Aldy-Bel, and Iron Age individuals from Russia and Kazakhstan (12). The second oldest Cimmerian (cim358) was also the only one with both uniparental markers pointing toward East Asia. The Q1* Y chromosome sublineage of Q-M242 is widespread among Asians and Native Americans and is thought to have originated in the Altai Mountains (24)
In contrast to the eastern steppe Scythians (Pazyryks and Aldy-Bel) that were closely related to Yamnaya, the western North Pontic Scythians were instead more closely related to individuals from Afanasievo and Andronovo groups. Some of the Scythians of the western Pontic-Caspian steppe lacked the SA and the East Eurasian components altogether and instead were more similar to a Montenegro Iron Age individual (3), possibly indicating assimilation of the earlier local groups by the Scythians.
Toward the end of the Scythian period (fourth century CE), a possible direct influx from the southern Ural steppe zone took place, as indicated by scy332. However, it is possible that this individual might have originated in a different nomadic group despite being found in a Scythian cultural context.
I am surprised to find this new R1b-L23-based bottleneck in Eastern Iranian expansions so late, but admittedly – based on data from later times in the Pontic-Caspian steppe near the Caucasus – it was always a possibility. The fact that pockets of R1b-L23 lineages remained somehow ‘hidden’ in early Indo-Iranian communities was clear already since Narasimhan et al. (2018), as I predicted could happen, and is compatible with the limited archaeological data on Sintashta-Potapovka populations outside fortified settlements. I already said that Corded Ware was out of Indo-European migrations then, this further supports it.
Even with all these data coming just from a north-west Pontic steppe region (west of the Dnieper), these ‘Cimmerians’ – or rather the ‘Proto-Scythian’ nomadic cultures appearing before ca. 800 BC in the Pontic-Caspian steppes – are shown to be probably formed by diverse peoples from Central Asia who brought about the first waves of Siberian ancestry (and Asian lineages) seen in the western steppes. You can read about a Cimmerian-related culture, Anonino, key for the evolution of Finno-Permic peoples.
Also interesting about the Y-DNA bottleneck seen here is the rejection of the supposed continuous western expansions of R1a-Z645 subclades with steppe tribes since the Bronze Age, and thus a clearest link of the Hungarian Árpád dynasty (of R1a-Z2123 lineage) to either the early Srubna-related expansions or – much more likely – to the actual expansions of Hungarian tribes near the Urals in historic times.
NOTE. I will add the information of this paper to the upcoming post on Ugric and Samoyedic expansions, and the late introduction of Siberian ancestry to these peoples.
A few interesting lessons to be learned:
Remember the fantasy story about that supposed steppe nomadic pastoralist society sharing different Y-DNA lineages? You know, that Yamna culture expanding with R1b from Khvalynsk-Repin into the whole Pontic-Caspian steppes and beyond, developing R1b-dominated Afanasevo, Bell Beaker, and Poltavka, but suddenly appearing (in the middle of those expansions through the steppes) as a different culture, Corded Ware, to the north (in the east-central European forest zone) and dominated by R1a? Well, it hasn’t happened with any other steppe migration, so…maybe Proto-Indo-Europeans were that kind of especially friendly language-teaching neighbours?
Remember that ‘pure-R1a’ Indo-Slavonic society emerged from Sintashta ca. 2100 BC? (or even Graeco-Aryan??) Hmmmm… Another good fantasy story that didn’t happen; just like a central-east European Bronze Age Balto-Slavic R1a continuitydidn’t happen, either. So, given that cultures from around Estonia are those showing the closest thing to R1a continuity in Europe until the Iron Age, I assume we have to get ready for the Gulf of Finland Balto-Slavic soon.
Remember that ‘pure-R1a’ expansion of Indo-Europeans based on the Tarim Basin samples? This paper means ipso facto an end to the Tarim Basin – Tocharian artificial controversy. The Pre-Tocharian expansion is represented by Afanasevo, and whether or not (Andronovo-related) groups of R1a-Z645 lineages replaced part or eventually all of its population before, during, or after the Tocharian expansion into the Tarim Basin, this does not change the origin of the language split and expansion from Yamna to Central Asia; just like this paper does not change the fact that these steppe groups were Proto-Iranian (Srubna) and Eastern Iranian (Scythian) speakers, regardless of their dominant haplogroup.
Do you smell that fresher air? It’s the Central and East European post-Communist populist and ethnonationalist bullshit (viz. pure blondR1a-based Pan-Nordicism / pro-Russian Pan-Slavism / Pan-Eurasianism, as well as Pan-Turanism and similar crap from the 19th century) going down the toilet with each new paper.
#EDIT (5 OCT 2018): It seems I was too quick to rant about the consequences of the paper without taking into account the complexity of the data presented. Not the first time this impulsivity happens, I guess it depends on my mood and on the time I have to write a post on the specific work day…
While the data on Srubna, Cimmerians, and Sarmatians shows clearer Y-DNA bottlenecks (of R1a-Z645 subclades) with the new data, the Scythian samples remain controversial, because of the many doubts about the haplogroups (although the most certain cases are R1b-Z2103), their actual date, and cultural attribution. However, I doubt they belong to other peoples, given the expansionist trends of steppe nomads before, during, and after Scythians (as shown in statistical analyses), so most likely they are Scythian or ‘Para-Scythian’ nomadic groups that probably came from the east, whether or not they incorporated Balkan populations. This is further supported by the remaining R1b-P312 and R1b-Z2103 populations in and around the modern Eurasian steppe region.
You can find an interesting and detailed take on the data published (in Russian) at Vol-Vlad’s LiveJournal (you can read an automatic translation from Google). I think that post is maybe too detailed in debunking all information associated to the supposed Scythians – to the point where just a single sample seems to be an actual Scythian (?!) -, but is nevertheless interesting to read the potential pitfalls of the study.
Tracing the origin and expansion of the Turkic and Hunnic confederations, by Flegontov et al.
Turkic-speaking populations, now spread over a vast area in Asia, are highly heterogeneous genetically. The first confederation unequivocally attributed to them was established by the Göktürks in the 6th c. CE. Notwithstanding written resources from neighboring sedentary societies such as Chinese, Persian, Indian and Eastern Roman, earlier history of the Turkic speakers remains debatable, including their potential connections to the Xiongnu and Huns, which dominated the Eurasian steppe in the first half of the 1st millennium CE. To answer these questions, we co-analyzed newly generated human genome-wide data from Central Asia (the 1240K panel), spanning the period from ca. 3000 to 500 YBP, and the data published by de Barros Damgaard et al. (137 ancient human genomes from across the Eurasian steppes, Nature, 2018). Firstly, we generated a PCA projection to understand genetic affinities of ancient individuals with respect to present-day Tungusic, Mongolic, Turkic, Uralic, and Yeniseian-speaking groups. Secondly, we modeled hundreds of present-day and few ancient Turkic individuals using the qpAdm tool, testing various modern/ancient Siberian and ancient West Eurasian proxies for ancestry sources.
A majority of Turkic speakers in Central Asia, Siberia and further to the west share the same ancestry profile, being a mixture of Tungusic or Mongolic speakers and genetically West Eurasian populations of Central Asia in the early 1st millennium CE. The latter are themselves modelled as a mixture of Iron Age nomads (western Scythians or Sarmatians) and ancient Caucasians or Iranian farmers. For some Turkic groups in the Urals and the Altai regions and in the Volga basin, a different admixture model fits the data: the same West Eurasian source + Uralic- or Yeniseian-speaking Siberians. Thus, we have revealed an admixture cline between Scythians and the Iranian farmer genetic cluster, and two further clines connecting the former cline to distinct ancestry sources in Siberia. Interestingly, few Wusun-period individuals harbor substantial Uralic/Yeniseian-related Siberian ancestry, in contrast to preceding Scythians and later Turkic groups characterized by the Tungusic/Mongolic-related ancestry. It remains to be elucidated whether this genetic influx reflects contacts with the Xiongnu confederacy. We are currently assembling a collection of samples across the Eurasian steppe for a detailed genetic investigation of the Hunnic confederacies.
Flegontov: Present day Turkic speakers fall into two clusters of admixture patterns (Uralic/Yenisean and Tungussic/Mngolic) based on genomic data with ancient Turks belonging almost exclusively to the first cluster. #ISBA8
New interesting information on the gradual arrival of the “Uralic-Yeniseian” (Siberian) ancestry in eastern Europe with Iranian and Turkic-speaking peoples. We already knew that Siberian ancestry shows no original relationship with Uralic-speaking peoples, so to keep finding groups who expanded this ancestry eastwards in North Eurasia should be no surprise for anyone at this point.
Central Asia and Indo-Iranian
The session The Genomic Formation of South and Central Asia, by David Reich, on the recent paper by Narasimhan et al. (2018).
Ancient DNA and the peopling of the British Isles – pattern and process of the Neolithic transition, by Brace et al.
Over recent years, DNA projects on ancient humans have flourished and large genomic-scale datasets have been generated from across the globe. Here, the focus will be on the British Isles and applying aDNA to address the relative roles of migration, admixture and acculturation, with a specific focus on the transition from a Mesolithic hunter-gatherer society to the Neolithic and farming. Neolithic cultures first appear in Britain ca. 6000 years ago (kBP), a millennium after they appear in adjacent areas of northwestern continental Europe. However, in Britain, at the margins of the expansion the pattern and process of the British Neolithic transition remains unclear. To examine this we present genome-wide data from British Mesolithic and Neolithic individuals spanning the Neolithic transition. These data indicate population continuity through the British Mesolithic but discontinuity after the Neolithic transition, c.6000 BP. These results provide overwhelming support for agriculture being introduced to Britain primarily by incoming continental farmers, with surprisingly little evidence for local admixture. We find genetic affinity between British and Iberian Neolithic populations indicating that British Neolithic people derived much of their ancestry from Anatolian farmers who originally followed the Mediterranean route of dispersal and likely entered Britain from northwestern mainland Europe.
MN Atlantic / Megalithic cultures
Genomics of Middle Neolithic farmers at the fringe of Europe, by Sánchez Quinto et al.
Agriculture emerged in the Fertile Crescent around 11,000 years before present (BP) and then spread, reaching central Europe some 7,500 years ago (ya.) and eventually Scandinavia by 6,000 ya. Recent paleogenomic studies have shown that the spread of agriculture from the Fertile Crescent into Europe was due mainly to a demic process. Such event reshaped the genetic makeup of European populations since incoming farmers displaced and admixed with local hunter-gatherers. The Middle Neolithic period in Europe is characterized by such interaction, and this is a time where a resurgence of hunter-gatherer ancestry has been documented. While most research has been focused on the genetic origin and admixture dynamics with hunter-gatherers of farmers from Central Europe, the Iberian Peninsula, and Anatolia, data from farmers at the North-Western edges of Europe remains scarce. Here, we investigate genetic data from the Middle Neolithic from Ireland, Scotland, and Scandinavia and compare it to genomic data from hunter-gatherers, Early and Middle Neolithic farmers across Europe. We note affinities between the British Isles and Iberia, confirming previous reports. However, we add on to this subject by suggesting a regional origin for the Iberian farmers that putatively migrated to the British Isles. Moreover, we note some indications of particular interactions between Middle Neolithic Farmers of the British Isles and Scandinavia. Finally, our data together with that of previous publications allow us to achieve a better understanding of the interactions between farmers and hunter-gatherers at the northwestern fringe of Europe.
Central European Bronze Age
Ancient genomes from the Lech Valley, Bavaria, suggest socially stratified households in the European Bronze Age, by Mittnik et al.
Archaeogenetic research has so far focused on supra-regional and long-term genetic developments in Central Europe, especially during the third millennium BC. However, detailed high-resolution studies of population dynamics in a microregional context can provide valuable insights into the social structure of prehistoric societies and the modes of cultural transition.
Here, we present the genomic analysis of 102 individuals from the Lech valley in southern Bavaria, Germany, which offers ideal conditions for such a study. Several burial sites containing rich archaeological material were directly dated to the second half of the 3rd and first half of the 2nd millennium BCE and were associated with the Final Neolithic Bell Beaker Complex and the Early and Middle Bronze Age. Strontium isotope data show that the inhabitants followed a strictly patrilocal residential system. We demonstrate the impact of the population movement that originated in the Pontic-Caspian steppe in the 3rd millennium BCE and subsequent local developments. Utilising relatedness inference methods developed for low-coverage modern DNA we reconstruct farmstead related pedigrees and find a strong association between relatedness and grave goods suggesting that social status is passed down within families. The co-presence of biologically related and unrelated individuals in every farmstead implies a socially stratified complex household in the Central European Bronze Age.
Alissa Mittnik of @MPI_SHH with a talk that heralds a new era of studying archaeological sites: using high resolution ancient DNA to reconstruct relatedness patterns—her results reveal patrilocality in Late Neolithic and Bronze Age Central Europe #isba8
Gene geography of the Russian Far East populations – faces, genome-wide profiles, and Y-chromosomes, by Balanovsky et al.
Russian Far East is not only a remote area of Eurasia but also a link of the chain of Pacific coast regions, spanning from East Asia to Americas, and many prehistoric migrations are known along this chain. The Russian Far East is populated by numerous indigenous groups, speaking Tungusic, Turkic, Chukotko-Kamchatka, Eskimo-Aleut, and isolated languages. This linguistic and geographic variation opens question about the patterns of genetic variation in the region, which was significantly undersampled and received minor attention in the genetic literature to date. To fill in this gap we sampled Aleuts, Evenks, Evens, Itelmens, Kamchadals, Koryaks, Nanais, Negidals, Nivkhs, Orochi, Udegeis, Ulchi, and Yakuts. We also collected the demographic information of local populations, took physical anthropological photos, and measured the skin color. The photos resulted in the “synthetic portraits” of many studied groups, visualizing the main features of their faces.
Finland AD 5th-8th c.
Sadly, no information will be shared on the session A 1400-year transect of ancient DNA reveals recent genetic changes in the Finnish population, by Salmela et al. We will have to stick to the abstract:
Objectives: Our objective was to use aDNA to study the population history of Finland. For this aim, we sampled and sequenced 35 individuals from ten archaeological sites across southern Finland, representing a time transect from 5th to 18th century.
Methods: Following genomic DNA extraction and preparation of indexed libraries, the samples were enriched for 1,2 million genomewide SNPs using in-solution capture and sequenced on an Illumina HighSeq 4000 instrument. The sequence data were then compared to other ancient populations as well as modern Finns, their geographical neighbors and worldwide populations. Authenticity testing of the data as well as population history inference were based on standard computational methods for aDNA, such as principal component analysis and F statistics.
Results: Despite the relatively limited temporal depth of our sample set, we are able to see major genetic changes in the area, from the earliest sampled individuals – who closely resemble the present-day Saami population residing markedly further north – to the more recent ancient individuals who show increased affinity to the neighboring Circum-Baltic populations. Furthermore, the transition to the present-day population seems to involve yet another perturbation of the gene pool.
So, most likely then, in my opinion – although possibly Y-DNA will not be reported – Finns were in the Classical Antiquity period mostly R1a with secondary N1c in the Circum-Baltic region (similar to modern Estonians, as I wrote recently), while Saami were probably mostly a mix of R1a-Z282 and I1 in southern Finland. That’s what the first transition after the 5th c. probably reflects, the spread of Finns (with mainly N1c lineages) to the north, while the more recent transition shows probably the introduction of North Germanic ancestry (and thus also R1b-U106, R1a-Z284, and I1 lineages) in the west.
Dairying in ancient Mongolia
The History of Dairying in ancient Mongolia, by Wilkin et al.
The use of mass spectrometry based proteomics presents a novel method for investigating human dietary intake and subsistence strategies from archaeological materials. Studies of ancient proteins extracted from dental calculus, as well as other archaeological material, have robustly identified both animal and plant-based dietary components. Here we present a recent case study using shotgun proteomics to explore the range and diversity of dairying in the ancient eastern Eurasian steppe. Contemporary and prehistoric Mongolian populations are highly mobile and the ephemerality of temporarily occupied sites, combined with the severe wind deflation common across the steppes, means detecting evidence of subsistence can be challenging. To examine the time depth and geographic range of dairy use in Mongolia, proteins were extracted from ancient dental calculus from 32 individuals spanning burial sites across the country between the Neolithic and Mongol Empire. Our results provide direct evidence of early ruminant milk consumption across multiple time periods, as well as a dramatic increase in the consumption of horse milk in the late Bronze Age. These data provide evidence that dairy foods from multiple species were a key part of subsistence strategies in prehistoric Mongolia and add to our understanding of the importance of early pastoralism across the steppe.
Hypothesis: dairy pastoralism extends into Late #BronzeAge – calculus samples from 31 individuals 3000 BC – AD 1400 – shotgun proteomics; liquid chromatography–mass spectrometry – BLG peptides differentiate ruminant and equine milk, caprine-specific markers
The confirmation of the date 3000-2700 BC for dairying in the eastern steppe further supports what was already known thanks to archaeological remains, that the pastoralist subsistence economy was brought for the first time to the Altai region by expanding late Khvalynsk/Repin – Early Yamna pastoralists that gave rise to the Afanasevo culture.
Neolithic transition in Northeast Asia
Genomic insight into the Neolithic transition peopling of Northeast Asia, by C. Ning
East Asian representing a large geographic region where around one fifth of the world populations live, has been an interesting place for population genetic studies. In contrast to Western Eurasia, East Asia has so far received little attention despite agriculture here evolved differently from elsewhere around the globe. To date, only very limited genomic studies from East Asia had been published, the genetic history of East Asia is still largely unknown. In this study, we shotgun sequenced six hunter-gatherer individuals from Houtaomuga site in Jilin, Northeast China, dated from 12000 to 2300 BP and, 3 farming individuals from Banlashan site in Liaoning, Northeast China, dated around 5300 BP. We find a high level of genetic continuity within northeast Asia Amur River Basin as far back to 12000 BP, a region where populations are speaking Tungusic languages. We also find our Compared with Houtaomuga hunter-gatherers, the Neolithic farming population harbors a larger proportion of ancestry from Houtaomuga related hunter-gathers as well as genetic ancestry from central or perhaps southern China. Our finding further suggests that the introduction of farming technology into Northeast Asia was probably introduced through demic diffusion.
“Genomic insight into the peopling of Northeast China” – Chao Ning @MPI_SHH#ISBA8. Amazing genomic time transect 12000–2300BP from Houtaomuga, Jilin, PRC with #aDNA evidence for genetic continuity of #Tungusic-like groups in #Amur region even deeper than Chertovy Voroda (5700BC) pic.twitter.com/DGqibs52IE
A detail of the reported haplogroups of the Houtaomuga site:
Y-DNA in Northeast Asia shows thus haplogroup N1b1 ~5000 BC, probably representative of the Baikal region, with a change to C2b-448del lineages before the Xiongnu period, which were later expanded by Mongols.
Experienced researchers, particularly those interested in population structure and historical inference, typically present STRUCTURE results alongside other methods that make different modelling assumptions. These include TreeMix, ADMIXTUREGRAPH, fineSTRUCTURE, GLOBETROTTER, f3 and D statistics, amongst many others. These models can be used both to probe whether assumptions of the model are likely to hold and to validate specific features of the results. Each also comes with its own pitfalls and difficulties of interpretation. It is not obvious that any single approach represents a direct replacement as a data summary tool. Here we build more directly on the results of STRUCTURE/ADMIXTURE by developing a new approach, badMIXTURE, to examine which features of the data are poorly fit by the model. Rather than intending to replace more specific or sophisticated analyses, we hope to encourage their use by making the limitations of the initial analysis clearer.
The default interpretation protocol
Most researchers are cautious but literal in their interpretation of STRUCTURE and ADMIXTURE results, as caricatured in Fig. 1, as it is difficult to interpret the results at all without making several of these assumptions. Here we use simulated and real data to illustrate how following this protocol can lead to inference of false histories, and how badMIXTURE can be used to examine model fit and avoid common pitfalls.
STRUCTURE and ADMIXTURE are popular because they give the user a broad-brush view of variation in genetic data, while allowing the possibility of zooming down on details about specific individuals or labelled groups. Unfortunately it is rarely the case that sampled data follows a simple history comprising a differentiation phase followed by a mixture phase, as assumed in an ADMIXTURE model and highlighted by case study 1. Naïve inferences based on this model (the Protocol of Fig. 1) can be misleading if sampling strategy or the inferred value of the number of populations K is inappropriate, or if recent bottlenecks or unobserved ancient structure appear in the data. It is therefore useful when interpreting the results obtained from real data to think of STRUCTURE and ADMIXTURE as algorithms that parsimoniously explain variation between individuals rather than as parametric models of divergence and admixture.
For example, if admixture events or genetic drift affect all members of the sample equally, then there is no variation between individuals for the model to explain. Non-African humans have a few percent Neanderthal ancestry, but this is invisible to STRUCTURE or ADMIXTURE since it does not result in differences in ancestry profiles between individuals. The same reasoning helps to explain why for most data sets—even in species such as humans where mixing is commonplace—each of the K populations is inferred by STRUCTURE/ADMIXTURE to have non-admixed representatives in the sample. If every individual in a group is in fact admixed, then (with some exceptions) the model simply shifts the allele frequencies of the inferred ancestral population to reflect the fraction of admixture that is shared by all individuals.
Several methods have been developed to estimate K, but for real data, the assumption that there is a true value is always incorrect; the question rather being whether the model is a good enough approximation to be practically useful. First, there may be close relatives in the sample which violates model assumptions. Second, there might be “isolation by distance”, meaning that there are no discrete populations at all. Third, population structure may be hierarchical, with subtle subdivisions nested within diverged groups. This kind of structure can be hard for the algorithms to detect and can lead to underestimation of K. Fourth, population structure may be fluid between historical epochs, with multiple events and structures leaving signals in the data. Many users examine the results of multiple K simultaneously but this makes interpretation more complex, especially because it makes it easier for users to find support for preconceptions about the data somewhere in the results.
In practice, the best that can be expected is that the algorithms choose the smallest number of ancestral populations that can explain the most salient variation in the data. Unless the demographic history of the sample is particularly simple, the value of K inferred according to any statistically sensible criterion is likely to be smaller than the number of distinct drift events that have practically impacted the sample. The algorithm uses variation in admixture proportions between individuals to approximately mimic the effect of more than K distinct drift events without estimating ancestral populations corresponding to each one. In other words, an admixture model is almost always “wrong” (Assumption 2 of the Core protocol, Fig. 1) and should not be interpreted without examining whether this lack of fit matters for a given question.
Because STRUCTURE/ADMIXTURE accounts for the most salient variation, results are greatly affected by sample size in common with other methods. Specifically, groups that contain fewer samples or have undergone little population-specific drift of their own are likely to be fit as mixes of multiple drifted groups, rather than assigned to their own ancestral population. Indeed, if an ancient sample is put into a data set of modern individuals, the ancient sample is typically represented as an admixture of the modern populations (e.g., ref. 28,29), which can happen even if the individual sample is older than the split date of the modern populations and thus cannot be admixed.
This paper was already available as a preprint in bioRxiv (first published in 2016) and it is incredible that it needed to wait all this time to be published. I found it weird how reviewers focused on the “tone” of the paper. I think it is great to see files from the peer review process published, but we need to know who these reviewers were, to understand their whiny remarks… A lot of geneticists out there need to develop a thick skin, or else we are going to see more and more delays based on a perceived incorrect tone towards the field, which seems a rather subjective reason to force researchers to correct a paper.
A potential hindrance to our advice to upgrade from PCA graphs to PCA biplots is that the SNPs are often so numerous that they would obscure the Items if both were graphed together. One way to reduce clutter, which is used in several figures in this article, is to present a biplot in two side-by-side panels, one for Items and one for SNPs. Another stratagem is to focus on a manageable subset of SNPs of particular interest and show only them in a biplot in order to avoid obscuring the Items. A later section on causal exploration by current methods mentions several procedures for identifying particularly relevant SNPs.
One of several data transformations is ordinarily applied to SNP data prior to PCA computations, such as centering by SNPs. These transformations make a huge difference in the appearance of PCA graphs or biplots. A SNPs-by-Items data matrix constitutes a two-way factorial design, so analysis of variance (ANOVA) recognizes three sources of variation: SNP main effects, Item main effects, and SNP-by-Item (S×I) interaction effects. Double-Centered PCA (DC-PCA) removes both main effects in order to focus on the remaining S×I interaction effects. The resulting PCs are called interaction principal components (IPCs), and are denoted by IPC1, IPC2, and so on. By way of preview, a later section on PCA variants argues that DC-PCA is best for SNP data. Surprisingly, our literature survey did not encounter even a single analysis identified as DC-PCA.
The axes in PCA graphs or biplots are often scaled to obtain a convenient shape, but actually the axes should have the same scale for many reasons emphasized recently by Malik and Piepho . However, our literature survey found a correct ratio of 1 in only 10% of the articles, a slightly faulty ratio of the larger scale over the shorter scale within 1.1 in 12%, and a substantially faulty ratio above 2 in 16% with the worst cases being ratios of 31 and 44. Especially when the scale along one PCA axis is stretched by a factor of 2 or more relative to the other axis, the relationships among various points or clusters of points are distorted and easily misinterpreted. Also, 7% of the articles failed to show the scale on one or both PCA axes, which leaves readers with an impressionistic graph that cannot be reproduced without effort. The contemporary literature on PCA of SNP data mostly violates the prohibition against stretching axes.
The percentage of variation captured by each PC is often included in the axis labels of PCA graphs or biplots. In general this information is worth including, but there are two qualifications. First, these percentages need to be interpreted relative to the size of the data matrix because large datasets can capture a small percentage and yet still be effective. For example, for a large dataset with over 107,000 SNPs for over 6,000 persons, the first two components capture only 0.3693% and 0.117% of the variation, and yet the PCA graph shows clear structure (Fig 1A in ). Contrariwise, a PCA graph could capture a large percentage of the total variation, even 50% or more, but that would not guarantee that it will show evident structure in the data. Second, the interpretation of these percentages depends on exactly how the PCA analysis was conducted, as explained in a later section on PCA variants. Readers cannot meaningfully interpret the percentages of variation captured by PCA axes when authors fail to communicate which variant of PCA was used.
Five simple recommendations for effective PCA analysis of SNP data emerge from this investigation.
Use the SNP coding 1 for the rare or minor allele and 0 for the common or major allele.
Use DC-PCA; for any other PCA variant, examine its augmented ANOVA table.
Report which SNP coding and PCA variant were selected, as required by contemporary standards in science for transparency and reproducibility, so that readers can interpret PCA results properly and reproduce PCA analyses reliably.
Produce PCA biplots of both Items and SNPs, rather than merely PCA graphs of only Items, in order to display the joint structure of Items and SNPs and thereby to facilitate causal explanations. Be aware of the arch distortion when interpreting PCA graphs or biplots.
Produce PCA biplots and graphs that have the same scale on every axis.
I read the referenced paper Biplots: Do Not Stretch Them!, by Malik and Piepho (2018), and even though it is not directly applicable to the most commonly available PCA graphs out there, it is a good reminder of the distorting effects of stretching. So for example quite recently in Krause-Kyora et al. (2018), where you can see Corded Ware and BBC samples from Central Europe clustering with samples from Yamna:
NOTE. This is related to a vertical distorsion (i.e. horizontal stretching), but possibly also to the addition of some distant outlier sample/s.
The so-called ‘Yamnaya’ ancestry
Every time I read papers like these, I remember commenters who kept swearing that genetics was the ultimate science that would solve anthropological problems, where unscientific archaeology and linguistics could not. Well, it seems that, like radiocarbon analysis, these promising developing methods need still a lot of refinement to achieve something meaningful, and that they mean nothing without traditional linguistics and archaeology… But we already knew that.
Also, if this is happening in most peer-reviewed publications, made by professional geneticists, in journals of high impact factor, you can only wonder how many more errors and misinterpretations can be found in the obscure market of so many amateur geneticists out there. Because amateur geneticist is a commonly used misnomer for people who are not geneticists (since they don’t have the most basic education in genetics), and some of them are not even ‘amateurs’ (because they are selling the outputs of bioinformatic tools)… It’s like calling healers ‘amateur doctors’.
NOTE. While everyone involved in population genetics is interested in knowing the truth, and we all have our confirmation (and other kinds of) biases, for those who get paid to tell people what they want to hear, and who have sold lots of wrong interpretations already, the incentives of ‘being right’ – and thus getting involved in crooked and paranoid behaviour regarding different interpretations – are as strong as the money they can win or loose by promoting themselves and selling more ‘product’.
As a reminder of how badly these wrong interpretations of genetic results – and the influence of the so-called ‘amateurs’ – can reflect on research groups, yet another turn of the screw by the Copenhagen group, in the oral presentations at Languages and migrations in pre-historic Europe (7-12 Aug 2018), organized by the Copenhagen University. The common theme seems to be that Bell Beaker and thus R1b-L23 subclades do represent a direct expansion from Yamna now, as opposed to being derived from Corded Ware migrants, as they supported before.
NOTE. Yes, the “Yamna → Corded Ware → Únětice / Bell Beaker” migration model is still commonplace in the Copenhagen workgroup. Yes, in 2018. Guus Kroonen had already admitted they were wrong, and it was already changed in the graphic representation accompanying a recent interview to Willerslev. However, since there is still no official retraction by anyone, it seems that each member has to reject the previous model in their own way, and at their own pace. I don’t think we can expect anyone at this point to accept responsibility for their wrong statements.
I love the newly invented arrows of migration from Yamna to the north to distinguish among dialects attributed by them to CWC groups, and the intensive use of materials from Heyd’s publications in the presentation, which means they understand he was right – except for the fact that they are used to support a completely different theory, radically opposed to those defended in Heyd’s model…
Now added to the Copenhagen’s unending proposals of language expansions, some pearls from the oral presentation:
Corded Ware north of the Carpathians of R1a lineages developed Germanic;
R1b borugh [?] Italo-Celtic;
the increase in steppe ancestry on north European Bell Beakers mean that they “were a continuation of the Yamnaya/Corded Ware expansion”;
“Corded Ware groups  stopped their expansion and took over the Bell Beaker package before migrating to England” [yep, it literally says that];
Italo-Celtic expanded to the UK and Iberia with Bell Beakers [I guess that included Lusitanian in Iberia, but not Messapian in Italy; or the opposite; or nothing like that, who knows];
2nd millennium BC Bronze Age Atlantic trade systems expanded Proto-Celtic [yep, trade systems expanded the language]
1st millennium BC expanded Gaulish with La Tène, including a “Gaulish version of Celtic to Ireland/UK” [hmmm, datBritish Gaulish indeed].
You know, because, why the hell not? A logical, stable, consequential, no-nonsense approach to Indo-European migrations, as always.
Also, compare still more invented arrows of migrations, from Mikkel Nørtoft’s Introducing the Homeland Timeline Map, going against Kristiansen’s multiple arrows, and even against the own recent fantasy map series in showing Bell Beakers stem from Yamna instead of CWC (or not, you never truly know what arrows actually mean):
I really, really loved that perennial arrow of migration from Volosovo, ca. 4000-800 BC (3000+ years, no less!), representing Uralic?, like that, without specifics – which is like saying, “somebody from the eastern forest zone, somehow, at some time, expanded something that was not Indo-European to Finland, and we couldn’t care less, except for the fact that they were certainly not R1a“.
This and Kristiansen’s arrows are the most comical invented migration routes of 2018; and that is saying something, given the dozens of similar maps that people publish in forums and blogs each week.
It’s hard to accept that this is a series of presentations made by professional linguists, archaeologists, and geneticists, as stated by the official website, and still harder to imagine that they collaborate within the same professional workgroup, which includes experienced geneticists and academics.
I propose the following video to close future presentations introducing innovative ideas like those above, to help the audience find the appropriate mood:
Our results revealed tMRCA average values ranging from 4725 to 1175 years ago and support the estimates of Serre et al. (3000–6000 years ago) , rather than Morral et al. (52,000 years ago) , but the latter figure was challenged by Kaplan et al.  because of disagreement with assumptions used in their calculations. In addition, the tMRCA values from western European regions reported herein refine the results of Fichou et al.  from a study of Breton CF patients in which the Estiage analysis suggested that the most common recent ancestor lived 115 generations ago. That tMRCA value, however, may have underestimated the age of p.(Phe508del) in Brittany due to consideration of all the haplotypes, even those that were reconstructed with ambiguities, as well as a potential bias associated with consanguinity due to including both haplotypes in homozygous families. In the more stringent Estiage analyses reported herein, those potential biases were avoided for all populations, leading to estimates of the oldest tMCRA values corresponding to the Early Bronze Age in western Europe, which is generally agreed to begin around 3000 BCE. This finding extends our results from a direct investigation of aDNA in teeth from Iron Age burials near Vienna around 350 BCE and allow us to conclude that p.(Phe508del) was present in that region long before then. More specifically, in the Austrian families studied, the Estiage data revealed a mean tMCRA value of 3575 years ago, which converts to 1558 BCE (Middle Bronze Age) .
Perhaps most remarkably, the estimated ages of p.(Phe508del) in the three western European regions (France, Ireland, and Denmark) were similar with closely overlapping 95% CI values. This observation is also in line with previously documented spatial autocorrelograms expressing genetic and geographical distance for these populations . Such data provide more insight about the ancient origin of CF in our judgment—both when and where—and lead us to propose that CFTR p.(Phe508del) is derived from ancestors who lived in western Europe during the Bronze Age, as early as 2700 BCE, and that its relatively rapid dissemination occurred because of human migrations around the northwestern Atlantic trading routes  and then towards central and eastern Europe . Diffusion from northwestern to central Europe in approximately 1000 years is consistent with the prominent Bronze Age migrations evident in the archeological record [21, 22] and from genomic studies of aDNA . On the other hand, we are assuming a discrete origin of the principal CF-causing variant, but it is possible that p.(Phe508del) arose more than once or earlier, and then reached western Europe subsequently through Neolithic migrations.
[About Bell Beakers] (…) More specifically, their distinctive Bell Beaker pottery appeared and spread across western and central Europe beginning around 3000–2750 BCE and then disappeared between 2200 and 1800 BCE [22, 29]. Their migrations are linked to the advent of western and central European metallurgy, as they manufactured and traded metal goods, especially weapons, while traveling over long distances . Most relevant to our study is the evidence that they migrated in a direction and over a time period that fits well with the pattern of tMRCA data we found for the p.(Phe508del) variant. Olalde et al.  have shown that both migration and cultural transmission played a major role in diffusion of the “Beaker Complex” and led to a “profound demographic transformation” of Britain after 2400 BCE. Moreover, the cultural elements that unite the widely distributed Beaker folk are so obvious that some have considered them a distinct ethnicity of Bronze Age people .
From our results, we propose the novel concept that large scale, long term west-to-east migrations of the Bell Beaker Europeans [22, 28–30] during the Bronze Age, could explain the dissemination of p.(Phe508del) in Europe and its documented northwest-to-southeast gradient .In fact, our tMRCA data show a temporal gradient also.
As you can see from the references, they consulted with Barry Cunliffe (or people accepting his theory), who is obsessed with Bell Beakers expanding Celtic languages from the British Isles. He is like the British equivalent of Danish scholar Kristian Kristiansen, and his obsession with Corded Ware = Indo-European (and Germanic = CWC Denmark), immutable no matter what genetic results might show.
The funny thing is, the interpretation of the paper is probably right. From what we can see in the data, it is quite possible that the disease spread with expanding Bell Beakers…only it spread from the East group in Hungary, i.e. from east to west. The regional difference in TMRCA and apparent west—east cline would point to the different expansions of affected lineages in the corresponding regions, and not to an origin in the British Isles.
There has been an undercurrent of intellectual tension between geneticists studying human population history and archaeologists for almost 40 years. The rapid development of paleogenomics, with geneticists working on the very material discovered by archaeologists, appears to have recently heightened this tension. The relationship between these two fields thus far has largely been of a multidisciplinary nature, with archaeologists providing the raw materials for sequencing, as well as a scaffold of hypotheses based on interpretation of archaeological cultures from which the geneticists can ground their inferences from the genomic data. Much of this work has taken place in the context of western Eurasia, which is acting as testing ground for the interaction between the disciplines. Perhaps the major finding has not been any particular historical episode, but rather the apparent pervasiveness of migration events, some apparently of substantial scale, over the past ∼5000 years, challenging the prevailing view of archaeology that largely dismissed migration as a driving force of cultural change in the 1960s. However, while the genetic evidence for ‘migration’ is generally statistically sound, the description of these events as structured behaviours is lacking, which, coupled with often over simplistic archaeological definitions, prevents the use of this information by archaeologists for studying the social processes they are interested in. In order to integrate paleogenomics and archaeology in a truly interdisciplinary manner, it will be necessary to focus less on grand narratives over space and time, and instead integrate genomic data with other form of archaeological information at the level of individual communities to understand the internal social dynamics, which can then be connected amongst communities to model migration at a regional level. A smattering of recent studies have begun to follow this approach, resulting in inferences that are not only helping ask questions that are currently relevant to archaeologists, but also potentially opening up new avenues of research.
Interesting excerpts (emphasis mine, reference numbers removed for clarity):
There are two major, somewhat intertwined, problems that currently exist.
First, archaeologists are not critiquing whether the migrations identified by paleogenomics using sophisticated population genetic machinery are actually occurring. Instead, the technical criticism arrives in terms of how these migrations are being ascribed to specific cultures. In many paleogenomic papers, there is a tendency (and often an analytical and technical need) to associate samples with particular archaeological cultures, for which all samples are then treated as possessing some kind homogenous and pervasive social identity that is bound in space and time. The major critiques of this thus far have been directed to those studies examining Corded-Ware and Bell-Beaker-related individuals and their potential relationship to the Yamnaya [Vander Linden (2016), Heyd (2017), Furholt (2017)], but are applicable to many other ‘migration’ scenarios described in the recent literature. This is compounded by the use of sometimes small numbers of samples to represent certain cultures from a particular geographic area as representatives of the entire culture at a supra-regional level. Yet often these archaeological cultures such as Corded-Ware and Bell-Beaker themselves show considerable variability in space and time, and even within cemeteries, which is not factored into the genetic analysis.
From a population geneticists point of view, this kind of simplification is somewhat understandable and will often likely have very little impact on the final analysis, given that the primary goal is usually to use ancient samples to better understand modern genetic variation. Though there may be a specific historical interest in some of these past events, I would argue that the aim for most population geneticists at a higher level is to try and fit modern patterns of genetic variation using the simplest models possible that take into account past demographic events (for example fitting f-statistics using the ADMIXTUREGRAPH approach), as this is how we are trained. Although sharing an archaeological culture may not mean that a set of individuals are part of the same homogeneous social group in reality, this approach may be a good enough heuristic to find broad genetic connections compared to another group represented by a different culture, which can then ultimately help understand and model modern human population structure. However, for an archaeologists interested in the ancient individuals themselves and their social identity, this lumping is unsatisfactory, where sophisticated narratives of the individual migrants and their ancient communities are the intended goal.
The second related problem is that ‘migration’ in the sense used currently in the paleogenomics literature lacks sufficient detail to be of much use for an archaeologists attempting to disentangle the complex social dynamics within and between communities. To truly understand the role of migration as a social process and its contribution towards cultural changes, it is necessary to describe it as a structured behaviour, rather than treating it as an explanatory ‘black box’. Are the migrations occurring as a result of short range waves-of-advance movements, or as long-distance movements via leapfrogging models or stream migrations along established routes dependent on key kinship networks. Are there return migrants, and are some subset of individuals more predisposed to migration driving the signals? Although such models were implemented in past studies (even with classical markers ) and are part of the population genetics literature, they are lacking in the current paleogenomics literature when discussing migration. The finding that there is an increase of 12.3% of ancestry type X in population A compared to the preceding population B that is suggestive of a migration, is not particularly useful for examining these kind of models. It is also unclear to what degree standard population genetic parameters estimated from genomic data such as effective population size, Ne, and gene flow are relevant to models studied in archaeology, given they reflect (somewhat undefined) long-term population sizes and average rates of movements over time, rather than reflecting any kind of reality of census size and mobility in the ancient communities the archaeologists are actually attempting to study.
The text goes on to talk about ways of studying fine-grained social dynamics of local cultures, such as:
define levels of genetic relatedness, but also in terms of material culture, age, sex, stress and activity indicators, stable isotopes for diet reconstruction (nitrogen, d13C and d15N, carbon, 13C/12C) and strontium and oxygen isotopes for mobility (87Sr/86Sr, d18O). Where possible, sites should be examined over multiple generations. In addition it will be incredibly useful to characterize the impact of disease in these communities, which is also proving to be a highly fruitful realm for paleogenomics.
I would say that the main problem is not the obvious limitations of palaeogenomics in terms of identifying prehistoric ethnolinguistic communities and their evolution, which is why it is just another tool to complement archaeology and linguistics. The main problem is the narrow understanding that some people have of the inherent limitations of palaeogenomics – especially when it interests them – , when publicizing simplistic conclusions based on these tools and their results. And I am not referring only to amateurs.
Here, we compiled an extensive continental-scale database, consisting of 3070 radiocarbon dates associated to horse paleontological and archeological finds across the whole of Eurasia, that has been analyzed in association with coarse-scale paleoclimatic reconstructions. We further collected the number of identified specimens (NISP) frequency data for horses versus other ungulates in 1120 archeological layers in Europe (…) This ma.ssive amount of data allowed us to track,with unprecedented details, how the geographic distribution of the species changed through time
Geographic range through time
For most analyses, the data have been divided into climatic periods: pre-LGM(older than 27 ka B.P.), LGM(27 to 18 ka B.P.), Late Glacial (18 to 11.7 ka B.P.), Preboreal (11.7 to 10.6 ka B.P.), Boreal (10.6 to 9.1 ka B.P.), Early Atlantic (9.1 to 7.5 ka B.P.), Late Atlantic (7.5 to 5.5 ka B.P.), and Recent (younger than 5.5 ka B.P.) (Fig. 1, A and B). The spatial and temporal distribution of horse remains compiled in our database reveals a strong imbalance in Eurasia (Fig. 1, A and B).
We found a common trend in both regions for a high number of occurrences at the end of the Pleistocene (with a decrease during the LGM, only visible in Europe), followed by a drastic reduction in the Early and Middle Holocene, and a relative increase toward more recent times. These included both the Early Atlantic in Europe, which started ~9.1 ka B.P., and the time range after 5.5 ka B.P. for Asia. The horse fossil record appears ubiquitous throughout Europe in the Late Pleistocene, while in the Early and Middle Holocene the finds are concentrated in central-western Europe and Iberia. From 7.5 ka B.P., the number of finds increases markedly, and the geographical distribution extends toward the east and southeast.
Different Asian and European niches
This analysis revealed that, in both continents, horses occupied only a portion of the climatic space available. The range covered by random locations shows that the paleoecological conditions present in Europe were only a subset of those found in Asia. However, European horses occupied a much wider climatic space than in Asia, with only limited overlap between the two ranges.
Horses conquered temperate environments from a European source
There is no evidence of climatic barriers between those two populations through time because the forecasts from Europe and Asia always overlap in central Eurasia, except 5 ka B.P. (figs. S3 and S4). An alternative explanation is the role of the Urals as a potential constraint for the dispersal of horses between Europe and north central Asia.
Climatic and habitat association patterns for horses in Europe support increasing habitat fragmentation
The decrease of horse remains in Europe is not characterized by a geographic reduction in the overall extent of the area occupied by the species but in a drop of frequencies in a geographic extent that does not vary much between the Late Glacial and the Early Atlantic (Figs. 1B and 4B). This pattern is more likely to result from habitat fragmentation than from a geographic shift in the climatic range suitable for the species, as observed for many animals during the LGM (23).
In the whole period ranging from the Preboreal (11.7 to 10.6 ka B.P.) to the Late Atlantic (7.5 to 5.5 ka B.P.), the total amount of land space most and likely suitable to horses is wider than in the Late Glacial, and only between 8 to 7 ka ago the European range appears patchy and fragmented (Fig. 4C). When comparing each of four successive time bins during the Holocene (8, 7, 6, and 5 ka B.P., respectively) (Fig. 4E), the difference in successive p-Hor values in Europe shows that the suitability for the species in Iberia, northeastern France, Italy, the Balkans, and eastern Europe steadily increased, while in Central Europe strong differences can be observed between neighboring regions.
Taken at face value, this pattern would suggest that horses were not restricted to open environments but could equally well inhabit closed, forested environments, as previously suggested (18). However, as others recently emphasized (19), the faunal associations inHolocene sites from Europe suggest a different pattern. The PCAs based on faunal assemblages (figs. S1 and S2) separate on the second principal component sites characterized by ungulates associated to forested areas (red deer, wild boar, and roe deer) and all other animals, associated to semi-open and open environments, including horses for most records.
Together, the contrast between the reconstructed microscale and macroscale vegetable coverage in Europe, the increase of horses in mainly forested macroregions, and the spatial pattern of extinction suggest that, from the beginning of the Holocene, the suitable environment became more and more patchy, with open areas increasingly fragmented by forests, where wild populations of horses could have survived in isolation until one or several waves of arrivals of domestic horses, leading to either local admixture or a full replacement of the preexisting local populations.
Our data show that, up to 5.5 ka ago, horse finds do not show association with species characteristic of forested areas such as wild boar and roe deer. We infer that the open and semi-open habitats occupied by horses on a narrow geographic scale appear less and less frequent at a macroenvironmental scale, supporting the possibility of increasing fragmentation of open habitats. This event is also likely to have led to an intensification of genetic isolation for the remaining horse populations, a pattern that still needs to be tested on genomic data.
The suitability of both Iberia and eastern Europe appears constant throughout the entire post-LGM period, in line with these regions being hotspots of genetic diversity and, possibly, the refugia sources for the recolonization of the continent (11). While the Pontic-Caspian region appears not suitable for European horses around the time when horses where first domesticated some 5.5 ka ago (6), part of this region appears suitable for the Asian horses (with the Caspian Sea as the westernmost boundary). This may suggest that horse domestication started from a population background related to an Asian ancestry and that the further spread of the domesticated horses in Europe involved either adaptation to novel niches (possibly through selective breeding) or the application of domestication techniques to local horse populations pre-adapted to these environmental conditions. Testing this scenario will require mapping the genetic structure of the Eurasian horse population within the fifth to third millennium BCE.
Cultural-anthropological research and archaeological remains (see here), genetics (see here and here), and now also thorough palaeoclimatic and archaeological models point to the North Caspian region, settled by the Khvalynsk culture, as the most likely earliest origin of horse domestication. The paper also supports the favorable conditions of western Europe up to Iberia for the introduction of a horse-riding culture.
I intended to write a post about the myth of Corded Ware horse riders, but for the moment I haven’t found the time. Not that Corded Ware pastoralists didn’t have horses, or could not ride them: they were a highly mobile culture of pastoralists stemming from eastern Poland / western Ukraine, so they must have known horses, like many other European cultures of the late 4th / early 3rd millennium influenced by expanding Yamna settlers. But it just cannot be said to have formed an essential part of their culture, as it was for Khvalynsk-Novodanilovka, and especially Yamna and later East Bell Beaker, Sintashta, etc.
A mere look at these maps suffices to assess the limited role of the horse in north-eastern Europe, the only region where groups of late Corded Ware-derived cultures survived the expansion of Yamna, and especially East Bell Beakers after ca. 2500 BC, which transformed Western, Northern, and Central Europe, and even East Europe reaching the modern Baltic countries, Belarus, and Romania. Even Trzciniec was born out of the influence from expanding Bell Beakers into earlier Corded Ware territory, although the later (Iron Age) relevance of this culture was probably quite limited.
As you can imagine, without horses and horse symbolism, horse riding, carts, and intensive cattle-breeding (associated with Yamna and the broad, east-central European grasslands typical of steppe regions), there can be no Proto-Indo-European, whose reconstructed vocabulary is particulary rich in horse-related words, and whose reconstructed culture, society, and religion cannot be understood without the domesticated horse. In forest regions to the north-east and eastern Europe, there was apparently little space for horses, but plenty of room for other ungulates and thus hunting, and indeed Uralic languages…
In the upcoming months we will see R1a-fans associating Proto-Indo-Europeans more and more with wool, and sheep, and corded ware, and forest regions, until the proposed homeland shifts to the Baltic and Finland, instead of dat boring horse-riding people of the steppes…No wait, it’s already happening.
In recent decades, evidence has accumulated for comparable enclosures of later dates, including the Early Bronze Age Únětice Culture between 2200 and 1600 BC, and thus into the chronological and cultural context of the Nebra sky disc. Based on the analysis of one of these enclosure sites, recently excavated at Pömmelte on the flood plain of the Elbe River near Magdeburg, Saxony-Anhalt, and dating to the late third millennium BC
The main occupation began at 2321–2211 cal BC, with the stratigraphically earliest features containing exclusively Bell Beaker finds. Bell Beaker ceramics continue after 2204–2154 cal BC (boundary occupation I/II), although they were probably undecorated, but are now complemented by Únětice Culture (and other Early Bronze Age) types. At this time, with features common to both cultures predominate. Only contexts dating to the late main occupation phase (late phase II) and thereafter contained exclusively Únětice Culture finds. Evidently, the bearers of the Bell Beaker Culture were the original builders of the enclosure. During a second phase of use, Final Neolithic and Early Bronze Age cultures coexisted and intermingled. The material remains, however, should not be taken as evidence for successive groups of differing archaeological cultures, but as witnesses to a cultural transition from the Bell Beaker Culture to the Únětice Culture (Spatzier 2015). The main occupation ended 2086–2021 cal BC with the deconstruction of the enclosure; Bell Beaker finds are now absent. Finally, a few features (among them one shaft) and radiocarbon dates attest the sporadic re-use of the site in a phase of abandonment/re-use that ended 1636– 1488 cal BC.
How the above-ground structures possibly influenced perception may reveal another layer of meaning that highlights social functions related to ritual. While zone I was disconnected from the surroundings by a ‘semi-translucent’ post-built border, zones II/III were separated from the outside world by a wooden wall (i.e. the palisade), and zone III probably separated individuals from the crowd gathered in zone II. Accessing the interior or centre therefore meant passing through transitional zones, to first be secluded and then segregated. Exiting the structure meant re-integration and re-connection. The experience possibly induced when entering and leaving the monument reflects the three stages of ‘rites of passage’ described by van Gennep (1909): separation, liminality and incorporation. The enclosure’s outer zone(s) represents the pre- and post-liminal phase; the central area, the liminal phase. Seclusion and liminality in the interior promoted a sense of togetherness, which can be linked to Turner’s “communitas” (1969: 132–33). We might therefore see monuments such as the Pömmelte enclosure as important communal structures for social regulation and the formation of identity.
(…) The long-term stability of these connotations must be emphasised. As with the tradition of making depositions, these meanings were valid from the start of the occupation — c. 2300 BC — until at least the early period following the deconstruction event, c. 2050 BC. While the spatial organisation and the solar alignment of the main entrances were maintained throughout the main occupation, stone axes and ‘formal’ graves indicate the continuation of the spatial concepts described above until the twentieth to nineteenth centuries BC.
These layers of meaning mirror parallel concepts of space including, although not necessarily restricted to, the formation of group identities (see Hansen & Meyer 2013: 5). They can perhaps be better understood as a ‘cosmological geography’ manifested in the symbolism of superimposed levels of conceptual ideas related to space and to certain cardinal points (Figure 8). This idea is closely related to Eliade’s (1959: 29–36) understanding of “organized — hence comicized — territory”, that is territory consecrated to provide orientation within the homogeneity of the chaotic ‘outside world’, and the equivalence of spatial consecration and cosmogony. Put differently, the Pömmelte enclosure can be interpreted as a man-made metaphor and an icon of the cosmos, reflecting the Weltanschauung (a comprehensive conception of the world) of the people who built and used it. By bringing together Eliade and Rappaport’s ideas of meaningfulness in relation to religious experience (Rappaport 1999: 391–95), it may be argued that Pömmelte was a place intended to induce oneness with the cosmos. In combining multiple layers that symbolically represent different aspects of life (first-ordermeaning), the enclosure became an icon metaphorically representing the world (second-order-meaning). As this icon was the place to reaffirm life symbolism ritually, through their actions, people perhaps experienced a sense of rootedness in, or unity with, the cosmos (highest-order-meaning). Although we can only speculate about the perceptions of ancient people, such a theory aiming to describe general principles of religious experience can provide insight.
The circular enclosure of Pömmelte is the first Central European monumental complex of primarily sacred importance that has been excavated and studied in detail. It reveals aspects of society and belief during the transition from the Final Neolithic to the Early Bronze Age, in the second half of the third millennium BC. Furthermore, it offers details of ritual behaviour and the way that people organised their landscape. A sacred interior was separated from the profane environment, and served as a venue for rites that secured the continuity of the social, spiritual and cosmic order. Ancestor worship formed another integral part of this: a mound-covered burial hut and a square-shaped ditch sanctuary (located, respectively, within and near the enclosure’s south-eastern sector; cf. Figure 2)—dating to 2880–2580 cal BC and attributed to the Corded Ware Culture (Spatzier 2017a: 235–44)—suggest that this site was deliberately chosen. With construction of the ring sanctuary, this place gained an immense expansion in meaning—comparable to Stonehenge. Through architectural transformation, both of these sites developed into sanctuaries with increasingly complex religious functions, including in relation to the cult of the dead. The cosmological and social functions, and the powerful symbolism of the Nebra sky disc and hoard (Meller 2010: 59–70), are reflected in Pömmelte’s monumental architecture.
All of these features—along with Pömmelte’s dating, function and complex ring structure—are well documented for British henge monuments (Harding 2003; Gibson 2005). The continuous use of circular enclosures in Central Europe from around 3000– 1500 BC remains to be confirmed, but strong evidence indicates usage spanning from the fifth to the first millennia BC (Spatzier 2017a: 273–96). From 2500 BC onwards, examples in Central Europe, Iberia and Bulgaria (Bertemes 2002; Escudero Carrillo et al. 2017) suggest a Europe-wide concept of sanctuary. This indicates that in extensive communication networks at the beginning of bronze metallurgy (Bertemes 2016), intellectual and religious contents circulated alongside raw materials. The henge monuments of the British Isles are generally considered to represent a uniquely British phenomenon, unrelated to Continental Europe; this position should now be reconsidered. The uniqueness of Stonehenge lies, strictly speaking, with its monumental megalithic architecture.
The Classical Bell Beaker heritage
No serious scholar can argue at this point against the male-biased East Bell Beaker migrations that expanded the European languages related to Late Proto-Indo-European-speaking Yamna (see David Reich’s comments), and thus most likely North-West Indo-European – the ancestor of Italo-Celtic, Germanic, and Balto-Slavic, apart from Pre-Celtic IE in the British Isles, Lusitano-Galician in Iberia, or Messapic in Italy (see here a full account).
With language, these migrants (several ten thousands) brought their particular Weltanschauung to all of Western, Central, and Northern Europe. Their admixture precisely in Hungary shows that they had close interactions with non-Indo-European peoples (genetically related to the Globular Amphorae culture), something that we knew from the dozens of non-Indo-European words reconstructed exclusively for North-West Indo-European, apart from the few reconstructed non-Indo-European words that NWIE shares with Palaeo-Balkan languages, which point to earlier loans from their ancestors, Yamna settlers migrating along the lower Danube.
It is not difficult to imagine that the initial East Bell Beaker group shared a newly developed common cosmological point of view that clashed with other neighbouring Yamna-related worldviews (e.g. in Balkan EBA cultures) after the cultural ties with Yamna were broken. Interesting in this respect is for example their developed (in mythology as in the new North-West Indo-European concept) *Perkwūnos, the weather god – probably remade (in language as in concept) from a Yamna minor god also behind Old Indian parjányas, the rain god – as one of the main gods from the new Pantheon, distinct from *Dyēus patēr, the almighty father sky god. In support of this, the word *meldh-n- ‘lightning’, behind the name of the mythological hammer of the weather god (cf. Old Norse Mjǫllnir or Latvian Milna), was also a newly coined North-West Indo-European term, although the myth of the hero slaying the dragon with the magical object is older.
Circular enclosures are known in Europe since the Neolithic. Also, the site selected for the Pömmelte enclosure had been used to bury Corded Ware individuals some centuries before its construction, and Corded Ware symbolism (stone axe vs. quern) is seen in the use given by Bell Beakers and later Únětice at this place. All this and other regional similarities between Bell Beakers and different local cultures (see here an example of Iberian Bell Beakers) points to syncretism of the different Bell Beaker groups with preceding cultures in the occupied regions. After all, their genealogical ancestors included also those of their maternal side, and not all encountered males disappeared, as is clearly seen in the resurge of previous paternal lineages in Central-East Europe and in Scandinavia. The admixture of Bell Beakers with previous groups (especially those of similar steppe-related ancestry from Corded Ware) needs more complex analyses to clarify potential early dialectal expansions (read what Iosif Lazaridis has to say).
The popular “big and early” expansions
These syncretic trends gave rise to distinct regional cultures, and eventually different local groups rose to power in the new cultural regions and ousted the old structures. Social norms, hierarchy, and pantheons were remade. Events like this must have been repeated again and again in Bronze and Iron Age Europe, and in many cases it was marked by a difference in the prevailing archaeological culture attested, and probably accompanied by certain population replacements that will be seen with more samples and studies of fine-scale population structure.
Some of these cultural changes, marked by evident haplogroup or admixture replacement, are defined as a ‘resurge’ of ancestry linked to previous populations, although that is obviously not equivalent to a resurge of a previous cultural group, because they usually represent just a successful local group of the same supraregional culture with a distinct admixture and/or haplogroup (see e.g. resurge of R1a-Z645 in Central-East European Bronze Age). Social, religious, or ethnic concepts may have changed in each of these episodes, along with the new prestige dialect.
This must have happened then many times during the hundreds (or thousands in some cases) of years until the first attestation of a precise ancient language and culture (read e.g. about one of the latest branches to be attested, Balto-Slavic). Ancient language contacts, like substrates or toponymy, can only rarely be detected after so many changes, so their absence (or the lack of proper studies on them) is usually not relevant – and certainly not an argument – in scholarly discussions. Their presence, on the other hand, is a proof of such contacts.
We have dozens of papers supporting Uralic dialectal substrate influence on Pre-Germanic, Proto-Balto-Slavic, and Pre- and Proto-Indo-Iranian (and even Proto-Celtic), as well as superstrate influence of Palaeo-Germanic (i.e. from Pre- to Proto-Germanic) and Proto-Balto-Slavic into Proto-Finno-Saamic, much stronger than the Indo-Iranian adstrate influence on Finno-Ugric (see the relative importance of each influence) which locates all these languages and their evolution to the north and west of the steppe (with Proto-Permic already separated, in North-East Europe, as is Proto-Ugric further east near the Urals), probably around the Baltic and Scandinavia after the expansion of Bell Beakers. These connections have been known in linguistics for decades.
Apart from some early 20th century scholars, only a minority of Indo-Europeanists support nowadays an Indo-European (i.e. centum) substrate for Balto-Slavic, to keep alive an Indo-Slavonic group based on a hypothetical 19th century Satem group; so e.g. Holzer with his Temematic, and Kortlandt supporting him, also with some supposed Indo-European substrate with heavy non-Indo-European influence for Germanic and Balto-Slavic, that now (thanks mainly to the views of the Copenhagen group) have been linked to the Corded Ware culture, as it has become clear even to them that Bell Beakers expanded North-West Indo-European.
For their part, only a minority among Uralicists, such as Kuz’mina, Parpola or Häkkinen, believe in an ‘eastern’ origin of Uralic languages, around the Southern Urals. Genomic finds – like their peers – are clearly not supporting their views. But even if we accept this hypothesis, there is little space beyond Abashevo and related East Corded Ware cultures after the recent papers on Corded Ware and Fennoscandian samples. And yet here we are:
substitutes arrows for Kron-like colors (where danger red = Indo-European) with the same end result of many other late 20th century whole-Europe Kurgan maps, linking Sredni Stog and Corded Ware with Yamna, but obviating the precise origin of Corded Ware peoples (is it Sredni Stog, or is it that immutable Middle Dnieper group? is it West Yamna, or Yamna Hungary? is it wool, or is it wheels?);
relegates Uralic speakers to a tiny corner, a ‘Volosovo’ cultural region, thus near Khvalynsk/Yamna (but not too much), that miraculously survives surrounded by all-early-splitting, all-Northern Eneolithic Indo-Europeans, thus considering Uralic languages irrelevant not only to locate the PIE Urheimat, but also to locate their own homeland; also, cultures identified in color with Uralic speakers expand until the Iron Age with enough care not to even touch in the map one of the known R1a samples published to date (because, for some people, apparently R1a must be Indo-European); and of course N1c or Siberian ancestry are irrelevant, too;
and adds findings of wheels and wool probably in support of some new ideas based on yet another correlation = causation argument (that I cannot then properly criticize without access to its reasoning beyond cute SmartArt-like symbols) similar to their model – already becoming a classic example of wrong use of statistical methods – based on the infamously named Yamnaya ancestral component™, which is obviously still used here, too.
The end result is thus similar to any other simplistic 1990s Gimbutas (or rather the recently radicalized IE Sredni Stog -> Corded Ware -> BBC version by the Danish workgroup) + 2000s R1a-map + 2010s Yamnaya ancestry™; but, hard to believe, it is published in mid-2018. A lot of hours of senseless effort, because after its publication it becomes ipso facto outdated.
For comparison of Yamna and Bell Beaker expansions, here is a recent simplistic, static (and yet more accurate) pair of maps, from the Reich Lab:
If the Copenhagen group keeps on pushing Gimbutas’ long ago outdated IE Sredni Stog -> Corded Ware theory as modified by Kristiansen, with their recently invented Corded Ware -> Bell Beaker model in genetics, at some point they are bound to clash with the Reich-Jena team, which seems to have less attachment to the classic Kurgan model and the wrong interpretations of the 2015 papers, and that would be something to behold. Because, as Cersei would say: “When you play the game of thrones, you win or you die. There is no middle ground.” And when you play the game of credibility, after so many, so wrong publications, well…
NOTE. I have been working on a similar GIS tool for quite some time, using my own maps and compiled genetic data, which I currently only use for my 2018 revision of the Indo-European demic diffusion model. Maybe within some weeks or months I will be able to publish the maps properly, after the revised papers. It’s a pitty that so much work on GIS and analysis with genetic data and cultural regions has to be duplicated, but I intend to keep some decent neutrality in my revised cultural maps, and this seems impossible at this point with some workgroups who have put all their eggs in one broken basket…