Corded Ware ancestry in North Eurasia and the Uralic expansion


Now that it has become evident that Late Repin (i.e. Yamnaya/Afanasevo) ancestry was associated with the migration of R1b-L23-rich Late Proto-Indo-Europeans from the steppe in the second half of the the 4th millennium BC, there’s still the question of how R1a-rich Uralic speakers of Corded Ware ancestry expanded , and how they spread their languages throughout North Eurasia.

Modern North Eurasians

I have been collecting information from the supplementary data of the latest papers on modern and ancient North Eurasian peoples, including Jeong et al. (2019), Saag et al. (2019), Sikora et al. (2018), or Flegontov et al. (2019), and I have tried to add up their information on ancestral components and their modern and historical distributions.

Fortunately, the current obsession with simplifying ancestry components into three or four general, atemporal groups, and the common use of the same ones across labs, make it very simple to merge data and map them.

Corded Ware ancestry

There is no doubt about the prevalent ancestry among Uralic-speaking peoples. A map isn’t needed to realize that, because ancient and modern data – like those recently summarized in Jeong et al. (2019) – prove it. But maps sure help visualize their intricate relationship better:

Natural neighbor interpolation of Srubnaya ancestry among modern populations. See full map.
Kriging interpolation of Srubnaya ancestry among modern populations. See full map

Interestingly, the regions with higher Corded Ware-related ancestry are in great part coincident with (pre)historical Finno-Ugric-speaking territories:

Modern distribution of Uralic languages, with ancient territory (in the Common Era) labelled and delimited by a red line. For more information on the ancient territory see here.

Edit (29/7/2019): Here is the full Steppe_MLBA ancestry map, including Steppe_MLBA (vs. Indus Periphery vs. Onge) in modern South Asian populations from Narasimhan et al. (2018), apart from the ‘Srubnaya component’ in North Eurasian populations. ‘Dummy’ variables (with 0% ancestry) have been included to the south and east of the map to avoid weird interpolations of Steppe_MLBA into Africa and East Asia.

Natural neighbor interpolation of Steppe MLBA-like ancestry among modern populations. See full map.

Anatolia Neolithic ancestry

Also interesting are the patterns of non-CWC-related ancestry, in particular the apparent wedge created by expanding East Slavs, which seems to reflect the intrusion of central(-eastern) European ancestry into Finno-Permic territory.

NOTE. Read more on Balto-Slavic hydrotoponymy, on the cradle of Russians as a Finno-Permic hotspot, and about Pre-Slavic languages in North-West Russia.

Natural neighbor interpolation of LBK EN ancestry among modern populations. See full map.
Kriging interpolation of LBK EN ancestry among modern populations. See full map

WHG ancestry

The cline(s) between WHG, EHG, ANE, Nganasan, and Baikal HG are also simplified when some of them excluded, in this case EHG, represented thus in part by WHG, and in part by more eastern ancestries (see below).

Natural neighbor interpolation of WHG ancestry among modern populations. See full map.
Kriging interpolation of WHG ancestry among modern populations. See full map.

Arctic, Tundra or Forest-steppe?

Data on Nganasan-related vs. ANE vs. Baikal HG/Ulchi-related ancestry is difficult to map properly, because both ancestry components are usually reported as mutually exclusive, when they are in fact clearly related in an ancestral cline formed by different ancient North Eurasian populations from Siberia.

When it comes to ascertaining the origin of the multiple CWC-related clines among Uralic-speaking peoples, the question is thus how to properly distinguish the proportions of WHG-, EHG-, Nganasan-, ANE or BaikalHG-related ancestral components in North Eurasia, i.e. how did each dialectal group admix with regional groups which formed part of these clines east and west of the Urals.

The truth is, one ought to test specific ancient samples for each “Siberian” ancestry found in the different Uralic dialectal groups, but the simplistic “Siberian” label somehow gets a pass in many papers (see a recent example).

Below qpAdm results with best fits for Ulchi ancestry, Afontova Gora 3 ancestry, and Nganasan ancestry, but some populations show good fits for both and with similar proportions, so selecting one necessarily simplifies the distribution of both.

Ulchi ancestry

Natural neighbor interpolation of Ulchi ancestry among modern populations. See full map.
Kriging interpolation of Ulchi ancestry among modern populations. See full map.

ANE ancestry

Natural neighbor interpolation of ANE ancestry among modern populations. See full map.
Kriging interpolation of ANE ancestry among modern populations. See full map.

Nganasan ancestry

Natural neighbor interpolation of Nganasan ancestry among modern populations. See full map.
Kriging interpolation of Nganasan ancestry among modern populations. See full map.

Iran Chalcolithic

A simplistic Iran Chalcolithic-related ancestry is also seen in the Altaic cline(s) which (like Corded Ware ancestry) expanded from Central Asia into Europe – apart from its historical distribution south of the Caucasus:

Natural neighbor interpolation of Iran Neolithic ancestry among modern populations. See full map.
Kriging interpolation of Iran Chalcolithic ancestry among modern populations. See full map.

Other models

The first question I imagine some would like to know is: what about other models? Do they show the same results? Here is the simplistic combination of ancestry components published in Damgaard et al. (2018) for the same or similar populations:

NOTE. As you can see, their selection of EHG vs. WHG vs. Nganasan vs. Natufian vs. Clovis of is of little use, but corroborate the results from other papers, and show some interesting patterns in combination with those above.


Natural neighbor interpolation of EHG ancestry among modern populations, data from Damgaard et al. (2018). See full map.
Kriging interpolation of EHG ancestry among modern populations. See full map.

Natufian ancestry

Natural neighbor interpolation of Natufian ancestry among modern populations, data from Damgaard et al. (2018). See full map.
Kriging interpolation of Natufian ancestry among modern populations. See full map.

WHG ancestry

Natural neighbor interpolation of WHG ancestry among modern populations, data from Damgaard et al. (2018). See full map.
Kriging interpolation of WHG ancestry among modern populations. See full map.

Baikal HG ancestry

Natural neighbor interpolation of Baikal hunter-gatherer ancestry among modern populations, data from Damgaard et al. (2018). See full map.
Kriging interpolation of Baikal HG ancestry among modern populations. See full map.

Ancient North Eurasians

Once the modern situation is clear, relevant questions are, for example, whether EHG-, WHG-, ANE, Nganasan-, and/or Baikal HG-related meta-populations expanded or became integrated into Uralic-speaking territories.

When did these admixture/migration events happen?

How did the ancient distribution or expansion of Palaeo-Arctic, Baikalic, and/or Altaic peoples affect the current distribution of the so-called “Siberian” ancestry, and of hg. N1a, in each specific population?

NOTE. A little excursus is necessary, because the calculated repetition of a hypothetic opposition “N1a vs. R1a” doesn’t make this dichotomy real:

  1. There was not a single ethnolinguistic community represented by hg. R1a after the initial expansion of Eastern Corded Ware groups, or by hg. N1a-L392 after its initial expansion in Siberia:
  2. Different subclades became incorporated in different ways into Bronze Age and Iron Age communities, most of which without an ethnolinguistic change. For example, N1a subclades became incorporated into North Eurasian populations of different languages, reaching Uralic- and Indo-European-speaking territories of north-eastern Europe during the late Iron Age, at a time when their ancestral origin or language in Siberia was impossible to ascertain. Just like the mix found among Proto-Germanic peoples (R1b, R1a, and I1)* or among Slavic peoples (I2a, E1b, R1a)*, the mix of many Uralic groups showing specific percentages of R1a, N1a, or Q subclades* reflect more or less recent admixture or acculturation events with little impact on their languages.

*other typically northern and eastern European haplogroups are also represented in early Germanic (N1a, I2, E1b, J, G2), Slavic (I1, G2, J) and Finno-Permic (I1, R1b, J) peoples.

Map of archaeological cultures in north-eastern Europe ca. 8th-3rd centuries BC. [The Mid-Volga Akozino group not depicted] Shaded area represents the Ananino cultural-historical society. Fading purple arrows represent likely stepped movements of subclades of haplogroup N for centuries (e.g. Siberian → Ananino → Akozino → Fennoscandia [N-VL29]; Circum-Arctic → forest-steppe [N1, N2]; etc.). Blue arrows represent eventual expansions of Uralic peoples to the north. Modified image from Vasilyev (2002).

The problem with mapping the ancestry of the available sampling of ancient populations is that we lack proper temporal and regional transects. The maps that follow include cultures roughly divided into either “Bronze Age” or “Iron Age” groups, although the difference between samples may span up to 2,000 years.

NOTE. Rough estimates for more external groups (viz. Sweden Battle Axe/Gotland_A for the NW, Srubna from the North Pontic area for the SW, Arctic/Nganasan for the NE, and Baikal EBA/”Ulchi-like” for the SE) have been included to offer a wider interpolated area using data already known.

Bronze Age

Similar to modern populations, the selection of best fit “Siberian” ancestry between Baikal HG vs. Nganasan, both potentially ± ANE (AG3), is an oversimplification that needs to be addressed in future papers.

Corded Ware ancestry

Natural neighbor interpolation of Srubnaya ancestry among Bronze Age populations. See full map.

Nganasan-like ancestry

Natural neighbor interpolation of Nganasan-like ancestry among Bronze Age populations. See full map.

Baikal HG ancestry

Natural neighbor interpolation of Baikal Hunter-Gatherer ancestry among Bronze Age populations. See full map.

Afontova Gora 3 ancestry

Natural neighbor interpolation of Afontova Gora 3 ancestry among Bronze Age populations. See full map.

Iron Age

Corded Ware ancestry

Interestingly, the moderate expansion of Corded Ware-related ancestry from the south during the Iron Age may be related to the expansion of hg. N1a-VL29 into the chiefdom-based system of north-eastern Europe, including Ananyino/Akozino and later expanding Akozino warrior-traders around the Baltic Sea.

NOTE. The samples from Levänluhta are centuries older than those from Estonia (and Ingria), and those from Chalmny Varre are modern ones, so this region has to be read as a south-west to north-east distribution from the Iron Age to modern times.

Natural neighbor interpolation of Srubnaya ancestry among Iron Age populations. See full map.

Baikal HG-like ancestry

The fact that this Baltic N1a-VL29 branch belongs in a group together with typically Avar N1a-B197 supports the Altaic origin of the parent group, which is possibly related to the expansion of Baikalic ancestry and Iron Age nomads:

Natural neighbor interpolation of Baikal HG ancestry among Iron Age populations. See full map.

Nganasan-like ancestry

The dilution of Nganasan-like ancestry in an Arctic region featuring “Siberian” ancestry and hg. N1a-L392 at least since the Bronze Age supports the integration of hg. N1a-Z1934, sister clade of Ugric N1a-Z1936, into populations west and east of the Urals with the expansion of Uralic languages to the north into the Tundra region (see here).

The integration of N1a-Z1934 lineages into Finnic-speaking peoples after their migration to the north and east, and the displacement or acculturation of Saami from their ancestral homeland, coinciding with known genetic bottlenecks among Finns, is yet another proof of this evolution:

Natural neighbor interpolation of Nganasan ancestry among Iron Age populations. See full map.

WHG ancestry

Similarly, WHG ancestry doesn’t seem to be related to important population movements throughout the Bronze Age, which excludes the multiple North Eurasian populations that will be found along the clines formed by WHG, EHG, ANE, Nganasan, Baikal HG ancestry as forming part of the Uralic ethnogenesis, although they may be relevant to follow later regional movements of specific populations.

Natural neighbor interpolation of WHG ancestry among Iron Age populations. See full map.


It seems natural that people used to look at maps of haplogroup distribution from the 2000s, coupled with modern language distributions, and would try to interpret them in a certain way, reaching thus the wrong conclusions whose consequences are especially visible today when ancient DNA keeps contradicting them.

In hindsight, though, assuming that Balto-Slavs expanded with Corded Ware and hg. R1a, or that Uralians expanded with “Siberian” ancestry and hg. N1a, was as absurd as looking at maps of ancestry and haplogroup distribution of ancient and modern Native Americans, trying to divide them into “Germanic” or “Iberian”…

The evolution of each specific region and cultural group of North Eurasia is far from being clear. However, the general trend speaks clearly in favour of an ancient, Bronze Age distribution of North Eurasian ancestry and haplogroups that have decreased, diluted, or become incorporated into expanding Uralians of Corded Ware ancestry, occasionally spreading with inter-regional expansions of local groups.

Given the relatively recent push of Altaic and Indo-European languages into ancestral Uralic-speaking territories, only the ancient Corded Ware expansion remains compatible with the spread of Uralic languages into their historical distribution.


Vikings, Vikings, Vikings! “eastern” ancestry in the whole Baltic Iron Age


Open access Population genomics of the Viking world, by Margaryan et al. bioRxiv (2019), with a huge new sampling from the Viking Age.

Interesting excerpts (emphasis mine, modified for clarity):

To understand the genetic structure and influence of the Viking expansion, we sequenced the genomes of 442 ancient humans from across Europe and Greenland ranging from the Bronze Age (c. 2400 BC) to the early Modern period (c. 1600 CE), with particular emphasis on the Viking Age. We find that the period preceding the Viking Age was accompanied by foreign gene flow into Scandinavia from the south and east: spreading from Denmark and eastern Sweden to the rest of Scandinavia. Despite the close linguistic similarities of modern Scandinavian languages, we observe genetic structure within Scandinavia, suggesting that regional population differences were already present 1,000 years ago.

Maps illustrating the following texts have been made based on data from this and other papers:

  • Maps showing ancestry include only data from this preprint (which also includes some samples from Sigtuna).
  • Maps showing haplogroup density include Vikings from other publications, such as those from Sigtuna in Krzewinska et al. (2018), and from Iceland in Ebenesersdóttir et al. (2018).
  • Maps showing haplogroups of ancient DNA samples based on their age include data from all published papers, but with slightly modified locations to avoid overcrowding (randomized distance approx. ± 0.1 long. and lat.).

Y-DNA haplogroups in Europe during the Viking expansions (full map). See other maps from the Middle Ages.

We find that the transition from the BA to the IA is accompanied by a reduction in Neolithic farmer ancestry, with a corresponding increase in both Steppe-like ancestry and hunter-gatherer ancestry. While most groups show a slight recovery of farmer ancestry during the VA, there is considerable variation in ancestry across Scandinavia. In particular, we observe a wide range of ancestry compositions among individuals from Sweden, with some groups in southern Sweden showing some of the highest farmer ancestry proportions (40% or more in individuals from Malmö, Kärda or Öland).

Ancestry proportions in Norway and Denmark on the other hand appear more uniform. Finally we detect an influx of low levels of “eastern” ancestry starting in the early VA, mostly constrained among groups from eastern and central Sweden as well as some Norwegian groups. Testing of putative source groups for this “eastern” ancestry revealed differing patterns among the Viking Age target groups, with contributions of either East Asian- or Caucasus-related ancestry.

Ancestry proportions of four-way models including additional putative source groups for target groups for which three-way fit was rejected (p ≤ 0.01);

Overall, our findings suggest that the genetic makeup of VA Scandinavia derives from mixtures of three earlier sources: Mesolithic hunter-gatherers, Neolithic farmers, and Bronze Age pastoralists. Intriguingly, our results also indicate ongoing gene flow from the south and east into Iron Age Scandinavia. Thus, these observations are consistent with archaeological claims of wide-ranging demographic turmoil in the aftermath of the Roman Empire with consequences for the Scandinavian populations during the late Iron Age.

Genetic structure within Viking-Age Scandinavia

We find that VA Scandinavians on average cluster into three groups according to their geographic origin, shifted towards their respective present-day counterparts in Denmark, Sweden and Norway. Closer inspection of the distributions for the different groups reveals additional complexity in their genetic structure.

Natural neighbor interpolation of “Danish ancestry” among Vikings.

We find that the ‘Norwegian’ cluster includes Norwegian IA individuals, who are distinct from both Swedish and Danish IA individuals which cluster together with the majority of central and eastern Swedish VA individuals. Many individuals from southwestern Sweden (e.g. Skara) cluster with Danish present-day individuals from the eastern islands (Funen, Zealand), skewing towards the ‘Swedish’ cluster with respect to early and more western Danish VA individuals (Jutland).

Some individuals have strong affinity with Eastern Europeans, particularly those from the island of Gotland in eastern Sweden. The latter likely reflects individuals with Baltic ancestry, as clustering with Baltic BA individuals is evident in the IBS-UMAP analysis and through f4-statistics.

Natural neighbor interpolation of “Norwegian ancestry” among Vikings.

For more on this influx of “eastern” ancestry see my previous posts (including Viking samples from Sigtuna) on Genetic and linguistic continuity in the East Baltic, and on the Pre-Proto-Germanic homeland based on hydrotoponymy.

Baltic ancestry in Gotland

Genetic clustering using IBS-UMAP suggested genetic affinities of some Viking Age individuals with Bronze Age individuals from the Baltic. To further test these, we quantified excess allele sharing of Viking Age individuals with Baltic BA compared to early Viking Age individuals from Salme using f4 statistics. We find that many individuals from the island of Gotland share a significant excess of alleles with Baltic BA, consistent with other evidence of this site being a trading post with contacts across the Baltic Sea.

Natural neighbor interpolation of “Finnish ancestry” among Vikings.

The earliest N1a-VL29 sample available comes from Iron Age Gotland (VK579) ca. AD 200-400 (see Iron Age Y-DNA maps), which also proves its presence in the western Baltic before the Viking expansion. The distribution of N1a-VL29 and R1a-Z280 (compared to R1a in general) among Vikings also supports a likely expansion of both lineages in succeeding waves from the east with Akozino warrior-traders, at the same time as they expanded into the Gulf of Finland.

Density of haplogroup R1a-Z280 (samples in pink) overlaid over other R1a samples (in green, with R1a-Z284 in cyan) among Vikings.

Vikings in Estonia

(…) only one Viking raiding or diplomatic expedition has left direct archaeological traces, at Salme in Estonia, where 41 Swedish Vikings who died violently were buried in two boats accompanied by high-status weaponry. Importantly, the Salme boat-burial predates the first textually documented raid (in Lindisfarne in 793) by nearly half a century. Comparing the genomes of 34 individuals from the Salme burial using kinship analyses, we find that these elite warriors included four brothers buried side by side and a 3rd degree relative of one of the four brothers. In addition, members of the Salme group had very similar ancestry profiles, in comparison to the profiles of other Viking burials. This suggests that this raid was conducted by genetically homogeneous people of high status, including close kin. Isotope analyses indicate that the crew descended from the Mälaren area in Eastern Sweden thus confirming that the Baltic-Mid-Swedish interaction took place early in the VA.

Natural neighbor interpolation of “Swedish ancestry” among Vikings.

Viking samples from Estonia show thus ancient Swedes from the Mälaren area, which proves once again that hg. N1a-VL29 (especially subclade N1a-L550) and tiny proportions of so-called “Siberian ancestry” expanded during the Early Iron Age into the whole Baltic Sea area, not only into Estonia, and evidently not spreading with Balto-Finnic languages (since the language influence is in the opposite direction, east-west, Germanic > Finno-Samic, during the Bronze Age).

N1a-VL29 lineages spread again later eastwards with Varangians, from Sweden into north-eastern Europe, most likely including the ancestors of the Rurikid dynasty. Unsurprisingly, the arrival of Vikings with Swedish ancestry into the East Baltic and their dispersal through the forest zone didn’t cause a language shift of Balto-Finnic, Mordvinic, or East Slavic speakers to Old Norse, either…

NOTE. For N1a-Y4339 – N1a-L550 subclade of Swedish origin – as main haplogroup of modern descendants of Rurikid princes, see Volkov & Seslavin (2019) – full text in comments below. Data from ancient samples show varied paternal lineages even among early rulers traditionally linked to Rurik’s line, which explains some of the discrepancies found among modern descendants:

  • A sample from Chernihiv (VK542) potentially belonging to Gleb Svyatoslavich, the 11th century prince of Tmutarakan/Novgorod, belongs to hg. I2a-Y3120 (a subclade of early Slavic I2a-CTS10228) and has 71% “Modern Polish” ancestry (see below).
  • Izyaslav Ingvarevych, the 13th century prince of Dorogobuzh, Principality of Volhynia/Galicia, is probably behind a sample from Lutsk (VK541), and belongs to hg. R1a-L1029 (a subclade of R1a-M458), showing ca. 95% of “Modern Polish” ancestry.
  • Yaroslav Osmomysl, the 12th century Prince of Halych (now in Western Ukraine), was probably of hg. E1b-V13, yet another clearly early Slavic haplogroup.

Density of haplogroup N1a-VL29, N1a-L550 (samples in pink, most not visible) among Vikings. Samples of hg. R1b in blue, hg. R1a in green, hg. I in orange.

Finnish ancestry

Firstly, modern Finnish individuals are not like ancient Finnish individuals, modern individuals have ancestry of a population not in the reference; most likely Steppe/Russian ancestry, as Chinese are in the reference and do not share this direction. Ancient Swedes and Norwegians are more extreme than modern individuals in PC2 and 4. Ancient UK individuals were more extreme than Modern UK individuals in PC3 and 4. Ancient Danish individuals look rather similar to modern individuals from all over Scandinavia. By using a supervised ancient panel, we have removed recent drift from the signal, which would have affected modern Scandinavians and Finnish populations especially. This is in general a desirable feature but it is important to check that it has not affected inference.

PCA of the ancient and modern samples using the ancient palette, showing different PCs. Modern individuals are grey and the K=7 ancient panel surrogate populations are shown in strong colors, whilst the remaining M-K=7 ancient populations are shown in faded colors.

The story for Modern-vs-ancient Finnish ancestry is consistent, with ancient Finns looking much less extreme than the moderns. Conversely, ancient Norwegians look like less-drifted modern Norwegians; the Danish admixture seen through the use of ancient DNA is hard to detect because of the extreme drift within Norway that has occurred since the admixture event. PC4 vs PC5 is the most important plot for the ancient DNA story: Sweden and the UK (along with Poland, Italy and to an extent also Norway) are visibly extremes of a distribution the same “genes-mirror-geography” that was seen in the Ancient-palette analysis. PC1 vs PC2 tells the same story – and stronger, since this is a high variance-explained PC – for the UK, Poland and Italy.

Uniform manifold approximation and projection (UMAP) analysis of the VA and other ancient samples.

Evidence for Pictish Genomes

The four ancient genomes of Orkney individuals with little Scandinavian ancestry may be the first ones of Pictish people published to date. Yet a similar (>80% “UK ancestry) individual was found in Ireland (VK545) and five in Scandinavia, implying that Pictish populations were integrated into Scandinavian culture by the Viking Age.

Our interpretation for the Orkney samples can be summarised as follows. Firstly, they represent “native British” ancestry, rather than an unusual type of Scandinavian ancestry. Secondly, that this “British” ancestry was found in Britain before the Anglo-Saxon migrations. Finally, that in Orkney, these individuals would have descended from Pictish populations.

Natural neighbor interpolation of “British ancestry” among Vikings.

(…) ‘UK’ represents a group from which modern British and Irish people all receive an ancestry component. This information together implies that within the sampling frame of our data, they are proxying the ‘Briton’ component in UK ancestry; that is, a pre-Roman genetic component present across the UK. Given they were found in Orkney, this makes it very likely that they were descended from a Pictish population.

Modern genetic variation within the UK sees variation between ‘native Briton’ populations Wales, Scotland, Cornwall and Ireland as large compared to that within the more ‘Anglo-Saxon’ English. This is despite subsequent gene flow into those populations from English-like populations. We have not attempted to disentangle modern genetic drift from historically distinct populations. Roman-era period people in England, Wales, Ireland and Scotland may not have been genetically close to these Orkney individuals, but our results show that they have a shared genetic component as they represent the same direction of variation.

Density of haplogroup R1b-L21 (samples in red), overlaid over all samples of hg. R1b among Vikings (R1b-U106 in green, other R1b-L151 in deep red). To these samples one may add the one from Janakkala in south-western Finland (AD ca. 1300), of hg. R1b-L21, possibly related to these population movements.

For more on Gaelic ancestry and lineages likely representing slaves among early Icelanders, see Ebenesersdóttir et al. (2018).


As in the case of mitochondrial DNA, the overall distribution profile of the Y chromosomal haplogroups in the Viking Age samples was similar to that of the modern North European populations. The most frequently encountered male lineages were the haplogroups I1, R1b and R1a.

Haplogroup I (I1, I2)

The distribution of I1 in southern Scandinavia, including a sample from Sealand (VK532) ca. AD 100 (see Iron Age Y-DNA maps) proves that it had become integrated into the West Germanic population already before their expansions, something that we already suspected thanks to the sampling of Germanic tribes.

Density of haplogroup I (samples in orange) among Vikings. Samples of hg. R1b in blue, hg. R1a in green, N1a in pink.
Density of haplogroup I1 (samples in red) overlaid over all samples of hg. I among Vikings.

Haplogroup R1b (M269, U106, P312)

Especially interesting is the finding of R1b-L151 widely distributed in the historical Nordic Bronze Age region, which is in line with the estimated TMRCA for R1b-P312 subclades found in Scandinavia, despite the known bottleneck among Germanic peoples under U106. Particularly telling in this regard is the finding of rare haplogroups R1b-DF19, R1b-L238, or R1b-S1194. All of that points to the impact of Bell Beaker-derived peoples during the Dagger period, when Pre-Proto-Germanic expanded into Scandinavia.

Also interesting is the finding of hg. R1b-P297 in Troms, Norway (VK531) ca. 2400 BC. R1b-P297 subclades might have expanded to the north through Finland with post-Swiderian Mesolithic groups (read more about Scandinavian hunter-gatherers), and the ancestry of this sample points to that origin.

However, it is also known that ancestry might change within a few generations of admixture, and that the transformation brought about by Bell Beakers with the Dagger Period probably reached Troms, so this could also be a R1b-M269 subclade. In fact, the few available data from this sample show that it comes from the natural harbour Skarsvågen at the NW end of the island Senja, and that its archaeologist thought it was from the Viking period or slightly earlier, based on the grave form. From Prescott (2017):

In 1995, Prescott and Walderhaug tentatively argued that a dramatic transformation took place in Norway around the Late Neolithic (2350 BCE), and that the swift nature of this transition was tied to the initial Indo-Europeanization of southern and coastal Norway, at least to Trøndelag and perhaps as far north as Troms. (…)

The Bell Beaker/early Late Neolithic, however, represents a source and beginning of these institution and practices, exhibits continuity to the following metal age periods and integrated most of Northern Europe’s Nordic region into a set of interaction fields. This happened around 2400 BCE, at the MNB to LN transition.

NOTE. This particular sample is not included in the maps of Viking haplogroups.

Density of haplogroup R1b (samples in blue) among Vikings. Samples of hg. I in orange, hg. R1a in green, N1a in pink.
Density of haplogroup R1b-U106 (samples in green) overlaid over all samples of hg. R1b (other R1b-L23 samples in red) among Vikings.
Density of R1b-L151 (xR1b-U106) (samples in deep red) overlaid over all samples of hg. R1b (R1b-U106 in green, other R1b-M269 in blue) among Vikings.

Haplogroup R1a (M417, Z284)

The distribution of hg. R1a-M417, in combination with data on West Germanic peoples, shows that it was mostly limited to Scandinavia, similar to the distribution of I1. In fact, taking into account the distribution of R1a-Z284 in particular, it seems even more isolated, which is compatible with the limited impact of Corded Ware in Denmark or the Northern European Plain, and the likely origin of R1a-Z284 in the expansion with Battle Axe from the Gulf of Finland. The distribution of R1a-Z280 (see map above) is particularly telling, with a distribution around the Baltic Sea mostly coincident with that of N1a.

Density of haplogroup R1a (samples in green) among Vikings. Samples of hg. R1b in blue, of hg. I in orange, N1a in pink.
Density of haplogroup R1a-Z284 (samples in cyan) overlaid over all samples of hg. R1a (in green, with R1a-Z280 in pink) among Vikings.

Other haplogroups

Among the ancient samples, two individuals were derived haplogroups were identified as E1b1b1-M35.1, which are frequently encountered in modern southern Europe, Middle East and North Africa. Interestingly, the individuals carrying these haplogroups had much less Scandinavian ancestry compared to the most samples inferred from haplotype based analysis. A similar pattern was also observed for less frequent haplogroups in our ancient dataset, such as G (n=3), J (n=3) and T (n=2), indicating a possible non-Scandinavian male genetic component in the Viking Age Northern Europe. Interestingly, individuals carrying these haplogroups were from the later Viking Age (10th century and younger), which might indicate some male gene influx into the Viking population during the Viking period.

Natural neighbor interpolation of “Italian ancestry” among Vikings.

As the paper says, the small sample size of rare haplogroups cannot distinguish if these differences are statistically relevant. Nevertheless, both E1b samples have substantial Modern Polish-like ancestry: one sample from Gotland (VK474), of hg. E1b-L791, has ca. 99% “Polish” ancestry, while the other one from Denmark (VK362), of hg. E1b-V13, has ca. 35% “Polish”, ca. 35% “Italian”, as well as some “Danish” (14%) and minor “British” and “Finnish” ancestry.

Given the E1b-V13 samples of likely Central-East European origin among Lombards, Visigoths, and especially among Early Slavs, and the distribution of “Polish” ancestry among Viking samples, VK362 is probably a close description of the typical ancestry of early Slavs. The peak of Modern Polish-like ancestry around the Upper Pripyat during the (late) Viking Age suggests that Poles (like East Slavs) have probably mixed since the 10th century with more eastern peoples close to north-eastern Europeans, derived from ancient Finno-Ugrians:

Natural neighbor interpolation of “Polish ancestry” among Vikings.

Similarly, the finding of R1a-M458 among Vikings in Funen, Denmark (VK139), in Lutsk, Poland (VK541), and in Kurevanikha, Russia (VK160), apart from the early Slav from Usedom, may attest to the origin of the spread of this haplogroup in the western Baltic after the Bell Beaker expansion, once integrated in both Germanic and Balto-Slavic populations, as well as intermediate Bronze Age peoples that were eventually absorbed by their expansions. This contradicts, again, my simplistic initial assessment of R1a-M458 expansion as linked exclusively (or even mainly) to Balto-Slavs.

Y-DNA haplogroups in Europe during Antiquity (full map). See other maps of cultures and ancient DNA from Antiquity.


Balto-Slavic accentual mobility: an innovation in contact with Balto-Finnic


Some very specific prosodic innovations affected the Balto-Slavic linguistic community, probably at a time when it already showed internal dialectal differences. Whether those innovations were related to archaic remnants stemming from the parent Proto-Indo-European language, and whether that disintegrating community included different dialects, remains an object of active debate.

“Archaic” Balto-Slavic?

The main question about Balto-Slavic is whether this concept represents a single community, or it was rather a continuum formed by two (Baltic and Slavic) or possibly three (East Baltic, West Baltic, Slavic) neighbouring communities, speaking closely related Northern European dialects, which just happened to evolve very close to each other, i.e. in cultures that were closer to each other than they were to Germanic or Balto-Finnic.

In my opinion, their similarities warrant the reconstruction of a single original central-east European community since the dissolution of Bell Beakers, speaking a North-West Indo-European dialect, and most internal differences between Baltic and Slavic may be explained as innovations. The precise identification of a Proto-Balto-Slavic community remains elusive, although the Unetice-Iwno-Mierzanowice triangle remains the best bet, with Trzciniec showing what seems like an Early Slavic-like population reaching up to the East Baltic.

Bell Beaker expansion in eastern Europe and around the Baltic.

The reconstruction of a common Balto-Slavic proto-language is known to range from difficult to impossible, depending on who you ask, not the least because of the differences that are discussed in this post, and which have been the own battlefield created by Balticists and Slavicists for decades. The old tenet that Balto-Slavic had inherited some traits directly from PIE is – in contrast with e.g. the Italo-Celtic concept – surprisingly vivid still today.

Take, for example, these internal differences and supposedly archaic traits:

  • The ruKi rule, where Baltic shows mostly *is, *us, and Slavic shows *, *; or the different output of Satemization in Baltic compared to Slavic (and both compared to Indo-Iranian). Nevertheless, the Satemization trends in Balto-Slavic and Indo-Iranian are usually explained together and taken as a sign of a traditional three-velar system for PIE.
    • If you consider Satemization as a late trend in Balto-Slavic, affecting each dialect in a different way, and thus Balto-Slavic phonetic evolution clearly distinct from the Indo-Iranian trend, rejecting trictectalism, this problem is solved. This would also solve the impossible Indo-Slavonic problem, and the paradox of Balto-Slavic sharing a genetic phylum with Germanic and Italo-Celtic.
    • If you, however, conflate these differences and North-West Indo-European features with an ad hoc explanation of a hypothetic Centum dialect called Temematic, which intends to solve their (in Holzer’s words) unlösbaren inconsistencies, you essentially add a whole new inconsistency without solving their previous ones. For a full rebuttal of Holzer‘s Temematic etymologies, see Matasović (2014).
  • Kortlandt’s reconstruction of a PIE 3rd singular *-e (Baltic from *-et, Slavic from *-eti) and 3rd plural *-o, which would have been replaced independently in other Indo-European dialects (by *-eti, *-onti), is reminiscent of his own reconstruction of laryngeals almost up to the attestation of all Indo-European dialects, including Baltic. If you consider these traits an innovation, this artificially created problem is immediately solved.
  • Genitive plural Pre-Baltic *-ōm vs. Pre-Slavic *-ŏm is another commonly cited example. However, I would place this difference among other similar differences found within other related IE dialects, hence a common phonetic innovation (see e.g. below for the classicist view of unstable obliques).
  • Kortlandt’s reconstruction of oblique cases in *-m-, shared with Germanic, as stemming from a common Middle PIE *-mus (based essentially on Old Lithuanian *-mus and on a non-existent equivalent Anatolian formation), hence different from those in *-bʰ-. While you can argue for infinite more reasonable alternatives, the most often cited one is the ins.-dat. pl. *-bʰ- as a common NWIE innovation based on ins. sg. *bʰi-, while forms in *-m- (including ins. sg.) as a Northern European phonetic innovation. The simplest, most elegant explanation I’ve read to date (I think by Rémy Viredaz) is the similar bilabial change of Giacobo/Giacomo in Italian…

As you can see, some Balto-Slavicists could have written whole books about how their object of study holds the key to solve problems on common Proto-Indo-European paradigms, some of which wouldn’t need solving if they hadn’t been started by Balto-Slavicists themselves…

While all of these “archaic” traits are easily dismissed without further ado (except for some understandable damaged pride among academics), there is one especially pervasive idea among those willing to find the white whale of laryngeal remnants in Indo-European languages (see here for other examples of dubious laryngeal remains).

The prophecy before the battle, Józef Ryszkiewicz, 1890. Or, how to conjure laryngeal remnants in Balto-Slavic.

Accentual development in contact

Whichever position one prefers, the general argument is that the Balto-Slavic accentual system is non-trivial for the classification of both dialects into a common branch. However, that would only be completely true if it were a common innovation, but not so much if it were a natural laryngeal evolution.

In fact, the broken tone preserving a PIE laryngeal, as proposed by Kortlandt – continuing Meillet’s idea of synchronous PIE-PBS developments – was always very difficult to accept. Even the rising pronunciation is not original, and represents a shift of the accent on the initial syllable in Latvian…

In my opinion, the derivation of a modern phenomenon from a PIE laryngeal must always raise a red flag (see below on archaisms vs. innovations in IE languages). As you can see from my take of the fable in Balto-Slavic, which uses Kortlandt’s reconstruction, I preferred not to take into account the reconstructed accents. The fable remains thus a model of what could have been a common Proto-Balto-Slavic, unlike other reconstructions, which are much less tentative.

NOTE. You could argue that accents may be reconstructed in spite of the wrong theory behind them, but this is not true; at least not of all reconstructed accents, some of which require further assumptions. Think about it this way: I wouldn’t take into account a reconstruction of Germanic accent which used Danish glottalized tone for a hypothetical Proto-Germanic laryngeal, even if most accents seemed correct at first sight. The truth is, I didn’t want to dedicate time to go through each reconstructed word and its explanation, so it was easier to delete them all, even though that’s not an actual solution, either. You will find the same doubts in the description of Balto-Slavic evolution in my old Modern Indo-European grammar. The introduction to IE dialects was partially copied from Wikipedia (which, in the case of Balto-Slavic, essentially summarized data from Kortlandt), but in the grammar I just tried to keep the basics, and not very successfully, because you need a comprehensive and coherent description of a language’s evolution. That’s how messed up the question was, and how it still is, even though 15 years of research have passed…

Despite the idea of an “archaic Balto-Slavic”, especially prevalent among older researchers, the current trend is to consider Balto-Slavic prosodic changes as a natural innovation, even among those who would artificially reconstruct laryngeal remnants up to late Balto-Slavic stages.

NOTE. You can read more about the Proto-Indo-European laryngeal loss and vocalism. While the presence of certain laryngeals up to Late PIE is certain, the loss in many environments is also generally agreed upon. This is especially true of a hypothetical Indo-Slavonic branch, like that supported by Kortlandt: even those supporting multiple laryngeal loss events must admit that Indo-Iranian showed no laryngeals before its disintegration, whether they put this loss as an internal Proto-Indo-Iranian evolution, or they place it earlier. Tocharian attests to an evolution similar to the rest of Late PIE dialects (hence to a quite early laryngeal loss trend), and Balkan dialects (supposedly splitting before Indo-Slavonic) also lost laryngeals in a similar way, except for initial ones, which show vocalic output instead of full loss.

So, where does a laryngeal loss fit in this “Indo-Slavonic” scheme, exactly? Before the Tocharian split? Before the Balkan split? After the Balkan split but before the full loss in Indo-Iranian? And where exactly does this group belong regarding Corded Ware, and where does Germanic? No idea (but you can read Kortlandt try fitting his model with Gimbutas’ “Kurgan peoples”). Because one thing is to reconstruct Proto-Greek, or Proto-Celtic, or Proto-Italic forms without laryngeals and to put them in relation with a purely theoretical three-laryngeal PIE, and a different one is to reconstruct laryngeals (including in environments which were already lost in Tocharian) up to Proto-Baltic and Proto-Slavic, which seems more than just a bit of a stretch…

Indo-European dialectal relationships, from Mallory and Adams (2006).

Thomas Olander offered a summary of the current positions regarding the Balto-Slavic accentual system recently in Indo-European heritage in the Balto-Slavic accentuation system (2013), which also contains a summary of his Mobility Law, to explain this phenomenon as a common Pre-Baltic and Pre-Slavic innovation.

Andersen, an advocate of different Baltic and Slavic dialects developing in contact with Satem dialects, suggested in The Satem Languages of the Indo-European Northwest. First Contacts? (2009), partially based on Olander’s initial proposal, that Baltic and Slavic accentual mobility arose as a result of contact with languages with fixed word-initial ictus: the accent was lost in the word-final mora in pre-Proto-Baltic and, independently, in pre-Proto-Slavic. Hence, the central innovation, the accent loss

technically is not a shared Slavic and Baltic innovation. On the contrary. It shows that the speakers of the Pre-Slavic and Pre-Baltic dialects formed bilingual communities with speakers of contact dialects that were of the same prosodic type, viz. had fixed initial ictus but no free accent.

In the meantime, Olander (2019) has found out about more real-world examples of this same phenomenon:

Prosodic features are known to be susceptible to contact influence (Salmons 1992:1 and passim). While it does not directly influence the evaluation of the Mobility Law as a non-trivial innovation, it is interesting that most of the alleged parallels are indeed considered to be contact-induced changes due to influence from languages with an ictus on the word-initial syllable (Andersen 2009: 11-14; Rinkevičius 2013): Balto-Fennic in the case of the Karelian and (perhaps through Latvian as an intermediary) Žemaitian dialects, and Hungarian in the case of the Slavonian dialects (for Karelian see Jakobson 1938/2002: 239; Veenker 1967: 74; Thomason & Kaufman 1988: 122, 241; Salmons 1992: 41- 42; for Žemaitian see Zinkevičius 1966: 45- 46; for Slavonian see Ivić 1958: 287).

I am not aware of any hypotheses on a contact-induced origin for Greek prosodic innovations, but it is at least worth noting that there is agreement on significant substrate influence on Greek. While we may speculate that these substrate language(s) had word-initial ictus like Balto-Fennic and Hungarian, we do not have any actual information about the prosodic system(s) (thus even Beekes 2014: 9, who in other respects provides a fairly detailed picture of the substrate).

The parallels from other speech varieties show that an accent loss of the type suggested for a pre-stage of Baltic and Slavic is a type of prosodic change that has occurred several times in different various systems. In the context of the present paper this means that the sound law itself cannot be classified as a non-trivial innovation; it may have taken place in already differentiated dialects or languages. Also, the parallels suggest that a loss of the accent may be the result of influence from languages with fixed word-initial ictus.

In this time when even linguists agree that substrate/contact languages have to be related to specific ethnolinguistic groups (see here for Germanic), the fact that Olander stops short of naming this substrate behind Pre-Baltic and Pre-Slavic as being Late Uralic in general, or Balto-Finnic in particular, is surprising.

NOTE. Not the least because Olander is part of the Homeland Timeline map project of the Copenhagen group (their website is not working right now), and they placed Volosovo as Uralians expanding with Netted Ware in contact with the Baltic during the Bronze Age…So what’s to doubt about Balto-Slavic – Balto-Finnic contacts, exactly? Maybe if Balto-Finnic was the substrate language behind Balto-Slavic (as it was in Germanic), it would mean that Uralic languages were previously spoken in territories that became later Germanic- and Balto-Slavic-speaking?

Still image from the Copenhagen Timeline Map (accessed one year ago), showing in green Volosovo hunter-gatherers who, according to the map, later expand to the north-east with Netted Ware…

Archaism vs. Innovation

If we tried to describe these trends of explaining peculiar traits in recent Indo-European dialects as archaism vs. innovation from a purely theoretical point of view, we could roughly distinguish two different positions (with infinite variants, of course) among academics – just like we could find people more inclined to leftist or rightist trends when speaking about economy. When it comes to linguistics, which is the least messed-up field where one can describe Indo-European and Indo-Europeans, I think we can find two alternative basic tenets:

  • One idea would hold that the oldest attested dialects – and those with an older guesstimated proto-language – are the gold standard as to what the original situation may have been, and about what could be described as an archaism. For example, Ancient Greek and Mycenaean or Vedic Sanskrit for old dialects; Tocharian, or Italic dialects for those with quite old guesstimates, each for different reasons; and Anatolian for both, old dialect and attested early.
  • NOTE. Nevertheless, the phonology of Anatolian inscriptions is often difficult to ascertain, and its ancient dialectal nature stemming from a Middle PIE stage may still be disputed by some. The archaic nature of Tocharian seems to be maybe less generally accepted than that of Anatolian, but I would say there is general consensus on the matter today.

  • The other general idea would support that the most isolated dialects are those which may hold the key to the oldest Indo-European traits, somehow hidden from external influences and areal contacts, and thus from generalized innovative trends that have affected the best known ancient dialects. In that sense, languages like Slavic, Baltic, Albanian, or Armenian – as well as some Balkan fragmentary dialects – are quite common aims of study to reveal exceptional PIE traits.

I think the education system in Southern Europe and South Asia is that of formal classicists. In eastern Europe, I’d reckon the education system – especially in regions that were never connected to the Graeco-Roman tradition – favours linguistics as a study of the own and related proto-languages. For northern Europe, I would say it’s 50/50, especially in Scandinavia, depending on whether classicists or linguists dominate over the departments of Indo-European. For example, while Germany or Austria would maybe lean more toward the classics, Copenhagen’s obsession with Germanic as the most archaic IE branch is well known…

A 17th-century birch bark manuscript of Pāṇini’s grammar treatise from Kashmir. Image from Wikipedia.

Both positions, when blindly accepted, are bound to fail at some point or another:

  • If you take Classical Sanskrit, Classical Greek, or Classical Latin as an example of Proto-Indo-European, you are bound to make radical mistakes when reconstructing the parent language, more so if you disregard the oldest attested layers of the languages. An interesting view of the so-called Adradists at the Complutense University of Madrid – apart from their famous 9-laryngeal reconstruction – is that Middle PIE had only 5 cases, with a general (unstable) oblique one in Late PIE that later evolved into the attested 5 to 8 cases in the different dialects. That is, in my opinion, a fairly typical classicist error, which would be easily addressed by taking into account the oldest stages, like those attested in Mycenaean and in Old Latin, instead of focusing on classical grammar. The 8-case system is, in fact, one of the few true Balto-Slavic archaisms, supported by external comparanda.
  • On the other hand, if you take Albanian, Armenian, Baltic or Slavic, or even phonetically dubious data like those from some Anatolian inscriptions, you can eventually argue for anything. And I really mean anything; you are leaving the logic door wide open for any crazy-ass opinion about Proto-Indo-European based on traits found in modern languages: From how many velars evolved (if at all, because you may find all of them in Luwian, or still living in Albanian or in Armenian…) and their nature as ejective consonants in Late PIE (based on Armenian or Germanic); to how many laryngeals and when these laryngeals disappeared (if they actually did disappear, because some may even find them in Modern Lithuanian, in Armenian, or in Danish…); etc. Once you believe your own romantic view of some modern language(s) retaining traits from five thousand years ago, there is no stopping that; not for you, but not for anyone else, either.

NOTE. One of the funniest consequences of this type of ‘worldview’, where one assumes that – the own interpretations of – modern dialects are as reliable (or even more so than) ancient ones, and that Indo-European dialects somehow split at the same time from the parent language (so there was one common “full laryngeal” language, and then all attested dialects evolved from it) are some of the theories that you can easily find posted on Facebook’s group on Proto-Indo-European. Let’s just say, for the sake of simplicity, that you can compare English ‘sunrise’ with Spanish ‘sonrisa’ “smile” all you want, and assert that both reveal a common origin in PIE *sup- hence from the Sun and the smile going “up” or something, but any explanation as to how you reached that conclusion doesn’t make for the why this comparison shouldn’t have even started at all. Now replace English and Spanish with Armenian, Slavic, and/or Albanian, invent some new IE sound law, throw one or two laryngeals in the mix, and somehow this might get a pass among certain linguists…

The Celebration of Svetovid on Rügen, Alphonse Mucha, The Slav Epic. Image from Wikipedia. Were Early Slavs some among a selected few romantic peoples to keep the “true” Indo-European language and traditions? Of course not.

While no one can deny the value of different Indo-European branches for the reconstruction of the parent language, no matter how recently they were attested, the only reasonable solution whenever a difficult case arises is to trust ancient dialects more than recent ones. Using data from fringe theories based on recent dialects to build a Proto-Indo-European paradigm, especially when there is contradictory data from ancient IE dialects, is flawed for two reasons:

  1. Languages attested later – especially after periods of population movements and contacts – would show, in general, a greater degree of change. Preferring Old Slavic or Classical Armenian to reconstruct Indo-European over ancient dialects like Ancient Greek, Vedic Sanskrit, or ancient Italic dialects is, in a way, like taking Byzantine Greek, Pali, or Old French as models, respectively.
  2. Classical languages are indeed modified due to the action of grammarians, but once standardized these “languages behind a state” (or religion) are less prone to change, due to the transmission of oral (and written) literature, education, commerce, etc. Languages left to unorganized tribes are less constrained in their evolution, and their internal (substrate) and external (contact) influences are greater and (what’s worse) unknown.

Baltic and Slavic, like Albanian or Armenian, are dialects attested very recently, which may have undergone complex internal and external influences we may never fully understand. Confronted with controversial or inexplicable traits compared to ancient branches like Greek, Indo-Iranian, or Italo-Celtic (especially if they fit with other Indo-European dialects), the conservative solution that will be right most of the time (and I mean 99.9999% of cases) is to assume they represent an innovation over Late PIE.

The fact that some researchers still use these recent dialects as a blank canvas instead, in order to propose unending new ideas about how to reconstruct IE proto-languages, or even older common PIE stages, is shocking. Not “R1a/Steppe” vs. “N1c/Siberian” haplogroup+ancestry bullshit-level shocking, but still unacceptable in a serious academic environment.

The only reason why Balto-Slavicists have failed so many times in this “unsolvable” question that seems to be Proto-Balto-Slavic reconstruction, apart from the known differences between Baltic and Slavic, is precisely the fixation of many with their object of study as a model for other IE languages (and thus for PIE), instead of taking the rest as a model for the reconstruction of Balto-Slavic (or of Proto-Baltic and Proto-Slavic).

Repeating ad nauseam the popular concept of Balto-Slavic (or Baltic and Slavic) being among the most archaic IE dialects, or the slowest evolving IE dialects, and cheap nationalist slogans of the sort, does not help this aim, and just reading or hearing that should make anyone cringe instantly. Not less than reading or hearing about Sanskrit being essentially equal to PIE, or spoken in the Indus Valley 10,000 years ago. Because we are not living in the 19th century, mind you.


The cradle of Russians, an obvious Finno-Volgaic genetic hotspot


First look of an accepted manuscript (behind paywall), Genome-wide sequence analyses of ethnic populations across Russia, by Zhernakova et al. Genomics (2019).

Interesting excerpts:

There remain ongoing discussions about the origins of the ethnic Russian population. The ancestors of ethnic Russians were among the Slavic tribes that separated from the early Indo-European Group, which included ancestors of modern Slavic, Germanic and Baltic speakers, who appeared in the northeastern part of Europe ca. 1,500 years ago. Slavs were found in the central part of Eastern Europe, where they came in direct contact with (and likely assimilation of) the populations speaking Uralic (Volga-Finnish and Baltic- Finnish), and also Baltic languages [11–13]. In the following centuries, Slavs interacted with the Iranian-Persian, Turkic and Scandinavian peoples, all of which in succession may have contributed to the current pattern of genome diversity across the different parts of Russia. At the end of the Middle Ages and in the early modern period, there occurred a division of the East Slavic unity into Russians, Ukrainians and Belarusians. It was the Russians who drove the colonization movement to the East, although other Slavic, Turkic and Finnish peoples took part in this movement, as the eastward migrations brought them to the Ural Mountains and further into Siberia, the Far East, and Alaska. During that interval, the Russians encountered the Finns, Ugrians, and Samoyeds speakers in the Urals, but also the Turkic, Mongolian and Tungus speakers of Siberia. Finally, in the great expanse between the Altai Mountains on the border with Mongolia, and the Bering Strait, they encountered paleo-Asiatic groups that may be genetically closest to the ancestors of the Native Americans. Today’s complex patchwork of human diversity in Russia has continued to be augmented by modern migrations from the Caucasus, and from Central Asia, as modern economic migrations take shape.

Sample relatedness based on genotype data. Eurasia: Principal Component plot of 574 modern Russian genomes. Colors reflect geographical regions of collection; shapes reflect the sample source. Red circles show the location of Genome Russia samples.

In the current study, we annotated whole genome sequences of individuals currently living on the territory of Russia and identifying themselves as ethnic Russian or as members of a named ethnic minority (Fig. 1). We analyzed genetic variation in three modern populations of Russia (ethnic Russians from Pskov and Novgorod regions and ethnic Yakut from the Sakha Republic), and compared them to the recently released genome sequences collected from 52 indigenous Russian populations. The incidence of function-altering mutations was explored by identifying known variants and novel variants and their allele frequencies relative to variation in adjacent European, East Asian and South Asian populations. Genomic variation was further used to estimate genetic distance and relationships, historic gene flow and barriers to gene flow, the extent of population admixture, historic population contractions, and linkage disequilibrium patterns. Lastly, we present demographic models estimating historic founder events within Russia, and a preliminary HapMap of ethnic Russians from the European part of Russia and Yakuts from eastern Siberia.

Sample relatedness based on genotype data. Western Russia and neighboring countries: Principal Component plot of 574 modern Russian genomes. Colors reflect geographical regions of collection; shapes reflect the sample source. Red circles show the location of Genome Russia samples.

The collection of identified SNPs was used to inspect quantitative distinctions among 264 individuals from across Eurasia (Fig. 1) using Principal Component Analysis (PCA) (Fig. 2). The first and the second eigenvectors of the PCA plot are associated with longitude and latitude, respectively, of the sample locations and accurately separate Eurasian populations according to geographic origin. East European samples cluster near Pskov and Novgorod samples, which fall between northern Russians, Finno-Ugric peoples (Karelian, Finns, Veps etc.), and other Northeastern European peoples (Swedes, Central Russians, Estonian, Latvians, Lithuanians, and Ukrainians) (Fig. 2b). Yakut individuals map into the Siberian sample cluster as expected (Fig. 2a). To obtain an extended view of population relationships, we performed a maximum likelihood-based estimation of ancestry and population structure using ADMIXTURE [46](Fig. 2c). The Novgorod and Pskov populations show similar profiles with their Northeastern European ancestors while the Yakut ethnic group showed mixed ancestry similar to the Buryat and Mongolian groups.

Population structure across samples in 178 populations from five major geographic regions (k=5). Samples are pooled across three different studies that covered the territory of Russian Federation (Mallick et al. 2016 [36], Pagani et al. 2016 [37], this study). The optimal k-value was selected by value of cross validation error. Russian samples from all studies (highlighted in bold dark blue) show a slight gradient from Eastern European (Ukrainian, Belorussian, Polish) to North European (Estonian Karelian, Finnish) structures, reflecting population history of northward expansion. Yakut samples from different studies (highlighted in bold red) also show a slight gradient from Mongolian to Siberian people (Evens), as expected from their original admixture and northward expansions. The samples originated from this study are highlighted, and plotted in separated boxes below.

Possible admixture sources of the Genome Russia populations were addressed more formally by calculating F3 statistics, which is an allele frequency-based measure, allowing to test if a target population can be modeled as a mixture of two source populations [48]. Results showed that Yakut individuals are best modeled as an admixture of Evens or Evenks with various European populations (Supplemental Table S4). Pskov and Novgorod showed admixture of European with Siberian or Finno-Ugric populations, with Lithuanian and Latvian populations being the dominant European sources for Pskov samples.

The heatmaps of gene flow barriers show for each point at the geographical map the interpolated differences in allele frequencies (AF) between the estimated AF at the point with AFs in the vicinity of this point. The direction of the maximal difference in allele frequencies is coded by colors and arrows.

So, Russians expanding in the Middle Ages as acculturaded Finno-Volgaic peoples.

Or maybe the true Germano-Slavonic™-speaking area was in north-eastern Europe, until the recent arrival of Finno-Permians with the totally believable Nganasan-Saami horde, whereas Yamna -> Bell Beaker represented Vasconic-Caucasian expanding all over Europe in the Bronze Age. Because steppe ancestry in Fennoscandia and Modern Basques in Iberia.

A really hard choice between equally plausible models.


Consequences of O&M 2018 (III): The Balto-Slavic conundrum in Linguistics, Archaeology, and Genetics

This is part of a series of posts analyzing the findings of the recent Nature papers Olalde et al.(2018) and Mathieson et al.(2018) (abbreviated O&M 2018).

The recent publication of Narasimhan et al. (2018) has outdated the draft of this post a bit, and it has made it at the same time still more interesting.

While we wait for the publication of the dataset (and the actual Y-DNA haplogroups and precise subclades with the revision of the paper), and as we watch the wrath of Hindu nationalists vented against the West (as if the steppe was in Western Europe) and science itself, we have already seen confirmation from the Reich Lab of their new approach to Late Proto-Indo-European migrations.

Yamna/Steppe EMBA, previously identified as the direct source of “steppe” ancestry (AKA Yamnaya‘ ancestry) and Late Indo-European migrations in Asia – through Corded Ware, it is to be understood – has been officially changed. In the case of Indo-Iranian migrations it is the “Steppe MLBA cloud”, after a direct contribution to it of Yamna/Steppe EMBA, which expanded Indo-Iranian, as I predicted ancient DNA could support.

In Twitter, the main author responded the following when asked for this change regarding the origin of steppe ancestry in Asian migrants (emphasis mine):

Our reasons are:

  1. The Turan samples show no elevated steppe ancestry till 2000BC.
  2. MLBA is R1a
  3. Indus periphery doesn’t have steppe ancestry but Swat does, and EMBA doesn’t work both in terms of time or genetic ancestry to explain the difference.
Image modified from Narasimhan et al. (2018), including the most likely proto-language identification of different groups. Original description “Modeling results including Admixture events, with clines or 2-way mixtures shown in rectangles, and clouds or 3-way mixtures shown in ellipses”. Yes, this map is the latest official view on migrations from the Reich Lab now. See the original full image here.

I am glad to see finally recognized that Y-DNA haplogroups and time have to be taken into account, and happy also to see an end to the by now obsolete ‘ADMIXTURE/PCA-only relevance’ in Human Ancestry. The timing of archaeological migrations, the cultural attribution of each sample, and the role of Y-DNA variability reduction and expansion have been finally recognized as equally important to assess potential migrations, as I requested.

This change was already in the making some months ago, when David Anthony – who has worked with the group for this paper and others before it – already changed his official view on Corded Ware – from his previous support of the 2015 model. His latest theory, which linked Yamna settlements in Hungary with a potential mixed society of migrants (of R1b-L23 and R1a-Z645 lineages) from West Yamna, is most likely wrong, too, but it was clearly a brave step forward in the right direction.

The only reasonable model now is that Yamna expanded Late Proto-Indo-European languages with steppe ancestry + R1b-L23 subclades.

You can either accept this change, or you can deny it and wait until one sample of R1a-Z645 appears in West Yamna or central Europe, or one sample of R1b-L23 appears in Corded Ware (as it is obvious it could happen), to keep spreading the wrong ideas still some more years, while the rest of the world goes on: Mallory, Anthony, and other archaeologists co-authoring the latest paper (probably part of the stronger partnership with academics that we were going to see), who had formally put forward complex, detailed theories, investing their time and name in them, have rejected their previous migration models to develop new ones based on the most recent findings. If they can do that, I am sure any amateur geneticist out there can, too.

Modified image, from Narasimhan et al. (2018). Anthony’s new model of a Yamna Hungary -> Corded Ware (Małopolska) migration arrow in red. Notice also how they keep the arrow from West Yamna to the north (in black), due probably to the Baltic Late Neolithic samples (see below).

The Balto-Slavic dialect and its homeland

An interesting question in Linguistics and Archaeology, now that Corded Ware cannot be identified as “Indo-Slavonic” or any other imaginary ancient group (like Indo-Slavo-Germanic), remains thus mostly unchanged since before the famous 2015 genetic papers:

  • Was Balto-Slavic a dialect of the expanding North-West Indo-European language, a Northern LPIE dialect, as we support, based on morphological and lexical isoglosses?
  • Or was it part of an Indo-Slavonic group in East Yamna, i.e. a Graeco-Aryan dialect, based mainly on the traditional Satem-Centum phonological division?

I am a strong supporter of Balto-Slavic being a member of a North-West Indo-European group. That’s probably because I educated myself first with the main Spanish books* on Proto-Indo-European reconstruction, and its authors kept repeating this consistent idea, but I have found no relevant data to reject it in the past 15 years.

* Today two of the three volumes are available in English, although they are from the early 1990s, hence a bit outdated. They also maintain certain peculiarities from Adrados’ own personal theories, such as multiple (coloured) laryngeals, 5 cases – with a common ancestral oblique case – for Middle PIE, etc. But it has lots of detailed discussions on the different aspects of the reconstruction. It is not an easy introductory manual to the field, though; for that you have already many famous short handbooks out there, like those of Fortson (N.American), Beekes (Leiden), or Meier-Brügger (Germany).

Fernando and I have always maintained that North-West Indo-European must have formed a very recent community, probably connected well into the early 2nd millennium BC for certain recent isoglosses to spread among its early dialects, based on our guesstimates*, and on our belief that it formed at some point not just a dialect continuum, but probably a common language, so we estimated that the expansion was associated with the pan-European influence of Únětice and close early Bronze Age European contacts.

NOTE. I know, you must be thinking “linguistic guesstimates? Bollocks, that’s not Science”. Right? Wrong. When you learn a dozen languages from different branches, half a dozen ancient ones, and then still study some reconstructed proto-languages from them, you begin to make your own assumptions about how the language changes you perceive could have developed according to your mental time frames. If you just learned a second language and some Latin in school, and try to make assumptions as to how language changes, or you believe you can judge it with this limited background, you have evidently the wrong idea of what a guesstimate is. I accept criticism to this concept from a scientist used only to statistical methods, since it comes from pure ignorance of what it means. And I accept alternative guesstimates from linguists whose language backgrounds may differ (and thus their perception of language change). However, I would not accept a glottochronological or otherwise (supposedly) statistical model instead (or a religious model, for that matter), so we have no alternatives to guesstimates for the moment.

In fact, guesstimates and dialectalization have paved the way to the steppe hypothesis, first with the kurgan hypothesis by Marija Gimbutas, then complemented further in the past 60 years by linguists and archaeologists into a detailed Khvalynsk -> Yamna -> Afanasevo/Bell Beaker/Sintashta-Andronovo expansion model, now confirmed with genomics. So either you trust us (or any other polyglot who deals with Indo-European matters, like Adrados, Lehmann, Beekes, Kloekhorst, Kortlandt, etc.), or you begin learning ancient languages and obtaining your own guesstimates, whichever way you prefer. The easy way of numbers + computer science does not exist yet, and is quite far from happening – until we can understand how our brains summarize and select important details involved in obtaining estimates – , no matter what you might be reading (even in Nature or Science) recently

Proto-Indo-European dialectal expansion according to Adrados (1998).

Data from the 2015 papers changed my understanding of the original NWIE-speaking community, and I have since shifted my preffered anthropological model (from a Northern dialect in Yamna spreading into a loose NWIE-speaking Corded Ware -> Únětice) to a quite close group formed by late Yamna settlers in the Carpathian Basin, expanded as East Bell Beakers, and later continuing with close contacts through Central European EBA.

NOTE. As you can read, we initially rejected Gimbutas’ and Anthony’s (2007) notion of a Late PIE splitting suddenly into all known dialects (viz. Italo-Celtic with Vučedol/Bell Beaker), and looked thus for a common NWIE spread with Corded Ware migrants, with help from inferences of modern haplogroup distribution (as was common in the early 2000s). Language reconstruction was the foundation of that model, and it was right in its own way. It probably gave the wrong idea to geneticists and archaeologists, who quite easily accepted some results from the 2015 papers as supporting this model. But it also helped us develop a new model and predict what would happen in future papers, as demonstrated in O&M 2018. Any alternative linguistic and archaeological model could explain what is seen today in genomics, but our model of North-West Indo-European reconstruction is obviously at present the best fit for it.

Map of Chalcolithic migrations (A Grammar of Modern Indo-European, 2nd ed. 2008): Corded Ware as the vector of Indo-European languages.

Nevertheless, one of the most important Balticists and Slavicists alive, Frederik Kortlandt, posits that there was in fact an Indo-Slavonic group, so one has to take that possibility into account. Not that his ideas are flawless, of course: he defends the glottalic theory – which is still held today by just a handful of researchers – , and I strongly oppose his description of Balto-Slavic and Germanic oblique cases in *-m- (against other LPIE *-bh-) as an ancestral remnant related to Anatolian (an ending which few scholars would agree corresponds to what he claims), since that would probably represent an older split than warranted in our model. I believe genetics is proving that the dialectalization of Late PIE happened as Fernando López-Menchero and I described.

NOTE. The idea with these examples of how he has been wrong in LPIE and MPIE reconstruction is not to observe the common ad hominem arguments used by amateur geneticists to dismiss academic proposals (“he said that and was wrong, ergo he is wrong now”). It is to bring into attention that the argument from authority is important for the academic community insofar as it creates a common ground, i.e. especially when there are many relevant scholars agreeing on the same subject. But, indeed, any model can and should be challenged, and all authorities are capable of being wrong, and in fact they often are.

The most common explanation today for the dialectal development *-m- is an innovation (not an archaism), whether morphological (viz. Ita. and Gk. them. pl *-i) or phonological (as I defend); and the most commonly repeated model for the satemization trend (even for those supporting a three-dorsal theory for PIE) is areal contact, whether driven by a previous (most likely Uralic) substratum, or not. Hence, if Kortlandt’s main different phonological and morphological assessments of the parent language are flawed, and they are the basis for his dialectal scheme, it should be revised.

The ‘atomic bomb’ that Indo-Slavonic proponents launched, in my opinion, was Holzer’s Temematic (born roughly at the same time as the renewed Old European concept in North-West Indo-European model of Oettinger) – and indeed Kortlandt’s acceptance of it. It seems to me like the linguistic equivalent of the archaeological “patron-client relationship” proposed by Anthony for a cultural diffusion of Late PIE into different Corded Ware regions: almost impossible to be fully rejected, if the Indo-Slavonic superstrate is proposed for a relatively early time.

In my opinion, the shared morphological layer with North-West Indo-European is obviously older than Iranian influence on Slavic, and I think this is communis opinio today. But how could we disentangle the dialectalization of Balto-Slavic, if there is (as it seems) an ancestral substrate layer (most likely Uralic) common to both Balto-Slavic and Indo-Iranian? It seems a very difficult task.

Diachronic map of migrations in Europe ca. 2250-1750 BC

The expansion of Balto-Slavic

In any case, there are two, and only two mainstream choices right now.

NOTE. Mainstream, as in representing trends current today among Indo-Europeanists, so that many programs around the world would explain these alternative models to their students, or they would easily appear in most handbooks. Not like the word “mainstream” you read in any comment out there by anyone who has never been interested in Indo-European studies, and uses any text from any author, written who knows how long ago, merely to justify their ethnic preconceptions coupled with certain genomic finds.

You can agree with:

A) The Spanish and German schools of thought, together with many American and British scholars, as well as archaeologists like Heyd, Mallory, or Prescott, and now Anthony, too: the language ancestral to Balto-Slavic, Germanic, and Italo-Celtic accompanied expanding West Yamna/East Bell Beakers into Europe, and then their speakers – like the rest of peoples everywhere in Europe – admixed later in the different regions.

B) Frederik Kortlandt and other Indo-Slavicists. The ‘original’ Balto-Slavic would have spread with Srubna (and likely Potapovka before it), as a product of the admixture of East Yamna’s Indo-Slavonic with incoming Corded Ware migrants (this would correspond to my description of Indo-Iranian). ‘True’ Balto-Slavic speakers would have then absorbed the Temematic-speaking migrants (equivalent to early Balto-Slavic migrants as described in the demic diffusion model) spreading from the west, most likely in the steppe. Later developments from the steppe would have then brought Baltic to the north, and Slavic to the west.

Therefore, in both cases the language spoken by early R1a-Z645 lineages in Únětice or Mierzanowice/Nitra EBA cultures would have been an eastern North-West Indo-European dialect associated with expanding Bell Beakers, and closely related to Germanic and Italo-Celtic. In the second case, the ancient samples we see genetically closer to modern West Slavs could thus be identified with those speaking the Temematic substrate absorbed later by Balto-Slavic, or maybe by Balts migrating northward, and Slavs spreading west- and southward.

NOTE. In any case, we know that R1a-Z645 subclades resurged in Central-East Europe after the expansion of Bell Beakers, potentially showing an ancient link with the prevalent R1a subclades in the region today. We know that some ancient Central European populations cluster near modern West Slavs, but in other interesting regions (like the British Isles, Central Europe, Scandinavia, or Iberia) we also see close clusters, and nevertheless observe historically documented radical ethnolinguistic changes, as well as many different subsequent genetic inflows and founder effects, that have significantly altered the anthropological picture in these regions, so it could very well be that the lineages we find in ancient samples do not correspond to modern West Slavic lineages, or even similar ancient and modern lineages could show a radical cultural discontinuity (as is likely the case in this to-and-from-the-steppe migration scheme).

Diachronic map of migrations in Europe ca. 1250-750 BC.

Since we are going to see signs of both – west and east admixture – in early Slavic communities near the steppe, and the distribution from South, West, and East Slavs will include a wide “cloud” connecting Central, East, and South-East Europe, as it is evident already from early Germanic samples, it may be interesting to shift our attention to the Tollense valley and Lusatian samples, and their predominant Y-DNA haplogroups. Once again, tracking male-driven migrations from Central Europe to the Baltic region and the steppe, and back again to much of Central and South Europe, will determine which groups expanded this eastern NWIE dialect initially and in later times.

Since Baltic and Slavic languages are attested quite late, genetics is likely to help us select among the different available models for Balto-Slavic, although (it is worth repeating it) these lineages may not be the same that later expanded each dialect.

NOTE. Bronze and Iron Age samples might begin to depict the true Balto-Slavic migration map. Apart from the strong differences in the satemization processes seen among Baltic, Slavic, and Indo-Iranian, from an archaeological point of view the geographic location of the earliest attested Baltic languages and the prehistoric developments of the region seem to me almost incompatible with a homeland in the steppe. Anyway, in the worst-case scenario – for those of us who work with Balto-Slavic to reconstruct North-West Indo-European – there is consensus that there must an eastern North-West Indo-European language (which some would call Temematic), whose common traits with Germanic and Italo-Celtic we use to reconstruct their parent language. The question remains thus mostly theoretical, of limited pragmatic use for the reconstruction.

The third way: Baltic Late Neolithic

I have referred to Kristiansen and his group‘s position regarding Corded Ware as Indo-European as flawed before. While their latest interpretation (and language identification) was wrong, Kristiansen’s original idea of long-lasting contacts in the Dnieper-Dniester region with the area occupied by late Trypillia developing a Proto-Corded Ware culture was probably right, as we are seeing now.

New data in Mittnik et al. 2018 show some interesting early Late Neolithic samples from the Baltic region – Zvejnieki, Gyvakarai1 (R1a-Z645) and Plinkaigalis242 – , proving what I predicted: that elevated steppe ancestry and R1a-Z645 subclades would be found in the Dnieper-Dniester region unrelated to the Yamna expansion, and, it seems, to migrants of the Corded Ware A-horizon.

Funnily enough, this shows that there were probably ancient interactions in the region, as originally asserted by Kristiansen, and probably following some of Victor Klochko‘s proposed exchange paths, but earlier than predicted by him.

Nevertheless, linguist Guus Kroonen (from Kristiansen’s workgroup) issued a quick response to O&M 2018 in yet another twist of his agricultural substrate theory, changing Corded Ware from the vector to a vector of expansion of Late Proto-Indo-European languages (thus following again strictly Gimbutas’ oudated model), which fails thus to tackle the main inconsistencies of their previous models, as shown now with the latest paper on South Asian migrations. As I said, they were always one step behind Anthony, and they still are.

Funny also how Anthony, too – like Kristiansen – , may have been right all along since 2007, in proposing that Corded Ware (the nuclear Corded Ware migrants) stemmed from the Dnieper-Dniester region roughly at the same time as Yamna migrants expanded west, and that they did not have any direct genetic connection (in terms of migrations) with each other.

Most likely Pre-Proto-Anatolian migration with Suvorovo-Novodanilovka chiefs in the North Pontic steppe and the Balkans.

Both researchers, who collaborated with the latest genomic research, remade their models, and have to revise now their most recent proposals with the new data, influencing each new paper published with their pressure to be right in their previous models, and with new genomic data compelling them to change their theories under the pressure not to be too wrong again, in this strange vicious circle. Had they remained silent and committed to their archaeological theories, they could have been right all along, each one in their own way.

NOTE. BTW, in case you see ad hominem here too, I feel compelled to say that only thanks to their commitment to disentangle the truth about ancient migrations, and their readiness to collaborate with genetic research – unlike many others in their field – we know today what we know. If they have been wrong many times, it is because they have tried to connect the genetic dots as they were told. Only because of their readiness to explore their science further they should be praised by all. But, again, that does not mean that they cannot be wrong in their models…

Thanks to Anthony’s latest change of mind, we don’t have to hear the “cultural diffusion” argument anymore, and I consider this a great advance for the field.

NOTE. Not that there could not be prehistoric cultural diffusion events of language (i.e. not accompanied by genetic admixture), of course, but such theories, almost impossible to disprove, probably need much more than a simple “patron-client relationship” proposal and anthropometry to justify them, in a time when we will be able to see almost every meaningful personal exchange in Genomics…

Today – since the finding of Ukraine_Eneolithic sample I6561, of haplogroup R1a-Z93, dated ca. 4200 BC, and likely from the Sredni Stog culture – it seems more likely than ever that the expansion of R1a-Z645 subclades was in fact associated with the spread of steppe admixture probably near the North Pontic forest-steppe region, most likely from the Dnieper-Dniester or Upper Dniester region.

The appearance of a ‘late’ Z93 subclade already at such an early date, with steppe admixture, makes it still more likely that the Proto-Corded Ware culture, from where Corded Ware migrants of R1a-Z645 lineages later spread, was probably associated with this wide region.

In a parallel but unrelated migration, as it is now clear, steppe admixture also expanded with Yamna settlers of R1b-L23 lineages into the North Pontic steppe – from the North Caspian steppe, where it had developed previously as the Khvalynsk and (likely) Repin cultures -, roughly at the same time as Proto-Corded Ware expanded to the north, ca. 3300-3000 BC, and then expanded to the west into the Balkans (contributing to the formation of Balkan EBA cultures, and to the East Bell Beaker group).

NOTE. A migration of Yamna settlers northward along the Prut dated ca. 3000 BC or later could have justified the appearance of steppe admixture in the Dnieper-Dniester region, as I proposed for the Zvejnieki sample, although dates from Baltic samples are likely too early for that. For this to be corroborated, migrants should be accompanied up to a certain region by R1b-L23 lineages, and this could mean in turn a revival of Anthony’s original model of cultural diffusion of 2007. The most likely scenario, however, as predicted by Heyd, given the early appearance of steppe admixture and R1a-Z93 subclades in the forest-steppe during the 5th millennium, is that the admixture happened much earlier than that, fully unrelated to Late PIE migrations.

Diachronic map of Copper Age migrations in Europe ca. 3100-2600 BC

The modern Baltic and Slavic conundrum

As for some people of Northern European ancestry previously supporting a bulletproof Yamna (R1a/R1b) -> Corded Ware migration that was obviously wrong; now supporting different Sredni Stog -> Corded Ware groups representing Indo-Slavonic (and Germanic??) in a model that is clearly wrong: how are these attempts different from Western Europeans supporting the autochthonous continuity of R1b-P312 lineages against all recent data, from Indians supporting the autochthonous continuity of R1a-M417 lineages no matter what, and from the more recent trend of autochthonous continuity theories for N1c lineages and Uralic in Eastern Europe?

Modern Germanic-speaking peoples can trace their common language to Nordic Iron Age Proto-Germanic, Celts to La Tène’s expansion of Proto-Celtic, and Romance speakers to the Roman expansion (and to an earlier Proto-Italic), all three dating approximately to the Iron Age. Proto-Slavic is dated much later than that, and probably Proto-Baltic too (or maybe earlier depending on the dialectal proposal), with Balto-Slavic being possibly coeval with Pre-Proto-Germanic and Italo-Celtic, but probably slightly later than that. Also, the language ancestral to Slavic may be (like a theoretical Proto-Romance language) impossible to reconstruct with precision, due to multiple substrate (or superstrate?) influences on the wide territory where Proto-Slavic formed and expanded from, in close alliance with steppe communities of different ethnolinguistic backgrounds.

We know that proto-historic Germanic, Celtic, and Italic peoples spread from relatively small regions, and had almost nothing to do with historic groups speaking their daughter languages, let alone modern speakers. Baltic and Slavic are not different.

NOTE. We have read that Weltzin samples clustered closely to Central Europeans (especially Austrians), and at a certain distance from modern Poles. That’s the conclusion of Sell’s PhD thesis, and it may be right, if you take only modern samples for comparison. However, if you have read or thought that they represented some kind of “ancestral Germanic vs. Slavic” battle, please imagine Trump’s voice for my opinion: Wrroonng, wrroonng, wrroonng. They cluster closely with Bell Beaker migrants, Poland BA, and Únětice (in this order), which we now know thanks to the data from O&M 2018 and Mittnik et al. 2018. And we also know who they don’t cluster close too: Corded Ware and Trzciniec samples. Therefore, people from the region near the most likely homelands of Pre-Proto-Germanic and Proto-Balto-Slavic are – as expected – likely descendants from Bell Beaker migrants in Central Europe. The genetic relationship of those ancient samples to modern inhabitants of Central-East Europe? Not obvious – at all.

PCA of samples from Tollense Valley battlefield and some ancient and modern samples.

We also know (and have known for a long time, well before these recent papers) that the oldest attested Indo-European languagesMycenaean, early Anatolian languages, and Indo-Aryan (through certain words in Mitanni inscriptions) – do not show continuity from the places where they were first attested to the Late and Middle Proto-Indo-European (steppe) homeland either. There should be no problem then in accepting that there is no linguistic, archaeological, or common sense reason to support that Balto-Slavic is older or shows more regional continuity than other IE languages from Europe.

NOTE. Oh yes, Balts saying “Baltic is the most similar language to PIE” I hear you thinking? Uh-huh, sure. And according to some Greeks (supported e.g. by the conclusions from Lazaridis et al. 2017) Mycenaeans were ‘autochthonous’, and Proto-Greek the most similar to PIE. For many Hindus, Vedic Sanskrit is in fact PIE), and the latest paper by Narasimhan et al. (2018) only reinforces this idea (don’t ask me why). Also, Caucasian scholar Gamkrelidze (with Ivanov) supported the origin of the language precisely in the Caucasus, with Armenian being thus the purest language. For Italians fans of Virgil and the Roman Empire, Latin (like Aeneas) comes from Anatolian linguistically and genetically, hence it must be the ‘oldest’ IE dialect alive… No, wait, Danish scholars Kroonen and Iversen quite recently asserted that Germanic is the oldest to branch off, then it should thus be nearest to PIE! I think you can see a pattern here…And don’t forget about the new Vasconic-Uralic hypotheses going on now, with Vasconic fans of R1b changing from Palaeolithic to Mesolithic, and now to European Neolithic and whatnot, or Uralic fans of N1c changing now from Mesolithic EHG to Siberia (for ancestry) or Central Asia (for N1c subclades), or whatever is necessary to believe in ‘continuity’ of their people following the newest genetic papers… Just pick whatever theory you want, call it “mainstream”, and that’s it.

So, if there is no reliable archaeological model connecting Bronze or Iron Age cultures to Eastern European cultures which are supposed to represent the Proto-Slavic and Proto-Baltic homelands…why on earth would any reasonable amateur (not to speak about scholars) dare propose any sort of genetic or linguistic continuity for thousands of years from PIE to early Slavs, a people whose first blurry appearance in historical records happened during the Middle Ages in rather turbulent and genetically admixed regions? It does not make any sense, and it had all odds against it. Blond hair, blue eyes, lactase persistence? Sure, and ABO group, brachycephaly, anthropometry… All very scientifish.

Diachronic map of migrations during Classical Antiquity in Europe 250 BC – 250 AD.
Where’s Proto-Slavic Wally?


Human ancestry can only help refine solid academic theories, it cannot create one. Every new pet theory used to satisfy modern cultural pre- and misconceptions has failed, and it will fail again, and again, and again…

To have an own anthropological model of prehistoric migration requires time and study. It is not enough to play with software and to misuse traditional academic disciplines just to ‘prove’ some completely irrelevant, meaningless, and false continuity.