Corded Ware and Bell Beaker related groups defined by patrilocality and female exogamy


Two new interesting papers concerning Corded Ware and Bell Beaker peoples appeared last week, supporting yet again what is already well-known since 2015 about West Uralic and North-West Indo-European speakers and their expansion.

Below are relevant excerpts (emphasis mine) and comments.

NOTE. I will add analyses of ancestry, renewed Y-DNA maps, etc. if and when I find the time.

I. Corded Ware and Battle Axe cultures

Open access The genomic ancestry of the Scandinavian Battle Axe Culture people and their relation to the broader Corded Ware horizon, by Malmström, Günther, et al. Philos. Trans. R. Soc. (2019).

I.1. Origins of Corded Ware peoples

The discovery of the Alexandria outlier represented a clear support for a long-lasting genomic difference between the two distinct cultural groups, Yamnaya and Corded Ware, already visible in an opposition Khvalynsk vs. late Sredni Stog ca. 4000 BC, i.e. well before the formation of both Late Eneolithic/Early Bronze Age groups.

However, the realization that it may not have been an Eneolithic individual, but rather a (Middle?) Bronze Age one, suggests that Sredni Stog was possibly not directly related to Corded Ware, and a potential direct connection with Yamnaya might have to be reevaluated, e.g. through the Carpathian Basin, as Anthony (2017) proposed.

Principal component analysis of modern Europeans (grey) and projected ancient Europeans.

This new paper shows two early Corded Ware individuals from Obłaczkowo, Poland (ca. 2900-2600 BC) – hence close to the supposed original Proto-Corded Ware community – with an apparently (almost) full “Steppe-like” ancestry, clustering (almost) with Yamnaya individuals:

Similar to the BAC individuals, the newly sequenced individuals from the present-day Karlova in Estonia and Obłaczkowo in Poland appear to have strong genetic affinities to other individuals from BAC and CWC contexts across the Baltic Sea region. Some individuals from CWC contexts, including the two from Obłaczkowo, cluster closely with the potential source population of steppe-related ancestry, the Yamnaya herders. Notably, these individuals appear to be those with the earliest radiocarbon dates among all genetically investigated individuals from CWC contexts. Overall, for CWC-associated individuals, there is a clear trend of decreasing affinity to Yamnaya herders with time.

NOTE. Interestingly, this sample is almost certainly attributed to the skeleton E8-A, which had been supposedly already investigated by the Copenhagen group as the RISE1 sample:

We note that RISE1 is also described as the individual from Obłaczkowo feature E8-A. However, their genetic results differ from ours. They present this individual as a molecularly determined male that belongs to Y-chromosomal haplogroup (hg) R1b and to mtDNA hg K1b1a1 while our results show this individual to be female, carrying a mtDNA hg U3a’c profile

Since the typical Steppe_MLBA ancestry of Corded Ware groups does not show good fits for (Pre-)Yamnaya-derived ancestry, it is almost certain that these individuals will show no (or almost no) direct Yamnaya-related contribution, but rather a contribution of East European sub-Neolithic groups, more or less close to the steppe-forest region.

NOTE. They might show contributions from Pre-Yamnaya-influenced Sredni Stog, though, but if they show a contribution of Yamnaya, then they are probably outliers, related to Yamnaya vanguard groups (see image below). And for them to show it, then both sources, Yamnaya and Corded Ware, should be clearly distinguishable from each other and their relative contribution quantifiable in formal stats, something difficult (if not impossible) to ascertain today.

Their position in the published PCA – a plot apparently affected by projection bias – suggests a cluster in common with early Baltic samples, which are known to show contributions from East European sub-Neolithic populations (see qpAdm values for Baltic CWC samples).

NOTE. Results for previous samples labelled as Poland CWC are unreliable due to their low coverage.

The most interesting aspect about the ancestry shown by these early samples is their further support for an origin of the culture different than Sredni Stog, and for a rejection of the Alexandria outlier as ancestral to them, hence for a Volhynian-Podolian homeland of Proto-Corded Ware peoples, with an ancestry probably more closely related to the late Maykop Steppe- and Trypillian/GAC groups admixed with sub-Neolithic populations of the Eastern European Late Eneolithic.

NOTE. That is, unless there is a reason for the apparent increase in so-called “Steppe-ancestry” during the northward and westward migration of CWC peoples that represents another thing entirely…

Trypillian routes of influence and Yamnaya culture influences in Central and Central-East Europe during the Late Eneolithic / Early Bronze Age. Images by Klochko (2009).

I.2. CWC expansion under R1a bottlenecks

The two males in our dataset (ber1 and poz81) belonged to Y-chromosome R1a haplogroups, as do the majority of males (16/24) from the previously published CWC contexts, while a smaller fraction belonged to R1b [3/24] or I2a [3/24] lineages. The R1a haplogroup has not been found among Neolithic farmer populations nor in hunter–gatherer groups in central and western Europe, but it has been reported from eastern European hunter–gatherers and Eneolithic groups. Individuals from the Pontic–Caspian steppe, associated with the Yamnaya Culture, carry mostly R1b and not R1a haplotypes.

Sample poz81 is of basal hg. R1a-CTS4385*, an R1a-M417 subclade, supporting once again that most Corded Ware individuals from western and central European groups expanded under R1a-M417 (xZ645) lineages. The Battle Axe sample from Bergsgraven (ca. 2620-2470 BC) shows a basal hg. R1a-Y2395*, a R1a-Z283 subclade leading to the typically Fennoscandian R1a-Z284.

Both findings further support that typical lineages of West CWC groups, including R1a-M417 (xZ645) subclades, were fully replaced by incoming East Bell Beakers, and that the limited expansion of R1a-Z284 and I1 (the latter found in one newly reported Late Neolithic sample from Sweden) was the outcome of later regional bottlenecks within Scandinavia, after the creation of a maritime dominion by the Bell Beaker elites during the Dagger Period.

I.3. CWC and lactase persistence

(…) one of these individuals (kar1) carried at least one allele (-13910 C->T) associated with lactose tolerance, while the other two individuals (ber1 and poz81) carried at least one ancestral variant each, consistent with previous observations of low levels of lactose tolerance variants in the Neolithic and a slight increase among individuals from CWC contexts.

The fact that two early CWC individuals carry ancestral variants could be said to support the improbability of the individual from Alexandria representing a community ancestral to the Corded Ware community. On the other hand, the late CWC individual from Estonia carries one allele, but it still seems that only Bell Beakers and Steppe-related groups show the necessary two alleles during the Early Bronze Age, which is in line with a late Repin/early Yamnaya-related origin of the successful selection of the trait, consistent with the expansion of their specialized semi-nomadic cattle-breeding economy through the steppe biome during the Late Eneolithic.

Maps part of the public data used for the post by Iain Mathieson on Lactase Persistence. “By 2500 BP, the allele is present over a band stretching from Ireland to Central Asia at around 50 degrees latitude. This probably reflects the spread of Steppe ancestry populations in which the allele originated. However, the allele is still rare (say less than 1% frequency) over this entire range. It does not become common anywhere until some time in the past 2500 years – when it reaches its present-day high frequency in Britain and Central Europe”.

I.4. West Uralic spread from the East

The BAC groups fit as a sister group to the CWC-associated group from Estonia but not as a sister group to the CWC groups from Poland or Lithuania (|Z| > 3), indicating some differences in ancestry between these CWC groups and BAC. Supervised admixture modelling suggests that BAC may be the CWC-related group with the lowest YAM-related ancestry and with more ancestry from European Neolithic groups.

While the results of the paper are compatible with a migration from either the Eastern or the Western Baltic into Scandinavia, phylogeography and archaeology support that Battle Axe peoples emerged as a Baltic Corded Ware group close to the Vistula that expanded first to the north-east, and then to the west from Finland, continuing mostly unscathed during the whole Bronze Age mostly in eastern Fennoscandia with the development of Balto-Finnic- and Samic-speaking communities.

Correlation between f4(Chimp, LBK, YAM, X), where X is a CWC or BAC individual, and the date (BCE) of each individual. This statistic measures shared drift between CWC and Linear Pottery Culture (LBK) as opposed to YAM and should increase with the higher proportion of Neolithic farmer ancestry in CWC and BAC.

Radiocarbon dating showed that the three individuals from the Öllsjö megalithic tomb derived from later burials, where oll007 (2860–2500 cal BCE) overlaps with the time interval of the BAC, and oll009 and oll010 (1930–1650 cal BCE) fall within the Scandinavian Late Neolithic and Early Bronze Age

For more on how the Pitted Ware culture may have influenced Uralic-speaking Battle Axe peoples earlier than Indo-European-speaking Bell Beakers in Scandinavia, read more about Early Bronze Age Scandinavia and about the emergence of the Pre-Proto-Germanic community.

II. Bell Beakers through the Bronze Age

New paper (behind paywall) Kinship-based social inequality in Bronze Age Europe, by Mittnik et al. Science (2019).

II.1. Yamnaya vanguard settlers

In my last post, I showed how the ancestry of Corded Ware from Esperstedt is consistent with influence by incoming Yamnaya vanguard settlers or early Bell Beakers, stemming ultimately from the Carpathian Basin, something that could be inferred from the position of the Esperstedt outlier in the PCA, and by the knowledge of Yamnaya archaeological influences up to Saxony-Anhalt.

Yamnaya settlers are strongly suspected to have migrated in small so-called vanguard groups to the west and north of the Carpathians in the first half of the 3rd millennium BC, well before the eventual adoption of the Proto-Beaker package and their expansion ca. 2500 BC as East Bell Beakers.

Tauber Valley infiltration

As I mentioned in the books, one of the known – among the many more unknown – sites displaying Yamnaya-related traits and suggesting the expansion of Yamnaya settlers into Central Europe is Lauda-Königshofen, in the Tauber Valley.

From Diet and Mobility in the Corded Ware of Central Europe, by Sjögren, Price, & Kristiansen PLoS One (2017):

A series of CW cemeteries have been excavated in the Tauber valley. There are three large cemeteries known and some 30 smaller sites. The larger ones are Tauberbischofsheim-Dittingheim with 62 individuals, Tauberbischofsheim-Impfingen with 40 individuals, and Lauda-Königshofen with 91 individuals. The cemeteries are dispersed rather regularly along the Tauber valley, on both sides of the river, suggesting a quite densely settled landscape.

The Lauda-Königshofen graves consisted mostly of single inhumations in contracted position, usually oriented E-W or NE-SW. A total of 91 individuals were buried in 69 graves. At least 9 double graves and three graves with 3–4 individuals were present. In contrast to the common CW pattern, sexes were not distinguished by body position, only by grave goods. This trait is common in the Tauber valley and suggests a local burial tradition in this area. Stone axes were restricted to males, pottery to females, while other artifacts were common to both sexes. About a third of the graves were surrounded by ring ditches, suggesting palisade enclosures and possibly over-plowed barrows.

In particular, Frînculeasa, Preda, & Heyd (2015) used Lauda-Königshofen as representative of the mobility of horse-riding Yamnaya nomadic herders migrating into southern Germany, referring to the findings in Trautmann (2012) about the nomadic herders from the Tauber Valley, and their already known differences with other Corded Ware groups.

The likely influence of Yamnaya in the region has been reported at least since the 2000s, repeatedly mentioned by Jozef Bátora (2002, 2003, 2006), who compiled Yamnaya influences in a map that has been copied ever since, with little improvement over time. Heyd believes that there are potentially many Yamnaya remains along the Middle and Lower Danube and tributaries not yet found, though.

NOTE. Looking for this specific site, I realized that Bátora (and possibly many after him who, like me, copied his map) located Lauda-Königshofen in a more south-western position within Baden-Württemberg than its actual location. I have now corrected it in the maps of Chalcolithic migrations.

Yamnaya influences in Central Europe suggestive of vanguard settlements, contemporary with Corded Ware groups. See full map.

Althäuser Hockergrab…Bell Beakers

Unfortunately, though, it is very difficult to attribute the reported R1b-L51 sample from the Tauber valley to a population preceding the arrival of East Bell Beakers in the region, so there is no uncontroversial smoking gun of Yamnaya vanguard settlers – yet. Reasons to doubt a Pre-Beaker origin are as follows:

1. This family of the Tauber valley shows a late radiocarbon date (ca. 2500 BC), i.e. from a time where East Bell Beakers are known to have been already expanding in all directions from the Middle and Upper Danube and its tributaries.

Crouched burial from Althausen (Althäuser Hockergrab), dated ca. 2500 BC.

2. Archaeological information is scarce. Remains of these four individuals were discovered in 1939 and officially reported together with other findings in 1950, without any meaningful data that could distinguish between Bell Beakers and Corded Ware individuals.

This site is located in the Tauber valley, ca. 100 km to the northwest of the Lech valley. The site was discovered during the construction of a sports field in 1939 and was subsequently excavated by G. Müller and O. Paret. Four individuals in crouched position were found in the burial pit of a flat grave. The burial did not contain any grave goods, but due to the type of grave and positioning of the bodies (with heads pointing towards southwest) the site was attributed to the Corded Ware complex.

The classification of this burial as of CWC and not BBC seems to have been based entirely on the numerous CWC findings in the Tauber valley, rather than on its particular burial orientation following a regional custom (foreign to the described standard of both cultures), and on its grave type that was also found among Bell Beaker groups. Like many human remains recovered in dubious circumstances in the 20th century, these samples should have probably been labelled (at least in the genetic paper) more properly as Tauber_LN or Tauber_EBA.

Changes in ancestry over time. (A) Median ages of individuals plotted against z scores of f4 (Mbuti, Test; Yamnaya_Samara, Anatolia_Neolithic) show increase of Anatolian farmer-related ancestry (indicated by more positive z-scores) and decrease of variation in ancestry over time. Grey shading indicates significant z scores, red line shonw near correlation (r = -0.35971; P = 0.003) and dotted lines the 95% confidence interval. (B) ancestry proportions on autosomes calculated with qpAdm. (C) Sex-bias z scores between autosomes and X chromosomes show significant male bias for steppe-related ancestry in the Tauber samples. Image modified from the paper: Surrounded with a blue circle in (A) are females with more Steppe-related ancestry, and in (C) surrounded by squares are the distinct sex biases found in the earliest BBC from the Tauber valley vs. later groups from the Lech valley.

3. In terms of ancestry, there seem to be no gross differences between the Lech Valley BBC individuals and previously reported South German Beakers, originally Yamnaya-like settlers admixing through exogamy with locals, including Corded Ware peoples, as the sex bias of the Lech Valley Beakers proves (see PCA plot below). In other words, northern and eastern Beakers admixed with regional (Epi-)Corded Ware females during their respective expansions, similar to how southern and western Beakers admixed with regional EEF-related females.

The two available Tauber Valley samples (“Tauber_CWC”) show the same pattern: a quite recent Steppe-related male bias and Anatolia_Neolithic-related female bias. Nevertheless, the male sample clusters ‘to the south’ in the PCA relative to all sampled Corded Ware individuals (see PCA plot below), and shows less Yamnaya-like ancestry than what is reported (or can be inferred) for Yamnaya from Hungary or early Bell Beakers of elevated Steppe-related ancestry.

The ancestry and position of the Althäuser male in the PCA is thus fully compatible with recently incoming East Bell Beakers admixing with local peoples (including Corded Ware) through exogamy, but not so much with a sample that would be expected from Yamanaya vanguard + Corded Ware-related ancestry (more like the Esperstedt outlier or the early France Beaker). Compared to the more ‘northern’ (fully Corded Ware-like) position of his female counterpart, there is little to support that both are part of the same native Tauber valley community after generations of ancestry levelling…

Table S9. Three-way qpAdm admixture model for European MN/Chalcolithic group+Yamnaya_Samara. P-values greater than 0.05 (model is not rejected) marked in green.

4. The haplogroup inference is also unrevealing: whereas the paper reports that it is R1b-P310* (xU106, xP312), there is no data to support a xP312 call, so it may well be even within the P312 branch, like most sampled Bell Beaker males. Similarly, the paper also reports that HUGO_180Sk1 (ca. 2340 BC) shows a positive SNP for the U106 trunk, which would make it the earliest known U106 sample and originally from Central Europe, but there is no clear support for this SNP call, either. At least not in their downloadable BAM files, as far as I can tell. Even if both were true, they would merely confirm the path of expansion of Yamnaya / East Bell Beakers through the Danube, already visible in confirmed genomic data:

Distribution of ‘archaic’ R1b-L51 subclades in ancient samples, overlaid over a map of Yamnaya and Bell Beaker migrations. In blue, Yamnaya Pre-L51 from Lopatino (not shown) and R1b-L52* from BBC Augsburg. In violet, R1b-L51 (xP312,xU106) from BBC Prague and Poland. In maroon, hg. R1b-L151* from BBC Hungary, BA Bohemia, and (not shown) a potential sample from the Tauber Valley and one from BBC at Mondelange, which is certainly xU106, maybe xP312. Interestingly, the earliest sample of hg. R1b-U106 (a lineage more proper of northern Europe) has been found in a Bell Beaker from Radovesice (ca. 2350 BC), between two of these ‘archaic’ R1b-L51 samples; and a sample possibly of hg. R1b-ZZ11+ (ancestral to DF27 and U152) was found in a Bell Beaker from Quedlinburg, Germany (ca. 2290 BC), to the north-west of Bohemia. The oldest R1b-U152 are logically from Central Europe, too.

II.2. Proto-Celts and the Tumulus culture

The most interesting data from Mittnik et al. (2019) – overshadowed by the (at first sight) striking “CWC” label of the Althäuser male – is the finding that the most likely (Pre-)Proto-Celtic community of Southern Germany shows, as expected, major genetic continuity over time with Yamnaya/East Bell Beaker-derived patrilineal families, which suggests an almost full replacement of other Y-chromosome haplogroups in Southern German Bronze Age communities, too.

Sampled families form part of an evolving Bell Beaker-derived European BA cluster in common with other Indo-European-speaking cultures from Western, Southern, and Northern Europe, also including early Balto-Slavs, clearly distinct from the Corded Ware-related clusters surviving in the Eastern Baltic and the forest zone.

This Central European Bronze Age continuity is particularly visible in many generations of different patrilocal families practising female exogamy, showing patrilineal inheritance mainly under R1b-P312 (mostly U152+) lineages proper of Central European bottlenecks, all of them apparently following a similar sociopolitical system spanning roughly a thousand years, since the arrival of East Bell Beakers in the region (ca. 2500 BC) until – at least – the end of the Middle Bronze Age (ca. 1300 BC):

Here, we show a different kind of social inequality in prehistory, i.e., complex households that consisted of i) a higher-status core family, passing on wealth and status to descendants, ii) unrelated, wealthy and high-status non-local women and iii) local, low-status individuals. Based on comparisons of grave goods, several of the high-status non-local females could have come from areas inhabited by the Unetice culture, i.e., from a distance of at least 350 km. As the EBA evidence from most of Southern Germany is very similar to the Lech valley, we suggest that social structures comparable to our microregion existed in a much broader area. The EBA households in the Lech valley, however, seem similar to the later historically known oikos, the household sphere of classic Greece, as well as the Roman familia, both comprising the kin-related family and their slaves.

Genetic structure of Late Neolithic and Bronze Age individuals from southern Germany. (A) Ancient individuals (covered at 20,000 or more SNPs) projected onto principal components defined by 1129 present day west Eurasians (shown in fig. S6); individuals in this study shown with outlines corresponding to their 87Sr/86Sr isotope value (black: consistent with local values, orange: uncertain/intermediate, red: inconsistent with local values). Selected published ancient European individuals are shown without outlines. Image modified from the paper. Surrounded by triangles in cyan, Corded Ware-like females; with a blue triangle, Yamnaya/Early BBC-like sample from the Tauber valley.

NOTE. For those unfamiliar with the usual clusters formed by the different populations in the PCA, you can check similar graphics: PCA with Bell Beaker communities, PCA with Yamnaya settlers from the Carpathians, a similar one from Wang et al. (2019) showing the Yamnaya-Hungary cline, or the chronological PCAs prepared by me for the books.

The gradual increase in local EEF-like ancestry among South Germany EBA and MBA communities over the previous BBC period offers a reasonable explanation as to how Italic and Celtic communities remained in loose contact (enough to share certain innovations) despite their physical separation by the Alps during the Early Bronze Age, and probably why sampled Bell Beakers from France were found to be the closest source of Celts arriving in Iberia during the Urnfield period.

Furthermore, continued contacts with Únětice-related peoples through exogamy also show how Celtic-speaking communities closer to the Danube might have influenced (and might have been influenced by) Germanic-speaking communities of the Nordic Late Neolithic and Bronze Age, helping explain their potentially long-lasting linguistic exchange.

Like other previous Neolithic or Chalcolithic groups that Yamnaya and Bell Beakers encountered in Europe, ancestry related to the Corded Ware culture became part of Bell Beaker groups during their expansion and later during the ancestry levelling in the European Early Bronze Age, which helps us distinguish the evolution of Indo-European-speaking communities in Europe, and suggests likely contacts between different cultural groups separated by hundreds of km. from each other.

All in all, there is nothing to support that (epi-)Corded Ware groups might have survived in any way in Central or Western Europe: whether through their culture, their Y-chromosome haplogroups, or their ancestry, they followed the fate of other rapidly expanding groups before them, viz. Funnelbeaker, Baden, or Globular Amphorae cultural groups. This is very much unlike the West Uralic-speaking territory in the Eastern Baltic and the Russian forests, where Corded Ware-related cultures thrived during the Bronze Age.

f4-statistics showing differences in ancestry in populations grouped by period. An increase in affinity to ancestry related to Anatolia Neolithic over time. Males and females grouped together shown as upward and downward pointing triangles, respectively.


It was about time that geneticists caught up with the relevance of Y-DNA bottlenecks when assessing migrations and cultural developments.

From Malmström et al. (2019):

The paternal lineages found in the BAC/CWC individuals remain enigmatic. The majority of individuals from CWC contexts that have been genetically investigated this far for the Y-chromosome belong to Y-haplogroup R1a, while the majority of sequenced individuals of the presumed source population of Yamnaya steppe herders belong to R1b. R1a has been found in Mesolithic and Neolithic Ukraine. This opens the possibility that the Yamnaya and CWC complexes may have been structured in terms of paternal lineages—possibly due to patrilineal inheritance systems in the societies — and that genetic studies have not yet targeted the direct sources of the expansions into central and northern Europe.

From Gibbons (2019), a commentary to Mittnik et al. (2019):

Some of the early farmers studied were part of the Neolithic Bell Beaker culture, named for the shape of their pots. Later generations of Bronze Age men who retained Bell Beaker DNA were high-ranking, buried with bronze and copper daggers, axes, and chisels. Those men carried a Y chromosome variant that is still common today in Europe. In contrast, low-ranking men without grave goods had different Y chromosomes, showing a different ancestry on their fathers’ side, and suggesting that men with Bell Beaker ancestry were richer and had more sons, whose genes persist to the present.

There was no sign of these women’s daughters in the burials, suggesting they, too, were sent away for marriage, in a pattern that persisted for 700 years. The only local women were girls from high-status families who died before ages 15 to 17, and poor, unrelated women without grave goods, probably servants, Mittnik says. Strontium levels from three men, in contrast, showed that although they had left the valley as teens, they returned as adults.

Also, from Scientific American:

(…) it has long been assumed that prior to the Athenian and Roman empires,—which arose nearly 2,500 and more than 2,000 years ago, respectively—human social structure was relatively straightforward: you had those who were in power and those who were not. A study published Thursday in Science suggests it was not that simple. As far back as 4,000 years ago, at the beginning of the Bronze Age and long before Julius Caesar presided over the Forum, human families of varying status levels had quite intimate relationships. Elites lived together with those of lower social classes and women who migrated in from outside communities. It appears early human societies operated in a complex, class-based system that propagated through generations.

It seems wrong (to me, at least) that the author and – as he believes – archaeologists and historians had “assumed” a different social system for the European Bronze Age, which means they hadn’t read about how Indo-European societies were structured. For example, long ago Benveniste (1969) already drew some coherent picture of these prehistoric peoples based on their reconstructed language alone: regarding their patrilocal and patrilineal family system; regarding their customs of female exogamy and marriage system; and regarding the status of foreigners and slaves as movable property in their society.

A long-lasting and pervasive social system of Bronze Age elites under Yamnaya lineages strikingly similar to this Southern German region can be easily assumed for the British Isles and Iberia, and it is likely to be also found in the Low Countries, Northern Germany, Denmark, Italy, France, Bohemia and Moravia, etc., but also (with some nuances) in Southern Scandinavia and Central-East Europe during the Bronze Age.

Therefore, only the modern genetic pool of some border North-West Indo-European-speaking communities of Europe need further information to describe a precise chain of events before their eventual expansion in more recent times:

  1. the relative geographical isolation causing the visible regional founder effects in Scandinavia, proper of the maritime dominion of the Nordic Late Neolithic (related thus to the Island Biogeography Theory); and
  2. the situation of the (Pre-)Proto-Balto-Slavic community close to the Western Baltic which, I imagine, will be shown to be related to a resurge of local lineages, possibly due to a shift of power structures similar to the case described for Babia Góra.

NOTE. Rumour has it that R1b-L23 lineages have already been found among Mycenaeans, while they haven’t been found among sampled early West European Corded Ware groups, so the westward expansion of Indo-European-speaking Yamnaya-derived peoples mainly with R1b-L23 lineages through the Danube Basin merely lacks official confirmation.


More Celts of hg. R1b, more Afanasievo ancestry, more maps


Interesting recent developments:

Celts and hg. R1b


Recent paper (behind paywall) Multi-scale archaeogenetic study of two French Iron Age communities: From internal social- to broad-scale population dynamics, by Fischer et al. J Archaeol Sci (2019).

In it, Fischer and colleagues update their previous data for the Y-DNA of Gauls from the Urville-Nacqueville necropolis, Normandy (ca. 300-100 BC), with 8 samples of hg. R, at least 5 of them R1b. They also report new data from the Gallic cemetery at Gurgy ‘Les Noisats’, Southern Paris Basin (ca. 120-80 BC), with 19 samples of hg. R, at least 13 of them R1b.

In both cases, it is likely that both communities belonged (each) to the same paternal lineages, hence the patrilocal residence rules and patrilineality described for Gallic groups, also supported by the different maternal gene pools.

The interesting data would be whether these individuals were of hg. R1b-L21, hence mainly local lineages later replaced or displaced to the west, or – a priori much more likely – of some R1b-U152 and/or R1b-DF27 subclades from Central Europe that became less and less prevalent as Celts expanded into more isolated regions south of the Pyrenees and into the British Isles. Such information is lacking in the paper, probably due to the poor coverage of the samples.

Y-DNA haplogroups in Europe during the Early Iron Age. See full map.

Other Celts

As for early Celts, we already have:

Celtiberians from the Basque Country (one of hg. I2a) and likely Celtic genetic influence in north-east Iberia (all R1b), where Iberian languages spread later, showing that Celts expanded from some place in Central Europe, probably already with the Urnfield culture (ca. 1300 BC on).

Two Hallstatt samples from Bylany, Bohemia (ca. 836-780 BC), by Damgaard et al. Nature (2018), one of them of hg. R1b-U152.

Photo and diagram of burial HÜ-I/8, Mitterkirchen, Oberösterreich, Leskovar 1998.

Another Hallstatt HaC/D1 sample from Mittelkirchen, Austria (ca. 850-650/600), by Kiesslich et al. (2012), with predicted hg. G2a (see Athey’s haplogroup prediction).

One sample of early La Tène culture A from Putzenfeld am Dürrnberg, Hallein, Austria (ca 450–380 BC), by Kiesslich et al. (2012), with predicted hg. R1b (see Athey’s haplogroup prediction).

NOTE. For potential unreliability of haplogroup prediction with Whit Atheys’ haplogroup predictor, see e.g. Zhang et al. (2017).

Photo and diagram of Burial 376, Putzenfeld, Dürrnberg bei Hallein, Moser 2007.

Three Britons from Hinxton, South Cambridgeshire (ca. 170 BC – AD 80) from Schiffels et al. (2016), two of them of local hg. R1b-S461.

Indirectly, data of Vikings by Margaryan et al. (2019) from the British Isles and beyond show hg. R1b associated with modern British-like ancestry, also linked to early “Picts”, hence likely associated with Britons even after the Anglo-Saxon settlement. Supporting both (1) my recent prediction of hg. R1b-M167 expanding with Celts and (2) the reason for its presence among modern Scandinavians, is the finding of the first ancient sample of this subclade (VK166) among the Vikings of St John’s College Oxford, associated with the ‘St Brice’s Day Massacre’ (see Margaryan et al. 2019 supplementary materials).

The R1b-M167 sample shows 23.5% British-like ancestry, hence autosomally closer to other local samples (and related to the likely Picts from Orkney) than to some of his deceased partners at the site. Other samples with sizeable British-like ancestry include VK177 (32.6%, hg. R1b-U152), VK173 (33.3%, hg. I2a1b1a), or VK150 (25.6%, hg. I2a1b1a), while typical Germanic subclades like I1 or R1b-U106 – which may be associated with Anglo-Saxons, too – tend to show less.

Y-DNA haplogroups in Europe during the Late Iron Age. See full map.

I remember some commenter asking recently what would happen to the theory of Proto-Indo-European-speaking R1b-rich Yamnaya culture if Celts expanded with hg. R1a, because there were only one hg. R1b and one (possibly) G2a from Hallstatt. As it turns out, they were mostly R1b. However, the increasingly frequent obsession of searching for specific haplogroups and ancestry during the Iron Age and the Middle Ages is weird, even as a desperate attempt, because:

  1. it is evident that the more recent the ancient DNA samples are, the more they are going to resemble modern populations of the same area, so ancient DNA would become essentially useless;
  2. cultures from the early Iron Age onward (and even earlier) were based on increasingly complex sociopolitical systems everywhere, which is reflected in haplogroup and ancestry variability, e.g. among Balts, East Germanic peoples, Slavs (of hg. E1b-V13, I2a-L621), or Tocharians.

In fact, even the finding of hg. R1b among Celts of central and western Europe during the Iron Age is rather unenlightening, because more specific subclades and information on ancestry changes are needed to reach any meaningful conclusion as to migration vs. acculturation waves of expanding Celtic languages, which spread into areas that were mostly Indo-European-speaking since the Bell Beaker expansion.

Afanasevo ancestry in Asia

Wang and colleagues continue to publish interesting analyses, now in the preprint Inland-coastal bifurcation of southern East Asians revealed by Hmong-Mien genomic history, by Xia et al. bioRxiv (2019).

Interesting excerpt (emphasis mine):

Although the Devil’s Cave ancestry is generally the predominant East Asian lineage in North Asia and adjacent areas, there is an intriguing discrepancy between the eastern [Korean, Japanese, Tungusic (except northernmost Oroqen), and Mongolic (except westernmost Kalmyk) speakers] and the western part [West Xiōngnú (~2,150 BP), Tiānshān Hun (~1,500 BP), Turkic-speaking Karakhanid (~1,000 BP) and Tuva, and Kalmyk]. Whereas the East Asian ancestry of populations in the western part has entirely belonged to the Devil’s Cave lineage till now, populations in the eastern part have received the genomic influence from an Amis-related lineage (17.4–52.1%) posterior to the presence of the Devil’s Cave population roughly in the same region (~7,600 BP)12. Analogically, archaeological record has documented the transmission of wet-rice cultivation from coastal China (Shāndōng and/or Liáoníng Peninsula) to Northeast Asia, notably the Korean Peninsula (Mumun pottery period, since ~3,500 BP) and the Japanese archipelago (Yayoi period, since ~2,900 BP)2. Especially for Japanese, the Austronesian-related linguistic influence in Japanese may indicate a potential contact between the Proto-Japonic speakers and population(s) affiliating to the coastal lineage. Thus, our results imply that a southern-East-Asian-related lineage could be arguably associated with the dispersal of wet-rice agriculture in Northeast Asia at least to some extent.

Spatial and temporal distribution of ancestries in East Asians. Reference populations and corresponding hypothesized ancestral populations: (1) Devil’s Cave (~7,600 BP), the northern East Asian lineage; (2) Amis, the southern East Asian lineage (= AHM + AAA + AAN); (3) Hòabìnhian (~7,900 BP), a lineage related to Andamanese and indigenous hunter-gatherer of MSEA; (4) Kolyma (~9,800 BP), “Ancient Palaeo-Siberians”; (5) Afanasievo (~4,800 BP), steppe ancestry; (6) Namazga (~5,200 BP), the lineage of Chalcolithic Central Asian. Here, we report the best-fitting results of qpAdm based on following criteria: (1) a feasible p-value (&mt; 0.05), (2) feasible proportions of all the ancestral components (mean &mt; 0 and standard error < mean), and (3) with the highest p-value if meeting previous conditions.

In this case, the study doesn’t compare Steppe_MLBA, though, so the findings of Afanasievo ancestry have to be taken with a pinch of salt. They are, however, compared to Namazga, so “Steppe ancestry” is there. Taking into account the limited amount of Yamnaya-like ancestry that could have reached the Tian Shan area with the Srubna-Andronovo horizon in the Iron Age (see here), and the amount of Yamnaya-like ancestry that appears in some of these populations, it seems unlikely that this amount of “Steppe ancestry” would emerge as based only on Steppe_MLBA, hence the most likely contacts of Turkic peoples with populations of both Afanasievo (first) and Corded Ware-derived ancestry (later) to the west of Lake Baikal.

(1) The simplification of ancestral components into A vs. B vs. C… (when many were already mixed), and (2) the simplistic selection of one OR the other in the preferred models (such as those published for Yamnaya or Corded Ware), both common strategies in population genomics pose evident problems when assessing the actual gene flow from some populations into others.

Also, it seems that when the “Steppe”-like contribution is small, both Yamnaya and Corded Ware ancestry will be good fits in admixed populations of Central Asia, due to the presence of peoples of EHG-like (viz. West Siberia HG) and/or CHG-like (viz. Namazga) ancestry in the area. Unless and until these problems are addressed, there is little that can be confidently said about the history of Yamnaya vs. Corded Ware admixture among Asian peoples.

Maps, maps, and more maps

As you have probably noticed if you follow this blog regularly, I have been experimenting with GIS software in the past month or so, trying to map haplogroups and ancestry components (see examples for Vikings, Corded Ware, and Yamnaya). My idea was to show the (pre)historical evolution of ancestry and haplogroups coupled with the atlas of prehistoric migrations, but I have to understand first what I can do with GIS statistical tools.

My latest exercise has been to map modern haplogroup distribution (now added to the main menu above) using data from the latest available reports. While there have been no great surprises – beyond the sometimes awful display of data by some papers – I think it is becoming clearer with each new publication how wrong it was for geneticists to target initially those populations considered “isolated” – hence subject to strong founder effects – to extrapolate language relationships. For example:

  • The mapping of R1b-M269, in particular basal subclades, corresponds nicely with the Indo-European expansions.
  • There is no clear relationship of R1b, not even R1b-DF27 (especially basal subclades), with Basques. There is no apparent relationship between the distribution of R1b-M269 and some mythical non-Indo-European “Old Europeans”, like Etruscans or Caucasian speakers, either.
  • Basal R1a-M417 shows an interesting distribution, as do maps of basal Z282 and Z93 subclades, despite the evident late bottlenecks and acculturation among Slavs.
  • The distribution of hg. N1a-VL29 (and other N1a-L392 subclades) is clearly dissociated from Uralic peoples, and their expansion in the whole Baltic Sea during the Iron Age doesn’t seem to be related to any specific linguistic expansion.
  • haplogroup-n1a-vl29
    Modern distribution of haplogroup N1a-VL29. See full map.
  • Even the most recent association in Post et al. (2019) with hg. N1a-Z1639 – due to the lack of relationship of Uralic with N1a-VL29 – seems like a stretch, seeing how it probably expanded from the Kola Peninsula and the East Urals, and neither the Lovozero Ware nor forest hunter-fishers of the Cis- and Trans-Urals regions were Uralic-speaking cultures.
  • The current prevalence of hg. R1b-M73 supports its likely expansion with Turkic-speaking peoples.
  • The distribution of haplogroup R1b-V88 in Africa doesn’t look like it was a mere founder effect in Chadic peoples – although they certainly underwent a bottleneck under it.
  • The distribution of R1a-M420 (xM198) and hg. R1b-M343 (possibly not fully depicted in the east) seem to be related to expansions close to the Caucasus, supporting once more their location in Eastern Europe / West Siberia during the Mesolithic.
  • The mapping of E1b-V13 and I-M170 (I haven’t yet divided it into subclades) are particularly relevant for the recent eastward expansion of early Slavic peoples.

All in all, modern haplogroup distribution might have been used to ascertain prehistoric language movements even in the 2000s. It was the obsession with (and the wrong assumptions about) the “purity” of certain populations – say, Basques or Finns – what caused many of the interpretation problems and circular reasoning we are still seeing today.

I have also updated maps of Y-chromosome haplogroups reported for ancient samples in Europe and/or West Eurasia for the Early Eneolithic, Early Chalcolithic, Late Chalcolithic, Early Bronze Age, Middle Bronze Age, Late Bronze Age, Early Iron Age, Late Iron Age, Antiquity, and Middle Ages.

Haplogroup inference

I have also tried Yleaf v.2 – which seems like an improvement over the infamous v.1 – to test some samples that hobbyists and/or geneticists have reported differently in the past. I have posted the results in this ancient DNA haplogroup page. It doesn’t mean that the inferences I obtain are the correct ones, but now you have yet another source to compare.

Not many surprises here, either:

  • M15-1 and M012, two Proto-Tocharians from Shirenzigou, are of hg. R1b-PH155, not R1b-M269.
  • I0124, the Samara HG, is of hg. R1b-P297, but uncertain for both R1b-M73 and R1b-M269.
  • I0122, the Khvalynsk chieftain, is of hg. R1b-V1636.
  • I2181, the Smyadovo outlier of poor coverage, is possibly of hg. R, and could be of hg. R1b-M269, but could also be even non-P.
  • I6561 from Alexandria is probably of hg. R1a-M417, likely R1a-Z645, maybe R1a-Z93, but can’t be known beyond that, which is more in line with the TMRCA of R1a subclades and the radiocarbon date of the sample.
  • I2181, the Yamnaya individual (supposedly Pre-R1b-L51) at Lopatino II is R1b-M269, negative for R1b-L51. Nothing beyond that.

You can ask me to try mapping more data or to test the haplogroup of more samples, provided you give me a proper link to the relevant data, they are interesting for the subject of this blog…and I have the time to do it.


Vikings, Vikings, Vikings! “eastern” ancestry in the whole Baltic Iron Age


Open access Population genomics of the Viking world, by Margaryan et al. bioRxiv (2019), with a huge new sampling from the Viking Age.

Interesting excerpts (emphasis mine, modified for clarity):

To understand the genetic structure and influence of the Viking expansion, we sequenced the genomes of 442 ancient humans from across Europe and Greenland ranging from the Bronze Age (c. 2400 BC) to the early Modern period (c. 1600 CE), with particular emphasis on the Viking Age. We find that the period preceding the Viking Age was accompanied by foreign gene flow into Scandinavia from the south and east: spreading from Denmark and eastern Sweden to the rest of Scandinavia. Despite the close linguistic similarities of modern Scandinavian languages, we observe genetic structure within Scandinavia, suggesting that regional population differences were already present 1,000 years ago.

Maps illustrating the following texts have been made based on data from this and other papers:

  • Maps showing ancestry include only data from this preprint (which also includes some samples from Sigtuna).
  • Maps showing haplogroup density include Vikings from other publications, such as those from Sigtuna in Krzewinska et al. (2018), and from Iceland in Ebenesersdóttir et al. (2018).
  • Maps showing haplogroups of ancient DNA samples based on their age include data from all published papers, but with slightly modified locations to avoid overcrowding (randomized distance approx. ± 0.1 long. and lat.).

Y-DNA haplogroups in Europe during the Viking expansions (full map). See other maps from the Middle Ages.

We find that the transition from the BA to the IA is accompanied by a reduction in Neolithic farmer ancestry, with a corresponding increase in both Steppe-like ancestry and hunter-gatherer ancestry. While most groups show a slight recovery of farmer ancestry during the VA, there is considerable variation in ancestry across Scandinavia. In particular, we observe a wide range of ancestry compositions among individuals from Sweden, with some groups in southern Sweden showing some of the highest farmer ancestry proportions (40% or more in individuals from Malmö, Kärda or Öland).

Ancestry proportions in Norway and Denmark on the other hand appear more uniform. Finally we detect an influx of low levels of “eastern” ancestry starting in the early VA, mostly constrained among groups from eastern and central Sweden as well as some Norwegian groups. Testing of putative source groups for this “eastern” ancestry revealed differing patterns among the Viking Age target groups, with contributions of either East Asian- or Caucasus-related ancestry.

Ancestry proportions of four-way models including additional putative source groups for target groups for which three-way fit was rejected (p ≤ 0.01);

Overall, our findings suggest that the genetic makeup of VA Scandinavia derives from mixtures of three earlier sources: Mesolithic hunter-gatherers, Neolithic farmers, and Bronze Age pastoralists. Intriguingly, our results also indicate ongoing gene flow from the south and east into Iron Age Scandinavia. Thus, these observations are consistent with archaeological claims of wide-ranging demographic turmoil in the aftermath of the Roman Empire with consequences for the Scandinavian populations during the late Iron Age.

Genetic structure within Viking-Age Scandinavia

We find that VA Scandinavians on average cluster into three groups according to their geographic origin, shifted towards their respective present-day counterparts in Denmark, Sweden and Norway. Closer inspection of the distributions for the different groups reveals additional complexity in their genetic structure.

Natural neighbor interpolation of “Danish ancestry” among Vikings.

We find that the ‘Norwegian’ cluster includes Norwegian IA individuals, who are distinct from both Swedish and Danish IA individuals which cluster together with the majority of central and eastern Swedish VA individuals. Many individuals from southwestern Sweden (e.g. Skara) cluster with Danish present-day individuals from the eastern islands (Funen, Zealand), skewing towards the ‘Swedish’ cluster with respect to early and more western Danish VA individuals (Jutland).

Some individuals have strong affinity with Eastern Europeans, particularly those from the island of Gotland in eastern Sweden. The latter likely reflects individuals with Baltic ancestry, as clustering with Baltic BA individuals is evident in the IBS-UMAP analysis and through f4-statistics.

Natural neighbor interpolation of “Norwegian ancestry” among Vikings.

For more on this influx of “eastern” ancestry see my previous posts (including Viking samples from Sigtuna) on Genetic and linguistic continuity in the East Baltic, and on the Pre-Proto-Germanic homeland based on hydrotoponymy.

Baltic ancestry in Gotland

Genetic clustering using IBS-UMAP suggested genetic affinities of some Viking Age individuals with Bronze Age individuals from the Baltic. To further test these, we quantified excess allele sharing of Viking Age individuals with Baltic BA compared to early Viking Age individuals from Salme using f4 statistics. We find that many individuals from the island of Gotland share a significant excess of alleles with Baltic BA, consistent with other evidence of this site being a trading post with contacts across the Baltic Sea.

Natural neighbor interpolation of “Finnish ancestry” among Vikings.

The earliest N1a-VL29 sample available comes from Iron Age Gotland (VK579) ca. AD 200-400 (see Iron Age Y-DNA maps), which also proves its presence in the western Baltic before the Viking expansion. The distribution of N1a-VL29 and R1a-Z280 (compared to R1a in general) among Vikings also supports a likely expansion of both lineages in succeeding waves from the east with Akozino warrior-traders, at the same time as they expanded into the Gulf of Finland.

Density of haplogroup R1a-Z280 (samples in pink) overlaid over other R1a samples (in green, with R1a-Z284 in cyan) among Vikings.

Vikings in Estonia

(…) only one Viking raiding or diplomatic expedition has left direct archaeological traces, at Salme in Estonia, where 41 Swedish Vikings who died violently were buried in two boats accompanied by high-status weaponry. Importantly, the Salme boat-burial predates the first textually documented raid (in Lindisfarne in 793) by nearly half a century. Comparing the genomes of 34 individuals from the Salme burial using kinship analyses, we find that these elite warriors included four brothers buried side by side and a 3rd degree relative of one of the four brothers. In addition, members of the Salme group had very similar ancestry profiles, in comparison to the profiles of other Viking burials. This suggests that this raid was conducted by genetically homogeneous people of high status, including close kin. Isotope analyses indicate that the crew descended from the Mälaren area in Eastern Sweden thus confirming that the Baltic-Mid-Swedish interaction took place early in the VA.

Natural neighbor interpolation of “Swedish ancestry” among Vikings.

Viking samples from Estonia show thus ancient Swedes from the Mälaren area, which proves once again that hg. N1a-VL29 (especially subclade N1a-L550) and tiny proportions of so-called “Siberian ancestry” expanded during the Early Iron Age into the whole Baltic Sea area, not only into Estonia, and evidently not spreading with Balto-Finnic languages (since the language influence is in the opposite direction, east-west, Germanic > Finno-Samic, during the Bronze Age).

N1a-VL29 lineages spread again later eastwards with Varangians, from Sweden into north-eastern Europe, most likely including the ancestors of the Rurikid dynasty. Unsurprisingly, the arrival of Vikings with Swedish ancestry into the East Baltic and their dispersal through the forest zone didn’t cause a language shift of Balto-Finnic, Mordvinic, or East Slavic speakers to Old Norse, either…

NOTE. For N1a-Y4339 – N1a-L550 subclade of Swedish origin – as main haplogroup of modern descendants of Rurikid princes, see Volkov & Seslavin (2019) – full text in comments below. Data from ancient samples show varied paternal lineages even among early rulers traditionally linked to Rurik’s line, which explains some of the discrepancies found among modern descendants:

  • A sample from Chernihiv (VK542) potentially belonging to Gleb Svyatoslavich, the 11th century prince of Tmutarakan/Novgorod, belongs to hg. I2a-Y3120 (a subclade of early Slavic I2a-CTS10228) and has 71% “Modern Polish” ancestry (see below).
  • Izyaslav Ingvarevych, the 13th century prince of Dorogobuzh, Principality of Volhynia/Galicia, is probably behind a sample from Lutsk (VK541), and belongs to hg. R1a-L1029 (a subclade of R1a-M458), showing ca. 95% of “Modern Polish” ancestry.
  • Yaroslav Osmomysl, the 12th century Prince of Halych (now in Western Ukraine), was probably of hg. E1b-V13, yet another clearly early Slavic haplogroup.

Density of haplogroup N1a-VL29, N1a-L550 (samples in pink, most not visible) among Vikings. Samples of hg. R1b in blue, hg. R1a in green, hg. I in orange.

Finnish ancestry

Firstly, modern Finnish individuals are not like ancient Finnish individuals, modern individuals have ancestry of a population not in the reference; most likely Steppe/Russian ancestry, as Chinese are in the reference and do not share this direction. Ancient Swedes and Norwegians are more extreme than modern individuals in PC2 and 4. Ancient UK individuals were more extreme than Modern UK individuals in PC3 and 4. Ancient Danish individuals look rather similar to modern individuals from all over Scandinavia. By using a supervised ancient panel, we have removed recent drift from the signal, which would have affected modern Scandinavians and Finnish populations especially. This is in general a desirable feature but it is important to check that it has not affected inference.

PCA of the ancient and modern samples using the ancient palette, showing different PCs. Modern individuals are grey and the K=7 ancient panel surrogate populations are shown in strong colors, whilst the remaining M-K=7 ancient populations are shown in faded colors.

The story for Modern-vs-ancient Finnish ancestry is consistent, with ancient Finns looking much less extreme than the moderns. Conversely, ancient Norwegians look like less-drifted modern Norwegians; the Danish admixture seen through the use of ancient DNA is hard to detect because of the extreme drift within Norway that has occurred since the admixture event. PC4 vs PC5 is the most important plot for the ancient DNA story: Sweden and the UK (along with Poland, Italy and to an extent also Norway) are visibly extremes of a distribution the same “genes-mirror-geography” that was seen in the Ancient-palette analysis. PC1 vs PC2 tells the same story – and stronger, since this is a high variance-explained PC – for the UK, Poland and Italy.

Uniform manifold approximation and projection (UMAP) analysis of the VA and other ancient samples.

Evidence for Pictish Genomes

The four ancient genomes of Orkney individuals with little Scandinavian ancestry may be the first ones of Pictish people published to date. Yet a similar (>80% “UK ancestry) individual was found in Ireland (VK545) and five in Scandinavia, implying that Pictish populations were integrated into Scandinavian culture by the Viking Age.

Our interpretation for the Orkney samples can be summarised as follows. Firstly, they represent “native British” ancestry, rather than an unusual type of Scandinavian ancestry. Secondly, that this “British” ancestry was found in Britain before the Anglo-Saxon migrations. Finally, that in Orkney, these individuals would have descended from Pictish populations.

Natural neighbor interpolation of “British ancestry” among Vikings.

(…) ‘UK’ represents a group from which modern British and Irish people all receive an ancestry component. This information together implies that within the sampling frame of our data, they are proxying the ‘Briton’ component in UK ancestry; that is, a pre-Roman genetic component present across the UK. Given they were found in Orkney, this makes it very likely that they were descended from a Pictish population.

Modern genetic variation within the UK sees variation between ‘native Briton’ populations Wales, Scotland, Cornwall and Ireland as large compared to that within the more ‘Anglo-Saxon’ English. This is despite subsequent gene flow into those populations from English-like populations. We have not attempted to disentangle modern genetic drift from historically distinct populations. Roman-era period people in England, Wales, Ireland and Scotland may not have been genetically close to these Orkney individuals, but our results show that they have a shared genetic component as they represent the same direction of variation.

Density of haplogroup R1b-L21 (samples in red), overlaid over all samples of hg. R1b among Vikings (R1b-U106 in green, other R1b-L151 in deep red). To these samples one may add the one from Janakkala in south-western Finland (AD ca. 1300), of hg. R1b-L21, possibly related to these population movements.

For more on Gaelic ancestry and lineages likely representing slaves among early Icelanders, see Ebenesersdóttir et al. (2018).


As in the case of mitochondrial DNA, the overall distribution profile of the Y chromosomal haplogroups in the Viking Age samples was similar to that of the modern North European populations. The most frequently encountered male lineages were the haplogroups I1, R1b and R1a.

Haplogroup I (I1, I2)

The distribution of I1 in southern Scandinavia, including a sample from Sealand (VK532) ca. AD 100 (see Iron Age Y-DNA maps) proves that it had become integrated into the West Germanic population already before their expansions, something that we already suspected thanks to the sampling of Germanic tribes.

Density of haplogroup I (samples in orange) among Vikings. Samples of hg. R1b in blue, hg. R1a in green, N1a in pink.
Density of haplogroup I1 (samples in red) overlaid over all samples of hg. I among Vikings.

Haplogroup R1b (M269, U106, P312)

Especially interesting is the finding of R1b-L151 widely distributed in the historical Nordic Bronze Age region, which is in line with the estimated TMRCA for R1b-P312 subclades found in Scandinavia, despite the known bottleneck among Germanic peoples under U106. Particularly telling in this regard is the finding of rare haplogroups R1b-DF19, R1b-L238, or R1b-S1194. All of that points to the impact of Bell Beaker-derived peoples during the Dagger period, when Pre-Proto-Germanic expanded into Scandinavia.

Also interesting is the finding of hg. R1b-P297 in Troms, Norway (VK531) ca. 2400 BC. R1b-P297 subclades might have expanded to the north through Finland with post-Swiderian Mesolithic groups (read more about Scandinavian hunter-gatherers), and the ancestry of this sample points to that origin.

However, it is also known that ancestry might change within a few generations of admixture, and that the transformation brought about by Bell Beakers with the Dagger Period probably reached Troms, so this could also be a R1b-M269 subclade. In fact, the few available data from this sample show that it comes from the natural harbour Skarsvågen at the NW end of the island Senja, and that its archaeologist thought it was from the Viking period or slightly earlier, based on the grave form. From Prescott (2017):

In 1995, Prescott and Walderhaug tentatively argued that a dramatic transformation took place in Norway around the Late Neolithic (2350 BCE), and that the swift nature of this transition was tied to the initial Indo-Europeanization of southern and coastal Norway, at least to Trøndelag and perhaps as far north as Troms. (…)

The Bell Beaker/early Late Neolithic, however, represents a source and beginning of these institution and practices, exhibits continuity to the following metal age periods and integrated most of Northern Europe’s Nordic region into a set of interaction fields. This happened around 2400 BCE, at the MNB to LN transition.

NOTE. This particular sample is not included in the maps of Viking haplogroups.

Density of haplogroup R1b (samples in blue) among Vikings. Samples of hg. I in orange, hg. R1a in green, N1a in pink.
Density of haplogroup R1b-U106 (samples in green) overlaid over all samples of hg. R1b (other R1b-L23 samples in red) among Vikings.
Density of R1b-L151 (xR1b-U106) (samples in deep red) overlaid over all samples of hg. R1b (R1b-U106 in green, other R1b-M269 in blue) among Vikings.

Haplogroup R1a (M417, Z284)

The distribution of hg. R1a-M417, in combination with data on West Germanic peoples, shows that it was mostly limited to Scandinavia, similar to the distribution of I1. In fact, taking into account the distribution of R1a-Z284 in particular, it seems even more isolated, which is compatible with the limited impact of Corded Ware in Denmark or the Northern European Plain, and the likely origin of R1a-Z284 in the expansion with Battle Axe from the Gulf of Finland. The distribution of R1a-Z280 (see map above) is particularly telling, with a distribution around the Baltic Sea mostly coincident with that of N1a.

Density of haplogroup R1a (samples in green) among Vikings. Samples of hg. R1b in blue, of hg. I in orange, N1a in pink.
Density of haplogroup R1a-Z284 (samples in cyan) overlaid over all samples of hg. R1a (in green, with R1a-Z280 in pink) among Vikings.

Other haplogroups

Among the ancient samples, two individuals were derived haplogroups were identified as E1b1b1-M35.1, which are frequently encountered in modern southern Europe, Middle East and North Africa. Interestingly, the individuals carrying these haplogroups had much less Scandinavian ancestry compared to the most samples inferred from haplotype based analysis. A similar pattern was also observed for less frequent haplogroups in our ancient dataset, such as G (n=3), J (n=3) and T (n=2), indicating a possible non-Scandinavian male genetic component in the Viking Age Northern Europe. Interestingly, individuals carrying these haplogroups were from the later Viking Age (10th century and younger), which might indicate some male gene influx into the Viking population during the Viking period.

Natural neighbor interpolation of “Italian ancestry” among Vikings.

As the paper says, the small sample size of rare haplogroups cannot distinguish if these differences are statistically relevant. Nevertheless, both E1b samples have substantial Modern Polish-like ancestry: one sample from Gotland (VK474), of hg. E1b-L791, has ca. 99% “Polish” ancestry, while the other one from Denmark (VK362), of hg. E1b-V13, has ca. 35% “Polish”, ca. 35% “Italian”, as well as some “Danish” (14%) and minor “British” and “Finnish” ancestry.

Given the E1b-V13 samples of likely Central-East European origin among Lombards, Visigoths, and especially among Early Slavs, and the distribution of “Polish” ancestry among Viking samples, VK362 is probably a close description of the typical ancestry of early Slavs. The peak of Modern Polish-like ancestry around the Upper Pripyat during the (late) Viking Age suggests that Poles (like East Slavs) have probably mixed since the 10th century with more eastern peoples close to north-eastern Europeans, derived from ancient Finno-Ugrians:

Natural neighbor interpolation of “Polish ancestry” among Vikings.

Similarly, the finding of R1a-M458 among Vikings in Funen, Denmark (VK139), in Lutsk, Poland (VK541), and in Kurevanikha, Russia (VK160), apart from the early Slav from Usedom, may attest to the origin of the spread of this haplogroup in the western Baltic after the Bell Beaker expansion, once integrated in both Germanic and Balto-Slavic populations, as well as intermediate Bronze Age peoples that were eventually absorbed by their expansions. This contradicts, again, my simplistic initial assessment of R1a-M458 expansion as linked exclusively (or even mainly) to Balto-Slavs.

Y-DNA haplogroups in Europe during Antiquity (full map). See other maps of cultures and ancient DNA from Antiquity.


European hydrotoponymy (VI): the British Isles and non-Indo-Europeans


The nature of the prehistoric languages of the British Isles is particularly difficult to address: because of the lack of ancient data from certain territories; because of the traditional interpretation of Old European names simply as “Celtic”; and because Vennemann’s re-labelling of the Old European hydrotoponymy as non-Indo-European has helped distract the focus away from the real non-Indo-European substrate on the islands.

Alteuropäisch and Celtic

An interesting summary of hydronymy in the British Isles was already offered long ago, in British and European River-Names, by Kitson, Transactions of the Philological Society (1996) 94(2):73-118. In it, he discusses, among others:

  • Non-serial hydronyms: Drua-/Drav-/Dru-, from drew- sometimes reshaped as derw-; ab-; ag-; al-; alb-; alm-; am-; antjā-; arg-; aw-; dan-; eis-; el-/ol-; er-/or-; kar(r)a-, ker-; nebh-; ned-; n(e)id-; sal-; wig-; weis-/wis-; ur-, wer-; etc.
  • Serial elements: -went-, -m(e)no-, -nt-o-, -n-; -nā-, -tā-; -st-, -r-; etc.

Probably non-Celtic suffixes are found e.g. in Tamesis, paralelled in the Spey Tuesis, and also in Tweed (<*Twesetā?); or -no-/-nā- is also particularly frequent in Scottish river-names, but not in English ones. Another interesting case is the reverse suffix relative order into -r-st- instead of -st-r-.

Most if not all of them can be explained as of Old European nature. I will leave aside the discussion of particular formations – most of which may be found repeated, complemented, and updated in more modern texts.

Hydronyms ub-, ob-. Another Western European river name.

Bell Beakers as Old Europeans

(…) Bell-beakers are in fact the only archaeological phenomenon of any period of prehistory with a comparably wide spread to that of river-names in the western half of Europe. The presumption must I think be that Beaker Folk were the vector of alteuropäisch river-names to most of western Europe. Rivers in the base Arg-, which we have seen there is cause to think was not already in use at the earliest stage of the river-naming system, and which therefore should be associated with such a vector if one existed, fit their distribution exceptionally well.

That they were a single-speech community can be asserted more confidently of the Beaker Folk than of most archaeologically identified groups for the very reasons that have caused archaeologists difficulty in interpreting them. As McEvedy (1967:28) put it, ‘the bell-beaker folk march convincingly in every prehistorian’s text, but they do so from Spain to Germany in some and from Germany to Spain in others, while lately there has been a tendency to make them go from Spain to Germany and back again (primary and reflux movements)’. One ‘firm datum seems to be that the British beaker folk came from the Rhine-Elbe region.’

This confirms what the long chronology now indicated for Common Indo-European would suggest anyway, and what to me, as remarked above, the rareness of non-Indo-European names in England suggests, that the old dissenting minority of Celticists were right to see the arrival in Britain of Indo-Europeans, as evinced in river-names whether or not in ethnic proto-Celts, as early as the third millennium. McEvedy’s map of Beaker Folk identifies them linguistically with Celto-Ligurians, but in that his admirably tidy mind was, typically, a degree too tidy. Considerations of phonology indicate that more than one linguistic group was involved.

It is normal in reconstructed Indo-European for groups of related words not all to have the same vowel in the root syllable. The commonest vowel gradation is between e, o, and zero; (…) Language-groups that level short a and o include Germanic and Baltic, Slavonic, Illyrian, Hittite and Indo-Iranian; but Celtic and Italic like Greek and Armenian preserve the original distinction. It follows that Celts speaking normal Celtic sounds cannot have been wholly responsible for bringing alteuropäisch river-names to any area. It would seem to follow, as Professor Nicolaisen has consistently urged, that in Spain, Gaul, Britain, and Italy, where the only historically known early Indo-Europeans were speakers of non-levelling languages, they were preceded by speakers of levelling languages not historically known. This hypothesis, pretty well required by the linguistic evidence, finds so good an archaeological correlate in the Beaker People that I think it would now be flying in the face of the evidence not to accept those as bearers of the river-names to these countries.

Bell Beaker Civilization (CAD O. Lemercier).

The funny note is the rejection of the steppe homeland by Kitson in favour of Central European Neolithic cultures, due in part to the ‘impossibility’ of proto-Finnic loans from East Indo-European, if Proto-Indo-European was spoken in the steppe. As I said recently, the lack of knowledge of Uralic languages and Indo-European – Uralic contacts has clearly conditioned the Urheimat question for both, Proto-Indo-European and Proto-Uralic researchers.

On the other hand, the identification of Bell Beakers with Old Europeans was not something new. Already in the 1950s Hugh Hencken talked about this, and J.P. Mallory (who described Bell Beakers more exactly in 2013 as North-West Indo-Europeans) is sure that this idea had been used even before the 1950s.

The question is, though, to what extent the reasoning of those researchers was as detailed so as to consider it a modern approach to the question, because Krahe in the 1940s seems to offer the first reliable data to make that assumption. In any case, Gimbutas’ idea of Kurgan warriors imposing Indo-European languages everywhere, so over-represented in Encyclopedia-like texts since the end of the 1990s, was not the only, and probably never the main hypothesis among many Indo-Europeanists.

Celts part of Bell Beakers?

Regarding Koch and Cunliffe’s revival of the autochthonous Celts idea, one can find a similar traditional view among British researchers of the early and mid-20th century – and a proper rejection based on hydrotoponymy. It seems that many fringe theories in Indo-European studies, from Nordic or Baltic homelands to autochthonous Celts to the Europa Vasconica, can be traced back to revivalist waves of romantic views of the 19th c.:

What the late Professor C. F. C. Hawkes called in British archaeology ‘cumulative Celticity’, built up by successions of comparatively small tribal migrations, will then have operated on the linguistic side as well. That the predecessors of the Celts proper for so long had in most of Britain been people of similar Indo-European speech explains why there is not a significant survival of recognizably non-Indo-European river-names, and why the few serious candidates for non-Indo-European among recorded place-names all seem to be in Scotland. That the river-names kept their north European non-Celtic phonology will be because the Celts proper took them over as names, with denotative not fully lexical meaning. (…)

(…) I think non-Celtic Indo-European-speakers are likely to have been involved in fact, whether or not they are the whole story, both because that it is the hypothesis which makes best sense of the archaeological evidence (…)

(…) because it is widely accepted that placenames in the Low Countries imply the existence of at least one group of not historically attested Indo-European-speakers, not the same as the ones we are concerned with. So do names in Spain, another country where the only historically attested early Indo-Europeans were Celtic. Comparing Spanish alteuropäisch names with British ones gives a glimpse of the dialectal range that must have characterized the Beaker phenomenon. Either group shares one feature with historical Celtic that the other lacks. The Spanish names like Celtic proper mostly keep Inda-European o. There the diagnostic feature is initial p (Schmoll 1959:93, 78-80; Rodriguez 1980), lost from Celtic and the alteuropäisch of Britain.

Interesting is also the early reaction against Vennemann’s much publicized interpretation of Krahe’s Old European as ‘Vasconic’. This is a useful comment which is still applicable to the same non-existent ‘problem’ found by some Indo-Europeanists, depending on their ideas about Indo-European dialectalization:

It is again naughty of Vennemann (1994:244) to call his laryngealist explanation ‘the only kind of explanation that I know’. At least he does not quite go so far in his laryngealism as to posit a proto-Indo-European in which the vowel a never existed, as Kuiper does.

NOTE. It is difficult to understand why the work of so many Indo-Europeanists is usually not known, while Vennemann’s far-fetched theory has been endlessly repeated. I reckon it must be the same phenomenon of personal and professional contacts, involvement in editorial decisions, and simplification in mass media which makes Kristiansen and his theories frequently published and cited nowadays.



Based on these data, I entertained the idea of arguing for a Pre-Celtic Indo-European language in A Storm of Words, called Pre-Pritenic, with a tentative fable based on the data described below for the Insular Celtic substrate, but eventually deleted the whole text, because (unlike other tentative fables, like the Lusitanian or Venetic ones) it was pure speculation with not even fragmentary data to rely on. Here is a fragment of the discussion:

Among the main reasons adduced to reject the non-Celtic nature of Pritenic is Orkney, a region where Pictish carved stones have been found (indicator of a centralised Pictish power and identity). The name was attested first as Gk. Orkas / Orkádos (secondary source, from Pytheas of Massilia, ca. 322-285 BC, or possibly much later) and Lat. Orchades / Orcades (by Latin sources in the 1st century AD), and it was used to describe the northernmost promontory in Scotland, commonly identified as Dunnet Head in Caithness. It is supposed to derive its name from Cel. *φorko- ‘pig’, because speakers of Old Irish interpreted the name for the island later as Insi Orc ‘island of the pigs’. Therefore, Pritenic would have undergone the prototypical Common Celtic evolution of NWIE *p- → Ø- (see above).

This argument is flawed, in so far as it could have happened (with the interpretation of the name from a Celtic point of view) what happened later with Norwegian settlers, who reinterpreted the name according to Old Norse orkn ‘seal’, to identify it as ‘island of the seals’. In fact, texts published in the 19th and 20th century looked for an even closer etymology to the interpreter, who usually saw it as ‘island of the orcas’.

The region name orc- could be speculatively linked to NWIE *ork-i- ‘cut off, divide’, cf. Ita. *erk-i- (vowel analogically changed), Hitt. ārk- (<*hork-ei-), in Latin found with the meaning ‘divide (an inheritance)’, hence noun Lat. erctum ‘inheritance, inherited part’.

Maybe more interesting is a connection to *or-, as found in British rivers or streams Arrow, Oare Water (Som), Ayre , Armet Water, Arnot Burn, Ernan Water etc. for which cognates Skt. arvan(t)- ‘running, swift’, árṇa- ‘surging’, Gmc. *arnia- ‘lively, energetic’ have been proposed (Forster 1941; Nicolaisen 1976; Kitson 1996). Similar to these derivatives in -n-, -m-, one could argue for a denominative suffixation in *-ko-, not uncommon in Old European toponyms (Villar Liébana 2007), which could be interpreted originally as ‘(region) pertaining to the Or (river, stream)’. The a-vocalism of Old European does not need further explanation, being fairly common in the British Isles (Kitson 1996).

I tried to look for rivers and streams in Caithness that fit a potential border for an ancestral tribe, but after reading many (and I really mean too many) texts on Scotland’s hydronymy, which is a quite well-researched area, I didn’t like the idea of plunging into such a speculative task; not when I have this blog for that… I deleted the text from the book, seeing how it doesn’t really add anything of value and may have distracted from its real aim. If any reader wants to post potential candidates for this delimiting river ‘Or’ in Caithness, feel free to post that below.

Or- hydronyms, mapped by Villar (2007). He considered it a variant of ur-, uro-, and only included one certain occurrence from old river-names in southern England.

Weak (if any) support of a non-Celtic nature of the names might also be found in the late description of Ptolemy’s Geographia (originally ca. 150 AD), Tauroedoúnou tēs kai Orkádos kaloumenēs, translated in Latin as Tarved(r)um, quod et Orcas promontorium dicitur. The original name seems to be formed from *tau-r-, as is common in Indo-European *taur-o- (compare also river Taum), whereas the commonly used Latin translation seems to rely on a Celtic *tarw-o-.

Always Celtic?

As with other Pictish material, these questions are unlikely to be settled without unequivocal sources pointing to the original names and their meaning. The autochthonous trend is set lately by Guto Rhys, whose work is thorough and methodologically sound, although his reviews tend to dismiss all evidence of a non-Celtic (or even non-Brittonic) layer in Pictland as described in previous works, mostly because of the lack of direct sources or uncontroverted data:

Where a supposed divergence is found in certain names, a lack of proper reading or interpretation of materials (or lack of enough cases to generalize them), combined with similar names in other (neighbouring or distant) Celtic languages, is adduced.

However, the same arguments can indeed be used to reject his proposal of a Celtic nature of many names which cannot be simply explained with other clearly Celtic examples: namely, that all similarities are due to later influences, re-analysis and modifications of Old European terms according to Celtic phonemic (or etymological) patterns, or that the Brittonic nature of many names are due to convergence of the attested Pritenic naming conventions with neighbouring dialects.

In the end, the only conclusion is that there is a clear impasse in hydrotoponymic research in the British Isles, particularly in Scotland, with an impossibility of describing non-Celtic or non-Indo-European Pre-Pritenic layers, due in great part – in my opinion – to the trend among many British Celticists to consider Celtic as autochthonous to the Atlantic. This hinders the proper investigation of the question, just like the trend among Basque studies to consider the western Pyrenees as the eternal Vasconic homeland hinders a fair investigation of the actual Vasconic proto-history.

Probable maximum extent of Pictland is also highlighted in blue and overlain on the modern outline of northern Britain. Image from Noble, G., Goldberg, M., & Hamilton, D. (2018).

Non-Indo-Europeans in Northern Europe

Insular Celtic substrate

Matasović, a specialist in Celtic languages and author of the famous Eytmological Dictionary of Proto-Celtic (IEED 2009), writes in The substratum in Insular Celtic (2009):

Syntactic evidence

The syntactic parallels between Insular Celtic and Afro-Asiatic languages (which used to be called Hamito-Semitic) were noted more than a century ago by Morris-Jones (1899), and subsequently discussed by a number of scholars. These parallels include the following.

  1. The VSO order, attested both in OIr. and in Brythonic from the earliest documents (…).
  2. The existence of special relative forms of the verb, (…).
  3. The existence of prepositions inflected for person (or prepositional pronouns), (…).
  4. Prepositional progressive verbal forms, (…).
  5. The existence of the opposition between the “absolute” and “conjunct” verbal forms. (…)

The aforementioned features of Old Irish and Insular Celtic syntax (and a few others) are all found in Afro-Asiatic languages, often in several branches of that family, but usually in Berber and Ancient Egyptian (see e.g. Isaac 2001, 2007a).

Orin Gensler, in his unpublished dissertation (1993) applied refined statistical methods showing that the syntactic parallels between Insular Celtic and Afro-Asiatic cannot be attributed to chance. The crucial point is that these parallels include features that are otherwise rare cross-linguistically, but co-occur precisely in those two groups of languages. This more or less amounts to a proof that there was some connection between Insular Celtic and Afro-Asiatic at some stage in prehistory, but the exact nature of that connection is still open to speculation.

“Atlantic” typology

Insular Celtic also shares a number of areal isoglosses with languages of Western Africa, sometimes also with Basque, which shows that the Insular Celtic — Afroasiatic parallels should be viewed in light of the larger framework of prehistoric areal convergences in Western Europe and NW Africa.

The text goes on with typologically rare features found in West Europe and West Africa, such as the inter-dental fricative /þ/ (also in English, Icelandic, Castillian Spanish); initial consonant mutations/regular alterations of initial consonants caused by the grammatical category of the preceding word; the common order demonstrative-noun (within the NP) reversed; the vigesimal counting system; or use of demonstrative articles.

Lexical evidence

(…) only 38 words shared by Brythonic and Goidelic without any plausible IE etymology. These words belong to the semantic fields that are usually prone to borrowing, including words referring to animals (…), plants (…), and elements of the physical world (…). Note that cognates of these words may be unattested in Gaulish and Celtiberian because these languages are poorly attested, so that the actual number of exclusive loanwords from substratum language(s) in Insular Celtic is probably even lower. In my opinion it is not higher than 1% of the vocabulary. The large majority of substratum words in Irish and Welsh (and, generally, in Goidelic and Brythonic) is not shared by these two languages, which probably means that the sources were different substrates of, respectively, Ireland and Britain; (…)


The thesis that Insular Celtic languages were subject to strong influences from an unknown, presumably non-Indo-European substratum, hardly needs to be argued for. However, the available evidence is consistent with several different hypotheses regarding the areal and genetic affiliation of this substratum, or, more probably, substrata. The syntactic parallels between the Insular Celtic and Afro-Asiatic languages are probably not accidental, but they should not be taken to mean that the pre-Celtic substratum of Britain and Ireland belonged to the Afro-Asiatic stock. It is also possible that it was a language, or a group of languages (not necessarily related), that belonged to the same macro-area as the Afro-Asiatic languages of North Africa. The parallels between Insular Celtic, Basque, and the Atlantic languages of the Niger-Congo family, presented in the second part of this paper, are consistent with the hypothesis that there was a large linguistic macro-area, encompassing parts of NW Africa, as well as large parts of Western Europe, before the arrival of the speakers of Indo-European, including Celtic.

Map of tuk-, tok-, tuch-, tug- (with India). Interestingly, the language of the -tuk- represents a more recent layer in Iberia than the earlier Old European serial elements, pointing to a west-european expansion from the north, although it may have an Indo-European etymology.

Language of the geminates

Further evidence of the potential presence of non-Indo-European speakers at the arrival of Insular Celtic may be found in Schrijver’s Non-Indo-European surviving in Ireland in the first millennium AD (2000), and his less enthusiastic revision More on Non-Indo-European surviving in Ireland in the first millennium AD (2005). Both are referred and enough summarized by Matasović (2009).

Even more interesting than the discussion of potential non-Indo-Europeans still lingering in Ireland until well into the Common Era, is the discussion on his paper Lost Languages in Northern Europe (2001). Apart from other non-Indo-European borrowings in northern Europe, most of which must clearly be included within the European agricultural substrate, Schrijver tries to interpret the relative chronology of a substratum language of northern Europe, described by Kuiper (1995) as A2, and by Schrijver as “language of geminates“.

This substrate language is heavily present in Germanic (see e.g. Boutkan 1998), but also in Celtic and Balto-Slavic:

A highly characteristic feature of words deriving from this language is the variation of the final root consonant, which may be single or double, voiced or voiceless, and prenasalized. (…)

Incidentally, the language of geminates cannot be Uralic, as another of its characteristics is the frequent occurrence of word-initial *kn- and *kl-, and Uralic languages do not allow consonant clusters at the beginning of the word. On the other hand, and at the risk of explaining obscura per obscuriora, one might consider the possibility that the consonant gradation of Lappish and Baltic Finnic is somehow connected with the alternation of consonants at the end of the first syllable in the “language of geminates”.

The idea that the Northern European language of geminates could play an intermediary role in loan contacts between Northern and Western Indo-European on the one hand and Finno-Ugric on the other may also account for the fact that Finno-Ugric words could end up as far away as Celtic, which as far as we know was never in direct contact with a branch of Uralic.

Schrijver later changed his view about certain aspects of this substrate, from a “language of geminates” influencing Balto-Finnic which in turn influenced Germanic, to Pre-Balto-Finnic speakers being the substrate of Germanic, and both evolving at the same time in contact in Scandinavia. In fact, we know that Pre-Proto-Germanic evolved in southern Scandinavia, with a core in Jutland that shifted to the south, so the location must have been close to the North European Plain.

Also fitting this model is the substrate behind Balto-Slavic (spoken in the West Baltic), which must have also been (Para-)Balto-Finnic. However, the frequent word-initial *kn- and *kl- and the loanwords appearing in the Celtic homeland (also including Early Balto-Finnic) must place this Uralic(± non-Indo-European) language contact also well into Central European Corded Ware groups.

Similarly, one of the shared features between Finnic and Mordvinic is precisely the presence of certain geminated consonants. A revision of the data in combination with these facts should shift of evidence to a (Para-)Balto-Finnic-speaking Baltic area during the Early Bronze Age, certainly encompassing the Battle Axe culture.

Corded Ware groups. “Classical funerary practices” found regularly, in darker shades. Modified from Furholt (2014).

Afroasiatic-like substrate and Vasconic

The only archaeological culture that could fit most of these data, in the currently known relative chronological time frame, would be the Megalithic expansion in Western Europe, or potentially (maybe in addition to this early layer) the expansion of the Proto-Beaker package, which could have spread a Basque-Iberian language (see e.g. my take on Basque-Iberians).

Whether the language behind the Insular Celtic substrate (or, rather, some of its dialects) had true Afroasiatic syntactic features or it was just a language with features which happened to be similar to Afroasiatic is irrelevant. It’s impossible to reconstruct with confidence a Pre-Proto-Basque language with the currently available information.

Interestingly, it is possible to argue for an Afroasiatic branch surviving among hunter-gatherers adopting Neolithic traits in northern Europe. This has been proposed many times in the past, and one could argue in palaeogenomics for indirect supporting data, such as the expansion of WHG ancestry from south-east Europe (close to Anatolia, forming a cline with AME), and in particular of hg. R1b-V88 from the steppe – potentially associated with the Afroasiatic expansion into Africa from a Nostratic community.

NOTE. I will not resort here to typologically-based arguments similar to the “Hamito-Semit(id)ic” and “Vasconic-Uralic” Europe that were commonly in use in the 1990s, because they are in great part based on the mere re-labelling of Old European layers as “Vasconic” and flawed mass lexical/grammatical comparisons. For linguists favourable to this kind of reasoning, the theory set forth here is probably easier, though, as will be for those supporting a Neolithic expansion of Indo-European from the Mediterranean. This, however, has its own set of problems, as I have already discussed.

Distribution of megaliths in western, central, and northern Europe (after Muller 2006; graphic: Holger Dieterich).

Single Grave culture

The non-Indo-European substrate of Insular Celtic, in combination with the oldest hydrotoponymic layers – almost exclusively of Old European nature – of Britain and likely all of Ireland, can more easily be explained as a first layer of North-West Indo-European speakers heavily influenced by an Afroasiatic(-like) substrate reaching the British Isles, possibly with a slightly richer set of non-Indo-European loanwords at the time. Their language would have been later replaced by the closely related Celtic dialects imposed by elites in the Early Iron Age, which could have then easily absorbed this (mainly syntactic) substrate.

There is little space to argue for a hypothetic non-Indo-European expansion from another region, or for an in situ substrate, due to:

  1. the radical population replacement (and Y-chromosome bottleneck) in Britain and northern Ireland stemming from the Lower Rhine;
  2. the lack of meaningful population movements during the Bronze Age (at least from out of the islands);
  3. the final east-west movements of Celtic languages;
  4. the presence of the same (mainly syntactic) substrate in both Goidelic and Brittonic; and
  5. the minimal non-Indo-European lexical borrowings and hydrotoponymy, different in each island;

Based on archaeological and palaeogenomic data, the only reasonable direct connection of north-western Bell Beakers and this substrate language would be then the Corded Ware groups from north-western Europe – i.e. the traditionally named Single Grave culture from northern Germany and Denmark, and the Protruding Foot Beaker culture from the Netherlands.

The main reasons for this are as follows:

1. Early Corded Ware wave

The earliest Corded Ware burials from northern Europe (ca. 2900-2800 BC) show important differences, so no strict funerary norms existed at first (Furholt 2014):

  • In southern Sweden the prevailing orientation is north-east–south-west, and south–north; contrary to the supposed rule, male individuals are regularly deposite on their left and females on their right side
  • In the Danish Isles and north-eastern Germany, the Final Neolithic / Single Grave Period is characterized by a majority of megalithic graves, with only some single graves from typical barrows.
  • In south Germany, west–east and collective burials prevail, while in Switzerland no graves are found.
  • In Kuyavia (south-eastern Poland), Hesse (Germany), or the Baltic, west–east orientation and gender differentiation cannot be proven statistically.
Corded Ware and neighbouring groups. Top: cultural map. Bottom: varied Y-chromosome haplogroups from ancient DNA samples. See full maps.

In genetics, the area that would become the ‘core Corded Ware province’ only after ca. 2700 BC also shows a surprising variability in the oldest samples in terms of haplogroups (which may indicate a recent departure of migrants from a mixed homeland); in terms of admixture, at least one sample clusters close to EEF groups, while later ones from Esperstedt – of hg. R1a-M417 (possibly xZ645) – show a likely admixture with Yamna vanguard groups expanding from the Carpathian Basin.

On the contrary, the slightly later eastern expansions as Battle Axe and Abashevo show long-lasting genetic continuity and a marked bottleneck under R1a-Z645 subclades, as well as a clear cultural connection through the Fatyanovo culture. The role of local populations, particularly females, in preserving local customs in the Single Grave culture (see Bourgeois and Kroon 2017) is also quite relevant to the continuity of the regionl culture in spite of migrations.

2. Single Grave culture in Denmark

The Corded Ware culture in Denmark was particularly weak in its human impact compared to previous farmers (see e.g. Feeser et al. 2019), and also in its cultural traits, adopting Funnel Beaker culture traits up to a point where even the Copenhagen group describes cultural continuity, likely entailing an important substrate language impact (see e.g. Iversen and Kroonen 2017).

It’s not difficult to realize that this same argument used for Semitic-like terms in Germanic by Kroonen (2012) – e.g. words for ‘lentil’, ‘pea’, and ‘turnip’ – and supported by the Copenhagen group may be used to support the adoption of non-Uralic substrate in a Uralic-speaking Corded Ware area (as Schrijver does), which later influenced incoming Bell Beakers that developed into the Pre-Proto-Germanic speakers.

From Iversen (2016):

As it appears from the analysis above, the situation in East Denmark during the 3rd millennium BC is culturally rather complex. The continued use of megalithic entombments and the almost total rejection of the Single Grave burial custom show a strong affiliation with old Funnel Beaker traditions even after the end of the Funnel Beaker culture. (…) With an almost total lack of the two defining elements of the Single Grave culture – interments in single graves and the prominent position of stone battle axes – one can hardly talk about a Single Grave culture in East Denmark. What we see is rather the adoption of various Single Grave, Battle Axe and Pitted Ware cultural traits into a setting that was basically a continuation of Funnel Beaker norms and traditions (Iversen 2015).

Single Grave and Battle axe culture graves in Denmark and scania (dots). Grey colouring: Distribution of Jutland single Graves. Dark grey: Initial phase ca. 2850–2800 BC. Cross: Megalithic tombs with single Grave/Battle axe culture finds (Iversen 2013 fig. 3).

The reason why East Denmark so conservatively upheld the Funnel Beaker traditions must be found in the area’s old position as a ‘megalithic heartland’, which reaches back to the early 4th millennium BC when dolmens and passage graves were constructed in very large numbers. (…) The result was a cultural blend governed by old Funnel Beaker norms and the use of Pitted Ware, Single Grave and Battle Axe material culture. This situation continued until the beginning of the Late Neolithic (ca. 2350 BC) when cultural and social development took a new course and flint daggers and metal objects appeared/ re-appeared in South Scandinavia.

The radical change brought about in the Late Neolithic “Dagger Period” is commonly agreed to be associated with the arrival and expansion of the Pre-Proto-Germanic communtity (read more here).

3. Single Grave culture in the Netherlands

The Corded Ware culture in the Netherlands is particularly disconnected culturally from its eastern core areas, which is reflected in the likely survival of a non-Indo-European language around the Low Countries, in the so-called Nordwestblock area. From Kroon et al. (2019):

The connections between changes in ceramic production techniques and social changes (see Fig. 2) allow for the formulation of hypotheses about the technological impact of the scenarios that archaeologists have proposed for the introduction of the CWC. If migration (i.e. an influx of new communities that bring new material culture) causes the spread of the CWC, then CWC vessels should differ from the vessels of previous communities in all respects: resilient, group-related, and salient techniques. However, if the introduction of the CWC is the result of diffusion of stylistic traits and moving objects, both these imported objects (different raw materials and production sequences) and changes in salient techniques should be observed when comparing CWC vessels to VLC vessels. Network interactions should yield the same changes as diffusion, as the combined movement of people, objects and styles within existing networks leads to the introduction of CWC. However, network interactions should yield one additional characteristic. Given that new people are integrated into extant communities, the occurrence of vessels with different resilient techniques, but group-related techniques that are stable relative to previous communities, is to be expected.

Schematic representation of the hypothesised changes in ceramic technology for diffusion (above) and migration (below) scenarios for the spread of the CWC. Image from Kroon et al. (2019).

The over-arching transitional process in the Western coastal area of the Netherlands is local continuity with diffusion and network interaction traits. Interestingly, the supra-regional networks of the VLC communities in this region, as well as some of the defining technological practices within these networks, remain intact throughout the CWC transition.

In the absence of detailed genetic and isotopic data from Late Neolithic individuals from the western coastal areas of the Netherlands, direct conclusions on the relations between the migrations demonstrated by genetic analyses in other regions and the outcomes of this study remain speculative. However, if a similar shift in the late Neolithic gene pool from this area can be detected, this raises questions on the impact of such migrations on knowledge transmission and local traditions. If such a change cannot be attested, questions should be raised about the nature of the CWC in this particular area. Questions that will ultimately boil down to what we define as CWC.

In other words, the introduction of Corded Ware in the Netherlands, which we can assume were driven by migrations – evidenced by the arrival of “Steppe ancestry” (see below) – would need to be interpreted in light of the adoption of a different set of cultural traits in this region. Combining linguistic and archaeological data, there is strong evidence that the Corded Ware ideology and its internal coherence might have been broken in the westernmost territories, hence the likely survival of the local culture and language(s).

Further reasons for this independence from the Uralic homeland, supporting the advantages of a cultural and linguistic integration among regional groups, include:

  1. the gradual shift of the core Corded Ware territory to the east in the centuries leading to the mid-3rd millennium BC;
  2. the similar weakened grip in Jutland (see above);
  3. the development of an isolated “classical” package in western Europe disconnected from eastern groups; and
  4. the cultural and genetic impact of expanding vanguard Yamna settlers.
General distribution of SGC settlements in the Netherlands. A = the tidal area in the province of Noord-Holland; B = the coastal barriers and Older Dunes area; C = the central river district; D = the northern, central and southern Dutch Pleistocene areas. a = certain/probable settlement; b = possible settlement; c = wooden trackway. Legend Holocene: 1 = coastal barriers and dunes. 2 = marine clay. 3 = peat. 4 = river clay. 5 = river dunes. 6 = water. Image modified from Drenth, Brinkkemper, and Lauwerier (2006).

4. Old Europeans in Britain

This predominant non-Indo-European language would later be the substrate language of Bell Beakers from the Lower Rhine and the British Isles.

Culturally, the same process as in the previous Single Grave culture period may have happened in the Low Countries, due to the culturally favorable situation there. This might be inferred from the continuity of Protruding Foot Beaker into All-Over Ornamented Beaker, most likely an imitation of the expanding Proto-Beaker package by locals of the Single Grave culture.

Arguably, though, the same situation should have happened in all other Proto-Beaker regions favourable to cultural change and witnessing admixture with locals, such as Iberia, and the social relevance of this imitation is far from being accepted by almost anyone except for archaeologists working around the Rhine… From Heise (2014):

While in 1955 the Maritime Beaker was considered to be intrusive, the 1976 work seemed to prove that in the Netherlands a continuous development from Protruding Foot Beaker (PFB) to All-Over Ornamented (AOO) Beaker to Maritime Beaker occurred. Nevertheless, the authors stressed that it was not possible to identify ‘the’ origin of the ‘Bell Beaker Culture’ in the Lower Rhine Area since typical artefacts (wristguards, daggers) were not known to be associated with the early AOO and Maritime pottery. Furthermore they argued against the “misleading simplification” of a single point of origin (Lanting & van der Waals 1976, 2). However, this last observation was not appreciated or was simply ignored by large parts of the research community and the theory was subsequently applied as a universal solution in many parts of Europe.

Typological development of Beakers in the Netherlands (PFB: Protruding Foot Beakers; AOO: All Over Ornamented Beakers; BB: Bell Beakers) (after Lanting & van der Waals 1976, 4, fig. 1).

In fact, most archaeologists have unequivocally rejected a Single Grave – Classical Bell Beaker continuity, and Heyd’s model has been recently confirmed in paleogenomics, which shows an evident expansion of East Bell Beakers from Yamna settlers in the Carpathian Basin (see here). We may nevertheless still save the following assertion, as particularly relevant for the continuity of non-Indo-European languages among the Single Grave groups of the Lower Rhine:

Marc Vander Linden argued that the “local validity of the Dutch sequence cannot […] be questioned” (2012, 76).

Olalde et al. (2019) showed how British, Dutch, and French Beakers have excess “Steppe ancestry” relative to Central European Beakers from Germany, who are in turn closest to the origin of Old Europeans in Iberia (i.e. Galaico-Lusitanian, “Ligurian”), the Lower Danube (i.e. Celtic), Italy (i.e. Italic, Venetic, Messapic), Sicily, and even Denmark (i.e. Germanic). This excess “Steppe ancestry” probably implies admixture with local Single Grave populations of the Lower Rhine, which is further supported by the position of these Lower Rhine Beakers in the PCA (using British Beakers and Netherlands BA as proxies), clustering – among Bell Beakers – closest to Corded Ware samples.

PCA of ancient Eurasian samples, with Corded Ware clusters drawn. Rhine/British Bell Beakers partially overlapping them. See full PCAs.

Futhermore, the emergence of Bell Beakers in the British Isles represents a radical replacement, with a population turnover of ca. 90% of the local population, and Yamna lineages representing more than 90% of the haplogroups of individuals in Chalcolithic and Bronze Age Britain and Ireland, apart from an evident Y-chromosome bottleneck under hg. R1b-S461 (and its subclade R1b-L21), maintained during the whole Bronze Age. The scarce non-Indo-European hydrotoponymy attests to the lack of integration of local populations or their languages into the new society. All this suggests an initial swift and massive intrusion marking the linguistic evolution of the British Isles until the Iron Age.

The arrival of Insular Celtic in the British Isles will be likely defined by an increase in ancestry related to Central Europe (and probably haplogroups, too). Since the Afroasiatic-like substrate is unrelated to Common Celtic, the non-Indo-European substrate must be associated with preceding Bronze Age populations of western Europe, most likely with Bronze Age Britons, who are in turn derived from Bell Beakers from the Lower Rhine admixed with Single Grave peoples. The latter, therefore, must have passed on their Afroasiatic-like language as the substrate of Lower Rhine Beakers.

5. Vasconic from the north

Another indirect proof to the survival of non-Indo-Europeans in northern Europe is offered by Basques. Vasconic speakers came originally from some place beyond Aquitaine, and very recently before the Roman conquests, because place- and river-names show an overwhelming Old European substratum to the north of the Pyrenees, and exclusively Old European to the south.

Their origin is potentially quite far away, since Modern Basques show a similar cluster to that found in Iron Age Celtiberians of the Basque country. This could essentially mean that Basques were peoples of north/central European ancestry (see below fitting models of origin populations), because they must have arrived to Aquitaine after the arrival of Celtiberians, and with a similar ancestry.

In words of Olalde et al. (2019):

(…) increases in Steppe ancestry were not always accompanied by switches to Indo-European languages. This is consistent with the genetic profile of present-day Basques who speak the only non-Indo-European language in Western Europe but overlap genetically with Iron Age populations showing substantial levels of Steppe ancestry.


The Tollense Valley near Rügen in the West Baltic shows LBA people clustering with Modern Basques (see here). This is compatible with the arrival (or displacement) of Vasconic-speaking Northern/Central Europeans close to the Rhine, possibly originally from northern France, very likely close to the Atlantic area during the Final Bronze Age / Early Iron Age based on cultural interactions.

Population movements from central-northern Europe into western Europe. Top: Cultures of the Late Bronze Age. Bottom: PCA with Bronze Age samples and drawn clusters. Marked in red is the Tollense site (please note: the approximate Tollense cluster does not include outliers, among them those closer to Modern Basques). Also marked is the British Bell Beaker sample closest to CWC populations. See full maps and whole PCAs.

Pre-Steppe languages in Europe?

An alternative to Old Europeans of the British Isles would be to support some kind of non-Indo-European/Vasconic continuity in the Atlantic façade close to the English Channel and the North Sea, given the current lack of palaeogenomic data on Bell Beakers and later groups in the area, and the potential Vasconic nature of Megalithic/Proto-Beaker groups that might have survived there.

The main problems with this approach are the lack of such an Afroasiatic-like substrate in Gaulish, which should have shown the same substrate as Insular Celtic, and the impossibility of associating this Afroasiatic-like substrate with Vasconic, both potentially representing completely different languages. A counterargument would be that we don’t have that much information on Gaulish and its dialects – or on the syntax of Vasconic, for that matter – to reject this hypothesis straight away…

In any case, the survival of pockets of non-Indo-European, non-Uralic speakers in northern Europe, even after Steppe-related expansions, should not shock anyone:

If the survival of non-Indo-European-speaking groups happened despite the swift expansion and radical population replacement brought about by the Bell Beaker folk – so called traditionally because of its unitary culture suggesting a unitary language community -, and non-Uralic-speaking groups in areas dominated by Corded Ware peoples, it could certainly have happened, and even more so, with Corded Ware and Bell Beaker groups at the western and northern edges of their expansions, due to the early loss of contact with their respective core cultural regions.


Even obscure components of place or river names, like those from northern Europe, the Nordwestblock area, and the British Isles, might be better explained as Old European exceptions than any other alternative, i.e. either as an Indo-European layer over a non-Indo-European one or vice versa, or both in different periods, before the eventual unifying Celtic, Roman, and (later) Germanic expansions.

All in all, one could say about substrates and hydrotoponymy in the British Isles, the Lower Rhine, and in northern Europe as a whole, that the potentially interesting non-Indo-European forms are precisely those which do not interest either scholarly ‘faction’:

  • those supporting a non-Indo-European Western Europe, because it doesn’t represent the whole substrate, and can’t be used to argue for a Europa Vasconica or Europa Afroasiatica;
  • those supporting a Palaeo-Indo-European Western Europe, because their limited presence concentrated in isolated pockets doesn’t deny the Indo-Europeanness of the Old European layer anywhere.

However, these are the details that should be studied and that could define what happened exactly after steppe-related migrations, e.g. in the Single Grave cultural area before and after North-West Indo-Europeans admixed with its population, and thus what happened in the British Isles, too.

Ignoring the (mostly useless) typological comparisons, my bet would be for an ancient Uralic layer heavily admixed with local non-Uralic peoples, especially intense in the Single Grave culture. This Proto-Uralic layer would be of a dialect or dialects (assuming succeeding CWC waves and later local expansions) different from the known Late Proto-Uralic – which expanded with eastern Corded Ware groups.

Describing the phonetic features of this layer could improve our knowledge of Early Proto-Uralic, as well as some specifics of the evolution of Germanic, Balto-Slavic, and potentially Celtic and Balto-Finnic.

This would be similar to the relevance of Aquitanian toponyms for Proto-Basque reconstruction, or of the alteuropäische substratum when it conflicts with the Proto-Indo-European dialectal reconstruction of some linguists (e.g. the laryngeal Pre-Indo-Slavonic of Kortlandt) which, like Kitson implies, should question the dialectal reconstruction of this minority of Indo-Europeanists, and not the Indo-European nature of the substratum.


European hydrotoponymy (II): Basques and Iberians after Lusitanians and “Ligurians”


The first layer in hydrotoponymy of Iberia is clearly Indo-European, in territories that were occupied by Indo-Europeans when Romans arrived, but also in most of those occupied by non-Indo-Europeans.

Among Indo-European peoples, the traditional paradigm – carried around in Wikipedia-like texts until our days – has been to classify their languages as “Pre-Celtic” despite the non-Celtic phonetics (especially the initial -p-), because the same toponyms appear in areas occupied by Celts (e.g. Parisii, Pictones, Pelendones, Palantia); or – even worse – just as “Celtic”, because of the famous -briga and related components. This was evidently not tenable at the end of the 20th century, and it is simply anachronistic today.

NOTE. Since Indo-Europeans and non-Indo-Europeans of Western Europe show strong Y-chromosome bottlenecks under R1b-P312 lineages, maps below show the evolution of cultural groups side by side with ADMIXTURE of ancient DNA samples instead. The map series on prehistorical migrations contains also Y-DNA and mtDNA maps.

Most excerpts below (emphasis mine) are translated from Spanish (see the original text here):

Top Left: Arrival of Indo-European-speaking East Bell Beakers and likely disruption of the Basque-Iberian community (ca 2500 BC on). Top Right: corresponding (unsupervised) ADMIXTURE map of ancient DNA samples. Arrival of Central European ancestry (“Steppe ancestry”, roughly represented by the blue color), with other components still prevalent, roughly including Anatolia Neolithic (brown), WHG (red), and sporadically Northern African (violet). Notice the high proportion of Central European ancestry in central and north-western Iberia. See full maps including Y-DNA and mtDNA. Bottom: PCA of Bell Beaker and contemporaneous samples.


While the non-Celtic Indo-European nature of Lusitanian is certain, the nature of the “Pre-Celtic” language spoken by peoples such as Cantabri, Astures, Pellendones, Carpetani and Vettones is still being discussed, due to the scarcity of material to work with.


From Hacia una definición del lusitano, by Vallejo (2013):

It is certain that the delimitation of the geographical area set by Tovar is still valid, basically determined by the known direct documents, that is, the traditionally accepted inscriptions (the classic ones of Lamas de Moledo, Arroyo de la Luz and Cabeço das Fráguas), in addition to the new ones from Arroyo and the recent one from Arronches, see Fig. 1), to which some others could be added: the new bilingual inscription from Viseu necessarily compels us to consider it as indigenous, because it contains terms that belong to the core of the language and not only onomastics (I refer to the nexus igo and the nicknames deibabor and deibobor). By virtue of this new incorporation, we can also consider other texts as indigenous, although they do not include a common lexicon (see Fig. 1, inscriptions 7 to 22), in the expectation that many Lusitanian scribes were consciously mixing two linguistic registers (code switching), one to refer to the deities (for which they frequently used indigenous inflection) and another for anthroponyms (always with Latin inflection).

Left: Early Bronze Age cultures in Iberia (in red, likely Indo-European groups; in green, likely non-Indo-European groups). Right: Unsupervised ADMIXTURE of ancient DNA samples. See full maps including Y-DNA and mtDNA.

Firstly, it is striking that this geographical profile drawn by the texts correspond almost exactly to the distribution of large series of anthroponyms and theonyms.* Among the abundant names of people we can highlight those with a large number of repetitions whose appearance is circumscribed to our region of study (see Fig. 2). Some of them are truly frequent and lack parallels on the outside, such as the stem Tanc / Tang- (of Tanginus) with no less than 130 attestations, or Tonc- / Tong- (of Tongius or Tongetamus) with 70. Others show also sufficiently representative figures as Camalus and Maelo (with 46 repetitions each), Celtius (with 29), Caturo or Sunua (with 23), Camira (with 22), Doquirus (with 20), Louesius (with 18), Al(l)ucquius (with 17) or Malge(i)nus (with 16). According to these quantities, it appears that these are not casual occurrences of names, taking into account that chance tends to be reduced to a minimum in the study of the Iberian Peninsula, since we can easily handle the entire peninsular corpus. In turn, Reue, Bandue, Nauiae and Crougiae are the theonyms that best represent the Lusitanian-Galician area, coinciding fundamentally (Figure 3) with the picture that anthroponymy and texts had drawn, although with less examples.

Top left: Lusitanian (long and short) inscriptions; top right: Map of the distribution of statue-menhirs and south-western stelae, by Rodríguez-Corral (2014) [(1) stelae in Beira Alta and Tras-os-Montes (Portugal), and Orense (Galicia, Spain); (2) both in the same territory: northwestern statue-menhirs and southwestern stelae; (3) hybridization of both into the same material form (stela/stela-menhir from Pedra Alta)]; bottom left: Lusitanian teonymy; bottom right: Lusitanian anthroponymy.

* The other subdivision of the onomastics, toponymy, presents difficulty in the elaboration of series, by the few repetitions of segments, once the universal element -briga has been eliminated.

It is not only these groups of names and roots that help us define a large northwestern area, but, as I have had occasion to mention in other places, some onomastic data that share a similar distribution can also be added: the desinence -oi (with an assimilation in -oe / -ui) of theonymic dative singular, the ending -bo of dative plural, the presence of the noun-forming suffix -aiko-, in addition to other phonetic features such as the passage of e> ei in anthroponymy, the reduction ug> uo the step of w> b.

Genetic isolation in modern north-western Iberia (northern Portugal / southern Galicia) is greater than in other Iberian regions, forming different ancestral clusters splitting before others (including Basques). Image from Bycroft et al. (2018). See explanatory video by Carracedo.


From The concept of Onomastic Landscape: the case of the Astures, by Vallejo (2013):

(…) First of all, it seems that there is an independent onomastic area, which can be defined by a series of names and suffixes that are repeated there exclusively or predominantly. This area does not seem to correspond with what we know of the Lusitanian-Galician onomastics nor of the more coastal Asturian; it also differs from the Celtiberian area, with which it does not have features in common. In this way, and always in the conjectural terrain, we could find ourselves before an Indo-European non-Celtic language different from the Lusitanian language.

A peculiarity that will have to be investigated is the presence of an excessively wide border corridor, where the names of the southern Astures (Augustales) do not predominate, but neither those of the northern Astures (Transmontanos). Similarly, we will have to see the scope of the hypothesis that there might have been a language perhaps differentiated from that spoken in the Lusitanian, Galician or Celtiberian zones; the lower documentary richness of the Asturian zone of Transmontana makes it more difficult to guarantee that it is not the same linguistic area as the one we isolate among Asturian cities.

In any case, de Hoz, even taking into account the difficulty of an affirmation of this type, pointed out ambiguously that we could find ourselves in front of different languages. On the other hand, the absence of texts directly transmitted by this people leaves us without a definitive confirmation the argument that it is a linguistically differentiated region, but it does not invalidate it at all. These drawbacks require the suspension of the exact characterization of our area, awaiting advances in the field of epigraphy and methodology.



The following are mainly excerpts from Villar (2007, 2014):

Lenguas, genes y culturas en la Prehistoria de Europa y Asia suroccidental (2007). Buy the ebook online (or the printed version, if available).



The information provided by place-names and hydronyms on the one hand and anthroponyms on the other is of undoubted historical value in both cases, but of different specific significance. Anthroponyms reflect the present situation at the moment when living people were using them. It is an aspect very sensitive to social changes of all kinds, reaching its highest level of instability when there is language change.

(…) the Pre-Roman anthroponymic inventory of the Basque Country and Navarre indicates that prior to the arrival of Romans the language spoken was Indo-European (reflected in the names used) in the territories of Caristii, Varduli and Autrigones, while in Vasconic territory (especially in the current Navarre) most of the speakers chose Iberian names. In the territories of the current Basque Country, only a negligible statistical proportion chose Basque names, whereas in Navarre it was a minority of the population. That’s how things were towards the 3rd century BC.


Cities and rivers are not subject to the ephemeral life cycle of humans. Rivers have very long cycles that go far beyond the life time not only of individuals, but also of languages ​​and cultures. Cities are also generally very stable, although social circumstances occasionally cause one to be abandoned or destroyed, while new ones are created from time to time. That means that the names of rivers and cities are not subject to fashions or frequent change. Nor does a language change imply a renewal of the previous hydronymy and toponymy.

Speakers of the new languages ​​incorporated into a territory learn from the natives the hydronymic and toponymic system, producing what we call the “toponymic transmission”. (…) it requires a prolonged contact between the native population and the new occupants, which can only occur when the indigenous population is not annihilated quickly and radically.

Top Left: Middle Bronze Age cultures in Iberia (in red, likely Indo-European groups; in green, likely non-Indo-European groups). Top Right: Unsupervised ADMIXTURE of ancient DNA samples. See full maps including Y-DNA and mtDNA. Bottom: Bottom: PCA of Bronze Age groups.

The ancient onomastic data of the Basque Country and Navarre can be summarized as follows:

  • Ancient hydronymy, the longest lasting onomastic component, is not Basque, but Indo-European in its entirety.
  • The old toponymy, which follows it in durability, is also Indo-European in its entirety, except Poampaelo (now Pamplona) and Oiarso (now Oyarzun).
  • And in anthroponymy, which reflects the language used at the time when those names were in use, is also massively Indo-European, although there are between 10-15% anthroponyms of Vasconic etymology.

(…) the existing data show that, while in Roman times in Hispania there were only a couple of place-names in the Pyrenean border and a dozen anthroponyms of Vasconic etymology, in Aquitaine there was an abundant antroponymy of that etymology.

Left: Late Bronze Age cultures in Iberia (in red, likely Indo-European groups; in green, likely non-Indo-European groups). Right: Unsupervised ADMIXTURE of ancient DNA samples. See full maps including Y-DNA and mtDNA.

This set of facts is most compatible with a hypothesis that postulated a late infiltration of this type of population from Aquitaine, which at the time of the Roman conquest had only reached to establish a bridgehead, consisting of a small population center in Navarre and Alto Aragón and nothing else, except some isolated individuals in the current provinces of Álava, Vizcaya and Guipúzcoa. The almost complete absence of old place-names of Vasconic etymology would be explained in this way: Vasconic speakers, recently arrived and still in small numbers, would not have had the possibility of altering in depth the toponymic heritage prior to their arrival, which was Indo-European.

The idea of ​​a late Vasconization of a part of those territories, in the High Middle Ages or late Antiquity, is not new. Already in the 1920s M. Gómez Moreno said about the modern Basque provinces, with the district of Estella in Navarra, that “personal nomenclature allows comparisons of definitive value, probative that there lived people of the Cantabrian-Asturian race [who for Gómez Moreno were Indo-European], without the slightest trace of perceptible Basqueness”. For him, the first Indo-European people to penetrate the peninsula would have been Ligurian, which evolved into Cantabrians, Asturians, Venetians, Lusitanians, Tormogi, Vacaeans, Autrigones, Caristii and Varduli.

Top Left: Pre-Roman cultures in Iberia (in red/brown, Indo-European groups; in pink, Greek; in yellow, Phoenician; in green, likely non-Indo-European groups; Tartessian is disputed). Top Right: Unsupervised ADMIXTURE of ancient DNA samples. See full maps including Y-DNA and mtDNA. Bottom: PCA of Iron Age groups.


If, as we said above, Basque speakers began to enter the Iberian Peninsula from the other side of the Pyrenees only from the Roman-Republican era, to intensify their presence in the following centuries we must assume that they were to the north of the Pyrenees already before those dates. And, indeed, the existence of this abundant Vasconic antroponymy shows that in the first centuries of our era – while Vasconic speakers in the Peninsula were very few in number, their population in Aquitaine was abundant.

In a provisional manner we can advance that [Aquitaine’s] hydronyms are also known in other places of Europe and easily compatible with Indo-European etymologies (Argantia, Aturis, Tarnes, Sigmanos); and among the place names there are also many that are compatible with non-Gallic Indo-European etymologies, or not necessarily Gallic (Curianum, Aquitania, Burdigala, Cadurci, Auscii, Eluii, Rutani, Cala- (gorris), Latusates, Cossion, Sicor, Oscidates, Vesuna, etc.).

In addition to those place names that we classify as generically Indo-European, there are not a few Celts (Lugdunum, Mediolanum, Noviomagos, Segodunon, Bituriges, Petrucorii, Pinpedunni), several Latins (Aquae Augustae, Convenae, ad Sextum, Augusta), and even some Celto-Latin hybrids (Augustonemeton, Augustoriton). On the other hand, there are hardly any names, neither serial nor not serial, that have a reasonable possibility of being explained by Vasconic etymology (Anderedon could be one of them).

Consequently, the onomastic question of Aquitaine is not compatible with the possibility that Vasconic is the “primordial element” there, either. On the contrary, it is compatible with the hypothesis that they arrived also late in Aquitaine, when hydro-toponymy was already established. They had to Vasconize all or part of the previous population, that turned to use to a large extent the Vasconic anthroponymy. But the previous toponymy remained and the Vasconization process was probably soon interrupted by Celticization first, and Romanization later.

Aquitani and neighbouring tribes around the Pyrenees, as described by the Romans (ca. 1st c. BC). The Basque language likely expanded south and west of the Pyrenees into Indo-European-speaking territories during the Roman period. The term ‘Vascones’ only became applied to Basque-speaking tribes in medieval times. Map modified from image by Sémhur at Wikipedia.

A prediction in genetics

This is how Francisco Villar and co-authors from the University of Salamanca saw what would happen with the genetic studies of modern Basques in 2007, based on the similarity with neighbouring Iberians and French, and the late intrusion of the language in its current territory:

Unfortunately, linguistics does not have the means to establish the moment of that arrival in terms of absolute chronology. In any case, this hypothesis is not incompatible with some peculiarities in the frequency of certain genes of the Basque-speaking population. Indeed, today we tend to attribute these peculiarities to the joint action of genetic drift and isolation; to which perhaps we could add a bottleneck in the Vasconic founding population that would one day settle in Aquitaine.

Indoeuropeos, iberos, vascos y sus parientes (2014). Buy the ebook online (Or printed version, if available).

Also Villar, in 2014:

In the hypothesis that I propose, future speakers of Basque would have settled initially in Aquitaine, where there would have been an inevitable genetic diffusion with pre-existing [first stage] populations. On the other hand, Basque speakers from Aquitaine would have started to arrive to the Basque Country and Navarre only from Roman times (only a couple of Vasconic toponyms, at least one of them of recent creation; scarce anthroponyms of Vasconic etymology). The part of those populations that mixed with the pre-existing Palaeo-Indo-Europeans (Indo-European names of rivers; general Indo-European toponymy) saw how the uniqueness of their haplogroups, if there was any, was diluted, making it difficult to distinguish from the general [Indo-European] background; being a minority, it could had been even lost as a result of adverse genetic drift.

Olalde et al. (2019) confirmed this hypothesis that modern Basques are quite similar to investigated Iron Age Indo-Europeans from Iberia (such as Celtiberians sampled from the Basque Country):

For the Iron Age, we document a consistent trend of increased ancestry related to Northern and Central European populations with respect to the preceding Bronze Age. The increase was 10 to 19% (95% confidence intervals given here and in the percentages that follow) in 15 individuals along the Mediterranean coast where non-Indo-European Iberian languages were spoken; 11 to 31% in two individuals at the Tartessian site of La Angorrilla in the southwest with uncertain language attribution; and 28 to 43% in three individuals at La Hoya in the north where Indo-European Celtiberian languages were likely spoken. This trend documents gene flow into Iberia during the Late Bronze Age or Early Iron Age, possibly associated with the introduction of the Urnfield tradition.

Modern Basques show therefore, paradoxically, an ancestry similar to recent Iron Age Indo-European invaders (quite likely the ancestors of Celtiberians), which confirms the hypothesis of bottlenecks/founder effects followed by a very recent isolation of its population:

(…) the genetic profile of present-day Basques who speak the only non-Indo-European language in Western Europe [] overlap genetically with Iron Age populations showing substantial levels of Steppe ancestry.

Left: Roman period in Iberia. Right: Unsupervised ADMIXTURE of ancient DNA samples. See full maps including Y-DNA and mtDNA. Notice increase of steppe ancestry in the north, associated with the (Late Bronze Age / Early Iron Age) arrival of Central Europeans.


Regarding the Iberian language, the circumstances of analysis are less favorable. However, we can observe in the ancient toponymy of typically Iberian areas (the Spanish Levant and Catalonia) a considerable proportion of toponymy of Indo-European etymology, often identical to that which F. Villar (2000) has called “Southern-Iberian-Pyrenean”. In fact, its presence in the Levant is nothing else but a continuation from Catalonia to the South along the Mediterranean coast. Here are some examples: Caluba, Sorobis, Uduba, Lesuros, Urce / Urci, Turbula, Arsi / Arse, Asterum, Cartalias, Castellona, ​​Lassira, Lucentum, Saguntum, Trete, Calpe, Lacetani, Onusa, Palantia, Saetabis, Saetabicula, Sarna , Segestica, Sicana, Turia, Turicae, Turis.

Compatible with the Indo-European etymology can also be Blanda, Sebelacum, Sucro, Tader, Sigarra, Mastia, Contestania, Liria, Lauro, Indibilis, Herna, Edeta, Dertosa, Cesetania, Cossetani, Celeret, Bernaba, Biscargis, (…)

Finally, in other place names there are Indo-European components in hybrid toponymic syntagms, such as:

  1. orc- / urc-: Orceiabar, Urcarailur, Urceatin, Urcebas, Urcecere, Urcescer, Urceticer.
  2. Il-: Iltukoite, Iluro (3), Ilurci, Ilorci, Ilurcis, Ilucia, Iliturgi, Ilarcurris, Iluberitani, etc.


Examples like these show that in Catalonia and the Spanish Levant the Iberian language is not the deepest identifiable substrate language, but that it took root there when there was previously an Indo-European language that had created a considerable network of toponyms and hydronyms that we can recognize, and over which Iberians settled as a superstrate. The pre-existence of an Indo-European language in the historically Iberian area is further corroborated by the fact that its ancient hydronyms are all Indo-European, with the exception of a single river that has a name that is supposed to be Iberian: the Iberus (Ebro), of which obviously the country and its inhabitants took their name. No doubt ib- was an appellation for river, so that in the language that created that hydronym the Iber should have simply been “the river”. But we will see in the body of this work that ib- is in various places outside the Iberian Peninsula as an appellation for «river», which will force us to rethink its supposed Iberian affiliation. In fact, the Iberus had another name, Elaisos, whose etymology is compatible with Indo-European. As we know with certainty that after Iberians no other Indo-European peoples came to their territory before the Romans, the Indo-European creators of that hydronymy have had to be there before the Iberians. And its antiquity must be considerable because, as we have already said, the vast majority of its hydronyms (Alebus, Caluba, Lesuros, Palantia, Saetabis, Sigarra, Sucro, Tader, Turia and Uduba, Elaisos) belong to that anonymous Indo-European language that didn’t leave written texts or had historical continuity.

Inscriptions in Iberia ca. 2nd–1st c. BC. Purple squares show Celtiberian inscriptions, blue circles show Iberian inscriptions. Image modified from Hesperia – Banco de datos de lenguas paleohispánicas.

Villar (2014):

Not always that a language is settled in a territory is it able to eradicate the existing ones definitively. Even a political system as unitary and unifying as the Roman was not able to eradicate the Basque language. And nowadays in Latin America, despite the crushing cultural dominance of Spanish, despite the means for the schooling of a modern society, in spite of the media, a multitude of pre-Columbian languages ​​are spoken that coexist with the language of culture, the only one that is written in those countries. In those situations, which can be prolonged for quite a lot of time, there are individuals who only speak the language newly imposed, others who speak only the language that has resisted disappearing, and others who speak both, in a broad framework of bilingualism. My proposal is that something similar to that must have happened in the Iberian territory when the Romans arrived: A language of culture, Iberian, diversified into more or less distant local dialects, coexisted with several previous languages, equally differentiated from the dialectal point of view. This explains the irruption in the Iberian texts of non-Iberian anthroponyms and, above all, the existence there of a Palaeo-Indo-European hydro-toponymy that had remained in use not only because it was transmitted to Iberian speakers, but also because its native users were still present.


European hydrotoponymy (I): Old European substrate and its relative chronology


These first two posts on Old European hydro-toponymy contain excerpts mainly from Indoeuropeos, iberos, vascos y sus parientes, by Francisco Villar, Universidad de Salmanca (2014), but also from materials of Lenguas, genes y culturas en la Prehistoria de Europa y Asia suroccidental, by Villar et al. Universidad de Salamanca (2007). I can’t recommend both books hardly enough for anyone interested in the history of Pre-Roman peoples in Iberia and Western Europe.

NOTE. Both books also contain detailed information on hydrotoponymy of other regions, like Northern Europe, the Aegean and the Middle East, with some information about Asia, apart from (outdated) genetic data, but their main aim is obviously the Prehistory of Iberia and neighbouring regions like France, Italy, or Northern Africa.

Here are only some excerpts (emphasis mine), translated from Spanish (see the original texts here), accompanied by images from both books.

Indoeuropeos, iberos, vascos y sus parientes (2014). Buy the ebook online (Or printed version, if available).

Alteuropäisch and Krahe

The investigation of “Old European” or Alteuropäisch, popularized by Krahe, began precisely with the study of some toponyms and personal names spread all over Europe, previously considered “Ligurian” (by H. d’Arbois de Jubainville and C. Jullian) or “Illyrian” (by J. Pokorny), with which those linguistic groups – in turn badly known – were given an excessive extension, based only on some lexical coincidences.

This is a comment made by the author about Krahe‘s data and his opinions, frequently used against his compiled data, which I find paradoxically applicable to Villar’s data and his tentative assignment of the relative linguistic chronology to an absolute one – including the expansion of a “Mesolithic” Indo-European vs. a “Neolithic” Basque / Iberian vs. a Bronze Age Celtic – when it is now clear that the sequence of events was much later than that:

It is very widespread today a derogatory and globally disqualifying attitude to everything that sounds like Alteuropäisch and Krahe, sometimes without the necessary discrimination between different hypotheses, or even between data and hypothesis. It is not fair that the version of H. Krahe and that of W. P. Schmid be disqualified in a single simplistic judgment as if they were the same thing. But it is a major mistake to reduce the value of the hydro-toponymic data of Europe by the mere fact that Krahe attributed an implausible historical explanation to them. The data are real and still need an adequate explanation within a real historical framework, despite the unfeasibility of Krahe’s explanation.

With that we reach a point that I want to highlight. Among those who are allergic to anything that involves deviating one iota of the Indo-European paradigm as a single event, an attitude gaining momentum considers that hydro-toponymy was introduced in the different regions of Europe and Southeast Asia by the same Indo-European languages ​​that appear historically occupying their territory. H. Krahe had argued strongly against this possibility, so now I will save myself a deeper refutation and I will limit myself to pointing out some difficulties that position is forced to face.

Sala, Sala, Sala, Sala, Sala, Sala, Sala, Sala, Sala, Sala, Sala, Salaca/Salis, Salaceni,
Salacia, Salacia, Salaeni, Salam, Salandona, Salangi, Salangi , Salaniana, Sãlantas,
Salapa, Salapeni, Salaphitanum, Salapia / Salpia / Salapina palus / Salpe, Salar, Salara, Salarama,
Salarbima, Salariga, Salars, Salas, Salat, Salauris, Salcitani, Sale, Sale, Sale, Sale
stagnum, Salecon, Saleia, Salentina, Salentini, Salernum, Salerni, Sales, Sali, Salia, Salia,
Salica, Salica, Salice, Salii, Salija, Salinẽlis, Salìnis, Salìnis, Salìnis, Salìnis, Salinsae, Salionca,
Salius, Salō, Salō, Saloca, Salodurum, Salona, Salonae, Salonenica, Salonia, Saloniana,
Salonime, Salonium, Salontia, Saluca, Salum, Salum, Salunatasi, Saluntum / Salluntum,
Salùpis, Sãlupis, Salur, Salurnis, Selepitani, Sõlis.

The defenders of that alternative have to assume that the process of dialectalization, that before the migrations from the Urheimat was separating into the different Indo-European branches, affected each of them in the phonetic aspect in the general naming vocabulary, but left them unaltered in its phonetic predialectal state with regards to hydro-toponymy, as well as a good part of the naming lexicon related to the concepts of “river, water” and the different qualities of water currents. For example, according to those sharing that opinion, the Hispanic Palantia of the area of Vaccei would be in fact Celtic, but in that name the loss of the initial /p/ that characterizes Celtic would not have been applicable. Similarly, the hydro-toponymy in Germania is largely exempt from the Lautverschiebung, in Greece the loss of initial /s/, etc. These names not only fail to suffer the dialectal innovations corresponding to their zones, but sometimes they present innovations different from the features of the dialect involved. For example the word *mori “sea, standing water” is sometimes found in the hydro-toponymy of Gaul in the form *mari instead of *mori proper of Celtic (Marantium, Marisanga, Marsus), which in the framework of the paradigm has to be inevitably interpreted as a non-Celtic innovation.

Potential geographic relationship between a priori unrelated graphic-phonetic variants.

Names of this nature that appear in areas where a pre-Roman historical Indo-European language never existed remain unexplained, such as in North Africa, Arabia Felix or the Caucasus: Lake Pallantias in Libya; the Salat River in Mauritania Tingitana; Auso in Mauritania Caesariensis; the Alonta River in Georgia; the Abas River in Caucasian Albania; Salma and Salapeni in Arabia Felix; etc. Of course, for these cases it is always possible to deny any relationship of kinship between these forms and their European cognates, and attribute everything to the chance of random homophonies. Thus, once again, the annoying comparative data are sacrificed in the sacred altar of the paradigm, despite the fact that they are so numerous and consistent that if there were no blind faith in the current dogma, they would be sufficient to articulate a new paradigm over them.

The choice of each Indo-Europeanist between the non-Indo-European and the Indo-European interpretation to explain the prehistoric toponymy of Europe is not motivated by the fact that they manage partial sets of hydronyms that are more propitious alternatively for the one or the other option. On the contrary, frequently the same batch of materials is claimed by both trends as its own. An extreme example is that of Th. Vennemann, who considers simply as non-Indo-European (specifically Paleo-Basque) exactly the same material that H. Krahe used to support his Indo-European interpretation. Thus, the structure and linguistic characteristics of the studied material have little role in the choice of one or the other path, which is rather conditioned by convictions and adhesion to a varied range of personal beliefs, traditional dogmas and scientific paradigms.

Lenguas, genes y culturas en la Prehistoria de Europa y Asia suroccidental (2007). Buy the ebook online (or the printed version, if available).

The linguistic column

The sequence of languages ​​that were successively spoken in any territory constitutes what by analogy [with the “geological column”] we could call its “ethno-linguistic column”.

Next I offer the list of the languages ​​detected in the compositional (and to a lesser extent derivational) toponymic syntagms in which the appellatives ub-, up-, ab-, ap-, ur-, il-, igi, tuk, -ip – analyzed in this work – are involved.

From the interaction of the different strata in words and hybrid syntagms we can, therefore, establish the linguistic column in the Iberian Peninsula and its neighboring territories (Western Europe and Northern Africa) with the following sequence:

1. A first stratum of very old chronology, which in a previous publication I have proposed to call Palaeo-Indo-European [“arqueo-indoeuropeo”]. The toponymic elements belonging to this stratum dealt with throughout this text are abundant: kerso-, turso-, alawo-, lako-, mido-, silo-, tibo-, etc.

They always function as determinant toponyms of a place-name in any other language. It never uses the name “city” (or “river”) in hybrid syntagms. Their place names (determinants) are combined with names of the following languages:

   a) Iberian in Iberia or Southern France: kiŕś-iltiŕ, tuŕś-iltiŕ, alaun-iltiŕte, lakunm ∙ -iltiŕte.

   b) The language of the igi in southern Iberia and perhaps Northern Africa: Cantigi, Saltigi, Sagigi, Sicingi.

   c) The southern language of the postponed -il: Mid-ili, Sil-ili, Tib-ili.

   d) The language of the postponed -ip: Lac-ipo, Ost-ipo, Vent-ipo.

   e) Celtic in Gaul: kerso-ialos > Cersolius > Cerseuil; Ibili-duros > Ibliodurus.

Cariensi, Carantium, Carandonis, Carae, Caraca / Caracca, Carrinensis, Cariaca, Carneus, Carula, Carlae, Carieco, Cariocieco, Caricillum, Carona, Carnona, Caranta, Carantonus / Carantana, Caronte, Carantomum / Carantomium, Carronenses / Garronenses, Cares / Carus, Caranusca, Carona, Caro vicus, Carninia, Carus, Carnutes, Carnonis castrum, Carenses, Caralis / Carallis, Carni, Carnicum, Caraceni, Careia, Carici, Carant / Carrant, Carnonacae, Carontō, Cariolum, Caritani, Carinum, Carantani, Carnuntum, Cariniana Vallis, Cariones, Careotae, Caroia, Caria, Careum, Carnae, Caran, Carnasium, Carnus, Carneates, Carnium, Carenus, Karlasuwa, Carnias, Karahna, Karna, Cariuntis, Kariuna, Careotis, Karu, Caralitis, Carus, Carnasso, Cares, Carene, Caranum, Caria, Carina, Carura, Caralis, Coralis, Carana, Carnalis, Carinum, Carnus, Carium, Carnium, Carnus Carnuntus / Carnusii, Chariuntas, Carandra, Carna, Carana, Carine, Cariatae, Caralae, Carura, Carei, Carura, Caricum, Caranis, Caralia, Carustum, Carystus, Carastasei.

This first Palaeo-Indo-European layer also corresponds to:

Several Palaeo-Indo-European varieties that have ab-, ap-, ub-, up- as a name for «river». To them belong also numerous place names (balsa-, siko-, wol-, etc.) that act as first members composed in both monoglotic and hybrid syntagmas.

Palaeo-Indo-European varieties in which ur- is the name “river”.


2. The second stratum in decreasing order of antiquity is formed by the language of the place name igi “city”, although its presence is only verified with certainty in Iberia (especially in the south) and Northern Africa:

   a) It sets the igi name in compounds with Palaeo-Indo-European toponyms as in Salt-, Ast-, Olont-, Cant-, Aur- (Hispania) and Sagigi, Sicingi (Northern Africa).

   b) It works as the first place-name of the compound when the second is il: Igilium, Igilgili, Singili.

3. The third stratum is the language of the name il “city”:

   a) It puts the nickname il as determined in hybrid syntagms with Palaeo-Indo-European determinants: Mid-ili, Sil-ili, Tib-ili.

   b) It puts the nickname il as determined in hybrid syntagms with determinant toponyms igi: Igilium, Igilgili, Singili.

   c) It puts the place names (determinants) in front of the name (determined) of the language -ip (Il-ipa, Il-ipula and Il-ipla).


4. Fourth is the language of the name ip- “city”, which puts the name (determined) in syntagms with:

   a) Palaeo-Indo-European toponym (determinant): Lac-ipo, Ost-ipo, Vent-ipo.

   b) Toponym (determinant) il: Ilipa.

   c) Second generation hybrid toponym of Palaeo-Indo-European + il: Balsilippa.

   d) In the Balsilippa and Sicilippa conglomerates, the three strata appear in the expected sequence: Palaeo-Indo-European + il + ip.


5. In the fifth place of the sequence is the language of the tuk-:

   a) It puts the name tuk- in compounds in which the place-name is a Palaeo-Indo-European element: Acatucci (see Aduatuci in Germania).

   b) It puts the name tuk- “height, top” in compounds in which the place-name is an ip- fossilized as place-names: Iptuci, etc.

   c) On at least one occasion an ip-fossilized syntagm acts as a toponym opposite a Celtic name: Itucodon (<Iptuco-dunum).

NOTE. Even though Villar talks about this stratum -tuk in Germania (Aduatukus) and the British Isles (Itucodon), only one case is found in each territory.


6. The last place is occupied by Celtic:

   a) In Itucodon it puts the name (dunum) in front of a complex toponym of two previous strata, ip- + tuk-; and in Iliodurus it gives the name duro- in front of an equally complex Ibliodurus (<Ibili + duro).

   b) In bilbiliz it puts the casual morpheme in a fossilized bi-member toponym of a previous stratum, one of whose components is il-: Bilbil-iz.

[First column modified to include relative instead of absolute chronology]

A hard change of paradigm

More effort did it cost me to accept that ub- is a dialectal variant of a known Indo-European word for “water, river”, of which previously knew three others: ap-, ab-, up-. The obviousness of the phonetic correlation ap- / ab- // up- / ub- together with the semantic link with rivers, which can be verified above all outside of Spain, but is also present in our Peninsula, forced my resistance little by little. And with it fell the first trench of the dogma, unshakable until that moment, that everything in the Peninsula in the south was to be non-Indo-European.


Along with this serial component, many other isolated place names were revealed as very likely of Indo-European etymology, both in the “Iberian” East and in the “Tartessian” South. So the ubiquity of Indo-European throughout the Peninsula began to impose itself to me painfully. I say painfully because I lacked a paradigm in which to fit the new perspective that was making its way into my mind, which was therefore suspended in nothing, without any theoretical support, leaving me with a feeling that I was losing my footing. And for a time I was reluctant to accept the profound implications that all of this had entailed.

All il languages, in any of their locations, exhibit a compositional behavior in hybrid toponymic syntagms that place them all in an intermediate position between the clearly [first/second layer] strata, with place-names for their human settlements semantically derived from water realities (ur), and those clearly attributable to the [fifth layer] with appellations derived from settlements in heights (briga, dunum). But in that intermediate segment of the column there are three strata: 1) il, 2) ip-, 3) tuk-. In Andalusia there is an additional one: the igi stratum, of opaque semantics, which immediately precedes the il stratum.

Hydronyms in -or-, -ur-.

To postulate that any of the toponymic strata of our column imply a new linguistic stratum, certain additional requirements will be necessary. One of them is that, in addition to the name in question, the languages ​​involved should share other features that could not have been lent, such as the very precise order of elements in the compounds Toponym + Name coexisting with Name + Adjective. Or the sharing of additional lexical elements that are not usually subject to loans, such as the semantically basic adjectives beri «new» and bels «black».

Unfortunately, the toponymic method, like the Comparative Method itself, does not have the capacity to establish precise absolute chronologies. (…)

Linguistic chronology

Old European hydrotoponymy. Baltic data compensated. Statistical method Kriging.

In Europe (Hispania, South of France, Germania, British Isles, Baltic) the oldest stratum that can be identified is an indeterminable number of palaeo-varieties of the Indo-European macro-family, which do not have a direct local relationship with historical Indo-European languages, to the extent that we can verify. In fact, we have seen that stratigraphic signs lead us to consider the main Indo-European pre-Roman language of Hispania, the Celtic language, as a stratum after the il language, which in turn is later than the peninsular Indo-European palaeo-varieties.

In North Africa there is also a Palaeo-Indo-European stratum present. But there is also a very old non-Indo-European stratum whose identity I can not define through the material used. Nor has it been possible for me to establish relative antiquity of one and the other on African soil.

Another of the languages ​​involved, which has il- as an appellation for “city” in the Southwest of Hispania and North Africa, could have some kind of kinship relationship with Basque on the one hand and the Iberian language on the other, but the same indirect form that I have just pointed out for the Indo-European palaeo-varieties with respect to the historical Indo-European languages. Or in other words: the language(s) of the place-names referred to in this work would be palaeo-varieties of a linguistic family to which two known historical languages, Iberian and Basque, may have belonged, although we can’t establish a relation of direct affiliation neither between those two historical languages ​​among themselves, nor between any of them and the palaeo-varieties of the prehistoric toponymy.

[First column modified to include relative instead of absolute chronology]

In general, Celtic does not have in its historical territories the onomastic behavior of an ancestral language, but that of an intrusive language, whose presence there is not only more recent than other Indo-European varieties, but also after that of various non-Indo-European strata, which are themselves ranked between the oldest detected (Palaeo-Indo-European) and the last of Pre-Romans, which is Celtic itself. If we only detected two strata, the Indo-European and the Celtic ones, we could discuss if it is possible that both are one and the same, so that what we define as “Celtic” is nothing other than the modern in situ evolution of Palaeo-Indo-European. But examples like those of kiŕśiltiŕ, kerso-ialos, Cirsa or Itucodon, among many others analyzed throughout this book, make it unlikely. And, in addition, the mediation of several strata in the column between the Palaeo-Indo-European language of Cirsa, as well as the greater antiquity of the ip- and tuk- languages ​​in Spanish, Gallic and British territory, defines the latter as a new and more recent layer than the aforementioned, which burst into its historical sites during the Iron Age.

Because Archaeology continues to deny the existence of population movements of a size worthy of consideration in the Iron Age, it is necessary to accept that the Indo-European Problem remains intact. It is understandable that before this aporia, many minds who are uncomfortable living with doubts, prefer to adopt a creed (the traditional, the Neolithic or the continuist) and expose it as a certainty to their students in the classrooms or their colleagues in conferences and publications. It’s not my case. For me, with Voltaire, “le doute est désagréable, mais la certitude est ridicule”. Or with Manzoni: “E men male l’agitarsi nel dubbio, che riposar nell’errore”.

Continue reading on European hydrotoponymy (II): Basques, Iberians, and Etruscans after Old Europeans.


Haplogroup R1b-M167/SRY2627 linked to Celts expanding with the Urnfield culture


As you can see from my interest in the recently published Olalde et al. (2019) Iberia paper, once you accept that East Bell Beakers expanded North-West Indo-European, the most important question becomes how did its known dialects spread to their known historic areas.

We already had a good idea about the expansion of Celts, based on proto-historical accounts, fragmentary languages, and linguistic guesstimates, but the connection of Celtic with either Urnfield or slightly later Hallstatt/La Tène was always blurred, due to the lack of precise data on population movements.

The latest paper on Iberia is interesting for many details, such as:

  • The express dismissal of the newest pet theory based on the simplistic “steppe ancestry = IE”: the obsessive comparisons of Dutch Bell Beakers as the origin of basically anything that moves in Europe.
  • A discrete influx of North African ancestry in certain samples before the Moorish invasion (which was probably mediated by peoples of North African rather than Levantine admixture).
  • The finding of very Mycenaean-like Greek colonies of the 5th century (interestingly, under R1b lineages).
Modified from section of PCA of ancient samples by Olalde et al. (2019). “IE Iberia” refers to Pre-Celtic Indo-European languages of Iberia, such as Galaico-Lusitanian in the west (see more on Lusitanian), and a potentially Ligurian-related language in the North-East and southern France.

The paper is, however, of particular importance from the perspective of historical linguistics. It confirms that:

  • Celtic-speaking peoples expanded in Iberia likely during the Late Bronze Age – Early Iron Age (probably with the Urnfield culture, before 1000 BC) with North/Central European ancestry.

NOTE. The paper marks what are believed to be the boundaries of non-Indo-European languages during the Iron Age in later times, extrapolating that situation to the past. Mediterranean sites with Iberian traits (ca. 6th century on) were probably non-Indo-European-speaking tribes, but it is unclear what happened in the centuries before their sampling, and there are no clear boundaries. These incoming Celts from central Europe with the Urnfield culture makes it very likely that the Iberian expansion to the north happened later, incorporating thus this central European ancestry in the process. The southern (orientalizing, Tartessian) site of La Angorrilla shows incineration and influence from Phoenician settlers, and their actual language is also far from clear. The other investigated samples, with higher central European contribution, are from Celtiberian sites.

  • The slightly later arrival of (Phoenician, Greek and) Latin-speaking peoples into Iberia is marked by Central/Eastern Mediterranean and North African ancestry.
Expansion of different ancestry components in Iberia during Prehistory. Modified from Olalde et al. (2019) to include labels with populations expanding with each component.

While both confirm what was more or less already known about the oldest attested NWIE dialects, and further support the role of East Bell Beakers in expanding North-West Indo-European, the first part is interesting for two main reasons:

  1. Koch’s Celtic from the West hypothesis, which made a recent comeback with a renewed model based on “steppe ancestry”, is once again rejected in population genomics, as expected. At this point I doubt this will mean anything to the supporters of the theory (because you can propose as many “Celtic-over-Celtic” layers as you want), but if you are not obsessed with autochthonous continuity of Celtic languages in the Atlantic area we might begin to judge the most correct dialectal split (and thus classification) among those proposed to date, based on ancestry and haplogroup expansions.
  2. We believed in the 2000s that the expansion of haplogroup R1b-M167 (TMRCA ca. 1100 BC for YTree or 1700 BC for YFull) was coupled with the expansion of Iberians from the Pyrenees, in turn (thus) closely related to Basques. This non-IE presence has been contested with toponymic data in linguistics, and with the testing of many modern samples and the subsequent discovery of the widespread distribution of the subclade in western and northern Europe. Now it has become even more likely (lacking confirmation with aDNA) that this haplogroup expanded with Celts.

NOTE. Regarding R1b SNPs, YTree has more samples (and thus more SNPs) to work with estimates, due to its connection with FTDNA groups, so it is in principle more reliable (although estimates were calculated in 2017). Nevertheless, the methods to estimate the age of the MRCA are different between YTree and YFull.

YTree estimations of TMRCA for R1b-Z262 (left) and R1b-M167 (right).

Why this is important has to do with the realization that Celts must have expanded explosively in all directions during the estimated range for Common Celtic (ca. 1500-1000 BC), and as such R1b-M167 is probably going to be one of the clear Y-DNA markers of the Celtic expansion, when it appears in the ancient DNA record, maybe in new SNP calls from samples of the Olalde et al. (2019) paper, or in future Urnfield/Hallstatt/La Tène papers.

Sister clades derived from R1b-Z262 (TMRCA ca. 1650 BC for YTree, or 2700 for YFull), although sharing a quite old origin, may have taken part in the same communities that expanded R1b-M167, likely from some point in central Europe, possibly as remnants of a previous (Tumulus culture?) central European expansion, as the sample SZ5 from Szólád (R1b-CTS1595) and the distribution of modern samples suggest.

Left: Modern distribution of upstream clade L176.2 (YFull R1b-CTS4188); Right: Modern distribution of M167. Both include later expansions within Iberia (probably with the Crown of Aragon during the Reconquista). Contour maps of the derived allele frequencies of the SNPs analyzed in Solé-Morata et al. (2017).

EDIT (23 APRIL): In Hernández et al. (2018), the TMRCA of R1b-M167 is reported as 3372-3718 ybp:

The youngest sub-branch, R1b-M167, dates to approximately 3.5 kya (95% CI= 2.5-5.3 kya), i.e. even after the Bronze Age.

Contour (surface) maps displaying the frequencies of Y-chromosome haplogroup and its sub-lineages across Europe and the Mediterranean basin. Modified from Hernández et al. (2018).

NOTE. Admittedly, the maps are mainly based on Iberian samples and certain limited sampling elsewhere, so most of the frequencies displayed in other territories are extrapolated. Since the percentage of R1b-M167 in France is estimated to be ca. 3%, and in Bavaria ca. 5%, the distribution in Central Europe is probably much higher, and around the Mediterranean much lower than represented in them.

The Celtic expansion might not have been a mass migration of peoples replacing all male lines of their controlled territories (as was common in the Neolithic and Chalcolithic), because of the Bronze Age dominant chiefdom-based system that relied on alliances, but it is becoming clear that Early Celts are also going to show the expansion of certain successful male lineages.

Oh, and you can say goodbye to the autochthonous “Vasconic = R1b-DF27” (latest heir of the “Vasconic = R1b-P312”) theory, too, if – for some strange reason – you hadn’t already.

EDIT (16 MAR) Just in case the wording is not clear: the fact that this haplogroup most likely expanded with Celts does not mean that its lineages didn’t become eventually incorporated into Iberian cultures and adopted non-IE languages: some of them probably did at some point, in some regions of northern Iberia, and most were certainly later incorporated to the Roman civilization and spoke Latin, then to the medieval kingdoms with their languages, and so on until the present day… Only those eventually associated with Iron Age Aquitanians may have retained their non-IE language, unless those lineages today associated with Basques were incorporated later to the Basque-speaking regions by expanding medieval kingdoms. A complex picture repeated everywhere in Europe: no haplogroup+language continuity in sight, anywhere.

NOTE: This here is currently the most likely interpretation of data based on estimations of mutations; it is not confirmed with ancient samples.


Iberia: East Bell Beakers spread Indo-European languages; Celts expanded later


New paper (behind paywall), The genomic history of the Iberian Peninsula over the past 8000 years, by Olalde et al. Science (2019).

NOTE. Access to article from Reich Lab: main paper and supplementary materials.


We assembled genome-wide data from 271 ancient Iberians, of whom 176 are from the largely unsampled period after 2000 BCE, thereby providing a high-resolution time transect of the Iberian Peninsula. We document high genetic substructure between northwestern and southeastern hunter-gatherers before the spread of farming. We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry. We show that, in the Iron Age, Steppe ancestry had spread not only into Indo-European–speaking regions but also into non-Indo-European–speaking ones, and we reveal that present-day Basques are best described as a typical Iron Age population without the admixture events that later affected the rest of Iberia. Additionally, we document how, beginning at least in the Roman period, the ancestry of the peninsula was transformed by gene flow from North Africa and the eastern Mediterranean.

Interesting excerpts:

From the Bronze Age (~2200–900 BCE), we increase the available dataset (6, 7, 17) from 7 to 60 individuals and show how ancestry from the Pontic-Caspian steppe (Steppe ancestry) appeared throughout Iberia in this period (Fig. 1, C and D), albeit with less impact in the south (table S13). The earliest evidence is in 14 individuals dated to ~2500–2000 BCE who coexisted with local people without Steppe ancestry (Fig. 2B). These groups lived in close proximity and admixed to form the Bronze Age population after 2000 BCE with ~40% ancestry from incoming groups (Fig. 2B and fig. S6).

Y-chromosome turnover was even more pronounced (Fig. 2B), as the lineages common in Copper Age Iberia (I2, G2, and H) were almost completely replaced by one lineage, R1b-M269. These patterns point to a higher contribution of incoming males than females, also supported by a lower proportion of nonlocal ancestry on the X-chromosome (table S14 and fig. S7), a paradigm that can be exemplified by a Bronze Age tomb from Castillejo del Bonete containing a male with Steppe ancestry and a female with ancestry similar to Copper Age Iberians.


For the Iron Age, we document a consistent trend of increased ancestry related to Northern and Central European populations with respect to the preceding Bronze Age (Figs. 1, C and D, and 2B). The increase was 10 to 19% (95% confidence intervals given here and in the percentages that follow) in 15 individuals along the Mediterranean coast where non-Indo-European Iberian languages were spoken; 11 to 31% in two individuals at the Tartessian site of La Angorrilla in the southwest with uncertain language attribution; and 28 to 43% in three individuals at La Hoya in the north where Indo-European Celtiberian languages were likely spoken (fig. S6 and tables S11 and S12).

This trend documents gene flow into Iberia during the Late Bronze Age or Early Iron Age, possibly associated with the introduction of the Urnfield tradition (18). Unlike in Central or Northern Europe, where Steppe ancestry likely marked the introduction of Indo-European languages (12), our results indicate that, in Iberia, increases in Steppe ancestry were not always accompanied by switches to Indo-European languages.

I think it is obvious they are extrapolating the traditional (not that well-known) linguistic picture of Iberia during the Iron Age, believing in continuity of that picture (especially non-Indo-European languages) during the Urnfield period and earlier.

What this data shows is, as expected, the arrival of Celtic languages in Iberia after Bell Beakers and, by extension, in the rest of western Europe. Somewhat surprisingly, this may have happened during the Urnfield period, and not during the La Tène period.

Also important are the precise subclades:

We thus detect three Bronze Age males who belonged to DF27 (154, 155), confirming its presence in Bronze Age Iberia. The other Iberian Bronze Age males could belong to DF27 as well, but the extremely low recovery rate of this SNP in our dataset prevented us to study its true distribution. All the Iberian Bronze Age males with overlapping sequences at R1b-L21 were negative for this mutation. Therefore, we can rule out Britain as a plausible proximate origin since contemporaneous British males are derived for the L21 subtype.

New open access paper Survival of Late Pleistocene Hunter-Gatherer Ancestry in the Iberian Peninsula, by Villalba-Mouco et al. Cell (2019):

BAL0051 could be assigned to haplogroup I1, while BAL003 carries the C1a1a haplogroup. To the limits of our typing resolution, EN/MN individuals CHA001, CHA003, ELT002 and ELT006 share haplogroup I2a1b, which was also reported for Loschbour [73] and Motala HG [13], and other LN and Chalcolithic individuals from Iberia [7, 9], as well as Neolithic Scotland, France, England [9], and Lithuania [14]. Both C1 and I1/ I2 are considered typical European HG lineages prior to the arrival of farming. Interestingly, CHA002 was assigned to haplogroup R1b-M343, which together with an EN individual from Cova de Els Trocs (R1b1a) confirms the presence of R1b in Western Europe prior to the expansion of steppe pastoralists that established a related male lineage in Bronze Age Europe [3, 6, 9, 13, 19]. The geographical vicinity and contemporaneity of these two sites led us to run genomic kinship analysis in order to rule out any first or second degree of relatedness. Early Neolithic individual FUC003 carries the Y haplogroup G2a2a1, commonly found in other EN males from Neolithic Anatolia [13], Starçevo, LBK Hungary [18], Impressa from Croatia and Serbia Neolithic [19] and Czech Neolithic [9], but also in MN Croatia [19] and Chalcolithic Iberia [9].

See also