In it, Fischer and colleagues update their previous data for the Y-DNA of Gauls from the Urville-Nacqueville necropolis, Normandy (ca. 300-100 BC), with 8 samples of hg. R, at least 5 of them R1b. They also report new data from the Gallic cemetery at Gurgy ‘Les Noisats’, Southern Paris Basin (ca. 120-80 BC), with 19 samples of hg. R, at least 13 of them R1b.
In both cases, it is likely that both communities belonged (each) to the same paternal lineages, hence the patrilocal residence rules and patrilineality described for Gallic groups, also supported by the different maternal gene pools.
The interesting data would be whether these individuals were of hg. R1b-L21, hence mainly local lineages later replaced or displaced to the west, or – a priori much more likely – of some R1b-U152 and/or R1b-DF27 subclades from Central Europe that became less and less prevalent as Celts expanded into more isolated regions south of the Pyrenees and into the British Isles. Such information is lacking in the paper, probably due to the poor coverage of the samples.
Three Britons from Hinxton, South Cambridgeshire (ca. 170 BC – AD 80) from Schiffels et al. (2016), two of them of local hg. R1b-S461.
Indirectly, data of Vikings by Margaryan et al. (2019) from the British Isles and beyond show hg. R1b associated with modern British-like ancestry, also linked to early “Picts”, hence likely associated with Britons even after the Anglo-Saxon settlement. Supporting both (1) my recent prediction of hg. R1b-M167 expanding with Celts and (2) the reason for its presence among modern Scandinavians, is the finding of the first ancient sample of this subclade (VK166) among the Vikings of St John’s College Oxford, associated with the ‘St Brice’s Day Massacre’ (see Margaryan et al. 2019 supplementary materials).
The R1b-M167 sample shows 23.5% British-like ancestry, hence autosomally closer to other local samples (and related to the likely Picts from Orkney) than to some of his deceased partners at the site. Other samples with sizeable British-like ancestry include VK177 (32.6%, hg. R1b-U152), VK173 (33.3%, hg. I2a1b1a), or VK150 (25.6%, hg. I2a1b1a), while typical Germanic subclades like I1 or R1b-U106 – which may be associated with Anglo-Saxons, too – tend to show less.
I remember some commenter asking recently what would happen to the theory of Proto-Indo-European-speaking R1b-rich Yamnaya culture if Celts expanded with hg. R1a, because there were only one hg. R1b and one (possibly) G2a from Hallstatt. As it turns out, they were mostly R1b. However, the increasingly frequent obsession of searching for specific haplogroups and ancestry during the Iron Age and the Middle Ages is weird, even as a desperate attempt, because:
it is evident that the more recent the ancient DNA samples are, the more they are going to resemble modern populations of the same area, so ancient DNA would become essentially useless;
cultures from the early Iron Age onward (and even earlier) were based on increasingly complex sociopolitical systems everywhere, which is reflected in haplogroup and ancestry variability, e.g. among Balts, East Germanic peoples, Slavs (of hg. E1b-V13, I2a-L621), or Tocharians.
In fact, even the finding of hg. R1b among Celts of central and western Europe during the Iron Age is rather unenlightening, because more specific subclades and information on ancestry changes are needed to reach any meaningful conclusion as to migration vs. acculturation waves of expanding Celtic languages, which spread into areas that were mostly Indo-European-speaking since the Bell Beaker expansion.
Although the Devil’s Cave ancestry is generally the predominant East Asian lineage in North Asia and adjacent areas, there is an intriguing discrepancy between the eastern [Korean, Japanese, Tungusic (except northernmost Oroqen), and Mongolic (except westernmost Kalmyk) speakers] and the western part [West Xiōngnú (~2,150 BP), Tiānshān Hun (~1,500 BP), Turkic-speaking Karakhanid (~1,000 BP) and Tuva, and Kalmyk]. Whereas the East Asian ancestry of populations in the western part has entirely belonged to the Devil’s Cave lineage till now, populations in the eastern part have received the genomic influence from an Amis-related lineage (17.4–52.1%) posterior to the presence of the Devil’s Cave population roughly in the same region (~7,600 BP)12. Analogically, archaeological record has documented the transmission of wet-rice cultivation from coastal China (Shāndōng and/or Liáoníng Peninsula) to Northeast Asia, notably the Korean Peninsula (Mumun pottery period, since ~3,500 BP) and the Japanese archipelago (Yayoi period, since ~2,900 BP)2. Especially for Japanese, the Austronesian-related linguistic influence in Japanese may indicate a potential contact between the Proto-Japonic speakers and population(s) affiliating to the coastal lineage. Thus, our results imply that a southern-East-Asian-related lineage could be arguably associated with the dispersal of wet-rice agriculture in Northeast Asia at least to some extent.
In this case, the study doesn’t compare Steppe_MLBA, though, so the findings of Afanasievo ancestry have to be taken with a pinch of salt. They are, however, compared to Namazga, so “Steppe ancestry” is there. Taking into account the limited amount of Yamnaya-like ancestry that could have reached the Tian Shan area with the Srubna-Andronovo horizon in the Iron Age (see here), and the amount of Yamnaya-like ancestry that appears in some of these populations, it seems unlikely that this amount of “Steppe ancestry” would emerge as based only on Steppe_MLBA, hence the most likely contacts of Turkic peoples with populations of both Afanasievo (first) and Corded Ware-derived ancestry (later) to the west of Lake Baikal.
(1) The simplification of ancestral components into A vs. B vs. C… (when many were already mixed), and (2) the simplistic selection of one OR the other in the preferred models (such as those published for Yamnaya or Corded Ware), both common strategies in population genomics pose evident problems when assessing the actual gene flow from some populations into others.
Also, it seems that when the “Steppe”-like contribution is small, both Yamnaya and Corded Ware ancestry will be good fits in admixed populations of Central Asia, due to the presence of peoples of EHG-like (viz. West Siberia HG) and/or CHG-like (viz. Namazga) ancestry in the area. Unless and until these problems are addressed, there is little that can be confidently said about the history of Yamnaya vs. Corded Ware admixture among Asian peoples.
Maps, maps, and more maps
As you have probably noticed if you follow this blog regularly, I have been experimenting with GIS software in the past month or so, trying to map haplogroups and ancestry components (see examples for Vikings, Corded Ware, and Yamnaya). My idea was to show the (pre)historical evolution of ancestry and haplogroups coupled with the atlas of prehistoric migrations, but I have to understand first what I can do with GIS statistical tools.
My latest exercise has been to map modern haplogroup distribution (now added to the main menu above) using data from the latest available reports. While there have been no great surprises – beyond the sometimes awful display of data by some papers – I think it is becoming clearer with each new publication how wrong it was for geneticists to target initially those populations considered “isolated” – hence subject to strong founder effects – to extrapolate language relationships. For example:
The mapping of R1b-M269, in particular basal subclades, corresponds nicely with the Indo-European expansions.
There is no clear relationship of R1b, not even R1b-DF27 (especially basal subclades), with Basques. There is no apparent relationship between the distribution of R1b-M269 and some mythical non-Indo-European “Old Europeans”, like Etruscans or Caucasian speakers, either.
Basal R1a-M417 shows an interesting distribution, as do maps of basal Z282 and Z93 subclades, despite the evident late bottlenecks and acculturation among Slavs.
The distribution of hg. N1a-VL29 (and other N1a-L392 subclades) is clearly dissociated from Uralic peoples, and their expansion in the whole Baltic Sea during the Iron Age doesn’t seem to be related to any specific linguistic expansion.
Even the most recent association in Post et al. (2019) with hg. N1a-Z1639 – due to the lack of relationship of Uralic with N1a-VL29 – seems like a stretch, seeing how it probably expanded from the Kola Peninsula and the East Urals, and neither the Lovozero Ware nor forest hunter-fishers of the Cis- and Trans-Urals regions were Uralic-speaking cultures.
All in all, modern haplogroup distribution might have been used to ascertain prehistoric language movements even in the 2000s. It was the obsession with (and the wrong assumptions about) the “purity” of certain populations – say, Basques or Finns – what caused many of the interpretation problems and circular reasoning we are still seeing today.
I have also tried Yleaf v.2 – which seems like an improvement over the infamous v.1 – to test some samples that hobbyists and/or geneticists have reported differently in the past. I have posted the results in this ancient DNA haplogroup page. It doesn’t mean that the inferences I obtain are the correct ones, but now you have yet another source to compare.
I0124, the Samara HG, is of hg. R1b-P297, but uncertain for both R1b-M73 and R1b-M269.
I0122, the Khvalynsk chieftain, is of hg. R1b-V1636.
I2181, the Smyadovo outlier of poor coverage, is possibly of hg. R, and could be of hg. R1b-M269, but could also be even non-P.
I6561 from Alexandria is probably of hg. R1a-M417, likely R1a-Z645, maybe R1a-Z93, but can’t be known beyond that, which is more in line with the TMRCA of R1a subclades and the radiocarbon date of the sample.
I2181, the Yamnaya individual (supposedly Pre-R1b-L51) at Lopatino II is R1b-M269, negative for R1b-L51. Nothing beyond that.
You can ask me to try mapping more data or to test the haplogroup of more samples, provided you give me a proper link to the relevant data, they are interesting for the subject of this blog…and I have the time to do it.
Interesting excerpts (emphasis mine, modified for clarity):
To understand the genetic structure and influence of the Viking expansion, we sequenced the genomes of 442 ancient humans from across Europe and Greenland ranging from the Bronze Age (c. 2400 BC) to the early Modern period (c. 1600 CE), with particular emphasis on the Viking Age. We find that the period preceding the Viking Age was accompanied by foreign gene flow into Scandinavia from the south and east: spreading from Denmark and eastern Sweden to the rest of Scandinavia. Despite the close linguistic similarities of modern Scandinavian languages, we observe genetic structure within Scandinavia, suggesting that regional population differences were already present 1,000 years ago.
Maps illustrating the following texts have been made based on data from this and other papers:
Maps showing ancestry include only data from this preprint (which also includes some samples from Sigtuna).
Maps showing haplogroups of ancient DNA samples based on their age include data from all published papers, but with slightly modified locations to avoid overcrowding (randomized distance approx. ± 0.1 long. and lat.).
We find that the transition from the BA to the IA is accompanied by a reduction in Neolithic farmer ancestry, with a corresponding increase in both Steppe-like ancestry and hunter-gatherer ancestry. While most groups show a slight recovery of farmer ancestry during the VA, there is considerable variation in ancestry across Scandinavia. In particular, we observe a wide range of ancestry compositions among individuals from Sweden, with some groups in southern Sweden showing some of the highest farmer ancestry proportions (40% or more in individuals from Malmö, Kärda or Öland).
Ancestry proportions in Norway and Denmark on the other hand appear more uniform. Finally we detect an influx of low levels of “eastern” ancestry starting in the early VA, mostly constrained among groups from eastern and central Sweden as well as some Norwegian groups. Testing of putative source groups for this “eastern” ancestry revealed differing patterns among the Viking Age target groups, with contributions of either East Asian- or Caucasus-related ancestry.
Overall, our findings suggest that the genetic makeup of VA Scandinavia derives from mixtures of three earlier sources: Mesolithic hunter-gatherers, Neolithic farmers, and Bronze Age pastoralists. Intriguingly, our results also indicate ongoing gene flow from the south and east into Iron Age Scandinavia. Thus, these observations are consistent with archaeological claims of wide-ranging demographic turmoil in the aftermath of the Roman Empire with consequences for the Scandinavian populations during the late Iron Age.
Genetic structure within Viking-Age Scandinavia
We find that VA Scandinavians on average cluster into three groups according to their geographic origin, shifted towards their respective present-day counterparts in Denmark, Sweden and Norway. Closer inspection of the distributions for the different groups reveals additional complexity in their genetic structure.
We find that the ‘Norwegian’ cluster includes Norwegian IA individuals, who are distinct from both Swedish and Danish IA individuals which cluster together with the majority of central and eastern Swedish VA individuals. Many individuals from southwestern Sweden (e.g. Skara) cluster with Danish present-day individuals from the eastern islands (Funen, Zealand), skewing towards the ‘Swedish’ cluster with respect to early and more western Danish VA individuals (Jutland).
Some individuals have strong affinity with Eastern Europeans, particularly those from the island of Gotland in eastern Sweden. The latter likely reflects individuals with Baltic ancestry, as clustering with Baltic BA individuals is evident in the IBS-UMAP analysis and through f4-statistics.
Genetic clustering using IBS-UMAP suggested genetic affinities of some Viking Age individuals with Bronze Age individuals from the Baltic. To further test these, we quantified excess allele sharing of Viking Age individuals with Baltic BA compared to early Viking Age individuals from Salme using f4 statistics. We find that many individuals from the island of Gotland share a significant excess of alleles with Baltic BA, consistent with other evidence of this site being a trading post with contacts across the Baltic Sea.
The earliest N1a-VL29 sample available comes from Iron Age Gotland (VK579) ca. AD 200-400 (see Iron Age Y-DNA maps), which also proves its presence in the western Baltic before the Viking expansion. The distribution of N1a-VL29 and R1a-Z280 (compared to R1a in general) among Vikings also supports a likely expansion of both lineages in succeeding waves from the east with Akozino warrior-traders, at the same time as they expanded into the Gulf of Finland.
Vikings in Estonia
(…) only one Viking raiding or diplomatic expedition has left direct archaeological traces, at Salme in Estonia, where 41 Swedish Vikings who died violently were buried in two boats accompanied by high-status weaponry. Importantly, the Salme boat-burial predates the first textually documented raid (in Lindisfarne in 793) by nearly half a century. Comparing the genomes of 34 individuals from the Salme burial using kinship analyses, we find that these elite warriors included four brothers buried side by side and a 3rd degree relative of one of the four brothers. In addition, members of the Salme group had very similar ancestry profiles, in comparison to the profiles of other Viking burials. This suggests that this raid was conducted by genetically homogeneous people of high status, including close kin. Isotope analyses indicate that the crew descended from the Mälaren area in Eastern Sweden thus confirming that the Baltic-Mid-Swedish interaction took place early in the VA.
N1a-VL29 lineages spread again later eastwards with Varangians, from Sweden into north-eastern Europe, most likely including the ancestors of the Rurikid dynasty. Unsurprisingly, the arrival of Vikings with Swedish ancestry into the East Baltic and their dispersal through the forest zone didn’t cause a language shift of Balto-Finnic, Mordvinic, or East Slavic speakers to Old Norse, either…
NOTE. For N1a-Y4339 – N1a-L550 subclade of Swedish origin – as main haplogroup of modern descendants of Rurikid princes, see Volkov & Seslavin (2019) – full text in comments below. Data from ancient samples show varied paternal lineages even among early rulers traditionally linked to Rurik’s line, which explains some of the discrepancies found among modern descendants:
A sample from Chernihiv (VK542) potentially belonging to Gleb Svyatoslavich, the 11th century prince of Tmutarakan/Novgorod, belongs to hg. I2a-Y3120 (a subclade of early Slavic I2a-CTS10228) and has 71% “Modern Polish” ancestry (see below).
Izyaslav Ingvarevych, the 13th century prince of Dorogobuzh, Principality of Volhynia/Galicia, is probably behind a sample from Lutsk (VK541), and belongs to hg. R1a-L1029 (a subclade of R1a-M458), showing ca. 95% of “Modern Polish” ancestry.
Firstly, modern Finnish individuals are not like ancient Finnish individuals, modern individuals have ancestry of a population not in the reference; most likely Steppe/Russian ancestry, as Chinese are in the reference and do not share this direction. Ancient Swedes and Norwegians are more extreme than modern individuals in PC2 and 4. Ancient UK individuals were more extreme than Modern UK individuals in PC3 and 4. Ancient Danish individuals look rather similar to modern individuals from all over Scandinavia. By using a supervised ancient panel, we have removed recent drift from the signal, which would have affected modern Scandinavians and Finnish populations especially. This is in general a desirable feature but it is important to check that it has not affected inference.
The story for Modern-vs-ancient Finnish ancestry is consistent, with ancient Finns looking much less extreme than the moderns. Conversely, ancient Norwegians look like less-drifted modern Norwegians; the Danish admixture seen through the use of ancient DNA is hard to detect because of the extreme drift within Norway that has occurred since the admixture event. PC4 vs PC5 is the most important plot for the ancient DNA story: Sweden and the UK (along with Poland, Italy and to an extent also Norway) are visibly extremes of a distribution the same “genes-mirror-geography” that was seen in the Ancient-palette analysis. PC1 vs PC2 tells the same story – and stronger, since this is a high variance-explained PC – for the UK, Poland and Italy.
Evidence for Pictish Genomes
The four ancient genomes of Orkney individuals with little Scandinavian ancestry may be the first ones of Pictish people published to date. Yet a similar (>80% “UK ancestry) individual was found in Ireland (VK545) and five in Scandinavia, implying that Pictish populations were integrated into Scandinavian culture by the Viking Age.
Our interpretation for the Orkney samples can be summarised as follows. Firstly, they represent “native British” ancestry, rather than an unusual type of Scandinavian ancestry. Secondly, that this “British” ancestry was found in Britain before the Anglo-Saxon migrations. Finally, that in Orkney, these individuals would have descended from Pictish populations.
(…) ‘UK’ represents a group from which modern British and Irish people all receive an ancestry component. This information together implies that within the sampling frame of our data, they are proxying the ‘Briton’ component in UK ancestry; that is, a pre-Roman genetic component present across the UK. Given they were found in Orkney, this makes it very likely that they were descended from a Pictish population.
Modern genetic variation within the UK sees variation between ‘native Briton’ populations Wales, Scotland, Cornwall and Ireland as large compared to that within the more ‘Anglo-Saxon’ English. This is despite subsequent gene flow into those populations from English-like populations. We have not attempted to disentangle modern genetic drift from historically distinct populations. Roman-era period people in England, Wales, Ireland and Scotland may not have been genetically close to these Orkney individuals, but our results show that they have a shared genetic component as they represent the same direction of variation.
As in the case of mitochondrial DNA, the overall distribution profile of the Y chromosomal haplogroups in the Viking Age samples was similar to that of the modern North European populations. The most frequently encountered male lineages were the haplogroups I1, R1b and R1a.
Haplogroup I (I1, I2)
The distribution of I1 in southern Scandinavia, including a sample from Sealand (VK532) ca. AD 100 (see Iron Age Y-DNA maps) proves that it had become integrated into the West Germanic population already before their expansions, something that we already suspected thanks to the sampling of Germanic tribes.
Haplogroup R1b (M269, U106, P312)
Especially interesting is the finding of R1b-L151 widely distributed in the historical Nordic Bronze Age region, which is in line with the estimated TMRCA for R1b-P312 subclades found in Scandinavia, despite the known bottleneck among Germanic peoples under U106. Particularly telling in this regard is the finding of rare haplogroups R1b-DF19, R1b-L238, or R1b-S1194. All of that points to the impact of Bell Beaker-derived peoples during the Dagger period, when Pre-Proto-Germanic expanded into Scandinavia.
Also interesting is the finding of hg. R1b-P297 in Troms, Norway (VK531) ca. 2400 BC. R1b-P297 subclades might have expanded to the north through Finland with post-Swiderian Mesolithic groups (read more about Scandinavian hunter-gatherers), and the ancestry of this sample points to that origin.
However, it is also known that ancestry might change within a few generations of admixture, and that the transformation brought about by Bell Beakers with the Dagger Period probably reached Troms, so this could also be a R1b-M269 subclade. In fact, the few available data from this sample show that it comes from the natural harbour Skarsvågen at the NW end of the island Senja, and that its archaeologist thought it was from the Viking period or slightly earlier, based on the grave form. From Prescott (2017):
In 1995, Prescott and Walderhaug tentatively argued that a dramatic transformation took place in Norway around the Late Neolithic (2350 BCE), and that the swift nature of this transition was tied to the initial Indo-Europeanization of southern and coastal Norway, at least to Trøndelag and perhaps as far north as Troms. (…)
The Bell Beaker/early Late Neolithic, however, represents a source and beginning of these institution and practices, exhibits continuity to the following metal age periods and integrated most of Northern Europe’s Nordic region into a set of interaction fields. This happened around 2400 BCE, at the MNB to LN transition.
NOTE. This particular sample is not included in the maps of Viking haplogroups.
Among the ancient samples, two individuals were derived haplogroups were identified as E1b1b1-M35.1, which are frequently encountered in modern southern Europe, Middle East and North Africa. Interestingly, the individuals carrying these haplogroups had much less Scandinavian ancestry compared to the most samples inferred from haplotype based analysis. A similar pattern was also observed for less frequent haplogroups in our ancient dataset, such as G (n=3), J (n=3) and T (n=2), indicating a possible non-Scandinavian male genetic component in the Viking Age Northern Europe. Interestingly, individuals carrying these haplogroups were from the later Viking Age (10th century and younger), which might indicate some male gene influx into the Viking population during the Viking period.
As the paper says, the small sample size of rare haplogroups cannot distinguish if these differences are statistically relevant. Nevertheless, both E1b samples have substantial Modern Polish-like ancestry: one sample from Gotland (VK474), of hg. E1b-L791, has ca. 99% “Polish” ancestry, while the other one from Denmark (VK362), of hg. E1b-V13, has ca. 35% “Polish”, ca. 35% “Italian”, as well as some “Danish” (14%) and minor “British” and “Finnish” ancestry.
Given the E1b-V13 samples of likely Central-East European origin among Lombards, Visigoths, and especially among Early Slavs, and the distribution of “Polish” ancestry among Viking samples, VK362 is probably a close description of the typical ancestry of early Slavs. The peak of Modern Polish-like ancestry around the Upper Pripyat during the (late) Viking Age suggests that Poles (like East Slavs) have probably mixed since the 10th century with more eastern peoples close to north-eastern Europeans, derived from ancient Finno-Ugrians:
Similarly, the finding of R1a-M458 among Vikings in Funen, Denmark (VK139), in Lutsk, Poland (VK541), and in Kurevanikha, Russia (VK160), apart from the early Slav from Usedom, may attest to the origin of the spread of this haplogroup in the western Baltic after the Bell Beaker expansion, once integrated in both Germanic and Balto-Slavic populations, as well as intermediate Bronze Age peoples that were eventually absorbed by their expansions. This contradicts, again, my simplistic initial assessment of R1a-M458 expansion as linked exclusively (or even mainly) to Balto-Slavs.
The nature of the prehistoric languages of the British Isles is particularly difficult to address: because of the lack of ancient data from certain territories; because of the traditional interpretation of Old European names simply as “Celtic”; and because Vennemann’s re-labelling of the Old European hydrotoponymy as non-Indo-European has helped distract the focus away from the real non-Indo-European substrate on the islands.
Alteuropäisch and Celtic
An interesting summary of hydronymy in the British Isles was already offered long ago, in British and European River-Names, by Kitson, Transactions of the Philological Society (1996) 94(2):73-118. In it, he discusses, among others:
Non-serial hydronyms: Drua-/Drav-/Dru-, from drew- sometimes reshaped as derw-; ab-; ag-; al-; alb-; alm-; am-; antjā-; arg-; aw-; dan-; eis-; el-/ol-; er-/or-; kar(r)a-, ker-; nebh-; ned-; n(e)id-; sal-; wig-; weis-/wis-; ur-, wer-; etc.
Serial elements: -went-, -m(e)no-, -nt-o-, -n-; -nā-, -tā-; -st-, -r-; etc.
Probably non-Celtic suffixes are found e.g. in Tamesis, paralelled in the Spey Tuesis, and also in Tweed (<*Twesetā?); or -no-/-nā- is also particularly frequent in Scottish river-names, but not in English ones. Another interesting case is the reverse suffix relative order into -r-st- instead of -st-r-.
Most if not all of them can be explained as of Old European nature. I will leave aside the discussion of particular formations – most of which may be found repeated, complemented, and updated in more modern texts.
Bell Beakers as Old Europeans
(…) Bell-beakers are in fact the only archaeological phenomenon of any period of prehistory with a comparably wide spread to that of river-names in the western half of Europe. The presumption must I think be that Beaker Folk were the vector of alteuropäisch river-names to most of western Europe. Rivers in the base Arg-, which we have seen there is cause to think was not already in use at the earliest stage of the river-naming system, and which therefore should be associated with such a vector if one existed, fit their distribution exceptionally well.
That they were a single-speech community can be asserted more confidently of the Beaker Folk than of most archaeologically identified groups for the very reasons that have caused archaeologists difficulty in interpreting them. As McEvedy (1967:28) put it, ‘the bell-beaker folk march convincingly in every prehistorian’s text, but they do so from Spain to Germany in some and from Germany to Spain in others, while lately there has been a tendency to make them go from Spain to Germany and back again (primary and reflux movements)’. One ‘firm datum seems to be that the British beaker folk came from the Rhine-Elbe region.’
This confirms what the long chronology now indicated for Common Indo-European would suggest anyway, and what to me, as remarked above, the rareness of non-Indo-European names in England suggests, that the old dissenting minority of Celticists were right to see the arrival in Britain of Indo-Europeans, as evinced in river-names whether or not in ethnic proto-Celts, as early as the third millennium. McEvedy’s map of Beaker Folk identifies them linguistically with Celto-Ligurians, but in that his admirably tidy mind was, typically, a degree too tidy. Considerations of phonology indicate that more than one linguistic group was involved.
It is normal in reconstructed Indo-European for groups of related words not all to have the same vowel in the root syllable. The commonest vowel gradation is between e, o, and zero; (…) Language-groups that level short a and o include Germanic and Baltic, Slavonic, Illyrian, Hittite and Indo-Iranian; but Celtic and Italic like Greek and Armenian preserve the original distinction. It follows that Celts speaking normal Celtic sounds cannot have been wholly responsible for bringing alteuropäisch river-names to any area. It would seem to follow, as Professor Nicolaisen has consistently urged, that in Spain, Gaul, Britain, and Italy, where the only historically known early Indo-Europeans were speakers of non-levelling languages, they were preceded by speakers of levelling languages not historically known. This hypothesis, pretty well required by the linguistic evidence, finds so good an archaeological correlate in the Beaker People that I think it would now be flying in the face of the evidence not to accept those as bearers of the river-names to these countries.
The funny note is the rejection of the steppe homeland by Kitson in favour of Central European Neolithic cultures, due in part to the ‘impossibility’ of proto-Finnic loans from East Indo-European, if Proto-Indo-European was spoken in the steppe. As I said recently, the lack of knowledge of Uralic languages and Indo-European – Uralic contacts has clearly conditioned the Urheimat question for both, Proto-Indo-European and Proto-Uralic researchers.
The question is, though, to what extent the reasoning of those researchers was as detailed so as to consider it a modern approach to the question, because Krahe in the 1940s seems to offer the first reliable data to make that assumption. In any case, Gimbutas’ idea of Kurgan warriors imposing Indo-European languages everywhere, so over-represented in Encyclopedia-like texts since the end of the 1990s, was not the only, and probably never the main hypothesis among many Indo-Europeanists.
Celts part of Bell Beakers?
Regarding Koch and Cunliffe’s revival of the autochthonous Celts idea, one can find a similar traditional view among British researchers of the early and mid-20th century – and a proper rejection based on hydrotoponymy. It seems that many fringe theories in Indo-European studies, from Nordic or Baltic homelands to autochthonous Celts to the Europa Vasconica, can be traced back to revivalist waves of romantic views of the 19th c.:
What the late Professor C. F. C. Hawkes called in British archaeology ‘cumulative Celticity’, built up by successions of comparatively small tribal migrations, will then have operated on the linguistic side as well. That the predecessors of the Celts proper for so long had in most of Britain been people of similar Indo-European speech explains why there is not a significant survival of recognizably non-Indo-European river-names, and why the few serious candidates for non-Indo-European among recorded place-names all seem to be in Scotland. That the river-names kept their north European non-Celtic phonology will be because the Celts proper took them over as names, with denotative not fully lexical meaning. (…)
(…) I think non-Celtic Indo-European-speakers are likely to have been involved in fact, whether or not they are the whole story, both because that it is the hypothesis which makes best sense of the archaeological evidence (…)
(…) because it is widely accepted that placenames in the Low Countries imply the existence of at least one group of not historically attested Indo-European-speakers, not the same as the ones we are concerned with. So do names in Spain, another country where the only historically attested early Indo-Europeans were Celtic. Comparing Spanish alteuropäisch names with British ones gives a glimpse of the dialectal range that must have characterized the Beaker phenomenon. Either group shares one feature with historical Celtic that the other lacks. The Spanish names like Celtic proper mostly keep Inda-European o. There the diagnostic feature is initial p (Schmoll 1959:93, 78-80; Rodriguez 1980), lost from Celtic and the alteuropäisch of Britain.
Interesting is also the early reaction against Vennemann’s much publicized interpretation of Krahe’s Old European as ‘Vasconic’. This is a useful comment which is still applicable to the same non-existent ‘problem’ found by some Indo-Europeanists, depending on their ideas about Indo-European dialectalization:
It is again naughty of Vennemann (1994:244) to call his laryngealist explanation ‘the only kind of explanation that I know’. At least he does not quite go so far in his laryngealism as to posit a proto-Indo-European in which the vowel a never existed, as Kuiper does.
NOTE. It is difficult to understand why the work of so many Indo-Europeanists is usually not known, while Vennemann’s far-fetched theory has been endlessly repeated. I reckon it must be the same phenomenon of personal and professional contacts, involvement in editorial decisions, and simplification in mass media which makes Kristiansen and his theories frequently published and cited nowadays.
Based on these data, I entertained the idea of arguing for a Pre-Celtic Indo-European language in A Storm of Words, called Pre-Pritenic, with a tentative fable based on the data described below for the Insular Celtic substrate, but eventually deleted the whole text, because (unlike other tentative fables, like the Lusitanian or Venetic ones) it was pure speculation with not even fragmentary data to rely on. Here is a fragment of the discussion:
Among the main reasons adduced to reject the non-Celtic nature of Pritenic is Orkney, a region where Pictish carved stones have been found (indicator of a centralised Pictish power and identity). The name was attested first as Gk. Orkas / Orkádos (secondary source, from Pytheas of Massilia, ca. 322-285 BC, or possibly much later) and Lat. Orchades / Orcades (by Latin sources in the 1st century AD), and it was used to describe the northernmost promontory in Scotland, commonly identified as Dunnet Head in Caithness. It is supposed to derive its name from Cel. *φorko- ‘pig’, because speakers of Old Irish interpreted the name for the island later as Insi Orc ‘island of the pigs’. Therefore, Pritenic would have undergone the prototypical Common Celtic evolution of NWIE *p- → Ø- (see above).
This argument is flawed, in so far as it could have happened (with the interpretation of the name from a Celtic point of view) what happened later with Norwegian settlers, who reinterpreted the name according to Old Norse orkn ‘seal’, to identify it as ‘island of the seals’. In fact, texts published in the 19th and 20th century looked for an even closer etymology to the interpreter, who usually saw it as ‘island of the orcas’.
The region name orc- could be speculatively linked to NWIE *ork-i- ‘cut off, divide’, cf. Ita. *erk-i- (vowel analogically changed), Hitt. ārk- (<*hork-ei-), in Latin found with the meaning ‘divide (an inheritance)’, hence noun Lat. erctum ‘inheritance, inherited part’.
Maybe more interesting is a connection to *or-, as found in British rivers or streams Arrow, Oare Water (Som), Ayre , Armet Water, Arnot Burn, Ernan Water etc. for which cognates Skt. arvan(t)- ‘running, swift’, árṇa- ‘surging’, Gmc. *arnia- ‘lively, energetic’ have been proposed (Forster 1941; Nicolaisen 1976; Kitson 1996). Similar to these derivatives in -n-, -m-, one could argue for a denominative suffixation in *-ko-, not uncommon in Old European toponyms (Villar Liébana 2007), which could be interpreted originally as ‘(region) pertaining to the Or (river, stream)’. The a-vocalism of Old European does not need further explanation, being fairly common in the British Isles (Kitson 1996).
I tried to look for rivers and streams in Caithness that fit a potential border for an ancestral tribe, but after reading many (and I really mean too many) texts on Scotland’s hydronymy, which is a quite well-researched area, I didn’t like the idea of plunging into such a speculative task; not when I have this blog for that… I deleted the text from the book, seeing how it doesn’t really add anything of value and may have distracted from its real aim. If any reader wants to post potential candidates for this delimiting river ‘Or’ in Caithness, feel free to post that below.
Weak (if any) support of a non-Celtic nature of the names might also be found in the late description of Ptolemy’s Geographia (originally ca. 150 AD), Tauroedoúnou tēs kai Orkádos kaloumenēs, translated in Latin as Tarved(r)um, quod et Orcas promontorium dicitur. The original name seems to be formed from *tau-r-, as is common in Indo-European *taur-o- (compare also river Taum), whereas the commonly used Latin translation seems to rely on a Celtic *tarw-o-.
As with other Pictish material, these questions are unlikely to be settled without unequivocal sources pointing to the original names and their meaning. The autochthonous trend is set lately by Guto Rhys, whose work is thorough and methodologically sound, although his reviews tend to dismiss all evidence of a non-Celtic (or even non-Brittonic) layer in Pictland as described in previous works, mostly because of the lack of direct sources or uncontroverted data:
Where a supposed divergence is found in certain names, a lack of proper reading or interpretation of materials (or lack of enough cases to generalize them), combined with similar names in other (neighbouring or distant) Celtic languages, is adduced.
However, the same arguments can indeed be used to reject his proposal of a Celtic nature of many names which cannot be simply explained with other clearly Celtic examples: namely, that all similarities are due to later influences, re-analysis and modifications of Old European terms according to Celtic phonemic (or etymological) patterns, or that the Brittonic nature of many names are due to convergence of the attested Pritenic naming conventions with neighbouring dialects.
In the end, the only conclusion is that there is a clear impasse in hydrotoponymic research in the British Isles, particularly in Scotland, with an impossibility of describing non-Celtic or non-Indo-European Pre-Pritenic layers, due in great part – in my opinion – to the trend among many British Celticists to consider Celtic as autochthonous to the Atlantic. This hinders the proper investigation of the question, just like the trend among Basque studies to consider the western Pyrenees as the eternal Vasconic homeland hinders a fair investigation of the actual Vasconic proto-history.
The syntactic parallels between Insular Celtic and Afro-Asiatic languages (which used to be called Hamito-Semitic) were noted more than a century ago by Morris-Jones (1899), and subsequently discussed by a number of scholars. These parallels include the following.
The VSO order, attested both in OIr. and in Brythonic from the earliest documents (…).
The existence of special relative forms of the verb, (…).
The existence of prepositions inflected for person (or prepositional pronouns), (…).
Prepositional progressive verbal forms, (…).
The existence of the opposition between the “absolute” and “conjunct” verbal forms. (…)
The aforementioned features of Old Irish and Insular Celtic syntax (and a few others) are all found in Afro-Asiatic languages, often in several branches of that family, but usually in Berber and Ancient Egyptian (see e.g. Isaac 2001, 2007a).
Orin Gensler, in his unpublished dissertation (1993) applied refined statistical methods showing that the syntactic parallels between Insular Celtic and Afro-Asiatic cannot be attributed to chance. The crucial point is that these parallels include features that are otherwise rare cross-linguistically, but co-occur precisely in those two groups of languages. This more or less amounts to a proof that there was some connection between Insular Celtic and Afro-Asiatic at some stage in prehistory, but the exact nature of that connection is still open to speculation.
Insular Celtic also shares a number of areal isoglosses with languages of Western Africa, sometimes also with Basque, which shows that the Insular Celtic — Afroasiatic parallels should be viewed in light of the larger framework of prehistoric areal convergences in Western Europe and NW Africa.
The text goes on with typologically rare features found in West Europe and West Africa, such as the inter-dental fricative /þ/ (also in English, Icelandic, Castillian Spanish); initial consonant mutations/regular alterations of initial consonants caused by the grammatical category of the preceding word; the common order demonstrative-noun (within the NP) reversed; the vigesimal counting system; or use of demonstrative articles.
(…) only 38 words shared by Brythonic and Goidelic without any plausible IE etymology. These words belong to the semantic fields that are usually prone to borrowing, including words referring to animals (…), plants (…), and elements of the physical world (…). Note that cognates of these words may be unattested in Gaulish and Celtiberian because these languages are poorly attested, so that the actual number of exclusive loanwords from substratum language(s) in Insular Celtic is probably even lower. In my opinion it is not higher than 1% of the vocabulary. The large majority of substratum words in Irish and Welsh (and, generally, in Goidelic and Brythonic) is not shared by these two languages, which probably means that the sources were different substrates of, respectively, Ireland and Britain; (…)
The thesis that Insular Celtic languages were subject to strong influences from an unknown, presumably non-Indo-European substratum, hardly needs to be argued for. However, the available evidence is consistent with several different hypotheses regarding the areal and genetic affiliation of this substratum, or, more probably, substrata. The syntactic parallels between the Insular Celtic and Afro-Asiatic languages are probably not accidental, but they should not be taken to mean that the pre-Celtic substratum of Britain and Ireland belonged to the Afro-Asiatic stock. It is also possible that it was a language, or a group of languages (not necessarily related), that belonged to the same macro-area as the Afro-Asiatic languages of North Africa. The parallels between Insular Celtic, Basque, and the Atlantic languages of the Niger-Congo family, presented in the second part of this paper, are consistent with the hypothesis that there was a large linguistic macro-area, encompassing parts of NW Africa, as well as large parts of Western Europe, before the arrival of the speakers of Indo-European, including Celtic.
Even more interesting than the discussion of potential non-Indo-Europeans still lingering in Ireland until well into the Common Era, is the discussion on his paper Lost Languages in Northern Europe (2001). Apart from other non-Indo-European borrowings in northern Europe, most of which must clearly be included within the European agricultural substrate, Schrijver tries to interpret the relative chronology of a substratum language of northern Europe, described by Kuiper (1995) as A2, and by Schrijver as “language of geminates“.
This substrate language is heavily present in Germanic (see e.g. Boutkan 1998), but also in Celtic and Balto-Slavic:
A highly characteristic feature of words deriving from this language is the variation of the final root consonant, which may be single or double, voiced or voiceless, and prenasalized. (…)
Incidentally, the language of geminates cannot be Uralic, as another of its characteristics is the frequent occurrence of word-initial *kn- and *kl-, and Uralic languages do not allow consonant clusters at the beginning of the word. On the other hand, and at the risk of explaining obscura per obscuriora, one might consider the possibility that the consonant gradation of Lappish and Baltic Finnic is somehow connected with the alternation of consonants at the end of the first syllable in the “language of geminates”.
The idea that the Northern European language of geminates could play an intermediary role in loan contacts between Northern and Western Indo-European on the one hand and Finno-Ugric on the other may also account for the fact that Finno-Ugric words could end up as far away as Celtic, which as far as we know was never in direct contact with a branch of Uralic.
Schrijver later changed his view about certain aspects of this substrate, from a “language of geminates” influencing Balto-Finnic which in turn influenced Germanic, to Pre-Balto-Finnic speakers being the substrate of Germanic, and both evolving at the same time in contact in Scandinavia. In fact, we know that Pre-Proto-Germanic evolved in southern Scandinavia, with a core in Jutland that shifted to the south, so the location must have been close to the North European Plain.
Also fitting this model is the substrate behind Balto-Slavic (spoken in the West Baltic), which must have also been (Para-)Balto-Finnic. However, the frequent word-initial *kn- and *kl- and the loanwords appearing in the Celtic homeland (also including Early Balto-Finnic) must place this Uralic(± non-Indo-European) language contact also well into Central European Corded Ware groups.
The only archaeological culture that could fit most of these data, in the currently known relative chronological time frame, would be the Megalithic expansion in Western Europe, or potentially (maybe in addition to this early layer) the expansion of the Proto-Beaker package, which could have spread a Basque-Iberian language (see e.g. my take on Basque-Iberians).
Whether the language behind the Insular Celtic substrate (or, rather, some of its dialects) had true Afroasiatic syntactic features or it was just a language with features which happened to be similar to Afroasiatic is irrelevant. It’s impossible to reconstruct with confidence a Pre-Proto-Basque language with the currently available information.
NOTE. I will not resort here to typologically-based arguments similar to the “Hamito-Semit(id)ic” and “Vasconic-Uralic” Europe that were commonly in use in the 1990s, because they are in great part based on the mere re-labelling of Old European layers as “Vasconic” and flawed mass lexical/grammatical comparisons. For linguists favourable to this kind of reasoning, the theory set forth here is probably easier, though, as will be for those supporting a Neolithic expansion of Indo-European from the Mediterranean. This, however, has its own set of problems, as I have already discussed.
Single Grave culture
The non-Indo-European substrate of Insular Celtic, in combination with the oldest hydrotoponymic layers – almost exclusively of Old European nature – of Britain and likely all of Ireland, can more easily be explained as a first layer of North-West Indo-European speakers heavily influenced by an Afroasiatic(-like) substrate reaching the British Isles, possibly with a slightly richer set of non-Indo-European loanwords at the time. Their language would have been later replaced by the closely related Celtic dialects imposed by elites in the Early Iron Age, which could have then easily absorbed this (mainly syntactic) substrate.
There is little space to argue for a hypothetic non-Indo-European expansion from another region, or for an in situ substrate, due to:
the presence of the same (mainly syntactic) substrate in both Goidelic and Brittonic; and
the minimal non-Indo-European lexical borrowings and hydrotoponymy, different in each island;
Based on archaeological and palaeogenomic data, the only reasonable direct connection of north-western Bell Beakers and this substrate language would be then the Corded Ware groups from north-western Europe – i.e. the traditionally named Single Grave culture from northern Germany and Denmark, and the Protruding Foot Beaker culture from the Netherlands.
The main reasons for this are as follows:
1. Early Corded Ware wave
The earliest Corded Ware burials from northern Europe (ca. 2900-2800 BC) show important differences, so no strict funerary norms existed at first (Furholt 2014):
In southern Sweden the prevailing orientation is north-east–south-west, and south–north; contrary to the supposed rule, male individuals are regularly deposite on their left and females on their right side
In the Danish Isles and north-eastern Germany, the Final Neolithic / Single Grave Period is characterized by a majority of megalithic graves, with only some single graves from typical barrows.
In south Germany, west–east and collective burials prevail, while in Switzerland no graves are found.
In Kuyavia (south-eastern Poland), Hesse (Germany), or the Baltic, west–east orientation and gender differentiation cannot be proven statistically.
In genetics, the area that would become the ‘core Corded Ware province’ only after ca. 2700 BC also shows a surprising variability in the oldest samples in terms of haplogroups (which may indicate a recent departure of migrants from a mixed homeland); in terms of admixture, at least one sample clusters close to EEF groups, while later ones from Esperstedt – of hg. R1a-M417 (possibly xZ645) – show a likely admixture with Yamna vanguard groups expanding from the Carpathian Basin.
The Corded Ware culture in Denmark was particularly weak in its human impact compared to previous farmers (see e.g. Feeser et al. 2019), and also in its cultural traits, adopting Funnel Beaker culture traits up to a point where even the Copenhagen group describes cultural continuity, likely entailing an important substrate language impact (see e.g. Iversen and Kroonen 2017).
As it appears from the analysis above, the situation in East Denmark during the 3rd millennium BC is culturally rather complex. The continued use of megalithic entombments and the almost total rejection of the Single Grave burial custom show a strong affiliation with old Funnel Beaker traditions even after the end of the Funnel Beaker culture. (…) With an almost total lack of the two defining elements of the Single Grave culture – interments in single graves and the prominent position of stone battle axes – one can hardly talk about a Single Grave culture in East Denmark. What we see is rather the adoption of various Single Grave, Battle Axe and Pitted Ware cultural traits into a setting that was basically a continuation of Funnel Beaker norms and traditions (Iversen 2015).
The reason why East Denmark so conservatively upheld the Funnel Beaker traditions must be found in the area’s old position as a ‘megalithic heartland’, which reaches back to the early 4th millennium BC when dolmens and passage graves were constructed in very large numbers. (…) The result was a cultural blend governed by old Funnel Beaker norms and the use of Pitted Ware, Single Grave and Battle Axe material culture. This situation continued until the beginning of the Late Neolithic (ca. 2350 BC) when cultural and social development took a new course and flint daggers and metal objects appeared/ re-appeared in South Scandinavia.
The Corded Ware culture in the Netherlands is particularly disconnected culturally from its eastern core areas, which is reflected in the likely survival of a non-Indo-European language around the Low Countries, in the so-called Nordwestblock area. From Kroon et al. (2019):
The connections between changes in ceramic production techniques and social changes (see Fig. 2) allow for the formulation of hypotheses about the technological impact of the scenarios that archaeologists have proposed for the introduction of the CWC. If migration (i.e. an influx of new communities that bring new material culture) causes the spread of the CWC, then CWC vessels should differ from the vessels of previous communities in all respects: resilient, group-related, and salient techniques. However, if the introduction of the CWC is the result of diffusion of stylistic traits and moving objects, both these imported objects (different raw materials and production sequences) and changes in salient techniques should be observed when comparing CWC vessels to VLC vessels. Network interactions should yield the same changes as diffusion, as the combined movement of people, objects and styles within existing networks leads to the introduction of CWC. However, network interactions should yield one additional characteristic. Given that new people are integrated into extant communities, the occurrence of vessels with different resilient techniques, but group-related techniques that are stable relative to previous communities, is to be expected.
The over-arching transitional process in the Western coastal area of the Netherlands is local continuity with diffusion and network interaction traits. Interestingly, the supra-regional networks of the VLC communities in this region, as well as some of the defining technological practices within these networks, remain intact throughout the CWC transition.
In the absence of detailed genetic and isotopic data from Late Neolithic individuals from the western coastal areas of the Netherlands, direct conclusions on the relations between the migrations demonstrated by genetic analyses in other regions and the outcomes of this study remain speculative. However, if a similar shift in the late Neolithic gene pool from this area can be detected, this raises questions on the impact of such migrations on knowledge transmission and local traditions. If such a change cannot be attested, questions should be raised about the nature of the CWC in this particular area. Questions that will ultimately boil down to what we define as CWC.
In other words, the introduction of Corded Ware in the Netherlands, which we can assume were driven by migrations – evidenced by the arrival of “Steppe ancestry” (see below) – would need to be interpreted in light of the adoption of a different set of cultural traits in this region. Combining linguistic and archaeological data, there is strong evidence that the Corded Ware ideology and its internal coherence might have been broken in the westernmost territories, hence the likely survival of the local culture and language(s).
Further reasons for this independence from the Uralic homeland, supporting the advantages of a cultural and linguistic integration among regional groups, include:
This predominant non-Indo-European language would later be the substrate language of Bell Beakers from the Lower Rhine and the British Isles.
Culturally, the same process as in the previous Single Grave culture period may have happened in the Low Countries, due to the culturally favorable situation there. This might be inferred from the continuity of Protruding Foot Beaker into All-Over Ornamented Beaker, most likely an imitation of the expanding Proto-Beaker package by locals of the Single Grave culture.
Arguably, though, the same situation should have happened in all other Proto-Beaker regions favourable to cultural change and witnessing admixture with locals, such as Iberia, and the social relevance of this imitation is far from being accepted by almost anyone except for archaeologists working around the Rhine… From Heise (2014):
While in 1955 the Maritime Beaker was considered to be intrusive, the 1976 work seemed to prove that in the Netherlands a continuous development from Protruding Foot Beaker (PFB) to All-Over Ornamented (AOO) Beaker to Maritime Beaker occurred. Nevertheless, the authors stressed that it was not possible to identify ‘the’ origin of the ‘Bell Beaker Culture’ in the Lower Rhine Area since typical artefacts (wristguards, daggers) were not known to be associated with the early AOO and Maritime pottery. Furthermore they argued against the “misleading simplification” of a single point of origin (Lanting & van der Waals 1976, 2). However, this last observation was not appreciated or was simply ignored by large parts of the research community and the theory was subsequently applied as a universal solution in many parts of Europe.
In fact, most archaeologists have unequivocally rejected a Single Grave – Classical Bell Beaker continuity, and Heyd’s model has been recently confirmed in paleogenomics, which shows an evident expansion of East Bell Beakers from Yamna settlers in the Carpathian Basin (see here). We may nevertheless still save the following assertion, as particularly relevant for the continuity of non-Indo-European languages among the Single Grave groups of the Lower Rhine:
Marc Vander Linden argued that the “local validity of the Dutch sequence cannot […] be questioned” (2012, 76).
Olalde et al. (2019) showed how British, Dutch, and French Beakers have excess “Steppe ancestry” relative to Central European Beakers from Germany, who are in turn closest to the origin of Old Europeans in Iberia (i.e. Galaico-Lusitanian, “Ligurian”), the Lower Danube (i.e. Celtic), Italy (i.e. Italic, Venetic, Messapic), Sicily, and even Denmark (i.e. Germanic). This excess “Steppe ancestry” probably implies admixture with local Single Grave populations of the Lower Rhine, which is further supported by the position of these Lower Rhine Beakers in the PCA (using British Beakers and Netherlands BA as proxies), clustering – among Bell Beakers – closest to Corded Ware samples.
Futhermore, the emergence of Bell Beakers in the British Isles represents a radical replacement, with a population turnover of ca. 90% of the local population, and Yamna lineages representing more than 90% of the haplogroups of individuals in Chalcolithic and Bronze Age Britain and Ireland, apart from an evident Y-chromosome bottleneck under hg. R1b-S461 (and its subclade R1b-L21), maintained during the whole Bronze Age. The scarce non-Indo-European hydrotoponymy attests to the lack of integration of local populations or their languages into the new society. All this suggests an initial swift and massive intrusion marking the linguistic evolution of the British Isles until the Iron Age.
The arrival of Insular Celtic in the British Isles will be likely defined by an increase in ancestry related to Central Europe (and probably haplogroups, too). Since the Afroasiatic-like substrate is unrelated to Common Celtic, the non-Indo-European substrate must be associated with preceding Bronze Age populations of western Europe, most likely with Bronze Age Britons, who are in turn derived from Bell Beakers from the Lower Rhine admixed with Single Grave peoples. The latter, therefore, must have passed on their Afroasiatic-like language as the substrate of Lower Rhine Beakers.
5. Vasconic from the north
Another indirect proof to the survival of non-Indo-Europeans in northern Europe is offered by Basques. Vasconic speakers came originally from some place beyond Aquitaine, and very recently before the Roman conquests, because place- and river-names show an overwhelming Old European substratum to the north of the Pyrenees, and exclusively Old European to the south.
Their origin is potentially quite far away, since Modern Basques show a similar cluster to that found in Iron Age Celtiberians of the Basque country. This could essentially mean that Basques were peoples of north/central European ancestry (see below fitting models of origin populations), because they must have arrived to Aquitaine after the arrival of Celtiberians, and with a similar ancestry.
(…) increases in Steppe ancestry were not always accompanied by switches to Indo-European languages. This is consistent with the genetic profile of present-day Basques who speak the only non-Indo-European language in Western Europe but overlap genetically with Iron Age populations showing substantial levels of Steppe ancestry.
The Tollense Valley near Rügen in the West Baltic shows LBA people clustering with Modern Basques (see here). This is compatible with the arrival (or displacement) of Vasconic-speaking Northern/Central Europeans close to the Rhine, possibly originally from northern France, very likely close to the Atlantic area during the Final Bronze Age / Early Iron Age based on cultural interactions.
Pre-Steppe languages in Europe?
An alternative to Old Europeans of the British Isles would be to support some kind of non-Indo-European/Vasconic continuity in the Atlantic façade close to the English Channel and the North Sea, given the current lack of palaeogenomic data on Bell Beakers and later groups in the area, and the potential Vasconic nature of Megalithic/Proto-Beaker groups that might have survived there.
The main problems with this approach are the lack of such an Afroasiatic-like substrate in Gaulish, which should have shown the same substrate as Insular Celtic, and the impossibility of associating this Afroasiatic-like substrate with Vasconic, both potentially representing completely different languages. A counterargument would be that we don’t have that much information on Gaulish and its dialects – or on the syntax of Vasconic, for that matter – to reject this hypothesis straight away…
In any case, the survival of pockets of non-Indo-European, non-Uralic speakers in northern Europe, even after Steppe-related expansions, should not shock anyone:
If the survival of non-Indo-European-speaking groups happened despite the swift expansion and radical population replacement brought about by the Bell Beaker folk – so called traditionally because of its unitary culture suggesting a unitary language community -, and non-Uralic-speaking groups in areas dominated by Corded Ware peoples, it could certainly have happened, and even more so, with Corded Ware and Bell Beaker groups at the western and northern edges of their expansions, due to the early loss of contact with their respective core cultural regions.
Even obscure components of place or river names, like those from northern Europe, the Nordwestblock area, and the British Isles, might be better explained as Old European exceptions than any other alternative, i.e. either as an Indo-European layer over a non-Indo-European one or vice versa, or both in different periods, before the eventual unifying Celtic, Roman, and (later) Germanic expansions.
All in all, one could say about substrates and hydrotoponymy in the British Isles, the Lower Rhine, and in northern Europe as a whole, that the potentially interesting non-Indo-European forms are precisely those which do not interest either scholarly ‘faction’:
those supporting a non-Indo-European Western Europe, because it doesn’t represent the whole substrate, and can’t be used to argue for a Europa Vasconica or Europa Afroasiatica;
those supporting a Palaeo-Indo-European Western Europe, because their limited presence concentrated in isolated pockets doesn’t deny the Indo-Europeanness of the Old European layer anywhere.
However, these are the details that should be studied and that could define what happened exactly after steppe-related migrations, e.g. in the Single Grave cultural area before and after North-West Indo-Europeans admixed with its population, and thus what happened in the British Isles, too.
Ignoring the (mostly useless) typological comparisons, my bet would be for an ancient Uralic layer heavily admixed with local non-Uralic peoples, especially intense in the Single Grave culture. This Proto-Uralic layer would be of a dialect or dialects (assuming succeeding CWC waves and later local expansions) different from the known Late Proto-Uralic – which expanded with eastern Corded Ware groups.
Describing the phonetic features of this layer could improve our knowledge of Early Proto-Uralic, as well as some specifics of the evolution of Germanic, Balto-Slavic, and potentially Celtic and Balto-Finnic.
This would be similar to the relevance of Aquitanian toponyms for Proto-Basque reconstruction, or of the alteuropäische substratum when it conflicts with the Proto-Indo-European dialectal reconstruction of some linguists (e.g. the laryngeal Pre-Indo-Slavonic of Kortlandt) which, like Kitson implies, should question the dialectal reconstruction of this minority of Indo-Europeanists, and not the Indo-European nature of the substratum.
The first layer in hydrotoponymy of Iberia is clearly Indo-European, in territories that were occupied by Indo-Europeans when Romans arrived, but also in most of those occupied by non-Indo-Europeans.
Among Indo-European peoples, the traditional paradigm – carried around in Wikipedia-like texts until our days – has been to classify their languages as “Pre-Celtic” despite the non-Celtic phonetics (especially the initial -p-), because the same toponyms appear in areas occupied by Celts (e.g. Parisii, Pictones, Pelendones, Palantia); or – even worse – just as “Celtic”, because of the famous -briga and related components. This was evidently not tenable at the end of the 20th century, and it is simply anachronistic today.
While the non-Celtic Indo-European nature of Lusitanian is certain, the nature of the “Pre-Celtic” language spoken by peoples such as Cantabri, Astures, Pellendones, Carpetani and Vettones is still being discussed, due to the scarcity of material to work with.
It is certain that the delimitation of the geographical area set by Tovar is still valid, basically determined by the known direct documents, that is, the traditionally accepted inscriptions (the classic ones of Lamas de Moledo, Arroyo de la Luz and Cabeço das Fráguas), in addition to the new ones from Arroyo and the recent one from Arronches, see Fig. 1), to which some others could be added: the new bilingual inscription from Viseu necessarily compels us to consider it as indigenous, because it contains terms that belong to the core of the language and not only onomastics (I refer to the nexus igo and the nicknames deibabor and deibobor). By virtue of this new incorporation, we can also consider other texts as indigenous, although they do not include a common lexicon (see Fig. 1, inscriptions 7 to 22), in the expectation that many Lusitanian scribes were consciously mixing two linguistic registers (code switching), one to refer to the deities (for which they frequently used indigenous inflection) and another for anthroponyms (always with Latin inflection).
Firstly, it is striking that this geographical profile drawn by the texts correspond almost exactly to the distribution of large series of anthroponyms and theonyms.* Among the abundant names of people we can highlight those with a large number of repetitions whose appearance is circumscribed to our region of study (see Fig. 2). Some of them are truly frequent and lack parallels on the outside, such as the stem Tanc / Tang- (of Tanginus) with no less than 130 attestations, or Tonc- / Tong- (of Tongius or Tongetamus) with 70. Others show also sufficiently representative figures as Camalus and Maelo (with 46 repetitions each), Celtius (with 29), Caturo or Sunua (with 23), Camira (with 22), Doquirus (with 20), Louesius (with 18), Al(l)ucquius (with 17) or Malge(i)nus (with 16). According to these quantities, it appears that these are not casual occurrences of names, taking into account that chance tends to be reduced to a minimum in the study of the Iberian Peninsula, since we can easily handle the entire peninsular corpus. In turn, Reue, Bandue, Nauiae and Crougiae are the theonyms that best represent the Lusitanian-Galician area, coinciding fundamentally (Figure 3) with the picture that anthroponymy and texts had drawn, although with less examples.
* The other subdivision of the onomastics, toponymy, presents difficulty in the elaboration of series, by the few repetitions of segments, once the universal element -briga has been eliminated.
It is not only these groups of names and roots that help us define a large northwestern area, but, as I have had occasion to mention in other places, some onomastic data that share a similar distribution can also be added: the desinence -oi (with an assimilation in -oe / -ui) of theonymic dative singular, the ending -bo of dative plural, the presence of the noun-forming suffix -aiko-, in addition to other phonetic features such as the passage of e> ei in anthroponymy, the reduction ug> uo the step of w> b.
(…) First of all, it seems that there is an independent onomastic area, which can be defined by a series of names and suffixes that are repeated there exclusively or predominantly. This area does not seem to correspond with what we know of the Lusitanian-Galician onomastics nor of the more coastal Asturian; it also differs from the Celtiberian area, with which it does not have features in common. In this way, and always in the conjectural terrain, we could find ourselves before an Indo-European non-Celtic language different from the Lusitanian language.
A peculiarity that will have to be investigated is the presence of an excessively wide border corridor, where the names of the southern Astures (Augustales) do not predominate, but neither those of the northern Astures (Transmontanos). Similarly, we will have to see the scope of the hypothesis that there might have been a language perhaps differentiated from that spoken in the Lusitanian, Galician or Celtiberian zones; the lower documentary richness of the Asturian zone of Transmontana makes it more difficult to guarantee that it is not the same linguistic area as the one we isolate among Asturian cities.
In any case, de Hoz, even taking into account the difficulty of an affirmation of this type, pointed out ambiguously that we could find ourselves in front of different languages. On the other hand, the absence of texts directly transmitted by this people leaves us without a definitive confirmation the argument that it is a linguistically differentiated region, but it does not invalidate it at all. These drawbacks require the suspension of the exact characterization of our area, awaiting advances in the field of epigraphy and methodology.
The information provided by place-names and hydronyms on the one hand and anthroponyms on the other is of undoubted historical value in both cases, but of different specific significance. Anthroponyms reflect the present situation at the moment when living people were using them. It is an aspect very sensitive to social changes of all kinds, reaching its highest level of instability when there is language change.
(…) the Pre-Roman anthroponymic inventory of the Basque Country and Navarre indicates that prior to the arrival of Romans the language spoken was Indo-European (reflected in the names used) in the territories of Caristii, Varduli and Autrigones, while in Vasconic territory (especially in the current Navarre) most of the speakers chose Iberian names. In the territories of the current Basque Country, only a negligible statistical proportion chose Basque names, whereas in Navarre it was a minority of the population. That’s how things were towards the 3rd century BC.
Cities and rivers are not subject to the ephemeral life cycle of humans. Rivers have very long cycles that go far beyond the life time not only of individuals, but also of languages and cultures. Cities are also generally very stable, although social circumstances occasionally cause one to be abandoned or destroyed, while new ones are created from time to time. That means that the names of rivers and cities are not subject to fashions or frequent change. Nor does a language change imply a renewal of the previous hydronymy and toponymy.
Speakers of the new languages incorporated into a territory learn from the natives the hydronymic and toponymic system, producing what we call the “toponymic transmission”. (…) it requires a prolonged contact between the native population and the new occupants, which can only occur when the indigenous population is not annihilated quickly and radically.
The ancient onomastic data of the Basque Country and Navarre can be summarized as follows:
Ancient hydronymy, the longest lasting onomastic component, is not Basque, but Indo-European in its entirety.
The old toponymy, which follows it in durability, is also Indo-European in its entirety, except Poampaelo (now Pamplona) and Oiarso (now Oyarzun).
And in anthroponymy, which reflects the language used at the time when those names were in use, is also massively Indo-European, although there are between 10-15% anthroponyms of Vasconic etymology.
(…) the existing data show that, while in Roman times in Hispania there were only a couple of place-names in the Pyrenean border and a dozen anthroponyms of Vasconic etymology, in Aquitaine there was an abundant antroponymy of that etymology.
This set of facts is most compatible with a hypothesis that postulated a late infiltration of this type of population from Aquitaine, which at the time of the Roman conquest had only reached to establish a bridgehead, consisting of a small population center in Navarre and Alto Aragón and nothing else, except some isolated individuals in the current provinces of Álava, Vizcaya and Guipúzcoa. The almost complete absence of old place-names of Vasconic etymology would be explained in this way: Vasconic speakers, recently arrived and still in small numbers, would not have had the possibility of altering in depth the toponymic heritage prior to their arrival, which was Indo-European.
The idea of a late Vasconization of a part of those territories, in the High Middle Ages or late Antiquity, is not new. Already in the 1920s M. Gómez Moreno said about the modern Basque provinces, with the district of Estella in Navarra, that “personal nomenclature allows comparisons of definitive value, probative that there lived people of the Cantabrian-Asturian race [who for Gómez Moreno were Indo-European], without the slightest trace of perceptible Basqueness”. For him, the first Indo-European people to penetrate the peninsula would have been Ligurian, which evolved into Cantabrians, Asturians, Venetians, Lusitanians, Tormogi, Vacaeans, Autrigones, Caristii and Varduli.
If, as we said above, Basque speakers began to enter the Iberian Peninsula from the other side of the Pyrenees only from the Roman-Republican era, to intensify their presence in the following centuries we must assume that they were to the north of the Pyrenees already before those dates. And, indeed, the existence of this abundant Vasconic antroponymy shows that in the first centuries of our era – while Vasconic speakers in the Peninsula were very few in number, their population in Aquitaine was abundant.
In a provisional manner we can advance that [Aquitaine’s] hydronyms are also known in other places of Europe and easily compatible with Indo-European etymologies (Argantia, Aturis, Tarnes, Sigmanos); and among the place names there are also many that are compatible with non-Gallic Indo-European etymologies, or not necessarily Gallic (Curianum, Aquitania, Burdigala, Cadurci, Auscii, Eluii, Rutani, Cala- (gorris), Latusates, Cossion, Sicor, Oscidates, Vesuna, etc.).
In addition to those place names that we classify as generically Indo-European, there are not a few Celts (Lugdunum, Mediolanum, Noviomagos, Segodunon, Bituriges, Petrucorii, Pinpedunni), several Latins (Aquae Augustae, Convenae, ad Sextum, Augusta), and even some Celto-Latin hybrids (Augustonemeton, Augustoriton). On the other hand, there are hardly any names, neither serial nor not serial, that have a reasonable possibility of being explained by Vasconic etymology (Anderedon could be one of them).
Consequently, the onomastic question of Aquitaine is not compatible with the possibility that Vasconic is the “primordial element” there, either. On the contrary, it is compatible with the hypothesis that they arrived also late in Aquitaine, when hydro-toponymy was already established. They had to Vasconize all or part of the previous population, that turned to use to a large extent the Vasconic anthroponymy. But the previous toponymy remained and the Vasconization process was probably soon interrupted by Celticization first, and Romanization later.
A prediction in genetics
This is how Francisco Villar and co-authors from the University of Salamanca saw what would happen with the genetic studies of modern Basques in 2007, based on the similarity with neighbouring Iberians and French, and the late intrusion of the language in its current territory:
Unfortunately, linguistics does not have the means to establish the moment of that arrival in terms of absolute chronology. In any case, this hypothesis is not incompatible with some peculiarities in the frequency of certain genes of the Basque-speaking population. Indeed, today we tend to attribute these peculiarities to the joint action of genetic drift and isolation; to which perhaps we could add a bottleneck in the Vasconic founding population that would one day settle in Aquitaine.
Also Villar, in 2014:
In the hypothesis that I propose, future speakers of Basque would have settled initially in Aquitaine, where there would have been an inevitable genetic diffusion with pre-existing [first stage] populations. On the other hand, Basque speakers from Aquitaine would have started to arrive to the Basque Country and Navarre only from Roman times (only a couple of Vasconic toponyms, at least one of them of recent creation; scarce anthroponyms of Vasconic etymology). The part of those populations that mixed with the pre-existing Palaeo-Indo-Europeans (Indo-European names of rivers; general Indo-European toponymy) saw how the uniqueness of their haplogroups, if there was any, was diluted, making it difficult to distinguish from the general [Indo-European] background; being a minority, it could had been even lost as a result of adverse genetic drift.
Olalde et al. (2019) confirmed this hypothesis that modern Basques are quite similar to investigated Iron Age Indo-Europeans from Iberia (such as Celtiberians sampled from the Basque Country):
For the Iron Age, we document a consistent trend of increased ancestry related to Northern and Central European populations with respect to the preceding Bronze Age. The increase was 10 to 19% (95% confidence intervals given here and in the percentages that follow) in 15 individuals along the Mediterranean coast where non-Indo-European Iberian languages were spoken; 11 to 31% in two individuals at the Tartessian site of La Angorrilla in the southwest with uncertain language attribution; and 28 to 43% in three individuals at La Hoya in the north where Indo-European Celtiberian languages were likely spoken. This trend documents gene flow into Iberia during the Late Bronze Age or Early Iron Age, possibly associated with the introduction of the Urnfield tradition.
Modern Basques show therefore, paradoxically, an ancestry similar to recent Iron Age Indo-European invaders (quite likely the ancestors of Celtiberians), which confirms the hypothesis of bottlenecks/founder effects followed by a very recent isolation of its population:
(…) the genetic profile of present-day Basques who speak the only non-Indo-European language in Western Europe  overlap genetically with Iron Age populations showing substantial levels of Steppe ancestry.
Regarding the Iberian language, the circumstances of analysis are less favorable. However, we can observe in the ancient toponymy of typically Iberian areas (the Spanish Levant and Catalonia) a considerable proportion of toponymy of Indo-European etymology, often identical to that which F. Villar (2000) has called “Southern-Iberian-Pyrenean”. In fact, its presence in the Levant is nothing else but a continuation from Catalonia to the South along the Mediterranean coast. Here are some examples: Caluba, Sorobis, Uduba, Lesuros, Urce / Urci, Turbula, Arsi / Arse, Asterum, Cartalias, Castellona, Lassira, Lucentum, Saguntum, Trete, Calpe, Lacetani, Onusa, Palantia, Saetabis, Saetabicula, Sarna , Segestica, Sicana, Turia, Turicae, Turis.
Compatible with the Indo-European etymology can also be Blanda, Sebelacum, Sucro, Tader, Sigarra, Mastia, Contestania, Liria, Lauro, Indibilis, Herna, Edeta, Dertosa, Cesetania, Cossetani, Celeret, Bernaba, Biscargis, (…)
Finally, in other place names there are Indo-European components in hybrid toponymic syntagms, such as:
Examples like these show that in Catalonia and the Spanish Levant the Iberian language is not the deepest identifiable substrate language, but that it took root there when there was previously an Indo-European language that had created a considerable network of toponyms and hydronyms that we can recognize, and over which Iberians settled as a superstrate. The pre-existence of an Indo-European language in the historically Iberian area is further corroborated by the fact that its ancient hydronyms are all Indo-European, with the exception of a single river that has a name that is supposed to be Iberian: the Iberus (Ebro), of which obviously the country and its inhabitants took their name. No doubt ib- was an appellation for river, so that in the language that created that hydronym the Iber should have simply been “the river”. But we will see in the body of this work that ib- is in various places outside the Iberian Peninsula as an appellation for «river», which will force us to rethink its supposed Iberian affiliation. In fact, the Iberus had another name, Elaisos, whose etymology is compatible with Indo-European. As we know with certainty that after Iberians no other Indo-European peoples came to their territory before the Romans, the Indo-European creators of that hydronymy have had to be there before the Iberians. And its antiquity must be considerable because, as we have already said, the vast majority of its hydronyms (Alebus, Caluba, Lesuros, Palantia, Saetabis, Sigarra, Sucro, Tader, Turia and Uduba, Elaisos) belong to that anonymous Indo-European language that didn’t leave written texts or had historical continuity.
Not always that a language is settled in a territory is it able to eradicate the existing ones definitively. Even a political system as unitary and unifying as the Roman was not able to eradicate the Basque language. And nowadays in Latin America, despite the crushing cultural dominance of Spanish, despite the means for the schooling of a modern society, in spite of the media, a multitude of pre-Columbian languages are spoken that coexist with the language of culture, the only one that is written in those countries. In those situations, which can be prolonged for quite a lot of time, there are individuals who only speak the language newly imposed, others who speak only the language that has resisted disappearing, and others who speak both, in a broad framework of bilingualism. My proposal is that something similar to that must have happened in the Iberian territory when the Romans arrived: A language of culture, Iberian, diversified into more or less distant local dialects, coexisted with several previous languages, equally differentiated from the dialectal point of view. This explains the irruption in the Iberian texts of non-Iberian anthroponyms and, above all, the existence there of a Palaeo-Indo-European hydro-toponymy that had remained in use not only because it was transmitted to Iberian speakers, but also because its native users were still present.
NOTE. Both books also contain detailed information on hydrotoponymy of other regions, like Northern Europe, the Aegean and the Middle East, with some information about Asia, apart from (outdated) genetic data, but their main aim is obviously the Prehistory of Iberia and neighbouring regions like France, Italy, or Northern Africa.
Here are only some excerpts (emphasis mine), translated from Spanish (see the original texts here), accompanied by images from both books.
Alteuropäisch and Krahe
The investigation of “Old European” or Alteuropäisch, popularized by Krahe, began precisely with the study of some toponyms and personal names spread all over Europe, previously considered “Ligurian” (by H. d’Arbois de Jubainville and C. Jullian) or “Illyrian” (by J. Pokorny), with which those linguistic groups – in turn badly known – were given an excessive extension, based only on some lexical coincidences.
This is a comment made by the author about Krahe‘s data and his opinions, frequently used against his compiled data, which I find paradoxically applicable to Villar’s data and his tentative assignment of the relative linguistic chronology to an absolute one – including the expansion of a “Mesolithic” Indo-European vs. a “Neolithic” Basque / Iberian vs. a Bronze Age Celtic – when it is now clear that the sequence of events was much later than that:
It is very widespread today a derogatory and globally disqualifying attitude to everything that sounds like Alteuropäisch and Krahe, sometimes without the necessary discrimination between different hypotheses, or even between data and hypothesis. It is not fair that the version of H. Krahe and that of W. P. Schmid be disqualified in a single simplistic judgment as if they were the same thing. But it is a major mistake to reduce the value of the hydro-toponymic data of Europe by the mere fact that Krahe attributed an implausible historical explanation to them. The data are real and still need an adequate explanation within a real historical framework, despite the unfeasibility of Krahe’s explanation.
With that we reach a point that I want to highlight. Among those who are allergic to anything that involves deviating one iota of the Indo-European paradigm as a single event, an attitude gaining momentum considers that hydro-toponymy was introduced in the different regions of Europe and Southeast Asia by the same Indo-European languages that appear historically occupying their territory. H. Krahe had argued strongly against this possibility, so now I will save myself a deeper refutation and I will limit myself to pointing out some difficulties that position is forced to face.
The defenders of that alternative have to assume that the process of dialectalization, that before the migrations from the Urheimat was separating into the different Indo-European branches, affected each of them in the phonetic aspect in the general naming vocabulary, but left them unaltered in its phonetic predialectal state with regards to hydro-toponymy, as well as a good part of the naming lexicon related to the concepts of “river, water” and the different qualities of water currents. For example, according to those sharing that opinion, the Hispanic Palantia of the area of Vaccei would be in fact Celtic, but in that name the loss of the initial /p/ that characterizes Celtic would not have been applicable. Similarly, the hydro-toponymy in Germania is largely exempt from the Lautverschiebung, in Greece the loss of initial /s/, etc. These names not only fail to suffer the dialectal innovations corresponding to their zones, but sometimes they present innovations different from the features of the dialect involved. For example the word *mori “sea, standing water” is sometimes found in the hydro-toponymy of Gaul in the form *mari instead of *mori proper of Celtic (Marantium, Marisanga, Marsus), which in the framework of the paradigm has to be inevitably interpreted as a non-Celtic innovation.
Names of this nature that appear in areas where a pre-Roman historical Indo-European language never existed remain unexplained, such as in North Africa, Arabia Felix or the Caucasus: Lake Pallantias in Libya; the Salat River in Mauritania Tingitana; Auso in Mauritania Caesariensis; the Alonta River in Georgia; the Abas River in Caucasian Albania; Salma and Salapeni in Arabia Felix; etc. Of course, for these cases it is always possible to deny any relationship of kinship between these forms and their European cognates, and attribute everything to the chance of random homophonies. Thus, once again, the annoying comparative data are sacrificed in the sacred altar of the paradigm, despite the fact that they are so numerous and consistent that if there were no blind faith in the current dogma, they would be sufficient to articulate a new paradigm over them.
The choice of each Indo-Europeanist between the non-Indo-European and the Indo-European interpretation to explain the prehistoric toponymy of Europe is not motivated by the fact that they manage partial sets of hydronyms that are more propitious alternatively for the one or the other option. On the contrary, frequently the same batch of materials is claimed by both trends as its own. An extreme example is that of Th. Vennemann, who considers simply as non-Indo-European (specifically Paleo-Basque) exactly the same material that H. Krahe used to support his Indo-European interpretation. Thus, the structure and linguistic characteristics of the studied material have little role in the choice of one or the other path, which is rather conditioned by convictions and adhesion to a varied range of personal beliefs, traditional dogmas and scientific paradigms.
The linguistic column
The sequence of languages that were successively spoken in any territory constitutes what by analogy [with the “geological column”] we could call its “ethno-linguistic column”.
Next I offer the list of the languages detected in the compositional (and to a lesser extent derivational) toponymic syntagms in which the appellatives ub-, up-, ab-, ap-, ur-, il-, igi, tuk, -ip – analyzed in this work – are involved.
From the interaction of the different strata in words and hybrid syntagms we can, therefore, establish the linguistic column in the Iberian Peninsula and its neighboring territories (Western Europe and Northern Africa) with the following sequence:
1. A first stratum of very old chronology, which in a previous publication I have proposed to call Palaeo-Indo-European [“arqueo-indoeuropeo”]. The toponymic elements belonging to this stratum dealt with throughout this text are abundant: kerso-, turso-, alawo-, lako-, mido-, silo-, tibo-, etc.
They always function as determinant toponyms of a place-name in any other language. It never uses the name “city” (or “river”) in hybrid syntagms. Their place names (determinants) are combined with names of the following languages:
a) Iberian in Iberia or Southern France: kiŕś-iltiŕ, tuŕś-iltiŕ, alaun-iltiŕte, lakunm ∙ -iltiŕte.
b) The language of the igi in southern Iberia and perhaps Northern Africa: Cantigi, Saltigi, Sagigi, Sicingi.
c) The southern language of the postponed -il: Mid-ili, Sil-ili, Tib-ili.
d) The language of the postponed -ip: Lac-ipo, Ost-ipo, Vent-ipo.
This first Palaeo-Indo-European layer also corresponds to:
Several Palaeo-Indo-European varieties that have ab-, ap-, ub-, up- as a name for «river». To them belong also numerous place names (balsa-, siko-, wol-, etc.) that act as first members composed in both monoglotic and hybrid syntagmas.
Palaeo-Indo-European varieties in which ur- is the name “river”.
2. The second stratum in decreasing order of antiquity is formed by the language of the place name igi “city”, although its presence is only verified with certainty in Iberia (especially in the south) and Northern Africa:
a) It sets the igi name in compounds with Palaeo-Indo-European toponyms as in Salt-, Ast-, Olont-, Cant-, Aur- (Hispania) and Sagigi, Sicingi (Northern Africa).
b) It works as the first place-name of the compound when the second is il: Igilium, Igilgili, Singili.
3. The third stratum is the language of the name il “city”:
a) It puts the nickname il as determined in hybrid syntagms with Palaeo-Indo-European determinants: Mid-ili, Sil-ili, Tib-ili.
b) It puts the nickname il as determined in hybrid syntagms with determinant toponyms igi: Igilium, Igilgili, Singili.
c) It puts the place names (determinants) in front of the name (determined) of the language -ip (Il-ipa, Il-ipula and Il-ipla).
4. Fourth is the language of the name ip- “city”, which puts the name (determined) in syntagms with:
a) Palaeo-Indo-European toponym (determinant): Lac-ipo, Ost-ipo, Vent-ipo.
b) Toponym (determinant) il: Ilipa.
c) Second generation hybrid toponym of Palaeo-Indo-European + il: Balsilippa.
d) In the Balsilippa and Sicilippa conglomerates, the three strata appear in the expected sequence: Palaeo-Indo-European + il + ip.
5. In the fifth place of the sequence is the language of the tuk-:
a) It puts the name tuk- in compounds in which the place-name is a Palaeo-Indo-European element: Acatucci (see Aduatuci in Germania).
b) It puts the name tuk- “height, top” in compounds in which the place-name is an ip- fossilized as place-names: Iptuci, etc.
c) On at least one occasion an ip-fossilized syntagm acts as a toponym opposite a Celtic name: Itucodon (<Iptuco-dunum).
NOTE. Even though Villar talks about this stratum -tuk in Germania (Aduatukus) and the British Isles (Itucodon), only one case is found in each territory.
6. The last place is occupied by Celtic:
a) In Itucodon it puts the name (dunum) in front of a complex toponym of two previous strata, ip- + tuk-; and in Iliodurus it gives the name duro- in front of an equally complex Ibliodurus (<Ibili + duro).
b) In bilbiliz it puts the casual morpheme in a fossilized bi-member toponym of a previous stratum, one of whose components is il-: Bilbil-iz.
A hard change of paradigm
More effort did it cost me to accept that ub- is a dialectal variant of a known Indo-European word for “water, river”, of which previously knew three others: ap-, ab-, up-. The obviousness of the phonetic correlation ap- / ab- // up- / ub- together with the semantic link with rivers, which can be verified above all outside of Spain, but is also present in our Peninsula, forced my resistance little by little. And with it fell the first trench of the dogma, unshakable until that moment, that everything in the Peninsula in the south was to be non-Indo-European.
Along with this serial component, many other isolated place names were revealed as very likely of Indo-European etymology, both in the “Iberian” East and in the “Tartessian” South. So the ubiquity of Indo-European throughout the Peninsula began to impose itself to me painfully. I say painfully because I lacked a paradigm in which to fit the new perspective that was making its way into my mind, which was therefore suspended in nothing, without any theoretical support, leaving me with a feeling that I was losing my footing. And for a time I was reluctant to accept the profound implications that all of this had entailed.
All il languages, in any of their locations, exhibit a compositional behavior in hybrid toponymic syntagms that place them all in an intermediate position between the clearly [first/second layer] strata, with place-names for their human settlements semantically derived from water realities (ur), and those clearly attributable to the [fifth layer] with appellations derived from settlements in heights (briga, dunum). But in that intermediate segment of the column there are three strata: 1) il, 2) ip-, 3) tuk-. In Andalusia there is an additional one: the igi stratum, of opaque semantics, which immediately precedes the il stratum.
To postulate that any of the toponymic strata of our column imply a new linguistic stratum, certain additional requirements will be necessary. One of them is that, in addition to the name in question, the languages involved should share other features that could not have been lent, such as the very precise order of elements in the compounds Toponym + Name coexisting with Name + Adjective. Or the sharing of additional lexical elements that are not usually subject to loans, such as the semantically basic adjectives beri «new» and bels «black».
Unfortunately, the toponymic method, like the Comparative Method itself, does not have the capacity to establish precise absolute chronologies. (…)
In Europe (Hispania, South of France, Germania, British Isles, Baltic) the oldest stratum that can be identified is an indeterminable number of palaeo-varieties of the Indo-European macro-family, which do not have a direct local relationship with historical Indo-European languages, to the extent that we can verify. In fact, we have seen that stratigraphic signs lead us to consider the main Indo-European pre-Roman language of Hispania, the Celtic language, as a stratum after the il language, which in turn is later than the peninsular Indo-European palaeo-varieties.
In North Africa there is also a Palaeo-Indo-European stratum present. But there is also a very old non-Indo-European stratum whose identity I can not define through the material used. Nor has it been possible for me to establish relative antiquity of one and the other on African soil.
Another of the languages involved, which has il- as an appellation for “city” in the Southwest of Hispania and North Africa, could have some kind of kinship relationship with Basque on the one hand and the Iberian language on the other, but the same indirect form that I have just pointed out for the Indo-European palaeo-varieties with respect to the historical Indo-European languages. Or in other words: the language(s) of the place-names referred to in this work would be palaeo-varieties of a linguistic family to which two known historical languages, Iberian and Basque, may have belonged, although we can’t establish a relation of direct affiliation neither between those two historical languages among themselves, nor between any of them and the palaeo-varieties of the prehistoric toponymy.
In general, Celtic does not have in its historical territories the onomastic behavior of an ancestral language, but that of an intrusive language, whose presence there is not only more recent than other Indo-European varieties, but also after that of various non-Indo-European strata, which are themselves ranked between the oldest detected (Palaeo-Indo-European) and the last of Pre-Romans, which is Celtic itself. If we only detected two strata, the Indo-European and the Celtic ones, we could discuss if it is possible that both are one and the same, so that what we define as “Celtic” is nothing other than the modern in situ evolution of Palaeo-Indo-European. But examples like those of kiŕśiltiŕ, kerso-ialos, Cirsa or Itucodon, among many others analyzed throughout this book, make it unlikely. And, in addition, the mediation of several strata in the column between the Palaeo-Indo-European language of Cirsa, as well as the greater antiquity of the ip- and tuk- languages in Spanish, Gallic and British territory, defines the latter as a new and more recent layer than the aforementioned, which burst into its historical sites during the Iron Age.
Because Archaeology continues to deny the existence of population movements of a size worthy of consideration in the Iron Age, it is necessary to accept that the Indo-European Problem remains intact. It is understandable that before this aporia, many minds who are uncomfortable living with doubts, prefer to adopt a creed (the traditional, the Neolithic or the continuist) and expose it as a certainty to their students in the classrooms or their colleagues in conferences and publications. It’s not my case. For me, with Voltaire, “le doute est désagréable, mais la certitude est ridicule”. Or with Manzoni: “E men male l’agitarsi nel dubbio, che riposar nell’errore”.
We already had a good idea about the expansion of Celts, based on proto-historical accounts, fragmentary languages, and linguistic guesstimates, but the connection of Celtic with either Urnfield or slightly later Hallstatt/La Tène was always blurred, due to the lack of precise data on population movements.
The latest paper on Iberia is interesting for many details, such as:
A discrete influx of North African ancestry in certain samples before the Moorish invasion (which was probably mediated by peoples of North African rather than Levantine admixture).
The finding of very Mycenaean-like Greek colonies of the 5th century (interestingly, under R1b lineages).
The paper is, however, of particular importance from the perspective of historical linguistics. It confirms that:
Celtic-speaking peoples expanded in Iberia likely during the Late Bronze Age – Early Iron Age (probably with the Urnfield culture, before 1000 BC) with North/Central European ancestry.
NOTE. The paper marks what are believed to be the boundaries of non-Indo-European languages during the Iron Age in later times, extrapolating that situation to the past. Mediterranean sites with Iberian traits (ca. 6th century on) were probably non-Indo-European-speaking tribes, but it is unclear what happened in the centuries before their sampling, and there are no clear boundaries. These incoming Celts from central Europe with the Urnfield culture makes it very likely that the Iberian expansion to the north happened later, incorporating thus this central European ancestry in the process. The southern (orientalizing, Tartessian) site of La Angorrilla shows incineration and influence from Phoenician settlers, and their actual language is also far from clear. The other investigated samples, with higher central European contribution, are from Celtiberian sites.
The slightly later arrival of (Phoenician, Greek and) Latin-speaking peoples into Iberia is marked by Central/Eastern Mediterranean and North African ancestry.
While both confirm what was more or less already known about the oldest attested NWIE dialects, and further support the role of East Bell Beakers in expanding North-West Indo-European, the first part is interesting for two main reasons:
Koch’sCeltic from the West hypothesis, which made a recent comeback with a renewed model based on “steppe ancestry”, is once again rejected in population genomics, as expected. At this point I doubt this will mean anything to the supporters of the theory (because you can propose as many “Celtic-over-Celtic” layers as you want), but if you are not obsessed with autochthonous continuity of Celtic languages in the Atlantic area we might begin to judge the most correct dialectal split (and thus classification) among those proposed to date, based on ancestry and haplogroup expansions.
We believed in the 2000s that the expansion of haplogroup R1b-M167 (TMRCA ca. 1100 BC for YTree or 1700 BC for YFull) was coupled with the expansion of Iberians from the Pyrenees, in turn (thus) closely related to Basques. This non-IE presence has been contested with toponymic data in linguistics, and with the testing of many modern samples and the subsequent discovery of the widespread distribution of the subclade in western and northern Europe. Now it has become even more likely (lacking confirmation with aDNA) that this haplogroup expanded with Celts.
NOTE. Regarding R1b SNPs, YTree has more samples (and thus more SNPs) to work with estimates, due to its connection with FTDNA groups, so it is in principle more reliable (although estimates were calculated in 2017). Nevertheless, the methods to estimate the age of the MRCA are different between YTree and YFull.
Why this is important has to do with the realization that Celts must have expanded explosively in all directions during the estimated range for Common Celtic (ca. 1500-1000 BC), and as such R1b-M167 is probably going to be one of the clear Y-DNA markers of the Celtic expansion, when it appears in the ancient DNA record, maybe in new SNP calls from samples of the Olalde et al. (2019) paper, or in future Urnfield/Hallstatt/La Tène papers.
Sister clades derived from R1b-Z262 (TMRCA ca. 1650 BC for YTree, or 2700 for YFull), although sharing a quite old origin, may have taken part in the same communities that expanded R1b-M167, likely from some point in central Europe, possibly as remnants of a previous (Tumulus culture?) central European expansion, as the sample SZ5 from Szólád (R1b-CTS1595) and the distribution of modern samples suggest.
The youngest sub-branch, R1b-M167, dates to approximately 3.5 kya (95% CI= 2.5-5.3 kya), i.e. even after the Bronze Age.
NOTE. Admittedly, the maps are mainly based on Iberian samples and certain limited sampling elsewhere, so most of the frequencies displayed in other territories are extrapolated. Since the percentage of R1b-M167 in France is estimated to be ca. 3%, and in Bavaria ca. 5%, the distribution in Central Europe is probably much higher, and around the Mediterranean much lower than represented in them.
The Celtic expansion might not have been a mass migration of peoples replacing all male lines of their controlled territories (as was common in the Neolithic and Chalcolithic), because of the Bronze Age dominant chiefdom-based system that relied on alliances, but it is becoming clear that Early Celts are also going to show the expansion of certain successful male lineages.
Oh, and you can say goodbye to the autochthonous “Vasconic = R1b-DF27” (latest heir of the “Vasconic = R1b-P312”) theory, too, if – for some strange reason – you hadn’t already.
EDIT (16 MAR) Just in case the wording is not clear: the fact that this haplogroup most likely expanded with Celts does not mean that its lineages didn’t become eventually incorporated into Iberian cultures and adopted non-IE languages: some of them probably did at some point, in some regions of northern Iberia, and most were certainly later incorporated to the Roman civilization and spoke Latin, then to the medieval kingdoms with their languages, and so on until the present day… Only those eventually associated with Iron Age Aquitanians may have retained their non-IE language, unless those lineages today associated with Basques were incorporated later to the Basque-speaking regions by expanding medieval kingdoms. A complex picture repeated everywhere in Europe: no haplogroup+language continuity in sight, anywhere.
NOTE: This here is currently the most likely interpretation of data based on estimations of mutations; it is not confirmed with ancient samples.
We assembled genome-wide data from 271 ancient Iberians, of whom 176 are from the largely unsampled period after 2000 BCE, thereby providing a high-resolution time transect of the Iberian Peninsula. We document high genetic substructure between northwestern and southeastern hunter-gatherers before the spread of farming. We reveal sporadic contacts between Iberia and North Africa by ~2500 BCE and, by ~2000 BCE, the replacement of 40% of Iberia’s ancestry and nearly 100% of its Y-chromosomes by people with Steppe ancestry. We show that, in the Iron Age, Steppe ancestry had spread not only into Indo-European–speaking regions but also into non-Indo-European–speaking ones, and we reveal that present-day Basques are best described as a typical Iron Age population without the admixture events that later affected the rest of Iberia. Additionally, we document how, beginning at least in the Roman period, the ancestry of the peninsula was transformed by gene flow from North Africa and the eastern Mediterranean.
From the Bronze Age (~2200–900 BCE), we increase the available dataset (6, 7, 17) from 7 to 60 individuals and show how ancestry from the Pontic-Caspian steppe (Steppe ancestry) appeared throughout Iberia in this period (Fig. 1, C and D), albeit with less impact in the south (table S13). The earliest evidence is in 14 individuals dated to ~2500–2000 BCE who coexisted with local people without Steppe ancestry (Fig. 2B). These groups lived in close proximity and admixed to form the Bronze Age population after 2000 BCE with ~40% ancestry from incoming groups (Fig. 2B and fig. S6).
Y-chromosome turnover was even more pronounced (Fig. 2B), as the lineages common in Copper Age Iberia (I2, G2, and H) were almost completely replaced by one lineage, R1b-M269. These patterns point to a higher contribution of incoming males than females, also supported by a lower proportion of nonlocal ancestry on the X-chromosome (table S14 and fig. S7), a paradigm that can be exemplified by a Bronze Age tomb from Castillejo del Bonete containing a male with Steppe ancestry and a female with ancestry similar to Copper Age Iberians.
For the Iron Age, we document a consistent trend of increased ancestry related to Northern and Central European populations with respect to the preceding Bronze Age (Figs. 1, C and D, and 2B). The increase was 10 to 19% (95% confidence intervals given here and in the percentages that follow) in 15 individuals along the Mediterranean coast where non-Indo-European Iberian languages were spoken; 11 to 31% in two individuals at the Tartessian site of La Angorrilla in the southwest with uncertain language attribution; and 28 to 43% in three individuals at La Hoya in the north where Indo-European Celtiberian languages were likely spoken (fig. S6 and tables S11 and S12).
This trend documents gene flow into Iberia during the Late Bronze Age or Early Iron Age, possibly associated with the introduction of the Urnfield tradition (18). Unlike in Central or Northern Europe, where Steppe ancestry likely marked the introduction of Indo-European languages (12), our results indicate that, in Iberia, increases in Steppe ancestry were not always accompanied by switches to Indo-European languages.
I think it is obvious they are extrapolating the traditional (not that well-known) linguistic picture of Iberia during the Iron Age, believing in continuity of that picture (especially non-Indo-European languages) during the Urnfield period and earlier.
What this data shows is, as expected, the arrival of Celtic languages in Iberia after Bell Beakers and, by extension, in the rest of western Europe. Somewhat surprisingly, this may have happened during the Urnfield period, and not during the La Tène period.
Also important are the precise subclades:
We thus detect three Bronze Age males who belonged to DF27 (154, 155), confirming its presence in Bronze Age Iberia. The other Iberian Bronze Age males could belong to DF27 as well, but the extremely low recovery rate of this SNP in our dataset prevented us to study its true distribution. All the Iberian Bronze Age males with overlapping sequences at R1b-L21 were negative for this mutation. Therefore, we can rule out Britain as a plausible proximate origin since contemporaneous British males are derived for the L21 subtype.
BAL0051 could be assigned to haplogroup I1, while BAL003 carries the C1a1a haplogroup. To the limits of our typing resolution, EN/MN individuals CHA001, CHA003, ELT002 and ELT006 share haplogroup I2a1b, which was also reported for Loschbour  and Motala HG , and other LN and Chalcolithic individuals from Iberia [7, 9], as well as Neolithic Scotland, France, England , and Lithuania . Both C1 and I1/ I2 are considered typical European HG lineages prior to the arrival of farming. Interestingly, CHA002 was assigned to haplogroup R1b-M343, which together with an EN individual from Cova de Els Trocs (R1b1a) confirms the presence of R1b in Western Europe prior to the expansion of steppe pastoralists that established a related male lineage in Bronze Age Europe [3, 6, 9, 13, 19]. The geographical vicinity and contemporaneity of these two sites led us to run genomic kinship analysis in order to rule out any first or second degree of relatedness. Early Neolithic individual FUC003 carries the Y haplogroup G2a2a1, commonly found in other EN males from Neolithic Anatolia , Starçevo, LBK Hungary , Impressa from Croatia and Serbia Neolithic  and Czech Neolithic , but also in MN Croatia  and Chalcolithic Iberia .
Sorry for the last weeks of silence, I have been rather busy lately. I am having more projects going on, and (because of that) I also wanted to finish a project I have been working on for many months already.
I have therefore decided to publish a provisional version of the text, in the hope that it will be useful in the following months, when I won’t be able to update it as often as I would like to:
EDIT (20 JAN 2019): For those of you who are more comfortable reading in your native language, I have placed some links to automatic translations by Google Translate. They might work especially well for the texts of A Game of Clans & A Clash of Chiefs.
Don’t forget to check out the maps included in the supplementary materials: I have added Y-DNA, mtDNA, and ADMIXTURE data using GIS software. The PCA graphics are also important to follow the main text.
NOTE. Right now the files are only in my server. I will try to upload them to Academia.edu and Research Gate when I have time, I have uploaded them to Academia.edu and ResearchGate, in case the websites are too slow.
I would have preferred to wait for a thorough revision of the section on archaeology and the linguistic sections on Uralic, but I doubt I will have time when the reviews come, so it was either now or maybe next December…
I say so in the introduction, but it is evident that certain aspects of the book are tentative to say the least: the farther back we go from Late Proto-Indo-European, the less clear are many aspects. Also, linguistically I am not convinced about Eurasiatic or Nostratic, although they do have a certain interest when we try to offer a comprehensive view of the past, including ethnolinguistic identities.
I cannot be an expert in everything, and these books cover a lot. I am bound to publish many corrections as new information appears and more reviews are sent. For example, just days ago (before SNP calls of Wang et al. 2018 were published) some paragraphs implied that AME might have expanded Nostratic from the Middle East. Now it does not seem so, and I changed them just before uploading the text. That’s how tentative certain routes are, and how much all of this may change. And that only if we accept a Nostratic phylum…
NOTE. Since the first book I wrote was the linguistic one, and I have spent the last months updating the archaeology + genetics part, now many of you will probably understand 1) why I am so convinced about certain language relationships and 2) how I used many posts to clarify certain ideas and receive comments. Many posts offer probably a good timeline of what I worked with, and when.
I did not add this section to the books, because they are still not ready for print, but I think this is due somewhere now. It is impossible to reference all who have directly or indirectly contributed to this, so this is a list of those I feel have played an important role.
I am indebted to the following people (which does not mean that they share my views, obviously):
First and foremost, to Fernando López-Menchero, for having the patience to review with detail many parts on Indo-European linguistics, knowing that I won’t accept many of his comments anyway. The additional information he offers is invaluable, but I didn’t want to turn this into a huge linguistic encyclopaedia with unending discussions of tiny details of each reconstructed word. I think it is already too big as it is.
I would not have thought about doing this if it were not for the interest of Wekwos (Xavier Delamarre) in publishing a full book about the Indo-European demic diffusion model (in the second half of 2017, I think). It was them who suggested that I extended the content, when all I had done until then was write an essay and draw some maps in my free time between depositing the PhD thesis and defending it.
Sadly, as much as I would like to publish a book with a professional publisher, I don’t think ancient DNA lends itself for the traditional format, so my requests (mainly to have free licenses and being able to review the text at will, as new genetic papers are published) were logically not acceptable. Also, the main aim of all volumes, especially the linguistic one, is the teaching of essentials of Late Proto-Indo-European and related languages, and this objective would be thwarted by selling each volume for $50-70 and only in printed format. I prefer a wider distribution.
At first I didn’t think much of this proposal, because I do not benefit from this kind of publications in my scientific field, but with time my interest in writing a whole, comprehensive book on the subject grew to the point where it was already an ongoing project, probably by the start of 2018.
I would not have been in contact with Wekwos if it were not for user Camulogène Rix at Anthrogenica, so thanks for that and for the interest in this work.
I would not have thought of writing this either if not for the spontaneous support (with an unexpected phone call!) of a professor of the Complutense University of Madrid, Ángel Gómez Moreno, who is interested in this subject – as is his wife, a professor of Classics more closely associated to Indo-European studies, and who helped me with a search for Indo-Europeanists.
EDIT (1 JAN 2019): I remembered that Karin Bojs sent me her book after reading the demic diffusion model. I may have also thought about writing a whole book back then, but mid-2017 is probably too early for the project.
Professor Kortlandt is still to review the text, but he contributed to both previous essays in some very interesting ways, so I hope he can help me improve the parts on Uralic, and maybe alternative accounts of expansion for Balto-Slavic, depending on the time depth that he would consider warranted according to the Temematic hypothesis.
The maps are evidently (for those who are interested in genetics) in part the result of the effort of the late Jean Manco: As you can see from the maps including Y-DNA and mtDNA samples, I have benefitted from her way of organising data and publishing it. Similarly, the work of Iain McDonald in assessing the potential migration routes of R1b and R1a in Europe with the help of detailed maps was behind my idea for the first maps, and consequently behind these, too.
Readers of this blog with interesting comments have also been essential for the improvement of the texts. You can probably see some of your many contributions there. I may not answer many comments, because I am always busy (and sometimes I just don’t have anything interesting to say), but I try to read all of them.
Users of other sites, like Anthrogenica, whose particular points of view and deep knowledge of some very specific aspects are sometimes very useful. In particular, user Anglesqueville helped me to fix some issues with the merging of datasets to obtain the PCAs and ADMIXTURE, and prepared some individual samples to merge them.
Even without posting anything, Google Analytics keeps sending me messages about increasing user fidelity (returning users), and stats haven’t really changed (which probably means more people are reading old posts), so thank you for that.