Common Slavs from the Lower Danube, expanding with haplogroup E1b-V13?

late-iron-age-eastern-europe

Florin Curta has published online his draft for Eastern Europe in the Middle Ages (500-1300), Brill’s Companions to European History, Vol. 10 (2019), apparently due to appear in June.

Some interesting excerpts, relevant for the latest papers (emphasis mine):

The Archaeology of the Early Slavs

(…) One of the most egregious problems with the current model of the Slavic migration is that it is not at all clear where it started. There is in fact no agreement as to the exact location of the primitive homeland of the Slavs, if there ever was one. The idea of tracing the origin of the Slavs to the Zarubyntsi culture dated between the 3rd century BC and the first century AD is that a gap of about 200 years separates it from the Kiev culture (dated between the 3rd and the 4th century AD), which is also attributed to the Slavs. Furthermore, another century separates the Kiev culture from the earliest assemblages attributed to the Prague culture. It remains unclear as to where the (prehistoric) Slavs went after the first century, and whence they could return, two centuries later, to the same region from which their ancestors had left. The obvious cultural discontinuity in the region of the presumed homeland raises serious doubts about any attempts to write the history of the Slavic migration on such a basis. There is simply no evidence of the material remains of the Zarubyntsi, Kiev, or even Prague culture in the southern and southwestern direction of the presumed migration of the Slavs towards the Danube frontier of the Roman Empire.

Moreover, the material culture revealed by excavations of 6th- to 7th-century settlements and, occasionally, cremation cemeteries in northwestern Russia, Belarus, Poland, Moravia, and Bohemia is radically different from that in the lands north of the Danube river, which according to the early Byzantine sources were inhabited at that time by Sclavenes: no settlement layout with a central, open area; no wheel-made pottery or pottery thrown on a tournette; no clay rolls inside clay ovens; few, if any clay pans; no early Byzantine coins, buckles, or remains of amphorae; no fibulae with bent stem, and few, if any bow fibulae. Conversely, those regions have produced elements of material culture that have no parallels in the lands north of the river Danube: oval, trough-like settlement features (which are believed to be remains of above-ground, log-houses); exclusively handmade pottery of specific forms; very large settlements, with over 300 houses; fortified sites that functioned as religious or communal centers; and burials under barrows. With no written sources to inform about the names and identities of the populations living in the 6th and 7th centuries in East Central and Eastern Europe, those contrasting material culture profiles could hardly be interpreted as ethnic commonality. In other words, there is no serious basis for attributing to the Sclavenes (or, at least, to those whom early Byzantine authors called so) any of the many sites excavated in Russia, Belarus, Poland, Moravia, and Bohemia.

slavic-expansion-prague-korchak
Common Slavic expanding with Prague-Korchak from the east…or was it from the west?

Migrations

There is of course evidence of migrations in the 6th and 7th centuries, but not in the directions assumed by historians. For example, there are clear signs of settlement discontinuity in northern Germany and in northwestern Poland. German archaeologists believe that the bearers of the Prague culture who reached northern Germany came from the south (from Bohemia and Moravia), and not from the east (from neighboring Poland or the lands farther to the east). At any rate, no archaeological assemblage attributed to the Slavs either in northern Germany or in northern Poland may be dated earlier than ca. 700. In Poland, settlement discontinuity was postulated, to make room for the new, Prague culture introduced gradually from the southeast (from neighboring Ukraine). However, there is increasing evidence of 6th-century settlements in Lower Silesia (western Poland and the lands along the Middle Oder) that have nothing to do with the Prague culture. Nor is it clear how and when did the Prague culture spread over the entire territory of Poland. No site of any of the three archaeological cultures in Eastern Europe that have been attributed to the Slavs (Kolochin, Pen’kivka, and Prague/Korchak) has so far been dated earlier than the sites in the Lower Danube region where the 6th century sources located the Sclavenes. Neither the Kolochin, nor the Pen’kivka cultures expanded westwards into East Central or Southeastern Europe; on the contrary, they were themselves superseded in the late 7th or 8th century by other archaeological cultures originating in eastern Ukraine. Meanwhile, there is an increasing body of archaeological evidence pointing to very strong cultural influences from the Lower and Middle Danube to the Middle Dnieper region during the 7th century—the opposite of the alleged direction of Slavic migration.

When did the Slavs appear in those regions of East Central and Eastern Europe where they are mentioned in later sources? A resistant stereotype of the current scholarship on the early Slavs is that “Slavs are Slavonic-speakers; Slavonic-speakers are Slavs.”* If so, when did people in East Central and Eastern Europe become “Slavonic speakers”? There is in fact no evidence that the Sclavenes mentioned by the 6th-century authors spoke Slavic (or what linguists now call Common Slavic). Nor can the moment be established (with any precision), at which Slavic was adopted or introduced in any given region of East Central and Eastern Europe.** To explain the spread of Slavic across those regions, some have recently proposed the model of a koiné, others that of a lingua franca. The latter was most likely used within the Avar polity during the last century of its existence (ca. 700 to ca. 800).

*Ziółkowski, “When did the Slavs originate?” p. 211. On the basis of the meaning of the Old Church Slavonic word ięzyk (“language,” but also “people” or “nation”), Darden, “Who were the Sclaveni?” p. 138 argues that the meaning of the name the Slavs gave to themselves was closely associated with the language they spoke.

**Uncertainty in this respect dominates even in recent studies of contacts between Slavic and Romance languages (particularly Romanian), even though such contacts are presumed to have been established quite early (Paliga, “When could be dated ‘the earliest Slavic borrowings’?”; Boček, Studie). Recent studies of the linguistic interactions between speakers of Germanic and speakers of Slavic languages suggest that the adoption of place names of Slavic origin was directly linked to the social context of language contact between the 9th and the 13th centuries (Klír, “Sociální kontext”).

Avars

During the 6th century, the area between the Danube and the Tisza in what is today Hungary, was only sparsely inhabited, and probably a “no man’s land” between the Lombard and Gepid territories. It is only after ca. 600 that this area was densely inhabited, as indicated by a number of new cemeteries that came into being along the Tisza and north of present-day Kecskemét. There can therefore be no doubt about the migration of the Avars into the Carpathian Basin, even though it was probably not a single event and did not involve only one group of population, or even a cohesive ethnic group.

The number of graves with weapons and of burials with horses is particularly large in cemeteries excavated in southwestern Slovakia and in neighboring, eastern Austria. This was a region of special status on the border of the qaganate, perhaps a “militarized frontier.” From that region, the Avar mores and fashions spread farther to the west and to the north, into those areas of East Central Europe in which, for reasons that are still not clear, Avar symbols of social rank were particularly popular, as demonstrated by numerous finds of belt fittings. Emulating the success of the Avar elites sometimes involved borrowing other elements of social representation, such as the preferential deposition of weapons and ornamented belts. For example, in the early 8th century, a few males were buried in Carinthia (southern Austria) with richly decorated belts imitating those in fashion in the land of the Avars, but also with Frankish weapons and spurs. Much like in the Avar-age cemeteries in Slovakia and Hungary, the graves of those socially prominent men are often surrounded by many burials without any grave goods whatsoever.

early-avar-khaganate
Territory of the early Avar Qaganate and the location of the investigated sites in the Carpathian Basin in Csáky et al. (2019).

Carantanians

Carantania was a northern neighbor of the Lombard duchy of Friuli, which was inhabited by Slavs. According to Paul the Deacon, who was writing in the late 780s, those Slavs called their country Carantanum, by means of a corruption of the name of ancient Carnuntum (a former Roman legionary camp on the Danube, between Vienna and Bratislava). Carantanians were regarded as Slavs by the author of a report known as the Conversion of the Bavarians and Carantanians, and written in ca. 870 in order to defend the position of the archbishop of Salzburg against the claims of Methodius, the bishop of Pannonia.94 According to this text, a duke named Boruth was ruling over Carantania when he was attacked by Avars in ca. 740. He called for the military assistance of his Bavarian neighbors. The Bavarian duke Odilo (737–748) obliged, defeated the Avars, but in the process also subdued the Carantanians to his authority. Once Bavarian overlordship was established in Carantania, Odilo took with him as hostages Boruth’s son Cacatius and his nephew Chietmar (Hotimir). Both were baptized in Bavaria. During the 743 war between Odilo and Charles Martel’s two sons, Carloman and Pepin (the Mayors of the Palace in Austrasia and Neustria, respectively), Carantanian troops fought on the Bavarian side. The Bavarian domination cleared the field for missions of conversion to Christianity sent by Virgil, the new bishop of Salzburg (746–784). Many missionaries were of Bavarian origin, but some were Irish monks.

Moravians

Several Late Avar cemeteries dated to the last quarter of the 8th century are known from the lands north of the middle course of the river Danube, in what is today southern Slovakia and the valley of the Lower Morava [see image below]. By contrast, only two cemeteries have so far been found in Moravia (the eastern part of the present-day Czech Republic), along the middle and upper course of the Morava and along its tributary, the Dyje. In both Dolní Dunajovice and Hevlín, the latest graves may be dated by means of strap ends and belt mounts with human figures to the very end of the Late Avar period. (…)

The archaeological evidence pertaining to burial assemblages dated to the early 9th century is completely different. Shortly before or after 800, all traces of cremation—with or without barrows—disappear from the valley of the Morava river and southwestern Slovakia, two regions in which cremation had been the preferred burial rite during the previous centuries. This dramatic cultural change has often been interpreted as a direct influence of both Avar and Frankish burial rites, but it coincides in time with the adoption of Christianity by local elites. In spite of conversion, however, the representation of status through furnished burial continued well into the 9th century. Unlike Avar-age sites in Hungary and the surrounding regions, many men were buried in 9th-century Moravia together with their spurs, in addition to such weapons as battle axes, “winged” lance heads, or swords with high-quality steel blades of Frankish production.

morvaian-sites
Relevant Moravian sites mentioned in Curta’s new book.

When the Magyars inflicted a crushing defeat on the Bavarians at Bratislava (July 4, 907), the fate of Moravia was sealed as well. Moravia and the Moravians disappear from the radar of the written sources, and historians and archaeologists alike believe that the polity collapsed as a result of the Magyar raids.

Magyars

(…) although there can be no doubt about the relations between Uelgi and the sites in Hungary attributed to the first generations of Magyars, those relations indicate a migration directly from the Trans-Ural lands, and not gradually, with several other stops in the forest-steppe and steppe zones of Eastern Europe. In the lands west of the Ural Mountains, the Magyars are now associated with the Kushnarenkovo (6th to 8th century) and Karaiakupovo (8th to 10th century) cultures, and with such burial sites as Sterlitamak (near Ufa, Bashkortostan) and Bol’shie Tigany (near Chistopol, Tatarstan).* However, the same problem with chronology makes it difficult to draw the model of a migration from the lands along the Middle Volga. Many parallels for the so typically Magyar sabretache plates found in Hungary are from that region. They have traditionally been dated to the 9th century, but more recent studies point to the coincidence in time between specimens found in Eastern Europe and those from Hungary.

* Ivanov, Drevnie ugry-mad’iary; Ivanov and Ivanova, “Uralo-sibirskie istoki”; Boldog et al., “From the ancient homelands,” p. 3; Ivanov, “Similarities.” Ivanov, “Similarities,” p. 562 points out that the migration out of the lands along of the Middle Volga is implied by the disappearance of both cultures (Kushnarenkovo and Karaiakupovo) in the mid-9th century. For the Kushnarenkovo culture, see Kazakov, “Kushnarenkovskie pamiatniki.” For the Karaiakupovo culture, see Mogil’nikov, “K probleme.”

Given that the Magyars are first mentioned in relation to events taking place in the Lower Danube area in the 830s, the Magyar sojourn in Etelköz must have been no longer than 60 years or so—a generation. (…)

arrival-of-hungarians-feszty-slavs
A detail of the Arrival of the Hungarians, Árpád Feszty’s and his assistants’ vast (1800 m2) cyclorama, painted to celebrate the 1000th anniversary of the Magyar conquest of Hungary, now displayed at the Ópusztaszer National Heritage Park in Hungary. This specific detail is probably based on the account on The Annals of Fulda, which narrates under the year 894 that the Hungarians crossed the Danube into Pannonia where they “killed men and old women outright and carried off the young women alone with them like cattle to satisfy their lusts and reduced the whole” province “to desert”.

It has become obvious by now that one’s impression of the Magyars as “Easterners” and “steppe-like” was (and still is) primarily based on grave finds, while the settlement material is considerably more aligned with what is otherwise known from other contemporary settlement sites in Central and Southeastern Europe. The dominant feature on the 10th- and 11th-century settlements in Hungary is the sunken-floored building of rectangular plan, with a stone oven in a corner. Similarly, the pottery resulting from the excavation of settlement sites is very similar to that known from many other such sites in Eastern Europe. Moreover, while clear changes taking place in burial customs between ca. 900 and ca. 1100 are visible in the archaeological record from cemeteries, there are no substantial differences between 10th- and the 11th-century settlements in Hungary. (…)

As a matter of fact, the increasing quantity of paleobotanical and zooarchaeological data from 10th-century settlements strongly suggests that the economy of the first generations of Magyars in Hungary was anything but nomadic. To call those Magyars “half-nomad” is not only wrong, but also misleading, as it implies that they were half-way toward civilization, with social changes taking place that must have had material culture correlates otherwise visible in the burial customs.

Comments

The origin of “Slavs” (i.e. that of “Slavonic” as a language, whatever the ancestral Proto-Slavic ethnic make-up was) is almost as complicated as the origin of Albanians, Basques, Balts, or Finns. Their entry into history is very recent, with few reliable sources available until well into the Middle Ages. If you add our ignorance of their origin with the desire of every single researcher or amateur out there to connect them to the own region (or, still worse, to all the regions where they were historically attested), we are bound to find contradictory data and a constantly biased selection of information.

Furthermore, it is extremely complicated to connect any recent population to its ancestral (linguistic) one through haplogroups prevalent today, and just absurd to connect them through ancestral components. This, which was already suspected for many populations, has been confirmed recently for Basques in Olalde et al. (2019) and will be confirmed soon for Finns with a study of the Proto-Fennic populations in the Gulf of Finland.

NOTE. Yes, the “my parents look like Corded Ware in this PCA” had no sense. Ever. Why adult people would constantly engage in that kind of false 5,000-year-old connections instead of learning history – or their own family history – escapes all comprehension. But if something is certain about human nature, is that we will still see nativism and ancestry/haplogroup fetishism for any modern region or modern haplogroups and their historically attested ethnolinguistic groups.

balto-slavic-pca
Genetic structure of modern Balto-Slavic populations within a European context according to the three genetic systems. Image from Kushniarevich et al. (2015)

As you can see from my maps and writings, I prefer neat and simple concepts: in linguistics, in archaeology, and in population movements. Hence my aversion to this kind of infinite proto-historical accounts (and interpretations of them) necessary to ascertain the origins of recent peoples (Slavs in this case), and my usual preference for:

  • Clear dialectal classifications, whether or not they can be as clear cut as I describe them. The only thing that sets Slavic apart from other recent languages is its connection with Baltic, luckily for both. Even though this connection is disputed by some linguists, and the question is always far from being resolved, a homeland of Proto-Balto-Slavic would almost necessarily need to be set to the north of the Carpathian Mountains in the Bronze Age (or at least close to them).
  • NOTE. A dismissal of a connection with Baltic would leave Slavic a still more complicated orphan, and its dialectal classification within Late PIE more dubious. Its union with Balto-Slavic locates it close to Germanic, and thus as a Bronze Age North-West Indo-European dialect close to northern Germany. So bear with me in accepting this connection, or enter the linguistic hell of arguing for Indo-Slavonic of R1a-Z93 mixed with Temematic….

  • A priori “pots = people” assumption, which may lead to important errors, but fewer than the usual “pots != people” of modern archaeologists. The traditional identification of the Common Slavic expansion with the Prague-Korchak culture – however undefined this culture may be – has clear advantages: it may be connected (although admittedly with many archaeological holes) with western cultures expanding east during the Bronze Age, and then west again after the Iron Age, and thus potentially also with Baltic.
  • A simplistic “haplogroup expansion = ethnolinguistic expansion”, which is quite useful for prehistoric migrations, but enters into evident contradictions as we approach the Iron Age. Common Slavs may be speculatively (for all we know) associated with an expansion of recent R1a-M458 lineages – among other haplogroups – from the east, and possibly Balto-Slavic as an earlier expansion of older subclades from the west, as I proposed in A Clash of Chiefs.
r1a-m458-underhill-2015
Modern distribution of R1a-M458, after Underhill et al. (2015).

NOTE. The connection of most R1a-Z280 lineages is more obviously done with ancient Finno-Ugric peoples, as it is clear now (see here and here).

Slavs appeared first in the Danube?

No matter what my personal preference is, one can’t ignore the growing evidence, and it seems that Florin Curta‘s long-lasting view of a Danubian origin of expansion for Common Slavic, including its condition as a lingua franca of late Avars, won’t be easy to reject any time soon:

1) Theories concerning Chernyakhov as a Slavic homeland will apparently need to be fully rejected, due to the Germanic-like ancestry that will be reported in the study by Järve et al. (2019).

EDIT (3 MAY 2019). From their poster Shift in the genetic landscape of the western Eurasian Steppe not due to Scythian dominance, but rather at the transition to the Chernyakhov culture (Ostrogoths) (download PDF):

(…) the transition from the Scythian to the Chernyakhov culture (~2,100–1,700 cal BP) does mark a shift in the Ponto-Caspian genetic landscape. Our results agree well with the Ostrogothic origins of the Chernyakhov culture and support the hypothesis that Scythian dominance was cultural rather than achieved through population replacement.

scythians-chernyakhov-ostrogoths-jarve
PCA of novel and published ancient samples from Scythian/Sarmatian and related groups on the background of modern samples presented as population medians. Δ – ref. 1, ○ – ref. 2, □ – ref. 3, ◊ – this study. Embedded are the locations of some of the samples. Notice the wide cluster formed by the three samples, from Hungarian Scythians in the west to steppe-like peoples in the east.

2) Therefore, unless Przeworsk shows the traditionally described mixture of populations in terms of ancestry and/or haplogroups, it will also be a sign of East Germanic peoples expanding south (and potentially displacing the ancestors of Slavs in either direction, east or south).

It would seem we are stuck in a Danubian vs. Kievan homeland for Common Slavs, then:

3) About the homeland in the Kiev culture, two early Avar females from Szólád have been commented to cluster “among Modern Slavic populations” based on some data in Amorim et al. (2018).

Rather than supporting an origin of Slavs in common with modern Russians, Poles, and Ukranians as observed in the PCA, though, the admixture of AV1 and AV2 (ca. AD 540-640) paradoxically supports an admixture of Modern Slavs of Eastern Europe in common with early Avar peoples (an Altaic-speaking population) and other steppe groups with an origin in East Asia… So this admixture would actually support a western origin of the Common Slavs with which East Asian Avars may have admixed, and whose descendants are necessarily sampled at later times.

pca-medieval-avar-longobards
Procrustes transformed PCA of medieval ancient samples against POPRES imputed SNP dataset. AV1 and Av2 samples have been circled in red. Color coding of medieval samples is same as in Figs 1 and 2. Two letter and three codes for POPRES samples: AL=Albania, AT=Austria, BA=Bosnia-Herzegovina, BE=Belgium, BG=Bulgaria, CH=Switzerland, CY=Cyprus, CZ=Czech Republic, DE=Germany, DK=Denmark, ES=Spain, FI=Finland, FR=France, GB=United Kingdom, GR, Greece, HR=Croatia, HU=Hungary, IE=Ireland, IT=Italy, KS=Kosovo, LV=Latvia, MK=Macedonia, NO=Norway, NL=Netherlands, PL=Poland, PT=Portugal, RO=Romania, SM=Serbia and Montenegro, RU=Russia, Sct=Scotland, SE=Sweden, SI=Slovenia, SK=Slovakia, TR=Turkey, UA=Ukraine.

4) Favouring Curta’s Danubian origin (or even an origin near Bohemia) at the moment are thus:

  • The “western” cluster of Early Slavs from Brandýsek, Bohemia (ca. AD 600-900).
  • Two likely Slavic individuals from Usedom, in Mecklenburg-Vorpommern (AD 1200) show hg. R1a-M458 and E1b-M215 (Freder 2010).
  • An early West Slav individual from Hrádek nad Nisou in Northern Bohemia (ca. AD 1330) also shows E1b-M215 (Vanek et al. 2015).
  • One sample from Székkutas-Kápolnadülő (SzK/239) among middle or late Avars (ca. AD 650-710), a supposed Slavonic-speaking polity, of hg. E1b-V13.
  • Two samples from Karosc (K1/13, and K2/6) among Hungarian conquerors (ca. AD 895-950), likely both of hg. E1b-V13, probably connected to the alliance with Moravian elites.
  • Possibly a West Slavic sample from Poland in the High Middle Ages (see below).

A later Hungarian sample (II/53) from the Royal Basilica, where King Béla was interred, of hg. E1b1, supports the importance of this haplogroup among elite conquerors, although its original relation to the other buried individuals is unknown.

NOTE. You can see all ancient samples of haplogroup E to date on this Map of ancient E samples, with care to identify the proper subclades related to south-eastern Europe. About the ancestral origin of the haplogroup in Europe, you may read Potential extra Iberomaurusian-related gene flow into European farmers, by Chad Rohlfsen.

Even assuming that the R1a sample reported from the late Avar period is of a subclade typically associated with Slavs (I know, circular reasoning here), which is not warranted, we would have already 6 E1b1b vs. 1-2 R1a-M458 in populations that can be actually assumed to represent early Slavonic speakers (unlike many earlier cultures potentially associated with them), clearly earlier than other Slavic-speaking populations that will be sampled in eastern Europe. It is more and more likely that Early Slavs are going to strengthen Curta’s view, and this may somehow complicate the link of Proto-Slavic with eastern European BA cultures like Trzciniec or Lusatian.

NOTE. I am still expecting a clear expansion associated with Prague-Korchak, though, including a connection with bottlenecks based on R1a-M458 in the Middle Ages, whether the expansion is eventually shown to be from the west (i.e. Bohemia -> Prague -> Korchak), or from the east (i.e. Kiev -> Korchack -> Prague), and whether or not this cultural community was later replaced by other ‘true’ Slavonic-speaking cultures through acculturation or population movements.

slavic-origins
Common theories on Slavic origins.. After “The Early Slavs. Culture and Society in Early Medieval Europe” by P. M. Barford, Cornell University Press (2001). Image by Hxseek at Wikipedia.

5) Back to Przeworsk and the “north of the Carpathians” homeland (i.e. between the Upper Oder and the Upper Dniester), but compatible with Curta’s view: Even if Common Slavic is eventually evidenced to be driven by small migrations north and south of the Danube during the Roman Iron Age, before turning into a mostly “R1a-rich” migration or acculturation to the north in Bohemia and then east (which is what this early E1b-V13 connection suggests), this does not dismiss the traditional idea that Late Bronze Age – Iron Age central-eastern Europe was the Proto-Slavic homeland, i.e. likely the Pomeranian culture disturbed by the East Germanic migrations first (in Przeworsk), and the migrations of steppe nomads later (around the Danube).

Even without taking into account the connection with Baltic, the relevance of haplogroup E1b-V13 among Early Slavs may well be a sign of an ancestral population from the northern or eastern Carpathian region, supported by the finding of this haplogroup among the westernmost Scythians. The expansion of some modern E1b-CTS1273 lineages may link Slavic ancestrally with the Lusatian culture, which is an eastern (very specific) Urnfield culture group, stemming from central-east Europe.

An important paper in this respect is the upcoming Zenczak et al., where another hg. E1b1 will be added to the list above: such a sample is expected from Poland (from Kowalewko, Maslomecz, Legowo or Niemcza), either from the Roman Iron Age or Early Middle Ages, close to an early population of likely Scandinavian origin (eight I1 samples), apart from other varied haplogroups, with little relevance of R1a. Whether this E-V13 sample is an Iron Age one (justifying the bottleneck under E-V13 to the south) or, maybe more likely, a late one from the Middle Ages (maybe supporting a connection of the Gothic/Slavic E1b bottleneck with southern Chernyakhov or further west along the Danube) is unclear.

The finding of south-eastern European ancestry and lineages in both, Early Slavs and East Germanic tribes* suggests therefore a Slavonic homeland near (or within) the Przeworsk culture, close to the Albanoid one, as proposed based on topohydronymy. This may point to a complex process of acculturation of different eastern European populations which formed alliances, as was common during the Iron Age and later periods, and which cannot be interpreted as a clear picture of their languages’ original homeland and ancestral peoples (in the case of East Germanic tribes, apparently originally expanding from Scandinavia under strong I1 bottlenecks).

* Iberian samples of the Visigothic period in Spain show up to 25% E1b-V13 samples, with a mixture of haplogroups including local and foreign lineages, as well as some more E1b-V13 samples later during the Muslim period. Out of the two E1b samples from Longobards in Amorim et al. (2018), only SZ18 from Szólád (ca. AD 412-604) is within E1b-V13, in a very specific early branch (SNP M35.2), further locating the expansion of hg. E1b-V13 near the Danube. Samples of haplogroup J (maybe J2a) or G2a among Germanic tribes (and possibly in Poland’s Roman Iron Age / Early Middle Ages) are impossible to compare with early Hungarian ones without precise subclades.

east-slavic-expansion
East Slavic expansion in topo-hydronymy. Image from (Udolph 1997, 2016).

I already interpreted the earlier Slavic samples we had as a sign of a Carpathian origin and very recent bottlenecks under R1a lineages among Modern Slavs:

The finding of haplogroup E1b1b-M215 in two independent early West Slavic individuals further supports that the current distribution of R1a1a1b1a-Z282 lineages in Slavic populations is the product of recent bottlenecks. The lack of a precise subclade within the E1b1b-M215 tree precludes a proper interpretation of a potential origin, but they are probably under European E1b1b1a1b1-L618 subclade E1b1b1a1b1a-V13 (formed ca. 6100 BC, TMRCA ca. 2800 BC), possibly under the mutation CTS1273 (formed ca. 2600 BC, TMRCA ca. 2000 BC), in common with other ancient populations around the Carpathians (see below §viii.11. Thracians and Albanians). This gross geographic origin would support the studies of the Common Slavic homeland based on toponymy (Figure 66), which place it roughly between the Upper Oder and the Upper Dniester, north of the Carpathians (Udolph 1997, 2016).

EDIT (8 APR 2019): Another interesting data is the haplogroup distribution among Modern Slavs and neighbouring peoples (see Wikipedia). For example, the bottleneck seen in Modern Albanians, under Z5017 subclade, also points to an origin of the expansion of E1b-V13 subclades among multiethnic groups around the Lower Danube coinciding with the Roman Iron Age, given the estimates for the arrival of Proto-Albanian close to the Latin and Greek linguistic frontier.

Remarkable is also its distribution among Rusyns, East Slavs from the Carpathians not associated with the Kievan Rus’, isolated thus quite soon from East Slavic expansions to the east. They were reported to show ca. 35% hg. E1b-V13 globally in FTDNA, with a frequency similar to or higher than R1a, in common with South Slavic peoples*, reflecting thus a situation similar to the source of East Slavs before further R1a-based bottlenecks (and/or acculturation events) to the east:

* Although probably due in part to founder effects and biased familial sampling, this should be assumed to be common to all FTDNA sampling, anyway.

rusyns-map
Map showing the full geographic extent of the Rusyn people in Central Europe, prior to World War I (Carpatho Rusyn Society).

Repeating what should be already evident: in complex organizations and/or demographically dense populations (more common since the Iron Age), we can’t expect language change to happen in the same way as during the known Neolithic or Chalcolithic population replacements, be it in Finland, Hungary, Iberia, or Poland. For example, no matter whether Romans (2nd c. BC) brought some R1b-U152 and other Mediterranean lineages to Iberia; Germanic peoples entering Hispania (AD 5th c.) were of typically Germanic lineages or not; Muslims who spoke mainly Berber (AD 8th c.) and were mainly of hg. E1b-M81 (and J?) brought North African ancestry; etc. the language or languages of Iberia changed (or not) with the political landscape: neither with radical population replacements (or full population continuity), nor with the dominant haplogroups’ ancestral language.

Y-chromosome haplogroups are, in those cases, useful for ascertaining a more recent origin of the population. Like the finding of certain R1a-Z645, I2a-L621 & N-L392 lineages among Hungarians shows a recent origin near the Trans-Urals forest-steppes, or the finding of I1, R1b-U106 & E1b-V13 among Visigoths shows a recent origin near the Danube, the finding of Early Slavs (ca. AD 6th-7th c.) originally with small elite groups of hg. R1a-M458 & E1b-V13 from the Lower/Middle Danube – if strengthened with more Early Slavic samples, with Slavonic partially expanding as a lingua franca in some regions – is not necessarily representative of the Proto-Slavic community, just as it is clearly not representative of the later expansion of Slavic dialects. It would be representative, though, of the same processes of acculturation repeated all over Eurasia at least since the Iron Age, where no genetic continuity can be found with ancestral languages.

Related

R1a-Z280 and R1a-Z93 shared by ancient Finno-Ugric populations; N1c-Tat expanded with Micro-Altaic

Two important papers have appeared regarding the supposed link of Uralians with haplogroup N.

Avars of haplogroup N1c-Tat

Preprint Genetic insights into the social organisation of the Avar period elite in the 7th century AD Carpathian Basin, by Csáky et al. bioRxiv (2019).

Interesting excerpts (emphasis mine):

After 568 AD the Avars settled in the Carpathian Basin and founded the Avar Qaganate that was an important power in Central Europe until the 9th century. Part of the Avar society was probably of Asian origin, however the localisation of their homeland is hampered by the scarcity of historical and archaeological data.

Here, we study mitogenome and Y chromosomal STR variability of twenty-six individuals, a number of them representing a well-characterised elite group buried at the centre of the Carpathian Basin more than a century after the Avar conquest.

The Y-STR analyses of 17 males give evidence on a surprisingly homogeneous Y chromosomal composition. Y chromosomal STR profiles of 14 males could be assigned to haplogroup N-Tat (also N1a1-M46). N-Tat haplotype I was found in four males from Kunpeszér with identical alleles on at least nine loci. The full Y-STR haplotype I, reconstructed from AC17 with 17 detected STRs, is rare in our days. Only nine matches were found among haplotypes in YHRD database, such as samples from the Ural Region, Northern Europe (Estonia, Finland), and Western Alaska (Yupiks). We performed Median Joining (MJ) network analysis using N-Tat haplotypes with ten shared STR loci (Fig. 3, Table S9). All modern N-Tat samples included in the network had derived allele of L708 as well. Haplotype I (Cluster 1 in Fig. 3) is shared by eight populations on the MJ network among the 24 identical haplotypes. Cluster 1 represents the founding lineage, as it is described in Siberian populations, because this haplotype is shared by the most populations and it is more diverse than Cluster 2.

Nine males share N-Tat haplotype II (on a minimum of eight detected alleles), all of them buried in the Danube-Tisza Interfluve. We found 30 direct matches of this N-Tat haplotype II in the YHRD database, using the complete 17 STR Y-filer profile of AC1, AC12, AC14, AC15, AC19 samples. Most hits came from Mongolia (seven Buryats and one Khalkh) and from Russia (six Yakuts), but identical haplotypes also occur in China (five in Xinjiang and four in Inner Mongolia provinces). On the MJ network, this haplotype II is represented by Cluster 2 and is composed of 45 samples (including 32 Buryats) from six populations (Fig. 3).

y-str-haplogroup-n-mongolian-ugrians
Median Joining network of 162 N-Tat Y-STR haplotypes Allelic information of ten Y-STR loci were used for the network. Only those Avar samples were included, which had results for these ten Y-STR loci. The founder haplotype I (Cluster 1) is shared by eight populations including three Mongolian, three Székely, three northern Mansi, two southern Mansi, two Hungarian, eight Khanty, one Finn and two Avar (AC17, AC26) chromosomes. Haplotype II (Cluster 2) includes 45 haplotypes from six populations studied: 32 Buryats, two Mongolians, one Székely, one Uzbek, one Uzbek Madjar, two northern Mansi and six Avars (AC1, AC12, AC14, AC15, AC19 and KSZ 37). Haplotype III (indicated by a red arrow) is AC8. Information on the modern reference samples is seen in Table S9.

A third N-Tat lineage (type III) was represented only once in the Avar dataset (AC8), and has no direct modern parallels from the YHRD database. This haplotype on the MJ network (see red arrow in Fig. 3) seems to be a descendent from other haplotype cluster that is shared by three populations (two Buryat from Mongolia, three Khanty and one Northern Mansi samples). This haplotype cluster also differs one molecular step (locus DYS393) from haplotype II. We classified the Avar samples to downstream subgroup N-F4205 within the N-Tat haplogroup, based on the results of ours and Ilumäe et al.18 and constructed a second network (Fig. S4). The N-F4205 network results support the assumption that the N-Tat Avar samples belong to N-F4205 subgroup (see SI chapter 1d for more details).

Based on our calculation, the age of accumulated STR variance (TMRCA) within N-Tat lineage for all samples is 7.0 kya (95% CI: 4.9 – 9.2 kya), considering the core haplotype (Cluster 1) to be the founding lineage. Y haplogroup N-Tat was not detected by large scale Eurasian ancient DNA studies but it occurs in late Bronze Age Inner Mongolia and late medieval Yakuts, among them N-Tat has still the highest frequency.

Two males (AC4 and AC7) from the Transtisza group belong to two different haplotypes of Y-haplogroup Q1. Both Q1a-F1096 and Q1b-M346 haplotypes have neither direct nor one step neighbour matches in the worldwide YHRD database. A network of the Q1b-M346 haplotype shows that this male had a probable Altaian or South Siberian paternal genetic origin.

EDIT (5 APR 2019): The paper offers an interesting late sample before the arrival of Hungarian conquerors, although we don’t know which precise lineage the sample belongs to:

One sample in our dataset (HC9) comes from this population, and both his mtDNA (T1a1b) and Y chromosome (R1a) support Eastern European connections. (…) Furthermore, we excluded sample HC9 from population-genetic statistical analyses because it belongs to a later period (end of 7th – early 9th centuries)

Apparently, then, results are consistent with what was already known from studies of modern populations:

According to Ilumäe et al. study, the frequency peak of N-F4205 (N3a5-F4205) chromosomes is close to the Transbaikal region of Southern Siberia and Mongolia, and we conclude that most Avar N-Tat chromosomes probably originated from a common source population of people living in this area, completely in line with the results of Ilumäe et al.

haplogroup_n1
Geographic-Distribution Map of hg N3 from Ilumäe et al.

Finno-Ugrians share haplogroup R1a-Z280

Another paper, behind paywall, Genetic history of Bashkirian Mari and Southern Mansi ethnic groups in the Ural region, by Dudás et al. Molecular Genetics and Genomics (2019).

Interesting excerpts (emphasis mine):

Y‑chromosome diversity

The most frequent haplogroups of the Bashkirian Maris were N1b-P43 (42%), R1a-Z280 (16%), R1a-Z93 (16%), N1c-Tat (13%), and J2-M172 (7%). Furthermore, subgroup R1b-M343 accounted for 4% and I2a-P37 covered 2% of the lineages. None of the Mari N1c Y chromosomes belonged to the N1c subgroups investigated (L1034, VL29, Z1936).

In the case of the Southern Mansi males, the most frequent haplogroups were N1b-P43 (33%), N1c-L1034 (28%) and R1a-Z280 (19%). The frequencies of the remaining haplogroups were as follows: R1a-M458 (6%), I1-L22 (3%), I2a-P37 (3%), and R1b-P312 (3%). The haplotype and haplogroup diversities of the Bashkirian Mari group were 0.9929 and 0.7657, whereas these values for the Southern Mansi were 0.9984 and 0.7873, respectively. The results show that, in both populations, haplotypes are much more diverse than haplogroups.

bashkir-mari-southern-mansi
Haplogroup frequencies of the Bashkirian Mari and the Southern Mansi ethnic groups in Ural region

Genetic structure

(..) the studied Bashkirian Mari and Southern Mansi population groups formed a compact cluster along with two Khanty, Northern Mansi, Mari, and Estonian populations based on close Fst-genetic distances (< 0.05), with nonsignificant p values (p > 0.05) except for the Estonian population. All of these populations belong to the Finno-Ugric language family. Interestingly, the other Mansi population studied by Pimenoff et al. (2008) (pop # 38) was located a great distance from the Southern Mansi group (0.268). In addition, the Bashkir population (pop # 6) did not show a close genetic affinity to the Bashkirian Mari group (0.194), even though it is the host population. However, the Russian population from the Eastern European region of Russia (pop # 49) showed a genetic distance of 0.055 with the Southern Mansi group. All Hungarian speaking populations (pops 13, 22, 23, 24, 50, and 51) showed close genetic affinities to each other and to the neighbouring populations, but not to the two studied populations.

y-dna-hungarians-ugric-mansi
Multidimensional scaling (MDS) plot constructed on Fstgenetic distances of Y haplogroup frequencies of 63 populations compared. The haplogroup frequency data used for population comparison together with references are seen in Online Resource 2 (ESM_2). Pairwise Fst-genetic distances and p values between 63 populations were calculated as shown in Online Resource 3 (ESM_3) Fig. 4 Multidimensional scaling (MDS) plot constructed on Rstgenetic distances of 10 STR-based Y haplotype frequencies of 21 populations compared. Image modified to include labels of modern populations.

Phylogenetic analysis

Median-joining networks were constructed for:

N-P43 (earlier N1b):

(…) TMRCA estimates for this haplogroup were made for all P43 samples (n = 157) 8.7 kya (95% CI 6.7–10.8 kya), for the N-P43 Asian.

N1c-Tat:

(…) 75% of Buryats belonged to Haplotype 2, indicating that the Buryats studied by us is a young and isolated population (Bíró et al. 2015). Bashkirian Mari samples derive from Haplotype 2 via Haplotype 3 (see dark purple circles on the top of Fig. 6a). Haplotype 3 contained six males (2 Buryat, 1 Northern Mansi, and 3 Khanty samples from Pimenoff et al. 2008). The biggest Bashkirian Mari haplotype node (3 Mari samples) was positioned three mutational steps away from Haplotype 1 and the remaining Mari samples can be derived from this haplotype. Southern Mansi haplotypes were scattered within the network except for two, which formed a smaller haplotype node with two Northern Mansi and two Khanty samples from Pimenoff et al. (2008).

n1c-n-tat-uralic-ugric
Median-Joining Networks (MJ) of 153 N-Tat (a) and 26 N-L1034 (b) haplotypes constructed. The circle sizes are proportional to the haplotype frequencies. The smallest area is equivalent to one individual. For N-Tat network, we used data from Southern Mansi (n = 11), Bashkirian Mari (n = 6) samples with Hungarian (n = 12), Hungarian speaking Székely (n = 6), Northern Mansi (n = 14), Mongolian (n = 16), Buryat (n = 44), Finnish (n = 13), Uzbek Madjar (n = 2), Uzbek (n = 3), Khanty (n = 4) populations studied earlier by us (Fehér et al. 2015; Bíró et al. 2015) and Khanty (n = 18) and Mansi (n = 4) studied by Pimenoff et al. (2008)

R1a-Z280 haplotypes, shared by Maris, Mansis, and Hungarians, hence ancient Finno-Ugrians:

The founder R1a-Z280 haplotype was shared by four samples from four populations (1 Bashkirian Mari; 1 Southern Mansi; 1 Hungarian speaking Székely; and 1 Hungarian), as presented in Fig. 7 (Haplotype 1). Haplotype 2 included five males (3 Bashkirian Mari and 2 Hungarian), as it can be seen in Fig. 7. Haplotype 4 included two shared haplotypes (1 Bashkirian Mari and one Hungarian speaking Csángó). The remaining two Bashkirian Mari haplotypes differ from the founder haplotype (Haplotype 1) by two mutational steps via Hungarian or Hungarian and Bashkirian Mari shared haplotypes. Beside Haplotype 1, the remaining Southern Mansi haplotypes were shared with Hungarians (Haplotype 5 or turquoise blue and red-coloured circles above Haplotype 7) or with Hungarians and Hungarian speaking Székely group (Haplotypes 3, 5, and 6). Haplotype 7 included ten Hungarian speakers (Hungarian, Székely, and Csángó). One Hungarian and one Uzbek Khwarezm shared haplotype can be found in Fig. 7 as well (red and white-coloured circle). All the other haplotypes were scattered in the network. The age of accumulated STR variation within R1a-Z280 lineage for 93 samples is estimated to be 9.4 kya (95% CI 6.5–12.4 kya) considering Haplotype 1 (Fig. 7) to be the founder.

r1a-z280-ugrians
Median-Joining Networks (MJ) of 93 R1a-Z280 haplotypes constructed. The circle sizes are proportional to the haplotype frequencies. The smallest area is equivalent to one individual. We used haplotype data from Bashkirian Mari (n = 7), Southern Mansi (n = 7), Hungarian (n = 52), Hungarian speaking Székely (n = 11), Hungarian speaking Csángó (n = 10), Uzbek Ferghana (n = 2), Uzbek Tashkent (n = 1), Uzbek Khwarezm (n = 1) and Northern Mansi (n = 2) populations

R1a-Z93 as isolated lineages among Permic and Ugric populations:

Figure 8 depicts an MJ network of R1a-Z93* samples using 106 haplotypes from the 14 populations (Fig. 8). All of the Bashkirian Mari samples (7 haplotypes) formed a very isolated branch and differed from the one Hungarian haplotype (Fig. 8, see Haplotype 1) by seven mutational steps as well from two Uzbek Tashkent samples (see Haplotype 3). Another Hungarian sample shared two haplotypes of Uzbek Khwarezm samples in Haplotype 4. This haplotype can be derived from Haplotype 3 (Uzbek Tashkent). Haplotype 2 included one Hungarian and one Khakassian male. The remaining three Hungarian haplotypes are outliers in the network and are not shared by any sample. The other population samples included in the network either form independent clusters such as Altaians, Khakassians, Khanties, and Uzbek Madjars or were scattered in the network. The age of accumulated STR variation (TMRCA) within R1a-Z93* lineage for 106 samples is estimated as 11.6 kya (95% CI 9.3–14.0 kya) considering an Armenian haplotype (Fig. 8, “A”) to be the founder and the median haplotype.

r1a-z93-ugrians
Median-Joining Networks (MJ) of 106 R1a-Z93 haplotypes constructed. The circle sizes are proportional to the haplotype frequencies. The smallest area is equivalent to one individual. We used the next haplotype data: 7 Bashkirian Mari, 6 Khanty, 4 Uzbek Madjar, 5 Uzbek Ferghana, 9 Uzbek Tashkent, 7 Uzbek Khwarezm, 2 Mongolian, 2 Buryat, 6 Hungarian samples tested by us for this study or published earlier (Bíró et al. 2015) and populations (3 Armenian; 3 Afghan Tajik;
16 Altaian; 24 Khakassian; 12 Kyrgyz) from Underhill et al. (2015)

Comments

The results of modern populations for N (especially N1c) subclades show really wide clusters and ancient TMRCA, consistent with their known ancient and wide distribution in northern and eastern Eurasian groups, and thus with infiltration of different lineages with eastern nomads (and northern Arctic populations) coupled with later bottlenecks, as well as acculturation of groups.

EDIT (2 APR): Interesting is the specific subclade to which ancient Mongolic-speaking Avars belong (information from Yfull) N1c-F4205 (TMRCA ca. 500 BC), subclade of N1c-Y6058 (formed ca. 2800 BC, TMRCA ca. 2800 BC). This branch also gives the “European” branch N1c-CTS10760 (formed ca. 2800 BC, TMRCA ca. 2100 BC), and is subclade of a branch of N1c-L392 (formed ca. 4400 BC, TMRCA ca. 2800 BC). A northern expansion of N1c-L392 is probably represented by its branch N1c-Z1936 (formed ca. 2800, TMRCA ca. 2100 BC), the most likely candidate to appear in the Kola Peninsula in the Bronze Age as the Palaeo-Laplandic population (see here). Read more about potential routes of expansion of haplogroup N.

On the other hand, R1a-Z280 lineages form a tight cluster connecting Permic with Ugric groups, with R1a-Z93 showing early isolation (probably) between Cis-Urals and Trans-Urals regions. While both Corded Ware lineages in Finno-Ugrians are most likely related to the Abashevo expansion through Seima-Turbino and the Andronovo-like Horizon (and potentially later Eurasian expansions), a plausible hypothesis would be that Finno-Ugrians are related to an expansion of R1a-Z283 haplogroups (we already knew about the Finno-Permic connection), while the ancient connection between Permians and Hungarians with R1a-Z93 would correspond to this haplogroup’s potentially tighter link with an early Samoyedic split.

I don’t think that an explosive expansion of eastern Corded Ware groups of R1a-Z645 lineages will show a clear-cut division of haplogroups among Eastern Uralic groups, though, and culturally I doubt we will have such a clear image, either (similar to how the explosive expansion of Bell Beakers cannot be easily divided by regional/language group into R1b-L151 subclades before the known bottlenecks). Relevant in this regard are the known Z93 samples from the Árpád dynasty.

Nevertheless, this data may represent a slightly more recent wave of R1a-Z280 lineages linked to the expansion of Ugric into the Trans-Uralian region, after their split from Finno-Permic, still in close contact with Indo-Iranians in Poltavka and Sintashta-Potapovka, evident from the early and late Indo-Iranian borrowings, during a common period when Samoyedic had already separated.

Such a “Z283 over Z93” layer in the Trans-Urals (and Cis-Urals?) forest-steppes would be similar to the apparent replacement of Z284 by Z282 in the Eastern Baltic during the Bronze Age (possibly with the second or Estonian Battle Axe wave or, much more likely during later population movements). Such an early R1a-Z93 split could potentially be supported also by the separation into bottlenecks under “Northern” (R1a-Z283) Finno-Ugric-speaking Abashevo-related groups and “Southern” (R1a-Z93) acculturated Indo-Iranian-speaking Abashevo migrants developing Sintashta-Potapovka admixing with Poltavka R1b-Z2103 herders.

r1a-z282-z280-z2125-distribution
Modified image, from Underhill et al. (2015). Spatial frequency distributions of Z282 (green) and Z93 (blue) affiliated haplogroups.. Notice the potential Finno-Ugric-associated distribution of Z282 (especially R1a-M558, a Z280 subclade), the expansion of R1a-Z2123 subclades with Central Asian forest-steppe groups.

Conclusion

Let’s review some of the most common myths about Hungarians (and Finno-Ugrians in general) repeated ad nauseam, side by side with my assertions:

❌ N (especially N1c-Tat) in ancient and modern samples represent the True Uralic™ N1c peoples including Magyar tribes? Nope.

✅ Ancient N (especially N1c-Tat) lineages among Uralic populations expanded relatively recently, and differently in different regions (including eastern steppe nomads and northern arctic populations) not associated with a particular language or language group? Yep (read the series on Corded Ware = Uralic expansion).

❌ Modern Hungarian R1a-Z280 lineages represent the majority of the native population, poor Slavic ‘peasants’ from the Carpathian Basin, forcibly acculturated by a minority of bad bad Hungarian hordes? Nope.

✅ Modern Hungarian R1a-Z280 subclades represent Ugric lineages in common with ancient R1a-Z645 Finno-Ugric populations from north-eastern Europe and the Trans-Urals? Yep (see Avars and Ugrians).

❌ Modern Hungarian R1a-Z93 lineages represent acculturated Iranian/Turkic peoples from the steppes? Not likely.

✅ Modern Hungarian R1a-Z93 lineages represent a remnant of the expansion of Corded Ware to the east, potentially more clearly associated with Samoyedic? Much more likely.

finno-ugric-haplogroup-n
Map of archaeological cultures in north-eastern Europe ca. 8th-3rd centuries BC. [The Mid-Volga Akozino group not depicted] Shaded area represents the Ananino cultural-historical society. Fading purple arrows represent likely stepped movements of subclades of haplogroup N for centuries (e.g. Siberian → Ananino → Akozino → Fennoscandia [N-VL29]; Circum-Arctic → forest-steppe [N1, N2]; etc.). Blue arrows represent eventual expansions of Uralic peoples to the north. Modified image from Vasilyev (2002).

Sooo, the theory of a “diluted” Y-DNA in Modern Hungarians from originally fully N-dominated conquerors subjugating native R1a-Z280 Slavs from the Carpathian Basin is not backed up by genetic studies? The ethnic Iranian-Turkic R1a-Z93 federation in the steppes that ended up speaking Magyar is not real?? Who would’ve thunk.

Another true story whose rejection in genetics could not be predicted, like, not at all.

Totally unexpected, too, the drift of “R1a=IE” fans with the newest genetic findings towards a Molgen-like “Yamna/R1b = Vasconic-Caucasian”, “N1c = Uralic-Altaic”, and “R1a = the origin of the white world in Mother Russia”. So much for the supposed interest in “Steppe ancestry” and fancy statistics.

Related

Common pitfalls in human genomics and bioinformatics: ADMIXTURE, PCA, and the ‘Yamnaya’ ancestral component

invasion-from-the-steppe-yamnaya

Good timing for the publication of two interesting papers, that a lot of people should read very carefully:

ADMIXTURE

Open access A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots, by Daniel J. Lawson, Lucy van Dorp & Daniel Falush, Nature Communications (2018).

Interesting excerpts (emphasis mine):

Experienced researchers, particularly those interested in population structure and historical inference, typically present STRUCTURE results alongside other methods that make different modelling assumptions. These include TreeMix, ADMIXTUREGRAPH, fineSTRUCTURE, GLOBETROTTER, f3 and D statistics, amongst many others. These models can be used both to probe whether assumptions of the model are likely to hold and to validate specific features of the results. Each also comes with its own pitfalls and difficulties of interpretation. It is not obvious that any single approach represents a direct replacement as a data summary tool. Here we build more directly on the results of STRUCTURE/ADMIXTURE by developing a new approach, badMIXTURE, to examine which features of the data are poorly fit by the model. Rather than intending to replace more specific or sophisticated analyses, we hope to encourage their use by making the limitations of the initial analysis clearer.

The default interpretation protocol

Most researchers are cautious but literal in their interpretation of STRUCTURE and ADMIXTURE results, as caricatured in Fig. 1, as it is difficult to interpret the results at all without making several of these assumptions. Here we use simulated and real data to illustrate how following this protocol can lead to inference of false histories, and how badMIXTURE can be used to examine model fit and avoid common pitfalls.

admixture-protocol
A protocol for interpreting admixture estimates, based on the assumption that the model underlying the inference is correct. If these assumptions are not validated, there is substantial danger of over-interpretation. The “Core protocol” describes the assumptions that are made by the admixture model itself (Protocol 1, 3, 4), and inference for estimating K (Protocol 2). The “Algorithm input” protocol describes choices that can further bias results, while the “Interpretation” protocol describes assumptions that can be made in interpreting the output that are not directly supported by model inference

Discussion

STRUCTURE and ADMIXTURE are popular because they give the user a broad-brush view of variation in genetic data, while allowing the possibility of zooming down on details about specific individuals or labelled groups. Unfortunately it is rarely the case that sampled data follows a simple history comprising a differentiation phase followed by a mixture phase, as assumed in an ADMIXTURE model and highlighted by case study 1. Naïve inferences based on this model (the Protocol of Fig. 1) can be misleading if sampling strategy or the inferred value of the number of populations K is inappropriate, or if recent bottlenecks or unobserved ancient structure appear in the data. It is therefore useful when interpreting the results obtained from real data to think of STRUCTURE and ADMIXTURE as algorithms that parsimoniously explain variation between individuals rather than as parametric models of divergence and admixture.

For example, if admixture events or genetic drift affect all members of the sample equally, then there is no variation between individuals for the model to explain. Non-African humans have a few percent Neanderthal ancestry, but this is invisible to STRUCTURE or ADMIXTURE since it does not result in differences in ancestry profiles between individuals. The same reasoning helps to explain why for most data sets—even in species such as humans where mixing is commonplace—each of the K populations is inferred by STRUCTURE/ADMIXTURE to have non-admixed representatives in the sample. If every individual in a group is in fact admixed, then (with some exceptions) the model simply shifts the allele frequencies of the inferred ancestral population to reflect the fraction of admixture that is shared by all individuals.

Several methods have been developed to estimate K, but for real data, the assumption that there is a true value is always incorrect; the question rather being whether the model is a good enough approximation to be practically useful. First, there may be close relatives in the sample which violates model assumptions. Second, there might be “isolation by distance”, meaning that there are no discrete populations at all. Third, population structure may be hierarchical, with subtle subdivisions nested within diverged groups. This kind of structure can be hard for the algorithms to detect and can lead to underestimation of K. Fourth, population structure may be fluid between historical epochs, with multiple events and structures leaving signals in the data. Many users examine the results of multiple K simultaneously but this makes interpretation more complex, especially because it makes it easier for users to find support for preconceptions about the data somewhere in the results.

In practice, the best that can be expected is that the algorithms choose the smallest number of ancestral populations that can explain the most salient variation in the data. Unless the demographic history of the sample is particularly simple, the value of K inferred according to any statistically sensible criterion is likely to be smaller than the number of distinct drift events that have practically impacted the sample. The algorithm uses variation in admixture proportions between individuals to approximately mimic the effect of more than K distinct drift events without estimating ancestral populations corresponding to each one. In other words, an admixture model is almost always “wrong” (Assumption 2 of the Core protocol, Fig. 1) and should not be interpreted without examining whether this lack of fit matters for a given question.

admixture-pitfalls
Three scenarios that give indistinguishable ADMIXTURE results. a Simplified schematic of each simulation scenario. b Inferred ADMIXTURE plots at K= 11. c CHROMOPAINTER inferred painting palettes.

Because STRUCTURE/ADMIXTURE accounts for the most salient variation, results are greatly affected by sample size in common with other methods. Specifically, groups that contain fewer samples or have undergone little population-specific drift of their own are likely to be fit as mixes of multiple drifted groups, rather than assigned to their own ancestral population. Indeed, if an ancient sample is put into a data set of modern individuals, the ancient sample is typically represented as an admixture of the modern populations (e.g., ref. 28,29), which can happen even if the individual sample is older than the split date of the modern populations and thus cannot be admixed.

This paper was already available as a preprint in bioRxiv (first published in 2016) and it is incredible that it needed to wait all this time to be published. I found it weird how reviewers focused on the “tone” of the paper. I think it is great to see files from the peer review process published, but we need to know who these reviewers were, to understand their whiny remarks… A lot of geneticists out there need to develop a thick skin, or else we are going to see more and more delays based on a perceived incorrect tone towards the field, which seems a rather subjective reason to force researchers to correct a paper.

PCA of SNP data

Open access Effective principal components analysis of SNP data, by Gauch, Qian, Piepho, Zhou, & Chen, bioRxiv (2018).

Interesting excerpts:

A potential hindrance to our advice to upgrade from PCA graphs to PCA biplots is that the SNPs are often so numerous that they would obscure the Items if both were graphed together. One way to reduce clutter, which is used in several figures in this article, is to present a biplot in two side-by-side panels, one for Items and one for SNPs. Another stratagem is to focus on a manageable subset of SNPs of particular interest and show only them in a biplot in order to avoid obscuring the Items. A later section on causal exploration by current methods mentions several procedures for identifying particularly relevant SNPs.

One of several data transformations is ordinarily applied to SNP data prior to PCA computations, such as centering by SNPs. These transformations make a huge difference in the appearance of PCA graphs or biplots. A SNPs-by-Items data matrix constitutes a two-way factorial design, so analysis of variance (ANOVA) recognizes three sources of variation: SNP main effects, Item main effects, and SNP-by-Item (S×I) interaction effects. Double-Centered PCA (DC-PCA) removes both main effects in order to focus on the remaining S×I interaction effects. The resulting PCs are called interaction principal components (IPCs), and are denoted by IPC1, IPC2, and so on. By way of preview, a later section on PCA variants argues that DC-PCA is best for SNP data. Surprisingly, our literature survey did not encounter even a single analysis identified as DC-PCA.

The axes in PCA graphs or biplots are often scaled to obtain a convenient shape, but actually the axes should have the same scale for many reasons emphasized recently by Malik and Piepho [3]. However, our literature survey found a correct ratio of 1 in only 10% of the articles, a slightly faulty ratio of the larger scale over the shorter scale within 1.1 in 12%, and a substantially faulty ratio above 2 in 16% with the worst cases being ratios of 31 and 44. Especially when the scale along one PCA axis is stretched by a factor of 2 or more relative to the other axis, the relationships among various points or clusters of points are distorted and easily misinterpreted. Also, 7% of the articles failed to show the scale on one or both PCA axes, which leaves readers with an impressionistic graph that cannot be reproduced without effort. The contemporary literature on PCA of SNP data mostly violates the prohibition against stretching axes.

pca-how-to
DC-PCA biplot for oat data. The gradient in the CA-arranged matrix in Fig 13 is shown here for both lines and SNPs by the color scheme red, pink, black, light green, dark green.

The percentage of variation captured by each PC is often included in the axis labels of PCA graphs or biplots. In general this information is worth including, but there are two qualifications. First, these percentages need to be interpreted relative to the size of the data matrix because large datasets can capture a small percentage and yet still be effective. For example, for a large dataset with over 107,000 SNPs for over 6,000 persons, the first two components capture only 0.3693% and 0.117% of the variation, and yet the PCA graph shows clear structure (Fig 1A in [4]). Contrariwise, a PCA graph could capture a large percentage of the total variation, even 50% or more, but that would not guarantee that it will show evident structure in the data. Second, the interpretation of these percentages depends on exactly how the PCA analysis was conducted, as explained in a later section on PCA variants. Readers cannot meaningfully interpret the percentages of variation captured by PCA axes when authors fail to communicate which variant of PCA was used.

Conclusion

Five simple recommendations for effective PCA analysis of SNP data emerge from this investigation.

  1. Use the SNP coding 1 for the rare or minor allele and 0 for the common or major allele.
  2. Use DC-PCA; for any other PCA variant, examine its augmented ANOVA table.
  3. Report which SNP coding and PCA variant were selected, as required by contemporary standards in science for transparency and reproducibility, so that readers can interpret PCA results properly and reproduce PCA analyses reliably.
  4. Produce PCA biplots of both Items and SNPs, rather than merely PCA graphs of only Items, in order to display the joint structure of Items and SNPs and thereby to facilitate causal explanations. Be aware of the arch distortion when interpreting PCA graphs or biplots.
  5. Produce PCA biplots and graphs that have the same scale on every axis.

I read the referenced paper Biplots: Do Not Stretch Them!, by Malik and Piepho (2018), and even though it is not directly applicable to the most commonly available PCA graphs out there, it is a good reminder of the distorting effects of stretching. So for example quite recently in Krause-Kyora et al. (2018), where you can see Corded Ware and BBC samples from Central Europe clustering with samples from Yamna:

NOTE. This is related to a vertical distorsion (i.e. horizontal stretching), but possibly also to the addition of some distant outlier sample/s.

pca-cwc-yamna-bbc
Principal Component Analysis (PCA) of the human Karsdorf and Sorsum samples together with previously published ancient populations projected on 27 modern day West Eurasian populations (not shown) based on a set of 1.23 million SNPs (Mathieson et al., 2015). https://doi.org/10.7554/eLife.36666.006

The so-called ‘Yamnaya’ ancestry

Every time I read papers like these, I remember commenters who kept swearing that genetics was the ultimate science that would solve anthropological problems, where unscientific archaeology and linguistics could not. Well, it seems that, like radiocarbon analysis, these promising developing methods need still a lot of refinement to achieve something meaningful, and that they mean nothing without traditional linguistics and archaeology… But we already knew that.

Also, if this is happening in most peer-reviewed publications, made by professional geneticists, in journals of high impact factor, you can only wonder how many more errors and misinterpretations can be found in the obscure market of so many amateur geneticists out there. Because amateur geneticist is a commonly used misnomer for people who are not geneticists (since they don’t have the most basic education in genetics), and some of them are not even ‘amateurs’ (because they are selling the outputs of bioinformatic tools)… It’s like calling healers ‘amateur doctors’.

NOTE. While everyone involved in population genetics is interested in knowing the truth, and we all have our confirmation (and other kinds of) biases, for those who get paid to tell people what they want to hear, and who have sold lots of wrong interpretations already, the incentives of ‘being right’ – and thus getting involved in crooked and paranoid behaviour regarding different interpretations – are as strong as the money they can win or loose by promoting themselves and selling more ‘product’.

As a reminder of how badly these wrong interpretations of genetic results – and the influence of the so-called ‘amateurs’ – can reflect on research groups, yet another turn of the screw by the Copenhagen group, in the oral presentations at Languages and migrations in pre-historic Europe (7-12 Aug 2018), organized by the Copenhagen University. The common theme seems to be that Bell Beaker and thus R1b-L23 subclades do represent a direct expansion from Yamna now, as opposed to being derived from Corded Ware migrants, as they supported before.

NOTE. Yes, the “Yamna → Corded Ware → Únětice / Bell Beaker” migration model is still commonplace in the Copenhagen workgroup. Yes, in 2018. Guus Kroonen had already admitted they were wrong, and it was already changed in the graphic representation accompanying a recent interview to Willerslev. However, since there is still no official retraction by anyone, it seems that each member has to reject the previous model in their own way, and at their own pace. I don’t think we can expect anyone at this point to accept responsibility for their wrong statements.

So their lead archaeologist, Kristian Kristiansen, in The Indo-Europeanization of Europé (sic):

kristiansen-migrations
Kristiansen’s (2018) map of Indo-European migrations

I love the newly invented arrows of migration from Yamna to the north to distinguish among dialects attributed by them to CWC groups, and the intensive use of materials from Heyd’s publications in the presentation, which means they understand he was right – except for the fact that they are used to support a completely different theory, radically opposed to those defended in Heyd’s model

Now added to the Copenhagen’s unending proposals of language expansions, some pearls from the oral presentation:

  • Corded Ware north of the Carpathians of R1a lineages developed Germanic;
  • R1b borugh [?] Italo-Celtic;
  • the increase in steppe ancestry on north European Bell Beakers mean that they “were a continuation of the Yamnaya/Corded Ware expansion”;
  • Corded Ware groups [] stopped their expansion and took over the Bell Beaker package before migrating to England” [yep, it literally says that];
  • Italo-Celtic expanded to the UK and Iberia with Bell Beakers [I guess that included Lusitanian in Iberia, but not Messapian in Italy; or the opposite; or nothing like that, who knows];
  • 2nd millennium BC Bronze Age Atlantic trade systems expanded Proto-Celtic [yep, trade systems expanded the language]
  • 1st millennium BC expanded Gaulish with La Tène, including a “Gaulish version of Celtic to Ireland/UK” [hmmm, dat British Gaulish indeed].

You know, because, why the hell not? A logical, stable, consequential, no-nonsense approach to Indo-European migrations, as always.

Also, compare still more invented arrows of migrations, from Mikkel Nørtoft’s Introducing the Homeland Timeline Map, going against Kristiansen’s multiple arrows, and even against the own recent fantasy map series in showing Bell Beakers stem from Yamna instead of CWC (or not, you never truly know what arrows actually mean):

corded-ware-migrations
Nørtoft’s (2018) maps of Indo-European migrations.

I really, really loved that perennial arrow of migration from Volosovo, ca. 4000-800 BC (3000+ years, no less!), representing Uralic?, like that, without specifics – which is like saying, “somebody from the eastern forest zone, somehow, at some time, expanded something that was not Indo-European to Finland, and we couldn’t care less, except for the fact that they were certainly not R1a“.

This and Kristiansen’s arrows are the most comical invented migration routes of 2018; and that is saying something, given the dozens of similar maps that people publish in forums and blogs each week.

NOTE. You can read a more reasonable account of how haplogroup R1b-L51 and how R1-Z645 subclades expanded, and which dialects most likely expanded with them.

We don’t know where these scholars of the Danish workgroup stand at this moment, or if they ever had (or intended to have) a common position – beyond their persistent ideas of Yamnaya™ ancestral component = Indo-European and R1a must be Indo-European – , because each new publication changes some essential aspects without expressly stating so, and makes thus everything still messier.

It’s hard to accept that this is a series of presentations made by professional linguists, archaeologists, and geneticists, as stated by the official website, and still harder to imagine that they collaborate within the same professional workgroup, which includes experienced geneticists and academics.

I propose the following video to close future presentations introducing innovative ideas like those above, to help the audience find the appropriate mood:

Related

On the origin of haplogroup R1b-L51 in late Repin / early Yamna settlers

steppe-eneolithic-migrations

A recent comment on the hypothetical Central European origin of PIE helped me remember that, when news appeared that R1b-L51 had been found in Khvalynsk ca. 4250-4000 BC, I began to think about alternative scenarios for the expansion of this haplogroup, with one of them including Central Europe.

Because, if YFull‘s (and Iain McDonald‘s) estimation of the split of R1b-L23 in L51 and Z2103 (ca. 4100 BC, TMRCA ca. 3700 BC) was wrong, by as much as the R1a-Z645 estimates proved wrong, and both subclades were older than expected, then maybe R1b-L51 was not part of the Yamna expansion, but rather part of an earlier expansion with Suvorovo-Novodanilovka into central Europe.

That is, R1b-L51 and R1b-Z2103 would have expanded wih Khvalynsk-Novodanilovka migrants, and they would have either disappeared among local populations, or settled and expanded with successful lineages in certain regions. I think this may give rise to two potential models.

A hidden group in the European east-central steppes?

Here is what Heyd (2011), for example, has to say about the effect of the Khvalynsk-Novodanilovka expansion in the 4th millennium BC, with the first Kurgan wave that shuttered the social, economic, and cultural foundations of south-eastern Europe (before the expansion of west Yamna migrants in the region):

indo-european-anatolian-uralic-migrations
Proto-Anatolian migrations with Khvalynsk-Novodanilovka expansion, including ADMIXTURE data from Wang et al. (2018).

As the Boleraz and Baden tumuli cases in Serbia and Hungary demonstrate, there are earlier, 4th millennium cal. B.C. round tumuli in the Carpathian basin. There are also earlier north-Pontic steppe populations who infiltrated similar environments west of the Black Sea prior to the rise of the Yamnaya culture. This situation can be traced back to the 2nd half of the 5th millennium cal. B.C. to a group of distinct burials, zoomorphic maceheads, long flint blades, triangular flint points, etc., summarized under the term Suvurovo-Novodanilovka (Govedarica 2004; Rassamakin 2004; Anthony 2007; Heyd forthcoming 2011). They also erected round personalized tumuli, though smaller in size and height, above inhumations of single individuals. Suvorovo and Casimcea are the key examples in the lower Danube region of Romania. In northeast Bulgaria, the primary grave of Polska Kosovo (ochre-stained supine extended body position: information communicated by S. Alexandrov) can also be seen as such, as should the Targovishte-“Gonova mogila” primary grave 1 in the Thracian plain with a burial arranged in a supine position with flexed legs, southeast-northwest orientated, and strewed with ochre (Kanchev 1991 , p. 56- 57; Ivanova Gaydarska 2007). In addition to the many copper and shell beads, the 17.4cm long obsidian blade is exceptional, which links this grave to the Csongrád-“Kettoshalom” grave in the south Hungarian plain (Ecsedy 1979). It also yielded an obsidian blade ( 13.2cm long) and copper, shell and limestone beads.

suvorovo-novodanilovka-expansion-europe
The Southeast European distribution of graves of the Suvorovo-Novodanilovka group and such unequipped ones mentioned in the text which can be attributed by burial custom and stratigraphic position in the barrow, plus zoomorphic and abstract animal head sceptres as well as specific maceheads with knobs as from Decea Maresului (mid-5th millennium until around 4000 BC). Heyd (2016).

However, no traces of a tumulus have been recorded above the Kettoshalom tomb. Conventionally, it is dated to the Bodrogkeresztur-period in east Hungary, shortly after 4000 cal. B.C., which would correspond very well with the suggested Cernavodă I (or its less known cultural equivalent in the Thracian plain) attribution for the “Gonova mogila” grave, a cultural background to which the Csongrád grave should have also belonged. Bodrogkeresztur and Cernavodă I periods are not the only examples of 4th millennium cal. B.C. tumuli and burials displaying this steppe connection. Indeed we can find this early steppe impact throughout the 4th millennium cal. B.C. These include adscriptions to the Horodiștea II (Corlateni-Dealul Stadole, grave I: Burtanescu l 998, p. 37; Holbocai, grave 34: Coma 1998, p. 16); to Gordinești-Cernavodă 11 (Liești-Movila Arbănașu, grave 22: Brudiu 2000); to Gorodsk-Usatovo (Corlăteni Dealul Cetăţii, grave I: Comșa 1998, p. 17- 18, in Romania; Durankulak, grave 982: Vajsov 2002, in Bulgaria); and to Cernavodă III(Golyama Detelina, tum. 4: Leshtakov, Borisov 1995), and early (end of 4th millennium cal. B.C.) Ezero in Ovchartsi, primary grave (Kalchev 1994, p. 134-138) and Golyama Detelina, tum. 2 (Kanchev 1991) in Bulgaria. Also the Boleráz and Baden tumuli of Banjevac-Tolisavac and Mokrin in the south Carpathian basin account for this, since one should perhaps take into account primary grave 12 of the Sárrédtudavari-Orhalom tumulus in the Hungarian Alfold: a left-sided crouched juvenile ( 15- 17 y) individual in an oval, NW-SE orientated grave pit 14C dated to 3350-3100 cal. B.C. at 2 sigma (Dani, Ncpper 2006). Neither the burial custom (no ochre strewing or depositing a lump of ochre has been recorded), nor date account for its ascription to the Yamnaya!

All of these tumuli and burials demonstrate, though, that there is already a constant but perhaps low-level 4th millennium cal. B.C. steppe interaction, linking the regions of the north of the Black Sea with those of the west, and reaching deep into the Carpathian basin. This has to be acknowledged. even if these populations remain small, bounded to their steppe habitat with an economy adapted to this special environment, and are not always visible in the record. Indirect hints may help in seeing them, such as the frequent occurrence of horse bones, regarded as deriving from domesticated horses, in Hungarian Baden settlements (Bokonyi 1978; Benecke 1998), and in those of the south German Cham Culture (Matuschik 1999, p. 80-82) and the east German Bernburg Culture (Becker 1999; Benecke 1999). These occur, however, always in low numbers, perhaps not enough to maintain and regenerate a herd. Does this point us towards otherwise archaeologically hidden horsebreeders in the Carpathian basin, before the Yamnaya? In any case, I hope to make one case clear: these are by no means Yamnaya burials in the strict definition! Attribution to the Yamnaya in its strict definition applies.

pit-graves-central-europe
Distribution of Pit-Grave burials west of the Black Sea likely dating to the 2nd half of the 4th millennium BC (triangles: side-crouched burials; filled circles: supine extended burials; open circles: suspected). In Alin Frînculeasa, Bianca Preda, Volker Heyd, Pit-Graves, Yamnaya and Kurgans along the Lower Danube.

Also, about the expansion of Yamna settlers along the steppes:

However, it should have been made clear by the distribution map of the Western Yamnaya that they were confining themselves solely to their own, well-known, steppe habitat and therefore not occupying, or pushing away and expelling, the locally settled farming societies. Also, living solely in the steppes requires another lifestyle, and quite different economic and social bases, most likely very different to the established farming societies. Although surely regarded as incoming strangers, they may therefore not have been seen as direct competitors. This argument can be further enforced when remembering that the lowlands and the steppes in the southeast of Europe had already been populated throughout the 4th millennium cal. B.C., as demonstrated above, by societies with a similar north-Pontic steppe origin and tradition, albeit in lower numbers. It is only for these groups that the Yamnaya may have become a threat, but their common origin and perhaps a similar economic/ social background with comparable lifestyles would surely have assisted to allow rapid assimilation. More important, though, is that farming societies in this region may therefore have been accustomed to dealing and interacting with different people and ethnic strangers for a long time. (…)

When assessing farming and steppe societies’ interaction from a general point of view, attitudes can diverge in three main directions:

  1. the violent one; with raids, fights, struggles, warfare, suppression and finally the superiority and exploitation of the one over the other;
  2. the peaceful one; with a continuous exchange of gifts, goods, work, information and genes in a balanced reciprocal system, leading eventually to the merging of the two societies and creation of a new identity;
  3. the neutral one; with the two societies ignoring each other for a long time.

What we see from trying to understand the record of the Yamnaya, based on their tumuli and burials, and the local and neighbouring contemporary societies, based on their settlements, hoards, and graves, is likely a mixture of all three scenarios, with the balance perhaps more towards exchange in a highly dynamic system with alterations over time. However, violence and raids cannot be ruled out; they would be difficult to see in the archaeological record; or only indirectly, such as the building of hill forts, particularly the defence-like chain of Vucedol hillforts along the south shore of the Danube on the Serbian/Croatian border zone (Tasic 1995a), and the retreat of people into them (Falkenstein 1998, p. 261-262), with other interpretations also possible. And finally, we are dealing here with very different local and neighbouring societies, as well as with more distant contemporary ones, looking, in reality, rather like a chequer board of societies and archaeological cultures (see Parzinger 1993 for the overview). These display different regional backgrounds and traditions leading to different social and settlement organizations, different economic bases and material cultures in the wide areas between Prut and Maritza rivers, and Black Sea and Tisza river. They surely found their individual way of responding to the incoming and settling Yamnaya people.

yamna-tumuli-west-carpathians
Yamnaya tumuli signalling the expansion of West Yamna from ca. 3100 BC (especially after ca. 2950 BC). Heyd (2011).

The best data we have about this potential non-Yamna origin of R1b-L51 – and thus in favour of its admixture in the Carpathian basin – lies in:

  1. The majority of R1a-Z2103 subclades found to date among Yamna samples.
  2. The presence of R1b-Z2103 in the Catacomb culture – in the Northern Caucasus and in Ukraine.
  3. The limited presence of (ancient and modern) R1b-L51 in eastern Europe and India, whose isolated finds are commonly (and simplistically) attributed to ‘late migrations’.
  4. The presence of R1b-L51 (xZ2103) in cultures related to the ‘Yamna package’, but supposedly not to Yamna settlers. So for example I7043, of haplogroup R1b-L151(xU106,xP312), ca. 2500-2200 BC from Szigetszentmiklós-Üdülősor, probably from the Bell Beaker (Csepel group), but maybe from the early Nagýrev culture.
  5. The expansion of its subclades apparently only from a single region, around the Carpathian basin, in contrast to R1b-Z2103.
  6. The already ‘diluted’ steppe admixture found in the earliest samples with respect to Yamna, which points to the appearance after the Yamna admixture with the local population.
  7. Ukrainian archaeologists (in contrast to their Russian colleagues) point to the relevance of North Pontic cultures like Kvitjana and Lower Mikhailovka in the development of Early Yamna in the west, and some eastern European researchers also believe in this similarity.
  8. If R1b-Z2103 and R1b-L51 had expanded with Suvorovo-Novodanilovka migrants to the west, and had admixed later as Hungary_LCA-LBA-like peoples with Yamna migrants during the long-term contacts with other ‘kurganized cultures’ ca. 2900-2500 BC in the Great Hungarian Plains, it could explain some peculiar linguistic traits of North-West Indo-European, and also why R1b-Z2103 appears in cultures associated with this earlier ‘steppe influence’ (i.e. not directly related to Yamna) such as Vučedol (with a R1b-Z2103 sample, see below). That could also explain the presence of R1b-L151(xP312, xU106) in similar Balkan cultures, possibly not directly related to Yamna.
PCA-r1b-l51
Image modified from Wang et al. (2018). PCA of ancient and modern samples. Red circle in dashed line around Varna, Greece Neolithic, and (approximate position of) Smyadovo outliers, part of Khvalynsk-Novodanilovka settlers.

A hidden group among north or west Pontic Eneolithic steppe cultures?

The expansion of Khvalynsk as Novodanilovka into the North Pontic area happened through the south across the steppe, near the coast, with the forest-steppe region working as a clear natural border for this culture of likely horse-riding chieftains, whose economy was probably based on some rudimentary form of mobile pastoralism.

Although archaeologists are divided as to the origin of each individual Middle Eneolithic group near the Black Sea after the end of the Khvalynsk-Novodanilovka period, it seems more or less clear that steppe cultures like Cernavodă, Lower Mikhailovka, or Kvitjana are closer (or “more archaic”) in their steppe features, which connects them to Volga–Ural and Northern Caucasus cultures, like Northern Caucasus, Repin or Khvalynsk.

On the other hand, forest-steppe cultures like Dereivka (including Alexandria) show innovative traits and contacts with para- or sub-Neolithic cultures to the north, like Comb-Pit Ware groups, apart from corded decoration influenced by Trypillian groups to the west, especially in their later (‘Proto-Corded Ware‘) stage after ca. 3500 BC.

If Ukrainian researchers like Rassamakin are right, Early Yamna expanded not only from Repin settlers, but also from local steppe cultures adopting Repin traits to develop an Early Yamna culture, similar to how eastern (Volga–Ural groups) seem to have synchronously adopted Early Yamna without massive affluence of Repin settlements.

Furthermore, local traits develop in southern groups, like anthropomorphic stelae (shared with Kemi-Oba, direct heir of Lower Mikhailovka), and rich burials featuring wagons. These traits are seen in west Yamna settlers.

north-pontic-kvityana-dereivka-repin
Modified from Rassamakin (1999), adding red color to Repin expansion. The system of the latest Eneolithic Pointic cultures and the sites of the Zhivotilovo-Volchanskoe type: 1) Volchanskoe; 2) Zhivotilovka; 3) Vishnevatoe; 4) Koisug.

Problems of this model include:

  1. On the North Pontic area – in contrast to the Volga–Ural region – , there was a clear “colonization” wave of Repin settlers, also supported by Ukrainian researchers, based on the number of new settlements and burials, and on the progressive retreat of Dereivka, Kvitjana, as well as (more recent) Maykop- and Trypillia-related groups from the North Pontic area ca. 3350/3300 BC. It seems unlikely that these expansionist, semi-nomadic, cattle-breeding, patrilineally-related steppe clans that were driving all native populations out of their territories suddenly decided, at some point during their spread into the North Pontic area ca. 3300-3100 BC, to join forces with some foreign male lineages from the area, and then continue their expansion to the west…
  2. Similar to the fate of R1b-P297 subclades in the Baltic after the expansion of Corded Ware migrants, previous haplogropus of the North Pontic region – such as R1a, R1b-V88, and I2 subclades basically disappeared from the ancient DNA record after the expansion of Khvalynsk-Novodanilovka, and then after the expansion of Yamna, as is clear from Yamna, Afanasevo, and Bell Beaker samples obtained to date. This, in combination with what we know about Y-chromosome bottlenecks in post-Neolithic expansions, leaves little space to think that a big enough territorial group with a majority of “native” haplogroups could survive later expansions (be it R1b-L51 or R1a-Z645).
  3. Supporting an expansion of the same male (and partly female) population, the Yamna admixture from east to west is quite homogeneous, with the only difference found in (non-significant) EEF-like proportion which becomes elevated in distant areas [apart from significant ‘southern’ contribution to certain outlier samples]. Based on the also homogeneous Y-DNA picture, the heterogeneity must come, in general, from the female exogamy practiced by expanding groups.
  4. There is a short period, spanning some centuries (approximately 3300-2700 BC), in which the North Pontic area – especially the forest-steppe territories to the west of the Dnieper, i.e. the Upper Dniester, Boh, and Prut-Siret areas – are a chaos of incoming and emigrating, expanding and shrinking groups of different cultures, such as late Trypillian groups, Maykop-related traits, TRB, GAC, (Proto-)Corded Ware, and Early Yamna settlements. No natural geographic frontier can be delimited between these groups, which probably interacted in different ways. Nevertheless, based on their cultural traits, admixture, and especially on their Y-DNA, it seems that they never incorporated foreign male lineages, beyond those they probably had during their initial expansion trends.
  5. The further expansionist waves of Early Yamna seen ca. 3100 BC, from the Danube Delta to the west, give an overall image of continuously expanding patrilineal clans of R1b-M269 subclades since the Khvalynsk-Novodanilovka migration, in different periodic steps, mostly from eastern Pontic-Caspian nuclei, usually overriding all encountered cultures and (especially male) populations, rather than showing long-term collaboration and interaction. Such interaction is seen only in exceptional cases, e.g. the long-term admixture between Abashevo and Poltavka, as seen in Proto-Indo-Iranian peoples and their language.
PCA-Ukraine-r1b-l51
Image modified from Wang et al. (2018). PCA of ancient and modern samples. Arrows depicting Khvalynsk -> Yamna drift (blue), and hypothetic approximate Ukraine Eneolithic -> Yamna drift accompanying R1b-L51 (red).

Consequences

We are living right now an exemplary ego-, (ethno-)nationalism-, and/or supremacy-deflating moment, for some individuals of eastern and northern European descent who believed that R1a or ‘steppe ancestry proportions’ meant something special. The same can be said about those who had interiorized some social or ethnolinguistic meaning for the origin of R1b in western Europe, N1c in north-eastern Europe, as well as Greeks, Iranians, Armenians, or Mediterranean peoples in general of ‘Near Eastern’ ancestry or haplogroups, or peoples of Near Eastern origin and/or language.

These people had linked their haplogroups or ancestry with some fantasy continuity of ‘their’ ancestral populations to ‘their’ territories or languages (or both), and all are being proven wrong.

Apart from teaching such people a lesson about what simplistic views are useful for – whether it is based on ABO or RH group, white skin, blond hair, blue eyes, lactase persistence, or on the own ancestry or Y-DNA haplogroup -, it teaches the rest of us what can happen in the near future among western Europeans. Because, until recently, most western Europeans were comfortably settled thinking that our ancestors were some remnant population from an older, Palaeolithic or Mesolithic population, who acquired Indo-European languages by way of cultural diffusion in different periods, including only minor migrations.

Judging by what we can see now among some individuals of Northern and Eastern European descent, the only thing that can worsen the air of superiority among western Europeans is when they realize (within a few years, when all these stupid battles to control the narrative fade) that not only are they the cultural ‘heirs’ of the Graeco-Roman tradition that began with the Roman Empire, but that most of them are the direct patrilineal descendants of Khvalynsk, Yamna, Bell Beaker, and European Bronze Age peoples, and thus direct descendants of Middle PIE, Late PIE, and NWIE speakers.

steppe-chalcolithic-migrations
Steppe-related migrations ca. 3100-2600 BC with tentative linguistic identification.

The finding of R1b-L51 and R1b-Z2103 among expanding Suvorovo-Novodanilovka chieftains, with pockets of R1b-L51 remaining in steppe-like societies of the Balkans and the Carpathian Basin, would have beautifully complemented what we know about the East Yamna admixture with R1a-Z93 subclades (Uralic speakers) ca. 2600-2100 BC to form Proto-Indo-Iranian, and about the regional admixtures seen in the Balkans, e.g. in Proto-Greeks, with the prevalent J subclades of the region.

It would have meant an end to any modern culture or nation identifying themselves with the ‘true’ Late PIE and Yamna heirs, because these would be exclusively associated with the expansion of R1b-Z2103 subclades with late Repin, and later as the full-fledged Late PIE with Yamna settlers to south-east and central Europe, and to the southern Urals. The language would have had then obviously undergone different language changes in all these territories through long-lasting admixture with other populations. In that sense, it would have ended with the ideas of supremacy in western Europe before they even begin.

The most likely future

However limited the evidence, it seems that R1b-L51 expanded with Yamna, though, based on the estimates for the haplogroups involved, and on marginal hints at the variability of L23 subclades within Yamna and neighbouring populations. If R1b-L51 expanded with West Repin / Early Yamna settlers, this is why they have not yet been found among Yamna samples:

steppe-eneolithic-migrations
Simplified map of Repin expansions from ca. 3500/3400 BC.
  • The subclade division of Yamna settlers needs not be 50:50 for L51:Z2103, either in time or in space. I think this is the simplistic view underlying many thoughts on this matter. Many different expanding patrilineal clans of L23 subclades may have been more or less successful in different areas, and non-Z2103 may have been on the minority, or more isolated relative to Z2103-clans among expanding peoples on the steppe, especially on the east. In fact, we usually talk in terms of “Z2103 vs. L51” as if
    1. these two were the only L23 subclades; and
    2. both had split and succeeded (expanding) synchronously;

    that is, as if there had not been multiple subclades of both haplogroups, and as if there had not been different expansion waves for hundreds of years stemming from different evolving nuclei, involving each time only limited (successful) clans. Many different subclades of haplogroups L23 (xZ2103, xL51), Z2103, and L51 must have been unsuccessful during the ca. 1,500 years of late Khvalynsk and late Repin-Early Yamna expansions in which they must have participated (for approximately 60-75 generations, based on a mean 20-25 years).

  • If we want to imagine a pocket of ‘hidden’ L51 for some region of the North Pontic or Carpathian region, the same can be imagined – and much more likely – for any unsampled territory of expanding late Repin/Early Yamna settlers from the Lower Don – Lower Volga region (probably already a mixed society of L51 and Z2103 subclades since their beginning, as the early Repin culture, ca. 3800 BC), with L51 clans being probably successful to the west.
  • The Repin culture expanded only in small, mobile settlements from the Lower Don – Lower Volga to the north, east, and south, starting ca. 3500/3400 BC, in the waves that eventually gave a rather early distant offshoot in the Altai region, i.e. Afanasevo. Starting ca. 3300 BC in the archaeological record, the majority of R1b-Z2103 subclades found to date in Afanasevo also supports either
    • a mixed Repin society, with Z2103-clans predominating among eastern settlers; or
    • a Repin society marked by haplogroup L51, and thus a cultural diffusion of late Repin/Early Yamna traits among neighbouring (Khvalynsk, Samara, etc.) groups of essentially the same (early Khvalynsk-Novodanilovka) genetic stock in the Volga–Ural region.

    Both options could justify a majority of Z2103 in the Lower Volga–Ural region, with the latter being supported by the scattered archaeological remains of late Repin in the region before the synchronous emergence of Early Yamna findings in the whole Pontic-Caspian steppe.

  • Most Z2103 from Yamna samples to date are from around 3100 BC (in average) onward, and from the right bank of the Lower Don to the east, particularly from the Lower Volga–Ural area (especially the Samara region), which – based on the center of expansion of late Repin settlers – may be depicting an artificially high Z2103-distribution of the whole Yamna community.
repin-expansion-khvalynsk-cultures
Repin expansion into the Volga–Ural region from ca. 3500/3400 BC. Map made by me based on maps and data from Morgunova (2014, 2016). Lopatino is marked with number 64.
  • Yamna sample I0443, R1b-L23 (Y410+, L51-), ca. 3300-2700 BCE from Lopatino II, points to an intermediate subclade between L23 and L51, near one of the supposed late Repin sites (based on kurgan burials with late Repin cultural traits) in the Samara region.
  • Other Balkan cultures potentially unrelated to the Yamna expansion also show Z2103 (and not only L51) subclades, like I3499 (ca. 2884-2666 calBC), of the Vučedol culture, from Beli Manastir-Popova zemlja, which points to the infiltration of Yamna peoples in other cultures. In any case, the appearance of R1b-L23 subclades in the region happens only after the Yamna expansion ca. 3100 BC, probably through intrusions into different neighbouring regions, if these Balkan cultures are not directly derived from Yamna settlements (which is probably the case of the Csepel Bell Beaker or early Nagýrev sample, see above).
  • The diversity of haplogroups found in or around the Carpathian Basin in Late Chalcolithic / Early Bronze Age samples, including L151(xP312, xU106), P312, U106, Z2103, makes it the most likely sink of Yamna settlers, who spread thus with expanding family clans of different R1b-L23 subclades.
  • Even though some Yamna vanguard groups are known to have expanded up to Saxony-Anhalt before ca. 2700 BC, haplogroup Z2103 seems to be restricted to more eastern regions, which suggests that R1b-L51 was already successful among expanding West Yamna clans in Hungary, which gave rise only later to expanding East Bell Beakers (overwhelmingly of L151 subclades). The source of R1b-L51 and L151 expansion over Z2103 must lie therefore in the West Yamna period, and not in the Bell Beaker expansion.
indo-european-uralic-migrations-yamna-gac
Yamna migrants ca. 3300-2600. Most likely site of admixture with GAC circled in red.
  • The R1b-Z2103 found in Poltavka, Catacomb, and to the south point to a late migration displacing the western R1b-L51, only after the late Repin expansion. This is also seen in the steppe ancestry and R1b-Z2103 south of the Caucasus, in Hajji Firuz, which points to this route as a potential source of the supposed “Earliest Proto-Indo-Iranian” (the mariannu term) of the Near East. A similar replacement event happened some centuries later with expanding R1a-Z93 subclades from the east wiping out haplogroup R1b-Z2103 from the Pontic-Caspian steppe.
  • Many ancient samples from Khvalynsk, Northern Caucasus, Yamna, or later ones are reported simply as R1b-M269 or L23, without a clear subclade, so the simplistic ‘Yamna–Z2103’ picture is not real: if one takes into account that Z2103 might have been successful quite early in the eastern region, it is more likely to obtain a successful Y-SNP call of a Z2103 subclade in the Volga–Ural region than a xZ2103 one.
  • There are some modern samples of R1b-L51 in eastern Europe and Asia, whose common simplistic attribution to “late expansions” is usually not substantiated; and also ancient R1b-L51 samples might be confirmed soon for Asia.
  • ‘Western’ features described by archaeologists for West Yamna settlers, associated with Kemi Oba and southern Yamna groups in the North Pontic area – like rich burials with anthropomorphic stelae and wagons – are actually absent in burials from settlers beyond Bulgaria, which does not support their affiliation with these local steppe groups of the Black Sea. Also, a mix with local traditions is seen accross all Early Yamna groups of the Pontic-Caspian steppe, and still genetics and common cultural traits point to their homogeneization under the same patrilineal clans expanding continuously for centuries. The maintenance of local traditions (as evidenced by East Bell Beakers in Iberia related to Iberian Proto-Beakers) is often not a useful argument in genetics, especially when the female population is not replaced.
yamna-settlers-hungary
Yamna settlers in the Great Pannonian Plain, showing only kurgans of Hungary ca. 2950-2500 BC. Yamna Hungary was one of the biggest West Yamna provinces. From Hórvath et al. (2013).

Conclusion

This is what we know, using linguistics, archaeology, and genetics:

  • Middle Proto-Indo-European expanded with Khvalynsk-Novodanilovka after ca. 4800 BC, with the first Suvorovo settlements dated ca. 4600 BC.
  • Archaic Late Proto-Indo-European expanded with late Repin (or Volga–Ural settlers related to Khvalynsk, influenced by the Repin expansion) into Afanasevo ca. 3500/3400 BC.
  • Late Proto-Indo-European expanded with Early Yamna settlers to the west into central Europe and the Balkans ca. 3100 BC; and also to the east (as Pre-Proto-Indo-Iranian) into the southern Urals ca. 2600 BC.
  • North-West Indo-European expanded with Yamna Hungary -> East Bell Beakers, from ca. 2500 BC.
  • Proto-Indo-Iranian expanded with Sintashta, Potapovka, and later Andronovo and Srubna from ca. 2100 BC.

It seems that the subclades from Khvalynsk ca. 4250-4000 BC were wrongly reported – like those of Narasimhan et al. (2018). However, even if they are real and YFull estimates have to be revised, and even if the split had happened before the expansion of Suvorovo-Novodanilovka, the most likely origin of R1b-L51 among Bell Beakers will still be the expansion of late Repin / Early Yamna settlers, and that is what ancient DNA samples will most likely show, whatever the social or political consequences.

The only relevance of the finding of R1b-L51 in one place or another – especially if it is found to be a remnant of a Middle PIE expansion coupled with centuries of admixture and interaction in the Carpathian Basin – is the potential influence of an archaic PIE (or non-IE) layer on the development of North-West Indo-European in Yamna Hungary -> East Bell Beaker. That is, more or less like the Uralic influence related to the appearance of R1a-Z93 among Proto-Indo-Iranians, of R1a-Z284 among Pre-Germanic peoples, and of R1a-Z282 among Balto-Slavic peoples.

I think there is little that ancient DNA samples from West Yamna could add to what we know in general terms of archaeology or linguistics at this point regarding Late PIE migrations, beyond many interesting details. I am sure that those who have not attributed some random 6,000-year-old paternal ancestor any magical (ethnic or nationalist) meaning are just having fun, enjoying more and more the precise data we have now on European prehistoric populations.

As for those who believe in magical consequences of genetic studies, I don’t think there is anything for them to this quest beyond the artificially created grand-daddy issues. And, funnily enough, those who played (and play) the ‘neutrality’ card to feel superior in front of others – the “I only care about the truth”-type of lie, while secretly longing for grandpa’s ethnolinguistic continuity – are suffering the hardest fall.

Related

The Danube Corridor Hypothesis and the Carpathian Basin in the Aurignacian

palaeolithic-migrations

Open access review, The Danube Corridor Hypothesis and the Carpathian Basin: Geological, Environmental and Archaeological Approaches to Characterizing Aurignacian Dynamics, by Wei Chu, J World Prehist (2018).

Abstract (emphasis mine):

Early Upper Paleolithic sites in the Danube catchment have been put forward as evidence that the river was an important conduit for modern humans during their initial settlement of Europe. Central to this model is the Carpathian Basin, a region covering most of the Middle Danube. As the archaeological record of this region is still poorly understood, this paper aims to provide a contextual assessment of the Carpathian Basin’s geological and paleoenvironmental archives, starting with the late Upper Pleistocene. Subsequently, it compiles early Upper Paleolithic data from the region to provide a synchronic appraisal of the Aurignacian archaeological evidence. It then uses this data to test whether the relative absence of early Upper Paleolithic sites is obscured by a taphonomic bias. Finally, it reviews current knowledge of the Carpathian Basin’s archaeological record and concludes that, while it cannot reject the Danube corridor hypothesis, further (geo)archaeological work is required to understand the link between the Carpathian Basin and Central and Southeastern Europe.

Interesting excerpt:

Though the Carpathian Basin record currently supports the idea of an exogenous, early entrance of the early Upper Paleolithic into the Carpathian Basin unrelated to any of the preceding MP or transitional industries, the dispersal across the Carpathian Basin is not suggestive of rapid demic expansion, as is evidenced by the relatively late hybridization of the Peștera cu Oase fossil and implied by persistent Mousterian technological elements (Fu et al. 2015; Horvath 2009; Noiret 2005).

carpathian-basin-aurignacian
Map of the Carpathian Basin showing major physiographic features, principal early Upper Paleolithic localities and environmental proxies mentioned in the text. Red stars indicate major archaeological sites; black stars indicate minor archaeological sites. Blue circles indicate modern human remains and black circles are loess profiles (see Tables 1 and 3 for locality information). Projection is latitude–longitude WGS84; DEM is SRTM (Color figure online)”

This begs the question of where the makers of the early Upper Paleolithic in the Carpathian Basin came from. Aside from a handful of Aurignacian sites (e.g. Bacho Kiro, Temnata) and Kozarnika, whose link to the Aurignacian remains tenuous, no other sites directly connect the Carpathian Basin Aurignacian to the south in the Balkans. Additionally, Anatolia has also to provide empirical evidence of a connection between Southwestern Europe and the early hominin technocomplexes of the Levant. Therefore, a western source for the Carpathian Basin early Upper Paleolithic is conceivable, especially considering the early Willendorf dates which, if correct, pre-date any of the evidence in the Carpathian Basin. If the Danube was as easy a conduit as has been suggested, it is equally likely that it may have seen hominin movement in the opposite direction (Sitlivy et al. 2014). Indeed, increasing genetic and archaeological evidence (Adler et al. 2008; Anikovich et al. 2007; López et al. 2016) supports the idea that the earliest modern humans coming from the Middle East and into Europe may have bypassed Southeastern Europe (at least overland), opting for a route running through the Caucasus, dispersing east through the East European Plain and then north of the Carpathians.

The recent reanalysis of Central European early Upper Paleolithic assemblages and possibly Initial Upper Paleolithic sensu lato finds farther east in the geographically connected Moravian Plains (Bohunician) suggests that early modern humans were present in Central Europe far sooner than previously recognized (Müller et al. 2011; Nigst et al. 2014; Richter et al. 2008, 2009). This notion could lead to a major modification in our understanding of the origin and cultural ontogeny of the Aurignacian technocomplex (Sitlivy et al. 2014).

This suggests that while hominins were undoubtedly present within the Middle Danube catchment in the late Upper Pleistocene, it is currently difficult to tell from the archaeological record whether they entered the Carpathian Basin on direct ‘highways’, in waves (Hublin 2015), or more piecemeal; furthermore, the evidence is too sparse to suggest a directional trajectory. Indeed, the gap in the Danube record suggests that the situation may be more complicated than has previously been thought. Furthermore, climatic reconstructions, illustrated by advances in loess stratigraphy, faunal/floral records and geochemistry, suggest a necessary diversion from the rugged karstic regions of the basin that may have been more familiar hunting areas for previous (Neanderthal) populations. This may have resulted in more frequent or seasonal use of the lowlands within the earlier parts of MIS 3 that may have prompted subsequent modifications in hominin subsistence behavior. A prolonged/intense modern human presence in the Carpathian Basin throughout the late Upper Pleistocene is testified to by a higher frequency of lithic sites with increased artifact density. Increased sedimentation rates in the later part of the Pleistocene may have also helped to offset the palimpsest effect that might have skewed the record.

Related:

The Indo-European demic diffusion model, and the “R1b – Indo-European” association

yamna_bell_beaker_cut

Beginning with the new year, I wanted to commit myself to some predictions, as I did last year, even though they constantly change with new data.

I recently read Proto-Indo-European homelands – ancient genetic clues at last?, by Edward Pegler, which is a good summary of the current state of the art in the Indo-European question for many geneticists – and thus a great example of how well Genetics can influence Indo-European studies, and how badly it can be used to interpret actual cultural events – although more time is necessary for some to realize it. Notice for example the distribution of ‘Yamnaya’ in 3000 BC, all the way to Latvia (based on the initial findings of Mathieson et al. 2017), and the map of 2000 BC with ‘Corded Ware’, both suggesting communities linked by admixture and unrelated to actual cultures.

Some people – especially those interested in keeping a simplistic picture of Europe, either divided into admixture groups or simplistic R1b-Vasconic / R1a-Indo-European / N1c-Uralic (or any combination thereof) – want (others) to believe that I am linking ‘Indo-Europeans’ with haplogroup R1b. That is simply not true. In fact, my model dismisses such simplistic identifications of the reconstructible proto-languages with any modern peoples, admixtures, or haplogroups.

vasconic-uralic
Simplistic Vasconic/R1b-Uralic/N1c distribution, and intruding Indo-European/R1a, according to Wiik.

The beauty of the model lies, therefore, precisely in that if you take any modern group speaking Indo-European languages, none can trace back their combination of language, admixture, and/or haplogroup to a common Indo-European-speaking people. All our ancestral lines have no doubt changed language families (and indeed cultures), they have admixed, and our European regions’ paternal lines have changed, so that any dreams of ‘purity’ or linguistic/cultural/regional continuity become absurd.

That conclusion, which should be obvious to all, has been denied for a long time in blogs and forums alike, and is behind the effort of many of those involved in amateur genetics.

Main linguistic aim

The main consequence of the model, as the title of the paper suggests, is that reconstructible Indo-European proto-languages expanded with people, i.e. with actual communities, which is what we can assert with the help of Genomics. From a personal (or ethnic, or political) point of view genomics is useless, but from an anthropological (and thus linguistic) point of view, genomics can be a very useful tool to decide between alternative models of language diffusion, which has given lots of headaches to those of us involved in Indo-European studies.

The demic diffusion theory for the three main stages of the proto-language expansion was originally, therefore, a dismissal of impossible-to-prove cultural diffusion models for the proto-language – e.g. the adoption of Late Proto-Indo-European by Corded Ware groups due to a patron-client relationship (as proposed by Anthony), or a long-lasting connection between cultures (as proposed by Kristiansen, and favoured by “constellation analogy” proponents like Clackson, who negated the existence of common proto-languages). It also means the acceptance of the easiest anthropological model for language change: migration and – consequently – replacement.

By the time of the famous 2015 papers, I had been dealing for some time with the idea that the shared features between Indo-Iranian and Balto-Slavic may have been due to a common substrate, and must have therefore had some reflection in genomic finds. The data on these papers, and the addition of a weak connection between Pre-Germanic and Balto-Slavic communities, together with their clearest genetic link – R1a-M417 subclades (especially European Z283) – made it still easier to propose a Corded Ware substrate, partially common to the three.

Allentoft Corded Ware
Allentoft et al. “Arrows indicate migrations — those from the Corded Ware reflect the evidence that people of this archaeological culture (or their relatives) were responsible for the spreading of Indo-European languages. All coloured boundaries are approximate.”

Before the famous 2015 papers (and even after them, if we followed their interpretation), we were left to wonder why the supposed vector of expansion of Indo-European languages, Corded Ware migrants – represented by R1a-Z645 subclades, and supposedly continued unchanged into modern populations in its ‘original’ ancestral territories, Balto-Slavic and Indo-Iranian – , were precisely the (phonetically) most divergent Indo-European languages – relative to the parent Late Indo-European proto-language.

My paper implied therefore the dismissal of an unlikely Indo-Slavonic group, as proposed by Kortlandt, and of a still less factible Germano-Slavonic, or Germano-Indo-Slavonic (?) group, as loosely implied by some in the past, and maybe supported in certain archaeological models (viz. Kristiansen or partially Anthony), and presently by some geneticists since their simplistic 2015 papers on “massive migrations from the steppe“, and amateur genetic fans with infinite pet theories, indeed.

A common Corded Ware substrate to Balto-Slavic and Indo-Iranian, and common also partially between Balto-Slavic and Germanic (as supported by Kortlandt, too, albeit with different linguistic connotations), would explain their common features. The Corded Ware culture (and Uralic, tentatively proposed by me as the group’s main language family) is a strong potential connection between them, further supported by phylogeography, too.

Other consequences

Interpretations in my paper help thus dismiss the simplistic Yamna -> Corded Ware -> Bell Beaker migration model implied with phylogeography in the 2000s, and revived again by geneticists and Kristiansen’s workgroup based on the famous 2015 papers, whereby – due to the “Yamnaya ancestral component” – the Yamna culture would have been composed of communities of R1a-M417 and R1b-M269 lineages which remained against all odds ‘related but separated’ for more than two thousand years, sharing a common unitary language (why? and how?), and which expanded from Yamna (mainly R1b-L23) into Corded Ware (mainly R1a-M417) and then into Bell Beaker (mainly R1b-L51), in imaginary migration waves whose traces Archaeology has not found, or Anthropology described, before.

While phylogeography (especially the distribution of ancient samples of certain R1b and R1a subclades) was the main genetic aspect I used in combination with Archaeology and Anthropology to challenge the reliability of the “Yamnaya ancestral component” in assessing migrations – and thus Kristiansen’s now-popular-again modified Kurgan model – , my main aim was to prove a recent expansion of Late Proto-Indo-European from the steppe, and a still more recent expansion of a common group of speakers of North-West Indo-European, the language ancestral to Italo-Celtic, Germanic, and probably Balto-Slavic (or ‘Temematic’, the NWIE substrate of Balto-Slavic, according to some linguists).

My arguments serve for this purpose, and modern distributions of haplogroups or admixture are fully irrelevant: I am ready to change my view at any time, regarding the role of any haplogroup, or ancestral component, archaeological data, or anthropological migration model, to the extent that it supports the soundest linguistic model.

proto-indo-european-stages
Stages of Proto-Indo-European evolution. IU: Indo-Uralic; PU: Proto-Uralic; PAn: Pre-Anatolian; PToch: Pre-Tocharian; Fin-Ugr: Finno-Ugric. The period between Balkan IE and Proto-Greek could be divided in two periods: an older one, called Proto-Greek (close to the time when NWIE was spoken), probably including Macedonian, and spoken somewhere in the Balkans; and a more recent one, called Mello-Greek, coinciding with the classically reconstructed Proto-Greek, already spoken in the Greek peninsula (West 2007). Similarly, the period between Northern Indo-European and North-West Indo-European could be divided, after the split of Pre-Tocharian, into a North-West Indo-European proper, during the expansion of Yamna to the west, and an Old European period, coinciding with the formation and expansion of the East Bell Beaker group.

Gimbutas’ old theory of sudden and recent expansion served well to support a real community of Proto-Indo-European speakers, as did later the Yamna -> Corded Ware -> Bell Beaker theory that circulated in the 2000s based on modern phylogeography, and as did later partially Anthony’s updated steppe theory (2007). On the other hand, Kristiansen’s long-lasting connections among north-west Pontic steppe cultures and Globular Amphorae and Trypillian cultures, did not fit well with a close community expanding rapidly – although recent genetic data on Trypillia and Globular Amphorae might be compelling him to improve his migration theory.

So, if data turns out to be not as I expect now, I will reflect that in future versions of the paper. I have no problem saying I am wrong. I have been wrong many times before, and something I am certain is that I am wrong now in many details, and I am going to be in the future.

If, for example, R1b-L23(xZ2105) is demonstrated to come from Hungary and not the steppe (as supported by Balanovsky) or R1a-M417 samples are proved to have expanded with West Yamna settlers (as recently proposed by Anthony, see below the Balto-Slavic question), I would support the same model from a linguistic point of view, but modified to reflect these facts. Or if a direct migration link is found in Archaeology from Yamna to Corded Ware, and from Corded Ware to Bell Beaker (as proposed in the 2015 papers), I will revise that too (again, see the image below). Or, if – as Lazaridis et al. (2017) paper on Minoans and Mycenaeans suggested – the Anatolian hypothesis (that is, one of the multiple ones proposed) turns out to be somehow right, I will support it.

calcolithic-expansion
My map of Late Proto-Indo-European expansion (A Grammar of Modern Indo-European, 2006), following Gimbutas and Mallory.

Haplogroups are the least important aspect of the whole model, they are just another data that has to be taken into account for a throrough explanation of migrations. It has become essential today because of the apparent lack of vision on the part of geneticists, who failed to use them to adjust their findings of admixture with findings of haplogroup expansions, favouring thus a marginal theory of long-lasting steppe expansion instead of the mainstream anthropological models.

Since many of these alternative scenarios seem less and less likely with each new paper, it is probably more efficient to talk about which developments are most likely to challenge my model.

Main points

My main predictions – based mostly on language guesstimates, archaeological cultures, and anthropological models of migration -, even with the scarce genomic data we had, have been proven right until know with new samples from Mathieson et al. (2017) and Olalde et al. (2017), among other papers of this past year. These were my original assumptions:

(1) A Middle Proto-Indo-European expansion defined by the appearance of steppe ancestry + reduction in haplogroup diversity and expansion of (mainly) R1b-M269 and R1b-L23 lineages;

(2) A Late Proto-Indo-European expansion defined by steppe ancestry + reduction in haplogroup diversity and expansion of (mainly) R1b-L23 subclades; and

(3) A North-West Indo-European expansion defined by steppe ancestry + reduction in haplogroup diversity and expansion of (mainly) R1b-L51 subclades.

The expansion of Corded Ware peoples, associated with steppe ancestry + reduction in haplogroup diversity and expansion of (mainly) R1a-Z645 subclades, represents thus a different migration, which is compatible with the different nature of the Corded Ware culture, unrelated to Yamna and without migration waves from one to the other (although there were certainly contacts in neighbouring regions).

As you can see, neither of the 3+1 expansion models imply that no other haplogroup can be found in the culture or regions involved (others have in fact been found, and still the models remain valid): these migrations imply a reduction of haplogroup diversity, and the expansion of certain subclades as is common in population expansions throughout history. While we all accept this general idea, some people have difficulties accepting just those cases not compatible with their dreams of autochthonous continuity.

Nevertheless, there are still voids in genetic investigation.

Controversial aspects

In my humble opinion, these are potential conflict periods and the most likely areas of change for the future of the theory:

1. When and how did R1b-M269 lineages become “chiefs” in the steppe?

Based on scarce data from Khvalynsk, it seems that during the Neolithic there were many haplogroups in the North Pontic and North Caspian steppes. A reduction to R1b-M269 subclades must have happened either just before or (as I support) during (the migrations that caused) the Suvorovo-Novodanilovka expansion among Sredni Stog, probably coinciding also with the expansion (or one of the expansions) of CHG ancestry (and thus the appearance of ‘Steppe component’ in the steppe). My theory was based initially on Anthony’s account and TMRCA of haplogroups of modern populations (both ca. 4200-4000 BC), but recent samples of the Balkans (R1b-M269 and steppe ancestry) seem to trace the population expansion some centuries back.

If my assessment is correct, then modern populations of haplogroup R1b-M269* and R1b-L23* in the Balkans probably reflect that ancient expansion, and samples related to Proto-Anatolian cultures in the Balkans will most likely be of R1b-M269 subclades and R1b-L23*. After admixture in the Balkans, posterior migrations of Anatolian languages into Anatolia might be associated with a different admixture component and haplogroups, we don’t have enough data yet.

If the haplogroup reduction and expansion in Khvalynsk happened later than the Suvorovo-Novodanilovka expansion, then we might find the expansion of Pre- or Proto-Anatolian associated with many different haplogroups, such as R1b (xM269), R1a, I, J, or G2, and more or less associated with steppe ancestry in the Balkans.

Another reason for finding such variety of haplogroups in ancient samples from the Balkans would be that this Khvalynsk group of “chiefs” traversed – and mixed with – the Sredni Stog population. Nevertheless, if we suppose homogeneity in haplogroups in Khvalynsk during the expansion, a high proportion of different haplogroups explained by admixture with the local population of Sredni Stog would challenge the whole “chief domination” explanation by Anthony, and we would have to return to the “different culture” theory by Rassamakin and potentially an older migration from Khvalynsk. In any case, both researchers show clear links of the Suvorovo-Novodanilovka phenomenon to Khvalynsk, and a differentiation with the surrounding Sredni Stog culture.

A less likely model would support the identification of the whole Eneolithic Pontic-Caspian steppe as a loose Indo-Hittite-speaking community, which would be in my opinion too big a territory and too loose a cultural bond to justify such a long-lasting close linguistic connection. This will probably be the refuge of certain people looking desperately for R1a-IE connections. However, the nature of the western steppe will remain distinct from Late Proto-Indo-European, which must have developed in the Yamna culture, so autochthonous continuity is not on the table anymore, in any case…

suvorovo-novodanilovka-region
Coexistence of the Varna-Gumelniţa culture and the Suvorovo phase of the sceptre-bearer communities. 1 — Fălciu; 2 — Fundeni-Lungoţi; 3 — Novoselskaja; 4 — Suvorovo; 5 — Casimcea; 6 — Kjulevča; 7 — Reka Devnja; 8 — Drama; 9 — Gonova mogila; 10 — Reževo; 11 — geographically separate Decea variant of the sceptre bearer group (after Govedarica, Manzura 2011: Abb. 5, adapted).

2. How did R1a-M417 (and especially R1a-Z645) haplogroups came to dominate over the Corded Ware cultures?

If I am right (again, based on TMRCA of modern populations), then it is precisely at the time of the potential expansion of Proto-Corded Ware from the Dnieper-Dniester forest, forest-steppe, and steppe regions, ca 3300-3000. Furholt’s recent radiocarbon analysis and suggestions of a Lesser Poland origin of the third or A-horizon, on which disparate archaeologists such as Anthony or Klejn rely now, seem to suggest also that Corded Ware was a cultural complex rather than a compact culture reflecting a migration of peoples – similar thus to the Bell Beaker complex.

This cultural complex interpretation of Corded Ware contrasts with the quite homogeneous late samples we have, suggesting clear migration waves in northern Europe, at least at some point in time, so Genomics will be a great tool to ascertain when and from where approximately did Corded Ware peoples expand. Right now, it seems that Eneolithic Ukraine populations are the closest to its origin, so the traditional interpretation of its regional origin by Kristiansen or Anthony remains valid.

3. How was Indo-Iranian adopted by Corded Ware invaders?

This is rather an anthropological question. We need reasonable models of founder effect/cultural diffusion necessary for that to happen – similar to the ones necessary to explain the arrival of N1c subclades into north-east Europe, or the arrival of R1b subclades in Basque/Iberian-speaking regions in south-west Europe. My description of potential events in the eastern steppe – based partially on Anthony – is merely a short sketch. Genomic data is unlikely to offer more than it does today (replacement of haplogroups, and gradually of some steppe component, by late Corded Ware groups in the steppe), but let’s see what new samples can contribute.

As for what some Indians – and other people willing to confront them – are looking for, regarding R1a-M417 and/or Indo-European origins in India, I don’t see the point, we already know a) that the origin of the expansion is in the steppe and b) that Hindu nationalist biggots will not accept results from research that oppose their views. I don’t expect huge surprises there, just more fruitless discussions (fomented by those who live from trolling or conspiracies)…

4. Yamna settlers from Hungary

Anthony’s new theory – and the nature of Balto-Slavic – hinges on the presence of R1a-M417 subclades (associated with later Corded Ware samples) in Yamna settlers of Hungary, potentially originally from the North Pontic area, where the oldest sample has been found.

My ‘modified’ version of Anthony’s new model (the only I deem just remotely factible) includes the expansion of a Proto-Corded Ware from Lesser Poland, but (given the overwhelming R1b found in East Bell Beaker), with R1a-M417 being associated with the region. How to explain this language change with objective data? Well, we have Bell Beaker expanding to these areas at a later time, so we would need to find R1b-L23 settlers in Lesser Poland, and then a resurge of R1a-M417 haplogroup. If not, resorting yet again to cultural diffusion Yamna “patrons” to Corded Ware “clients” of Lesser Poland would bring us to square one, now with the ‘steppe ancestry’ controversy included…

Since some Eastern Europeans are (for no obvious reason whatsoever) putting their hopes on that IE-R1a-CWC association, let’s hope some samples of R1a-M417 in Yamna or Hungary give them a break, so that they can begin accepting something closer to mainstream anthropological models. We could then work from there a Yamna-> Bell Beaker / North-West Indo-European association truce, and from there keep accepting that no single haplogroup from Yamna settlers is linked with modern languages, cultures or ethnic groups.

yamna-region
localization of Central-European funerary monuments with elements of the Pit Grave culture (after Bátora 2006);

5. How and when was Balto-Slavic associated with haplogroup R1a?

If we accept the Southern or Graeco-Aryan nature of Balto-Slavic with influence from an absorbed North-West Indo-European dialect, “Temematic” (as Kortlandt does), then Indo-Slavonic adopted in the steppe from Potapovka by Sintashta and Poltavka populations divided ca. 2000 BC into Indo-Iranian (migrating to the east with Andronovo), and Balto-Slavic (migrating westward with the Srubna culture). History from there is not straightforward, and it should follow Srubna, Thraco-Cimmerian, or other late expansions from cultures of the steppe.

On the other hand, if it is a Northern dialect related closely to Germanic and Italo-Celtic (in a North-West Indo-European group), then its origin has to be found in the initial expansion of East Bell Beakers, and its development into either the Únětice culture (of Balkan and thus potentially “Southern IE” influence), or the Mierzanowice-Nitra culture (of Corded Ware and thus potentially Uralic influence), or maybe from both, given the intermediate substrate found in Germanic and Balto-Slavic.

It is my opinion that the association of Balto-Slavic with haplogroup R1a is quite early after the East Bell Beaker expansion, probably initially with the subclade typically associated with West Slavic, R1a-M458. I have not much data to support this (apart from the most common linguistic model), just modern haplogroup distribution maps and common TMRCA, and highly hypothetical archaeological-anthropological models. Genetics will hopefully bring more data.

Let’s see also what information on ancient haplogroups we can obtain from the Tollense valley (already showing a close cluster with modern West Slavic populations) and steppe regions.

6. How did Germanic, Celtic, and Italic expand?

Germanic is probably the most interesting one. Following the expansion of R1b-L51 subclades (especially R1b-U106) and steppe ancestry (a confounding factor, with the previous expansion of R1a-Z284 subclades) in Scandinavia is going to be fascinating. Anthropological models already point to a linguistic and archaeological expansion of Pre-Germanic with Bell Beaker peoples.

The expansion of Celtic seems to be associated with chiefdoms, untraceable today in terms of haplogroups, and it seems thus different from previous expansions. New studies might tell how that happened, if it was actually in successive ways, as proposed, or maybe we don’t have enough data yet to reach conclusions.

We don’t know either how Italic expanded into the Italian Peninsula, or whether Latin expanded with peoples from Italy, if at all, or it was mostly a cultural diffusion event, as it seems.

Regarding Etruscan, while I think it is a controversy initiated based on fantastic accounts, and ignited with few finds of Middle Eastern ancestry (that seem logical from the point of view of regional contacts), it will be important for Italian linguists and archaeologists, also to accept the most likely scenario.

As for Palaeo-Hispanic languages, while steppe ancestry is found quite reduced in R1b-L51 subclades (after so many different expansions and admixture events since the departure from the steppe), their distribution from the Chalcolithic onwards and the resurgence of native haplogroups may serve to ascertain which Pre-Roman tribes were associated with the oldest regions where these subclades dominated. For that aim, a closer look at the developments in Aquitania and other pre-Roman Vasconic- and Iberian-speaking regions may shed some light on how founder effects might develop to leave the native language intact (in a case similar to the adoption of Indo-Iranian by post-Corded Ware Sinthastha and Potapovka in the eastern Pontic-Caspian steppe).

NOTE: Although mostly unrelated, linguistic questions may also be somehow altered with a change of migration models. For example, our current Corded Ware Substrate Hypothesis – strongly contested by Kortlandt and others – implies that Uralic was potentially the language spoken by Eneolithic Ukraine / Proto-Corded Ware peoples, therefore early Uralic languages were spoken by Corded Ware peoples, as a substrate for Germanic and Balto-Slavic, and Balto-Slavic and Indo-Iranian. If an Indo-Hittite branch different from Late PIE is accepted for Eneolithic Ukraine (thus suggesting a millennia-long cultural-historical community in the steppe), then the model still stands (e.g. Ger. and BSl. *-mos/-mus, as stated by Kortlandt, would correspond to the oldest morphological IE layer). As you can read in the different versions of our model, the different possibilities for the common substrate are stated, and the most likely one selected. But the most likely a priori option sometimes turns out to be wrong…

NOTE 2: You can comment whatever you want here, but I opened a specific thread in our forum if you want serious comments on the model to stuck and be further discussed.

Featured images: from the book Interactions, changes and meanings. Essays in honour of Igor Manzura on the occasion of his 60th birthday. Țerna S., Govedarica B. (eds.). 2016. Kishinev: Stratum Plus.

See also:

Admixture of Srubna and Huns in Hungarian conquerors

hungarian-conqueror-migrations

New preprint at BioRxiv, Mitogenomic data indicate admixture components of Asian Hun and Srubnaya origin in the Hungarian Conquerors, by Neparáczki et al. (2018), at BioRxiv.

Abstract (emphasis mine):

It has been widely accepted that the Finno-Ugric Hungarian language, originated from proto Uralic people, was brought into the Carpathian Basin by the Hungarian Conquerors. From the middle of the 19th century this view prevailed against the deep-rooted Hungarian Hun tradition, maintained in folk memory as well as in Hungarian and foreign written medieval sources, which claimed that Hungarians were kinsfolk of the Huns. In order to shed light on the genetic origin of the Conquerors we sequenced 102 mitogenomes from early Conqueror cemeteries and compared them to sequences of all available databases. We applied novel population genetic algorithms, named Shared Haplogroup Distance and MITOMIX, to reveal past admixture of maternal lineages. Phylogenetic and population genetic analysis indicated that more than one third of the Conqueror maternal lineages were derived from Central-Inner Asia and their most probable ultimate sources were the Asian Huns. The rest of the lineages most likely originated from the Bronze Age Potapovka-Poltavka-Srubnaya cultures of the Pontic-Caspian steppe, which area was part of the later European Hun empire. Our data give support to the Hungarian Hun tradition and provides indirect evidence for the genetic connection between Asian and European Huns. Available data imply that the Conquerors did not have a major contribution to the gene pool of the Carpathian Basin, raising doubts about the Conqueror origin of Hungarian language.

hungarian-conqueror-mtdna
“Comparison of major Hg distributions from modern and ancient populations. Asian main Hg-s are designated with brackets. Major Hg distribution of Conqueror samples from this study are very similar to that of other 91 Conquerors taken from previous studies [11,12]. Scythians and ancient Xiongnus show similar Hg composition to the bracketed Asian fraction of the Conqueror samples, but Hg B is present just in Xiongnus. Modern Hungarians have very small Asian components pointing at small contribution from the Conquerors. Of the 289 modern Hungarian mitogenomes 272 are published in [29]. Scythian Hg-s are from [48,49,55,59,71–74]. Xiongnu Hg-s are from [66–69].”

Just recently another article contributed to a similar idea. I already talked about the Bronze Age R1a-z93 sample with high steppe ancestry found in the Balkans, and its likely origin in an expansion of the Srubna or a related culture. No truce, therefore, for those looking for autochthonous continuity anywhere in Europe.

We are seeing how multiple migrations shaped the history of the Carpathian basin (and its complex genetic structure) – and of Europe in general -, often from the Pontic-Caspian steppe. That is clear from many different prehistorical and historical times, such as the expansions of Suvorovo-Novodanilovka, Yamna, Srubna, Thraco-Cimmerians, Sarmatians, Scythians, Huns,…

About the linguistic interpretations based on genetics contained in the paper (Hungarian language as a legacy of Huns), well, you know my stance regarding the Yamnaya ancestral concept (and the wrong linguistic interpretations derived from it, which many sadly keep to this day), and genetics in general to solve language questions

This is yet another example of how (what some people would call) “scientific data” is useless without sound anthropological models.

Featured image, from the article: “Hypothetic origin and migration route of different components of the Hungarian Conquerors. Bluish line frames the Eurasian steppe zone, within which all presumptive ancestors of the Conquerors were found. Yellow area designates the Xiongnu Empire at its zenith from which area the East Eurasian lineages originated. Phylogeographical distribution of modern East Eurasian sequence matches (Fig. 1) well correspond to this territory, especially considering that Yakuts, Evenks and Evens lived more south in the past [108], and European Tatars also originated from this area. Regions where Asian and European Scythian remains were found are labeled green, pink is the presumptive range of the Srubnaya culture. Migrants of Xiongnu origin most likely incorporated descendants of these groups. The map was created using QGIS 2.18.4[109]”.

Article available under a CC-BY-NC-ND 4.0 International license.

Discovered via Razib Khan.

See also:

The Great Hungarian Plain in a time of change in the Balkans – Neolithic, Chalcolithic, and Bronze Age

hungary-yamna-burials-map

I wrote recently about Anthony’s new model of Corded Ware culture expansion from Yamna settlements of Hungary. I am extremely sceptic about it in terms of current genetic finds, and suspicious of the real reasons behind it – probably misinterpretations of the so-called ‘Yamnaya ancestral component’ in recent genetic papers, rather than archaeological finds.

Nevertheless, it means a definitive rejection by Anthony of:

  • The multiple patron-client relationships he proposed to justify a cultural diffusion of Late Indo-European dialects from Yamna into different Corded Ware cultures in the forest-steppe and Forest Zone (see one of his latest summaries of the model in 2015). Now the language change is explained as a pure migration event, and cultural diffusion is not an option. Ergo, if no migration is found from Hungarian Yamna into Lesser Poland, then Corded Ware cultures were not Indo-European-speaking.
  • Ringe’s glottochronological tree for Proto-Indo-European languages (Ringe, Warnow, and Taylor 2002). An early and sudden split of Late PIE dialects in all directions is substituted by a common, Old European language that expanded from a very small area of settlers, in the Carpathian Basin. This is coincident with the current view on North-West Indo-European, and I think that his final acceptance of a sound linguistic model is essential to solve Indo-European questions.
  • The simplistic assumption of Yamna -> Corded Ware -> Bell Beaker migration found in genetic papers of 2015. The new model implies Yamna->Yamna settlers (Eastern Hungary). Yamna settlers are known to have developed into East Bell Beakers (as described by Gimbutas and accepted by Anthony originally, and now also found in the adoption of Heyd’s theory for his new model); therefore a Yamna settlers (Hungary) -> East Bell Beaker evolution is evident and mainstream, now clear also in genetics. It remains to be seen if the additional Yamna settlers (Hungary) -> Proto-Corded Ware migration proposed by him as a novelty in this new model is also right, i.e. if Yamna settlers from Hungary did in fact migrate into sites of Lesser Poland (to form a Proto-Corded Ware culture). If not, then only Heyd’s model remains.

This new model offers thus a more suitable time frame for usual proto-language guesstimates, that would be compatible with a spread of Late Indo-European with Yamna settlers (of R1b lineages) from the steppe into a small region, where North-West Indo-European would have been spoken, and then a potential cultural diffusion through (or founder effect in) a Proto-Corded Ware culture (of R1a-M417 subclades) of Lesser Poland, which is compatible with the Corded Ware Substrate hypothesis.

Since Anthony has stuck his neck out in favour of this new theory – changing some of his popular theories, and rejecting what many geneticists seem to take as certain – , and because of his previous impressive improvements over Gimbutas’ simple steppe theory (now apparently fashionable again), I think he deserves that his proposal of Yamna/Late Indo-European expansion in the Balkans be further investigated, if only to be improved upon.

I recently found the paper 4000-2000 BC in Hungary: The Age of Transformation, by T. Horváth, in Annales Universitatis Apulensis. Series Historica, 20/II, 51-113. While it deals mainly with the potential survival of the Baden culture into the late third millennium BC, it gives some interesting quite early dates for Yamna (‘Pit’) graves in the Carpathian Basin, and potential cultural (and population) movements within the Balkans.

A note about the Corded Ware culture in the Carpathian Basin:

Many researchers may assume that it is unnecessary for us to deal with the Corded Ware and Globular Amphorae cultures of north Germany, Poland and Denmark, and if so it does not matter what the names of the periods are. It actually matters a lot. It is true that in these areas there was no Baden complex, but the period had many Baden (and other) culture “period phenomena”. These seem to part of a larger formation than cultures – evidenced by traces such as cattle burials, the relationship between copper metallurgies and jade – which link these territories even when the culture complexes were different, because these phenomena appear not just in the Baden, but in the Corded Ware and Globular Amphorae area as well (and these cultural complexes partly overlapped each other both in space and time!). Even the characteristics of sites show many similarities: e.g. in the northern part of corded ware distribution area, mainly burials have been discovered (similarly to the Pit Grave culture in the Great Hungarian Plain) and in the southern part only settlements appear.

At the moment we have no explanation regarding the nature of the relationship between them (it is supposed that as a result of geographical conditions the people of the same culture lived in different ecological conditions and they adapted differently to their environment). In considering the whole of Europe around 3500-3000 BC, easily observable settlement signs disappeared (Milisauskas and Kruk, “Late Neolithic/Late Copper Age,” 307), similarly to Hungary, even though in Hungary this occurred from the end of the Middle Copper Age to the Early Bronze Age, between 4000 and 2000 BC. If we do not take into account that the cattle burials of the Baden culture between 3600 and 2800 BC, and possibly even longer than that, have analogies with the cattle burials of areas in the Early and Middle Neolithic Corded Ware culture (because “logically” analogies would be sought in those areas in the Bronze Age but this period is not analogous with that period in those areas), we would not find any spiritual resemblance in their relationships that lies behind their spatial and temporal analogies; cf. comp. Niels Johannsen and Steffen Laursen, “Routes and Wheeled Transport in Late 4th-Early 3rd Millennium Funerary Customs of the Jutland Peninsula: Regional Evidence and European Context,” PZ 85 (2010): 15-58; Horváth “The Intercultural Connections of the Baden „Culture,” 118. It is painful to think about how many relationships we have not explored or even assessed yet!

hungary-yamna-corded-ware-map
One version from both maps shown in the article, by T. Horváth: “Since the two cultures surely lived together in the Late Copper Age, their collective map represents the Late Copper Age (supplemented with Vučedol sites). Since the direction of diffusion of the Kostolac ceramic style is still unclear, two map versions were made. On one the Kostolac followed the Danube River, on the other they diffused in the opposite direction. In northeast Hungary, Coțofeni III appeared. On this map Kostolac sites are not depicted as dots but, in light of their position and density, proportionately sized arrows are used.”

On Yamna culture and burials in the Carpathian Basin:

Looking at Pit Grave kurgans on the distribution map, it is apparent that burials are the densest where there were no Boleráz or Baden occupations (in this respect this was a kind of “no man’s land”, but from the whole Late Copper Age perspective it was not: the sites of the Baden complex and Pit Grave complemented each other and even partially overlapped). Apart from burials, no Pit Grave settlements or other types of Pit Grave sites are known in Hungary, therefore we do not know whether Pit Grave settlements were situated near the kurgans or whether were somewhere else entirely and we simply have not found them yet.

Since the Pit Grave people had a different lifestyle from the Baden, we can assume that, up to the line of the Tisza River, small animal-keeping mobile groups (Pit Grave) met more populated and settled, agriculturalist, indigenous Boleráz-Baden groups. Animal keepers (Pit Grave) settled in areas where agriculturalists (Boleráz and Baden) did not; in some places, however, they crossed each other’s paths (Fig. 5, 7). Sometimes their sites are very close to each other, sometimes they appear on one site and they can be identified in the stratigraphy of a site. In the latter case the kurgan is always situated on top of a Baden settlement, indicating that Pit Grave not only followed the Baden at these sites but may have represented a somewhat higher social power and belief system than the Baden.

The relationship between pastoral, patrilineal, combatant nomadic tribes and agriculturalist communities is often described as some sort of patron and client relationship. In reality, the signs of such assumption are not visible in the Pit Grave-Baden relationship. There are cases when more aggressive herders conquered more developed agriculturalist communities, but there are also cases when the conqueror’s culture was more developed or stronger than that of the conquered. Always, the conquering nomads are the patrons, the rulers and the empire builders.

In our case, timing is important. How much time had passed on those common sites where a Baden settlement was followed by a Pit Grave kurgan? In these cases, it is certain that the kurgan is younger, but how much younger?

hungary-yamna-settlements
From the article, by T. Horváth. “On the 10 locations analysed, surviving Baden can be assumed after 2800 BC. Unfortunately, it is not possible to predict which sites would survive further scrutiny of radiocarbon dating in this respect; only a few dates are available so far. Therefore, on the map of Baden that still existing after the Late Copper Age, I have also represented all sites (up to the Danube River line) and combined them with Early Bronze Age sites. Since the majority of Makó sites are represented by only one find (scattered finds), and the majority of sites have just one grave, it is impossible to ascertain whether it was part of a cemetery, was within a settlement, or was an individual burial without any further features. Therefore, following Dani 2005, I utilized subdivisions: perhaps in the future this fine subdivision will provide a meaningful explanation (1). Since the radiocarbon dates of Pit Grave kurgans clearly show that the Pit Grave survived at least until 2500 BC, I combined the previous map with that of the Pit Grave. This map would show a realistic picture of cultures after 2800 BC east of the Danube River (2).”

To sum up, the Pit Grave and Baden in the Late Copper Age were certainly contemporary from 3350 BC in the Great Hungarian Plain, and they had common sites, sites which were very close to each other, sites which were far from each other, and also independent sites. The Pit Grave culture surely survived in the transitional period, and into Early Bronze Age I, but perhaps even longer. For the most part, the Baden had ended by 2900 BC in the Great Hungarian Plain. Mapping and some other data (e.g. the discovery that Younger-type, not Mondsee-type, metal objects, which can now be considered to be Baden, even appear east of the Danube River) does not exclude the possibility of searching further for traces of Baden surviving in the Great Hungarian Plain together with or alongside to the Pit Grave. On the common Baden-Pit Grave sites, even without carbon dating, we can assume from already known stratigraphical data that they closely followed each other in time.

For those of you interested in more detailed radiocarbon analysis and assessment of Yamna burials and settlements, from the steppe to the Balkans, to investigate Anthony’s theory further – apart from those authors referenced by him – , I can recommend reading Y. Rassamakin (e.g. Import and Imitation in Archaeology, 2008), S. Ivanova, or Claudia Gerling (e.g. Prehistoric Mobility and Diet in the West Eurasian Steppes 3500 to 300 BC).

Featured image, from the article, by T. Horváth: Distribution map of the Pit Grave.

Related: