Migrations painted by Irish and Scottish genetic clusters, and their relationship with British and European ones


Interesting and related publications, now appearing in pairs…

1. The Irish DNA Atlas: Revealing Fine-Scale Population Structure and History within Ireland, by Gilbert et al., in Scientific Reports (2017).


The extent of population structure within Ireland is largely unknown, as is the impact of historical migrations. Here we illustrate fine-scale genetic structure across Ireland that follows geographic boundaries and present evidence of admixture events into Ireland. Utilising the ‘Irish DNA Atlas’, a cohort (n = 194) of Irish individuals with four generations of ancestry linked to specific regions in Ireland, in combination with 2,039 individuals from the Peoples of the British Isles dataset, we show that the Irish population can be divided in 10 distinct geographically stratified genetic clusters; seven of ‘Gaelic’ Irish ancestry, and three of shared Irish-British ancestry. In addition we observe a major genetic barrier to the north of Ireland in Ulster. Using a reference of 6,760 European individuals and two ancient Irish genomes, we demonstrate high levels of North-West French-like and West Norwegian-like ancestry within Ireland. We show that that our ‘Gaelic’ Irish clusters present homogenous levels of ancient Irish ancestries. We additionally detect admixture events that provide evidence of Norse-Viking gene flow into Ireland, and reflect the Ulster Plantations. Our work informs both on Irish history, as well as the study of Mendelian and complex disease genetics involving populations of Irish ancestry.

The European ancestry profiles of 30 Irish and British clusters. (a) The total ancestry contribution summarised by majority European country of origin to each of the 30 Irish and British clusters. (b) (left) The ancestry contributions of 19 European clusters that donate at least 2.5% ancestry to any one Irish or British cluster. (right) The geographic distribution of the 19 European clusters, shown as the proportion of individuals in each European region belonging to each of the 19 European clusters. The proportion of individuals form each European region not a member of the 19 European clusters is shown in grey. Total numbers of individuals from each region are shown in white text. Not all Europeans included in the analysis were phenotyped geographically. The figure was generated in the statistical software language R46, version 3.4.1, using various packages. The map of Europe was sourced from the R software package “mapdata” (https://CRAN.R-project.org/package=mapdata).

2. New preprint on BioRxiv, Insular Celtic population structure and genomic footprints of migration, by Byrne, Martiniano et al. (2017).


Previous studies of the genetic landscape of Ireland have suggested homogeneity, with population substructure undetectable using single-marker methods. Here we have harnessed the haplotype-based method fineSTRUCTURE in an Irish genome-wide SNP dataset, identifying 23 discrete genetic clusters which segregate with geographical provenance. Cluster diversity is pronounced in the west of Ireland but reduced in the east where older structure has been eroded by historical migrations. Accordingly, when populations from the neighbouring island of Britain are included, a west-east cline of Celtic-British ancestry is revealed along with a particularly striking correlation between haplotypes and geography across both islands. A strong relationship is revealed between subsets of Northern Irish and Scottish populations, where discordant genetic and geographic affinities reflect major migrations in recent centuries. Additionally, Irish genetic proximity of all Scottish samples likely reflects older strata of communication across the narrowest inter-island crossing. Using GLOBETROTTER we detected Irish admixture signals from Britain and Europe and estimated dates for events consistent with the historical migrations of the Norse-Vikings, the Anglo-Normans and the British Plantations. The influence of the former is greater than previously estimated from Y chromosome haplotypes. In all, we paint a new picture of the genetic landscape of Ireland, revealing structure which should be considered in the design of studies examining rare genetic variation and its association with traits.

Here are some interesting excerpts (emphasis mine):

Population structure in Ireland

The geographical distribution of this deep subdivision of Leinster resembles pre-Norman territorial boundaries which divided Ireland into fifths (cúige), with north Leinster a kingdom of its own known as Meath (Mide) [15]. However interpreted, the firm implication of the observed clustering is that despite its previously reported homogeneity, the modern Irish population exhibits genetic structure that is subtly but detectably affected by ancestral population structure conferred by geographical distance and, possibly, ancestral social structure.

ChromoPainter PC1 demonstrated high diversity amongst clusters from the west coast, which may be attributed to longstanding residual ancient (possibly Celtic) structure in regions largely unaffected by historical migration. Alternatively, genetic clusters may also have diverged as a consequence of differential influence from outside populations. This diversity between western genetic clusters cannot be explained in terms of geographic distance alone.

In contrast to the west of Ireland, eastern individuals exhibited relative homogeneity; (…) The overall pattern of western diversity and eastern homogeneity in Ireland may be explained by increased gene flow and migration into and across the east coast of Ireland from geographically proximal regions, the closest of which is the neighbouring island of Britain.

Analysis of variance of the British admixture component in cluster groups showed a significant difference (p < 2×10-16), indicating a role for British Anglo-Saxon admixture in distinguishing clusters, and ChromoPainter PC2 was correlated with the British component (p < 2×10-16), explaining approximately 43% of the variance. PC2 therefore captures an east to west Anglo-Celtic cline in Irish ancestry. This may explain the relative eastern homogeneity observed in Ireland, which could be a result of the greater English influence in Leinster and the Pale during the period of British rule in Ireland following the Norman invasion, or simply geographic proximity of the Irish east coast to Britain. Notably, the Ulster cluster group harboured an exceptionally large proportion of the British component (Fig 1D and 1E), undoubtedly reflecting the strong influence of the Ulster Plantations in the 17th century and its residual effect on the ethnically British population that has remained.

Fine-grained population structure in Ireland. (A) fineSTRUCTURE clustering dendrogram for 1,035 Irish individuals. Twenty-three clusters are defined, which are combined into cluster groups for clusters that are neighbouring in the dendrogram, overlapping in principal component space (B) and sampled from regions that are geographically contiguous. Details for each cluster in the dendrogram are provided in S1 Fig. (B) Principal components analysis (PCA) of haplotypic similarity, based on ChromoPainter coancestry matrix for Irish individuals. Points are coloured according to cluster groups defined in (A); the median location of each cluster group is plotted. (C) Map of Irelandshowing the sampling location for a subset of 588 individuals analysed in (A) and (B), coloured by cluster group. Points have been randomly jittered within a radius of 5 km to preserve anonymity. Precise sampling location for 44 Northern Irish individuals from the People of the British Isles dataset was unknown; these individuals are plotted geometrically in a circle. (D) “British admixture component” (ADMIXTURE estimates; k=2) for Irish cluster groups. This component has the largest contribution in ancient Anglo-Saxons and the SEE cluster. (E) Linear regression of principal component 2 (B) versus British admixture component (r2 = 0.43; p < 2×10-16). Points are coloured by cluster group. (Standard error for ADMIXTURE point estimates presented in S11 Fig.)

On the genetic structure of the British Isles

The genetic substructure observed in Ireland is consistent with long term geographic diversification of Celtic populations and the continuity shown between modern and Early Bronze Age Irish people

Clusters representing Celtic populations harbouring less Anglo-Saxon influence separate out above and below SEE on PC4. Notably, northern Irish clusters (NLU), Scottish (NISC, SSC and NSC), Cumbria (CUM) and North Wales (NWA) all separate out at a mutually similar level, representing northern Celtic populations. The southern Celtic populations Cornwall (COR), south Wales (SWA) and south Munster (SMN) also separate out on similar levels, indicating some shared haplotypic variation between geographically proximate Celtic populations across both Islands. It is notable that after the split of the ancestrally divergent Orkney, successive ChromoPainter PCs describe diversity in British populations where “Anglo-saxonization” was repelled [22]. PC3 is dominated by Welsh variation, while PC4 in turn splits North and South Wales significantly, placing south Wales adjacent to Cornwall and north Wales at the other extreme with Cumbria, all enclaves where Brittonic languages persisted.

In an interesting symmetry, many Northern Irish samples clustered strongly with southern Scottish and northern English samples, defining the Northern Irish/Cumbrian/Scottish (NICS) cluster group. More generally, by modelling Irish genomes as a linear mixture of haplotypes from British clusters, we found that Scottish and northern English samples donated more haplotypes to clusters in the north of Ireland than to the south, reflecting an overall correlation between Scottish/north English contribution and ChromoPainter PC1 position in Fig 1 (Linear regression: p < 2×10-16, r2 = 0.24).

North to south variation in Ireland and Britain are therefore not independent, reflecting major gene flow between the north of Ireland and Scotland (Fig 5) which resonates with three layers of historical contacts. First, the presence of individuals with strong Irish affinity among the third generation PoBI Scottish sample can be plausibly attributed to major economic migration from Ireland in the 19th and 20th centuries [6]. Second, the large proportion of Northern Irish who retain genomes indistinguishable from those sampled in Scotland accords with the major settlements (including the Ulster Plantation) of mainly Scottish farmers following the 16th Century Elizabethan conquest of Ireland which led to these forming the majority of the Ulster population. Third, the suspected Irish colonisation of Scotland through the Dál Riata maritime kingdom, which expanded across Ulster and the west coast of Scotland in the 6th and 7th centuries, linked to the introduction and spread of Gaelic languages [3]. Such a migratory event could work to homogenise older layers of Scottish population structure, in a similar manner as noted on the east coasts of Britain and Ireland. Earlier communications and movements across the Irish Sea are also likely, which at its narrowest point separates Ireland from Scotland by approximately 20 km.

Genes mirror geography in the British Isles. (A) fineSTRUCTURE clustering dendrogram for combined Irish and British data. Data principally split into Irish and British groups before subdividing into a total of 50 distinct clusters, which are combined into cluster groups for clusters that formed clades in the dendrogram, overlapped in principal component space (B) and were sampled from regions that are geographically contiguous. Names and labels follow the geographical provenance for the majority of data within the cluster group. Details for each cluster in the dendrogram are provided in S2 Fig. (B) Principal component analysis (PCA) of haplotypic similarity based on the ChromoPainter coancestry matrix, coloured by cluster group with their median locations labelled. We have chosen to present PC1 versus PC4 here as these components capture new information regarding correlation between haplotypic variation across Britain and Ireland and geography, while PC2 and PC3 (Fig 4) capture previously reported splitting for Orkney and Wales from Britain [7]. A map of Ireland and Britain is shown for comparison, coloured by sampling regions for cluster groups, the boundaries of which are defined by the Nomenclature of Territorial Units for Statistics (NUTS 2010), with some regions combined. Sampling regions are coloured by the cluster group with the majority presence in the sampling region; some sampling regions have significant minority cluster group representations as well, for example the Northern Ireland sampling region (UKN0; NUTS 2010) is majorly explained by the NICS cluster group but also has significant representation from the NLU cluster group. The PCA plot has been rotated clockwise by 5 degrees to highlight its similarity with the geographical map of the Ireland and Britain. NI, Northern Ireland; PC, principal component. Cluster groups that share names with groups from Fig 1 (NLU; SMN; CLN; CNN) have an average of 80% of their samples shared with the initial cluster groups. © EuroGeographics for the map and administrative boundaries, note some boundaries have been subsumed or modified to better reflect sampling regions.

Genomic footprints of migration into Ireland

Quite interesting is that it is haplogroups, and not admixture, that which defines the oldest migration layers into Ireland. Without evidence of paternal Y-DNA lineages we would probably not be able to ascertain the oldest migrations and languages broght by migrants, including Celtic languages:

Of all the European populations considered, ancestral influence in Irish genomes was best represented by modern Scandinavians and northern Europeans, with a significant single-date one-source admixture event overlapping the historical period of the Norse-Viking settlements in Ireland (p < 0.01; fit quality FQB > 0.985; Fig 6). (…) This suggests a contribution of historical Viking settlement to the contemporary Irish genome and contrasts with previous estimates of Viking ancestry in Ireland based on Y chromosome haplotypes, which have been very low [25]. The modern-day paucity of Norse-Viking Y chromosome haplotypes may be a consequence of drift with the small patrilineal effective population size, or could have social origins with Norse males having less influence after their military defeat and demise as an identifiable community in the 11th century, with persistence of the autosomal signal through recombination.

European admixture date estimates in northwest Ulster did not overlap the Viking age but did include the Norman period and the Plantations

The genetic legacies of the populations of Ireland and Britain are therefore extensively intertwined and, unlike admixture from northern Europe, too complex to model with GLOBETROTTER.

All-Ireland GLOBETROTTER admixture date estimates for European and British surrogate admixing populations. A summary of the date estimates and 95% confidence intervals for inferred admixture events into Ireland from European and British admixing sources is shown in (A), with ancestry proportion estimates for each historical source population for the two events and example coancestry curves shown in (B). In the coancestry curves Relative joint probability estimates the pairwise probability that two haplotype chunks separated by a given genetic distance come from the two modeled source populations respectively (ie FRA(8) and NOR-SG); if a single admixture event occurred, these curves are expected to decay exponentially at a rate corresponding to the number of generations since the event. The green fitted line describes this GLOBETROTTER fitted exponential decay for the coancestry curve. If the sources come from the same ancestral group the slope of this curve will be negative (as with FRA(8) vs FRA(8)), while a positive slope indicates that sources come from different admixing groups (as with FRA(8) vs NOR-SG). The adjacent bar plot shows the inferred genetic composition of the historical admixing sources modelled as a mixture of the sampled modern populations. A European admixture event was estimated by GLOBETROTTER corresponding to the historical record of the Viking age, with major contributions from sources similar to modern Scandinavians and northern Europeans and minor contributions from southern European-like sources. For admixture date estimates from British-like sources the influence of the Norman settlement and the Plantations could not be disentangled, with the point estimate date for admixture falling between these two eras and GLOBETROTTER unable to adequately resolve source and proportion details of admixture event (fit quality FQB< 0.985). The relative noise of the coancestry curves reflects the uncertainty of the British event. Cluster labels (for the European clustering dendrogram, see S4 Fig; for the PoBI clustering dendrogram, see S3 Fig): FRA(8), France cluster 8; NOR-SG, Norway, with significant minor representations from Sweden and Germany; SE_ENG, southeast England; N_SCOT(4) northern Scotland cluster 4.

Another study that strengthens the need to ascertain haplogroup-admixture differences between Yamna/Bell Beaker and Sredni Stog/Corded Ware.

Text and images from preprint article under a CC-BY-NC-ND 4.0 International license.

Featured image, from the article on Science Reports: The clustering of individuals with Irish and British ancestry based solely on genetics. Shown are 30 clusters identified by fineStructure from 2,103 Irish and British individuals. The dendrogram (left) shows the tree of clusters inferred by fineStructure and the map (right) shows the geographic origin of 192 Atlas Irish individuals and 1,611 British individuals from the Peoples of the British Isles (PoBI) cohort, labelled according to fineStructure cluster membership. Individuals are placed at the average latitude and longitude of either their great-grandparental (Atlas) or grandparental (PoBI) birthplaces. Great Britain is separated into England, Scotland, and Wales. The island of Ireland is split into the four Provinces; Ulster, Connacht, Leinster, and Munster. The outline of Britain was sourced from Global Administrative Areas (2012). GADM database of Global Administrative Areas, version 2.0. www.gadm.org. The outline of Ireland was sourced from Open Street Map Ireland, Copyright OpenStreetMap Contributors, (https://www.openstreetmap.ie/) – data available under the Open Database Licence. The figure was plotted in the statistical software language R46, version 3.4.1, with various packages.

Expansion of peoples associated with spread of haplogroups: Mongols and C3*-F3918, Arabs and E-M183 (M81)


Two recent interesting papers on the potential expansion of cultures associated with haplogroups:

1. Whole Y-chromosome sequences reveal an extremely recent origin of the most common North African paternal lineage E-M183 (M81), by Solé-Morata et al., Scientific Reports (2017).


E-M183 (E-M81) is the most frequent paternal lineage in North Africa and thus it must be considered to explore past historical and demographical processes. Here, by using whole Y chromosome sequences from 32 North African individuals, we have identified five new branches within E-M183. The validation of these variants in more than 200 North African samples, from which we also have information of 13 Y-STRs, has revealed a strong resemblance among E-M183 Y-STR haplotypes that pointed to a rapid expansion of this haplogroup. Moreover, for the first time, by using both SNP and STR data, we have provided updated estimates of the times-to-the-most-recent-common-ancestor (TMRCA) for E-M183, which evidenced an extremely recent origin of this haplogroup (2,000–3,000 ya). Our results also showed a lack of population structure within the E-M183 branch, which could be explained by the recent and rapid expansion of this haplogroup. In spite of a reduction in STR heterozygosity towards the West, which would point to an origin in the Near East, ancient DNA evidence together with our TMRCA estimates point to a local origin of E-M183 in NW Africa.

Distribution of E-M183 subclades among North Africa, the Near East and the Iberian Peninsula. Pie chart sectors areas are proportional to haplogroup frequency and are coloured according to haplogroup in the schematic tree to the right. n: sample size. Map was generated using R software.

An interesting excerpt, from the discussion:

Regarding the geographical origin of E-M183, a previous study suggested that an expansion from the Near East could explain the observed east-west cline of genetic variation that extends into the Near East. Indeed, our results also showed a reduction in STR heterozygosity towards the West, which may be taken to support the hypothesis of an expansion from the Near East. In addition, previous studies based on genome-wide SNPs reported that a North African autochthonous component increase towards the West whereas the Near Eastern decreases towards the same direction, which again support an expansion from the Near East. However, our correlations should be taken carefully because our analysis includes only six locations on the longitudinal axis, none from the Near East. As a result, we do not have sufficient statistical power to confirm a Near Eastern origin. In addition, rather than showing a west-to-east cline of genetic diversity, the overall picture shown by this correlation analysis evidences just low genetic diversity in Western Sahara, which indeed could be also caused by the small sample size (n = 26) in this region. Alternatively, given the high frequency of E-M183 in the Maghreb, a local origin of E-M183 in NW Africa could be envisaged, which would fit the clear pattern of longitudinal isolation by distance reported in genome-wide studies. Moreover, the presence of autochthonous North African E-M81 lineages in the indigenous population of the Canary Islands, strongly points to North Africa as the most probable origin of the Guanche ancestors. This, together with the fact that the oldest indigenous inviduals have been dated 2210 ± 60 ya, supports a local origin of E-M183 in NW Africa. Within this scenario, it is also worth to mention that the paternal lineage of an early Neolithic Moroccan individual appeared to be distantly related to the typically North African E-M81 haplogroup30, suggesting again a NW African origin of E-M183. A local origin of E-M183 in NW Africa > 2200 ya is supported by our TMRCA estimates, which can be taken as 2,000–3,000, depending on the data, methods, and mutation rates used.

The TMRCA estimates of a certain haplogroup and its subbranches provide some constraints on the times of their origin and spread. Although our time estimates for E-M78 are slightly different depending on the mutation rate used, their confidence intervals overlap and the dates obtained are in agreement with those obtained by Trombetta et al Regarding E-M183, as mentioned above, we cannot discard an expansion from the Near East and, if so, according to our time estimates, it could have been brought by the Islamic expansion on the 7th century, but definitely not with the Neolithic expansion, which appeared in NW Africa ~7400 BP and may have featured a strong Epipaleolithic persistence. Moreover, such a recent appearance of E-M183 in NW Africa would fit with the patterns observed in the rest of the genome, where an extensive, male-biased Near Eastern admixture event is registered ~1300 ya, coincidental with the Arab expansion. An alternative hypothesis would involve that E-M183 was originated somewhere in Northwest Africa and then spread through all the region. Our time estimates for the origin of this haplogroup overlap with the end of the third Punic War (146 BCE), when Carthage (in current Tunisia) was defeated and destroyed, which marked the beginning of Roman hegemony of the Mediterranean Sea. About 2,000 ya North Africa was one of the wealthiest Roman provinces and E-M183 may have experienced the resulting population growth.

2. The Y-chromosome haplogroup C3*-F3918, likely attributed to the Mongol Empire, can be traced to a 2500-year-old nomadic group, by Zhang et al., Journal of Human Genetics (2017)


The Mongol Empire had a significant role in shaping the landscape of modern populations. Many populations living in Eurasia may have been the product of population mixture between ancient Mongolians and natives following the expansion of Mongol Empire. Geneticists have found that most of these populations carried the Y-haplogroup C3* (C-M217). To trace the history of haplogroup (Hg) C3* and to further understand the origin and development of Mongolians, ancient human remains from the Jinggouzi, Chenwugou and Gangga archaeological sites, which belonged to the Donghu, Xianbei and Shiwei, respectively, were analysed. Our results show that nine of the eleven males of the Gangga site, two of the eight males of Chengwugou site and all of the twelve males of Jinggouzi site were found to have mutations at M130 (Hg C), M217 (Hg C3), L1373 (C2b, ISOGG2015), with the absence of mutations at M93 (Hg C3a), P39 (Hg C3b), M48 (Hg C3c), M407 (Hg C3d) and P62 (Hg C3f). These samples were attributed to the Y-chromosome Hg C3* (Hg C2b, ISOGG2015), and most of them were further typed as Hg C2b1a based on the mutation at F3918. Finally, we inferred that the Y-chromosome Hg C3*-F3918 can trace its origins to the Donghu ancient nomadic group.

The development of Mongolia and the frequencies of haplogroup C3* in modern Eurasians. a The development of Mongolia. b The frequencies of haplogroup C3 in modern Eurasians. The dotted line represents the approximate boundary between the Xiongnu and the Donghu. The black and grey arrows denote the migration of the Donghu and Mongolians, respectively

The expansion of peoples is known to be associated with the spread of a certain admixture component, joint with the expansion and reduction in variability of a haplogroup. In other words, few male lineages are usually more successful during the expansion.

Other known examples include:

Featured image: Diachronic map of Iron Age migrations ca. 750-250 BC.


Why we shouldn’t care about the fixation of Neo-Nazis with the Middle Ages

People are obsessed with what racists, white supremacists, Neo-Nazis, etc. use to cover their ignorance, to hide their lack of political or social arguments, and to boost their pathologically low self-confidence. Now it seems to be the Middle Ages.

Some time ago I already read about this new trend, but I didn’t care. For me, as a supporter of a revival of Indo-European as a modern language, it was a relief that their fixation was somewhere different than Indo-Europeans.

The usual false syllogism for Indo-European questions goes Right populists support the supremacy of Aryans, ergo supporting the existence of expansions/language/social customs/etc. of Indo-Europeans means supporting Aryan supremacy. You can see the immediate association by the general population of Indo-European matters with Aryan supremacy by looking for information on Indo-European + white supremacy/Aryans/nazism, etc. on the Internet.

If you do that search, you might read a lot of right populist crap using Indo-European matters to support their ideas. You might even begin to associate one with the other, because it seems as if research on Indo-European questions somehow boosted those extremist ideals, right? If you think that, you are obviously part of the problem.

Apart from Aryans, Nazis have had fixations with ancient symbols (like the Swastika or Celtic symbolism), neo-paganism, the Roman Empire, the western European ‘heir empires of Rome’ that ensued, Catholicism, Germanic peoples, Romans, Greeks, Slavs, whiteness, blondness, Neanderthals…

And all of this has come at a cost for anyone involved or interested in any of those themes. It is only natural that Nazis evolve; just like shit decays, they move on. However,their interest about medieval times is not new (as is clear from the featured image of this post, and other propaganda from the time); it is just stronger now.

Now I see some medieval scholars complaining, in Twitter and in the news, calling for all to do something to protect the field of Medieval Studies.

But why? Why should we care about those who will regard medieval historians as tainted with Nazism? About you being called a Nazi, about people tacitly suggesting that you support their ideas? What have you done to protect Indo-Europeanists from the accusations, from the name-calling, from the shame?

Perhaps more importantly: now that you have become aware of this problem for the study of the western Middle Ages… What have you planned to do to help Indo-European studies once you are free from that yoke, that presumption of guilt? Probably nothing, you just care about yourselves. We all do.

I think it might be actually beneficial for Academia if more scholars suffer the same discrimination, if Nazis keep widening their areas of interest, so that we can all just ignore a simplistic and overused Nazi-shaming by stupid critics.

I don’t recall anyone defending Indo-Europeanists from those playing the Nazi card. The most recent example I know is the discussion around Lazaridis et al. (2017) paper, on Minoans and Mycenaeans. Some outrage from those involved in Human Evolutionary Biology (read the comments), but not too much from the rest of the world; too much concern this year about poor medievalists to care, I suppose.

You might not remember the infinite other times when you didn’t care about us being called Nazis because of our interest in (or writings about) our beloved academic field. But we do. And if you are complaining now, you certainly knew what was happening (what is happening), because how else could you know what this new love of right populists means for Medieval History, the shit it will bring?

Now your turn has come to enjoy the populace’s unending ad Nazium arguments. Publish anything about the social life in the Mediaevum, about medieval wars, religion, peoples, languages, symbols, etc., and just about anything that does not follow perfect political correctness will get you publicly shamed. Publish anything remotely interesting, and populist sites will publicise and manipulate your words, and critics and journalists will destroy your work by using populists’ words to describe it, not yours.

But, really, you shouldn’t care about the automatic association of your field with Nazis, about the unending insults, about the tacit (and oftentimes also explicit) link they will make of your work with Nazi ideas.

Just take a look at Indo-European studies. Not many Nazis have felt inclined to study (this or any other subject) because of their historical fixation with Aryans, so fear not, they will not take over your scholarships. However, their fixation has been a great filter for our field, to get rid of the weak of the heart, of those who care too much about what other people think, of those who are not convinced that this is what they want to do.

It seems to me that Indo-European studies have fewer scholars than it should, compared to other (in my humble opinion less promising or interesting) subjects, but the community is strong. Not much fuck is given about political correctness when publishing theories and models on the spread of Indo-Europeans, on their myths and customs, on their language. ‘Patrilocality’, ‘violent conquest’, ‘migration of peoples’, ‘women exchange’, ‘slavery’, are common (otherwise unpopular) words to describe their history and evolution, their ancestry, and they are becoming popular to describe anthropological evolution in general. We are in a privileged position to observe reality, and also the stupid political correctness of many.

Also, you might find comfort when passing this moment of truth professionally and personally knowing that, in the future, another field – whose scholars don’t give a fuck now about your popular shaming – will be their love object, and you will be able to tell them what I am telling you now.

Welcome to the dark side!

(EDIT 9/SEP/2017) I just realized that most (tacit or explicit) Nazi-shaming come from people within the field, who are obviously the ones interested in what you write. It is without a doubt the easiest way to criticise the work of your peers, who won’t need to do their research and answer formally with careful investigation. So good luck with that too!

Human ancestry solves language questions? New admixture citebait


A paper at Scientific Reports, Human ancestry correlates with language and reveals that race is not an objective genomic classifier, by Baker, Rotimi, and Shriner (2017).

Abstract (emphasis mine):

Genetic and archaeological studies have established a sub-Saharan African origin for anatomically modern humans with subsequent migrations out of Africa. Using the largest multi-locus data set known to date, we investigated genetic differentiation of early modern humans, human admixture and migration events, and relationships among ancestries and language groups. We compiled publicly available genome-wide genotype data on 5,966 individuals from 282 global samples, representing 30 primary language families. The best evidence supports 21 ancestries that delineate genetic structure of present-day human populations. Independent of self-identified ethno-linguistic labels, the vast majority (97.3%) of individuals have mixed ancestry, with evidence of multiple ancestries in 96.8% of samples and on all continents. The data indicate that continents, ethno-linguistic groups, races, ethnicities, and individuals all show substantial ancestral heterogeneity. We estimated correlation coefficients ranging from 0.522 to 0.962 between ancestries and language families or branches. Ancestry data support the grouping of Kwadi-Khoe, Kx’a, and Tuu languages, support the exclusion of Omotic languages from the Afroasiatic language family, and do not support the proposed Dené-Yeniseian language family as a genetically valid grouping. Ancestry data yield insight into a deeper past than linguistic data can, while linguistic data provide clarity to ancestry data.

Regarding European ancestry:

Southern European ancestry correlates with both Italic and Basque speakers (r = 0.764, p = 6.34 × 10−49). Northern European ancestry correlates with Germanic and Balto-Slavic branches of the Indo-European language family as well as Finno-Ugric and Mordvinic languages of the Uralic family (r = 0.672, p = 4.67 × 10−34). Italic, Germanic, and Balto-Slavic are all branches of the Indo-European language family, while the correlation with languages of the Uralic family is consistent with an ancient migration event from Northern Asia into Northern Europe. Kalash ancestry is widely spread but is the majority ancestry only in the Kalash people (Table S3). The Kalasha language is classified within the Indo-Iranian branch of the Indo-European language family.

Sure, admixture analysis came to save the day. Yet again. Now it’s not just Archaeology related to language anymore, it’s Linguistics; all modern languages and their classification, no less. Because why the hell not? Why would anyone study languages, history, archaeology, etc. when you can run certain algorithms on free datasets of modern populations to explain everything?

What I am criticising here, as always, is not the study per se, its methods (PCA, the use of Admixture or any other tools), or its results, which might be quite interesting – even regarding the origin or position of certain languages (or more precisely their speakers) within their linguistic groups; it’s the many broad, unsupported, striking conclusions (read the article if you want to see more wishful thinking).

This is obviously simplistic citebait – that benefits only journals and authors, and it is therefore tacitly encouraged -, but not knowledge, because it is not supported by any linguistic or archaeological data or expertise.

Is anyone with a minimum knowledge of languages, or general anthropology, actually reviewing these articles?


Featured image: Ancestry analysis of the global data set, from the article.

Spread of Indo-European folktale traditions related to cultural and demic diffusion (using genomic data)


New article at PNAS, Inferring patterns of folktale diffusion using genomic data, by Bortoloni et al. (2017).


Observable patterns of cultural variation are consistently intertwined with demic movements, cultural diffusion, and adaptation to different ecological contexts [Cavalli-Sforza and Feldman (1981) Cultural Transmission and Evolution: A Quantitative Approach; Boyd and Richerson (1985) Culture and the Evolutionary Process]. The quantitative study of gene–culture coevolution has focused in particular on the mechanisms responsible for change in frequency and attributes of cultural traits, the spread of cultural information through demic and cultural diffusion, and detecting relationships between genetic and cultural lineages. Here, we make use of worldwide whole-genome sequences [Pagani et al. (2016) Nature 538:238–242] to assess the impact of processes involving population movement and replacement on cultural diversity, focusing on the variability observed in folktale traditions (n = 596) [Uther (2004) The Types of International Folktales: A Classification and Bibliography. Based on the System of Antti Aarne and Stith Thompson] in Eurasia. We find that a model of cultural diffusion predicted by isolation-by-distance alone is not sufficient to explain the observed patterns, especially at small spatial scales (up to ~4,000 km). We also provide an empirical approach to infer presence and impact of ethnolinguistic barriers preventing the unbiased transmission of both genetic and cultural information. After correcting for the effect of ethnolinguistic boundaries, we show that, of the alternative models that we propose, the one entailing cultural diffusion biased by linguistic differences is the most plausible. Additionally, we identify 15 tales that are more likely to be predominantly transmitted through population movement and replacement and locate putative focal areas for a set of tales that are spread worldwide.

I am very interested in folktales and their origins within Proto-Indo-European culture, so the title alone was an immediate click-bait for me. It did, as always, disappoint in its methods and conclusions, but just the idea it proposes is of great interest for future studies.

There are gross limitations in assessing folktales using simply the Aarne-Thompson-Uther Classification without further analysis or explanation, apart from a summary of tales in the supplementary materials.

But their maps and simplistic hypothesized waves of diffusion (‘African origin’, ‘northern Eurasian’, ‘Eastern European’, or ‘Middle-Eastern/Caucasian’) seem to me as if they try to swim with the tide of the current literature regarding the identification of Proto-Indo-European demic diffusion with “steppe admixture” distribution (and ancient language family diffusion in general through admixture), and as such it can only be wrong.

If you just look at actual folktale distribution (black dots) and compare them with prehistoric cultures and ancient Y-DNA distribution, you realize their maps don’t make much sense, and more complex methods (and a clearer idea of what admixture represents) are needed.

If their intention was to get published in a journal of high impact factor, they succeeded, so good for them. I am glad this subject gets more attention. Of course, their conclusions are kept formally in line with the many limitations of their methods, and are the most interesting aspect of the article:

By correcting for the presence of ethnolinguistic barriers, we find that the null model of cultural diffusion predicted by IBD alone cannot explain the observed distribution of folktales across Eurasia. Instead, beyond ~4,000 km, cultural diffusion biased by linguistic barriers exhibits the highest correlation at all geographic bins. At small geographic bins (<4,000 km), population movements and linguistic barriers may be more relevant than geographic proximity, pointing once again at the possible importance of small-scale processes of cultural transmission for testing more specific hypotheses when using genetic evidence. In addition, processes other than simple cultural diffusion may be more relevant for a smaller group of tales shared by pairs of populations that are genetically closer than populations not exhibiting those tales. Looking for smaller packages of tales or individual tales and their variants can be useful to shed light on the formation process of this vast body of popular knowledge. The long-range patterns detected by our analyses may complement this picture by suggesting a more ancient origin of some of these folktales (SI Appendix). On a broader level, these results can be used in the future to infer directional trends of cultural dispersal as well as to test for the emergence of systematic social biases [such as prestige bias, conformism/anticonformism, heterophily, and content-dependent biases] or cultural barriers different from linguistic ones, which have a chronology that may be independently ascertained.

If you are interested in studies about folktales, and especially those related to Indo-European traditions, you can check out the following articles I found interesting in the past:


Featured image (featured also in the article): Possible focal area and dispersion pattern for tale ATU313 “The Magic Flight,” one the most popular folktales in this dataset, which may have been additionally spread through population movement and replacement. It is interesting to note how this tale reached locations that are far from its putative origin (such as Japan and southeastern Africa), whereas it was not retained by many populations located in between (gray dots).

My European Family: The First 54,000 years, by Karin Bojs


I have recently read the book My European Family: The First 54,000 years (2015), by Karin Bojs, a known Swedish scientific journalist, former science editor of the Dagens Nyheter.

My European Family: The First 54,000 Years
It is written in a fresh, dynamic style, and contains general introductory knowledge to Genetics, Archaeology, and their relation to language, and is written in a time of great change (2015) for the disciplines involved.

The book is informed, it shows a balanced exercise between responsible science journalism and entertaining content, and it is at times nuanced, going beyond the limits of popular science books. It is not written for scholars, although you might learn – as I did – interesting details about researchers and institutions of the anthropological disciplines involved. It contains, for example, interviews with known academics, which she uses to share details about their personalities and careers, which give – in my opinion – a much needed context to some of their publications.

Since I am clearly biased against some of the findings and research papers which are nevertheless considered mainstream in the field (like the identification of haplogroup R1a with the Proto-Indo-European expansion, or the concept of steppe admixture), I asked my wife (who knew almost nothing about genetics, or Indo-European studies) to read it and write a summary, if she liked it. She did. So much, that I have convinced her to read The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World (2007), by David Anthony.

Here is her summary of the book, translated from Spanish:

The book is divided in three main parts: The Hunters, The Farmers, and The Indo-Europeans, and each has in turn chapters which introduce and break down information in an entertaining way, mixing them with recounts of her interactions and personal genealogical quest.

Part one, The Hunters, offers intriguing accounts about the direct role music had in the development of the first civilizations, the first mtDNA analyses of dogs (Savolainen), and the discovery of the author’s Saami roots. Explanations about the first DNA studies and their value for archaeological studies are clear and comprehensible for any non-specialized reader. Interviews help give a close view of investigations, like that of Frederic Plassard’s in Les Combarelles cave.

Part two, The Farmers, begins with her travel to Cyprus, and arouses the interest of the reader with her description of the circular houses, her notes on the Basque language, the new papers and theories related to DNA analyses, the theory of the decision of cats to live with humans, the first beers, and the houses built over graves. Karin Bojs analyses the subgroup H1g1 of her grandmother Hilda, and how it belonged to the first migratory wave into Central Europe. This interest in her grandmother’s origins lead her to a conference in Pilsen about the first farmers in Europe, where she knows firsthand of the results of studies by János Jakucs, and studies of nuclear DNA. Later on she interviews Guido Brandt and Joachim Burguer, with whom she talks about haplogroups U, H, and J.

The chapter on Ötzi and the South Tyrol Museum of Archaeology (Bolzano) introduces the reader to the first prehistoric individual whose DNA was analysed, belonging to haplogroup G2a4, but also revealing other information on the Iceman, such as his lactose intolerance.

Part three, dealing with the origin of Indo-Europeans, begins with the difficulties that researchers have in locating the origin of horse domestication (which probably happened in western Kazakhstan, in the Russian steppe between the rivers Volga and Don). She mentions studies by David Anthony and on the Yamna culture, and its likely role in the diffusion of Proto-Indo-European. In an interview with Mallory in Belfast, she recalls the potential interest of far-right extremists in genetic studies (and early links of the Journal of Indo-European Studies to certain ideology), as well as controversial statements of Gimbutas, and her potentially biased vision as a refugee from communist Europe. During the interview, Mallory had a copy of the latest genetic paper sent to Nature Magazine by Haak et al., not yet published, for review, but he didn’t share it.

Then haplogroups R1a and R1b are introduced as the most common in Europe. She visits the Halle State Museum of Prehistory (where the Nebra sky disk is exhibited), and later Krakow, where she interviews Slawomir Kadrow, dealing with the potential creation of the Corded Ware culture from a mix of Funnelbeaker and Globular Amphorae cultures. New studies of ancient DNA samples, published in the meantime, are showing that admixture analyses between Yamna and Corded Ware correlate in about 75%.

In the following chapters there is a broad review of all studies published to date, as well as individuals studied in different parts of Europe, stressing the importance of ships for the expansion of R1b lineages (Hjortspring boat).

The concluding chapter is dedicated to vikings, and is used to demystify them as aggressive warmongers, sketching their relevance as founders of the Russian state.

To sum up, it is a highly documented book, written in a clear style, and is capable of awakening the reader’s interest in genetic and anthropological research. The author enthusiastically looks for new publications and information from researchers, but is at the same time critic with them, showing often her own personal reactions to new discoveries, all of which offers a complex personal dynamic often shared by the reader, engaged with her first-person account the full length of the book.

Mayte Batalla (July 2017)

DISCLAIMER: The author sent me a copy of the book (a translation into Spanish), so there is a potential conflict of interest in this review. She didn’t ask for a review, though, and it was my wife who did it.