Two more studies on the genetic history of East Asia: Han Chinese and Thailand

chinese-eurasian-drift

A comprehensive map of genetic variation in the world’s largest ethnic group – Han Chinese, by Charleston et al. (2017).

It is believed – based on uniparental markers from modern and ancient DNA samples and array-based genome-wide data – that Han Chinese originated in the Central Plain region of China during prehistoric times, expanding with agriculture and technology northward and southward, to become the largest Chinese ethnic group.

Abstract:

As are most non-European populations around the globe, the Han Chinese are relatively understudied in population and medical genetics studies. From low-coverage whole-genome sequencing of 11,670 Han Chinese women we present a catalog of 25,057,223 variants, including 548,401 novel variants that are seen at least 10 times in our dataset. Individuals from our study come from 19 out of 22 provinces across China, allowing us to study population structure, genetic ancestry, and local adaptation in Han Chinese. We identify previously unrecognized population structure along the East-West axis of China and report unique signals of admixture across geographical space, such as European influences among the Northwestern provinces of China. Finally, we identified a number of highly differentiated loci, indicative of local adaptation in the Han Chinese. In particular, we detected extreme differentiation among the Han Chinese at MTHFR, ADH7, and FADS loci, suggesting that these loci may not be specifically selected in Tibetan and Inuit populations as previously suggested. On the other hand, we find that Neandertal ancestry does not vary significantly across the provinces, consistent with admixture prior to the dispersal of modern Han Chinese. Furthermore, contrary to a previous report, Neandertal ancestry does not explain a significant amount of heritability in depression. Our findings provide the largest genetic data set so far made available for Han Chinese and provide insights into the history and population structure of the world’s largest ethnic group.

Using Shanghai individuals as representatives, shared drift between Chinese and ancient humans are computed by calculating the outgroup f3 statistics of the form f3(Mbuty;X, Y), with ancient individuals separated into approximately Palaeolithic, Mesolithic, Neolithic , and Chalcolithic-Medieval times. it is found that modern Chinese individuals show greater shared drift with pre-Neolithic hunter-gatherers rather than Neolithic farmers (Featured image from the article).

EDIT (17/7/2017): Davidski at Eurogenes shares an interesting view on this kind of results:

These sorts of estimates always look way off. And I doubt that it’s largely the result of the Silk Road, which linked China to the Near East and Mediterranean rather than to Northern Europe. More likely it reflects gene flow from the Pontic-Caspian steppe in Eastern Europe during the Bronze and Iron ages, via the Afanasievo, Andronovo, and other closely related steppe peoples


New insights from Thailand into the maternal genetic history of Mainland Southeast Asia, by Kutanan et al. (2017)

Abstract:

Tai-Kadai (TK) is one of the major language families in Mainland Southeast Asia (MSEA), with a concentration in the area of Thailand and Laos. Our previous study of 1,234 mtDNA genome sequences supported a demic diffusion scenario in the spread of TK languages from southern China to Laos as well as northern and northeastern Thailand. Here we add an additional 560 mtDNA sequences from 22 groups, with a focus on the TK-speaking central Thai people and the Sino-Tibetan speaking Karen. We find extensive diversity, including 62 haplogroups not reported previously from this region. Demic diffusion is still a preferable scenario for central Thais, emphasizing the extension and expansion of TK people through MSEA, although there is also some support for an admixture model. We also tested competing models concerning the genetic relationships of groups from the major MSEA languages, and found support for an ancestral relationship of TK and Austronesian-speaking groups.

My European Family: The First 54,000 years, by Karin Bojs

steppe-expansion-corded-ware

I have recently read the book My European Family: The First 54,000 years (2015), by Karin Bojs, a known Swedish scientific journalist, former science editor of the Dagens Nyheter.

my-european-family
My European Family: The First 54,000 Years
It is written in a fresh, dynamic style, and contains general introductory knowledge to Genetics, Archaeology, and their relation to language, and is written in a time of great change (2015) for the disciplines involved.

The book is informed, it shows a balanced exercise between responsible science journalism and entertaining content, and it is at times nuanced, going beyond the limits of popular science books. It is not written for scholars, although you might learn – as I did – interesting details about researchers and institutions of the anthropological disciplines involved. It contains, for example, interviews with known academics, which she uses to share details about their personalities and careers, which give – in my opinion – a much needed context to some of their publications.

Since I am clearly biased against some of the findings and research papers which are nevertheless considered mainstream in the field (like the identification of haplogroup R1a with the Proto-Indo-European expansion, or the concept of steppe admixture), I asked my wife (who knew almost nothing about genetics, or Indo-European studies) to read it and write a summary, if she liked it. She did. So much, that I have convinced her to read The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World (2007), by David Anthony.

Here is her summary of the book, translated from Spanish:

The book is divided in three main parts: The Hunters, The Farmers, and The Indo-Europeans, and each has in turn chapters which introduce and break down information in an entertaining way, mixing them with recounts of her interactions and personal genealogical quest.

Part one, The Hunters, offers intriguing accounts about the direct role music had in the development of the first civilizations, the first mtDNA analyses of dogs (Savolainen), and the discovery of the author’s Saami roots. Explanations about the first DNA studies and their value for archaeological studies are clear and comprehensible for any non-specialized reader. Interviews help give a close view of investigations, like that of Frederic Plassard’s in Les Combarelles cave.

Part two, The Farmers, begins with her travel to Cyprus, and arouses the interest of the reader with her description of the circular houses, her notes on the Basque language, the new papers and theories related to DNA analyses, the theory of the decision of cats to live with humans, the first beers, and the houses built over graves. Karin Bojs analyses the subgroup H1g1 of her grandmother Hilda, and how it belonged to the first migratory wave into Central Europe. This interest in her grandmother’s origins lead her to a conference in Pilsen about the first farmers in Europe, where she knows firsthand of the results of studies by János Jakucs, and studies of nuclear DNA. Later on she interviews Guido Brandt and Joachim Burguer, with whom she talks about haplogroups U, H, and J.

The chapter on Ötzi and the South Tyrol Museum of Archaeology (Bolzano) introduces the reader to the first prehistoric individual whose DNA was analysed, belonging to haplogroup G2a4, but also revealing other information on the Iceman, such as his lactose intolerance.

Part three, dealing with the origin of Indo-Europeans, begins with the difficulties that researchers have in locating the origin of horse domestication (which probably happened in western Kazakhstan, in the Russian steppe between the rivers Volga and Don). She mentions studies by David Anthony and on the Yamna culture, and its likely role in the diffusion of Proto-Indo-European. In an interview with Mallory in Belfast, she recalls the potential interest of far-right extremists in genetic studies (and early links of the Journal of Indo-European Studies to certain ideology), as well as controversial statements of Gimbutas, and her potentially biased vision as a refugee from communist Europe. During the interview, Mallory had a copy of the latest genetic paper sent to Nature Magazine by Haak et al., not yet published, for review, but he didn’t share it.

Then haplogroups R1a and R1b are introduced as the most common in Europe. She visits the Halle State Museum of Prehistory (where the Nebra sky disk is exhibited), and later Krakow, where she interviews Slawomir Kadrow, dealing with the potential creation of the Corded Ware culture from a mix of Funnelbeaker and Globular Amphorae cultures. New studies of ancient DNA samples, published in the meantime, are showing that admixture analyses between Yamna and Corded Ware correlate in about 75%.

In the following chapters there is a broad review of all studies published to date, as well as individuals studied in different parts of Europe, stressing the importance of ships for the expansion of R1b lineages (Hjortspring boat).

The concluding chapter is dedicated to vikings, and is used to demystify them as aggressive warmongers, sketching their relevance as founders of the Russian state.

To sum up, it is a highly documented book, written in a clear style, and is capable of awakening the reader’s interest in genetic and anthropological research. The author enthusiastically looks for new publications and information from researchers, but is at the same time critic with them, showing often her own personal reactions to new discoveries, all of which offers a complex personal dynamic often shared by the reader, engaged with her first-person account the full length of the book.

Mayte Batalla (July 2017)

DISCLAIMER: The author sent me a copy of the book (a translation into Spanish), so there is a potential conflict of interest in this review. She didn’t ask for a review, though, and it was my wife who did it.

Indo-European pastoralists healthier than modern populations? Genomic health improving over time

genetic-risk

A new paper has appeared at BioRxiv, The Genomic Health Of Ancient Hominins (2017) by Berence, Cooper and Lachance.

Important results are available at: http://popgen.gatech.edu/ancient-health/.

While the study’s many limitations are obvious to the authors, they still suggest certain interesting possibilities as the most important conclusions:

  • In general, Genetic risk scores (GRS) are similar to present-day individuals
  • Genomic health seems to be improving over time
  • Pastoralists could have been healthier than older and modern populations

Some details and shortcomings of the study (most stated by them, bold is from me) include:

  • Allele selection: only some of the known autosomal disease-associated SNPs were included
  • Discovered disease-associated SNPs are known to be biased toward European diseases
  • Ancient sample selection and genomic quality: only 147 ancient genomes were included, from 449 available, with a conventional cut made at 50% of the focal 3180 disease-associated loci. These samples did not include the same loci. All this can affect whether an individual has high or low GRS (a relationship was found between GRS percentiles and sequencing coverage for ancient samples).
  • Phase 3 of the 1000 Genomes Project was used. However, many disease alleles that segregated in the past remain undiscovered – therefore, GRS for ancient individuals should be considered to be underestimated.
  • Genetic risk scores were calculated for each individual (with different sets of disease-associated loci), hence they were not comparable across individuals. So GRS were standardized as GRS percentiles, with certain assumptions, comparing them to modern individuals
  • Multiple comparisons with all data available, using multiple groups, in the small sample selected: comparisons were made between standardized GRS percentile, sample age (i.e. estimated date), mode of subsistance, and geographic location.
  • Older samples have worse coverage, especially Altai Neandertal, Ust’-Ishim, and Denisovan (which might influence results in hunter-gatherers)
  • Northern ancient individuals (using latitude values) show healthier genomes: but, most ancient individuals are from Eurasia, and samples are heterogeneous.
  • Agriculturalists show a higher genetic risk for dental/periodontal diseases than hunger-gatherers and pastoralists. However, this disease has the smallest number of risk loci (k = 40), so risk in older samples might be underestimated, and pastoralists are the more recent agriculturalist population (most used agriculture as a complementary diet), so it is only natural that selection had an impact over time in this aspect.
  • Pastoralists have the smallest sample size (19 samples) and geographic range, so conclusions about this group are still less trustworthy.
  • Genetic risk percentile ≠ Genomic health ≠ phenotypic health (not deterministic), and also disease-associated alleles in modern populations ≠ same effects in past environments.

To sum up, an interesting approach to studying genomic health with the scarce data available, but too many comparisons, with too many hypotheses being tested, which remind to a brute-force attack on data that can therefore yield statistically significant results anytime, anywhere.