All Ancient DNA Samples - Y-DNA and mtDNA Haplogroups
                       
Downloading this spreadsheet as Excel retains format. Links may not copy to new sheet.
The entire work is identified by the Version Number and Date given on second to last and last column. https://haplogroup.info/
File name includes Version Number, as well as Subversion Number for minor changes, if there are any. Current Link to Files Copyright 2018-2021 Carlos Quiles.
Copyright 2013-2018 Jean Manco.
Information on document structure, version history, how to cite this document, etc. https://indo-european.eu/
https://creativecommons.org/licenses/by/4.0/
Because of continuing research on ancient DNA and standards like ISOGG, FTDNA, or YFull, this data needs constant updates, and inconsistencies might often be found in nomenclature  
For updates, corrections, or clarifications, as well as to request access to protected files, please write to this e-mail: cquiles@dnghu.org
BACKGROUND: "Confirmed" subclade (FTDNA, YFull) Y-SNPs reviewed in 2021 Approximate subclade / data Inferred subclade (external data) No corresponding subclade / No data
FONT: Data for personal use Needs rechecking/retesting Warning / unreliable analysis Calls checked/expanded Wrong call now corrected
SYMBOLS/USAGE * Indicates basal subclade (negative for all known subclades).
~ Indicates only an approximate location on the tree.
(xY) Indicates negative for SNP Y.
pre-Y Indicates inferred SNP location between named branch Y and its parent haplogroup.
? Indicates that there is contrasting or not fully conclusive evidence for the SNP inference
Please Note: Samples which do not have inferences of haplogroups are included for other purposes, such as formal stats of genome-wide SNP data.
COLUMN NAMES AND VALUES
Object-ID Official lab name (preferred Reich Lab's compendium AADR Master-ID).
Colloquial-Skeletal Other lab or skeletal names.
Latitude Reported/approximate latitude.
Longitude Reported/approximate longitude.
RandLat Randomized latitude (to avoid overcrowding).
RandLong Randomized longitude (to avoid overcrowding).
Sex Reported genetic or anthropological sex.
mtDNA-coverage mtDNA coverage (merged data).
mtDNA-haplogroup Reported mtDNA haplogroup.
mtDNA-Haplotree mtDNA in FTDNA nomenclature.
mt-FTDNA link to mtDNA subclade on FTDNA.
mtree mtDNA in YFull nomenclature.
mt-YFull link to mtDNA subclade on YFull.
FTDNA-mt-Haplotree FTDNA Haplotree standard nomenclature.
mtFAR Inverted Formed-Age-Ratio for terminal mt-SNP (NOTE: Values greater than 1 have been rounded to 1). FAR metric proposed by Jari Kinnunen at https://haplotree.info/. Formation dates collected from YFull (2019-2020) and adapted for the FTDNA haplotree in SNP Tracker at http://scaledinnovation.com/.
mt-Simple Simple mtDNA subclade grouping.
mt-Symbol mtDNA symbol.
HVS-I Reported variants for called MT haplogroup / HVS-I.
HVS-II Reported Private mutations(recent papers) / HVS-II
HVS-NO Mutations not present in sample.
mt-SNPs Mutations newly reported
Responsible-mtDNA Person (real name or pseudonym) or institution responsible for updated subclade, usually based on BAM analysis.
Y-DNA Reported Y-DNA haplogroup.
Y-New Updated Y-DNA.
SNP-positive Positive SNPs.
SNP-negative Negative SNPs.
SNP-dubious Dubious SNPs.
Y-SNP Accession ID and link to files with Y-SNP calls with YLeaf + pathPhynder, selecting only SNPs from the FTDNA Haplotree.
NRY Markers with haplogroup information, YLeaf v. 2.2 over FASTQ (occasionally over BAM). [When multiple files available and not merged, selected is the highest number].
Y-Simple Simple Y-DNA subclade grouping.
YTree Detailed Y-DNA subclade (selecting YTree.net nomenclature for R1b subclades when available).
Y-Haplotree-Variant Y-DNA in FTDNA nomenclature, available in search by variants.
Y-Haplotree-Public Y-DNA in FTDNA nomenclature, available in default search.
Y-FTDNA Link to Y-DNA subclade on FTDNA.
YFull Y-DNA in YFull nomenclature.
Y-YFull Link to Y-DNA subclade on YFull.
ISOGG2019 Y-DNA in last available ISOGG standard (2018-2020) nomenclature.
FTDNA-Y-Haplotree FTDNA Haplotree standard nomenclature.
Y-FAR Inverted Formed-Age-Ratio for terminal Y-SNP (NOTE: Values greater than 1 have been rounded to 1). FAR metric proposed by Jari Kinnunen at https://haplotree.info/. Formation dates collected from YFull (2019-2020) and adapted for the FTDNA haplotree in SNP Tracker at http://scaledinnovation.com/.
Y-Symbol Y-DNA symbol (simple).
Y-Symbol2 Y-DNA symbol (multiple).
Responsible-SNP Person (real name or pseudonym) or institution responsible for updated subclade, usually based on BAM analysis.
SNPs SNPs hit on autosomal targets.
Autosomal-Coverage Coverage on autosomal targets.
Damage-Rate Damage rate in first nucleotide on sequences overlapping 1240k targets (merged data).
Assessment ASSESSMENT (Xcontam interval is listed if lower bound is >0.005, "QUESTIONABLE" if lower bound is 0.01-0.02, "QUESTIONABLE_CRITICAL" or "FAIL" if lower bound is >0.02) (mtcontam confidence interval is listed if coverage >2 and upper bound is <0.98: 0.9-0.95 is "QUESTIONABLE"; <0.9 is "QUESTIONABLE_CRITICAL", questionable status gets overriden by ANGSD with PASS if upper bound of contamination is <0.01 and QUESTIONABLE if upper bound is 0.01-0.05) (damage for ds.half is "QUESTIONABLE_CRITICAL/FAIL" if <0.01, "QUESTIONABLE" for 0.01-0.03, and recorded but passed if 0.03-0.05; libraries with fully-treated last base are "QUESTIONABLE_CRITICAL" or "FAIL" if <0.03, "QUESTIONABLE" if 0.03-0.06, and recorded but passed if 0.06-0.1) (sexratio is QUESTIONABLE if [0.03,0.10] or [0.30,0.35); QUESTIONABLE_CRITICAL/FAIL if (0.10,0.30))
Kinship-Notes Reported kinship.
Source Source in format "Author(Paper)Year". See below "Publications".
Method-Date Method for Determining Date (including any warnings)
Date Reported date, usually 95% CI calibrated BC/AD, but also calBP, BP, archaeological period etc.
Mean Average year expressed as negative (calBC) to positive (calAD) values.
CalBC_top Earliest year in the reported range expressed as negative (calBC) or positive (calAD) values.
CalBC_bot Latest year in the reported range expressed as negative (calBC) or positive (calAD) values.
QGIS_top Earliest year in the reported range in format "XXXX BC" or "XXXX AD" for QGIS plugin TimeManager.
QGIS_bot Latest year in the reported range in format "XXXX BC" or "XXXX AD" for QGIS plugin TimeManager.
Q2W_top Earliest year in simple (false) year range that works for qgis2web plugin.
Q2W_bot Latest year in simple (false) year range that works for qgis2web plugin.
ArcGIS_top Earliest year in simple (false) year range that works for ArcGIS.
ArcGIS_bot Latest year in simple (false) year range that works for ArcGIS.
Age Age estimation in anthropology or historical record.
Simplified_Culture Simplified culture.
Culture_Grouping Cultural groups, including regional divisions.
Label Official label for lab (admixture) analyses.
Location Archaeological site or modern location.
SiteID More specific archaeological feature, burial, or description.
Country Modern country of the archaeological site.
LP Lactase persistence (0, 1, NA, or ...)
Skin Reported or updated skin color prediction.
Hair Reported or updated hair color and shade prediction.
Eye Reported or updated eye color prediction.
P1104A P1104A variant, data from KernerAmJHumGenet2021
Other Plague, CCR5-D32, etc.
Version-ID Version-ID selected from Reich Lab's compendium AADR for autosomal analyses (SmartPCA, ADMIXTURE,...)
AnatoliaN-Admixture Version Number. Column includes dates of last corrections, additions, etc.
IranN-Admixture Supervised ADMIXTURE results (K=7, pruned for LD 200 25 0.5) for Anatolia Neolithic component.
ANE-Admixture Supervised ADMIXTURE results (K=7, pruned for LD 200 25 0.5) for ANE (AG2 and AG3).
WHG-Admixture Supervised ADMIXTURE results (K=7, pruned for LD 200 25 0.5) for WHG.
BaikalEN_Admixture Supervised ADMIXTURE results (K=7, pruned for LD 200 25 0.5) for Baikal_EN.
Papuan-Admixture Supervised ADMIXTURE results (K=7, pruned for LD 200 25 0.5) for SE Asia (Thailand_BA, peaks among Papuan).
AfricaHG-Admixture Supervised ADMIXTURE results (K=7, pruned for LD 200 25 0.5) for Africa_HG.
TO DO LIST
1. Check radiocarbon references and calBP -> calBC/AD from published archaeological papers.
2. Review ancestral, derived, and dubious SNPs.
3. Improve (unify) notation for ancestral, derived, and dubious SNPs (Y-STR, etc.) for automated use.
4. Check potential patrilineal relatives for more accurate subclade estimation.