Conventional DNA profiling of Short Tandem Repeats (STR) provides little evidentiary value in the absence of reference profiles or in the case of a non-match. Recently, the forensic DNA intelligence field has flourished to provide investigators with valuable information from DNA samples that can narrow the collection of potential matches by identifying previously unknown reference individuals. Intelligence data of special interest includes the biogeographical ancestry (BGA) and external visible characteristics (EVC) such as the eye, hair and skin colour of unknown DNA samples donors.
Innovative technological advances like next-generation sequencing and microarrays have been crucial to the establishment of population diversity repositories comprising millions of DNA markers, the most abundant of which include single nucleotide polymorphisms (SNPs). Large-scale SNP studies of global populations have enabled reconstructions of mitochondrial (mtDNA) and non-recombining Y-chromosome (NRY) phylogenies, providing highly comprehensive population specific patterns of maternal and paternal genetic variation. Similarly, numerous patterns of autosomal genetic variation have been identified between different populations. These studies have culminated in panels of markers capable of resolving ancestry at the continental level. The identification of autosomal SNPs associated with human pigmentation variation has also resulted in the discovery of specific SNPs capable of predicting EVCs. Several DNA intelligence and phenotyping assays for the inference of BGA and for the prediction of eye, hair and skin colour have subsequently been developed. However, most of these intelligence tools have primarily focused on the analysis of one class of SNPs, hence limiting the amount of ancestry intelligence that could be obtained. The scarcity and often environmentally compromised nature of forensic biological evidence means that performing numerous individual intelligence tests is not optimal and a consolidated DNA intelligence diagnostic test is very much needed.
This study aimed to develop a SNP genotyping system that combined autosomal, NRY and mtDNA markers for comprehensive predictions of BGA and EVCs. Candidate SNPs were selected through literature and database searches to identify loci exhibiting skewed allele frequency differences between Sub-Saharan African, North African, Middle Eastern, European, South and East Asian populations. A hierarchical arrangement comprising five separate multiplexes was implemented, in which SNP typing was performed by single-base extension assays. The haploid mtDNA and NRY SNPs were grouped into Multiplex 1 to 4, with SNPs defining maternal and paternal lineages (haplogroups) affiliated with the same geographic region grouped in the same reaction. The markers defining basal haplogroups were included in Multiplex 1, which is then used to identify the subsequent multiplex(es) required to achieve further haplogroup resolution and to minimise the number of tests required. The autosomal SNPs are typed separately in Multiplex 5.
A performance evaluation of the 5-multiplex SNP assay was undertaken on 146 individuals originating from the six major population groups of interest. Population genetic analyses of the mtDNA and NRY haplotypes and autosomal genotypes revealed that a greater degree of population differentiation was achieved with the selected NRY and autosomal SNPs than with the mtDNA SNPs. Moreover, the results indicated that the assay primarily allowed for the differentiation of continental ancestry, with populations in close proximity within continents, such as Europe, the Middle East and South Asia, often difficult to distinguish. However, the observed correlation between the declared and inferred geographic regions of maternal and paternal origin was high; 73-100% for maternal and 79-100% for paternal regional BGA. The bi-parental BGA predictions ranged from 85 to 95%, provided Middle Easterners and Europeans were grouped into a single Western Eurasian population. In 99% of cases, two of the three SNP classes correctly predicted the same ancestry from one of the five broad geographical regions (Sub-Saharan Africa, North Africa, Western Eurasia, South Asia and East Asia). High prediction accuracies were also observed for the inference of EVCs including hair (86-88%) and eye colour (81-95%). The DNA intelligence assay also demonstrated advanced performance with low starting amounts of genomic DNA, with full profiles observed for up to 100pg of template and for the analysis of routine casework biological samples. Consequently, this study presented the successful development of a novel, consolidated DNA intelligence tool that has displayed high performance for the inference of regional (continental) BGA and EVC in preliminary tests. Further validations of the assay are required; however the developed 5-multiplex SNP assay remains a valuable DNA intelligence diagnostic tool for the forensic science community.