Nonparametric Binary Classification to Distinguish Closely Related versus Unrelated P. Falciparum Parasites.

Publisher:
American Society of Tropical Medicine and Hygiene
Publication Type:
Journal Article
Citation:
American Journal of Tropical Medicine and Hygiene, 2021, 104, (5), pp. 1830-1835
Issue Date:
2021-04-05
Full metadata record
Assessing genetic relatedness of Plasmodium falciparum genotypes is a key component of antimalarial efficacy trials. Previous methods have focused on determining a priori definitions of the level of genetic similarity sufficient to classify two infections as sharing the same strain. However, factors such as mixed-strain infections, allelic suppression, imprecise typing methods, and heterozygosity complicate comparisons of apicomplexan genotypes. Here, we introduce a novel method for nonparametric statistical testing of relatedness for P. falciparum parasites. First, the background distribution of genetic distance between unrelated strains is computed. Second, a threshold genetic distance is computed from this empiric distribution of distances to demarcate genetic distances that are unlikely to have arisen by chance. Third, the genetic distance between paired samples is computed, and paired samples with genetic distances below the threshold are classified as related. The method is designed to work with any arbitrary genetic distance definition. We validated this procedure using two independent approaches to calculating genetic distance. We assessed the concordance of the novel nonparametric classification with a gold-standard Bayesian approach for 175 pairs of recurrent P. falciparum episodes from previously published malaria efficacy trials with microsatellite data from five studies in Guinea and Angola. The novel nonparametric approach was 98% sensitive and 84-89% specific in correctly identifying related genotypes compared with a gold-standard Bayesian algorithm. The approach provides a unified and systematic method to statistically assess relatedness of P. falciparum parasites using arbitrary genetic distance methodologies.
Please use this identifier to cite or link to this item: